Scenario:
Notes:
In data profiling, Unique values refer to entries that appear only once in the column, while Distinct values are all the different values present in the column, regardless of how often they occur.
Essentially, unique focuses on values that are exclusive to a single instance, while distinct considers all variations.
However, if we wants to generate a Summary Statistics table for all the Columns of a Table then we can generate it using the following MQuery.
Suppose we have a Table "tbl_OrderDetails" as follows.
1) Column Distribution:
Using the "Colum Distribution" option from Power Query > View options, we can see how the data is being distributed across each Column as follows.
The number of distinct and unique values is displayed for each column. Unique values, which are a subset of distinct values, represent entries that appear only once in a column of dataset or table.
For a Primary key (Eg: Order_Id ) column, the Distinct and Unique count is same as it has one unique id for each record.
Notes:
In data profiling, Unique values refer to entries that appear only once in the column, while Distinct values are all the different values present in the column, regardless of how often they occur.
For instance, in a column with the values [A, B, B, C, A, D]:
Distinct values (4): These are [A, B, C, D], representing all different values in the column.
Unique values (2): These are [C, D], as they occur only once in the column.
2) Column Quality:
Using "Colum Quality" option from Power Query > View options, we can see the percentage of Valid, Error, and Empty values across each Column as follows.
With "Colum Profile" option from Power Query > View options, we can see the Data Quality, Summary Statistics, and Data Distribution per each Distinct item of a selected Column only.
For the Column Profile example, I have selected the column "Units Sold" from the "tbl_Sales" table as shown below:
The Column Profile displays the Summary Statistics of Column like Count, Error, Empty, Distinct, Unique, NaN(not a number), Zero, Min, Max, Average, Standard Deviation, Even and Odd values in the Column.
However, if we wants to generate a Summary Statistics table for all the Columns of a Table then we can generate it using the following MQuery.
= Table.Profile(tbl_OrderDetails)
This query will generate the Summary Statistics of the table "tbl_OrderDetails" as follows.
No comments:
Post a Comment
Hi User, Thank You for visiting My Blog. If you wish, please share your genuine Feedback or comments only related to this Blog Posts. It is my humble request that, please do not post any Spam comments or Advertising kind of comments, which will be Ignored.