Sunday, August 8, 2021

How to do Data Profiling and display Summary Statistics of a Table in Power BI

How to generate Summary Statistics of a Table Data in Power BI
Scenario:
Suppose we have a Table "tbl_OrderDetails" as follows.


Now we can do data profiling for this table using Power Query> View options as follows.

1) Column Distribution:
Using the "Colum Distribution" option from Power Query > View options, we can see how the data is being distributed across each Column as follows.
The number of distinct and unique values is displayed for each column. Unique values, which are a subset of distinct values, represent entries that appear only once in a column of dataset or table.
For a Primary key (Eg: Order_Id ) column, the Distinct and Unique count is same as it has one unique id for each record. 


Notes:
In data profiling, Unique values refer to entries that appear only once in the column, while Distinct values are all the different values present in the column, regardless of how often they occur.
For instance, in a column with the values [A, B, B, C, A, D]:
Distinct values (4): These are [A, B, C, D], representing all different values in the column.
Unique values (2): These are [C, D], as they occur only once in the column.

Essentially, unique focuses on values that are exclusive to a single instance, while distinct considers all variations.

2) Column Quality:
Using "Colum Quality" option from Power Query > View options, we can see the percentage of Valid, Error, and Empty values across each Column as follows.


3) Column Profile:
With "Colum Profile" option from Power Query > View options, we can see the Data Quality, Summary Statistics, and Data Distribution per each Distinct item of a selected Column only.
For the Column Profile example, I have selected the column "Units Sold" from the "tbl_Sales" table as shown below:

The Column Profile displays the Summary Statistics of Column like Count, Error, Empty, Distinct, Unique, NaN(not a number), Zero, Min, Max, Average, Standard Deviation, Even and Odd values in the Column.


However, if we wants to generate a Summary Statistics table for all the Columns of a Table then we can generate it using the following MQuery.

= Table.Profile(tbl_OrderDetails)

This query will generate the Summary Statistics of the table "tbl_OrderDetails" as follows.


--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. If you wish, please share your genuine Feedback or comments only related to this Blog Posts. It is my humble request that, please do not post any Spam comments or Advertising kind of comments, which will be Ignored.

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose we have a sample Sales data, which is stored with Monthly Snapsh...

Popular Posts from this Blog