Tuesday, March 4, 2025

How to return the Metadata of Table or Dataset using Power Query

How to Check the Cardinality of Columns of a Table or Dataset using Power Query
Scenario:
Suppose we have a Table or Dataset "fct_Orders_Sample" in the Data Model. From this Table, we want to return the following Metadata:
"No. of Columns", "Column Name", "Column Type", "Column Size", "Column Cardinality" 

We can achieve this Scenario, using the following Power M-Query:

let
    /* Define your table name here */
    TableName = "fct_Orders_Sample",  

    /* Use the actual table data here */
    Source = #"fct_Orders_Sample",

    /* Retrieve the schema of the table */
    Schema = Table.Schema(Source),
    ColumnNames = Schema[Name],
    ColumnTypes = Schema[Kind],
    
    /* Efficiently calculate the column sizes */
    ColumnSizes = List.Transform(ColumnNames, each 
        let
            ColumnData = Table.Column(Source, _),
            DataSize = List.Sum(List.Transform(ColumnData, 
                each if _ is null then 0 else Binary.Length(Text.ToBinary(Text.From(_)))))
        in
            Number.Round(DataSize / (1024 * 1024), 4)  /* Convert size to MB */
        ),

    /* Calculate the total size of the table */
    TotalSize = Number.Round(List.Sum(ColumnSizes), 4),

    /* Calculate the cardinality for each column */
    Cardinality = List.Transform(ColumnNames, each List.Count(List.Distinct(Table.Column(Source, _)))),

    /* Find the maximum cardinality and its corresponding column */
    MaxCardinality = List.Max(Cardinality),
    MaxCardinalityColumn = ColumnNames{List.PositionOf(Cardinality, MaxCardinality)},

    /* Retrieve the last refresh date from the table properties */
    RefreshInfo = Value.Metadata(Source),
    LastRefreshDate = Record.FieldOrDefault(RefreshInfo, "RefreshDate", DateTime.LocalNow()),

    /* Create a metadata table from the lists */
    MetadataTable = Table.FromColumns(
        {List.Repeat({TableName}, List.Count(ColumnNames)), ColumnNames, 
        ColumnTypes, List.Repeat({List.Count(ColumnNames)}, List.Count(ColumnNames)), 
        ColumnSizes, List.Repeat({TotalSize}, List.Count(ColumnNames)), 
        Cardinality, List.Repeat({MaxCardinality}, List.Count(ColumnNames)), 
        List.Repeat({MaxCardinalityColumn}, List.Count(ColumnNames)), 
        List.Repeat({LastRefreshDate}, List.Count(ColumnNames))}, 
        {"TableName", "ColumnName", "ColumnType", "NoOfColumns", 
        "ColumnSizeInMB", "TotalSizeOfTableInMB", "ColumnCardinality", 
        "MaxCardinality", "MaxCardinalityColumn", "LastRefreshDate"}
    )
in
    MetadataTable

Result:


--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. If you wish, please share your genuine Feedback or comments only related to this Blog Posts. It is my humble request that, please do not post any Spam comments or Advertising kind of comments, which will be Ignored.

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose we have a sample Sales data, which is stored with Monthly Snapsh...

Popular Posts from this Blog