Saturday, August 21, 2021

How to Group Data and select First Non Null Value from a Column using M Query in Power BI

How to Group Data and select First Non Null Value from a Column using Power Query in Power BI
Scenario :
Suppose we have a table "SalesOrders"  with Order_Id column having some Null and Zero as values.


Now we wants to Group this data by Country_Id, CountryName and CustomerName columns and and return the First Non Blank value from the Order_Id column.

We can achieve this using the below M-Query :
= Table.Group(#"Changed Type1", {"Country_Id","CountryName", "CustomerName"}, {{"FirstNonNullOrder", each List.First(List.RemoveNulls([Order_Id])), type nullable number}})

Notes : 
List.Remove will remove the Null values from the Column.
List.First will selects the First Value from the Column.

The following are some M-Qurery steps for reference :
let
    Source = Excel.Workbook(File.Contents(SrcFile), null, true),
    SalesOrders_Sheet = Source{[Item="SalesOrders",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(SalesOrders_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Order_Id", Int64.Type}, {"OrderMonth", type text}, {"SalesCountry_Id", Int64.Type}, {"Cust_Id", Int64.Type}, {"CustomerName", type text}, {"CountryName", type text}, {"Units_Sold", Int64.Type}, {"Unit_Price", Int64.Type}, {"Gross_Sales", Int64.Type}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"SalesCountry_Id", "Country_Id"}}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"OrderMonth", type date}}),
    GroupedRows = Table.Group(#"Changed Type1", {"Country_Id","CountryName", "CustomerName"}, {{"FirstNonNullOrder", each List.First(List.RemoveNulls([Order_Id])), type nullable number}}, GroupKind.Local),
    #"Appended Query" = Table.Combine({#"Grouped Rows", SalesOrdersByCountry}),
    #"Removed Duplicates" = Table.Distinct(#"Appended Query"),
    #"Merged Queries" = Table.NestedJoin(#"Removed Duplicates", {"Country_Id"}, tbl_Region, {"Country_Id"}, "tbl_Region", JoinKind.LeftOuter),
    #"Expanded tbl_Region" = Table.ExpandTableColumn(#"Merged Queries", "tbl_Region", {"Region_Name"}, {"Region"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded tbl_Region",{"Country_Id", "CountryName", "CustomerName", "Region", "FirstNonNullOrder"})
in
    #"Reordered Columns"

Notes :
SrcFile is the parameter defined for Source File Path
"T:\T_Tech_Lab\PowerBI_Lab\DataSets\ds_Demo_Sample.xlsx" meta [IsParameterQuery=true, Type="Text", IsParameterQueryRequired=true]

Table.Combine : This query will be used to Append two or more Datsets.
Table.NestedJoin : This query will be used to Join the two Datasets.
Table.Distinct : This query will be used to remove the duplicates from a Dataset.

When there is a space in the step name ( eg : Grouped Rows) then Power Query will encode that step as #"Grouped Rows" otherwise it read the step name as it is as GroupedRows.

GroupKind.Local = A local group is formed from a consecutive sequence of rows from an input table with the same key value. This will increase the speed of the Group operation when we are dealing with the large Dataset grouping.

Result :

-------------------------------------------------------------------------------------------------------- 
Thanks, TAMATAM ; Business Intelligence & Analytics Professional 
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. Please post your genuine Feedback or comments only related to this Blog Posts. Please do not post any Spam comments or Advertising kind of comments which will be Ignored.