How to generate Random Sample Dimension and Fact Tables using M-Query
Scenario:
Suppose if we want to generate the Dimension and Tables with Random Sample data using Power Query.
1) dim_Date - a sample Calendar dimension with list of Dates for 2021.
2) dim_Customer - a customer dimension with sample list of Customers.
3) dim_Products - a products dimension with sample list of Products
4) fact_Sales_Orders - a fact table with random sample of Order details data.
We can create those sample tables using the following Power Query logics.
1) dim_Date: The sample Calendar dimension with list of Dates for 2021.
let
/* Generate a list of dates for the entire year 2021 */
StartDate = #date(2021, 1, 1),
EndDate = #date(2021, 12, 31),
DateList = List.Dates(StartDate, Duration.Days(EndDate - StartDate) + 1, #duration(1, 0, 0, 0)),
/* Convert the list of dates into a table */
DateTable = Table.FromList(DateList, Splitter.SplitByNothing(), {"Date"}),
/* Add additional columns for required attributes */
ExtendedTable = Table.AddColumn(DateTable, "Year", each Date.Year([Date]), Int64.Type),
AddMonth = Table.AddColumn(ExtendedTable, "Month", each Text.Proper(Text.Start(Date.MonthName([Date]), 3)), type text),
AddDay = Table.AddColumn(AddMonth, "Day", each Date.Day([Date]), Int64.Type),
AddMonthYear = Table.AddColumn(AddDay, "Month_Year", each [Month] & "-" & Text.From(Date.Year([Date])), type text),
AddQuarter = Table.AddColumn(AddMonthYear, "Quarter", each "Q" & Text.From(Date.QuarterOfYear([Date])), type text),
AddQtrYear = Table.AddColumn(AddQuarter, "Qtr_Year", each [Quarter] & "-" & Text.From(Date.Year([Date])), type text),
/* Change data types as necessary */
FinalTable = Table.TransformColumnTypes(AddQtrYear, { {"Date", type date}, {"Year", Int64.Type}, {"Month", type text}, {"Day", Int64.Type}, {"Month_Year", type text}, {"Quarter", type text}, {"Qtr_Year", type text} })
in
FinalTable
Result:
2) dim_Customer: The customer dimension with sample list of Customers.
let
CustomerNames = {"Sun Hot", "Moon Cool", "Water Flow", "Sky Star", "Earth Strong", "Fire Blaze", "Ocean Deep", "Cloud Soft", "Wind Swift", "Mountain High"},
Regions = {"North", "South", "East", "West"},
Source = Table.FromRecords(
List.Transform({1..10}, each [
Customer_ID = Text.Combine({"C", Number.ToText(_, "00")}),
/* Picks a name from the predefined list */
Customer_Name = CustomerNames{_ - 1},
/* Cycles through regions */
Customer_Region = Regions{Number.Mod(_, List.Count(Regions))}
])
),
DataTypeChange = Table.TransformColumnTypes(Source, { {"Customer_ID", type text}, {"Customer_Name", type text}, {"Customer_Region", type text} })
in
DataTypeChange
Result:
3) dim_Products: The products dimension with sample list of Products
let
Products = {
[Product_ID="P01", Prod_Name="LED TV", Prod_Category="Electronics", Prod_Price=500, Prod_Cost=300],
[Product_ID="P02", Prod_Name="Smartphone", Prod_Category="Electronics", Prod_Price=800, Prod_Cost=500],
[Product_ID="P03", Prod_Name="Laptop", Prod_Category="Electronics", Prod_Price=1200, Prod_Cost=900],
[Product_ID="P04", Prod_Name="Refrigerator", Prod_Category="Home Appliances", Prod_Price=1500, Prod_Cost=1000],
[Product_ID="P05", Prod_Name="Washing Machine", Prod_Category="Home Appliances", Prod_Price=1800, Prod_Cost=1400],
[Product_ID="P06", Prod_Name="Microwave Oven", Prod_Category="Home Appliances", Prod_Price=600, Prod_Cost=400],
[Product_ID="P07", Prod_Name="Sofa", Prod_Category="Furniture", Prod_Price=2000, Prod_Cost=1500],
[Product_ID="P08", Prod_Name="Wooden Table", Prod_Category="Furniture", Prod_Price=900, Prod_Cost=600],
[Product_ID="P09", Prod_Name="Wooden Chair", Prod_Category="Furniture", Prod_Price=2500, Prod_Cost=1900]
},
Source = Table.FromList(Products, Record.FieldValues, {"Product_ID", "Product_Name", "Category", "Price", "Cost"})
in
Source
Result:
4) fact_Sales_Orders: The fact table with random sample of Order details data.
let
SalesRecords = List.Transform({1..30}, each
let
/* Generate a Range of Random Dates */
DateInfo =
let
StartDate = #date(2021, 1, 1),
EndDate = #date(2021, 3, 31),
RandomDaysOffset = Number.RoundDown(
Number.RandomBetween(0, Duration.Days(EndDate - StartDate))),
RandomDate = Date.AddDays(StartDate, RandomDaysOffset),
Date_ID = Number.FromText(Date.ToText(RandomDate, "yyyyMMdd")),
Month = Date.Month(RandomDate)
in
[Order_Date = RandomDate, Order_Month = Month],
/* Random Product selection */
RandomProduct = Number.RandomBetween(1, 9)
in
[
Order_ID = _,
Product_ID =
/*
Exclude specific products for certain months:
- For January (Order_Month = 1), exclude products P01, P02, and P03.
- For February (Order_Month = 2), exclude products P04, P05, and P06.
- For March (Order_Month = 3), exclude products P07, P08, and P09.
If the RandomProduct falls within the exclusion list for the respective month,
then the Product_ID is set to null (excluded from the data).
Otherwise, it assigns the product.
*/
if
(DateInfo[Order_Month] = 1 and List.Contains({1, 2, 3}, RandomProduct)) or
(DateInfo[Order_Month] = 2 and List.Contains({4, 5, 6}, RandomProduct)) or
(DateInfo[Order_Month] = 3 and List.Contains({7, 8, 9}, RandomProduct))
then null
else Text.Combine({"P", Number.ToText(RandomProduct, "00")}),
Customer_ID = Text.Combine({"C", Number.ToText(Number.RandomBetween(1, 10), "00")}),
Order_Date = DateInfo[Order_Date],
Quantity_Sold = Number.RandomBetween(1, 5),
Revenue = Number.RandomBetween(500, 3000),
Profit = Number.RandomBetween(50, 500)
]
),
/* Remove records where Product_ID is null */
FilteredRecords = Table.SelectRows(Table.FromRecords(SalesRecords), each [Product_ID] <> null),
/* Adjust column data types */
DataTypeChange = Table.TransformColumnTypes(FilteredRecords,{{"Order_ID", Int64.Type}, {"Product_ID", type text}, {"Customer_ID", type text}, {"Order_Date", type date}, {"Quantity_Sold", Int64.Type}, {"Revenue", type number}, {"Profit", type number}})
in
DataTypeChange
Result:
Please note that, the random values will keep change on each refresh of the above Queries.
--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------