Tuesday, March 17, 2026

How to use Dynamic SQL to Consolidate Data from multiple Tables with different Structures

How to use Cursor in SQL to Consolidate Data from various Tables with different Structures
Scenario:
Suppose you are working in a pharmaceutical company where information is distributed across several systems. Each system captures different attributes:
  • Clinical trials record patient groups, their clinical responses, and efficiency rates.
  • Production records track manufacturing details such as batch yield ratios and batch quantities.
  • Sales transactions capture commercial data like regions, customer types, and sales amounts.
While this design works well for transactional systems, it becomes difficult when you need to analyze or report across all of them in a consistent way.
For example, management may want to see how a drug performed in trials, how efficiently it was produced, and how much revenue it generated, in a consolidated view.

The challenge is to create a Unified_Data_Repository that standardizes data from these diverse sources. Instead of wide tables with many columns, the goal is to reshape the data into a normalized tall format. 

In this structure, each column value is transformed into a row, making it easier to query, compare, and report across the various heterogeneous tables.

The consolidated Output table follows a simple schema as per below:
  • ID: represents the original row identifier from the source table.
  • Criteria_Name: the name of the column being transformed.
  • Criteria_Value: the actual value from that column, stored as text.
  • Src_Object_Name: the source table name, ensuring traceability.
  • _LastModified_On: a timestamp marking when the record was inserted.
We can achieve this Scenario using Dynamic SQL with Cursor as explained below:
-----------------------------------------------------------------
Step1: Create Sample Input Source Tables:
-----------------------------------------------------------------
-- Sample of Clinical trial data
CREATE TABLE [dbo].[Clinical_Trials] 
(
    Trial_ID INT,
    Drug_Code VARCHAR(50),
    Patient_Group VARCHAR(50),
    Clinical_Response VARCHAR(100),
    Efficiency_Rate DECIMAL(10,2)
);

INSERT INTO [dbo].[Clinical_Trials] VALUES
(1001, 'Drug_A01', 'Adults', 'Positive', 87.5),
(1002, 'Drug_B02', 'Children', 'Neutral', 65.0),
(1003, 'Drug_C03', 'Elderly', 'Negative', 42.3);

SELECT * FROM [dbo].[Clinical_Trials]


-- Sample of Production records
CREATE TABLE [dbo].[Production_Records] 
(
    Batch_ID INT,
    Drug_Code VARCHAR(50),
    Plant_ID VARCHAR(50),
    Batch_Yield_Ratio DECIMAL(10,2),
    Batch_Quantity DECIMAL(10,2)
);

INSERT INTO [dbo].[Production_Records] VALUES
(2001, 'Drug_A01', 'Plant_001', 95.2, 1200.5),
(2002, 'Drug_B02', 'Plant_002', 89.7, 980.0),
(2003, 'Drug_C03', 'Plant_003', 76.4, 750.2);

SELECT * FROM [dbo].[Production_Records]


-- Sample of Sales transactions
CREATE TABLE [dbo].[Sales_Transactions] 
(
    Sale_ID INT,
    Region VARCHAR(50),
    Drug_Code VARCHAR(50),
    Customer_Type VARCHAR(50),
    Sales_Amount DECIMAL(10,2)
);

INSERT INTO [dbo].[Sales_Transactions] VALUES
(3001, 'North America', 'Drug_A01', 'Hospital', 15000.0),
(3002, 'Europe', 'Drug_B02', 'Pharmacy', 12000.5),
(3003, 'Asia', 'Drug_C03', 'Distributor', 9800.0);

SELECT * FROM [dbo].[Sales_Transactions]

-----------------------------------------------------------------
Step 2: Create the Unified Output Table to store final Output Data:
-----------------------------------------------------------------
CREATE TABLE [dbo].[Unified_Data_Repository] 
(
    ID INT,
    Criteria_Name VARCHAR(200),
    Criteria_Value VARCHAR(200),
    Src_Object_Name VARCHAR(200),
    _LastModified_On DATETIME
);

-----------------------------------------------------------------
Step 3: Create a Temp Table with Source Table Names and Key Columns:
-----------------------------------------------------------------
IF OBJECT_ID('tempdb..##Src_Table_Mapping') IS NOT NULL 
DROP TABLE ##Src_Table_Mapping;

CREATE TABLE ##Src_Table_Mapping (
    Source_Table VARCHAR(128),
    Key_Column  VARCHAR(128)
);

INSERT INTO ##Src_Table_Mapping VALUES
('Clinical_Trials', 'Trial_ID'),
('Production_Records', 'Batch_ID'),
('Sales_Transactions', 'Sale_ID');

SELECT * FROM ##Src_Table_Mapping;

-----------------------------------------------------------------
Step 4: Dynamic SQL Query with Cursor to consolidate Data from all Source tables
-----------------------------------------------------------------
-- Declare variables to hold table name and SQL statement
DECLARE @TableName NVARCHAR(200);
DECLARE @sql NVARCHAR(MAX);
DECLARE @KeyColumn NVARCHAR(200);

-- Define a Cursor that will loop through all Source table names
DECLARE _CursorX Cursor FOR
SELECT Source_Table, Key_Column FROM ##Src_Table_Mapping;

-- Open the Cursor (initializes the result set)
OPEN _CursorX;

-- Fetch the first record (first table name from mapping)
FETCH NEXT FROM _CursorX INTO @TableName, @KeyColumn;

-- Loop until all records are processed
WHILE @@FETCH_STATUS = 0
BEGIN
    SET @sql = '';

    -- Build dynamic SQL for the _CursorXrent table
    -- This Pivots columns into rows and inserts them into Unified_Data_Repository
    SELECT @sql = 
        'INSERT INTO [dbo].[Unified_Data_Repository] 
        (ID, Criteria_Name, Criteria_Value, Src_Object_Name, _LastModified_On) ' +
        (
        SELECT STRING_AGG(
            'SELECT ' + @KeyColumn + ' AS ID, ''' + COLUMN_NAME + ''' AS Criteria_Name, ' +
            'CAST(' + COLUMN_NAME + ' AS NVARCHAR(200)) AS Criteria_Value, ' +
            '''' + @TableName + ''' AS Src_Object_Name, GETDATE() AS _LastModified_On ' +
            'FROM [dbo].[' + @TableName + ']',
            ' UNION ALL '
            )
        FROM INFORMATION_SCHEMA.COLUMNS
        WHERE TABLE_NAME = @TableName
        AND COLUMN_NAME <> @KeyColumn  -- Exclude Key column from Pivot
        );

    -- Execute the dynamic SQL for this table
    EXEC(@sql);

    -- Fetch the next record (next table name)
    FETCH NEXT FROM _CursorX INTO @TableName, @KeyColumn;
END

-- Close the Cursor (release resources)
CLOSE _CursorX;

-- Deallocate the Cursor (remove definition)
DEALLOCATE _CursorX;
-----------------------------------------------------------------
 Step 5: View the final consolidated Output
-----------------------------------------------------------------
SELECT * 
FROM [dbo].[Unified_Data_Repository]
ORDER BY Src_Object_Name, ID;

-----------------------------------------------------------------
Data Analysis and Insights:
-----------------------------------------------------------------
We can use the consolidated unified dataset to analyze and study the relationship between clinical outcomes and commercial performance. 
By pivoting attributes such as Drug_Code, Efficiency_Rate, and Sales_Amount, the dataset can be transformed into an analytical view that highlights how scientific effectiveness connects with market adoption.
The example below demonstrates how a focused query can turn the normalized repository into actionable insights:

WITH Drug_Data AS (
    SELECT Src_Object_Name, Criteria_Name, Criteria_Value, ID
    FROM Unified_Data_Repository
    WHERE Criteria_Name IN ('Drug_Code','Efficiency_Rate','Sales_Amount')
)
SELECT t1.Criteria_Value AS Drug_Code,
       MAX(CASE WHEN t2.Criteria_Name = 'Efficiency_Rate' THEN t2.Criteria_Value END) AS Efficiency_Rate,
       MAX(CASE WHEN t2.Criteria_Name = 'Sales_Amount' THEN t2.Criteria_Value END) AS Sales_Amount
FROM Drug_Data t1
INNER JOIN Drug_Data t2 ON t1.ID = t2.ID
WHERE t1.Criteria_Name = 'Drug_Code'
GROUP BY t1.Criteria_Value;


Insights Derived from Analysis:
By using the consolidated dataset and pivoting attributes like Drug_Code, Efficiency_Rate, and Sales_Amount, we can understand how clinical effectiveness relates to the performance of the sales. From this analysis, we can derive the following key insights.

1) Clinical vs. Commercial Performance:
  • Drug_A01 shows the highest efficiency rate (87.5%) and the strongest sales (15,000). This indicates strong clinical success translating into strong market adoption.
  • Drug_C03 has the lowest efficiency (42.3%) and the weakest sales (9,800), suggesting limited clinical effectiveness is reflected in lower demand.
2) Correlation Between Metrics:
  • Drug_A01 shows a positive correlation: higher efficiency rates tend to align with higher sales amounts.
  • Drug_B02, with moderate efficiency (65%) but relatively strong sales (12,000.5), shows that commercial performance can sometimes outpace clinical effectiveness, possibly due to market factors like pricing, distribution, or brand strength.
3) Strategic Decision Support:
  • Drug_B02 may need further clinical improvement to match its commercial success, ensuring long‑term sustainability and patient satisfaction.
  • Drug_C03 could require both clinical enhancements and stronger commercial strategies to improve its overall performance.
Note:
This article shows how Dynamic SQL with Cursors can be used to consolidate data. The examples are for learning purposes, and you can refine and adapt the logic to fit your own requirements.
--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

Saturday, March 7, 2026

How to Apply Permutations and Combinations Logic in SQL to Analyze Transaction Data

How to Use SQL Window Functions to Analyze Transactional Data with Permutations and Combinations
Scenario:
Suppose you are working with a financial system that processes millions of transactions every day. To detect anomalies, understand customer behavior, or ensure compliance, analysts often need to examine both the relationships between transactions (combinations) and the order in which transactions occur (permutations).

Statistical Concepts:
In probability theory, permutations and combinations are methods of counting the possible outcomes.
N = Total number of elements in the sample space (here, number of transactions).
r = Number of elements selected at a time (e.g., pairs of transactions).

Combinations: Choosing r elements from N, where order does not matter.

(Nr)=N!r!(Nr)!

Example: Choosing r=2 transactions from N=5 results in (52)=10 possible pairs. In combinations, (A,B) is considered the same as (B,A), so duplicates are ignored.

Permutations: Choosing r elements from N, where order matters.

P(N,r)=N!(Nr)!

Example: Choosing r=2 transactions from N=5 results in P(5,2)=20 possible sequences. In permutations, (A,B) is different from (B,A), so both orders are considered.

In this article, we will use a simple scenario with sample data to show how permutations and combinations, and running balances can be applied in SQL to uncover useful insights.

Sample Table: tbl_Sample_Transactions

CREATE TABLE tbl_Sample_Transactions (
    TransactionID INT,
    AccountID VARCHAR(10),
    TransactionDate DATE,
    Amount DECIMAL(10,2),
    Type VARCHAR(20)
);

INSERT INTO tbl_Sample_Transactions VALUES
(1, 'A001', '2024-01-01', 500, 'Deposit'),
(2, 'A001', '2024-01-02', -200, 'Withdrawal'),
(3, 'A001', '2024-01-03', -100, 'Transfer'),
(4, 'A002', '2024-01-01', 1000, 'Deposit'),
(5, 'A002', '2024-01-04', -300, 'Withdrawal'),
(6, 'A002', '2024-01-05', -200, 'Transfer'),
(7, 'A003', '2024-01-02', 700, 'Deposit'),
(8, 'A003', '2024-01-03', -400, 'Withdrawal'),
(9, 'A003', '2024-01-06', -100, 'Transfer'),
(10,'A003', '2024-01-07', 300, 'Deposit');

SELECT * FROM tbl_Sample_Transactions


1) Combinations: Transaction Pairs (Order Doesn’t Matter)
Combinations are useful when we want to study pairs of transactions that occur together, regardless of order. For financial analysis, this helps detect linked activities such as deposit–withdrawal pairs or withdrawal–transfer pairs.

Here, in this query we consider combinations of r=2 transactions from N. We avoid duplicates by considering only one side (Txn1, Txn2) and ignoring the flipside.

-- Combinations (N, r=2)
SELECT 
    t_left.AccountID,
    t_left.TransactionID AS Txn1,
    t_left.Type AS Txn1Type,
    t_right.TransactionID AS Txn2,
    t_right.Type AS Txn2Type
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_left
INNER JOIN (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_right
ON t_left.AccountID = t_right.AccountID AND t_left.rn < t_right.rn
WHERE t_left.AccountID = 'A003'   -- Filter the Account A003 for Analysis
ORDER BY t_left.AccountID, Txn1, Txn2;

-- We can also re-write the above query using CTE as per below:

WITH Ranked_Transactions AS (
    SELECT 
        AccountID,
        TransactionID,
        Type,
        ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
    WHERE AccountID = 'A003'   -- Filter the Account A003 for Analysis
    )
SELECT 
    t1.AccountID,
    t1.TransactionID AS Txn1,
    t1.Type AS Txn1Type,
    t2.TransactionID AS Txn2,
    t2.Type AS Txn2Type
FROM Ranked_Transactions t1
INNER JOIN Ranked_Transactions t2
  ON t1.rn < t2.rn
ORDER BY t1.AccountID, Txn1, Txn2;

Result:

Financial Insights from Combinations:

  • A Deposit paired with a Withdrawal (Txn7 and Txn8) highlights immediate cash-out after funding. Fraud detection teams often flag this as potential cash recycling, where funds are deposited and withdrawn quickly to disguise their origin.
  • A Withdrawal paired with a Transfer (Txn8 and Txn9) suggests funds are being moved across accounts after withdrawal. This is a classic layering technique in money laundering, where money is shuffled to obscure its trail.
  • A Deposit paired with another Deposit (Txn7 and Txn10) indicates repeated funding activity. Analysts use this to study liquidity behavior, customer saving patterns, and potential inflows from multiple sources.
2) Permutations: Transaction Sequences (Order Matters)
Permutations are useful when we want to study the order of transactions. In financial terms, this helps us analyze transaction flows — the exact sequence in which deposits, withdrawals, and transfers occur.

Here, in this query we consider the permutations of r=2 transactions from N, considering both directions (Txn1 → Txn2 and Txn2 → Txn1).

-- Permutations (N, r=2)
SELECT 
    t_left.AccountID,
    t_left.TransactionID AS Txn1,
    t_left.Type AS Txn1Type,
    t_right.TransactionID AS Txn2,
    t_right.Type AS Txn2Type
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_left
INNER JOIN (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_right
ON t_left.AccountID = t_right.AccountID AND t_left.rn <> t_right.rn
WHERE t_left.AccountID = 'A003'   -- Filter the Account A003 for Analysis
ORDER BY t_left.AccountID, Txn1, Txn2;

-- We can also re-write the above query using CTE as per below:

WITH Ranked_Transactions AS (
    SELECT 
        AccountID,
        TransactionID,
        Type,
        ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
    WHERE AccountID = 'A003'   -- Filter the Account A003 for Analysis
    )
SELECT 
    t1.AccountID,
    t1.TransactionID AS Txn1,
    t1.Type AS Txn1Type,
    t2.TransactionID AS Txn2,
    t2.Type AS Txn2Type
FROM Ranked_Transactions t1
INNER JOIN Ranked_Transactions t2
  ON t1.AccountID = t2.AccountID
 AND t1.rn <> t2.rn
ORDER BY t1.AccountID, t1.TransactionID, t2.TransactionID;

Result:

Financial Insights from Permutations:

  • A Deposit followed by a Withdrawal (Txn7 → Txn8) shows immediate cash-out after deposit. This sequence is a strong indicator of suspicious activity, often linked to fraud or money laundering.
  • A Withdrawal followed by a Deposit (Txn8 → Txn7) reflects replenishment after overdraft. This sequence provides insights into customer liquidity management and spending behavior.
  • A Transfer followed by a Deposit (Txn9 → Txn10) highlights funds being moved and then topped up. Analysts interpret this as either legitimate fund management or potential layering in money laundering schemes.
3) Running Balance Analysis:
Running balances are a way to track how account funds evolve over time. This is essentially a permutation analysis extended across the entire sequence of transactions.
Instead of viewing each transaction separately, we calculate the cumulative impact of all the transactions in sequence. 
This approach provides a continuous picture of how deposits, withdrawals, and transfers affect the account balance at every step in time.

-- Running Balance calculation for all Accounts
SELECT 
    AccountID,
    TransactionID,
    TransactionDate,
    Amount,
    SUM(Amount) OVER ( PARTITION BY AccountID ORDER BY TransactionDate
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS Running_Balance
FROM tbl_Sample_Transactions
ORDER BY AccountID, TransactionDate;

Result:

Financial Insights from Running Balance:
  • The account starts with a deposit of 500, then decreases to 300 after a withdrawal, and further reduces to 200 after a transfer.
  • Running balances make it easy to spot accounts that are trending downward and may risk overdraft.
  • Compliance teams can monitor whether balances stay above required minimums.
  • Customer behavior can be analyzed more easily: steady deposits indicate saving habits, while frequent withdrawals or transfers highlight spending or fund movement patterns.
  • Risk managers can use these balances to forecast liquidity needs and detect unusual depletion trends across accounts.
Note / Disclaimer:
Examples in this article are for illustration only. Real-world financial systems involve added complexities such as multi-currency, regulatory rules, and fraud detection. Always validate queries and logic against your organization’s compliance and data governance requirements.

--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

How to use Power Query to Handle Multi‑Row Headers Dynamically from Excel to Power BI

How to use Power Query to Handle Composite Headers Dynamically from Excel to Power BI
Scenario:
Suppose we have a sample of FTE data in an Excel sheet where the actual headers begin at row number 4. However, the first three rows may contain additional text or values that provide context. For example, in a weekly allocation report, rows 1–3 might contain date ranges or descriptive labels above the actual header row, as shown below:


Considerations in Output:

When building the final table, we need to treat the first five columns differently from the last five:
  • Columns 1 to 5 (A to E): These are structural fields such as UserID, UserName, Project_ID, FTE_Alloc_Status, and FTE_Percent. For these, we must ignore any text or values in rows 1 to 3 and simply use the header from row 4.
  • Columns 6 to 10 (F to J): These represent weekly allocations (WK1 to WK5). For these, we must concatenate the text or values from rows 1 to 3 with the header in row 4 to form a composite header.
    • Example: WK1_30-12-2024_03-01-2025.
This approach ensures that the weekly columns carry both the identifier (WK1, WK2, etc.) and the contextual date ranges, making the dataset more meaningful and self-explanatory.

We can achieve this scenario, using the below Power Query:

let
    /* LoadWorkbookAndSheet: 
       Load the Excel workbook from the given source path and access the target sheet "FTE_Sample". 
    */
    Source = Excel.Workbook(File.Contents(SrcPath), null, true),
    SrcData_Sheet = Source{[Item="FTE_Sample",Kind="Sheet"]}[Data],

    /* FindHeaderRowIndex: 
       Dynamically detect the row index where actual headers begin. 
       This is identified by locating "UserID" in the first column.
    */
    HeaderRowIndex = List.PositionOf(SrcData_Sheet[Column1], "UserID"),

    /* ExtractHeaderAndPreHeaderRows: 
       Capture all rows up to the header row. 
       Separate the HeaderRow (row containing actual column names) 
       and PreHeaderRows (rows above the header that may contain contextual values). 
    */
    AllRowsToHeader = List.FirstN(Table.ToRows(SrcData_Sheet), HeaderRowIndex+1),
    HeaderRow = AllRowsToHeader{HeaderRowIndex},
    PreHeaderRows = List.RemoveLastN(AllRowsToHeader, 1),

    /* BuildCompositeHeaders: 
       Construct raw composite headers by combining HeaderRow with PreHeaderRows. 
       Non-null values are concatenated using an underscore (_) to form meaningful column names. 
    */
    RawCompositeHeaders = List.Transform(
        List.Zip({HeaderRow} & PreHeaderRows),
        each Text.Combine(
            List.Select(List.Transform(_, Text.From), (x) => x <> null and x <> ""),
            "_"
        )
    ),

    /* ApplySkipLogic: 
       Define columns that should not use composite headers. 
       For these columns (e.g., "UserID", "FTE_Percent"), retain only the HeaderRow value. 
    */
    SkipColumns = {"FTE_Percent", "UserID"},
    CompositeHeaders = List.Transform(
        List.Zip({HeaderRow, RawCompositeHeaders}),
        each if List.Contains(SkipColumns, _{0}) then _{0} else _{1}
    ),

    /* PromoteHeadersAndRename: 
       Promote the detected HeaderRow as column names. 
       Then replace them with CompositeHeaders to ensure contextual information is included 
       for selected columns while keeping structural columns clean. 
    */
    PromoteHeaders = Table.PromoteHeaders(
        Table.Skip(SrcData_Sheet, HeaderRowIndex), 
        [PromoteAllScalars=true]
    ),
    FinalTable = Table.RenameColumns(
        PromoteHeaders, 
        List.Zip({Table.ColumnNames(PromoteHeaders), CompositeHeaders})
    )
in
    FinalTable

Output:


--------------------------------------------------------------------------------------------------------
Thanks
--------------------------------------------------------------------------------------------------------

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose we have a sample Sales data, which is stored with Monthly Snapsh...

Popular Posts from this Blog