Saturday, March 7, 2026

How to Apply Permutations and Combinations Logic in SQL to Analyze Transaction Data

How to Use SQL Window Functions to Analyze Transactional Data with Permutations and Combinations
Scenario:
Suppose you are working with a financial system that processes millions of transactions every day. To detect anomalies, understand customer behavior, or ensure compliance, analysts often need to examine both the relationships between transactions (combinations) and the order in which transactions occur (permutations).

Statistical Concepts:
In probability theory, permutations and combinations are methods of counting the possible outcomes.
N = Total number of elements in the sample space (here, number of transactions).
r = Number of elements selected at a time (e.g., pairs of transactions).

Combinations: Choosing r elements from N, where order does not matter.

(Nr)=N!r!(Nr)!

Example: Choosing r=2 transactions from N=5 results in (52)=10 possible pairs. In combinations, (A,B) is considered the same as (B,A), so duplicates are ignored.

Permutations: Choosing r elements from N, where order matters.

P(N,r)=N!(Nr)!

Example: Choosing r=2 transactions from N=5 results in P(5,2)=20 possible sequences. In permutations, (A,B) is different from (B,A), so both orders are considered.

In this article, we will use a simple scenario with sample data to show how permutations and combinations, and running balances can be applied in SQL to uncover useful insights.

Sample Table: tbl_Sample_Transactions

CREATE TABLE tbl_Sample_Transactions (
    TransactionID INT,
    AccountID VARCHAR(10),
    TransactionDate DATE,
    Amount DECIMAL(10,2),
    Type VARCHAR(20)
);

INSERT INTO tbl_Sample_Transactions VALUES
(1, 'A001', '2024-01-01', 500, 'Deposit'),
(2, 'A001', '2024-01-02', -200, 'Withdrawal'),
(3, 'A001', '2024-01-03', -100, 'Transfer'),
(4, 'A002', '2024-01-01', 1000, 'Deposit'),
(5, 'A002', '2024-01-04', -300, 'Withdrawal'),
(6, 'A002', '2024-01-05', -200, 'Transfer'),
(7, 'A003', '2024-01-02', 700, 'Deposit'),
(8, 'A003', '2024-01-03', -400, 'Withdrawal'),
(9, 'A003', '2024-01-06', -100, 'Transfer'),
(10,'A003', '2024-01-07', 300, 'Deposit');

SELECT * FROM tbl_Sample_Transactions


1) Combinations: Transaction Pairs (Order Doesn’t Matter)
Combinations are useful when we want to study pairs of transactions that occur together, regardless of order. For financial analysis, this helps detect linked activities such as deposit–withdrawal pairs or withdrawal–transfer pairs.

Here, in this query we consider combinations of r=2 transactions from N. We avoid duplicates by considering only one side (Txn1, Txn2) and ignoring the flipside.

-- Combinations (N, r=2)
SELECT 
    t_left.AccountID,
    t_left.TransactionID AS Txn1,
    t_left.Type AS Txn1Type,
    t_right.TransactionID AS Txn2,
    t_right.Type AS Txn2Type
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_left
INNER JOIN (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_right
ON t_left.AccountID = t_right.AccountID AND t_left.rn < t_right.rn
WHERE t_left.AccountID = 'A003'   -- Filter the Account A003 for Analysis
ORDER BY t_left.AccountID, Txn1, Txn2;

Result:

Financial Insights from Combinations:

  • A Deposit paired with a Withdrawal (Txn7 and Txn8) highlights immediate cash-out after funding. Fraud detection teams often flag this as potential cash recycling, where funds are deposited and withdrawn quickly to disguise their origin.
  • A Withdrawal paired with a Transfer (Txn8 and Txn9) suggests funds are being moved across accounts after withdrawal. This is a classic layering technique in money laundering, where money is shuffled to obscure its trail.
  • A Deposit paired with another Deposit (Txn7 and Txn10) indicates repeated funding activity. Analysts use this to study liquidity behavior, customer saving patterns, and potential inflows from multiple sources.

2) Permutations: Transaction Sequences (Order Matters)
Permutations are useful when we want to study the order of transactions. In financial terms, this helps us analyze transaction flows — the exact sequence in which deposits, withdrawals, and transfers occur.

Here, in this query we consider the permutations of r=2 transactions from N, considering both directions (Txn1 → Txn2 and Txn2 → Txn1).

-- Permutations (N, r=2)
SELECT 
    t_left.AccountID,
    t_left.TransactionID AS Txn1,
    t_left.Type AS Txn1Type,
    t_right.TransactionID AS Txn2,
    t_right.Type AS Txn2Type
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_left
INNER JOIN (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY TransactionID) AS rn
    FROM tbl_Sample_Transactions
) t_right
ON t_left.AccountID = t_right.AccountID AND t_left.rn <> t_right.rn
WHERE t_left.AccountID = 'A003'   -- Filter the Account A003 for Analysis
ORDER BY t_left.AccountID, Txn1, Txn2;

Result:

Financial Insights from Permutations:

  • A Deposit followed by a Withdrawal (Txn7 → Txn8) shows immediate cash-out after deposit. This sequence is a strong indicator of suspicious activity, often linked to fraud or money laundering.
  • A Withdrawal followed by a Deposit (Txn8 → Txn7) reflects replenishment after overdraft. This sequence provides insights into customer liquidity management and spending behavior.
  • A Transfer followed by a Deposit (Txn9 → Txn10) highlights funds being moved and then topped up. Analysts interpret this as either legitimate fund management or potential layering in money laundering schemes.
3) Running Balance Analysis:
Running balances are a way to track how account funds evolve over time. This is essentially a permutation analysis extended across the entire sequence of transactions.
Instead of viewing each transaction separately, we calculate the cumulative impact of all the transactions in sequence. 
This approach provides a continuous picture of how deposits, withdrawals, and transfers affect the account balance at every step in time.

-- Running Balance calculation for all Accounts
SELECT 
    AccountID,
    TransactionID,
    TransactionDate,
    Amount,
    SUM(Amount) OVER ( PARTITION BY AccountID ORDER BY TransactionDate
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS Running_Balance
FROM tbl_Sample_Transactions
ORDER BY AccountID, TransactionDate;

Result:


Financial Insights from Running Balance:
  • The account starts with a deposit of 500, then decreases to 300 after a withdrawal, and further reduces to 200 after a transfer.
  • Running balances make it easy to spot accounts that are trending downward and may risk overdraft.
  • Compliance teams can monitor whether balances stay above required minimums.
  • Customer behavior can be analyzed more easily: steady deposits indicate saving habits, while frequent withdrawals or transfers highlight spending or fund movement patterns.
  • Risk managers can use these balances to forecast liquidity needs and detect unusual depletion trends across accounts.
Note / Disclaimer:
Examples in this article are for illustration only. Real-world financial systems involve added complexities such as multi-currency, regulatory rules, and fraud detection. Always validate queries and logic against your organization’s compliance and data governance requirements.

--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

How to use Power Query to Handle Multi‑Row Headers Dynamically from Excel to Power BI

How to use Power Query to Handle Composite Headers Dynamically from Excel to Power BI
Scenario:
Suppose we have a sample of FTE data in an Excel sheet where the actual headers begin at row number 4. However, the first three rows may contain additional text or values that provide context. For example, in a weekly allocation report, rows 1–3 might contain date ranges or descriptive labels above the actual header row, as shown below:


Considerations in Output:

When building the final table, we need to treat the first five columns differently from the last five:
  • Columns 1 to 5 (A to E): These are structural fields such as UserID, UserName, Project_ID, FTE_Alloc_Status, and FTE_Percent. For these, we must ignore any text or values in rows 1 to 3 and simply use the header from row 4.
  • Columns 6 to 10 (F to J): These represent weekly allocations (WK1 to WK5). For these, we must concatenate the text or values from rows 1 to 3 with the header in row 4 to form a composite header.
    • Example: WK1_30-12-2024_03-01-2025.
This approach ensures that the weekly columns carry both the identifier (WK1, WK2, etc.) and the contextual date ranges, making the dataset more meaningful and self-explanatory.

We can achieve this scenario, using the below Power Query:

let
    /* LoadWorkbookAndSheet: 
       Load the Excel workbook from the given source path and access the target sheet "FTE_Sample". 
    */
    Source = Excel.Workbook(File.Contents(SrcPath), null, true),
    SrcData_Sheet = Source{[Item="FTE_Sample",Kind="Sheet"]}[Data],

    /* FindHeaderRowIndex: 
       Dynamically detect the row index where actual headers begin. 
       This is identified by locating "UserID" in the first column.
    */
    HeaderRowIndex = List.PositionOf(SrcData_Sheet[Column1], "UserID"),

    /* ExtractHeaderAndPreHeaderRows: 
       Capture all rows up to the header row. 
       Separate the HeaderRow (row containing actual column names) 
       and PreHeaderRows (rows above the header that may contain contextual values). 
    */
    AllRowsToHeader = List.FirstN(Table.ToRows(SrcData_Sheet), HeaderRowIndex+1),
    HeaderRow = AllRowsToHeader{HeaderRowIndex},
    PreHeaderRows = List.RemoveLastN(AllRowsToHeader, 1),

    /* BuildCompositeHeaders: 
       Construct raw composite headers by combining HeaderRow with PreHeaderRows. 
       Non-null values are concatenated using an underscore (_) to form meaningful column names. 
    */
    RawCompositeHeaders = List.Transform(
        List.Zip({HeaderRow} & PreHeaderRows),
        each Text.Combine(
            List.Select(List.Transform(_, Text.From), (x) => x <> null and x <> ""),
            "_"
        )
    ),

    /* ApplySkipLogic: 
       Define columns that should not use composite headers. 
       For these columns (e.g., "UserID", "FTE_Percent"), retain only the HeaderRow value. 
    */
    SkipColumns = {"FTE_Percent", "UserID"},
    CompositeHeaders = List.Transform(
        List.Zip({HeaderRow, RawCompositeHeaders}),
        each if List.Contains(SkipColumns, _{0}) then _{0} else _{1}
    ),

    /* PromoteHeadersAndRename: 
       Promote the detected HeaderRow as column names. 
       Then replace them with CompositeHeaders to ensure contextual information is included 
       for selected columns while keeping structural columns clean. 
    */
    PromoteHeaders = Table.PromoteHeaders(
        Table.Skip(SrcData_Sheet, HeaderRowIndex), 
        [PromoteAllScalars=true]
    ),
    FinalTable = Table.RenameColumns(
        PromoteHeaders, 
        List.Zip({Table.ColumnNames(PromoteHeaders), CompositeHeaders})
    )
in
    FinalTable

Output:


--------------------------------------------------------------------------------------------------------
Thanks
--------------------------------------------------------------------------------------------------------

Tuesday, February 17, 2026

How to return the Last Refresh Date of SQL Tables in Power BI

How use Power Query to return the Last Refresh Date of SQL Tables in Power BI
Scenario:
We have a set of tables/views sourced from Azure SQL Synapse, each containing a column name as _modified_timestamp . Using this column, we need to generate a summary dataset in Power BI that returns the last refresh date for each table/view.

Src_Tables = {"dim_Country", "dim_Product", "dim_Segment"}

We can achieve this Scenario using the following Power Query:

let
    // Define Data Source Parameters
    _server = "MyServer-synapse-server.database.windows.net",
    _database = "db_MyAnalytics",
    _schema = "dbo",

    // Define Source Connection
    Source = Sql.Database(_server, _database),

    // List of tables for which we need to check last refresh Date
    Src_Tables = {"dim_Country", "dim_Product", "dim_Segment"},
 
    // Function to run native query for each table
    GetLastRefreshDate = (TableName as text) =>
        let
            Query = "SELECT MAX(_modified_timestamp) AS Last_Refresh_Date FROM " & _schema & "." & TableName,
            Query_Result = Value.NativeQuery(Source, Query),
            WithServer = Table.AddColumn(Query_Result, "Server_Name", each _server),
            WithDb = Table.AddColumn(WithServer, "Database_Name", each _database),
            WithSchema = Table.AddColumn(WithDb, "Schema_Name", each _schema),
            WithTable = Table.AddColumn(WithSchema, "Table_Name", each TableName),
            Result = Table.ReorderColumns(WithTable, {"Server_Name","Database_Name","Schema_Name","Table_Name","Last_Refresh_Date"})
        in
            Result,

    // Apply function to get the last refresh date for each table
    All_Results = List.Transform(Src_Tables, each GetLastRefreshDate(_)),

    // Combine each table result into one table
    Final_Result = Table.Combine(All_Results)
in
    Final_Result

Output:


Note: 
Use parameters to define source connection details such as Server, Database, and Schema. Avoid hard‑coding these values directly within the query definition.


--------------------------------------------------------------------------------------------------------
Thanks
--------------------------------------------------------------------------------------------------------

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose we have a sample Sales data, which is stored with Monthly Snapsh...

Popular Posts from this Blog