Thursday, January 25, 2018

How to Perform Incremental Loads using Temp Tables in SSIS

SSIS Task to Perform Incremental Loads using Temp Tables
Incremental Load is process of comparing the Target data table against the Source Data based on a Id Column or Date Stamp or Time Stamp and updating or loading the data to Target table as per below conditions:

If there are any New records in Source data then we have to insert those records in Target data table.

For example, on a Daily/Weekly basis we have to insert the Region wise sales data.

If there are any updated values in Source data then we have to update those records Target data table.
For example, updating the Net Sales data which is adjusted.

If there any records exist in Target but not available in the latest Source data, then we have to Delete those records from Target data table.

Scenario:
Suppose we have a Target Table, in which we have to load the data from Source Table
USE [SalesDB]
GO

SELECT [Order_Id],[ProdName],[RegionName],[FiscalQtr],[NetSales]
FROM [dbo].[tbl_Source]
GO

The Source Table as follows from which we Load the data to Target Table
USE [AnalyticsDB]
GO

SELECT [Order_Id],[ProdName],[RegionName],[FiscalQtr],[NetSales]
FROM [dbo].[tbl_Target]
GO


Now if you observes the data in the Source against the Destination Table, the following
Incremental Load activities to be perform here :
  • First 6 records already exist in the Target but the ProductName,RegionName and NetSales for records 3 and 5 (Order_Id : 113,115) are modified in Source which only need to Update in Target .
  • The records 7,8 (Order_Id 119,120) in the Source are new which are not exist in the Target which need to Insert in Target .
  • The records 7,8 (Order_Id 117,118) are missing in the latest Source so that those records need to be delete from the Target.
Now we will design a SSIS Package to perform the above Incremental Load activities.

Phase-1: Create the Temp Tables initially before Building a Package:
First we need to Create the two Temp Staging Tables one Time in the Target Database, which helps us build the Package initially. Later the Package will take care of Creating those Tables during Run Time, in future.
A)The Temp Staging Table for Update the Records in Target Table
IF OBJECT_ID('[tempdb]..[##tempUpdateTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempUpdateTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempUpdateTable](
[Order_Id] [Int] NOT NULL,
[ProdName] [nvarchar](255) NULL,
[RegionName] [nvarchar](255) NULL,
[FiscalQtr] [nvarchar](255) NULL,
[NetSales] [float] NULL
) ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO

B)The Temp Staging Table for Delete the Records from Target Table
IF OBJECT_ID('[tempdb]..[##tempDeleteTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempDeleteTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempDeleteTable](
[Order_Id] [Int] NOT NULL,
[ProdName] [nvarchar](255) NULL,
[RegionName] [nvarchar](255) NULL,
[FiscalQtr] [nvarchar](255) NULL,
[NetSales] [float] NULL
) ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO

Phase-2: Create the SQL Server Connections to Source, Temp and Target Databases:


Notes:
Please set the Property RetainSameConnection=True for the TAMATAM-PC.tempdb under connection properties.
Also set the Property DelayValidation=True in Control Flow Properties and in the Data Flow Properties.

Phase-3: Create the Temp Staging Table for Update with Execute SQL Task:
We have to Create a Temp Table which we will use as Staging Table, in which we store the records which are Modified in the Source. From this Staging Table we update those records in the Target Table. 
We use the below Query , which will Truncates if already exists Otherwise it re-creates the Temp Table on every run.
IF OBJECT_ID('[tempdb]..[##tempUpdateTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempUpdateTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempUpdateTable](
[Order_Id] [Int] NOT NULL,
[ProdName] [nvarchar](255) NULL,
[RegionName] [nvarchar](255) NULL,
[FiscalQtr] [nvarchar](255) NULL,
[NetSales] [float] NULL
) ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO

Next Connect the Execute SQL Task to Data Flow and Design the Data Flow Task for Inserting New records to Target and the Modified records to Temp Staging Table from Source.

Phase-4: Designing the Data Flow Task for Inserting New records to Target and the Modified records to Temp Staging Table from Source:
A) Connect to the Source and Target Tables

Source DB :
Target DB :

B) Next Sore the Source and Target Tables data with common Key Column (Order_Id)
Sort Source by Order_Id :
Sort Target by Order_Id :

C) Next apply the Merge Left Outer Join between Source and Target Tables
Merge Join:

D) Next apply the Conditional Split Transformation to Insert the New Records to Target and Modified Records to Temp Staging Table.

Conditional Split:
We have to split the data based on following conditions.
Insert_New = ISNULL(Order_Id_Target)

This Condition will returns the Null when the an Order_Id from Source is not exists in Target. So that those New records from Source need to insert to Target.

(ISNULL(ProdName_Target) ? "Default_Val" : ProdName_Target) != (ISNULL(ProdName_Source) ? "Default_Val" : ProdName_Source) || (ISNULL(RegionName_Target) ? "Default_Val" : RegionName_Target) != (ISNULL(RegionName_Source) ? "Default_Val" : RegionName_Source) || (ISNULL(FiscalQtr_Target) ? "Default_Val" : FiscalQtr_Target) != (ISNULL(FiscalQtr_Source) ? "Default_Val" : FiscalQtr_Source) || 
NetSales_Target != NetSales_Source

This Condition will checks whether is any Modification to data in Source by comparing it against the Target. These records need to Update in Target. Here we will first re-direct those rows to a Temp Staging Table.
Also, that we are handing NULLs of the Columns in runtime with some default value, so that we can compare the Source against the Target. 


E) Next Connect to the Destinations to Insert the New Records to Target and Modified Records to Temp Staging Table.
Target Table : 

Temp Stage Table :


Now the Data Flow is Ready to Load new records to Target and Modified records to Temp Update Stage Table from Source.

Phase-5: Updating Modified records from Temp Staging Table to Target:
Next Connect the Data Flow Task to Execute SQL Task to Update Modified records from Temp Staging Table to Target Target Table.
Use the below Statement to update records from Temp Staging Table to Target Table
Update Tbl_Target 
      Set [Order_Id]=U.[Order_Id],
            [ProdName]=U.[ProdName],
            [RegionName]=U.[RegionName],
            [FiscalQtr]=U.[FiscalQtr],
            [NetSales]=U.[NetSales]
From Tbl_Target  T Inner Join [##tempUpdateTable] U
On T.[Order_Id]=U.[Order_Id]
Now all the Tasks are ready to Perform the Incremental Load activities to Insert New and Update Modified from Source to Target Table.
Now Keep all the Tasks in one Sequence Container and run the Package to See the results in the Target Table.
Output in Target Table ( with Inserted New , Updated Modified from Source ) :
USE [AnalyticsDB]
GO
Select * From [Tbl_Target]

Modified records ( Order_Ids : 113,115,116 )
Inserted records ( Order_Ids : 119,120 )

-----------------------------------------------------------------------------------------------------------
Now to Delete the Records from the above Target ( Order_Ids : 117,118) which are missing in latest Source Table.
We can do that by designing the flow, in continuation to the above flow, as follows.

Phase-6: Creating a Temp Staging Table to Store the records from Source which have to be delete from Target:
We use the below Query , which will Truncates if already exists Otherwise it re-creates the Temp Table on every run.
IF OBJECT_ID('[tempdb]..[##tempDeleteTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempDeleteTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempDeleteTable](
 [Order_Id] [Int] NOT NULL,
 [ProdName] [nvarchar](255) NULL,
 [RegionName] [nvarchar](255) NULL,
 [FiscalQtr] [nvarchar](255) NULL,
 [NetSales] [float] NULL
) ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO
Next Connect this Execute SQL Task to the Data Flow and Design the Flow 

Phase-7: Design the Data Flow to Store the records from Source to a Temp Staging Table to which have to be delete from Target:
A) Connect to the Target Table



B) Next Connect Target Table to Lookup Transformation to Lookup against Source Table







C) Next Connect Lookup Transformation to the Temp Staging Table
Now Connect the Lookup No Match Output to Temp Stage Table, to Load the Records which are not exist in the Source but exists in the Target to the Temp Staging Table.







Now the Data Flow Task is ready to Load the records to Temp Delete Stage Table.

Phase-8: Connect the Data Flow to Execute SQL Task to Delete from Target:
Next Connect the Data Flow Task to Execute SQL Task to Delete the records from Target Table by Joining with Temp Delete Staging Table.
Use the below Statement to Delete records from Target Table by Joining with Temp Table
Delete T From [tbl_Target] T
 Inner Join [##tempDeleteTable] D
On T.Order_Id=D.Order_Id
Now all the Tasks are ready to Perform the Incremental Load activities to Delete the Records Target Table, which are not exist in the latest Source Table.


Finally connect the two Sequence Containers (one for Insert New, Update Modified ; another for Delete) and  and run the Package to See the results in the Target Table.


Final Output in Target Table ( with Inserted New , Updated Modified and Deleted non Match with Source) :
USE [AnalyticsDB]
GO
Select * From [Tbl_Target]
Modified records ( Order_Ids : 113,115,116 )
Inserted records ( Order_Ids : 119,120 )
Deleted records ( Order_Ids : 117,118 )



--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. Please post your genuine Feedback or comments only related to this Blog Posts. Please do not post any Spam comments or Advertising kind of comments which will be Ignored.

Featured Post from this Blog

How to compare Current Snapshot Data with Previous Snapshot in Power BI

How to Dynamically compare two Snapshots Data in Power BI Scenario: Suppose, we have a sample Sales data, which is stored with Monthly Snaps...

Popular Posts from this Blog