Thursday, January 25, 2018

How to Perform Incremental Loads to Sync Source Data with Destination using MERGE Command in SSIS

SSIS Task to Perform Incremental Loads to Sync Source Data Table with Target Table using MERGE Command in SSIS
Incremental Load is process of comparing the Target data table against the Source data based on a Id Column or Date Stamp or Time Stamp and updating or loading the data to Target table as per below conditions:If there are any New records in Source data then we have to insert those records in Target data table. For example, on a Daily/Weekly basis we have to insert the Region wise sales data.
For example, updating the Net Sales data which is adjusted.
Incremental Load activities to be perform here :
  • First 6 records already exist in the Target but the ProductName,RegionName and NetSales for records 3 and 5 (Order_Id : 113,115) are modified in Source which only need to Update in Target .
  • The records 7,8 (Order_Id 119,120) in the Source are new which are not exist in the Target which need to Insert in Target .
  • The records 7,8 (Order_Id 117,118) are missing in the latest Source so that those records need to be delete from the Target.
USE [AnalyticsDB]
GO
Select * From [Tbl_Target]
Inserted records ( Order_Ids : 119,120 )
Deleted records ( Order_Ids : 117,118 )
If there are any updated values in Source data then we have to update those records Target data table. 
If there any records exist in Target but not available in the latest Source data, then we have to Delete those records from Target data table.
Scenario:
Suppose we have a Target Table, in which we have to load the data from Source Table
USE [SalesDB]
GO
SELECT [Order_Id]
      ,[ProdName]
      ,[RegionName]
      ,[FiscalQtr]
      ,[NetSales]
  FROM [dbo].[tbl_Source]
GO
The Source Table as follows from which we Load the data to Target Table
USE [AnalyticsDB]
GO
SELECT [Order_Id]
      ,[ProdName]
      ,[RegionName]
      ,[FiscalQtr]
      ,[NetSales]
  FROM [dbo].[tbl_Target]
GO
Now if you observes the data in the Source against the Destination Table, the following
Now we will design a SSIS Package to perform the above Incremental Load activities.

Phase-1: Create the Temp Table initially before Building a Package:
First we need to Create the two Temp Staging Tables one Time in the Target Database, which helps us build the Package initially. Later the Package will take care of Creating those Tables during Run Time, in future.
The Temp Staging Table for Loading the data from Source, then performing the Incremental Load Activities in Target ( Insert New,Update Modified and Delete Not match with Source).
IF OBJECT_ID('[tempdb]..[##tempSourceTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempSourceTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempSourceTable](
[Order_Id] [Int] NOT NULL,
[ProdName] [nvarchar](255) NULL,
[RegionName] [nvarchar](255) NULL,
[FiscalQtr] [nvarchar](255) NULL,
[NetSales] [float] NULL
 ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO

Phase-2: Create the SQL Server Connections to Source, Temp and Target Databases:

Notes:
Please set the Property RetainSameConnection=True for the TAMATAM-PC.tempdb under connection properties.
Also set the Property DelayValidation=True in Control Flow Properties and in the Data Flow Properties.

Phase-3: Create the Temp Staging Table for Source with Execute SQL Task:
We have to Create a Temp Table which we will use as Staging Table, in which we store the data from the Source. From this Staging Table we Insert, update and Delete the records in the Target Table. 

We use the below Query , which will Truncates if already exists Otherwise it re-creates the Temp Table on every run.
IF OBJECT_ID('[tempdb]..[##tempSourceTable]', 'U') IS NOT NULL
 Begin
    Truncate Table [dbo].[##tempSourceTable]
  --Print ('Temp Table already exists and it has been Truncated')
 End
Else
Begin
CREATE TABLE[dbo].[##tempSourceTable](
[Order_Id] [Int] NOT NULL,
[ProdName] [nvarchar](255) NULL,
[RegionName] [nvarchar](255) NULL,
[FiscalQtr] [nvarchar](255) NULL,
[NetSales] [float] NULL
 ON [PRIMARY]
 -- Print ('A Temp Table has been Created')
End
GO
Next Connect the Execute SQL Task to Data Flow and Design the Data Flow Task for loading data from  Source to a Temp Staging Table.

Phase-4: Designing the Data Flow Task for for loading data from Source to a Temp Staging Table :
Source DB :
Temp Stage Table :


Now the Data Flow Task is Ready. Next Connect the Data Flow Task to Execute SQL Task to Perform the Incremental Load Activities.

Phase-5: Connect the Data Flow Task to Execute SQL Task to Perform the Incremental Load Activities :
We use the below Query (MERGE Command), in the Execute SQL Task which performs the Incremental Load Activities in Target ( Insert New,Update Modified and Delete Not match with Source)
MERGE tbl_Target AS T
USING ##TempSourceTable AS S
ON (T.Order_Id=S.Order_Id)
WHEN NOT MATCHED BY TARGET
THEN INSERT(Order_Id,ProdName,RegionName,FiscalQtr,NetSales)
VALUES(S.Order_Id,S.ProdName,S.RegionName,S.FiscalQtr,S.NetSales)
WHEN MATCHED AND 
(
ISNULL(T.ProdName,'abc')<>ISNULL(S.ProdName,'abc') OR
ISNULL(T.RegionName,'abc')<>ISNULL(S.RegionName,'abc') OR 
T.FiscalQtr<>S.FiscalQtr OR T.NetSales<>S.NetSales
)
THEN UPDATE SET T.ProdName=S.ProdName, 
T.RegionName=S.RegionName,
T.FiscalQtr=S.FiscalQtr,T.NetSales=S.NetSales
WHEN NOT MATCHED BY SOURCE THEN DELETE;

Notes :
Some times we may get the below Error when we performing comparison operations like below
ISNULL(T.ProdName,'abc')<>ISNULL(S.ProdName,'abc') OR
T.FiscalQtr<>S.FiscalQtr OR T.NetSales<>S.NetSales 
due to different Collations having by SourceDB and TargetDB. 


Msg 468, Level 16, State 9, Line 12
Cannot resolve the collation conflict between "Latin1_General_CI_AI" and "SQL_Latin1_General_CP1_CI_AS" in the not equal to operation.
We need to Make Sure that the 
To avoid this Error we need to Ensure both Source and Target DB should have the same Collation.
Final Phase: Now Execute the Package to perform the Incremental Load Activities :

Final Output in Target Table ( with Inserted New , Updated Modified and Deleted non Match with Source) :

Modified records ( Order_Ids : 113,115,116 )
Inserted records ( Order_Ids : 119,120 )
Deleted records ( Order_Ids : 117,118 )

--------------------------------------------------------------------------------------------------------
Thanks, TAMATAM ; Business Intelligence & Analytics Professional
--------------------------------------------------------------------------------------------------------

No comments:

Post a Comment

Hi User, Thank You for visiting My Blog. Please post your genuine Feedback or comments only related to this Blog Posts. Please do not post any Spam comments or Advertising kind of comments which will be Ignored.