Tuesday, 19 November 2013

MSBI-SSIS: SCD-Slowly Changing Dimension

MSBI-SSIS: SCD-Slowly Changing Dimension
SCD is Slowly Changing Dimension. As the name suggests, a dimension which changes slowly. For Example, say there is a table Employee, which stores information regarding employee as below:

BusinessEntityID, PANNumber, First_Name, last_Name LoginID, OrganizationNode
OrganizationLevel, JobTitle, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag
CurrentFlag, ModifiedDate

In this Employee table, the data for an employee doesn't change very often, but yes we can’t say that the changes won’t be there. The changes, which may happen, are:

  • Mistakenly spelling of First_Name is stored incorrect.
  • The employee gets married and marital status changes.
  • Last_Name changes.
  • The employee gets promotion and job designation changes and organization level         changes.
  • The columns which doesn't change except if we assume that no mistake happens while data entry are HireDate, Gender, PANNumber
The changes discussed don’t happen frequently, but may happen after certain time.

SCD supports four types of changes: 
Changing Attribute, Historical Attribute, Fixed Attribute, and Inferred Member.

Type 1 (Changing Attribute)
When the changes in any attribute or column overwrites the existing records.
For example; as discussed first name of employee is misspelled and wrong spelling is stored in first name of that employee. For making the first name correct, we don’t need to add one more record for the same employee, so we can overwrite the first name. SCD which does this kind of changes comes into type 1 category. This SCD transformation directs these rows to an output named Changing Attributes Updates Output.

Emp ID
First Name
Last Name
1
Raj Sekar
Gupta
1
Raja Sekhar
Gupta

This SCD transformation directs these rows to an output named Changing Attributes Updates Output.
Type 2 (Historical Attribute): 
when we need to maintain the history of records, whenever some particular column value changes.
For example the employee gets promotion, designation changes and organization level changes. In such case we need to maintain the history of the employee, that with which designation he joined, and when his designation and organizational level changes.

For these kinds of changes, there will be multiple records for the same employee with different designation. Then to indentify the current records, we can either add a column as current flag, which will be ‘y’ for the current or latest records, Or else we can add two column as start date and end date (expiry date), through which we can maintain history of employees records. This SCD directs these rows to two outputs: Historical Attribute Inserts Output and New Output.

EmpID
FirstName
DEsignation
StartDate
EndDate
Current
1
Raja Sekhar
Graduate Engineer
20-01-2010
25-01-2011
N
1
Raja Sekhar
Analyst Programmer
25-01-2011
25-01-2012
N
1
Raja Sekhar
Business Analyst
25-01-2012
1-01-2099
Y


Type 3 (Fixed Attribute) : when the attribute must not change.
For example HireDate, Gender, PANNumber should never change. So whenever changes will occur in these columns value then either it should throw error or the changes can be saved in some other destination. But changes should not be applied in the columns.

This SCD transformation detects changes and directs the rows with changes to an output named Fixed Attribute Output.

Inferred member:
  are those records of the dimension, which are found missing during fact load. 

For example, say there is a fact table which contains employee and department information. While generating the fact table from Employee_Data table and Department_Data table , sometimes happens that Employee_Data contains some departments name which has no records in department table , and during fact table generation those records are found missing from department table, these kind of member of dimension department are called inferred member. It’s like ‘Fact arriving earlier than dimensions’.

This SCD transformation directs these rows to an output named Inferred Member Updates. When data for the inferred member is loaded, you can update the existing record rather than create a new one.

Source:

2 comments:

  1. thank u for sharing this best information..really very nice.we are offering msbi online training

    ReplyDelete
  2. Thank you for posting the great content…I was looking for something like this…I found it quiet interesting, hopefully you will keep posting such blogs….Keep sharing

    best online MSBI training
    online abinitio training in usa,uk,india,canada
    selenium training
    best R programming online training
    SAP HANA online training in india

    ReplyDelete

Follow Me by Email

ExcelKingdom-Popular Posts

ExcelKingdom-Random Posts