Tuesday, 19 November 2013

What is Slowly Changing Dimension in Data Warehouse

SCD-Slowly Changing Dimension - Data Warehouse
SCD is Slowly Changing Dimension. As the name suggests, a dimension which changes slowly. For Example, say there is a table Employee, which stores information regarding employee as below:

BusinessEntityID, PANNumber, First_Name, last_Name LoginID, OrganizationNode
OrganizationLevel, JobTitle, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag
CurrentFlag, ModifiedDate

In this Employee table, the data for an employee doesn't change very often, but yes we can’t say that the changes won’t be there. The changes, which may happen, are:
  • Mistakenly spelling of First_Name is stored incorrect.
  • The employee gets married and marital status changes.
  • Last_Name changes.
  • The employee gets promotion and job designation changes and organization level         changes.
  • The columns which doesn't change except if we assume that no mistake happens while data entry are HireDate, Gender, PANNumber
The changes discussed don’t happen frequently, but may happen after certain time.

SCD supports four types of changes: 
Changing Attribute, Historical Attribute, Fixed Attribute, and Inferred Member.

Type 1 (Changing Attribute)
When the changes in any attribute or column overwrites the existing records.
For example; as discussed first name of employee is misspelled and wrong spelling is stored in first name of that employee. For making the first name correct, we don’t need to add one more record for the same employee, so we can overwrite the first name. SCD which does this kind of changes comes into type 1 category. This SCD transformation directs these rows to an output named Changing Attributes Updates Output.

Emp ID
First Name
Last Name
Raj Sekar
Raja Sekhar

This SCD transformation directs these rows to an output named Changing Attributes Updates Output.
Type 2 (Historical Attribute): 
when we need to maintain the history of records, whenever some particular column value changes.
For example the employee gets promotion, designation changes and organization level changes. In such case we need to maintain the history of the employee, that with which designation he joined, and when his designation and organizational level changes.

For these kinds of changes, there will be multiple records for the same employee with different designation. Then to indentify the current records, we can either add a column as current flag, which will be ‘y’ for the current or latest records, Or else we can add two column as start date and end date (expiry date), through which we can maintain history of employees records. This SCD directs these rows to two outputs: Historical Attribute Inserts Output and New Output.

Raja Sekhar
Graduate Engineer
Raja Sekhar
Analyst Programmer
Raja Sekhar
Business Analyst

Type 3 (Fixed Attribute) : when the attribute must not change.
For example HireDate, Gender, PANNumber should never change. So whenever changes will occur in these columns value then either it should throw error or the changes can be saved in some other destination. But changes should not be applied in the columns.

This SCD transformation detects changes and directs the rows with changes to an output named Fixed Attribute Output.

Inferred member:
  are those records of the dimension, which are found missing during fact load. 

For example, say there is a fact table which contains employee and department information. While generating the fact table from Employee_Data table and Department_Data table , sometimes happens that Employee_Data contains some departments name which has no records in department table , and during fact table generation those records are found missing from department table, these kind of member of dimension department are called inferred member. It’s like ‘Fact arriving earlier than dimensions’.

This SCD transformation directs these rows to an output named Inferred Member Updates. When data for the inferred member is loaded, you can update the existing record rather than create a new one.


No comments:

Post a Comment

Hi User, Thank You for Visiting My Blog. Please Post Your Feedback/Comments/Query.

Subscribe to Blog Posts by Email

ExcelKingdom-Popular Posts

ExcelKingdom-Random Posts