In Type 2 Slowly Changing Dimension, a
new record is added to the table to represent the new information. Therefore,
both the original and the new record will be present. The new record gets its
own primary key.
In our example, recall we originally
have the following table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
After Williams moved from New York to
Los Angeles, we add the new information as a new row into the table:
Customer Key
|
Name
|
State
|
1001
|
Williams
|
New
York
|
1005
|
Williams
|
Los
Angeles
|
Advantages
- This allows us to accurately keep all historical information.
Disadvantages
- This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
- This necessarily complicates the ETL process.
Usage
About 50% of the time.
When to use Type 2
Type 2 slowly changing dimension should
be used when it is necessary for the data warehouse to track historical
changes.
In a history table, we should have a column called CURR_ROW_IND (Current Row Indicator) as a CHAR (Y/N). The current row would be Y, and all other history rows would be N. In this example, we assume that the most current address would be the one with the largest CUSTOMER_KEY for the customer. To build a query, we would have to use a GROUP BY clause and a MAX function on the CUSTOMER_KEY. Returning the results could be sluggish with a large history file. Using the clause WHERE CURR_RO_IND = Y, returns results much faster.
ReplyDeleteHi,
ReplyDeletethnx for sharing this, Can anyone explain how the customer key can be changed?
That is surrogate key only - a sequential no.! Not a customer Id
DeleteNo increased complex ETL with Pentaho Data Integration, for it has a dedicated step for Dimension treatment. SCD 1 or 2 is click away. No need for 3.
ReplyDeleteThe complexity is with it's execution, the execution with max function will always be time consuming.
ReplyDeletedifferences in sap bods 4.0 and 4.1 and 4.2
ReplyDeleteIn scd2 we have to maintain historical data...if we want to change the one perticular...first look in to table by using lookup...if the record not exist then directoly insert...if It alredy exist then change it with start and end date ...start date with current sysdate and end with flag no like 9999
ReplyDeleteHi Question
ReplyDeleteIN SCD2, what if source record is deleted? How do we handle in SCD2 ?
Hi Question
ReplyDeleteIN SCD2, what if source record is deleted? How do we handle in SCD2 ?
Customer Key should remain same. This is wrong example. It should have version column or effective date column or current_Record IND (Y /N)
ReplyDeleteWhen customer key remains same, it cannot be treated as a new row and hence tracking history is impossible as the original record gets updated when key is same. Effective date or column as suggested is used in scd type 3 to maintain previous and current record only(partial history)
DeleteCustomer key is not natural key in this scenario. The natural key will remain same, and the primary key (customer key) is generated anew for each version.
Delete