Pages

Thursday, 13 September 2012

SCD Type 2,Slowly Changing Dimension Use,Example,Advantage,Disadvantage

In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
In our example, recall we originally have the following table:

Customer Key
Name
State
1001
Williams
New York

After Williams moved from New York to Los Angeles, we add the new information as a new row into the table:

Customer Key
Name
State
1001
Williams
New York
1005
Williams
Los Angeles

Advantages
  • This allows us to accurately keep all historical information.
Disadvantages
  • This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
  • This necessarily complicates the ETL process.
Usage
About 50% of the time.

When to use Type 2
Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes.

12 comments:

  1. In a history table, we should have a column called CURR_ROW_IND (Current Row Indicator) as a CHAR (Y/N). The current row would be Y, and all other history rows would be N. In this example, we assume that the most current address would be the one with the largest CUSTOMER_KEY for the customer. To build a query, we would have to use a GROUP BY clause and a MAX function on the CUSTOMER_KEY. Returning the results could be sluggish with a large history file. Using the clause WHERE CURR_RO_IND = Y, returns results much faster.

    ReplyDelete
  2. Hi,

    thnx for sharing this, Can anyone explain how the customer key can be changed?

    ReplyDelete
    Replies
    1. That is surrogate key only - a sequential no.! Not a customer Id

      Delete
  3. No increased complex ETL with Pentaho Data Integration, for it has a dedicated step for Dimension treatment. SCD 1 or 2 is click away. No need for 3.

    ReplyDelete
  4. The complexity is with it's execution, the execution with max function will always be time consuming.

    ReplyDelete
  5. differences in sap bods 4.0 and 4.1 and 4.2

    ReplyDelete
  6. In scd2 we have to maintain historical data...if we want to change the one perticular...first look in to table by using lookup...if the record not exist then directoly insert...if It alredy exist then change it with start and end date ...start date with current sysdate and end with flag no like 9999

    ReplyDelete
  7. Hi Question
    IN SCD2, what if source record is deleted? How do we handle in SCD2 ?

    ReplyDelete
  8. Hi Question
    IN SCD2, what if source record is deleted? How do we handle in SCD2 ?

    ReplyDelete
  9. Customer Key should remain same. This is wrong example. It should have version column or effective date column or current_Record IND (Y /N)

    ReplyDelete
    Replies
    1. When customer key remains same, it cannot be treated as a new row and hence tracking history is impossible as the original record gets updated when key is same. Effective date or column as suggested is used in scd type 3 to maintain previous and current record only(partial history)

      Delete
    2. Customer key is not natural key in this scenario. The natural key will remain same, and the primary key (customer key) is generated anew for each version.

      Delete