We come accross Date dimension mostly while working in Data warehousing.To make this post I have referred"The Data
Warehouse Toolkit: The Complete Guide to Dimensional Modeling - By
Ralph Kimball, Margy Ross"
The date dimension is the one
dimension nearly guaranteed to be in every data mart because virtually every
data mart is a time series. In fact, date is usually the first dimension in the
underlying sort order of the database so that the successive loading of time
intervals of data is placed into virgin territory on the disk.Unlike most of
our other dimensions, we can build the date dimension table in advance. We may
put 5 or 10 years of rows
representing days in the table so that we can cover the history we have stored,
as well as several years in the future. Even 10 years’ worth of days is only about 3,650 rows, which is a
relatively small dimension table. For a daily date dimension table in a retail
environment, we recommend the partial list of columns shown in Figure
|
Each column in the date
dimension table is defined by the particular day that the row represents. The day-of-week column contains the name of
the day, such as Monday. This column would be used to create reports comparing
the business on Mondays with Sunday business. The day number in calendar month column starts with 1 at the beginning
of each month and runs to 28, 29, 30, or 31, depending on the month. This column
is useful for comparing the same day each month. Similarly, we could have a
month number in year (1, ... , 12)..
The holiday indicator takes on the values of Holiday or Nonholiday.
Remember that the dimension table attributes serve as report labels. Simply
populating the holiday indicator with a Y or an N ould be far less useful.
Imagine a report where we’re comparing holiday sales for a given product versus
nonholiday sales.
A similar argument holds
true for the weekday indicator, which
would have a value of Weekday or Weekend. Saturdays and Sundays obviously would
be assigned the Weekend value. Of course, multiple date table attributes can be
jointly constrained, so we can easily compare weekday holidays with weekend holidays,
for example.The selling season column is set to the name of the retailing season,
if any.Examples in the United States could include Christmas, Thanksgiving,
Easter,Valentine’s Day, Fourth of July, or None.
Some designers pause at
this point to ask why an explicit date dimension table is needed.The reason that if the date
key in the fact table is a date-type field,then any SQL query can directly
constrain on the fact table date key and use natural SQL date semantics to
filter on month or year while avoiding a supposedly expensive join. This
reasoning falls apart for several reasons.
- First of all, if our relational database can’t handle an efficient join to the date dimension table, we’re already in deep trouble.
- Also, on the performance front, most databases don’t index SQL date calculations, so queries constraining on an SQL-calculated field wouldn’t take advantage of an index.
- In terms of usability, the typical business user is not versed in SQL date semantics,so he or she would be unable to directly leverage inherent capabilities associated with a date data type. SQL date functions do not support filtering by attributes such as weekdays versus weekends, holidays, fiscal periods, seasons, or major events. Presuming that the business needs to slice data by these nonstandard date attributes, then an explicit date dimension table is essential
No comments:
Post a Comment