Below are the few
points that i would like to list out about Staging area
- Staging area is
a place where you hold temporary tables on data warehouse server. Staging
tables are connected to work area or fact tables. We basically need
staging area to hold the data, and perform data cleansing and merging,
before loading the data into warehouse.
- In the absence
of a staging area, the data load will have to go from the OLTP system to
the OLAP system directly, which in fact will severely hamper the
performance of the OLTP system. This is the primary reason for the
existence of a staging area. In addition, it also offers a platform for
carrying out data cleansing.
- Staging area is
a temp schema used to
- Do
flat mapping i.e. dumping all the OLTP data in to it without applying any
business rules. Pushing data into staging will take less time because
there are no business rules or transformation applied on it.
- Used
for data cleansing and validation using First Logic
- A staging area is
like a large table with data separated from their sources to be loaded
into a data warehouse in the required format. If we attempt to load data
directly from OLTP, it might mess up the OLTP because of format changes
between a warehouse and OLTP. Keeping the OLTP data intact is very
important for both the OLTP and the warehouse.
- According to the complexity of the
business rule, we may require staging area, the basic need of staging area
is to clean the OLTP source data and gather in a place. Basically it’s a
temporary database area. Staging area data is used for the further process
and after that they can be deleted.
Please share your view about staging area......
Very nicely explained.
ReplyDeleteAnd also its always better to have all data, which is to be loaded, under one roof, so it would be easy to perform transformations(if any)
did anybody ask you?
DeleteThanks anyway, duh/
So a staging area is actually a database too, but only separated from OLTP databases and meant to perform transformations on? Am I understanding this correctly?
ReplyDeleteAlso, staging area is used to extract the data from various OLTP sources because all these sources might not be available at the same time.So, its a good idea to dump the required data in staging temporarily whenever the OLTP's are available and use it as per our need.
ReplyDelete