DATAWAREHOUSE CONCEPTS: Aggregator

Showing posts with label Aggregator. Show all posts

Thursday, 13 September 2012

Informatica Workflow taking more time to execute even after using Sorted input

Sorted input reduces the amount of data cached during the session and improves the performance of Aggregator. There are chances that you use sorted input in Aggregator still the workflow takes a long time to execute. To avoid such scenarios keep in mind the below points in the mapping or session.

Don’t use incremental aggregation in session.
Don’t use nested aggregate functions inside aggregate expressions.
Also the source data should not be data driven

Friday, 7 September 2012

Aggregate Tables in OBIA

Aggregates tables are important part of OBIA.I have already dicussed about aggregator in my earlier posts "Aggregators".In Datawarhouse we have to there will be many facts and we have to sum up fact data with respect to a given dimension, for example by date(Date dimension).

When we do these summations for each facts it results in slowing down the performance of the mapping. Hence we make use of the Aggregator to do these summations and other aggregate functions like Min,Max,Avg etc.

OBIA pre calculates these sums and stores it in the form of Aggregate tables. In OBIA the aggregate Tables are suffixed using _A. You can easily identify an Aggregate table in OBIA using the suffix.

Also there are many ways using which you can improve the performance of the Aggregator which I have mentioned in my previous posts.

Monday, 3 September 2012

Aggregator transformation vs Expression Transformation

Aggregator transformation is used to perform aggregate calculations such as sum,average,max min.Its an active transformation which is used to convert detailed values to summary values.If you compare with Expression transformation then the difference is that in the Expression transformation calculations are done by row by row whereas in Aggregator calculations are done for group. The Integration Service performs aggregate calculations as it reads, and stores necessary data group and row data in an aggregate cache.

The Aggregator Transformation consists of below four components

1) Group by ports

2) Aggregate Expression

3) Stored inputs

4) Aggregate Cache

You can read more about Aggregator in my below posts

1) Different Components in Aggregator

2) How to improve the performance of Aggregator

Sunday, 2 September 2012

Types of Transformations supported by Sorted Input

Only the below transformations supports Sorted input inorder to increase performance

Aggregator Transformation
Joiner Transformation
Lookup Transformation

Friday, 31 August 2012

Null Values in Aggregate Functions

When you configure the Integration Service, you can choose how you want the Integration Service to handle null values in aggregate functions(NULL or zero). By default the Integration services treats null values as Null in Aggregate functions, If you don’t want it to be like that then you can configure to treat null values as Zero.

Different Components of the Aggregator Transformation

The Aggregator transformation is an active transformation and has the following components :

Aggregate expression:This expression is given in the Output port and it can include non-aggregate expressions and conditional clauses.
Group by port:Used to create groups . It can be any input, input/output, output, or variable port.
Sorted input:In order to use sorted input pass data to the Aggregator transformation which is sorted by group by port either in ascending or descending order. Sorted input is used to improve session performance.
Aggregate cache:The Integration Service stores data in the aggregate cache until it completes aggregate calculations. In index cache it stores the group values and row data in the data cache

How to improve the performance of an Aggregator transformation?

Use sorted input: Sorted input reduces the amount of data cached during the session and improves session performance. Hence use Sorter transformation and sort the data and pass to the Aggregator transformation.

Less number of Input\Output Ports:Use less number of connected input/output or output ports to reduce the amount of data .The Aggregator transformation stores in the data cache.

User Filter:In case if you are using Filter transformation use it before aggregator.This helps the Aggregator transformation to reduce unnecessary aggregation

Wednesday, 29 August 2012

Sorting in Informatica

Sorting the data improves the performance.It is always better to sort the data at Source Qualifier level in case of Relational Source and for Flat files we use Sorter Transformation. Since SQL override cannot be used on flat files we use sorter transformation.

Flat file source - Use Sorter Transformation
Relational source - Use Sorter/Source Qualifier

Using Sorter in Aggregator:

Using Sorted input before Aggregator Transformation improves the performance. If you are not going to give sorted input to aggregator, then aggregator will do that before processing the records which degrades the performance. If you are using sorter prior to aggregator and not having "Sorted Input" checked, aggregator will cache all incoming data again . You should have "Sorted Input" checked only when sorting ports and aggregator ports are same and having same order of sort and aggregate.

DATAWAREHOUSE CONCEPTS

Pages