Informatica provides you the option of enhancing the performance of the Informatica session by the The PowerCenter® Partitioning Option. After tuning all the performance bottlenecks we can further improve the performance by addition partitions.
We can either go for Dynamic
partitioning (number of partition passed as parameter) or Non-dynamic
partition (number of partition are fixed while coding). Apart from used for
optimizing the session, Informatica partition become useful in situations where
we need to load huge volume of data or when we are using Informatica source
which already has partitions defined, and using those partitions will allow to
improve the session performance.
The
partition attributes include setting the partition point, the number of
partitions, and the partition types.
Partition
Point:
There can be one or more pipelines inside a mapping. Adding a partition point will divide this pipeline into many pipeline stages. Informatica will create one partition by default for every pipeline stage. As we increase the partition points it increases the number of threads. Informatica has mainly three types of threads –Reader, Writer and Transformation Thread.
There can be one or more pipelines inside a mapping. Adding a partition point will divide this pipeline into many pipeline stages. Informatica will create one partition by default for every pipeline stage. As we increase the partition points it increases the number of threads. Informatica has mainly three types of threads –Reader, Writer and Transformation Thread.
The
number of partitions can be set at any partition point. We can define up to 64
partitions at any partition point in a pipeline. When you increase the number
of partitions, you increase the number of processing threads, which can improve
session performance. However, if you create a large number of partitions or
partition points in a session that processes large amounts of data, you can
overload the system.
You
cannot create partition points for the following transformations:
•
Source definition
•
Sequence Generator
• XMLParser
• XML
target
•
Unconnected transformations
The
partition type controls how the Integration Service distributes data among
partitions at partition points. The Integration Service creates a default
partition type at each partition point.
Type of partitions are :
1.
Database partitioning,
2.
Hash auto-keys
3.
Hash user keys
4. Key
range
5.
Pass-through
6.
Round-robin.
Database
Partitioning
For Source Database Partitioning, Informatica will check the database system for
the partition information if any and fetches data from corresponding node in
the database into the session partitions. When you use Target database
partitioning, the Integration Service loads data into corresponding database
partition nodes.
Use database partitioning for
Oracle and IBM DB2 sources and IBM DB2 targets.
Pass
through
Using Pass through partition will not affect the distribution of data across
partitions instead it will run in single pipeline.which is by default for all
your sessions. The Integration Service processes data without redistributing
rows among partitions. Hence all rows in a single partition stay in the
partition after crossing a pass-through partition point.
Key
range
Used
when we want to partition the data based on upper and lower limit. The
Integration Service will distribute the rows of data based on a port or set of
ports that we define as the partition key. For each port, we define a range of
values.Based on the range that we define the rows are send to different partitions
Hash
auto-keys: The
Integration Service uses a hash function to group rows of data among
partitions. The Integration Service groups the data based on a partition key.
Hash
user keys: The
Integration Service uses a hash function to group rows of data among
partitions. We define the number of ports to generate the partition key.