How to remove duplicate rows in Informatica

Sunday, 12 August 2012

How to remove duplicate rows in Informatica

Scenario:Route unique rows to one table and duplicate to another using Informatica transformations.

SOURCE:

COL1	COL2	COL3
100	101	102
10	11	12
100	101	102
999	998	997
100	101	102
1	2	3
1	2	3

Target Table 1: Table containing all the unique rows

COL1	COL2	COL3
100	101	102
10	11	12
999	998	997
1	2	3

Target Table 2: Table containing all the duplicate rows

COL1	COL2	COL3
100	101	102
100	101	102
1	2	3

We need to use transformations in such a way that Source table duplicate values are filter into separate table. As you guessed we need to use both Router and Aggregator

Bring the source table and connect to an Aggregator Transformation
Add a new column in Aggregator (group by key column) for eg. Count_rec having the count of the key column
Connect to Router with two groups

1.ORGINAL: write count_rec=1

2.DUPLICATE: write count_rec>1.

Connect two groups to corresponding target table.

11 comments:

Unknown4 July 2013 at 14:48
In case of flat file:
SRC-->SQ-->Sorter-->Aggr-->Router-->Tgt1 and Tgt2
In Aggregator,create the variable ports
1) PRES_ID = ID
2) V_COUNT => IF(PRES_ID=PREV_ID,V_COUNT+1,1)
3) PREV_ID = ID
Create the output port
4) O_COUNT = V_COUNT
Pass the output to Router and create a group filter condition with O_COUNT=1.
The default group will contain the duplicate records.
ReplyDelete
Replies
Anonymous27 July 2013 at 11:53
Can't we write the same logic in an expression transformation instead of aggr ??
ReplyDelete
Replies
Unknown7 October 2013 at 14:18
we can also write same exp in expression t/r for performanece
ReplyDelete
Replies
Unknown9 October 2013 at 17:54
Count_rec having the count of the key column means how can i perform
this if anybody
ReplyDelete
Replies
Anonymous24 March 2014 at 13:20
Duplicate row in target were unique like expecting 100 101 102 two times,but in target getting one time. please provide alternative.
ReplyDelete
Replies
Kavya22 April 2014 at 16:15
Sort the source data.

1 2 3
1 2 3
10 11 12
100 101 102
100 101 102
100 101 102
999 998 999

In the expression transformation, create a variable port and increment it by 1.
Then assign the variable port to an output port and add one dummy column with value 1

V_c1=c1
V_c2=c2
V_c3=c3
V_count=IIF(V_c1=V_prev_c1 &&V_c2=V_prev_c2 &&V_c3=V_prev_c3,V_count+1,1)
V_prev_c1=c1
V_prev_c2=c2
V_prev_c3=c3
O_count=V_count
O_dummy=1

C1 c2 c3 count dummy
1 2 3 1 1
1 2 3 2 1
10 11 12 1 1
100 101 102 1 1
100 101 102 2 1
100 101 102 3 1
999 998 999 1 1

Then use Router transformation with 2 groups

1) count=dummy-----unique rows
2) count!=dummy-----duplicate rows
ReplyDelete
Replies
Unknown11 August 2014 at 09:42
If we give --> write count_rec=1,write count_rec>1 in Router T/F then output table 1 (containig all unique rows)
will have 2 records

10,11,12
1,2,3
ReplyDelete
Replies
SHAUN18 November 2016 at 19:11
Thank u
ReplyDelete
Replies
hari20 July 2018 at 23:12
Without using any transaction how to find whether there are duplicates in the target.(I don t want to use any sql and filter or select distinct option) in informatica
ReplyDelete
Replies

Add comment

DATAWAREHOUSE CONCEPTS

Pages

Sunday, 12 August 2012

How to remove duplicate rows in Informatica

11 comments:

ShareThis