Sunday, June 23, 2013

Informatica Questions with Answers

What are the differences between Connected and Unconnected Lookup?

The differences are illustrated in the below table
Connected LookupUnconnected Lookup
Connected lookup participates in dataflow and receives input directly from the pipelineUnconnected lookup receives input values from the result of a LKP: expression in another transformation
Connected lookup can use both dynamic and static cacheUnconnected Lookup cache can NOT be dynamic
Connected lookup can return more than one column value ( output port )Unconnected Lookup can return only one column value i.e. output port
Connected lookup caches all lookup columnsUnconnected lookup caches only the lookup output ports in the lookup conditions and the return port
Supports user-defined default values (i.e. value to return when lookup conditions are not satisfied)Does not support user defined default values


What is meant by active and passive transformation?

An active transformation is the one that performs any of the following actions:
1) Change the number of rows between transformation input and output. Example: Filter transformation.
2) Change the transaction boundary by defining commit or rollback points., example transaction control transformation.
3) Change the row type, example Update strategy is active because it flags the rows for insert, delete, update or reject.

On the other hand a passive transformation is the one which does not change the number of rows that pass through it. Example: Expression transformation.

What is the difference between Router and Filter?

Following differences can be noted,
RouterFilter
Router transformation divides the incoming records into multiple groups based on some condition. Such groups can be mutually inclusive (Different groups may contain same record)Filter transformation restricts or blocks the incoming record set based on one given condition.
Router transformation itself does not block any record. If a certain record does not match any of the routing conditions, the record is routed to default groupFilter transformation does not have a default group. If one record does not match filter condition, the record is blocked
Router acts like CASE.. WHEN statement in SQL (Or Switch().. Case statement in C)Filter acts like WHERE condition is SQL.


What can we do to improve the performance of Informatica Aggregator Transformation?

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and "sorted input" option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.
It is often a good idea to sort the record set in database level e.g. inside a source qualifier transformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator.

What are the different lookup cache(s)?

Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built and it remains same during the session run. On the other hand, A dynamic cache is refreshed during the session run by inserting or updating the records in cache based on the incoming source data. By default, Informatica cache is static cache.
A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it

How can we update a record in target table without using Update strategy?

A target table can be updated without using 'Update Strategy'. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as "Update as Update" and check the "Update" check-box.
Let's assume we have a target table "Customer" with fields as "Customer ID", "Customer Name" and "Customer Address". Suppose we want to update "Customer Address" without an Update Strategy. Then we have to define "Customer ID" as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.

Under what condition selecting Sorted Input in aggregator may fail the session?


  • If the input data is not sorted correctly, the session will fail.
  • Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.

Why is Sorter an Active Transformation?

This is because we can select the "distinct" option in the sorter property.
When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.

Is lookup an active or passive transformation?

From Informatica 9x, Lookup transformation can be configured as as "Active" transformation.
However, in the older versions of Informatica, lookup is a passive transformation

What is the difference between Static and Dynamic Lookup Cache?

We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.
In case you are wondering why do we need to make lookup cache dynamic

What is the difference between STOP and ABORT options in Workflow Monitor?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.
In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

What are the new features of Informatica 9.x in developer level?

From a developer's perspective, some of the new features in Informatica 9.x are as follows:
  • Now Lookup can be configured as an active transformation - it can return multiple rows on successful match
  • Now you can write SQL override on un-cached lookup also. Previously you could do it only on cached lookup
  • You can control the size of your session log. In a real-time environment you can control the session log file size or time
  • Database deadlock resilience feature - this will ensure that your session does not immediately fail if it encounters any database deadlock, it will now retry the operation again. You can configure number of retry attempts.

How to Delete duplicate row using Informatica

Scenario 1: Duplicate rows are present in relational database

Suppose we have Duplicate records in Source System and we want to load only the unique records in the Target System eliminating the duplicate rows. What will be the approach?
Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check the Distinct option of the Source Qualifier of the source table and load the target accordingly.

 Scenario1:

We have a target source table containing 3 columns : Col1, Col2 and Col3. There is only 1 row in the table as follows:

Col1 Col2 Col3
-----------------
a b c

There is target table containg only 1 column Col. Design a mapping so that the target table contains 3 rows as follows:

Col
-----
a
b
c

Solution: Not using a Normalizer transformation:

Create 3 expression transformations exp_1,exp_2 and exp_3 with 1 port each. Connect col1 from Source Qualifier to port in exp_1.Connect col2 from Source Qualifier to port in exp_2.Connect col3 from source qualifier to port in exp_3. Make 3 instances of the target. Connect port from exp_1 to target_1. Connect port from exp_2 to target_2 and connect port from exp_3 to target_3.

Scenario 2:

There is a source table that contains duplicate rows.Design a mapping to load all the unique rows in 1 target while all the duplicate rows (only 1 occurence) in another target.

Solution :

Bring all the columns from source qualifier to an Aggregator transformation. Check group by on the key column. Create a new output port count_col in aggregator transformation and write an expression count(key_column). Make a router transformation with 2 groups:Dup and Non-Dup. Check the router conditions count_col>1 in Dup group while count_col=1 in Non-dup group. Load these 2 groups in different targets.

Scenario 3:

There is a source table containing 2 columns Col1 and Col2 with data as follows:

Col1 Col2
a l
b p
a m
a n
b q
x y

Design a mapping to load a target table with following values from the above mentioned source:

Col1 Col2
a l,m,n
b p,q
x y

Solution:

Use a sorter transformation after the source qualifier to sort the values with col1 as key. Build an expression transformation with following ports(order of ports should also be the same):

1. Col1_prev : It will be a variable type port. Expression should contain a variable e.g val
2. Col1 : It will be Input/Output port from Sorter transformation
3. Col2 : It will be input port from sorter transformation
4. val : It will be a variable type port. Expression should contain Col1
5. Concatenated_value: It will be a variable type port. Expression should be decode(Col1,Col1_prev,Concatenated_value||','||Col2,Col1)
6. Concatenated_Final : It will be an outpur port conating the value of Concatenated_value

After expression, build a Aggregator Transformation. Bring ports Col1 and Concatenated_Final into aggregator. Group by Col1. Don't give any expression. This effectively will return the last row from each group.

Connect the ports Col1 and Concatenated_Final from aggregator to the target table.

Scenario 4:

Design an Informatica mapping to load first half records to 1 target while other half records to a separate target.

Solution:

You will have to assign a row number with each record. To achieve this, either use Oracle's psudo column rownum in Source Qualifier query or use NEXTVAL port of a Sequence generator. Lets name this column as rownumber.

From Source Qualifier, create 2 pipelines:

First Pipeline:
Carry first port Col1 from SQ transformation into an aggregator transformation. Create a new output port "tot_rec" and give the expression as COUNT(Col1). Do not group by any port. This will give us the total number of records in Source Table. Carry this port tot_rec to an Expression Transformation. Add another port DUMMY in expression transformation with default value 1.

Second Pipeline:
from SQ transformation, carry all the ports(including an additional port rownumber generated by rownum or sequence generator) to an Expression Transformation. Add another port DUMMY in expression transformation with default value 1.

Join these 2 pipelines with a Joiner Transformation on common port DUMMY. carry all the source table ports and 2 additional ports tot_rec and rownumber to a router transformation. Add 2 groups in Router : FIRST_HALF and SECOND_HALF. Give condition rownumber<=tot_rec/2 in FIRST_HALF. Give condition rownumber>tot_rec/2 in SECOND_HALF. Connect the 2 groups to 2 different targets.

No comments:

Post a Comment