Informatica Interview Questions

Informatica Interview Questions

Q. What do you mean by Enterprise Data Warehousing?

When the organization data is created at a single point of access it is called as enterprise data warehousing. Data can be provided with a global view to the server via a single source store. One can do periodic analysis on that same source. It gives better results but however the time required is high.

Q. What the difference is between a database, a data warehouse and a data mart?

Database includes a set of sensibly affiliated data which is normally small in size as compared to data warehouse. While in data warehouse there are assortments of all sorts of data and data is taken out only according to the customer’s needs. On the other hand datamart is also a set of data which is designed to cater the needs of different domains. For instance an organization having different chunk of data for its different departments i.e. sales, finance, marketing etc.

Q. What is meant by a domain?

When all related relationships and nodes are covered by a sole organizational point, its called domain. Through this data management can be improved.

Q. What is the difference between a repository server and a powerhouse?

Repository server controls the complete repository which includes tables, charts, and various procedures etc. Its main function is to assure the repository integrity and consistency. While a powerhouse server governs the implementation of various processes among the factors of server’s database repository.

Q. How many repositories can be created in informatica?

There can be any number of repositories in informatica but eventually it depends on number of ports.

Q. What is the benefit of partitioning a session?

Partitioning a session means solo implementation sequences within the session. It’s main purpose is to improve server’s operation and efficiency. Other transformations including extractions and other outputs of single partitions are carried out in parallel.

Q. How are indexes created after completing the load process?

For the purpose of creating indexes after the load process, command tasks at session level can be used. Index creating scripts can be brought in line with the session’s workflow or the post session implementation sequence. Moreover this type of index creation cannot be controlled after the load process at transformation level.

Q. Explain sessions. Explain how batches are used to combine executions?

A teaching set that needs to be implemented to convert data from a source to a target is called a session. Session can be carried out using the session’s manager or pmcmd command. Batch execution can be used to combine sessions executions either in serial manner or in a parallel. Batches can have different sessions carrying forward in a parallel or serial manner.

Q. How many number of sessions can one group in batches?

One can group any number of sessions but it would be easier for migration if the number of sessions are lesser in a batch.

Q. Explain the difference between mapping parameter and mapping variable?

When values change during the session’s execution it’s called a mapping variable. Upon completion the Informatica server stores the end value of a variable and is reused when session restarts. Moreover those values that do not change during the sessions execution are called mapping parameters. Mapping procedure explains mapping parameters and their usage. Values are allocated to these parameters before starting the session.

Q. What is complex mapping?

Following are the features of complex mapping.

Difficult requirements

Many numbers of transformations

Complex business logic

Q. How can one identify whether mapping is correct or not without connecting session?

One can find whether the session is correct or not without connecting the session is with the help of debugging option.

Q. Can one use mapping parameter or variables created in one mapping into any other reusable transformation?

Yes, One can do because reusable transformation does not contain any mapplet or mapping.

Q. Explain the use of aggregator cache file?

Aggregator transformations are handled in chunks of instructions during each run. It stores transitional values which are found in local buffer memory. Aggregators provides extra cache files for storing the transformation values if extra memory is required.

Q. Briefly describe lookup transformation?

Lookup transformations are those transformations which have admission right to RDBMS based data set. The server makes the access faster by using the lookup tables to look at explicit table data or the database. Concluding data is achieved by matching the look up condition for all look up ports delivered during transformations.

Q. What does role playing dimension mean?

The dimensions that are utilized for playing diversified roles while remaining in the same database domain are called role playing dimensions.

Q. How can repository reports be accessed without SQL or other transformations?

Repositoryreports are established by metadata reporter. There is no need of SQL or other transformation since it is a web app.

Q. What are the types of metadata that stores in repository?

Q. Explain the code page compatibility?

When data moves from one code page to another provided that both code pages have the same character sets then data loss cannot occur. All the characteristics of source page must be available in the target page. Moreover if all the characters of source page are not present in the target page then it would be a subset and data loss will definitely occur during transformation due the fact the two code pages are not compatible.

Q. How can you validate all mappings in the repository simultaneously?

All the mappings cannot be validated simultaneously because each time only one mapping can be validated.

Q. Describe Expression transformation?

Values can be calculated in single row before writing on the target in this form of transformation. It can be used to perform non aggregate calculations. Conditional statements can also be tested before output results go to target tables.

Q. What do you mean by filter transformation?

It is a medium of filtering rows in a mapping. Data needs to be transformed through filter transformation and then filter condition is applied. Filter transformation contains all ports of input/output, and the rows which meet the condition can only pass through that filter.

Q. What is Joiner transformation?

Joiner transformation combines two affiliated heterogeneous sources living in different locations while a source qualifier transformation can combine data emerging from a common source.

Q. What is Lookup transformation?

It is used for looking up data in a relational table through mapping. Lookup definition from any relational database is imported from a source which has tendency of connecting client and server. One can use multiple lookup transformation in a mapping.

Q. How Union Transformation is used?

It is a diverse input group transformation which can be used to combine data from different sources. It works like UNION All statement in SQL that is used to combine result set of two SELECT statements.

Q. What do you mean Incremental Aggregation?

Option for incremental aggregation is enabled whenever a session is created for a mapping aggregate. Power center performs incremental aggregation through the mapping and historical cache data to perform new aggregation calculations incrementally.

Q. What is the difference between a connected look up and unconnected look up?

When the inputs are taken directly from other transformations in the pipeline it is called connected lookup. While unconnected lookup doesn’t take inputs directly from other transformations, but it can be used in any transformations and can be raised as a function using LKP expression. So it can be said that an unconnected lookup can be called multiple times in mapping.

Q. What is a mapplet?

A recyclable object that is using mapplet designer is called a mapplet. It permits one to reuse the transformation logic in multitude mappings moreover it also contains set of transformations.

Q. Briefly define reusable transformation?

Reusable transformation is used numerous times in mapping. It is different from other mappings which use the transformation since it is stored as a metadata. The transformations will be nullified in the mappings whenever any change in the reusable transformation is made.

Q. What does update strategy mean, and what are the different option of it?

Row by row processing is done by informatica. Every row is inserted in the target table because it is marked as default. Update strategy is used whenever the row has to be updated or inserted based on some sequence. Moreover the condition must be specified in update strategy for the processed row to be marked as updated or inserted.

Q. What is the scenario which compels informatica server to reject files?

This happens when it faces DD_Reject in update strategy transformation. Moreover it disrupts the database constraint filed in the rows was condensed.

Q. What is surrogate key?

Surrogate key is a replacement for the natural prime key. It is a unique identification for each row in the table. It is very beneficial because the natural primary key can change which eventually makes update more difficult. They are always used in form of a digit or integer.

Q. What are the prerequisite tasks to achieve the session partition?

In order to perform session partition one need to configure the session to partition source data and then installing the Informatica server machine in multifold CPU’s.

Q. Which files are created during the session rums by informatics server?

Q. Briefly define a session task?

It is a chunk of instruction the guides Power center server about how and when to transfer data from sources to targets.

Q. What does command task mean?

This specific task permits one or more than one shell commands in UNIX or DOS in windows to run during the workflow.

Q. What is standalone command task?

This task can be used anywhere in the workflow to run the shell commands.

Q. What is meant by pre and post session shell command?

Command task can be called as the pre or post session shell command for a session task. One can run it as pre session command r post session success command or post session failure command.

Q. What is predefined event?

It is a file-watch event. It waits for a specific file to arrive at a specific location.

Q. How can you define user defied event?

User defined event can be described as a flow of tasks in the workflow. Events can be created and then raised as need arises.

Q. What is a work flow?

Work flow is a bunch of instructions that communicates server about how to implement tasks.

Q. What are the different tools in workflow manager?

Following are the different tools in workflow manager namely

Task Designer

Task Developer

Workflow Designer

Q. Tell me any other tools for scheduling purpose other than workflow manager pmcmd?

The tool for scheduling purpose other than workflow manager can be a third party tool like ‘CONTROL M’.

Q. What is OLAP (On-Line Analytical Processing?

A method by which multi-dimensional analysis occurs.

Q. What are the different types of OLAP? Give an example?

ROLAP eg.BO, MOLAP eg.Cognos, HOLAP, DOLAP

Q. What do you mean by worklet?

When the workflow tasks are grouped in a set, it is called as worklet. Workflow tasks includes timer, decision, command, event wait, mail, session, link, assignment, control etc.

Q. What is the use of target designer?

Target Definition is created with the help of target designer.

Q. Where can we find the throughput option in informatica?

Throughput option can be found in informatica in workflow monitor. In workflow monitor, right click on session, then click on get run properties and under source/target statistics we can find throughput option.

Q. What is target load order?

Target load order is specified on the basis of source qualifiers in a mapping. If there are multifold source qualifiers linked to different targets then one can entitle order in which informatica server loads data into targets.

Q. What are the differences between Connected and Unconnected Lookup?

The differences are illustrated in the below table:

Connected Lookup

Unconnected Lookup

Connected lookup participates in dataflow and receives input directly from the pipeline

Unconnected lookup receives input values from the result of a LKP: expression in another transformation

Connected lookup can use both dynamic and static cache

Unconnected Lookup cache can NOT be dynamic

Connected lookup can return more than one column value ( output port )

Unconnected Lookup can return only one column value i.e. output port

Connected lookup caches all lookup columns

Unconnected lookup caches only the lookup output ports in the lookup conditions and the return port

Do you want to study from the biggest collection of Informatica questions and answers? Where each question is hand-picked from real-life Interviews and answers prepared by the industry experts? Do you want to download the question-answer set in PDF format for offline study? If yes, get the Master Informatica Question Answer Set.

Q. What can we do to improve the performance of Informatica Aggregator Transformation?

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and “sorted input” option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.

It is often a good idea to sort the record set in database level e.g. inside a source qualifier transformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator

You may also read this article to know how to tune the performance of aggregator transformation

Q. What are the different lookup cache(s)?

Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built and it remains same during the session run. On the other hand, A dynamic cache is refreshed during the session run by inserting or updating the records in cache based on the incoming source data. By default, Informatica cache is static cache.

A lookup cache can also be divided as persistent or non–persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.

Q. How can we update a record in target table without using Update strategy?

A target table can be updated without using ‘Update Strategy’. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as “Update as Update” and check the “Update” check-box.

Let’s assume we have a target table “Customer” with fields as “Customer ID”, “Customer Name” and “Customer Address”. Suppose we want to update “Customer Address” without an Update Strategy. Then we have to define “Customer ID” as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.

Q. Under what condition selecting Sorted Input in aggregator may fail the session?

If the input data is not sorted correctly, the session will fail.

Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.

Q. Why is Sorter an Active Transformation?

This is because we can select the “distinct” option in the sorter property.

When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.

Q. Is lookup an active or passive transformation?

From Informatica 9x, Lookup transformation can be configured as as “Active” transformation.

Find out How to configure lookup as active transformation

However, in the older versions of Informatica, lookup used to be a passive transformation

Q. What is the difference between Static and Dynamic Lookup Cache?

We can configure a Lookup transformation to cache the underlying lookup table. In case of static or read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.

In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.

In case you are wondering why do we need to make lookup cache dynamic, read this article on dynamic lookup

Q. What is the difference between STOP and ABORT options in Workflow Monitor?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.

In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

Q. What are the new features of Informatica 9.x in developer level?

From a developer’s perspective, some of the new features in Informatica 9.x are as follows:

Now Lookup can be configured as an active transformation – it can return multiple rows on successful match

Now you can write SQL override on un-cached lookup also. Previously you could do it only on cached lookup

You can control the size of your session log. In a real-time environment you can control the session log file size or time

Database deadlock resilience feature – this will ensure that your session does not immediately fail if it encounters any database deadlock, it will now retry the operation again. You can configure number of retry attempts.

Q. Does an Informatica Transformation support only Aggregate expressions?

Q. How does Aggregator Transformation handle NULL values?

By default, the aggregator transformation treats null values as NULL in aggregate functions. But we can specify to treat null values in aggregate functions as NULL or zero.

Q. What is Incremental Aggregation?

We can enable the session option, Incremental Aggregation for a session that includes an Aggregator Transformation. When the Integration Service performs incremental aggregation, it actually passes changed source data through the mapping and uses the historical cache data to perform aggregate calculations incrementally.

For reference check Implementing Informatica Incremental Aggregation

Q. What are the performance considerations when working with Aggregator Transformation?

Filter the unnecessary data before aggregating it. Place a Filter transformation in the mapping before the Aggregator transformation to reduce unnecessary aggregation.

Improve performance by connecting only the necessary input/output ports to subsequent transformations, thereby reducing the size of the data cache.

Use Sorted input which reduces the amount of data cached and improves session performance.

Q. What differs when we choose Sorted Input for Aggregator Transformation?

Integration Service creates the index and data caches files in memory to process the Aggregator transformation. If the Integration Service requires more space as allocated for the index and data cache sizes in the transformation properties, it stores overflow values in cache files i.e. paging to disk. One way to increase session performance is to increase the index and data cache sizes in the transformation properties. But when we check Sorted Input the Integration Service uses memory to process an Aggregator transformation it does not use cache files.

Q. Under what conditions selecting Sorted Input in aggregator will still not boost session performance?

Incremental Aggregation, session option is enabled.

The aggregate expression contains nested aggregate functions.

Source data is data driven.

Q. Under what condition selecting Sorted Input in aggregator may fail the session?

If the input data is not sorted correctly, the session will fail.

Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.

Q. Suppose we do not group by on any ports of the aggregator what will be the output.

If we do not group values, the Integration Service will return only the last row for the input rows.

Q. What is the expected value if the column in an aggregator transform is neither a group by nor an aggregate expression?

Integration Service produces one row for each group based on the group by ports. The columns which are neither part of the key nor aggregate expression will return the corresponding value of last record of the group received. However, if we specify particularly the FIRST function, the Integration Service then returns the value of the specified first row of the group. So default is the LAST function.

Q. Give one example for each of Conditional Aggregation, Non-Aggregate expression and Nested Aggregation.

Use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.

SUM( SALARY, JOB = CLERK )

Use non-aggregate expressions in group by ports to modify or replace groups.

IIF( PRODUCT = Brown Bread, Bread, PRODUCT )

The expression can also include one aggregate function within another aggregate function, such as:

MAX( COUNT( PRODUCT ))

Q. What is a Rank Transform?

Rank is an Active Connected Informatica transformation used to select a set of top or bottom values of data.

Q. How does a Rank Transform differ from Aggregator Transform functions MAX and MIN?

Like the Aggregator transformation, the Rank transformation lets us group information. The Rank Transform allows us to select a group of top or bottom values, not just one value as in case of Aggregator MAX, MIN functions.

Q. What is a RANK port and RANKINDEX?

Rank port is an input/output port use to specify the column for which we want to rank the source values. By default Informatica creates an output port RANKINDEX for each Rank transformation. It stores the ranking position for each row in a group.

Q. How can you get ranks based on different groups?

Rank transformation lets us group information. We can configure one of its input/output ports as a group by port. For each unique value in the group port, the transformation creates a group of rows falling within the rank definition (top or bottom, and a particular number in each rank).

Q. What happens if two rank values match?

If two rank values match, they receive the same value in the rank index and the transformation skips the next value.

Q. What are the restrictions of Rank Transformation?

We can connect ports from only one transformation to the Rank transformation.

We can select the top or bottom rank.

We need to select the Number of records in each rank.

We can designate only one Rank port in a Rank transformation.

Q. How does a Rank Cache works?

During a session, the Integration Service compares an input row with rows in the data cache. If the input row out-ranks a cached row, the Integration Service replaces the cached row with the input row. If we configure the Rank transformation to rank based on different groups, the Integration Service ranks incrementally for each group it finds. The Integration Service creates an index cache to stores the group information and data cache for the row data.

Q. How does Rank transformation handle string values?

Rank transformation can return the strings at the top or the bottom of a session sort order. When the Integration Service runs in Unicode mode, it sorts character data in the session using the selected sort order associated with the Code Page of IS which may be French, German, etc. When the Integration Service runs in ASCII mode, it ignores this setting and uses a binary sort order to sort character data.

Q. What is a Sorter Transformation?

Sorter Transformation is an Active, Connected Informatica transformation used to sort data in ascending or descending order according to specified sort keys. The Sorter transformation contains only input/output ports.

Q. Why is Sorter an Active Transformation?

When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input Rows will vary as compared with the Output rows and hence it is an Active transformation.

Q. How does Sorter handle Case Sensitive sorting?

The Case Sensitive property determines whether the Integration Service considers case when sorting data. When we enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters.

Q. How does Sorter handle NULL values?

We can configure the way the Sorter transformation treats null values. Enable the property Null Treated Low if we want to treat null values as lower than any other value when it performs the sort operation. Disable this option if we want the Integration Service to treat null values as higher than any other value.

Q. How does a Sorter Cache works?

The Integration Service passes all incoming data into the Sorter Cache before Sorter transformation performs the sort operation.

The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of available physical RAM on the Integration Service machine.

If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory.

Q. What is a Union Transformation?

The Union transformation is an Active, Connected non-blocking multiple input group transformation use to merge data from multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union transformation does not remove duplicate rows.

Q. What are the restrictions of Union Transformation?

All input groups and the output group must have matching ports. The precision, datatype, and scale must be identical across all groups.

We can create multiple input groups, but only one default output group.

The Union transformation does not remove duplicate rows.

We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.

The Union transformation does not generate transactions.

Q. What is the difference between Static and Dynamic Lookup Cache?

We can configure a Lookup transformation to cache the corresponding lookup table. In case of static or read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does not update the lookup cache while it processes the Lookup transformation.

In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target.

Q. What is Persistent Lookup Cache?

Lookups are cached by default in Informatica. Lookup cache can be either non-persistent or persistent. The Integration Service saves or deletes lookup cache files after a successful session run based on whether the Lookup cache is checked as persistent or not.

Q. What is the difference between Reusable transformation and Mapplet?

Any Informatica Transformation created in the in the Transformation Developer or a non-reusable promoted to reusable transformation from the mapping designer which can be used in multiple mappings is known as Reusable Transformation. When we add a reusable transformation to a mapping, we actually add an instance of the transformation. Since the instance of a reusable transformation is a pointer to that transformation, when we change the transformation in the Transformation Developer, its instances reflect these changes.

A Mapplet is a reusable object created in the Mapplet Designer which contains a set of transformations and lets us reuse the transformation logic in multiple mappings. A Mapplet can contain as many transformations as we need. Like a reusable transformation when we use a mapplet in a mapping, we use an instance of the mapplet and any change made to the mapplet is inherited by all instances of the mapplet.

Q. What are the ERROR tables present in Informatica?

PMERR_DATA– Stores data and metadata about a transformation row error and its corresponding source row.

PMERR_MSG– Stores metadata about an error and the error message.

PMERR_SESS– Stores metadata about the session.

PMERR_TRANS– Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

Q. What is the difference between STOP and ABORT?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.

In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

Q. Can we copy a session to new folder or new repository?

Yes we can copy session to new folder or repository provided the corresponding Mapping is already in there.