ETL Testing Interview Questions and Answers

ETL testing also known as Data warehouse testing. The ETL process became a popular concept in the 1970s .The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for extraction, transformation, and loading. Note that ETL refers to a broad process, and not three well-defined steps. The acronym ETL is perhaps too simplistic, because it omits the transportation phase and implies that each of the other phases of the process is distinct. Nevertheless, the entire process is known as ETL.

The methodology and tasks of ETL have been well known for many years, and are not necessarily unique to data warehouse environments: a wide variety of proprietary applications and database systems are the IT backbone of any enterprise. Data has to be shared between applications or systems, trying to integrate them, giving at least two applications the same picture of the world. This data sharing was mostly addressed by mechanisms similar to what we now call ETL.

ETL testing is the process of verifying .ETL testing is done to ensure that the data that has been loaded from a source to the destination after business transformation is accurate. It also involves the verification of data at various middle stages that are being used between source and destination. Because the ETL process involves a number of steps, it also needs to be tested in several ways.

Tracing level is the amount of data stored in the log files. Tracing level can be classified in two Normal and Verbose. Normal level explains the tracing level in a detailed manner while verbose explains the tracing levels at each and every row.

A fact table without measures is known as Fact less fact table. It can view the number of occurring events. For example, it is used to record an event such as employee count in a company. The numeric data based on columns in a fact table is known as Measures.

To improve performance, transactions are sub divided, this is called as Partitioning. Partioning enables Informatica Server for creating of multiple connection to various sources the types of partitions are

Round-Robin Partitioning: By informatica data is distributed evenly among all partitions in each partition where the number of rows to process are approximately same this partioning is applicable

Hash Partitioning: For the purpose of partitioning keys to group data among partitions Informatica server applies a hash function It is used when ensuring the processes groups of rows with the same partitioning key in the same partition need to be ensured

The advantage of using the Data Reader Destination Adapter is that it populates an ADO record set (consist of records and columns) in memory and exposes the data from the Dataflow task by implementing the Data Reader interface, so that other application can consume the data.

A data source view allows to define the relational schema which will be used in the analysis services databases. Rather than directly from data source objects, dimensions and cubes are created from data source views.

Import the source into the Source Analyzer. Between Informatica and SAP Power connect act as a gateway. The next step is to generate the ABAP code for the mapping then only informatica can pull data from SAP

To connect and import sources from external systems Power Connect is used.