How to extract data from several sources and load into same target table

DATA TRANSFORMATION

How to perform transformation logics on data

DATA LOADING

How to load data into final target table

Introduction to Data Mart

DATA WAREHOUSING LIFE CYCLE

Types of Tables:

Fact Table

Dimension table

Schemas:

Fact Table

Star Schema

Snowflake Schema

Fact Constellation Schema

Week-2

Hadoop Real Time Cluster with Hue
Types of Nodes in Hadoop
Deamons in Hadoop Ecosystem
HDFS Deamons
YARN Deamons
Live Hadoop Demo with all the components
Hue Hands on with HDFS
Hue Hands on with Hive

SCD ( Slowly Changing Dimensions )

Different types of SCD’s which developer develops and a tester has to test as per different types of data needs of clients

We implement SCD-1 and SCD-2 in class to show the implementation part more transparently and explain SCD-3 to define how to work on limited historical data as well which is used sometimes

Types:

SCD-1

SCD-2

SCD-3

Week-3

ETL Tool Implementation and Testing Concepts

Basic Concepts In Sql (Select, Update, Insert and Delete)

How to design frequently used testing queries to validate the data in source and target for both of the reasons Data Quality and Correctness to ensure that expected data came into the final and mediator target tables or not.

Overview Of Etl Tool Architecture

How ETL tool works and what is the basic need of ETL tools in the market

What are the similarities between ETL tools and SQL

Why we use ETL Tools if SQL is the only standard which any Database or Data Warehouse can use

How ETL Tools work at back end

Different ETL Tools in the market

What types of tasks tester performs on ETL for testing purpose

Informatica PowerCenter (Leading tool in the market)

Week-4

Informatica PowerCenter Mapping and usecases

ETL Logic in Informatica PowerCenter

Informaica usage with Datawarehouse Projects

Why we use Informatica with Bigdata Tools

Differentiate between HQL, Pig Latin and Informatica

Informatica Basic mapping with Test cases

Informatica complex mappings with test cases

Week-5

Data Sources, Data Ingestion in Hadoop and SQOOP

Data Validation in Hadoop using SQOOP scripts

SQOOP Performance testing

Data Truncation and count

MINUS Query

COUNT Query

UNION Query

JOIN Query

Week-6

Hadoop Ecosystem

Data Lake and Layes

Hive Query Testing in Hadoop Ecosystem

Different Interfaces available to access Hadoop Ecosystem for Tester

Real time HQL scripts for validation and testing

Test usecases in Hive Layers - Staging, Data Stre and Data warehouse

Hive performance testing

Shell Scripts to test commands

Additional benefits:

Interview Questions

Resume preparations

Real-time Scenarios examples and solution discussion

Assignments

Cloudera and Hortonworks

Training with Real time Project by It Working professional

This course is designed based on combination of Informatica, BigData and Hadoop Ecosystem