3 Data Warehouse Oracle Warehouse Builder Oracle OLAP/ Data MinerAs you know, data warehouse is a repository of an organization's electronically stored data which is designed to facilitate reporting and analysis.When you build a data warehouse system, you can consider the building process as two parts. One is to collect data from different data sources, understand it and integrate it into target platform. It is called ETL process. The other part is to find pattern, predict value and generate report after building one unified data repository.In the mining project, MCMS, we are using oracle warehouse builder as an ETL processing tool to build a data warehouse, and for later part, we are currently using oracle data miner. In this presentation I will briefly review oracle warehouse builder and then my collegue Ying’ will present about data mining.Oracle Warehouse BuilderOracle OLAP/ Data MinerFind PatternPredict Behaviour or value(Classification/ Regression)Generate ReportETL (Extract/ Transform/ Load)Data Quality ControlMeta data Management“one of the major ETL tools in the market “

4 ETL ProcessExtract: extract data from sources and put in a so-called Staging Area(SA),usually with the same structure as the source.Here, the abbreviation of ETL means extract, transform and load. extract data from sources and put it in the target platform usually with the same structure as the source. after extraction all data will be located at one platform so you can easily join and union tables and filters and sort the calculations. In this step, you can check on data quality and cleans the data if necessary. Many experts recommend cleans data in this step.Load. Finally data is loaded into a central warehouse, usually into fact and dimension tables.Transform: join and union tables, filter and sort the calculations. In thisstep, we can check on data quality and cleans the data if necessary.Load: finally, data is loaded into a central warehouse, usually into factand dimension tables.

5 OWB ArchitectureOWB, oracle warehouse builder is one of the major ETL tool in the market and you can use it as free for academic usage.The OWB architecture looks like this. In the client side, you have two application called design centre and repository browser. Repository borswer is web based repository viewing tool and design centre is GUI based application which is mainly used to design warehouse and conduct ETP process. In development time, you login to oracle database server through design centre, design metadata and ETL processes, and then all what you’ve done is saved into server repository called warehouse builder repository. If you deploy and execute your designs in runtime, they are implemented as database objects and scripts suchas PL/SQL and stored into oracle data base.

6 Design CentreThe design centre have many functionality for data modelling, data compliance, integration and quality management.Details for each function will be followed on the rest of this presentation.

10 Data Quality ManagementData ProfilingMissing or invalid valuesDistributions of the values in a specific columnData Rule for CleansingWhen you build a data warehouse system, mange data quality is the one of the most important thing. Oracle warehouse builder have a data profiling tool, you can check your data, any missing data or invalid data and then you can make some cleansing strategy here.Once you set the data rules to cleanse your data, the tool deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data. I will show you how it works later on demonstration session.Based on the data profiling results, Warehouse Builder derives a set of data rules that you can use to cleanse the source data. You can automatically generate corrections based on these data rules by performing data correction actions.

11 Metadata Management Dependency ManagementData Lineage at attribute levelImpact Analysis at attribute levelMetadata SnapshotsChange Management (diff, merge and reconcile)Reporting (browser)APIs (Scripting, SQL, PL/SQL)Exchange (import/export)Finally, about metadata such as table and view, you can check dependencies between metadata and you can get some idea how any change of metadata will affect the others. And also you can audit the metadata changes.

12 Define Sources & TargetsDemonstrationDefine Sources & TargetsExtractData Profiling1. Identifying datasources/ targets and importing metadata2. Import data and design and execute mappings (Extract)3. Data profiling and decide data cleansing strategy“Derived Data Rule”“Generated Code”TransformLoadnow let me give you a demonstration using OWB step by step. First I will define data sources and targets and then import data from source and design and execute mappings to extract the data. Before merging data from different sources, I will do data profiling first and make sure the data is cleansed. Finally, the data are cleansed, merged and ready to be used for reporting and mining.4. Design and executemappings (Merging) and cleansing5. Design dimension tables“Generated Code”