Data Factory

Data Factory is an open framework and toolkit for creating data flows to collect, inspect, process and publish data. You can fully automate your data collection, processing, and publication with the Data Factory.

Why use an open framework?

No proprietary tooling and no vendor lock-in

Easily extensible for adding custom functionality

All outputs and artifacts are standards-based and fully portable

The Flow of the Data Factory Process

We have created an open ecosystem of tools and specifications for powerful and frictionless data processing flows.

Various Sources

Normalization & Cleaning

Transformation & Modification

Verification & Validation

Various Outputs

01Loading data from various sources and file types

02Normalizing, cleaning and tidying the data - making it documented and portable

03Transforming the data - changing the structure and/or the contents of the data, combining with
other datasets etc.

04Making sure that the data is correct, valid and adheres to your own verification rules

05Storing the processed data in any file or data storage system

Why Data Factory?

A professionally selected collection of the best tools and practices

An end-to-end solution

All parts are fully integrated

Backed by a team of professionals with years of
experience in similar projects

Use Data Factory for your project

Get our Professional Support

At Datahub, we provide various solutions to Publish and Deploy your Data with power and simplicity. Datahub is the fastest way for individuals, teams and organizations to publish, deploy and share their data.