TensorFlow Extended: Data Validation and Transform

Companies are looking for ways to incorporate machine learning into their business to lower costs and increase revenue. But machine learning models are only as good as their training data, which is often generated by ad hoc pipelines involving multiple products, systems, and usage logs. Code bugs, system failures, or human errors can occur at multiple points of this generation process. As a result, understanding the data and finding any anomalies early is critical for preventing data errors downstream. As a machine learning platform scales to larger data and runs continuously, there's a strong need for a reusable component that enables rigorous checks for data quality and promotes best practices for data management.

Join expert Armen Donigian to explore the crucial skills of data analysis, transformation, and validation. Over three hours, you'll gain hands-on practical experience designing and transforming features, experimenting, and analyzing, serving, and profiling machine learning models using the recently open-sourced TensorFlow Extended (TFX), which allows you to leverage the state-of-the-art technology that powers most of Google’s ML systems to solve your particular business or scientific problems.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

The problems TFX can help you solve

How to integrate various parts of the Tensorflow ecosystem together

How to validate your data using TensorFlow Data Validation

How to transform & process features using TensorFlow Transform

And you’ll be able to:

Apply the same design and implementation principles from the technology that made Google successful to your specific project

About your instructor

Armen Donigian has undergraduate and graduate degrees in Computer Science from UCLA and USC. He started his career building tracking & navigation algorithms at Orincon (later acquired by Lockheed Martin). Armen then joined the Global Differential GPS group at Jet Propulsion Laboratory (NASA), performing clock and orbit corrections using GPS/GLONASS satellites, which were also used for testing of Mars Science Laboratory Curiosity Rover.

Bitten by the startup bug, Armen has helped several startups build data driven products and scale infrastructure as a Senior Data & Machine Learning Engineer. Armen has previously led the development of machine learning explainability methods & currently works as the Head of Personalization & Recommender Systems at Honey Science.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

An overview of the problems TFX can help you solve (30 minutes)

Lecture: Problem statements; common vocabulary; context for rest of the course

Hands-on exercise: Knowledge check

Case study: An end-to-end example integrating various parts of the TensorFlow ecosystem together (50 minutes)

How to transform & process features using TensorFlow Transform (40 mins)

Lecture: How to define a preprocessing function (a logical description of the pipeline that transforms the raw data into the data used to train a machine learning model); the Apache Beam implementation used to transform data by converting the preprocessing function into a Beam pipeline; how to define data formats and schema; integrating with TensorFlow Training

How to define a preprocessing function (a logical description of the pipeline that transforms the raw data into the data used to train a machine learning model).