Delivering this course:

Vadim is a cloud and architecture expert. He is the founding partner of DoIT International, and an AWS Certified Solutions Architect and a Google Developer expert. Over the last years, Vadim has helped countless companies to realize their cloud dreams into fully developed deployments based on cloud solutions. Vadim oversees technology and makes the hard stuff simple.

Choose between different data processing products on the Google Cloud Platform

Intended Audience

This class is intended for Data analysts, Data scientists and Business analysts. It is also suitable for IT decision makers evaluating Google Cloud Platform for use by data scientists.

This class is for people who do the following with big data:

Extracting, Loading, Transforming, cleaning, and validating data for use in analytics

Designing pipelines and architectures for data processing

Creating and maintaining machine learning and statistical models

Querying datasets, visualizing query results and creating reports

Prerequisites

Before attending this course, participants should have roughly one (1) year of experience with one or more of the following:

A common query language such as SQL

Extract, transform, load activities

Data modeling

Machine learning and/or statistics

Programming in Python

Modules

Module 1 - Introduction

In this module you will be introduced to Google Cloud Platform and the data handling aspects of the platform.

What is the Google Cloud Platform?

GCP Big Data Products

Usage scenarios

Lab: Sign up for Google Cloud Platform

Module 2 - Foundation of Google Cloud Platform

In this module, we introduce the foundations of the Google Cloud Platform: compute and storage and introduce how they work to provide data ingest, storage, and federated analysis.

CPUs on demand (Compute Engine)

Lab: Start Google Compute Engine instance, ssh access

A global filesystem (Cloud Storage)

Lab: Set up a Ingest-Transform-Publish data processing pipeline

CloudShell

Module 3 - Data Analytics on the Cloud

In this module we introduce the common Big Data use cases that Google will manage for you. These are the things that are widely done in industry today and for which we provide easy migration to the cloud.

Stepping stones to the cloud

CloudSQL: your SQL database on the cloud

Lab: importing data into CloudSQL and running queries on rentals data

Dataproc

Lab: Machine Learning with SparkML

Module 4 - Scaling data analysis

This module is about the more transformational technologies in Google Cloud platform that may not have immediate parallels to technologies that attendees are using (“what’s next”).

Fast random access

Datalab

Demo: Sample notebook in datalab

BigQuery

Lab: Build machine learning dataset

Machine Learning with TensorFlow

Lab: Train and use neural network

Fully built models for common needs

Lab: Translate

Genomics API (optional)

Module 5 - Data processing architectures

In this module we will introduce you to data processing architectures in Google Cloud Platform.