Our experienced instructor provides a hands-on introduction, which is easy to follow and incorporates the most useful parts necessary for day-to-day work. Whatever questions you may have about Spark, you’ll have your answers delivered.

Overview

The course covers the fundamentals of Apache Spark including Spark’s architecture and internals, the core APIs for using Spark, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. The class is a mixture of lecture and hands-on labs.

Each topic includes lecture content along with hands-on labs in the Databricks notebook environment. Students may keep the notebooks and continue to use them with the free Databricks Community Edition offering after the class ends; all examples are guaranteed to run in that environment.

Learning Objectives

After taking this class, students will be able to:

Use the core Spark APIs to operate on data

Articulate and implement typical use cases for Spark

Build data pipelines and query large data sets using Spark SQL and DataFrames

Analyze Spark jobs using the administration UIs inside Databricks

Create Structured Streaming jobs

Work with relational data using the GraphFrames APIs

Understand how a Machine Learning pipeline works

Understand the basics of Spark’s internals

Who should apply?

Data engineers and Data Scientists interested in the most current technologies, analysts and BI professionals with basic coding skills and developers looking for a specialization in big data. Course material will be written in Python, but you don’t have to be an expert to be able to follow and understand it. The course is also great for IT Managers to get a better understanding of Apache Spark and the capabilities it can deliver.

How much does it cost?

Course fee

Maximum number of participants:

20

Have a question?

Register

If you have questions about the course or would like to register, feel free to contact us here:

First name:

Last name:

Email:Confirm Email:Phone:

Timing (Day 1-3)

9:00-10:30

Morning session 1

10:30-10:45

Break (Coffee)

10:45-12:00

Morning session 2

12:00-13:00

Lunch

13:00-14:15

Afternoon session 1

14:15-14:30

Break (Coffee)

14:30-16:00

Afternoon session 2

Who is the instructor?

Miklós Tóth is a senior instructor at Datapao. Besides Datapao, Miklós is working as a Data Scientist and working on different Machine Learning projects, running JAVA Spring Framework classes. Prior to teaching Apache Spark, Miklós worked for AUDI Academy as an IT trainer.