Terms & Conditions

Spark with Python

Invite friends

About The Event

Overview

Apache Spark with Python training will advance your expertise in Distributed programming with Spark and Python. Skill set gained through the course in Core Spark, Python, SparkSQL and Streaming will help you to solve complex problem. Deep knowledge of Spark with Python will always make you distinct, which will open a successful path for your career.

Objective

Hadoop mapreduce faciliteted to solve complex problems on distributed systems but with some limitations. This course will discuss limitation of Hadoop mapreduce and how Spark overcomes those limitations. We describe RDDs which is core of Spark and In memory computation. Understanding of persistent RDDs, in memory computation, and solving Big Data problems using Spark with Python is core of this course. Discussion will move through SparkSQL and problem solving with SparkSQL dataframes. Hand-on is the parallel movement for all the discussion. Concept on dealing with streaming data with Spark Streaming is also an important topic, which is included. Last part of course is Spark program optimization. Optimization of Spark core, Spark SQL, Spark streaming and optimizing the utilization of cluster system . We discuss Spark on Yarn, Standalone and Mesos cluster too. Training will go through many small projects to get you working on Spark clusters.

Day wise distribution of class :

Day 1 : Python, Bigdata , Spark Introduction and Component of Spark ,Operation on Single RDD,