Mastering Big Data Analytics

This course covers various Big Data Tools to help you gain an understanding of implementing Big Data in the Data Science Domain. This is an advanced course offered exclusively to students enrolling in the PGP-Data Science and Engineering (DSE) program before 31 Mar 2019.

Course Benefits

About the course

Today, we’re surrounded by data. People upload videos, take pictures on their cell phones, text friends,
update their Facebook status, leave comments around the web, click on ads, and so forth. Machines, too, are
generating and keeping more and more data. To process such large datasets, there is a need for specialized
tools.

This course covers two important frameworks Hadoop and Spark, which provide some of the most important tools
to carry out enormous big data tasks.The first module of the course will start with the introduction to Big
data and soon will advance into big data ecosystem tools and technologies like HDFS, YARN, MapReduce, Hive,
etc.

In the second module, the course will take you through an introduction to spark and then dive into Scala and
Spark concepts like RDD, transformations, actions, persistence and deploying Spark applications. The course
also covers Spark Streaming and Kafka, various data formats like JSON, XML, Avro, Parquet and Protocol
Buffers.

Projects

Yellow Taxi trip analysis using Hive

Sentiment Analysis on Twitter in Real Time

GL Gurus

Vinod Raju

Sajan Kedia

Course certificate

Get Big data course certificate from Great learning which you can share in the
Certifications section of your LinkedIn profile, on printed resumes, CVs, or other documents.

Special ePortfolio Mention

The Projects and Skills acquired as a part of this course will have a special place in your ePortfolio.

FAQ's

How soon after signing up would I get access to the Learning Content?

The course content access will be provided to you after completion of your PG Program in Data Science and Engineering.

For what duration will I have access to the course?

The course will be available to you for a lifetime. You can revisit the course content anytime you want.

Whom can I contact if I have queries regarding the course?

Once you enroll, you will have access to a course discussion forum where you can post all your queries and
they will be answered by GL Gurus.

What is Cloudlab?

Cloud Lab is a cloud-based Spark and Hadoop environment that Great learning offers with
this course. Here, you can execute all the online demos and work on real-life case studies seamlessly. As a
part of this course, the cloud lab access will be valid for 60 days only

What will be the mode of training for this course?

This course will be completely online and will provide you with access to high-quality
content videos, quizzes, case studies, and real-life projects. There will be a discussion forum where our GL
Gurus will be available for answering your queries.

Will the training and course material help me prepare for the Big Data Hadoop certification exam?

Yes, Great Learning’s training and course materials will prepare you for the Big Data
Hadoop certification exam.

Avail this course for free by enrolling into the DSE program before 31 Mar 2019Apply Now

Yellow Taxi trip analysis using Hive

The NYC taxi trip Analysis project is as elite as it sounds. The dataset is well designed to put your big data
skills to the ultimate test. The project will untie your potential to hone as well as master exploratory data
analysis on the given dataset.

The ultimate aim of the project is to derive the highest possible revenue figures using Hadoop and Hive.

Sentiment Analysis on Twitter in Real Time

With over 500 million tweets wrapped up in 280 words, Twitter is the home to one of the crispest and concisely
written content on the web. From space tweets to ( Lebron James’ on chicken nuggets OR Donald Trump’s infamous
‘covfefe’ tweet), it hosts ideas, comments, and sentiments with minimum jargons and more information. This
makes it an ideal platform for Sentiment Analysis using Machine Learning.

This project will enable you to run analysis on real-time tweet data, derive opinions and understand trends on
a gamut of trending topics across the globe, and obtain a riveting visual plot using PySpark

We use cookies to give you the best online experience. By using our website, you agree to our use of cookies in accordance with our privacy policy.
Learn More