Big Data Certification

Become a SAS® Certified Big Data Professional.

Demonstrate your ability to use the tools and technology designed to handle big data. The SAS Certified Big Data Professional program delivers the extra edge you're looking for.

SAS® Big Data Certification Curriculum

10 Courses

Course content is designed to prepare you for the certification exams.

+

Case Studies

Real-world case studies enable you to apply what you have learned.

+

2 Exams

Pass both exams to earn your certification credential.

Topics Covered

Critical SAS programming skills.

Accessing, transforming and manipulating data.

Improving data quality for reporting and analytics.

Essential communication skills.

Fundamentals of statistics and analytics.

Working with Hadoop, Hive, Pig and SAS.

Exploring and visualizing data.

SAS software covered

Base SAS®

SAS® Enterprise Guide®

SAS® Enterprise Miner™

SAS® In-Memory Statistics

SAS® Studio

SAS/STAT®

SAS® Visual Analytics

DataFlux® Data Management Server

DataFlux® Data Management Studio

Choose a Format

Classroom

Instructor-led training in a classroom setting.

Monday-Friday classes for six weeks.

Real-world case studies that help you apply what you learn.

Access to SAS software for practice.

Dedicated coach to guide you.

Certification exam vouchers.

Self-Paced

Access e-learning (24/7 online access).

Complete at your own pace over 6 months.

Real-world case studies that help you apply what you learn.

Access to SAS software for practice.

Access to an online community to support your learning.

Course Listing

Programming Review

SAS Fundamentals: Programming, SQL and Macro Language

This course focuses on data manipulation techniques using the DATA step and SQL procedure to access, transform, join and summarize SAS data sets. You'll learn how to use components of the SAS macro facility to make text substitutions in SAS code and to write simple macro programs.

Topics Covered

Summarizing and presenting data.

Querying and subsetting data.

Transforming character, numeric and date variables.

Combining SAS data sets, including complex joins and merges.

Performing DO loop and SAS array processing.

Restructuring or transposing SAS data sets.

Performing text substitution in SAS code.

Using macro variables.

Creating simple macro definitions.

Big Data Preparation, Statistics and Visual Exploration

Big Data Challenges and Analysis-Driven Data

This course provides an overview of the challenges associated with big data and analysis-driven data.

Topics Covered

Reading external data files.

Storing and processing data.

Combining Hadoop and SAS.

Recognizing and overcoming big data challenges.

Exploring Data With SAS Visual Analytics

In this course, you'll learn how to use SAS Visual Analytics Explorer to explore in-memory tables from the SAS® LASR™ Analytic Server and perform advanced data analyses.

Using chi-square statistics to detect associations among categorical variables.

Fitting a multiple logistic regression model.

Scoring new data using developed models.

Preparing Data for Analysis and Reporting

In this course, you'll learn how to perform data management tasks, such as improving data quality, entity resolution and data monitoring.

Topics Covered

Creating and reviewing data explorations.

Creating and reviewing data profiles.

Creating data jobs for data improvement.

Establishing monitoring aspects for your data.

Understanding the QKB components.

Using the component editors.

Understanding various definition types.

Building a new data type (optional).

Crafting Compelling (and true) Data Stories

Storytelling is a necessary skill when talking to key stakeholders. Insights uncovered in your data can move mountains if the right people say yes. But how do you move someone from simply being curious, all the way to, "Let's do this!" In this course, you'll learn why storytelling is a skill you need to develop, when a story works and when it doesn't, and how to communicate data in a meaningful way.

Big Data Programming and Loading

Introduction to SAS and Hadoop: Essentials

This course teaches you how to use SAS programming methods to read, write and manipulate Hadoop data. You'll learn how to use Base SAS methods to read and write raw data with the DATA step, manage the Hadoop Distributed File System (HDFS) and execute MapReduce and Pig code from SAS via the HADOOP procedure. You'll also learn how to use SAS/ACCESS® Interface to Hadoop methods that allow LIBNAME access and SQL pass-through techniques to read and write Hive or Impala table structures.

Topics Covered

Accessing Hadoop distributions using the LIBNAME statement and the SQL pass-through facility.

Creating and using SQL procedure pass-through queries.

Using options and efficiency techniques for optimizing data access performance.

Joining data using the SQL procedure and the DATA step.

Reading and writing Hadoop files with the FILENAME statement.

Executing and using Hadoop commands with PROC HADOOP.

Using Base SAS procedures with Hadoop.

DS2 Programming Essentials With Hadoop

This course focuses on DS2, a fourth-generation SAS proprietary language for advanced data manipulation, which enables parallel processing and storage of large data with reusable methods and packages.

Topics Covered

Identifying the similarities and differences between the SAS DATA step and the DS2 DATA step.

Using the SAS In-Database Code Accelerator to execute DS2 code outside of a SAS session.

Executing DS2 code in the SAS High-Performance Analytics grid using the HPDS2 procedure.

Hadoop Data Management With Hive, Pig and SAS

In this course, you will use processing methods to prepare structured and unstructured big data for analysis. You will learn to organize the data into structured tabular form using Apache Hive and Apache Pig. You will also learn SAS software technology and techniques that integrate with Hive and Pig, as well as how to use these open source capabilities by programming with Base SAS and SAS/ACCESS Interface to Hadoop, and with SAS Data Integration Studio.

Topics Covered

Moving data into the Hadoop ecosystem.

Using Hive to design a data warehouse in Hadoop, perform data analysis using the Hive query language (HiveQL) and join data sources.

Performing extract, transform and load (ETL).

Organizing data in Hadoop by usage.

Analyzing unstructured data using Pig.

Joining massive data sets using Pig.

Using user-defined functions (UDFs).

Analyzing big data in Hadoop using Hive and Pig.

Using SAS programming to submit Hive and Pig programs that execute in Hadoop, and store results in Hadoop or return results to SAS.

Using SAS programming to move data between the SAS server and the HDFS.

Constructing SAS Data Integration Studio jobs that integrate with Hive and Pig processes and the HDFS.

Getting Started With SAS In-Memory Statistics

This course focuses on accessing data on the SAS LASR Analytic Server and performing exploratory analysis and preparation. Topics include starting the server, loading data and manipulating data on the SAS LASR Analytic Server using the IMSTAT procedure. IMSTAT topics include deriving new temporary and permanent tables and columns, calculating summary statistics (e.g., mean, frequency and percentile), and creating filters and joins on in-memory data.