Scientists, developers, and other technologists from many different industries are taking advantage of AWS to perform big data analytics and meet the challenges of the increasing volume, variety, and velocity of digital information. AWS offers a portfolio of cloud computing services to help you manage big data by reducing costs, scaling to meet demand, and increasing the speed of innovation. In this quest, you’ll learn to work with advanced services for Big Data.

In this Quest, you will delve deeper into the uses and capabilities of Amazon Redshift. You will use a remote SQL client to create and configure tables, and gain practice loading large data sets into Redshift. You will explore the effects of schema variations and compression. You will explore visualization of Redshift data, and connect Redshift with Amazon Machine Learning to create a predictive data model.

Serverless architectures allow you to build and run applications and services without needing to provision, manage, and scale infrastructure. This quest will show how to design, build, and deploy interactive serverless web applications, using a simple HTML/JavaScript web interface which uses Amazon API Gateway calls to send requests to AWS Lambda backends that query Amazon DynamoDB data.

In this lab, you will deploy a fully functional Hadoop cluster, ready to analyze log data in just a few minutes. You will start by launching an Amazon EMR cluster and then use a HiveQL script to process sample log data stored in an Amazon S3 bucket. HiveQL is a SQL-like scripting language for data warehousing and analysis. You can then use a similar setup to analyze your own log files.

In this lab, you will experiment with and compare different types of data loading using Amazon Redshift. You will create tables, load data using S3, remote hosts, and practice troubleshooting data loading errors. For the lab to function as written, please DO NOT change the auto assigned region.

This lab demonstrates how to upload data to Amazon S3 and make it available for anyone to access via a web browser. You will learn how to create an Amazon S3 bucket, configure it to host a website, upload objects to it, and use JavaScript to display those objects on a web page. Along the way, you’ll learn some best practices for creating open data. At the end of this lab you will have deployed a simple web site that makes data easy to access and provides basic documentation of the data.

This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.

The lab will give you the basic understanding of Amazon Redshift data warehouse service. It will demonstrate the basic steps required to get started with Redshift: creating a cluster, loading data and performing queries against that data.

Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. This hand-son lab will demonstrate how Amazon Kinesis Firehose can capture and automatically load streaming data into an Elasticsearch cluster.