In this lab, you will set up an Azure Machine Learning Studio account. You will then walk through the various features and capabilities of Azure Machine Learning Studio. You will load data from local and external sources. You will clean, manipulate and transform the data to make it usable for machine learning. Finally, you will create a binary classification model using two-class boosted decision trees to build a targeted mailing list.

An Azure Machine Learning Studio Workspace allows you to use Machine Learning Studio to create and manage machine learning experiments and predictive web services. You can create multiple Workspaces, each one containing a set of your experiments, datasets, trained predictive models, web services, and notebooks. As the owner of a Workspace, you can invite other users to share the Workspace so you can collaborate with them on predictive analytics solutions.

In this exercise, you will create an Azure Machine Learning Studio Workspace.

Azure Machine Learning Studio is a powerful browser based visual drag-and-drop code free authoring environment for machine learning in Azure. It allows you to build, deploy and share predictive analytics solutions in a fully managed cloud service with minimal overhead and fast time to insights.

In this task, you will take a walkthrough of the Azure Machine Learning Studio interface where you will create and configure machine learning projects with imported datasets and other assets.

In this exercise, you will create your a simple AzureML experiment to read and summarize a dataset using the Summarize Data task.

In this exercise, you will access an online data source using the Import Data task in Azure Machine Learning Studio.

In this exercise, you will clean, manipulate and transform data using Azure Machine Learning Studio. You will implement a data cleansing process in your experiment to remove duplicate rows, remove outliers, and remove rows missing key data points. You will further verify that your data cleansing and transformations work by integrating the Summarize Data task into key points in the data preparation pipeline.

In this exercise, you will create an AzureML experiment to help you create a targeted mailing list using a classification algorithm in Azure Machine Learning Studio.

The type of algorithm we will use is called a binary classifier. A binary classifier is a type of algorithm that will classify elements into one of two groups. In our case, whether or not we should send an advertisement to an individual (read: marketing wants to know whether it is worth the cost of the stamp to send an advertisement to a potential customer). Other example use cases might be whether a piece of email is junk or good, whether a patient’s lab value is positive or negative, or whether sentiment is positive or negative.

The specific algorithm we will be using is the Two-Class Boosted Decision Tree. Decision trees are a great entry point into machine learning because they are very intuitive and easy to understand. The Two-Class Boosted Decision Tree is one of the easiest methods to get good performance. However, it is constrained by the size of memory and may not be well suited for larger datasets.

In Azure ML Studio, you can use the Execute R Script module to embed R code into experiments in Azure Machine Learning and execute them using the R language. This means you can have customized R functions and packages that are not immediately available in Azure ML Studio.

In this exercise, we are going to use an R script to sample our dataset. You might want to do this if you have a large dataset and want to use an algorithm such as Two-Class Boosted Decision Trees that operates in-memory and requires a smaller dataset. We will execute the R script by using the Execute R Script task in Azure ML Studio.

So far, we have looked at different tasks for importing and preparing data, building experiments, and training models. Now, we are going to convert our training experiment to a predictive experiment. The predictive experiment will generate predictions, taking a single input, and producing a result. The predictive experiment will be deployed as a Web service on Azure, to make it available for use by external applications.