Install .NET SDK

Enable .NET channel

In order to install .NET Core from Red Hat on RHEL, you first need to register using the Red Hat Subscription Manager. If this has not been done on your system, or if you are unsure, see the Red Hat Getting Started Guide.

Install the .NET SDK

After registering with the Subscription Manager and enabling the .NET Core channel, you are ready to install and enable the .NET SDK.

In your terminal, run the following commands:

Terminal

yum install rh-dotnet22 -yscl enable rh-dotnet22 bash

Register Microsoft key and feed

Before installing .NET, you'll need to register the Microsoft key, register the product repository, and install required dependencies. This only needs to be done once per machine.

Add machine learning

This opens ML.NET Model Builder in a new docked tool window in Visual Studio. Model Builder will guide you through the process of building a machine learning model in the following steps.

In your terminal, run the following commands:

Command prompt

mkdir myMLAppcd myMLAppdotnet new console -o consumeModelApp

The mkdir command creates a new directory named myMLApp, and the cd myMLApp command puts you into the newly created app directory. The dotnet command creates a new application of type console for you. The -o parameter creates a directory named consumeModelApp where your app is stored, and populates it with the required files.

Binary classification - Use this when you want to analyze text and predict if it belongs in either category A or category B (e.g. analyzing sentiment of customer reviews as either positive or negative)

Multiclass classification - Use this when you want to analyze text and classify into multiple categories (e.g. labeling new GitHub issues).

Regression - Use this when you want to predict a numeric value (e.g. predicting house price). This task is called regression.

In this case, you will predict sentiment based on the content (text) of customer reviews, so you will use the binary classification ML task.

Download and add data

Download the Wikipedia detox dataset and save it as wikipedia-detox-250-line-data.tsv in the myMLApp directory you created.

Each row in the wikipedia-detox-250-line-data.tsv dataset represents a different review left by a user on Wikipedia. The first column represents the sentiment of the text (0 is non-toxic, 1 is toxic), and the second column represents the comment left by the user. The columns are separated by tabs. The data looks like the following:

Add data

In Model Builder, you can add data from a local file or connect to a SQL Server database. In this case, you will add wikipedia-detox-250-line-data.tsv from a file.

Select File as the input data source in the drop-down, and in Select a file find and select wikipedia-detox-250-line-data.tsv.

Under Column to predict (Label), select "Sentiment."

The Label is what you are predicting, which in this case is the Sentiment found in the first column of the dataset. The rest of the columns (in this case the actual Sentiment Text from the reviews in the second column) are Features, which are attributes that help predict the Label.

After adding your data, go to the Train step.

Train your model

Now you will train your model with the wikipedia-detox-250-line-data.tsv dataset.

Model Builder evaluates many models with varying settings to give you the best performing model.

The default Time to train, the amount of time you would like Model Builder to explore various models, is 10 seconds. Note that for larger datasets, you should set a longer training time.

Select Start Training to start the training process.

Progress

You can keep track of the progress of model training in the Progress section.

Status - This shows you the status of the model training process; this will tell you how much time is left in the training process and will also tell you when the training process has completed.

Best accuracy - This shows you the accuracy of the best model that Model Builder has found so far. Higher accuracy means the model predicted more correctly on test data.

Best algorithm - This shows you which algorithm performed the best so far during Model Builder's exploration.

Last algorithm - This shows you the last algorithm that was explored by Model Builder.

What do these commands mean?

The mlnet auto-train command runs ML.NET with AutoML to explore many iterations of models with varying combinations of data transformations, algorithms, and algorithm options and then chooses the highest performing model.

--task: You must specify the ML task, which in this case is binary classification.

--dataset: You choose wikipedia-detox-250-line-data.tsv as the dataset (internally, the CLI will split the one dataset into training and testing datasets).

--label-column: You must specify the target column you want to predict (or the Label). In this case, you want to predict the Sentiment, which is in the first column (index 0).

--max-exploration-time: You must also specify the amount of time you would like the ML.NET CLI to explore the different models, in this case 10 seconds. Note that for larger datasets, you should set a longer training time.

Progress

While the ML.NET CLI is exploring different models, the progress bar will indicate how much time is left in the training process. As new models are explored, the Best Accuracy, Best Algorithm, Last Algorithm, and time duration will change.

Best accuracy - This shows you the accuracy of the best model so far. Higher accuracy means the model predicted more correctly on test data.

Best algorithm - This shows you which algorithm has performed the best so far.

Last algorithm - This shows you the last algorithm that was explored.

Evaluate your model

After Model Builder selects the best model, it will take you to the Evaluate step, which shows you various output, including how many models were explored and the ML task (in this case binary classification)

Model Builder also displays the top 5 models explored and displays several evaluation metrics for each of those top 5 models, including AUC, AUPRC, and F1-score, which you can learn more about here.

After evaluating your model, move on to the Code step.

After the ML.NET CLI selects the best model, it will display the Experiment Results, which shows you a summary of the exploration process, including how many models were explored and the top 5 models that were found in the given time.

Top 5 models

While the ML.NET CLI generates code for the highest performing model, it also displays the top 5 models with the highest accuracy that it found in the given exploration time. It displays several evaluation metrics for those top 5 models, including AUC, AUPRC, and F1-score, which you can learn more about here.

Generate code

In the Code step in Model Builder, select Add Projects.

Model Builder adds both the machine learning model and the projects for training and consuming the model to your solution. In the Solution Explorer, you should see the code files that were generated by Model Builder, including:

myMLAppML.ConsoleApp is a .NET console app which contains ModelBuilder.cs (used to build/train the model) and Program.cs (used to run the model).

Next steps

Now that you've used Model Builder for Sentiment Analysis, you can try other scenarios. Try out the Price Prediction scenario in Model Builder using the Taxi Fare dataset to keep building ML.NET models with Model Builder:

Congratulations, you've built your first machine learning model with the ML.NET CLI!

Now that you've used the ML.NET CLI for Sentiment Analysis, you can try other scenarios. Try out a Price Prediction scenario (e.g. the regression task in the ML.NET CLI) using the Taxi Fare dataset to keep building ML.NET models with the ML.NET CLI.