Author: Neel

If you have ever worked with APIs then you might be familiar with Swagger, if you have not heard about Swagger then this post will help you to know the basic of Swagger and the steps to configure Swagger with yours .Net Core 2.0 application.

First of all, let us see what is Swagger

One liner for Swagger – UI representation of your RESTfull APIs

Swagger is a set of rules (in other words, a specification) for a format describing REST APIs.

The format is both machine-readable and human-readable.

As a result, it can be used to share documentation among product managers, testers and developers, but can also be used by various tools to automate API-related processes.

For example, you have set of APIs and you want a proper documentation for these APIs, you can use Swagger. You can even test the API calls from Swagger and there is lot more can be done using Swagger, have a look here for more details.

In this post, we will see how can we use Swagger with .Net Core 2.0 application.

We are almost done and we have already enabled Swagger in the Core 2.0 application.

Last but not the least step, we will change the launch browser setting of the application and will tell the application to launch the Swagger when we run the application.

For these Open properties of the application -> Go to Debug tab -> Write swagger in the Launch browser text box:

That is it. Just run the application and you will see beautiful colorful Swagger landing page as shown below:

Manage versions of APIs

You can even manage different versions of your APIs, for example, you have v2 APIs as well and you need to enable that in Swagger then you just need to add a couple of steps.

First add new SwaggerDoc in ConfigureService method and then add new endpoint in Configure method in Startup.cs class as shown below:

Now just run the application and you can select different versions of your APIs by selecting the value from dropdown list as shown below:

XML documentation

We all are quite familiar with the XML document of the API, for example, you have hosted your APIs and you want to send the details of APIs to another team then you can simply pass XML document which contains all the details of the API like description, desired input, output etc.

Let us see how we can create XML documents with Swagger in .Net Core 2.0 application:

First of all, we need to enable XML support and once we enable the checkbox, the application will create an XML file in the bin\debug folder.

Make sure you select the appropriate name of Enum because it will be automatically reflected in the bot. For example, if you write the name of Enum as MovieTheatreLocation then the bot will use this as:

Please select a movie theatre location

Also, note that the sequence of different selection box will depend on how you declare different Enum in code.

For example in above class you have declared Enum in below sequence:

public MovieTheatreLocation movieTheatreLocation;
public MovieTheatre movieTheatre;
public MovieTypes movieTypes;
public ClassTypes classTypes;
[Optional]
public DoYouNeedAMeal doYouNeedAMeal;
public FoodMenu foodMenu;
public DateTime? DateOfJurney;
[Numeric(1, 5)]
public int? NumberOfAdult;
public int? NumberOfChild;

So Movie theatre selection box will come first and Number of child box will come last.

Now, we need to create a separate dialog for MovieBooking, so add one class called MovieBotDialog:

In this class, we will write what bot should behave when the user replies something or when the whole conversation is over.

As you can see above, we have mentioned if user types “hi” then we will initiate RootDialog class and in RootDialog class we will write some code to collect the name of the user. RootDialog class will store the name into the context user data, so that the name can be used later in the conversation.

.Net Core was introduced in last few months and people has started adopting it.

In this series of post, I will put some frequent issues during .Net Core development and some important topics for the Core.

In this post we will see how to resolve the error:

No executable found matching command “dotnet-ef”

This error comes when we want to migrate the database with the models and when we use below command:

dotnet ef migrations add FirstMigration

Reason:

Because dotnet-ef tool might have not been added along with your template

This error comes if you have not added Microsoft.EntityFrameworkCore.Tools in your Tools section of project.json file or if you have not added DotNetCliToolReference into the ItemGroup section, I will explain more in below solution.

Solution:

Below are some steps which you can try to resolve the error:

For .Net Core 2.0

Add Microsoft.EntityFrameworkCore.SqlServer in your project if it is not added yet, you can add them either using Nuget package by running below command:

dotnet add packageMicrosoft.EntityFrameworkCore.SqlServer

2. Add DotNetCliToolReference into the ItemGroup as below in your csproj file( for 99% cases, this should work):

By looking at how fast the companies adopting the Bots, it is really the best time for you to start learning Bot framework and start adopting Bots for your business.

Some pain in the real world without the Bots:

You have to read the whole FAQ to find some specific information for any website or any company

You have to wait for next business day to start to get answers to your queries

You have to send emails to get some information to send some information

You have to do manual work to answer some repetitive questions

More manpower would require if the number of questions increases suddenly:

This would eventually affect your business. Time to try the Bots.

Let us see what the Bots are:

An Internet bot, also known as web robot, WWW robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much higher rate than would be possible for a human alone.

In simple words, Bots are something that can be integrated with your website and they can answer the questions posted by the users without the need of the humans

Let us see how to create simple Bot application using Visual Studio 2017.

Also if you want to have Bot Application as a template then as a workaround just download this (download would start once you click on the link) project and put the extracted folder into below location:

This is the third post on Machine Learning Questions and Answers series. Look here for previous posts.

The fundamental requirement to start any Machine Learning project is to identify which algorithm we should apply for the business problem on which we are starting the Machine Learning project.

For that, you must have the ability to pick the algorithm by looking at the problem. You should have that vision to decide whether we should choose Classification or Regression or Clustering or Recordation etc.

Today’s question is somewhat around that.

How to choose Machine Learning Algorithm from the problem given by Business?

Answer –

The Very first step would be to create a Problem Statement from the problem given by the Business.

Importance of the Problem Statement

Whenever you start working on any Machine Learning project, you first need to understand what the problem is

You should be able to understand what type of problem we want to solve and it mainly depends on how you set the problem statement

Setting a problem statement includes the input and output of the business problem you want to solve

Problem statement must be concrete and it should not have multiple sentences in it

For example, if your company is getting many harmful spams or some fake emails which may harm your organization, you require an algorithm which can identify what are the spam\fake mails out of all the emails. So, in this case, an email would be the input and identification whether the email is spam\fake is the output. So the Problem statement would be “Is the mail spam\fake or not?“

Once the problem statement is defined, you should be able to identify which algorithm you should choose among below types:

Any Classification Algorithm

Any Clustering Algorithm

Any Regression Algorithm

Any Recommendation Algorithm

When to use Classification Algorithms:

For example, your problem statement is a question to classify something(ie Is this good or bad?) or when your goal is to predict discrete values, e.g. {1,0}, {True, False}, {Spam, No Spam}

Classification is a Supervised Learning.

Real Time Examples:

Is this mail a spam or not?

Is the customer happy or not?

After entering out of a restaurant, was the person looking satisfied or not? Here Resturant can be replaced with anything like Bank, shop, mail etc

Is the feedback provided by the customer positive or negative?

If the trading day is an Up-day or a down-day?

Classification algorithms are used by banks to classify loan applicants by their probability of defaulting payments

All above problem statement has some certain pattern. They involve taking an object or entity and classifying it. For example in Spam detection, we are classifying an Email

And on the other hand, you have something which needs to be classified. For example, if we are doing Binary classification then there might be 2 categories like the tweet is positive or not or if we are doing the classification with multiple categories then it needs to classify accordingly.

Example of Classification Algorithm:

K – Nearest neighbor

Decision Trees

Bayesian Classifier

When to use Clustering Algorithms:

For example, there is a large group of users and you want to divide them into particular groups based on some common attributes. But the key part is when the groups to be divided into are unknown beforehand. So you should go with Clustering Algorithm when there are not known attributes beforehand.

If there are questions like how is this organized or grouping something or concentrating on particular groups etc in your problem statement then you should go with Clustering.

Classification is an UnSupervised Learning.

Real Time Examples:

When you have large social network site and you want to divide the users on basis of the Likes they made on the post or on basis of Demographics, so it helps to identify the meaningful groups

A company got to know particular products are not making sales as expected so those products can be clustered. then only the cluster of such products would have to be checked rather than comparing the sales value of all the products

Most of the search engines like Yahoo, Google uses Clustering Algorithms to cluster web pages by similarity and identify the ‘relevance rate’ of search results. This helps search engines reduce the computational time for the users.

We all know Dominos guarantees you to deliver pizza in 30 minutes, they use Clustering to identify the Pizza shop location such a way that traveling time of the Pizza guy is minimized. That is the reason there are so sure that they will reach customers within 30 minutes

You should go with the Clustering when the major focus is to create groups which have similar attributes. For example, you’re given a set of history transactions which recorded who bought what. By using clustering techniques, you can tell the segmentation of your customers.

Example of Clustering Algorithm:

K- Means Algorithm

Expectation maximization

When to use Regression Algorithms:

For example, you want to compute some continuous value as compared to Classification where the output is categoric. So whenever you are told to predict some future value of a process which is currently running, you can go with Regression Algorithm.

Here you deal with numbers and there might be questions about How much or how many or how long or Impact of something on something else in your problem statement, in such cases you should go with Regression.

Classification is a Supervised Learning.

Real Time Examples:

What will be the value of Doller equivalent to the Bitcoin on any particular future date

As you can see in above all examples, we have some continuous process like BitCoin price, Compute Time, Sales etc and the output depends upon some certain inputs. For example in case of Compute time, it depends on the time of the day when I want to travel + the distance + weather etc.

Example of Regression Algorithm:

Linear Regression Algorithm

Logistic Regression Algorithm

Polynomial Regression Algorithm etc

When to use Recommendation Algorithms:

For example, when you want to determine what kind of theme a user would like in future based on the user’s past behavior. Like as Collaborative filtering(for example, User like you also liked kind of things)

So when there are some questions like what the user should do next or suggest something to the user or top choices of particular users etc in your problem statement then you should go with Recommendation.

Classification is an UnSupervised Learning.

Real Time Examples:

If a user buys the Washing Machine, what else he would buy in future

What are top 10 choices of books for a particular user

What kind of artist the user would like when he comes back on any music related applications

Providing user A the list of item which has been bought by another user B whose behavior is almost similar to user A

As you can see above examples, it mainly based on any particular user and his\her historical behavior.

Example of Recommendation Algorithm:

Collaborative filtering Algorithm

Logistic Regression Algorithm

Polynomial Regression Algorithm etc

Conclusion:

Whenever you want to identify which Algorithm to use:

Set one concrete Problem statement out of the problem given to you

The choice you made here will completely determine what would be the next steps

Make the choice which Algorithm you should choice by looking at the problem statement like what exactly is needed

I have started the series of Machine Learning Questions and Answers, you can find the first post here.

Let us see some more questions and answers.

What is OverFitting in Machine Learning? OR Is OverFitting good or bad in Machine Learning?

Answer –

Overfitting is not good for Machine learning projects.

As the name suggests, it is nothing but trying to fit something over than required.

One liner for Overfitting:

When you remembered but have not learned something, thus you are not prepared for the future

Definition:

So Overfitting occurs When we capture low-level details in a particular data set but we fail to capture the higher level, more abstract details of that data set. Thus it creates the problem for the future examples

In other words, Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data

And as per Wikipedia – Overfitting is the production of an analysis that corresponds too closely or exactly to a particular set of data, and may, therefore, fail to fit additional data or predict future observations reliably.

Real life example of Overfitting:

You started reading a chemistry book

You read every single low-level detail like every single letter and digit

You read the whole book from start to end

But what if you can not think about the bigger picture like relating different topics with each other, how one topic is related to another topic etc

Thus you have remembered the book but have not learned it

So if someone comes and asks you something from that book, it is not sure whether you can answer all of them

Let us see some examples:

You have some data and you put it into the X and Y axis as below:

If you try to separate the X and 0 something like below:

Though it looks valid, it looks so bad and may create a problem in future. This may happen when you take data too literally. So in above example, this may work perfectly for current data but what if some new data is added, it will not work perfectly.

When you can just take something simple as below:

By this, if you add some more data then also the distance of 0 would be lesser compared to the line which we drew above.

In short, prepare for future instead of giving too much time to the present.

Let us take some visual examples:

For example, you want to create a program which can identify the lion.

In this case, we are overfitting when we collect very basic and complex data like height and width of the lion some other very minor and deep details as below:

So the program will perfectly identify a lion which comes with little bit similar details as mentioned above, but this would become useless incase of identifying some new examples like a White Lion.

The program would fail to identify White Lion because it was trained with some unnecessary deep details instead of some abstract and useful details.

2) How can we avoid Overfitting?

Answer –

We can avoid Overfitting by splitting the data 3 way:

Training set

Cross-Validation

Test set

This way we are assuring that the model is not dependant on any particular set

Apart from this, you can:

Keep it Simple

Feature selection: consider using fewer feature combinations and decrease the number of numeric attributes bins

I have recently started posting related to Machine Learning and I got some very positive feedback from people because they are liking the way I explain Machine Learning related topics in simple words.

As per the demand, I am starting the series of Machine Learning Questions and Answers.

I will keep on posting the questions along with the answers here as soon as I get to know it.

So let us start:

1) What are the types of Machine Learning and what is the difference between Supervised Machine Learning and UnSupervised Machine learning?

Answer – For this, I have already written a post which you can find here.

2) What is the difference between Classification and Clustering?

Answer –

One liner for Classification:

Classifying data into pre-defined categories

One liner for Clustering:

Grouping data into a set of categories

Key difference:

Classification is taking data and putting it into pre-defined categories and in Clustering the set of categories, that you want to group the data into, is not known beforehand.

Let us go a bit deeper into Classification first:

In classification, you would start with one instance(one object) to be classified

You would classify it into pre-defined categories which are nothing but the labels

Do this based on the training data which has already been classified

For Example – In sentiment analysis, you would classify one comment as positive or negative and you would do this based on the set of training data which are already been classified into positive and negative comments

So if you understood Supervised Machine Learning then you would realize that Classification is nothing but the Supervised Machine Learning

Simple understanding with an example:

You give your algorithm(your friend) some data(Set of People), called as Training data, and made him learn which data corresponds to which label(Male or Female). Then you point your algorithm to certain data, called as Test data, and ask it to determine whether it is Male or Female. The better your teaching is, the better it’s prediction.

Some real-life examples:

If the e-mail is a spam or not

Is the comment on a Facebook post or a Tweet on Twitter is positive or negative

If the trading day is an Up-day or a down-day

Handwritten Digit Recognition

Speech Recognition

Image Recognition

Example of Classification Algorithm:

K – Nearest neighbor

Decision Trees

Bayesian Classifier

Steps of Classification Setup:

The problem has to be defined first

Then you would represent your data in the form of Numerical attributes called Features. This is done both for the training data which has already been classified and the test data which has to be classified in the future

You would take your training data and feed it into a classification algorithm to train a model

Take new instance that needs to be classified or take the test data and pass it to classifier to classify

Now let us see something more about Clustering:

Instead of taking single Instance(As the case of Classification above), we are taking large number of instances

We divide these number of instances into the groups

So as we had pre-defined categories in Classification, in clustering the groups are unknown beforehand

Basically, we do not know before the clusters are formed, what to call those clusters because we would not know until then, what would be the common categories inside these clusters

Yes, it is UnSupervised Machine Learning

Simple understanding with an example:

In Clustering, you provide the data(Set of people) to the algorithm(your friend) and ask it to group the data.

Now, it’s up to algorithm to decide what’s the best way to the group is? (Gender, Color or age group).

Again, you can definitely influence the decision made by the algorithm by providing extra inputs.

Some Real-life examples:

How we can divide set of articles such that those articles have the same theme(we do not know the theme of the articles ahead of time)

Identifying groups of houses according to their house type, value and geographical location

Earthquake epicenters to identify dangerous zones

Putting telephone towers in a new city using clustering such that all its user receives optimum single strength

Example of Clustering Algorithm:

K- Means Algorithm

Expectation maximization

Steps of Clustering Setup:

You would start with the problem statement, which is the database which needs to be clustered

Then you would represent points in that dataset using features.

No training step here

You would directly feed the data into Clustering algorithm to find the final clusters, without any training steps

3) Can Classifier and Clustering go hand an hand OR Can Classifier and Clustering work together?

Answer – Yes they can.

For example, you have set of articles -> you divide these articles into the clusters based on the tags -> The Articles are grouped based on the tags

Now you have an article -> Article is sent to Classifier and Classifier will assign one of the tags from the tags that are discovered during Clustering above – > Tag is identified

So basically, the articles which are grouped based on the tags into different clusters are becoming the training data for the Classifier.

Conclusion:

Classification assigns the category to 1 new item, based on already labeled items while Clustering takes a bunch of unlabeled items and divide them into the categories

In Classification, the categories\groups to be divided are known beforehand while in Clustering, the categories\groups to be divided are unknown beforehand

In Classification, there are 2 phases – Training phase and then the test phase while in Clustering, there is only 1 phase – dividing of training data in clusters

Classification is Supervised Learning while Clustering is UnSupervised Learning.

I have written a post in which I explained Machine Learning in simple words. You can find the post here.

Heart of the Machine Learning is the bunch of Algorithms. Algorithm plays very important role in creating the models. Nowadays different languages like Python, R and different tools like Azure, AWS has made our life so easy if we want to create Machine Learning projects.

But one should understand the algorithm instead of just using pre-existing libraries which above languages have already created.

In this article, I will try to explain Linear Regression.

Let us go back to our school days. You might remember syntax:

Y = MX + B

where:

B is the intercept,

M is the Slope, can be positive or negative

X is Independent variable

Y is dependent variable

So if you have X, you can figure out what Y is.

In simple words “Linear Regression” is a method to predict dependent variable (Y) based on values of independent variables (X).

Situation 1:

If X increases and Y also increases, it is called positive relation:

Situation B:

If X increases but Y decreases, it is called negative relation:

Let us see how to create a regression line:

To conduct regression, we require different observation. We can put those observations between X and Y:

Once all the observations are placed correctly, we can create the line which will fit all those observation dots and this is called the Regression line:

As we know, all the observation would never be in the straight line, there is always a difference between estimated value and an actual value. In the end, we are required to minimize the difference between estimated value and an actual value. We will call this difference as errors:

We would target to minimize these errors and above line has many errors when we compare actual with estimated values:

Let us take some examples to understand the Positive relationship and Negative relationship.

For example, if we study more our grades would increase:

It is a positive relationship where:

y is estimated grades

x is study time

b0, we can derive mathematically and it is the y interceptor

b1, this can also be derived mathematically, it is the slope

In simple words, if your study time is 0, grades are 10% and if you increase study time by 10 hours, grades would be greater than 10% and grades can be counted using above syntax.

Though there might be lot other features which affect the grades, for simplicity, we have only considered study time as the only feature. This is also called Univariate Linear Regression.

Multiple Linear Regression

We can increase our feature set by selecting more parameters like IQ of the person, the interest in the subject etc. For example, we can plot the grades against the interest of the person in particular subject and study time on a single graph, where the vertical axis plots grades and the two horizontal axes the interest in the subject and study time:

In this case, we can again fit a predictor to the data. But instead of drawing a line through the data we have to draw a plane through the data because the function that best predicts the grades is a function of two variables.

Now, let us see the Negative relationship:

If you spend more time on Facebook, grades would decrease:

As you can see above, if x increases, it would be multiplied by -b1(slope). Thus y decreases.

X is a dependent variable which we can manipulate, control, change and Y is a dependent variable which is nothing but the outcome of X’s activity.