Nexosis @ Work & Play

Building the Bigfoot Classinator: The Model

In part 3 of Building the Bigfoot Classinator series, Guy discusses how he built and tested the model.

This is part 3 of a four-part series where I explain how I built the Bigfoot Classinator. In part 1, I talked about the problem we are solving and the approach I used. In part 2, I told you how I munged the data and got it loaded into the Nexosis API. In this part, I'll show you how I built and tested the model.

Go back and read the other parts if you need some catching up. Or blithely plod forward with no guidance from the past. I'm not gonna tell you how to live your life.

It's only a model

Models are the things in machine learning that do the predicting. They are, in a sense, an intersection of your data and an algorithm. Typically, when you build a model, you divide your data into test data and train data, you pick an algorithm and adjust its parameters, and then you build a model. Usually, your first model is, shall we say, suboptimal. And so you tweak you your data, your parameters, and build it again. And again. And again. Eventually, you get something that is good, or at least good enough.

With the Nexosis API, we do the grunt work for you. So, all you need to do is make an HTTP request to start a model building session.

With the Nexosis API, we do the grunt work for you. So, all you need to do is make an HTTP request to start a model building session. To kick off the model build for the Bigfoot Classinator, I used Postman to make an HTTP POST request against the following URL.

https://ml.nexosis.com/v1/sessions/model

The body of this request provided instruction to the API on the type of model I wanted to build, the field that was to be targeted, and the data source for the model to be built from.

The dataSourceName contains the name of the dataset I uploaded. It could also contain a view, which would be a joining of multiple datasets. This is why it's called the dataSourceName and not the dataSetName.

The targetColumn is the column I wanted to predict. I wanted to predict the reportClass so I put that in there. Nothing fancy here.

The predictionDomain contains the type of machine learning problem I wanted to do. Since I was predicting a class, I chose classification. Other options include regression (if I wanted to predict a number instead of a class) and anomalies (if I wanted to find anomalous records). I didn't want those other things for this application, so classification it is.

As always, if you are following along, don't forget to include HTTP headers for your Content-Type of application/json and your api-key.

Wait

Once I submitted this request, I got the following response. Well, not exactly this response. I've trimmed some of the extra bits from it that you don't need to worry about for today. Don't fret, if you're following along, you'll get to see them.

A lot of what I got back was stuff I put in it. We can see the dataSourceName, targetColumn, and predictionDomain that I entered earlier. We also get a sessionId, a status, and statusHistory. What is all that?

Here's the thing, sometimes it can take a while to build a model. And the more data, the longer it takes. So, instead of keeping an HTTP session open that long and all that that implies to server-side threading models, we return a sessionId that can be used for polling the server to get the status of the session.

Of course, I wanted to check the status immediately, so I did. I just made a simple HTTP GET to this URL replacing <sessionId> with the sessionId I got back.

You will note a new field on the results. The modelId. This is the model that has been created. It can be tested using Postman. So, I did just that. I hit the following URL with an HTTP POST, replacing <modelId> with the returned modelId.

https://ml.nexosis.com/v1/models/<modelId>/predict

For the body of the request, I provided reports of two Bigfoot sightings in an array. Personally, I think these are great works of fiction. No commentary on Bigfoot here, it's just that I made them up.

Seeing Bigfoot in the woods is certainly a Class A sighting. Finding a big footprint is a Class B sighting. Looks like it worked. Hooray!

Next time on the Bigfoot Classinator

Next time, I'll be posting the exciting conclusion to this series where I write some code around this model to make something an end user would actually use. As always, the code for this is on GitHub. Go check it out.

Past posts

Ready to start building machine learning applications?

Guy Royse

Guy is one of our developer evangelists at Nexosis. He spends his days sharing with developers why our API is so great and his nights reminiscing about Hogwarts and dreaming of retiring to his dream job: Santa Claus.