All the models. Such predictions. Wow.

Aidan McLaughlin -
August 17, 2015

Bounding box coordinates found with our new face localization model!

Here at indico, we like to move quickly. Combine our passion for creating powerful tools with our left brain impulse for GSD, and you have our summer in a nutshell. It’s been awesome. So what should you be looking out for and why should you care? For those of you who are already familiar with our services, we’re filling out our offerings and making it easier to compare results (see the Intersections tool below), as well as grab all the metadata you need in one call (in other words, get batch results from multiple models). For those of you who are unfamiliar with our stuff, you came at a good time! It should now be easier than ever to start exploring your data using indico. Now, let’s see what we’ve got.

Note: I’m going to be using Python for the examples shown below, but we also have language wrappers in Ruby, node, Java, R and PHP. If none of those are your jam, you can also make calls to one of our RESTful endpoints directly.

New Text Models

Keywords (And It’s Multilingual!)

This model extracts the less common words from phrases, giving you an idea of what the salient parts of a piece of text are. Although different in its implementation, it’s like a low level version of text_tags in that the words it pulls out may be indicative of a topic area, but the only prediction we’re doing is deciding which words are likely to stand out — and therefore are most important — in the context.

Also, it’s available in multiple languages! We’re starting to work on finding training data in various languages so that you’re not restricted to only analyzing text in English. Admittedly, it will take time, but we’re eager to support our models in as many languages as we can find good data for.

import indicoio
indicoio.keywords('Did you hear about that coffee shop that has kittens for you to play with?')
# returned results
{
u'coffee': 0.16531870846662627,
u'kittens': 0.2311858662209601,
u'shop': 0.15917130269049332
}

Named Entity Recognition (NER)

Our NER model predicts which words and phrases in a piece of text refer to a specific person, place, or organization. This is helpful when you want to catch mentions of well-known proper nouns. If the model believes that a word is a named entity but is unsure of its class, it will predict unknown according to its uncertainty. Confidence describes the overall certainty in the word being a named entity.

Twitter Engagement

Austin covered this in more depth in his post, but just to recap: get an idea of how your tweet will be received! This model has learned the features of tweets that get large amounts of retweets and favorites. All you need to do is throw in your tweet’s text.

Intersections

This tool isn’t actually a new model, but it’s a new way of interacting with our existing ones. All you need to do is provide some text data, and then choose two of our models you’d like to examine for a possible relationship. Intersections will then tell you the correlations among our results and how confident it is in these connections. Still not quite sure what it does? Here’s a quick example using the first chapter of The Scarlet Letter.

The dictionary is nested according to the order in which you specify the two APIs (no more than two). Here we have sentiment_hq as our primary key, and the possible text tags topics at the second level of nesting. Finally, we have the correlation between the sets of values and the model’s confidence in its prediction.

Analyze Text

This tool allows you to combine calls when you’d like to analyze the same set of text with multiple models. It’s a great baseline if you’d like to test indico out. However, we recommend that you use a subset of your total data, or store the results to avoid running out of calls when experimenting. This example uses just the first three sentences.

New Image Models

Content Filtering

Automatically determine if an image is NSFW. Just toss it a compatible image format and see the number reflect how inappropriate the content is on a scale of 0 to 1. 0 is appropriate, 1 is inappropriate.

import indicoio
# from within downloads, filepath must be relative
indicoio.content_filtering('poptarts.jpg')
# returned results
0.3174145519733429

Facial Localization

Use this tool to find faces in an image — just give it a picture and you’ll receive a list of coordinates that bound each face.

Bonus Points! Our FER (Facial Emotion Recognition) model can now take the argument detect_faces=True to automatically analyze just the faces in a picture. Nice!

And the Finale…New Documentation!

We’re excited to announce that we’ve moved to self-hosted docs! They’ll be easier for us to edit, which means we can build new models for you much faster. Using self-hosted docs also means we’ll have an easier time adding nice little features, like jumping to specific pieces of documentation for a specific language, among many other benefits. If you see any issues with them, feel free to send a message via our chat. They’re still a work in progress and I’d be happy to help out.

Thanks for reading, and I hope you have fun with the new toys! Please feel free to contact us if you have any questions; we’re always down to work with you to figure things out.