Introducing Text Analysis API

Human beings are remarkably adept at understanding each other, given that we speak in languages of our own construction which are merely symbols of the information we’re trying to convey.

We’re skilled at understanding for two reasons. First, we’ve had, literally, millions of years to acquire the necessary skills. Second, we speak in, generally, the same terms, the same languages. Still, it’s an incredible feat, to extract understanding and meaning from such an avalanche of signal.

Consider this: researchers in Japan used the K Computer, currently the fourth most powerful supercomputer in the world, to process a single second of human brain activity.

It took the computer 40 minutes to process that single second of brain activity.

For machines to reach the level of understanding that’s required for today’s applications and news organizations, then, would require those machines to sift through astronomical amounts of data, separating the meaningful from the meaningless. Much like our brains consciously process only a fraction of the information they store, a machine that could separate the wheat from the chaff would be capable of extracting remarkable insights.

We live in the dawn of the computer age, but in the thirty years since personal computing went mainstream, we’ve seen little progress in how computers work on a fundamental level. They’ve gotten faster, smaller, and more powerful, but they still require huge amounts of human input to function. We tell them what to do, and they do it. But what if what we’re truly after is understanding? To endow machines with the ability to learn from us, to interact with us, to understand what we want? That’s the next phase in the evolution of computers.

Enter NLP

Natural Language Processing (NLP) is the catalyst that will spark that phase. NLP is a branch of Artificial Intelligence that allows computers to not just process, but to understand human language, thus eliminating the language barrier.

Siri and Google Now: contextual services built into your smartphone rely heavily on NLP. NLP is why Google knows to show you directions when you say “How do I get home?”.

There are many other examples of NLP in products you already use, of course. The technology driving NLP, however, is not quite where it needs to be (which is why you get so frustrated when Siri or Google Now misunderstands you). In order to truly reach its potential, this technology, too, has a next step: understand you. It’s not enough to recognize generic human traits or tendencies; NLP has to be smart enough to adapt to your needs.

Most startups and developers simply don’t have the time or the resources to tackle these issues themselves. That’s where we come in. AYLIEN (that’s us) has combined three years of our own research with emerging academic studies on NLP to provide a set of common NLP functionalities in the form of an easy-to-use API bundle.

Article Extraction

This tool extracts the main body of an article, removing all extraneous clutter, but leaving intact vital elements like embedded images and video.

Article Summarization

This one does what it says on the tin: summarizes a given article in just a few sentences.

Classification

The Classification feature uses a database of more than 500 categories to properly tag an article according to IPTC NewsCode standards.

Entity Extraction

This tool can extract any entities (people, locations, organizations) or values (URLS, emails, phone numbers, currency amounts and percentages) mentioned in a given text.

Concept Extraction

Concept Extractions continues the work of Entity Extraction, linking the entities mentioned to the relevant DBPedia and Linked Data entries, including their semantic types (such as DBPedia and schema.org types).

Language Detection

Language Detection, of course, detects the language of a document from a database of 62 languages, returning that information in ISO 639-1 format.

Sentiment Analysis

Sentiment Analysis detects the tone, or sentiment, of a text in terms of polarity (positive or negative) and subjectivity (subjective of objective).

Hashtag Suggestion

Because discoverability is crucial to social media, Hashtag Suggestion automatically suggests ultra-relevant hashtags to engage audiences across social media.

This suite of tools is the result of years of research mixed in with a good, old-fashioned hard work. We’re excited about the future of the Semantic Web, and we’re proud to offer news organizations and developers an easy-to-use API bundle that gets us one step closer to recognizing our vision.

We’re happy to announce that you can start using the Text API from today for free. Happy hacking, and let us know what you think.

Parsa is an AI, Machine Learning and NLP enthusiast, whose aim is to make these techniques and technologies more accessible and easier to use for developers and data scientists. When he’s not working he likes to play chess ('parsabg' on lichess.org).

Sentiment Analysis is a well-known task in Text Analysis, and it’s defined as the use of Natural Language Processing, Machine Learning and Computational Linguistics to identify and extract subjective information […]