Start here

How much have you read lately about how bad bully detection software is? Even the CEO of Twitter declared in an internal memo, exposed by The Verge, “We suck at dealing with abuse and trolls on the platform and we’ve sucked at it for years,”

As one of the great poets of our time put it “haters gonna hate, hate, hate, hate, hate”. So why is this such a difficult problem? There are multiple scholarly papers on the web about this problem, and some of them are hundreds of pages, and that is just for English! Can you imagine solving the problem for multiple languages? After all, 48% of Facebook users, and 49% of twitter users are using a language other than English when on their sites. Solving abuse in English is just solving half the problem. Let’s look at a couple of techniques used today.

False Positives vs False Negatives

If our intent is weighted more towards stopping abuse than it is towards freedom of speech, then we have to weigh any solution towards producing more false positives and fewer false negatives. A false positive would be where the software identified a phrase like “I had to suck through the straw hard to drink my milkshake”, as offensive, because of the use of the word “suck”. A false negative would be where the software identified the phrase “you suck, and I should pour this milkshake on your head”, as NOT offensive, because it shared the words “suck”, and “milkshake”. Both examples are in the extreme, but you get the point. If we are mitigating abuse/bullying, then we need to err on the side of false positives, not false negatives.

Keyword Search

If an online venue is overflowing with trolls and abuse, there is no time, and something must be done about it, panic drives people to adopt short-term fixes. One of such naïve approaches is to look for “potentially offensive” terms, then make the moderators review these posts. Unfortunately, this means a huge number of both false positives and false positives. While ethnic slurs are generally always offensive, words like “suck” and “fat” have plenty of legitimate meanings.

On the other hand, a combination of completely neutral terms may be combined into an offensive utterance, e.g. “Most people on Earth are nice except <insert your favorite ethnic / social group>”.

So yes, some cases will be resolved, and the effort will make it seem that someone is working to resolve the issue, but it won’t really solve the problem.

Statistical and Machine Learning Approaches

Statistical and machine learning approaches try to detect offensive messages via a more holistic view. Here, dear computer, is a set of offensive posts. Have a look, find out the similarity, and when I send you a message, tell me if it’s similar to the offensive one.

There are several problems here. First, a post may only contain a small part which is offensive, and otherwise be perfectly acceptable. However, it is not guaranteed that the classifier will know what part is that, and so it will not be able to derive the correct conclusion. Second, as of today, most if not all classifiers view the content as a bag of words, and as we know, switching a subject noun and an object noun yields a completely different meaning. If the words are normalized to lemmas, we risk getting more false positives; if they are not, we will be having more false negatives. Finally, getting a well-annotated adequate training set is not always simple or even possible, especially if the content is very diverse and comes in huge volumes. It may be as hopeless as trying to get a limited number of monkeys with typewriters create a complete works of Shakespeare.

Semantics

There is no getting around the requirement of understanding the message semantically or, simply put, what the message is about. This means, parsing the entire message, mapping pieces to syntactic structures, and linking words to their actual senses. Is it “suck” as in “be inadequate” or “suck” as in “draw fluid into one’s mouth”? We have to find this out.

But it’s still not enough. While some words may be offensive on their own right, in most cases we have to discover the sentiment, too. Not just the sentiment score of the message, but an actual targeted sentiment towards a specific entity in the message. We have to flag a message like “Klingons have never been nice” but let pass “Klingons have never been so nice”.

Semantically and grammatically aware systems like those we built with our Carabao Linguistic Virtual Machine determine offensiveness relying on pragmatics which function as “business logic” in this case. The pragmatics and the semantic network in Carabao LVM are shared by all languages (negation + favorable adjective means negativity in all languages; associating a social or an ethnic group with unfavorable adjectives is offensive, too), and this adds yet another advantage: once this business logic is built, it is shared by all the language models of Carabao LVM (at this time, 22 languages).

The application using Carabao LVM does not have to implement any language-specific logic, all it has to do is to look for specific identifiers, and flag the content that has them. Your abuse detection doesn’t have to suck!

Machine translation is the science of taking some human language content, usually in text format, although speech to speech systems are being developed, and using a computer to digest the content and create output that is true to the fidelity of the original, in another language. Simply put, as an example, English in, German out. So why is that so unbelievably hard to do?

Going back to the history of machine translation (hereafter called MT), you have scientists that believed it was originally a word for word lookup and replacement problem. It is easy for even the uninitiated to see quickly that this solution would break down quickly. For one thing, there was no word rearrangement. How would a Spanish speaker be able to read an English source sentence translated into Spanish if all of the adjectives were before the noun instead of after the noun? That’s only a tiny fraction of the problems, but easy to understand. So, this was obviously a case of needing rules for word rearrangement, which then required that the MT engine figure out the part of speech of each word in the source. How could you move the adjectives behind the noun, if you didn’t know which words were the adjectives and the noun they were acting on. So “Direct” systems were born where the engine would start rearranging the words based on the source, and then flip the words to the target language. Pretty soon, computers got a whole lot more powerful, and with more computer power came even more rules, and then Direct systems morphed into “Transfer” systems, where you could actually break down the sections of the engine into analysis, transfer, synthesis. The quality improved significantly, but that is sort of like saying it was less bad. But investment dollars were plenty, because the holy grail of a near perfect system would literally change the world. In the 80s, scientists started playing with “hidden markov models” (don’t worry, I’m not going to explain what they are) and statistical MT. Basically, what are the odds, that based on the words around a word, that, based on all of the corpus of words that the engine has already read, that the answer is this sense of the word, and not this other sense of the word. Of course, more data is good data in that world, and actually statistical MT got the world on board with systems like Google Translate and other free systems.

I’m going to go out on a limb here and declare my belief that Google Translate has failed with its current technology. If more data is the answer, Google has been scanning the web for existing translations to digest since 2007, and it still is just “less bad”.

So why did all of these millions of person hours that have been put into statistical MT failed? Have you ever heard of the expression “if you are a hammer, everything looks like a nail”? They are using the wrong tools, and, to some degree, have been for decades. Think about it. Where does languages sit at the university level? Liberal Arts. Writing is an act of art, not science. Translating art to art (translating Moby Dick from English to Spanish) is NOT a scientific endeavor, it is an artistic endeavor. Good translators are artists. They capture the mood, the meaning, and even the essence of the source, and try to replicate that experience for the reader of the translation. They are literally painting a verbal picture for the author, in another language.

Ok, so how do we get closer to the artist. For one thing, we need to understand the meaning of the source. We need to understand things like intensity, ambiguity, culture, and many more things that are very hard to quantify scientifically. At LinguaSys, with decades of experience in this space by the founders, we understood all of this, and that shows in the path we have been taking for over a decade of development of our MT technology. Before we begin to think about translating source content, we deeply analyze it both semantically and syntactically. By no means are we done, but what we have been able to produce for our clients, to date, is a system that retains fidelity, sometimes at the expense of good syntax. More importantly, we understand where to use MT and where not to use MT, and where most MT fails, is when it is used wrong. We have proven that MT works great when there is a given domain that we can concentrate on, such as, for one client, financial services. That is a limited domain, and knowing that the source is going to belong to that domain means that we have strong clues in how to disambiguate words that have multiple meanings. Based on the state of the art today, and we are, I believe, the state of the art today, MT to the masses (Bing, Google, etc) is one big FAIL. Use it at your own risk. Your mileage may vary.

Where new invention in AI takes us, must concentrate more on semantics and less on statistics. Yes, statistics helps a lot and can play a big role in being another voting part of the engine in making a disambiguation decision, but for MT, or Speech Recognition, or any of the other imperfect technologies that are “less bad” on a daily basis, the secret sauce to getting them “good” is semantics.

Create Your Own Applications in Days, Not Months, With the Groundbreaking LinguaSys Server That Allows Anyone With Basic Programming Skills to Create Natural Language Conversational Interfaces in Over 20 Languages

BOCA RATON, FL., February 11, 2015 – Millions of developers with only basic programming skills will now be able to create their own Artificial Intelligence (AI) Natural Language Understanding Interfaces (NLUI), with the announcement of today’s release of LinguaSys Natural Language User Interface Server. The new, multitenant, NLUI Server runs on Microsoft Azure’s open cloud platform, so users can quickly build, deploy and manage applications across a global network.

Anyone who can write XML scripts can create complex NLU applications such as hotel reservations, car rentals, an NLU interface to a favorite Customer Relation Management (CRM) system or build their own Siri-like application by leveraging LinguaSys’ extensive semantic network of 20+ languages, and their NLU engine, which moves most of the AI from the developer to the engine.

“We’re revolutionizing the byzantine Natural Language Understanding marketplace by commoditizing the ability to create AI interfaces in hours or days, not months and years,” said Brian Garr, CEO of LinguaSys. “NLUI Server allows you to write an AI interface once, and have it accept input in all LinguaSys supported languages.”

This new NLU capability is available on the LinguaSys GlobalNLP™ portal at https://nlp.linguasys.com/ . Trial subscriptions are free, and it comes with an editor, validator and run time.

“While our competitors are creating highly proprietary systems that only they can control, at exorbitant pricing levels, we are making NLU a commodity to enhance the customer experience for large and small enterprises around the world,” said Garr. “This is not smoke and mirrors. Try it yourself, today.”

LinguaSys solves human language challenges in Big Data and social media for blue chip clients around the world. Its natural language processing software provides real time multilingual text analysis, sentiment analytics and fast, cost-effective natural language user interfaces. The solutions are powered by LinguaSys’ Carabao Linguistic Virtual Machine™, a proprietary interlingual technology, to deliver faster and more accurate results. Designed to be easily customized by clients, the solutions can be used via SaaS or behind the firewall. Headquartered in Boca Raton, FL, LinguaSys is an IBM Business Partner. www.linguasys.com@LinguaSys Join us on LinkedIn http://linkd.in/1rC1qzi

So let’s say that you need to build something really cool, like a virtual assistant that understands human language. Now let’s say that you take a bunch of MIT mathematics guys, and you tell them to figure it out. How likely is it, do you venture, that these guys will ever pay attention to the actual meanings of the words being evaluated. Right. Thus we have “state of the art” virtual assistants based on prediction algorithms and models. So why doesn’t Apple and Microsoft care about the meaning of words? Probably because hundreds, if not thousands of developer jobs depend upon statistical methods. This is why Microsoft and Apple will never get to the next stage of NLU. The theory of feeding millions of words into a hidden markov model, and “the more data, the better” has hit a wall, and no one (mathematicians) wants to admit it. Let’s face it, if “more data is better”, than why does Google Translate put out such awful translations most of the time? They’ve been trolling the web adding more data for a decade.

The missing link in all of this? Semantics. Right, the idea of understanding the meaning of all of the words in an utterance. Just because the odds are in favor of a word (probability), obviously means that there is an error rate, because the odds are almost never 100%. Forget about the Rest of the World, English is incredibly ambiguous. Take the word “tank”. Inside the LinguaSys lexicons in English, “tank” has nine possible meanings. I tanked the test. Fill the gas tank. The tank rolled into battle….etc. The truth, as I see it, and experience it, is that semantics beats statistics hands down. In fact, we, at LinguaSys, have beaten statistics hands down. We recently competed against two of the very largest providers of natural language understanding at a major auto manufacturer. All three vendors had to complete the same task. Guess what, we won! And, although I don’t have the actual metrics for this, I imagine that we did it in about a 10th of the time and cost. Because we deal in semantics, if I need to include the option to bring pets in my hotel reservation virtual assistant, I don’t have to build an embedded grammar listing all of the kinds of pets there could be. I already know, because of my semantic tree, all of the “children” of pets. One line of code vs hours of grammar building. Now expand this tree to 19 other languages, and you start to see the power of semantics.

We live in an era of statistics, and it certainly has its place, but the Microsoft and Apples of the world will, I fear, never fulfill their vision simply because embracing semantics would cost too many developers their jobs. When you are a hammer, everything looks like a nail.