In the previous post, I offered a list of eight linguistics-based market segments, and a slide deck surveying them. And I promised a series of follow-up posts based on the slides.

Let me begin by explaining what I mean by some of that list (taken from Slide 2), starting from the bottom.

Machine translation is a small business, with small specialized vendors. Lernout & Hauspie attempted to combine it with voice recognition in a complex financial play, but that collapsed in a miasma of stock fraud. The remnants turned into what became Nuance Communications.

Nuance is a roll-up of most of the important independent voice recognition vendors. So far voice recognition has worked best in two areas: “Hands-free” computer use/dictation, and IVR (interactive voice response). While both are important, neither is exactly a mainstream enterprise computer software business. So voice recognition is not closely integrated with the other market segments.

“Natural language processing” other than voice recognition isn’t much of a business at this time (with apologies to Progress EasyAsk). It doesn’t make the list at all.

Spam filtering is obviously a major business, whether or not it is getting combined into more general security and/or messaging product suites. Antispam vendors actually perform a lot of machine learning, much like text miners do. But the types of rules they wind up with are quite different. And their hardest problems aren’t linguistic ones, usually, as the spammers have gone beyond text to, e.g., words depicted in graphical images. Besides, even where linguistics are involved, it’s a very different problem to identify words used by bad guys trying to spoof you (and the rest of the world) than it is to understand your particular users.

Why and to what extent I see the other five as separate markets was explained in connection with the subsequent 17 slides.