Basic Advice for Your Language Tech Start-up

I talk frequently to companies in, or entering, the language technology market. That’s text and social analytics, sentiment analysis, and all things applied NLP, from good-old entity extraction to natural language generation (NLG) to emoji semantics. Companies that contact me want guidance on feature sets, technical capabilities, competitive positioning, and potential sales targets, and they want to show off their wares in order to win attention. Early-stage companies covet coverage, and most welcome funding, partner, talent, and (what’s golden:) prospective-customer referrals.

The ones I reach out to: Well, I make it my business to spot players and trends early, to help advisees place the winning bets. Sometimes I write about startups and innovation and I regularly bring them in to events I organize including — do check it out — the Sentiment Analysis Symposium conference, taking place July 15-16 in New York. (The emoji reference above is to what should be a fascinating SAS15 talk, Emojineering @ Instagram, presented by engineer Thomas Dimson; and Prof. Robert Dale will offer an NLG workshop.)

Geneea, a Czech start-up founded last year, aims to build an “intelligent multilingual text analytics and interpretation platform.” Sounds ambitious, doesn’t it? Actually, technically, it’s almost the opposite. Open source software — Geneaa’s chose options including OpenNLP and Mallet — eliminates technology barriers to entry, including in text analytics. You do have to choose the most appropriate options and use them effectively, but I see the greater challenge in finding a market and a path to it. The path to market is facilitated by connections, but you do have to prove your technical capabilities by delivering data interpretation that suits business tasks. Not so easy.

I had a productive conversation last month with several Geneea team members. I’ll distill out and share some key points, from that conversation and others, acknowledging that I may learn as much from the startups I talk to — around the same time, folks including industry veteran Alyona Medelyan (check out Maui automatic extraction of keywords, concepts & terminology) and David Johnson and colleagues at Decooda (“cognitive text mining and big data analytics” targeting the insights industry).

An early-stage needs to recognize that, per Tom Nowak of Geneaa (quoting with his permission), “any piece of wisdom — experience & expertise — is most welcome and very important for startup strategy.” So point #1 is:

Solicit targeted advice, early.

Obvious, yes, but in my experience, some start-ups stay heads-down developing technology that ends up over-fitting any paying application. Also:

Look for comparators, companies to learn from that have succeeded (or failed) in what you hope to accomplish, whether similar in business model, function, technology, or target market.

Exploit open source. It’s free, proven, and comes with community support. What successful text analytics companies have built around open source? Attivio and Crimson Hexagon for two.

Tom and his Geneea colleagues have been working since last summer on their text analysis platform, which they’ll deploy online, available via a Web service, RESTful API (application programming interface). Others I cited above — Maui, Decooda, Luminoso — are also deploying via an API, which fits another bit of guidance:

Design to industry standards, at least to start, to allow your product to be easily plugged in to others’ platforms and workflows.

Lock-in is for later, once your established. A bit of related wisdom:

Market education is expensive. Time spent in explaining your idiosyncratic methods or terminology is time that communicates costs rather than business benefit.

(Decooda, how’s the “cognitive” label working out for you? Sure, IBM uses it, but I’m not convinced anyone understands it.)

Especially if you design to standards, you need to differentiate.

Identify, build out, and communicate things you do — not just better, faster, or cheaper than others, but that others don’t. Competing on better (including more accurately), faster, or cheaper is competing. You want to avoid competition, if you can swing it.

In the language-technology world, ability to handle under-resourced languages or excel in under-supported business domains is a good differentiator.

Another differentiator: ability to discern and extract information that others don’t.

Language coverage is a differentiator for Geneea, which is located in the Prague and supports Czech in addition to English. Czech is a jump-off point to other central and eastern European language, many of which are under-resourced. But I believe…

You need more than one competence, more than one selling point. Seek to create technical differentiation if you can, also design to meet someone’s business needs.

Seek to partner with established organizations including agencies and consultancies. They have assets you don’t: brand visibility, technology, domain knowledge, and business relationships. They provide a channel.

Partners (and investors) have a stake in your success. Keep that in mind.

But be wary of partnerships where you’re just one player among many. Some companies cultivate ecosystems that play tech partners off, one against the other, with no revenue assurance. (Salesforce, I’m looking at you.)

If you’ve gotten this far without skipping to the bottom, you realize that the majority of my points apply broadly. They’re not specific to language tech companies. Do they reflect your experience? I’d welcome knowing, via comments or direct contact.

And if you’re in the text or social analytics world, commercializing technology or developing solutions in NLP, sentiment analysis, or related areas, I’d love to hear your story. Get in touch!