How I Estimate (Social/Sentiment/Text Analytics) Market Size

Business loves market-size estimates. They quantify opportunity, measured as the gap between an existing and an addressable market, with growth rates providing a reality check. They help an individual solution provider understand how it stacks up against the competition, and they guide investors in deciding where to place funds. Being an obliging consultant and industry analyst, I do my best to compute good estimates for certain technology sectors I cover, for instance, text analytics. My methods are quite simple, actually. (The real smart-work is in the data collection.) Allow me to clue you in on how I estimate social/sentiment/text analytics market size.

First, why those particular analytical software technologies, and why is an independent analyst the best (or even the only) source of market-size figures?

Text analytics applies natural-language processing (NLP) to extract information from text, bringing text-sourced business intelligence to a broad array of applications. Sentiment analysis is a particular NLP application, although it can also be tackled via human analysis (a.k.a. “reading” and “listening”) including via crowd-sourcing. Sentiment analysis is about finding and exploiting subjective information in content: Attitudes, opinions, emotions, and intent. Social analytics, my number 3 area, applies text and sentiment analysis to content and network analysis to understand interconnectedness and message flow. These three related analytics species are often most interesting when they’re linked to analysis of enterprise transactional, operational, and profile data, leavened with geospatial and behavioral analyses.

It’s a great time to be in these fields, which are experiencing strong, steady market growth, reflecting the technologies’ ability to make sense of online, social, and enterprise ‘unstructured’ sources. Yet the dollar/euro/renminbi/rupee value realized by vendors lags far, far behind than that of mainstream enterprise applications. In part, the gap stems from market maturity. While it took CRM a couple of decades to build to 2010’s $16.5 billion in applications revenue and BI and analytics 30+ years to get to $10.5 billion, I’d guess-date their first significant commercial uses back 12 years for text analytics, 6 for sentiment analysis, and 3 for social analytics. Further, the newer, more specialized analytical technologies aren’t widely built into everyday business operations despite their ability to transform business-stakeholders interactions.

Technologies that are broadly applied though too new or too narrow to have drawn the full attention of the big-firm analysts. Yet enterprise use seems typically, still, at an individual, small-group, or departmental level, or it forms only a small part of much larger applications such as e-discovery and customer experience management. On the one hand, limited-scope enterprise use flies under the radar of enterprise CIOs and IT executives who buy expensive consulting and reports from the likes of Forrester, Gartner, and IDC so that analysts at those firms, were they inclined to run market-size numbers (Forrester, at least, isn’t in that business), don’t have a business driver. And solution-focused analysts, concerned with e-discovery, market research, and CEM, very justifiably don’t typically get into the nitty-gritty of technology details.

Enough of the why. How do I compute market-size estimates?

My approach starts by identifying companies wholly or partly in the space. For social analytics, “partly in the space” includes Social CRM/engagement, competitive/market intelligence, and customer experience vendors. Similarly, text-analytics is a contributor to e-discovery, publishing, intelligence, and other applications.

I then find or estimate revenues and growth rates. Sometimes it’s easy: A company is publicly traded or for some other reason is required to release figures. French companies, for instance, must release revenue figures, which can be found online via sites such as Bilan Gratuits. Companies that do business with the U.S. government have to provide figures, which become part of public records that are searchable at sites such as USAspending.gov. Data on fast-growing companies is exposed when they apply for Inc 5000 recognition. A tip: You can learn a lot by simply, directly asking the CEO for data. Sometimes she’ll tell you, about her company and about competitors. A promise to keep individual data points confidential — to release only aggregate figures — can help.

When figures aren’t current, an estimate of past years’ growth, for an individual company and for a sector, can be used to project them forward.

When I can get only whole-company figures for a vendor that’s in multiple markets — large companies such as Autonomy, IBM, SAP, and SAS are examples — I allocate a portion of overall revenue to the narrower space.

Where necessary, I create or adjust estimates based on elements such as a) typical/average deal size, multiplied by the number of customers, b) headcount multiplied by an estimate of revenue per employee for a company at the subject-company’s market stage, and c) investment, factoring in the proportion of ownership that a given investment likely bought, dividing by a typical valuation-to-revenue multiple.

A fair amount of guess work is involved, also many approximations, and boundary setting is part of the game. I bin the ankle-biters — low-revenue start-ups, many of which are flying under the radar — into a single guesstimate. I exclude the value of academic, government, and industrial research from my estimates. Work doesn’t contribute to a market valuation unless something — a product or service — is sold. I also exclude the sometimes-substantial value of work done for in-house use, for instance text analytics done by companies such as Thomson Reuters or Reed Elsevier in the course of creating information products.

I end up with estimates that I consider accurate enough to release to clients or in an article, often with a disclaimer that my work should be taken as inexact and that actual values likely lie in a defined interval around my released figures. The aim is to provide useful business guidance, to support confident decision making in fast-moving, dynamic technologies markets.