How Text Analytics Works for Social Media

Social media provides a treasure trove of insights for brands looking to understand consumers.

But due to the vast amount of social media data, powerful analysis is required to uncover insights. In addition to image analysis, one of the primary methods of uncovering insights from social media data is text analysis.

What is text analysis?

Text analytics is the process of deriving information from text sources (Gartner). Text analysis can be applied to any text-based dataset, including social media, surveys, forum posts, support tickets, call transcripts, and more.

Computers have historically had trouble understanding natural human language due to its nuance, subjectivity and idiosyncrasies. But new technology and techniques have greatly increased the accuracy of text analytics. While humans are still better at understanding language, the vast amount of text data makes automated analysis solutions particularly useful for processing data at scale.

Understand sentiment and emotion

Measure share of voice

Use text analysis to understand what percentage of a conversation is about a particular brand, product, or topic.

Identify key topics, words, and phrases

Drill down within any conversation to understand what driving it and how the content of the conversation has changed over time.

Quantify purchase intent

Identify intent to purchase and any other stages of the consumer buying cycle that your brand cares about.

Answer any question (with machine learning)

Want to measure something that’s specific to your brand or product? Machine learning-based text analysis allows you to create your own categories and train a platform to categorize social posts accordingly.

Now that you know what you can do with text analytics, let’s look at the two primary approaches to it.

Linguistic Rules vs. Machine Learning Analysis

There are two main approaches to text analytics:

Linguistic rules

Machine learning models

Each method has specific strengths and weaknesses, depending on your analysis goals. Choosing the right approach for your use case is important in order to maximize efficiency and the relevance value of the insights.

Linguistic Rules

Rule-based pattern matching can be based on simple boolean keywords or more complex models compiled over time by language experts. The linguistic rules can range from identifying parts of speech, syntax, and inflections to rules about different topics, regions, and stylistic variations. This rule-based method can be quickly applied to a set of documents for fast analysis.

Get social insights delivered to your inbox.

Thank you for signing up.

Linguistic Rules Benefits

Fast analysis

The analysis runs quickly (after the rules have been created).

Mistakes are easy to spot

Easy to understand where rules are successful and where they return irrelevant data.

Granular analysis

Text can be broken into smaller chunks for analysis.

Results closely match expectations

Rules-based analysis will find what you’re looking for, but often serves to reinforce initial assumptions instead of challenging them with a broader perspective.

Linguistic Rules Trade offs

There are always exceptions to rules

Language is variable, constantly changing, and often informal. It is impossible for rules to account for all the ways meaning can be expressed in text. Text analysis based on linguistic rules often misses information that is relevant due to the rigidity of the rules.

Building complex rules can take years

Complex rules based on expert knowledge sometimes require years of research to compile the necessary resources to perform the analysis.

Detailed development for each language

Certain languages that have not been widely studied may not be easily analyzed before extensive research on the unique features of the grammar and vocabulary.

Narrow approach

Rules are created by humans with inherent biases, and will only match patterns which were expected to be found. Discovering trends and new ways of expressing ideas is hampered by the reliance on static resources.

Machine Learning

Machine learning-based analysis discovers patterns naturally from text examples. Using statistical methods, documents are compared to one another to determine the most important and useful patterns in the corpus for the desired behavior. Machine learning analysis methods are diverse and can range from simple to complex, but they all share the same fundamental goal of learning the most valuable and distinctive patterns based on examples provided by a human.

Machine Learning Benefits

Train with examples

Requires less complex linguistic resources, but learns patterns that are useful for the task under consideration.

Customizable and adjustable

Models can be altered and adjusted to adapt to new conditions that weren’t anticipated.

Increased flexibility

Machine learning models capture important context missed by rules based approaches because they rely on applying patterns using probability and statistics.

More discovery

Machine learning models reveal changes in the way ideas are expressed that human experts would not have expected.

Analyze any language

Analyzing a new language requires less linguistic expertise because research and development requires fewer custom resources.

Machine Learning Trade offs

Must provide training data

Machine learning requires extensive training, but that training allows for more relevant insights.

Slight decrease in precision

The lack of strict rules leads to a slight dip in precision as a trade off for uncovering more hidden insights. Uncover more contextual insights in the conversation.

Flexibility matters

Both types of text analysis have their strengths and weaknesses. Ultimately, having the flexibility to switch between linguistic rules or machine learning models depending on the goals or your analysis will provide the best results.

Ready to transform your business?

In Jorge Luis Borges’ short story The Library of Babel, an infinite expanse of hexagonal rooms filled with books contained every possible arrangement of letters. For every important, beautiful, or useful book in this library there existed endless volumes of gibberish.

The only way to navigate this vast sea of meaningless information was to locate the Crimson Hexagon, the one room that contained a log of every other book in the library—a guide to extracting meaning from all the unstructured information.

Like Borges’ Crimson Hexagon, we aim to be a key to the world of social data, guiding your business and helping you uncover where to go next.