How to use web and machine intelligence to uncover hidden user insights?

The sudden rise of data across media platforms at the turn of the 20th century, with the advancement of artificial intelligence in natural language processing, is presenting opportunities never like before. It is now easier and more functional to tap into the crowdsourced wisdom. Web intelligence is disrupting and replacing the businesses of traditional market research, survey and focus groups. User insights can now be tracked in a real-time and a large scale manner. It is now the hack for market intelligence.

In this LA insight piece, we will explore:

The sources of hidden user insights and how they differ

The use cases of artificial intelligence in generating insights

How should an organization use AI advisory services

1. The sources of hidden user insights and how they differ

The sum of all data held by the big 4 of data – Google, Amazon, Microsoft and Facebook is expected to be at least 1,200 petabytes. For perspective, a petabyte is 1,000,000 gigabyte. That is a huge lot of data. Using analytics, organizations can tap into this large store of information and get the information they need. With the right expertise, the old days of guessing what works are limited.

There are 3 main wells of data where insights can be obtained – through social media, the public web and enterprise data.

Typical selfie

a. Social media data

The large social media platforms are Facebook, Twitter, Instagram, Tumblr, Reddit. Users and communities converse, generate and distribute their collective words online. These are generally unsolicited opinions on a broad range of topics, identifiable through keywords.

Pros:

Able to collect real-time insights

A large volume of data points

Often unsolicited user conversations, which give rise to valuable insights

Cons:

Curated comments from paid influencers and PR firms may skew opinions

Not all organizations have an active social media presence, okay for B2C, more challenging for B2B

Platform comparison

Instagram

Facebook:

Most popular social media platform with 2.2 billion monthly active users

People go to an organization’s Facebook page to learn more about the company, see what is going on

Used across the demographic groups

Twitter:

A microblogging site, limited 280 wordcount forces users to be more succinct and impactful with their content

Due to the large volume of Tweets, each Tweet has a short life cycle

Hashtags are a key tool for users to reach a larger audience beyond their own followers

Attracts the younger crowd

Instagram:

Mobile photo and video sharing platform

Has 1 billion monthly active users and 95 million photos and videos are shared daily

Used for publishing and viewing visually appearing content

Has the highest engagement rate across social media platform

Companies typically post beautiful photos of their products or pay for an ‘influencer’ to carry it

Used mainly for awareness and as a sales funnel

High growth platform

Attracts the younger crowd

b. Public web data

The public web comprises mainly: User review sites, news websites, forums, blogs. While social media stems from the unsolicited opinions of the online community, the narrative for the public web is usually driven by thought leaders. Be it the author who pens an article, users giving answers to questions posted on forums, ‘experts’ answering questions posted online etc. Topics and interests can be easily narrowed down. For example, we can go into health subforums to hear discussions of interest from users concerned with health.

Pros:

Able to dive into niche sources, forums, sites for targeted insights

Easy to divide and explore insights of different demographics

Monitor the news mentions

Cons:

More noise in the data collection due to the unstructured data format

Tripadvisor reviews

c. Enterprise data

Enterprise data comprises mainly: Voice of the user, analytics report generated from the organization’s social media pages to track conversions, information and characteristics of existing users, customer service logs. These information offer accurate insights into what drive positive and negative feedback of users. It is less contaminated than public data.

Pros:

Proprietary information to develop the edge which competitors do not have

Easy to create an ‘alike’ group for future user acquisition

Cons:

Organizations need to first have a proper data collection process before being able to utilise the insights

China corporate building

Tying the 3 sources of user data together

Social media allows the organization to collect a large volume of unsolicited opinions on any topics they are interested in. The public web contains more focused opinions and provides the platform to see what topics are trending or are of a greater interest. Enterprise data is a direct way to see what attracts or repels users from existing services.

2. The use cases of artificial intelligence in generating insights

With the large volume of data collected by machines, there needs to be an efficient, accurate way to scrub and interpret the data. This is where artificial intelligence, natural language processing, and automation comes in.

a. Natural language processing

Natural language processing allows better all-around understanding of a text. Generally, every informative we receive comes in the form of reading and interpreting a body of text. Be it from the news, online or text messages. What if we can automate this interpretation process, make it more accurate, fast and scalable? And have the ability to repeat this consistently, without the human errors nor fatigue? With natural language processing, this is possible. We can unlock an immense amount of value.

A good use case is in the financial markets. Every moment is about making decisions, whether to buy or sell an asset. For example, we need to make the decision if Stock A is a buy or a sell. We often turn to bank research reports and or consult opinions of ‘experts’ across the channels. Instead of spending 10 – 30 minutes to read every research piece, we can use natural language processing to parse through the pieces and come up with an equivalent decision. This can systematically be more accurate than a human’s interpretation, especially when averaged out over a large volume of runs. This is also beneficially faster, considering that time is of the essence when making these decisions. Beyond financial market uses, natural language processing can be applied to the commercial space by analyzing topical terms, reviews and text online. Especially when it is tedious and complex for the manual employee to do so.

Natural language processing works by first cleaning up messy user voices, often written in slangs, colloquialisms and abbreviations, before processing. Sentiment analysis is a common natural language processing use. It can monitor the voice of the user and derive a general degree of happiness, anger and other emotions expressed.

Integrated development environment

b. Machine learning classification of topics

The basic of every analysis is about getting clean, categorised data. With all the data we collected on the web, we need to clean, classify and organise them properly. What we can do is to use supervised learning methods to group data points into categories.

For example, company A produces 3 products and we have 1,000,000 data points to work with. We need to classify all user opinions into categories relating to the 3 products. For a large volume of data collected about company A, we manually hand label 50 posts into categories A, B or C. Next, we use the machine learning algorithms to classify the remaining data points into the respective categories. The end state is that all the 1,000,000 points will be classified into the categories.

c. Computer vision

An often heard comment in Singapore is that more users are active on Instagram than the other social media platforms. For Instagram, there are 60 million photos uploaded online daily and many of these do not contain any useful captions or text references. Hence from text analytics alone, a lot of information is missed out. As a result, the organization will start to lose valuable insights.

Visual communication remains very trendy today and this can be seen in Instagram’s rapid growth to its current 1 billion active users. For the individuals, posting a picture could have more social validation than posting a text post showing off what one has done or been to. Computer vision allows us to identify and extract information from pictures. Organisations can then better track the conversation around their brands.

Current available computer vision APIs have very impressive functionalities.

Here is a showcase of computer vision. When we search ‘Singapore’ from Google images, the first image that pops up is the following picture from the Visit Singapore site.

Singapore

We run this picture through Google Vision and receive the following results.

Google Vision

From the picture, we see new information: tags, labels, web sources, detection of adult content etc. One use case for computer vision is to extract user context and insights by running large number of pictures through computer vision, referenced by hashtags and location tagging.

Analyzing both text and images give the organization an advantage over competitors

Pictures are not pegged to any particular language, hence no translation is needed which can introduce in more noise

Pictures give information that is not necessarily found in the caption

Pictures can reveal new demographics and sentiments

3. How should an organization use AI advisory services

We understand that for many organizations, the value-add lies in recommending actionable solutions rather than aggregating data. Using web and machine intelligence to solve real-world problems is a multi-disciplinary approach and goes beyond a trivial software solution. It encompasses social sciences, business administration, economics, computer sciences and artificial intelligence.

The human touch is essential in identifying the problem, recommending and integrating solutions. We listen to senior managers, hear their concerns. We customized the qualitative and quantitative frameworks needed for each organization uniquely. We build an extended relationship with the team and function as an involved member rather than a vendor.