SAP HANA Text Analysis

As many are aware, twenty-first century corporations are facing a crisis. Many corporations have been accurately and comprehensively storing data for years. The data is in variety of forms like social media posts, email, blogs, news, feedback, tweets, business documents etc.

It is very important to extract meaningful information without having to read every single sentence. Now, what is meaningful information. The extraction process should identify the “who”, “what”, “where”, “when” and “how much” (among other things) from these data.

For example, use social media data to find out –

What people are saying about my brand or products?

How many people recommend my brand vs. advocate against it?

Text Analysis is the solution of all this problem.

In this article we will explain:

What is Text Analysis?

Why Text Analysis is so important for business?

How does SAP HANA support text analysis?

Before understanding Text Analysis, you will have to first understand Structured Data and Unstructured Data.

Structured and Unstructured Data:

Structured Data:Data that resides in a fixed field within a record or file is called structured data. This includesdata contained in relational databases and spreadsheets .

For example data stored in database tables are structured data.

Structured data has the advantage of being easily entered, stored, queried and analyzed.

Unstructured data files often include text and multimedia content. Examples include e-mail messages, word processing documents, videos, photos, audio files, presentations, webpages and many other kinds of business documents.

Digging through unstructured data can be cumbersome and costly. Email is a good example of unstructured data. It’s indexed by date, time, sender, recipient, and subject, but the body of an email remains unstructured. Other examples of unstructured data include books, documents, medical records, and social media posts.

Why unstructured data is so important for business?Experts estimate that 80 to 90 percent of the data in any organization is unstructured. And the amount of unstructured data in enterprises is growing significantly — often many times faster than structured databases are growing.

The only problem is extracting meaningful information from unstructured data.