Innovation in data processing and machine learning technology

There are many formulas in Google Sheets for analyzing quantitative data, but spreadsheets often capture valuable text data as well. Text data in Google Sheets can come from many sources: Google Form responses, notes columns, descriptions, and more. As humans we can make sense of this data by reading it, but this becomes difficult as your data grows into hundreds or thousands of rows.

Cloud Natural Language (Cloud NL) takes the machine learning technology used by Google Search and Google Assistant and makes it possible for anyone to perform syntax analysis, sentiment analysis, and entity analysis with their own data. Cloud NL is accessible from Google Sheets with a few lines of Apps Script, and can help extract meaning from text data in a scalable way.

In this post, we’ll import a Kaggle dataset into Google Sheets, write some Apps Script to send our text data to Cloud NL, and visualize the results in Data Studio.

Importing data into Sheets

The Boston Airbnb Open Data dataset on Kaggle includes a snapshot of guest reviews for thousands of Airbnb listings in Boston. One listing can have hundreds of reviews, which makes this data an ideal candidate for Cloud NL.

For our analysis, we will use Cloud NL to identify the entities mentioned in the reviews (think “parking”, “location”, “wifi”, etc.), and determine the sentiment (positive or negative) expressed with regard to these entities.

The Cloud NL method we’ll be using, entity sentiment analysis, currently supports English. Since the comment text in our data is in several languages, we’ll translate the text to English before sending it to Cloud NL. Google Sheets has two built-in functions that easily make this happen: DETECTLANGUAGE and GOOGLETRANSLATE.

Sending text data to Cloud Natural Language

Now, we are ready to write some Apps Script to manage the API calls and add the Cloud NL data to our sheet.

First, we’ll write a function called retrieveEntitySentiment that takes a line of text as a parameter and queries the analyzeEntitySentiment method of Cloud NL. Calling this method requires a Google Cloud Platform account and project, enabling the Google Natural Language API, and creating an API key. In the function, we will provide this API key, the request URL, and our request body for analyzeEntitySentiment. The request body includes our text along with its language and type, and a field called encodingType. Once our request JSON is ready, the function will call the endpoint and parse the JSON response with the entities and related sentiment.

Now we can incorporate this function into our primary function, let’s call it markEntitySentiment. This primary function will identify which reviews have not yet been sent to Cloud NL, call the retrieveEntitySentiment function, and record the response data in a format ready for further analysis. We will also set variables that identify the review sheet and index the columns we will need to reference.

In the next part of the function, we will process each row, calling Cloud NL for each review comment. We will call the API only for rows that contain a text comment, and also have not already been processed in a previous run. By building this check into our code, we can set the function to run on various triggers, such as on form submit, on edit, or on a regular pre-defined time interval.

Now that our Apps Script is complete, we are ready to run markEntitySentiment and watch the response data stream into the entitySentiment tab.

Visualizing NL insights with Data Studio

Once the script finishes running, we’ll find we have over 5,000 rows of data in the entitySentiment sheet! Now to visualize the results. Google Data Studio (beta) is a reporting and visualization tool with built-in connections to several data sources, including Google Sheets. Data Studio makes it possible to share interactive dashboards while helping to protect the security of the raw data.

To get started, we’ll navigate to datastudio.google.com, start a new report, and create a new data source. After choosing the Google Sheets connector, we’ll be able to add our entitySentiment and listings sheets as data sources for our report.

Data Studio has several chart options to choose from. In our report, we’ll start with two simple tables showing the entities by average sentiment and number of mentions for both the most positive and negative entities.

Adding in rating data from the listings tab, we will use scorecard charts to show our listing’s ratings against the average ratings across all Boston properties. Data Studio also offers a geo map chart that shows us the location of the listing using latitude and longitude coordinates from the data.

Take a look at the final product, as an interactive web embed! With just a few clicks, Data Studio will give you the code snippet you need to embed your report into a web page, making it available to the public or to a certain set of users.