Exercise

Quick taste of text mining

It is always fun to jump in with a quick and easy example. Sometimes we can find out the author's intent and main ideas just by looking at the most common words.

At its heart, bag of words text mining represents a way to count terms, or n-grams, across a collection of documents. Consider the following sentences, which we've saved to text and made available in your workspace:

Manually counting words in the sentences above is a pain! Fortunately, the qdap package offers a better alternative. You can easily find the top 4 most frequent terms (including ties) in text by calling the freq_terms function and specifying 4.

frequent_terms <- freq_terms(text, 4)

The frequent_terms object stores all unique words and their count. You can then make a bar chart simply by calling the plot function on the frequent_terms object.

plot(frequent_terms)

Instructions

100xp

We've created an object in your workspace called new_text containing several sentences.

Load the qdap package

Print new_text to the console.

Create term_count consisting of the 10 most frequent terms in new_text.