Grassroots SEO Knowing Your Twitter Audience

The actual creation of a blog site isn’t too tough, it’s reaching your audience that is hard. Just like any startup or new business venture, research should be done on your market if you intend to monetize your brand. So when it comes to finding an audience with interests in programming, automation and other nerdy things, the pool seems to evaporate like water in a heat exchanger. Luckily we have Python to find our audience for us while we sleep and work our day job.

Social Media Market Share

Where to Find Our People

As of April 2016, Facebook has a cool 1.6 Billion users along with Twitter’s 320 Million. LinkedIn may only have 100 Million users, but those interacting on LinkedIn are more professionally focused and may be interested in reading a technical blog. So these are the “Big 3”. As we previously discussed about SEO, visibility is a huge deal.

Market Research – Automated

In order to identify our consumer base (those that are likely to be interested in the AML blog), we can start with Twitter. Twitter is like a focus group for the 21st century, there is so much data out there about your audience you only need the tools to extract it. The Twitter API and a Python package called Tweepy is what you will need.

Objective: Using Python, we will learn about our audience, extract keywords they seem to be interested in, generate new words to search for and get to know our people.

Creating a Blog Audience is Tough Business

Since we do not want to create another “Twitter Bot”, we will not be automatically following those that we identify. According to our objective, we are trying to understand our audience, not chase them away! We can start with a few keywords that describe AML such as: Python, Data Science, Programming, and Automation. Tweepy takes these words and returns only the tweets that have these words in them.

"""--------------The Process------------
Tweepy ref - http://docs.tweepy.org/en/v3.5.0/api.html
0. Get a twitter account and create a new app -- apps.twitter.com
1. Have a list of things you think your consumers are interested in
2. Open the Twitter stream and analyze the matches from the Twitter API
3. Keep an index of screen_names, tweets and any other key information you may be interested in
-maybe even keep links to other blogs to scrape yourself for new key words??
4. Post that data into a database or save it to a text file
5. Extend - Start gaining followers by "favouriting" their content
6. Extend Even Further - Create some charts for your Chief Marketing Officer (CMO) with pandas
"""

Think of this code as “Listening to your customer”. In fact, we’ll be importing the StreamListener() from the Tweepy package. Check out the imports:

from tweepy import Stream,API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time #we will use this to limit our consumption
import json #to capture the tweets
#also use my sql to save the data
import MySQLdb
#what if we want to make some graphs with the data
import pandas as pd
import matplotlib.pyplot as plt

You are going to need a consumer key, consumer secret, access token and the access secret alpha-numeric combination so that the Twitter API can authenticate you through Oauth. You get these from Twitter.

Strategy First

Before we go any further with the main loop, let’s list the questions we would like answered from this code:

Who is saying what?

Where are they saying it from and in what language? (Who knows, maybe we want to be international??)

Are there any other blogs we should follow that our consumers are talking about?

Can we save our competitors blogs to web-crawl later?

Maybe we should blog about topics they do or don’t cover?

Code Second

Using the StreamListener() that we imported from the Tweepy package, we can now pass it through to our Listener() object. On_data() is actually a Tweepy method and it basically retrieves the Twitter Stream in JSON format. If you are familiar with Python, this should get you pretty pumped because now the returning data be put into a Python Dictionary for easy data manipulation.

The last line of code here is getLinks(tweet) which is a user-defined function that does a Reg-ex search for anything that has “http” in it. That means we’re looking for links a user posted within their tweet. We can save these links for later.

Extending My Code

Eventually you will see opportunities for this code. Maybe you will want to grow your search terms. Right now we are only using four topics, we would eventually want to grow our terms along with our audience. Speaking of endless, we should probably dump everything in a database. Import MySQLdb anyone?