Tutorial: creating a Twitterbot

TL;DR

Although it sounds like a lot of effort, creating a Twitter bot is actually really easy! This tutorial, along with some simple tools, can help you create Twitter bots that respond when they see certain phrases, or that periodically post a tweet. These bots work with Markov chains, which can generate text that looks superficially good, but is actually quite nonsensical. You can make the bots read your favourite texts, and they will produce new random text in the same style!

All code is available on GitHub

The examples on this page use a custom Python library, written by Edwin Dalmaijer (that’s me). This library is open source, and available on GitHub for free. You are very welcome to download and use it, but I would like to kindly ask you to not use it for doing evil stuff. So don’t start spamming or harassing people!

Step 2: Create a Twitter app

Fill out the details on the form. You have to give your app a name, description, and website (this can be a simple place holder, like http://www.example.com)

Read the Developer Agreement, and check the box at the bottom if you agree. Then click on the ‘Create your Twitter application’ button.

Step 3: Keys and access tokens

This is an important step, as you will need the keys and access tokens for you app. They allow you to sign in to your account via a Python script.

After creating your new app, you were redirected to its own page. If you weren’t, go to apps.twitter.com and click on your apps name.

On the app’s page, click on the ‘Keys and Access Tokens’ page.

At the bottom of this page, click on the ‘Create my access token’ button.

Make sure you make note of the following four keys, as you will need these later. (Just leave the tab open in your browser, or copy them to a text file or something. Make sure nobody else can access them, though!)

Consumer Key (API Key)

[ copy this value from Consumer Settings ]

Consumer Secret (API Secret)

[ copy this value from Consumer Settings ]

Access Token

[ copy this value from Your Access Token ]

Access Token Secret

[ copy this value from Your Access Token ]

Getting a Python Twitter library

Before being able to log into Twitter via Python, you need a Python wrapper for the Twitter API. There are several options out there, and I haven’t excessively researched all of them. This guide will go with Mike Verdone’s (@sixohsix on Twitter) Python Twitter Tools library, which is nice and elegant. You can also find its source code on GitHub.

If you don’t have it already, you need to install setuptools. Download the ez_setup.py script, and run it with the Python installation you want to have it installed in.

BONUS STEP: If you don’t know how to run a Python script, read this step. On Linux and OS X, open a terminal. Then type cd DIR and hit Enter, but replace DIR by the path to the folder that contains the ez_setup.py script you just downloaded. Next, type python ez_setup.py and hit Enter. This will run the script that installs setuptools. On Windows, the easiest thing to do is to make a new batch file. To do this, open a text editor (Notepad is fine). Now write “C:\Python27\python.exe” “ez_setup.py” on the first line, and pause on the second line. (Replace “C:\Python27″ with the path to your Python installation!) Save the file as “run_ez_setup.bat“, and make sure to save it in the same folder as the ez_setup.py script. The .bat extension is very important, as it will make your file into a batch file. Double-click the batch file to run it. This will open a Command Prompt in which the ez_setup.py script will be run.

After installing setuptools, you can use it to easily install the Twitter library. On Linux and OS X, open a terminal and type easy_install twitter. On Windows, create another batch file, write “C:\Python27\Scripts\easy_install.exe” twitter on the first line, and pause on the second line. (Replace “C:\Python27″ with the path to your Python installation!)

If you did everything correctly, the Twitter library should now be installed. Test it by opening a Python console, and typing import twitter. If you don’t see any errors, that means it works.

Setting up

Time to get things ready for your Twitter bot script. You need to things: a dedicated folder to store things in, and the markovbot Python library. The markovbot library is written by me, and you can easily grab it off GitHub. You’re also very welcome to post any questions, issues, or additions to the code on GitHub.

Create a new folder, and give it a name. In this example, we will use the name ‘TweetBot’ for this folder.

Getting data

To establish a Markov chain, you need data. And lots of it. You also need the data to be in machine-readable form, e.g. in a plain text file. Fortunately, Project Gutenberg offers an excellent online library, with free books that also come in the form of text files. I’m talking about . (Please do note that Project Gutenberg is intended for humans to read, and not for bots to crawl through. If you want to use their books, make sure you download and read them yourself. Also, make sure that you have the right to download a book before you do it. Not all countries have the same copyright laws as the United States, where Gutenberg is based.)

Copy (or move) the text file you just downloaded to the ‘TweetBot’ folder, and name it ‘Freud_Dream_Psychology.txt’.

Writing your script

You should have everything now: A Twitter account with developer’s access, a Twitter library for Python, the custom markovbot library, and some data to read. Best of all: You’ve organised all of this in a folder, in precisely the way it is described above. Now, it’s finally time to start coding the actual Twitterbot!

Start your favourite code editor, and open a new Python script.

Save the script in the ‘TwitterBot’ folder, for example as ‘my_first_bot.py’.

Start by importing the MarkovBot class from the markovbot module you need:

Python

1

2

importos

frommarkovbot importMarkovBot

I’m assuming you are familiar enough with Python that you know what importing a library means. If you are not, maybe it’d be good to readaboutthem.

The next step is to initialise your bot. The MarkovBot class requires no input arguments, so creating an instance is as simple as this:

Python

1

2

# Initialise a MarkovBot instance

tweetbot=MarkovBot()

The next step is important! Before he can generate any text, the MarkovBot needs to read something. You can make it read your Freud example book.

The bot expects you to give him a full path to the file, so you need to construct that first:

Python

1

2

3

4

5

6

# Get the current directory's path

dirname=os.path.dirname(os.path.abspath(__file__))

# Construct the path to the book

book=os.path.join(dirname,'Freud_Dream_Psychology.txt')

# Make your bot read the book!

tweetbot.read(book)

At this point, your bot is clever enough to generate some text. You can try it out, by using its generate_text method. This takes one argument, and three (optional) keyword arguments (but only one of those is interesting).

generate_text‘s argument is the number of words you want in your text. Let’s try 25 for now.

generate_text‘s interesting keyword argument is seedword. You can use it to define one or more keywords that you would like the bot to try and start its sentence with:

That’s cool, but what about the Twitter part? Remember you generated those keys and access tokens? You’ll need them now:

Python

1

2

3

4

5

6

7

8

9

# ALL YOUR SECRET STUFF!

# Consumer Key (API Key)

cons_key=''

# Consumer Secret (API Secret)

cons_secret=''

# Access Token

access_token=''

# Access Token Secret

access_token_secret=''

Replace each set of empty quotes (”) with your own keys and tokens (these should also between quotes).

First note on the codes here: It’s actually not very advisable to stick your crucial and secret information in a script. Your script can be read by humans and machines, so this is a highly unsafe procedure! It’s beyond the scope of this tutorial, but do try to find something better if you have the time.

Second note: Now that you are pasting your secret stuff into a plain script, make sure you paste it correctly! There shouldn’t be any spaces in the codes, and it’s really easy to miss a character while copying. If you run into any login error, make sure that your keys have been copied in correctly!

Only one more thing to do before you can start up the bot: You need to decide what your bot should do.

There are two different things the MarkovBot class can do. The first is to periodically post something. This is explained further down.

The second thing the MarkovBot class can do, is to monitor Twitter for specific things. You can specify a target string, which the bot will then use to track what happens on Twitter. For more information on the search string, see the Twitter API website.

The target string determines what tweets your bot will reply to, but it doesn’t determine what the bot says. For that, you need to specify keywords. These can go in a big list, which the bot will use whenever he sees a new tweet that matches the target string. Your bot will try to find any of your keywords in the tweets he reads, and will then attempt to use the keywords he found to start a reply with. For example, if your target string is ‘#MarryMeFreud’, and your keywords are [‘marriage’, ‘ring’, ‘flowers’, ‘children’], then your bot could find a tweet that reads I want your flowers and your children! #MarryMeFreud. In this case, the bot would read the tweet, find ‘flowers’ and ‘children’, and it will attempt to use those to start his reply. (Note: He won’t use both words, this is a very simple hierarchical thing, where the bot will try ‘flowers’ first, and ‘children’ if ‘flowers’ doesn’t work.)

In addition to the above, you can also have the MarkovBot add prefixes and suffixes to your tweets. This allows you to, for example, always start your tweet with a mention of someone (e.g. ‘@example’), or to always end with a hashtag you like (e.g. ‘#askFreud’).

Finally, the MarkovBot allows you to impose some boundaries on its behaviour. Specifically, it allows you to specify the maximal conversational depth at which it is still allowed to reply. If you are going to use your bot to reply to people, this is something you really should do. For example, if your bot always replies to people who mention ‘@example’, they are likely to wish to talk to Edward Xample. It’s funny to get one or two random responses, but as the conversation between people and Edward Xample continuous, you really don’t want your bot to keep talking to them. For this purpose, you can set the maxconvdepth to 2. This will allow your bot to reply only in conversations with no more than two replies.

Python

1

2

3

4

5

6

# Set some parameters for your bot

targetstring='MarryMeFreud'

keywords=['marriage','ring','flowers','children','religion']

prefix=None

suffix='#FreudSaysIDo'

maxconvdepth=None

The MarkovBot’s twitter_autoreply_start method can start a Thread that will track tweets with your target string, and automatically reply to them using your chosen parameters.

If you want to stop your MarkovBot from automatically replying, you can call its twitter_autoreply_start method.

How quick your bot replies to tweets is highly dependent on how many tweets are posted that match your target string.

# (Don't do this directly after starting it, or your bot will do nothing!)

tweetbot.twitter_autoreply_stop()

Another thing the MarkovBot can do, is to periodically post a new tweet. You can start this with the twitter_tweeting_start method.

The keywords, prefix, and suffix keywords are available for this function too. The keywords work a bit different though: For every tweet, one of them is randomly selected. You can also pass None instead of a list of keywords, in which case your bot will just freestyle it.

One very important thing, is the timing. The twitter_tweeting_start method provides three keywords to control its timing: days, hours, and minutes. The time you specify here, is the time the bot waits between tweets. You can use all three keywords, or just a single one. If you don’t specify anything, the bot will use its default of one day.

If you want your bot to stop, you can use the twitter_tweeting_stop method.

# (Don't do this directly after starting it, or your bot will do nothing!)

tweetbot.twitter_tweeting_stop()

Spamming and trolling

Although Twitter bots can easily be used to spam and troll people, I kindly ask you not to do it. You gain absolutely nothing by doing it, and Twitter’s API is built in such a way that it automatically blocks accounts that do too much, so you will be shut down for spamming. Nobody likes spammers and trolls, so don’t be one.

Conclusion

Creating a Twitter bot is easy! If you want to use my software to create your own, please do go ahead. It’s free and open, and this page lists the instructions on how to download it. I would love to hear about your projects, so please feel free to leave a comment with your story.

UPDATE: Join competitions with your bot!

Stefan Bohacek was kind enough to point out the existence of a monthly bot competition that he organises. Have a look at his Botwiki website to learn about this month’s theme, and how to compete.

It sounds like you might have misplaced the markovbot folder. You can get it from GitHub. Press the Download ZIP button, download the zip archive, and then unzip it. After that, copy the ‘markovbot’ folder to the same folder as where you keep your script. (Alternatively, you could copy it to your Python installation’s side-packages folder.)

Sorry if that was a really basic explanation; it’s definitely not intended to be condescending! If the problem persists, could you try to give me some more info to work with? E.g. what you did to get the markovbot library, what you’re trying in your code, etc.

I am having the same issue with importing the module. I downloaded the markovbot-master from the link provided. Then placed the “markovbot” folder into the directory where my script it. But when I try to import I am given:

Unfortunately, I can’t replicate your error. Would you be able to provide some additional info?

1) What version of Python are you running? On what operating system?
2) What’s the contents of your ‘markovbot’ folder? Is there an __init__.py and a markovbot.py? Anything else?
3) Did you use an archiving tool to unzip the markovbot-master.zip archive?
4) Is there an error if you try the following? from markovbot.markovbot import MarkovBot

I am also getting the same import error running Python 3.5 on Windows 7 64bit. I do not get any errors when I import all of “markovbot” but when I use from markovbot import MarkovBot I get the same error. I also tried explicitly importing the module and was also unsuccessful.

So it works when you use import markovbot? In that case, simply initialise a MarkovBot instance by using my_bot = markovbot.MarkovBot. I’m not sure why the direct imports aren’t working, as I can’t reproduce the issue. Sorry!

Thanks! That’s a good question, and I guess it depends on what kind of emoticons you’re using. Are they unicode characters? Does Twitter automatically turn some combinations of characters into an emoticon, or do they have their own encoding?

If I remember correctly, the entire library is unicode-safe. So you should be able to update a Tweet with unicode emoticons.

If you use the ‘generate_text’ (or ‘_construct_tweet’) method, it will only create a string. You need the ‘twitter_autoreply_start’ or ‘twitter_tweeting_start’ methods to actually post random text to Twitter (they use sub-threads that monitor tweets and post replies, or that periodically tweet). Make sure that you use the ‘twitter_login’ method before calling the other twitter-related methods. Also, you want to make sure that the main thread is doing something, otherwise you will end the sub-threads if the main thread runs out of things to do. (You can, for example, use the ‘time.sleep’ function to keep the main-thread busy for as long as you’d like. Or you could turn on/off the auto-replying depending on what time it is, for example to make the bot only reply during business hours.)

Thanks so much for this tutorial it’s the most comprehensive i’ve found so far and i’m a complete newbie so it helps a lot.

unfortunately i have the same problem. when i run the script in terminal it seems to be working fine as it prints out the generated text. only there’s no tweet being generated in the actual account. I was just using the periodical tweet so as to try out if it’s doing anything but so far it stays silent.

I assume you’ve removed the access codes for obvious reasons, but just to confirm: In your actual script, the cons_key etc. are the keys you generated with Twitter’s Dev website, right?

If you’re running this script using a terminal, it will terminate directly after starting the bot. This means it’ll never really tweet anything. You’ll need to keep the main thread alive for the bot to remain active in the background. The simplest way of doing this, is by simply sleeping for a while: import time; time.sleep(86400) (This will keep it active for 86400 seconds, which should be 24 hours.)

Hi.
In your step where we install the twitter library by writing the bat file giving the path “C:\Python27\easy_install” that actually is C:\Python27\Scripts\easy_install” is what I found out. Hope its the general settings in all python 2.7 installed windows setup

I have a quick question, what does the bot do when it finds a tweet to reply to but that tweet doesn’t have any of the keywords? I can’t seem to get the autoreply function to work, and that seems to be a commonality.

Sorry about that, the issue was with selecting a database. Databases within a bot are a new feature, and they allow for using a single bot with multiple Markov chains, for example to allow it to reply in different languages. (Or to make it seem like your bot has multiple personalities.) The issue should be solved with the updates I made today. Grab the new code of GitHub, and you should be fine

As for your actual question: If the bot can’t find a usable keyword, it will simply not use one at all. That means the text will be generated at random, without keyword prompting. This should never lead to any crashes, though!

Unfortunately, it’s not clever enough to actually understand the tweets it’s replying to. That’s kind of the point of a Markov chain: it describes the statistical regularities in language, and uses those to generate text. The keywords are a poor man’s way to get them in the right direction, sort of.

If you do want replies that make sense, you might have to turn to natural language processing, for example with NLTK (a Python package). In addition, you’ll probably want to use a different algorithm to generate responses (Markov chains are too simplistic for this kind of functionality).

As for the images: Yes, in theory. You can use the Twitter Python package that I reference in the article to upload media. See the Twitter dev site for more info (here).

That’s actually a very good question! Once you initialise the bot, log it in to Twitter, and activate the auto-reply and/or auto-tweeting, there will be a Thread running in the background. When your script terminates, this Thread will also terminate, thereby stopping the automatic Twitter activity. To get around this, you need to keep the main Thread busy. You can do this, for example, by sleeping. In the example script on GitHub you’ll notice that in line 90-93 the script waits for a week, before stopping the Twitter stuff and terminating. In this way, your terminal will stay open, and the bot will stay active.

My new artbot: The Internet of Things, Lost and Found, is up and running.
it finds some thing that has been lost on the Internet about every 22 hours, then it looks for the owner.

But, a real place too, for when real objects are found. A tall black cylindar that opens for itś function of capturing the lost and returning the found. The brains of the Bot is a Raspberry Pie 3 v. B. mounted in the head in an little used room in the centre of the capital of New Zealand. Trigger: Lost&Found

line 1207, in _error
raise Exception(u”ERROR in Markovbot.%s: %s” % (methodname, msg))
Exception: ERROR in Markovbot.generate_text: No data is available yet in database ‘default’. Did you read any data yet?

As for the other error: It seems you’re using a database that doesn’t have anything in it yet. Before you can generate text, you need to train your bot. Example:

# Initialise a MarkovBot instance
tweetbot = MarkovBot()

# Get the current directory's path
dirname = os.path.dirname(os.path.abspath(__file__))
# Construct the path to the book
book = os.path.join(dirname, u'Freud_Dream_Psychology.txt')
# Make your bot read the book!
tweetbot.read(book)

This will put the text in the ‘dafault’ database. You can also add it to a specific database, by using the database keyword:

tweetbot.read(book, database='dreams')

You can use the same database keyword for generating text, auto-replying and auto-tweeting:

# Generate some text, using the 'dreams' database
tweetbot.generate_text(20, database='dreams')

Unfortunately, multiple keywords are not supported at this point. Feel free to have a crack at implementing it, though. Pull requests with new features are always welcome

EDIT: I just realised your question could be interpreted in two ways! The first is answered above: Can the bot use two keywords in the same tweet to generate a response (answer is no; one keyword will be chosen at random). The second interpretation is: Can the bot reply to tweets that contain two words, e.g. “Mars” AND “space”. The answer to that question is yes! To do so, you can set the bot’s target string to 'Mars space'. If you would like your bot to reply to tweets that contain “Mars” OR “space”, your target string should be 'Mars,space'. More info can be found on the Twitter Dev website.

Thanks so much for your quick and descriptive reply. I am going to dig into the twitter dev documentation and get my feet wet with some python learning. I want to be able to get the bot to auto-reply only to a certain list of twitter users & have the options to not use Markov chain but rather match a list of single line tweets based on keywords. It’s either going to be a great learning opportunity or an opportunity to keep frustration at bay 😉

p.s. If you do paid script work, feel free to ping me.
p.p.s . Hope your health is back on track, permanently now.

I found a few syntax errors with python 3.5.2, I think (I wouldn’t say I’m a good programmer so very likely wrong) that its from specifying Unicode, as from a bit of research i think the latest version has str as Unicode regardless.
I ‘think’ that i’ve fixed it by removing the instances of [str, unicode] and replacing them with [str].

Last thing, is it possible to test the bot by tweeting manually from python? Or any way I can use the commands and my own bot to just tweet out a string. (If this is possible, if you could point me where you’re code is looking for the strings specified so i could try and understand then replicate this that’d be really helpful)

After using the very high tech method of running the code, copypasting the error into google and finding what the issue is (I love stack overflow), I’ve got it to work in the latest update!! It was just silly things like next is now __next__. However I still don’t know how to make it tweet a string I pass it, and how to search for keywords in other tweets, then use this to reply.

How large is the database that you’re using? If it’s very limited, it’s not going to be able to generate bodies of text. (Your error message indicates that your bot attempted to generate a new message, but couldn’t.)

Thanks! I was able to fix it with a larget file.
Because my TwitterBot uses PT-BR as default language I also had to change from ‘UTF-8′ to ‘ISO8859-1′ and modify the lines that automatically makes ‘suffix = preffix’ if it’s note Unicode on the markovbot.py.

It would probably be useful to publish your code on GitHub, so that other people (including myself) can see what you changed to make TwitterBot compatible with different languages!

As for calling twitter_autoreply_start within a while loop: That won’t do anything, other than keeping the main Thread alive by continuously looping. The bot works by launching two Threads in the background, one for automatically tweeting and one for automatically replying to certain tweets. The only thing that the twitter_autoreply_start method does, is telling one of these Threads to start auto-replying, and what parameters to use for this.

The thing is, those two Threads will stop running in the background when your main Thread stops. So you’ll need to keep the main Thread busy to prevent the bot from stopping. This is explained and demonstrated in the example script.

Hi, thanks a ton for this tutorial, it was really useful. I’ve managed to get the bot to generate text in the terminal. I’m very new to all this, so please excuse me if my questions are very basic.

I wasn’t able to use easy_install twitter directly (it kept saying that I didn’t have root folder access even though I was using an administrative account), so I instead used sudo easy_install twitter. No error showed up, but is this an effective solution?

Secondly, can I use XCode or Idle as a console to test with import twitter? If so, how? When I entered it in Idle as follows
import twitter
and hit enter. The prompter appears as follows:
>>|
I don’t know what I should be doing or how to run this (its a shell), or if the error should have appeared upon hitting enter.

And finally, I tried incorporating the sleep timer in the example in the script, but the following error showed up:
NameError: name ‘time’ is not defined
What should I do to fix this?

i have a query.. what to do if i have to make my bot search a particular phrase on twitter and then reply them with my choice of reply.
for example.. i created a new text file.. wrote 4-5 sample tweets of my choice.. and replaced it with .txt file of book u mentioned to download.. but it doesn’t work..

sorry ,if my question sounds bit dumb.. i have zero knowledge of programming.. still learnt while going through this post of yours.. whatever u have mentioned here.. i have tried it works for me.. my bot searches certain string and then replies them with quotes of that book..

Thanks for the tutorial, this was very informative and helpful especially for a newbie like myself. Is there anyway to edit the bot so that it can tweet a .txt in sequence, word by word instead of generating random text?

Nice tutorial, used your code and @DeepThoughtBot1 is online. It uses the Hitch Hikers Guide to the Galaxy as a book. This bot also takes a image and a thought from reddit and makes a new pic to post. Thank for the Markov chains code now it is really cool.

Just had a look at it, and it works beautifully! I wouldn’t have predicted that Douglas Adam’s work would be so suitable for Markov chains, but it turns out to be really good! I also really like the Reddit interaction. Is it a high-karma /r/showerthought superimposed on a high-karma /r/pics post?

Thank you for this excellent tutorial. It worked so well that my app was banned after about 2 hours of use – they removed it’s write permission

So rather than it seeking out keywords and adding replies to other users’ tweets, which is not permitted in Twitter at all, I am trying to adapt it to simply post a tweet and only reply when somebody responds to that tweet. Does that sound possible?

Hahaha, I did warn about this! 😉 As another note of caution, also for others reading this: Be careful about which keywords you pick to respond to! For my bots, I use #askFreud and #askBuddha, which are clearly intended to actively ask something from either entity. Moreover, they are almost completely unused, so the only thing my bots do is answer to people who actively seek them out, and only occasionally they will give an unsuspecting tweeter a funny surprise.

To answer your question: It’s definitely possible to reply only to tweets targeted at your handle, and it’s actually very easy! You could do it by simply using @MyBotName as your search string.

Enter the script and press Run Modyle to reveal “Traceback (most recent call last):
File “D:\nodeproject\markovbot-master\my_first.bot.py”, line 2, in
from markovbot import MarkovBot
File “D:\nodeproject\markovbot-master\markovbot\__init__.py”, line 26, in
from markovbot35.py import MarkovBot
ModuleNotFoundError: No module named ‘markovbot35′” .Is there a solution?

I’ve got my bot running and I’m having a ton of fun. I’m curious whether there’s a way to get the autoreply feature to accept keywords from tweets. If, perhaps, somebody tweets #targetstring *new_keyword*, could the bot input that somehow?

When I first initialized the bot a couple of days ago, it was totally able to reply. Now, I’m not able to get it to work. I shut terminal down, then I copied out the code above verbatim, but still nothing works!

Those messages indicate that reviving a Thread worked, so there shouldn’t be a problem with that. Do you have any other info at all? Error messages, anything else you did? You’re saying you had your own bot up and running, but then you said you copied the code from above; which one was it?

This is already possible! The read method adds the content of a source to an internal database (you can even specify which internal database!). So the following would add the content of two sources to the default database:

1

2

3

# This assumes 'bot' is a MarkovBot instance

bot.read('example_1.txt',overwrite=False)

bot.read('example_2.txt',overwrite=False)

To be clear, the overwrite keyword argument’s default input is False, so the following would be equivalent:

1

2

3

# This assumes 'bot' is a MarkovBot instance

bot.read('example_1.txt')

bot.read('example_2.txt')

Additional info

And the following would add the contents of the two example sources to different internal databases:

1

2

3

# This assumes 'bot' is a MarkovBot instance

bot.read('example_1.txt',database='funny',overwrite=False)

bot.read('example_2.txt',database='sad',overwrite=False)

As a final example, the following code first adds ‘example_1.txt’ to the default internal database. But the second call overwrites the existing database, and then adds the content of ‘example_2.txt':

Edwin – Thanks for this great script. Wondering if you could help me understand how to improve the variety of the output. Currently, I’m using MarkovBot to mix together lyrics from two different artists – however I find that the bot tends to simply quote whole (or near-whole) lines of lyrics from just one of the artists, instead of mixing up the lyrics between the two artists.

Is there a way to reduce how much of any particular lyric is used? (The equivalent in your sample bot would be to avoid just quoting an entire sentence from Freud’s book). Also, how to ensure that it generates tweets using material from both artists?

I’d prefer the tweets be more like “A couple words from Artist #1″ + “A couple words from Artist #2″ + “A couple words from Artist #1″ etc. Or at least feel a little more mixed up, and not directly from just one or two songs from the same artist.

I’ve tried combining lyrics into paragraphs in the txt file, so they are more “book-like.” I’ve tried further randomizing the seed in markovbot27.py by putting the randomized seed through a second “random.randint” function. I’ve tried tweaking the triples function. I’ve tried mixing the order of the songs in the txt file so they flip-flop from one artist to the other. I’ve also tried putting each artist in their own txt file, w/ two separate bot.read functions per your description above. None of it has really seemed to make a difference, and I think I’ve about reached the end of my limited coding knowledge.

That’s simply a bi-product of how Markov chains work: The probabilities of a particular word following two other words is assessed for all successive words in your two songs. Because there’s so few of them, it’s very likely that each word pair in your songs only occurs with one successor. This means that the probability of that one particular successor to come out of the Markov chain is 1. The remedy is simple: You need more data than just two songs.

Thanks, I can see what you’re saying, and I can try to work with that. But also – looking at the example tweet I posted, I can find hundreds of places in the corpus where words like “and” “it” “is” “a” etc are used in various songs from both artists. Is there a reason why the Markov chain didn’t break at one of those words and skip to a new location in the corpus, instead of staying within one linear lyric?

Is there a way to force breaks at words like “and”/”is”/etc so that whenever the bot encounters one of those words, it looks elsewhere in the corpus for the next word?

Yes, you’re right, those are common words! But as I said, the chain looks at pairs of successive words, so it also takes into account the word either before or after the “and”/”it”/”is”. In the end, there might be fewer common combinations than you think.

There isn’t a way to force the chain to “break”, i.e. switch between corpora when generating new text. That’s because within a database, no information is retained on the origin of the words: If the bot read data from multiple sources, it forgets where it read what.

help me with this
error message:
Traceback (most recent call last):
File “my_first_bot.py”, line 4, in
from markovbot import MarkovBot
File “C:\Users\u_name\Desktop\TweetBot\markovbot\__init__.py”, line 26, in
from markovbot35 import MarkovBot
ImportError: No module named ‘markovbot35′

Edwin,
Absolute newbie here when it comes to bots but proficient with Py. Okay, I’ve had tons of classes about Py, but never ventured out on my own. So, I’m wanting to create a bot to help me with Twitter and a research project I’m involved in. I’m excited to use your tutorial and code but I have a question about the data source. Can I create my own text file that contains the data I want the bot to “read” and use in place of a book? Is there a special format I have to use other than just making it a plain text file? Any insight, including criticism, will be accepted.
Thanks again,
Joey

Hey, so if you look in the markovbot.py, the autoreply code (around line 868) captures the user and screen name. You could create an array outside of this function and populate it here or append the data to a file.