I made an Ira Glass bot

Thu 20 December 2018

A couple of weeks back This American Life ran an
episode on how
we read things differently depending on the context. They started the
show with a section about InspiroBot. Host
Ira Glass declared his love for the InspiroBot and interviewed the
people behind it.

Since I love semi auto generated texts and Ira Glass is one of my
favorite journalists I decided to make an Ira Glass
"bot". Calling it a bot is actually
a bit overstated. It's not like it can hold a conversation or
anything. Here's what I did:

Extracted everything Ira Glass said

I first tried a regex but that got hairy fast. So I picked up
Scrapy that I've used before. That got me
reacquainted with Xpath
selectors. The syntax is about as readable as regexes but it's very
powerful.

Generated 100000 text snippets

frompydodoimportEnglishMarkovimporttimeimportosoutputfolder="./generated"defget_start_number(folder):ls=os.listdir(folder)try:result=max([int(item.split(".")[0])foriteminls])+1exceptValueError:result=0returnresultdefget_model(input):mm=EnglishMarkov()mm.construct(open(input))mm=mm.remove_pines()returnmmdefgenerate(model,n,start_number,folder):t1=time.time()count=0whilecount<n:# Generate a sentencesent=model.generate_sentence()# Only hang on to it if it's longer then 90 characters.iflen(sent)>90:fn=os.path.join(folder,f"{start_number + count}.txt")fh=open(fn,"w")fh.write(sent)fh.close()count+=1print(f"{count} / {n}")t2=time.time()print(n/(t2-t1))model=get_model("./data/all_data.txt")generate(model,100000,get_start_number("./generated"),"./generated")

The front end

The front end is all static HTML/CSS with at dash of JavaScript to
load in new text snippets. The loadRandomUrl function picks a random
number in the range 1 - 100000, fetches the corresponding text snippet
and inserts it on the page.