Press Release: Cure for Writer's Block Based on Automated Analysis of 137,000 Songs

PITTSBURGH—What might inspire someone to write a song called "I Am Your Adult" or "Numb Bones Dreaming?" In both these cases, the songwriters' muse was a computational tool that generates novel titles for popular music.

Called Titular, http://muse.fawm.org/titular, the creativity tool for songwriters is one of several developed by Burr Settles, a post-doctoral fellow in Carnegie Mellon University's Machine Learning Department. Another tool, LyriCloud, http://muse.fawm.org/lyricloud, makes lyrical suggestions based on words selected by the user. More than simple rhyming dictionaries or random word generators, these computational tools use lessons gleaned from analyzing thousands of existing songs to make suggestions that are both novel and meaningful.

"Writing lyrics that succinctly tell a story, express emotion or create an image in the listener's mind is challenging," said Settles, who also plays guitar in the Pittsburgh pop/rock band Delicious Pastries. "Even professional songwriters have trouble coming up with evocative titles or juicy words for the refrain. It seemed to me that the kinds of computational techniques we use to analyze natural language could be adapted to help overcome writer's block."

The motivation for Settles to create Titular and LyriCloud last year was February Album Writing Month (FAWM, http://fawm.org), an international songwriting event that he helped launch in 2004. FAWM challenges participants ("fawmers") to compose 14 new works of music — roughly an album's worth — in only 28 days. Last year, 4,000 people registered for FAWM and generated more than 10,000 new songs. Titular and LyriCloud, along with tools related to song structure and plot lines, were made available on the FAWM website to help participants in achieving their 14-song goals.

Others have created computational lyric-writing tools. Rock star David Bowie famously created one called Verbasizer, which helped him brainstorm ideas by generating random word permutations. But Settles noted that it takes a lot of work to sift through the results of something like Verbasizer. "Most of the suggestions feel truly random, and not very meaningful," he explained. Likewise, many other online lyric generators are by and large "Mad-Lib"-style, fill-in-the-blank templates that are often more amusing than inspiring.

To create the linguistic models for Titular and LyriCloud, Settles extracted 137,787 songs by 15,940 artists from online lyrics sites. The songs span multiple genres and include both obscure and well-known artists, such as Beyoncé, R.E.M. and Van Halen.

For Titular, Settles then used natural language analysis tools to tag each word in the titles based on its part of speech — common noun, plural noun, past participle verb, etc. — and automatically create templates that could be used to generate new titles based solely on patterns in the data. A template such as "(adjective) (common noun)," for instance, might yield the title "Irish Rover." The template "you're so (adjective)" might yield "You're So Vain." And the template "(-ing form verb) with a (common noun)" might yield "Walking With a Zombie."

LyriCloud takes a word of interest to the songwriter, or seed word, as input and suggests up to 25 related words arranged visually in a word cloud. The word "dream," for instance, might generate possible modifiers of dream, such as "broken" or "deep." The suggestions might also include antonyms, such as "nightmare," or possible rhymes, such as "smithereen."

For each seed word, LyriCloud selects related words based on a similarity score it generates. The similarity score is calculated in advance based in part on the number of times the word appears in the same line of a lyric as the seed word within all of the songs in its database, as well as the number of unique words with which the seed word occurs in any line of lyrics.

In blind evaluations conducted by workers hired through Amazon's Mechanical Turk service, 68 percent of the titles generated by Titular were judged to be good song titles, compared to 93 percent of the real song titles evaluated. Evaluators of LyriCloud found the relatedness of the words and their inspirational qualities to be superior to words selected at random, or to words selected based purely on their topical similarity.

The tools likely will get a workout during this year's FAWM, which Settles expects will generate more music in 28 days than could be listened to in 28 days. During last year's FAWM, Titular was the most popular of the online creativity tools, with 11,408 page views. Of the songs that fawmers uploaded to the site and tagged according to the tool used, 66 were attributed to Titular and 29 to LyriCloud. Of 14 songs ultimately selected for inclusion on the annual FAWM compilation CD, one was written with a little help from Titular — "I Am Your Adult," by British singer/songwriter Expendable Friend. Other Internet users subsequently discovered the tools, which have been in continuous use.

When not composing or performing music, Settles does research at CMU on such projects as Never-Ending Language Learning (NELL), http://rtw.ml.cmu.edu/rtw/, a computer system that is learning to read by continuously gleaning facts from Web pages. Settles specializes in "active learning," a type of machine learning in which a computer program is able to ask humans for guidance when it is confused or unsure of something. The Machine Learning Department is part of the School of Computer Science. Follow the school on Twitter @SCSatCMU.