How To Write A Name Generator (In Ruby)

I love reading fantasy, I’ve even written about some of my favouritefantasy series on this blog. One of the things that I have always found interesting about fantasy literature (besides unworkable economies and unsustainable population densities – I tend to over-analyse when I read :)) was how they come up with the names for all the characters. Large fantasy series often contain hundreds of characters – that’s a lot of names. This line of though naturally led me to think of what I would do if I ever needed to make up a bunch of names and being the software developer that I am the answer was naturally – get my computer to make up the names for me.

If you do a search around the web for name generators you get quite a few results, unfortunately most of those don’t tell you how they do what they do and even that is besides the point since I wasn’t really happy with the results that most of these name generators produce. Either the results are way too random (how about 6 consonants in a row) or they are not random enough with clear traces of human intervention (i.e. choosing from a list of pre-made names). Then I found Chris Pounds excellent name generator page. One of the things that he has on this page is his language confluxer (lc) script so for my first attempt at writing a name generator I decided to basically take his script and clean it up a little bit. There were two reasons for this:

he uses a pretty clever algorithm for his name generator, it is completely data driven and is therefore able to avoid the 6 consonants/vowels in a row issue while producing output that sounds similar to the data it is based on

it was a yucky Perl script and nobody wants to work with that (except Perl programmers), so I felt it was my duty to make it a little bit nicer and since I’ve been playing around with Ruby lately, well you get the picture :)

The Name Generator Algorithm

As I said the script is completely data driven in that it takes a list of words (names in our case) as input and uses these to produce a bunch of randomised names that hopefully sound similar to the original input. It does the following:

produces a list of starting letter pairs from the input data (all our names will start with one of these pairs)

produces a map of which letters can follow which other letters based on the input data

generates words/names by randomly selecting a starting pair and then appending to the word by randomly choosing a letter from the map based on what the last letter in our new word currently is

this continues until the word length falls into a particular range (this range is hard-coded in the script)

There are a few more little twists that make this whole thing function but that is the essence of the algorithm.

Faithful Perl-to-Ruby Conversion

So first thing I did was to take the Perl script and do a direct conversion into Ruby, here is what I got:

At this point I had a bit of a shock at how eerily similar the Ruby version of the script looks compared to the Perl version (*shudders*). Anyways, you can just take the above script put it into a file and run it, you’ll need to give it a data file (here is the one I used).

Cleaning Up The Basic Name Generator

The problems with the above script are:

it is not self-documenting

it is hard to test

it is hard to extend

Anyways, I decided to make it a little bit nicer and easier to play around with by breaking it up into a couple of classes (in the interest of object orientation and stuff):

name_generator_main.rb – the script entry point

NameGenerator – concerned with name generation (as you might expect)

DataHandler – concerned with reading the input data and producing the maps and arrays on which the NameGenerator relies

ArgumentParser – concerned with dealing with the command line arguments

That’s pretty damn good for a random name generator. The best part of it, since it is completely data driven, if you change the input data you completely alter the output. So if you pass in a file with a bunch of French names, you will get French-sounding random names etc. Try it yourself!

Share this:

Related

http://blog.barrkel.com/ Barry Kelly

In case any of your other readers are interested, this technique is based on Markov chains. It can be extended beyond the pairs you use to triples or larger tuples, and can also be applied on a word basis for generating text, e.g. nonsense post-modern, Biblical, Shakespearean or Dickensian style prose.

http://www.skorks.com Alan Skorkin

Thanks, I might look into that myself.

Ryan Stout

Looks good, though I can’t get the sample code to work in ruby 1.8. You have a few spots with default arguments, then no default arguments for the last argument. If you can, let me know what I need to do to get it running.

Thanks.

http://www.skorks.com Alan Skorkin

You’re right, I tried it out on ruby 1.8 and it died. Apparently with ruby 1.8 arguments without default values need to come before arguments with default values (as far as method parameters are concerned). With ruby 1.9 this doesn’t seem to matter. So there we go, live and learn :).

I’ve now updated the code, if you download and try to run it now it should all work. Thanks for pointing out the issue.

http://twitter.com/ehoque ehsanul

Totally awesome, that output looks great! Seeing a lot about Markov chains these days, perhaps a sign that I should be looking into them more closely..

Seyed Razavi

Love the code and it works well on linux but it seems to be broken on Windows (Vista) throwing a “SystemStackError: stack level too deep” error in the generate_name method.

http://www.skorks.com Alan Skorkin

It’s never been tried on Vista, I don’t really have a copy :), so I can’t run it to see what happens. However from what you say it is overflowing the stack and the only way that would happen is if the exit condition for the recursive function is not working correctly.

If you would like to debug the code on Vista I would really appreciate it.

Otherwise we’ll have to wait until Win7 comes out (since I am probably gonna get a copy of that one) and try it on that (which means we’ll be writing Vista off as a lost cause).

Seyed Razavi

OK, a bit of poking about and I think the problem is with different versions of ruby.