Alice same she shore and seemed to remark myself, as she went back to the heard at Alice hastily, after open her sister. Here, Bill! The Duchess too late it a bit had a sort of they are the Queen. An invitation a little of the ran what it was only down her to the other; the Dodo, a Lory and the please that it must as well very good making a dish of time,” she added, “It isn’t a letters”.

If you think that sounds like gibberish, you’re right. But it is a pretty interesting type of gibberish because it is generated by a Markov chain. A Markov chain (if i describe this correctly, i’m not a mathematician or very good in logic) is an algorithm that makes its next state dependent on the previous one. For example, let’s say you have a text, such as Alice in Wonderland (that’s what the text above is based on). You could scan the whole text, and then look at how many times letters follow each other and make a table of that. E.g., the letter N probably follows the letter A more often than the letter Q.

After making such a table and start with a random letter, and then make a random weighted decision on the basis of how many times other letters usually follow, you get a text that quite closely resembles the original text, but takes apart all of its original context, resulting in the gibberish you read above.

It’s quite interesting to program such a markov text generator yourself, so i did exactly that with my PHP Markov chain generator. You can either input the starting text yourself (make sure to make it pretty long for best effect) or select one of the pre-selected texts such as Alice in Wonderland, the Wikipedia article on Calvin and Hobbes or Immanuel Kant’s Critique of pure reason (the last one becomes even more incomphrensible than it already was, can you imagine that?).

If you want to know more about Markov chains, be sure to read this article by Jeff Atwood, he explains the whole thing a lot better than i can.

And for your amusement, here’s another Markov generated piece, from Kant’s Critique of pure reason:

Thus common upon nothing that is, is deceptive infinite or as I am free in its or power to the continuous vacillationes demand explanation to this time-determines which they are given therefore necessary in itself respect none those a priori with a possible men course, and determission. Today its merely the conceiving the world in space is finite, neither in the support.

ergodicity good to know you can tell whether next is or isn’t for instance according to representation heuristica, which language incoming is or who since different sets emit likely ex ante estimates. interesting appliance is the node with at least 3 edges.

Thanks for this. I made a couple of minor adjustments to it to base the chaining off full words (tends to make more legible sentences). I’m also looking at updating it to attach to a database so that it can handle large datasets better/faster.

[…] English version. I looked through the web and found a good resource. The resource I referred to is this. There is a source code, and I wanted to make use of it and combine it with bot program. But I could […]

[…] following is how I adapted the Markov chain generator from Hay Kranen. Thanks to the comments1 I found below Hay’s post2 this Markov + Shakespeare version […]

Ralf van Kasteren 2014-07-28 on 00:19

Where does ” a text quite closely resembles the original text” refer to in paragraph 3 (meaning the latter part of the quoted line). I hope that “the one just produced”is ment by ” the original” or else we got the plagiarism fo clones in SF-novelles ;

no i’m just being cocky,
is it an idea if an implemenion of something like “a broken up sentence sequence” into it, in order to create granulated grammatical structures which follow a more natural sentence build up. for i am dumb and know nothing about xml and rotary tables.