[Participation Agreement] The Clone Progenitors

Recommended Posts

I'll be making a game, titled "The Clone Progenitors/Wrong Thread", loosely based on the "Wrong Thread" thread, and thus passing the "Wrong Thread" diversifier.

I have reservations about sharing the game concept at this stage, but on the other hand I need some help from forum users, namely, I need people to approve the use of their username, avatar, and persona as presented on the forums, and I can imagine some would like to know what they are signing up for, so here is the gist of it (highlight to view):

The game will be played inside the Idle Forums, or, actually, a clone of the forums, generated by the complex procedure of pressing Ctrl+S while browsing them. In the game, the player will converse with members of the forum, by way of them writing in free text, and other people responding in a mix of scripted posts, and posts containing Markov chains based on the post history of that forum member. I considered using RNNs, but I didn't really understand how they work.

It goes without saying that the game will pass no judgment on any user, and hopefully the Markov chains won't produce anything anyone will find offensive. I'll block some words from being used in the generated text and not include posts of toxic tone as source for the chains, just to be safe.

And so, I would love to have *your* permission to use your forum avatar, username, and post history.

Share this post

Link to post

Share on other sites

I barely finished something that has any value entertainment-wise, so I'll continue to update it in the coming days. If I didn't include you in the game - sorry, I ran out of time, and getting a user's posting history is a more lengthy process than I would like. I'll update you in as soon as I find the time. Publishing before the deadline also means there might be bugs with the core story progression, but hopefully the unscripted sandbox will be entertaining enough by itself in the meantime.

Share this post

Link to post

Share on other sites

Yeah, this can be used to generate some fantastically bizarre threads!

Could you reveal a bit more about what this game is actually doing, because it is quite fascinating? For example, if you had a much larger library of sentences, would the posts be more coherent? Does the player's input affect the replies? Etc.

Share this post

Link to post

Share on other sites

Sure - I'll explain how this currently works, but keep in mind I'm working on a big update that will make some of this out of date.

The first step of the process was designing a really, extremely, bafflingly, shitty 'crawler' bot. I had it go through the posts of users who agreed to participate (sorry, everyone who's still not features in the game, I've been a little busy but I'll make sure to find some time to include you!), and generate a single text file containing a concatenation of those posts.

To generate a comment, I pick a user at random, and generate a Markov Chain based on their post history (I only used about 50 comments for each user, since even with the bot extracting the data was a lot of work).

Markov Chains, unlike, say, RNNs or deep learning algorithms, are very simple for a computer to generate. They are actually simple models in probability theory, which for some reason lend themselves pretty well into generating reasonably coherent text - I'm sure there's some explanation, Chomsky probably discussed it in one of his books.

Having a larger data sample would not make sentences substantially more coherent, I think. What it would do, is make the manner of speech resemble that of the user more closely. E.G. - if I were to use Nick's user, it would probably say "pretty good" more the larger the sample text is. As it currently stands, the only user I found to generate comments by which I could guess who they were is Argobot, since she mainly engages with the Idle Book Club forums, which have a different tone.

The player input does have an effect on the replies... barely. My plan is to give, in the Markov Chain model, a bigger weight to comments in the data source that resemble the last comment made by the player. But until then, I just have some scripted events (if the player's comment contains some key word, automatically respond with some pre-written comment).

If I wanted to make the sentences more coherent, I would probably have to first connect the Markov chain into some word data base that can tell me whether a word is a noun, a verb, an adjective, etc., and then learn some heavy linguistics and and implement sentence structure into the algorithm,