Solving Hangman using Language Models

We had to do a shared task for the Natural Language Processing Course recently, and turned out to be one of the most intellectually satisfying and exciting assignments I took up at college.

Basically, the problem statement is this: Given a series of dashes ____ of some arbitrary length and a maximum allowed limit of 8 wrong guesses, can you create a hangman solver that solves the game for you? The error would be judged by the average Levenshtein distance between the predicted word and the actual word in question over all samples in the test dataset. The baseline score for the task was 0.85 and we needed to better that. I will try to enlist the approaches we used and we got better results.

Initially, we used a simple word length based model to find the most frequent characters occurring in words of length k. In short, given a word of length k, start guessing in order of most probable letters occurring in words of length k in the training set. This got us a score of 1.37.

We felt using an n-gram model would obviously improve the results due to the nature of the problem, so we applied a model considering n-grams upto length 5. We got a score of 1.31 here :( . Later on, we realised it was because we only made the n-grams to predict the last letter, so we were losing out on information of letters to the right of the dash.

Correcting the above gave us an accuracy of 1.05, but we still needed to improve the score by a good margin. We realised then that we did the same mistake as before in case of bigrams. Correcting this error led to a significant improvement to 0.75.

We had almost stopped trying out new things, until my team-mate decided we needed to try out interpolating n-grams. He quickly hacked up the solution for the same, and after tweaking the weights for probabilities outputted by each n-gram model, we got a score of 0.73.