Evil Hangman

If you aren't familiar with the game Hangman, here is how it works:

  1. One player decides on a word for the other player to guess. That player writes down the number of characters in the word as dashes. (e.g. TREE becomes ----)
  2. The other player guesses letters that may be in the word. If the letter appears anywhere in the word, the player which decided the word reveals the locations of the letters which match. (e.g. guessing E on TREE would make it become --EE). Otherwise, the player will loose a guess.
  3. If the player runs out of guesses, they lose and the player which decided the word reveals it.

Your job will be to implement a computer version of Hangman, where the human guesses against the computer. While there are many viable strategies for building competitive computer game players, there is one approach that has been fairly neglected in modern research — cheating. Why spend all the effort trying to teach a computer the nuances of strategy when you can simply write a program to play dirty and win handily all the time?

Cheating at Hangman

If we choose our word at the end of the game, rather than the beginning, we can defeat the player under most cases, we just have to make it look like we are following the rules. To do this, we must reveal letters only when it gives us a larger set of words than if we didn't reveal any. Here is an example:

Say we make the player guess a four letter word, and our dictionary contains the following words:

ALLY BETA COOL DEAL ELSE FLEW GOOD HOPE IBEX

The user then guesses the letter E and we try to make the largest set of words possible. To do this, we break down our words into word families:

  • ----, which contains ALLY, COOL, and GOOD
  • -E--, which contains BETA and DEAL
  • --E-, which contains FLEW and IBEX
  • E--E, which contains ELSE
  • ---E, which contains HOPE

We then don't reveal any letters (as ---- was the largest set of words), and reduce the set of words we can use to:

ALLY COOL GOOD

After that, we just repeat the same process till either the user runs out of guesses or guesses all the letters in the only remaining word.

Stop! Make sure you fully understand how we will be cheating at Hangman before proceeding. This is critical to the understanding of this project.

Implementation

First, download the dictionary file you will be using here.

Here's a brief overview of the steps your program will complete:

1) Ask the user how many letters to play with.
2) Open `dictionary.txt` for reading.
3) Read all the words out of that file of the correct length and put them
   in a list or set (your choice).
4) Close `dictionary.txt`.
5) While the user still has guesses left:
   a) Let the user know how many guesses they have left and the letters
      they can guess.
   b) Ask the user for which letter they guess.
   c) Make sure they can actually guess that, if not, be nice and go back
      to the previous step.
   d) Determine, given their guess, which family of words to keep will be
      best for us (according to the rules above).
   e) If you gave them any letters in that family, congratulate them,
      otherwise, let them know there was not that letter and dock a guess.
   f) If the game state no longer has any dashes, congratulate them for
      winning and exit.

Once they run out of guesses, choose a random word from the set of possible words and let them know that was the word. Then mock them for not getting it.

You may want to see how all this works in action by downloading and playing my version here. But try not to look at the source code, remember that this project was for you to solve.

Hints

  • Letter frequency is not the word family. Although the words ELSE and HERE both have two Es, they are parts of separate families: E--E and -E-E, respectively.
  • Check and make sure there are enough words. In order to play Hangman with a word length of N, there must be at least one word with length N. Keep in mind there are some gaps in the dictionary for word length, the longest word has a length of 29, but there are no words with length 28 or 27.

Extensions

The strategy of choosing the largest set of possible words is a good heuristic but non-optimal. One possible improvement is to, if possible, choose the family of letters where the user gets no letters on their last guess, as to kill them off quicker.

Or, if you wish to extend your program to make it fully optimal, a minimax strategy could do this (warning: dangerous amounts of recursion ahead).

Credit

The Evil Hangman assignment originally comes from Keith Schwarz as a part of Stanford University's Nifty Assignments.