From charlesreid1

Hit me with the word list, daddy-O! https://git.charlesreid1.com/cs/five-letter-words/raw/branch/master/sgb-words.txt

What Is It

Five letter words is a set of (surprise!) five letter words, created by Donald Knuth as part of the Stanford Graph Base. This set of words contains 5757 common five-letter words, which meet the following criteria:

  • no proper nouns
  • no punctuation, hyphens, or accent marks
  • no extremely rare words that would only be useful to Scrabble players

Code

Several exercises from Art of Computer Programming ask us to manipulate/analyze the five letter words in various ways. Volume 4 exercises 26 through 37 are recommended by Knuth as warm-up exercises for interacting with and getting to know the five-letter-words list.

Code for each of these exercises is contained in the repository here: https://git.charlesreid1.com/cs/five-letter-words

These are not extraordinarily difficult problems (each took less than 10 minutes to implement), but some of them do take a while to run, and a few get more complicated (need to utilize Algorithms/Dynamic Programming techniques).

In the text, Knuth also mentions letter coverage - finding the number of five-letter words that can cover the first N letters of the alphabet. This is a more complicated task that requires some dynamic programming. See Letter Coverage.

Flags