Using letter frequency to solve Wordle

illustrations illustrations illustrations illustrations illustrations illustrations illustrations
post-thumb

Published on 30 December 2021 by Andrew Owen (3 minutes)

As a writer, Wordle (the latest internet gaming sensation) piqued my curiosity. It’s essentially a word-based variation on Mastermind. The key differences are that instead of six colors, you have 26 letters and only combinations that spell words are valid.

Having an interest in cryptanalysis, I remember the letter sequence EATOIN SHRDLU. This is the frequency with which letters appear in a sample of English text. The full sequence is EATOIN SHRDLU CMFGYP WBVKXJ QZ. The sample isn’t current, but English hasn’t changed that much since the analysis was done.

I wondered if I could use this information to devise a strategy for solving Wordle. The only way to solve it in one move is to guess the word. Solving it in two moves still requires guesswork. If the first two guesses are good, you may be able to work out the word in three moves. Most strategies where the first guess was wrong involve picking subsequent words that contain any letters that were correct in the first guess. But is there a better way?

Using letter frequency, I think it should always be possible to guess the word in a maximum of five moves. Typically, it should only take four. And sometimes it may be possible in three. The best words I could come up with that are the closest match for the letter frequency are:

  1. ATONE
  2. DIRLS
  3. CHUMP

Trying it out on today’s word, the results are:

  1. One correct letter in the wrong place. The standard strategy would be to pick a second word with an O in it, but in a different location.
  2. Two correct letters, one in the correct place. We have three correct letters, one in the correct place.
  3. Two correct letters, both in the correct place. We now have all five correct letters, three in the correct place.
  4. Since CDULO isn’t a word, it must be COULD.

The next step is to apply the strategy and track the results over time.

I tried running it over the original 40 words and I came up with a mean of 4.25 turns.

Subsequently, a friend of mine did some deeper analysis and found that SALET is the best starting word in easy mode. By using that word and then applying letter frequency to the next guess, I’m often able to get the word in two turns.

Following the acquisition by the New York Times, the dictionary has been changed, but SALET is still a good starter. If it contains no letters, RHINO is a good follow-up guess. If you still have no letters, the best third guess is COMFY.

But if you’re playing in hard mode, there are traps. BATCH, CATCH, HATCH, LATCH, MATCH, PATCH and WATCH are all valid answers, so even starting with one of them doesn’t guarantee you’ll get the right answer in six guesses.

However, another programmer has determined that hard mode can always be solved computationally if you start with HALTS. Following letter frequency, this would make my second guess NOIRE, but it’s not even in the large dictionary. Fortunately, its anagram IRONE is. The revised best third guess I’ve come up with in the event that neither of those contains any of the letters is DUCKY. However, I’ve rarely had to use the third guess with SALET-RHINO.

I won the first nine-round competition in a Facebook Wordle group against people who I assume are all using different strategies. As of writing, I’m in the lead in the second competition. So there is circumstantial evidence that this is a good human strategy.

I gave up playing Wordle in September 2022. I’m still looking for a good Mastermind app for the iPad.