Hacker News new | past | comments | ask | show | jobs | submit login
Solving Redactle with Decision Trees (valentin.sh)
32 points by foobuzzHN 4 months ago | hide | past | favorite | 5 comments



This is fun!

Of course a good Redactle player can do much better than what's described here, because this algorithm (by design) doesn't use any information about the article other than yes-or-no "does such-and-such a word appear in the article?", one bit per guess. But in fact there's a ton more information than that even before you start guessing words.

If you have the full text of all the articles, as here, then of course solving is pretty trivial: just look for an article with the right pattern of word lengths. The game might be working with slightly older or newer versions than the ones in your database, but there are any number of ways to do the matching fuzzily.

The sort of thing a good human solver might do is quite different. I pulled up a random Redactle game by way of example: https://redactle.net/en/Q160645 (note: redactle.net is the canonical place to play Redactle these days, and recommended in preference to the redactlegame.com referenced in the article). Here's the sort of thing you might notice before guessing any words:

- The article begins "An 9 is ...", so our subject begins with a vowel.

- There's something weird in the structure of the first sentence. We've got "... 7 and 8 3, for 7 7, 6 be 5 for by ...". After staring at it for a bit, you might decide that the 6 is probably something like "should", and the 3 pretty much has to be "who". So "7 and 8" are probably two categories of person. You could pretty much solve the whole thing just by staring at this hard enough, but let's look elsewhere now.

- Redactle players get used to spotting things that are likely to be numbers. In the first paragraph under the "8" heading we see: "... that for 5 1.1 6 that an 6 5 in the 11, they 6 behind 5 5 in 6 by 1 5". The 1.1 is obviously a number. "for 5 [number] [word] that ..." is probably "for every [number] [unit] that ...", and the "behind" a bit later means that it's probably a unit of time. Something like: for every 2.3 months that an 6 spent in the 11, they lagged behind other 5 in 6 by 2 weeks.

- That smells like child development, doesn't it? So maybe our subject is something like ... some sort of institution, like a juvenile detention centre, that it turns out to be bad for children to spend time in? Something like that.

- So we should look at that first sentence again. Surely "7 and 8" must be "infants and children" or "parents and children". You might well guess the title at this point, but if inspiration doesn't strike:

- A bit lower down we see "In 9 5 10 are 2 6 in use, ..." and that "2 6" might jump out at you as surely being "no longer". The 10 is surely the plural of our 9-letter subject. Ah, it must be: "In countries where [plural of title] are no longer in use". So this continues to sound like some sort of institution, something that many (more enlightened? richer?) countries have stopped using because they're a Bad Thing in some sense. (Perhaps because of the child-development thing conjectured above.)

- 9 letters, beginning with a vowel. Outdated sort of institution children are put in, which turns out not to be good for their development. At this point, anyone good enough to have spotted the things I mentioned above will surely think of "orphanage", and now we can make sense of that thing in the first sentence: it's not should, it's rather the opposite. "... infants and children who, for various reasons, cannot be cared for by their ..."

- And now you can put the answer in, with pretty high confidence, without having guessed any other words at all.

This way of playing isn't for everyone. It's much harder work, and takes longer, than throwing out guesses and seeing what turns up. But it's pretty satisfying when it works out.


That’s super fascinating, thanks for the comment. I had no idea you could go so far without a guess.

Making use of the Wikipedia article text (and even the titles) seems mostly, but not entirely, against the spirit of the game; the technique used to solve it here is practically identical to vanilla Wordle. It’d be much harder to code, but ideally it would be able to approach the solution with some sort of walk through an embedding, like humans do.


I think it would be very interesting to try to teach a computer to play the game without having explicit knowledge of what's in all the articles. I'm not sure just what level of ignorance you'd want to enforce: good human players don't have articles memorized but do e.g. have a good idea of the typical structure of Wikipedia articles.

I don't know whether it's feasible to get an LLM to play somehow. LLMs are famously bad at letter-counting, but probably have substantial fractions of Wikipedia kinda-memorized. Perhaps it would be possible to fine-tune an LLM to play Redactle, e.g. by giving it a bunch of training data that's a good human Redactle player's musings on games?


> (note: redactle.net is the canonical place to play Redactle these days, and recommended in preference to the redactlegame.com referenced in the article)

Thanks. I tried the latter for a few minutes, thought it seemed like an interesting game, but found the experience unbearable.


redactle.net has no ads (I think; I have pretty stringent anti-ad measures in place) and a bunch of nice features that make the game more fun to play. For instance, you can configure it to always show you the lengths of the words explicitly, you can "annotate" still-redacted words with your guesses about them, you can put it in a mode where when you guess a word you also auto-guess its various inflections, etc. Also, it's actively developed (though less so than it used to be, because the developer has a small child). Definitely recommended.

And yes, it's a very interesting game, whether you try to solve it as quickly as possible (maybe spamming lots of often-useful guesses before you start thinking at all) or in as few guesses as possible (the best "snipers" have something like an 80% success rate at needing no guesses other than the title words themselves) or just play casually in whatever way maximizes the fun and minimizes the stress for you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: