Wrote up something in python that achieves similar, but for increasing numbers of characters. Found a few hits fairly quickly. pypy3 is pushing 420k evaluations a second on my laptop.
e.g.
"The SHA256 for this sentence begins with: five, two, and d"
"The SHA256 for this sentence begins with: seven, e, and nine"
"The SHA256 for this sentence begins with: one, five, e, and three"
My brain fart this morning was forgetting to account for the newline, so I started out spitting out bad options.
You can wrap tqdm around the "permutations(TOKENS, k)" if you want to measure progress. I haven't spent time trying to make this particularly optimised, for example that dict lookup is likely avoidable with a little bit of work via index lookup, and maybe cheaper. I've also not attempted to parallelise it, which would be fairly easy to do.
#!/usr/bin/env python
import string
import hashlib
from itertools import permutations
WORD_DIGIT = {
"one":1,
"two":2,
"three":3,
"four":4,
"five":5,
"six":6,
"seven":7,
"eight":8,
"nine":9}
TOKENS = [
"one",
"two",
"three",
"four",
"five",
"six",
"seven",
"eight",
"nine",
] + list(string.ascii_lowercase)
SEPARATOR = ", "
STARTING_TEXT = "The SHA256 for this sentence begins with: "
for k in range(2, 6):
for perm in permutations(TOKENS, k):
sha_start = ""
for char in perm:
if char in WORD_DIGIT:
sha_start += str(WORD_DIGIT[char])
else:
sha_start += char
test_string = STARTING_TEXT + SEPARATOR.join(perm[:-1]) + ", and " + ''.join(perm[-1:]) + "\n"
checksum = hashlib.new("sha256")
checksum.update(test_string.encode())
if checksum.hexdigest().startswith(sha_start):
print(test_string)
This is an interesting starting point for a python implementation, but it doesn't work correctly as far as I can tell. The list of tokens leaves out zero and includes a bunch of ascii characters that aren't valid hex. That second problem won't produce wrong results, but it'll spend a lot of time testing combinations that can't possibly appear in a SHA256 hex output. I also think the original tweet calculated without a newline.
More significantly, permutation in python is probably not the right tool for this. I believe it will not repeat token values at different positions, so for example you'll never end up testing "The SHA256 for this sentence begins with: f, f, and f", which is certainly a potentially valid result.
e.g.
"The SHA256 for this sentence begins with: five, two, and d"
"The SHA256 for this sentence begins with: seven, e, and nine"
"The SHA256 for this sentence begins with: one, five, e, and three"
My brain fart this morning was forgetting to account for the newline, so I started out spitting out bad options.