Since I'm slacking off, here's a straightforward, not at all optimized Go implem...

Twirrim · on Sept 11, 2023

Wrote up something in python that achieves similar, but for increasing numbers of characters. Found a few hits fairly quickly. pypy3 is pushing 420k evaluations a second on my laptop.

e.g.

"The SHA256 for this sentence begins with: five, two, and d"

"The SHA256 for this sentence begins with: seven, e, and nine"

"The SHA256 for this sentence begins with: one, five, e, and three"

My brain fart this morning was forgetting to account for the newline, so I started out spitting out bad options.

Twirrim · on Sept 11, 2023

You can wrap tqdm around the "permutations(TOKENS, k)" if you want to measure progress. I haven't spent time trying to make this particularly optimised, for example that dict lookup is likely avoidable with a little bit of work via index lookup, and maybe cheaper. I've also not attempted to parallelise it, which would be fairly easy to do.

  #!/usr/bin/env python
  
  import string
  import hashlib
  from itertools import permutations
  
  WORD_DIGIT = {
          "one":1,
          "two":2,
          "three":3,
          "four":4,
          "five":5,
          "six":6,
          "seven":7,
          "eight":8,
          "nine":9}
  TOKENS = [
          "one",
          "two",
          "three",
          "four",
          "five",
          "six",
          "seven",
          "eight",
          "nine",
          ] + list(string.ascii_lowercase)
  
  SEPARATOR = ", "
  
  STARTING_TEXT = "The SHA256 for this sentence begins with: "
  
  for k in range(2, 6):
      for perm in permutations(TOKENS, k):
          sha_start = ""
          for char in perm:
              if char in WORD_DIGIT:
                  sha_start += str(WORD_DIGIT[char])
              else:
                  sha_start += char
          test_string = STARTING_TEXT + SEPARATOR.join(perm[:-1]) + ", and " + ''.join(perm[-1:]) + "\n"
          checksum = hashlib.new("sha256")
          checksum.update(test_string.encode())
          if checksum.hexdigest().startswith(sha_start):
              print(test_string)

rainsford · on Sept 12, 2023

This is an interesting starting point for a python implementation, but it doesn't work correctly as far as I can tell. The list of tokens leaves out zero and includes a bunch of ascii characters that aren't valid hex. That second problem won't produce wrong results, but it'll spend a lot of time testing combinations that can't possibly appear in a SHA256 hex output. I also think the original tweet calculated without a newline.

More significantly, permutation in python is probably not the right tool for this. I believe it will not repeat token values at different positions, so for example you'll never end up testing "The SHA256 for this sentence begins with: f, f, and f", which is certainly a potentially valid result.

Twirrim · on Sept 12, 2023

> The list of tokens leaves out zero and includes a bunch of ascii characters that aren't valid hex

I rewrote it in rust last night and realised that, yeah. Lots of wasted time on irrelevant characters.

>permutation in python is probably not the right tool for this.

Correct, missed that. I need `itertools.product(<iterable>, repeat=2)`

stavros · on Sept 13, 2023

Interestingly, cPython is twice as fast as PyPy for me.

Karellen · on Sept 11, 2023

Obvious variations include:

"SHA256" vs "SHA-256"

"begins" vs "starts"

":" vs ""

"%s," vs "%s"

"%s and" vs "%s, and"

"." vs ".\n"

andrewstuart2 · on Sept 11, 2023

Yeah, I'm surprised how simple it is to find these.

The SHA256 for this sentence begins with: 1c0510f.

The SHA256 for this sentence begins with: 4, 5, b, a, a, 5 and f.

vluft · on Sept 11, 2023

another pretty easy optimization is swapping to https://github.com/minio/sha256-simd, particularly if you're on a process with the sha extensions.

versus just spawning a goroutine per attempt, it'll likely be much faster as well to just split the search space into number of cores and have one chugging away on each chunk of the search space. (I have a thing to do vanity git commit hashes and that's what I do for that; for 7 characters in the hash, it takes well under a second on my CPU on average)

googlryas · on Sept 11, 2023

While we're there, since the message has a fixed prefix, you can initiate the sha256 computation with it, and then repeatedly branch off of that for the test suffix.

oefrha · on Sept 12, 2023

Oh I didn’t mean per attempt. If you have <= 16 cores, just wrap the first level loop body in a goroutine; if you have >16 cores, wrap the second level to spawn 256 goroutines. Very little code change, big speedup.

awegio · on Sept 11, 2023

> There's exactly one hit

There are N possible sequences, and you try N times with a success probability of 1/N each (because it is a good hash function). This means the expected number of hits is 1.

thehappypm · on Sept 11, 2023

The probability is 1/e

Dylan16807 · on Sept 12, 2023

The probability of missing is 1/e

dvh · on Sept 11, 2023

You can increase the odds by adding various suffixes, e.g "what are the odds" or "imagine that" or "can you beat that" or "can you do better" ....

drexlspivey · on Sept 11, 2023

Why would that increase the odds? It’s the same chance on every iteration

jrm4 · on Sept 11, 2023

I think they mean "increase your possible search space?"

krackers · on Sept 11, 2023

you effectively have multiple chances to try to get a given target hash (prefix)

mike_hock · on Sept 11, 2023

Trying different prefixes doesn't take less time than trying different sequences. Unless you're hellbent on finding a match for a particular sequence.

I'd settle on a fixed prefix instead and try longer sequences to make it look more impressive.

recursive · on Sept 11, 2023

Every time you lengthen the sequence you increase the "luck" requirement.

invalidator · on Sept 12, 2023

I like little puzzles so I gave it a shot in Ruby.

  #!/usr/bin/ruby
  
  require 'digest'
  
  charmap = %w{zero one two three four five six seven eight nine a b c d e f}
  
  (0..0xf).map do |prefix|
    Process.fork do
      (0..0xffffff).each do |i|
        hex = "%01x%06x" % [prefix, i]  # "0123456"
        digit_words = hex.chars.map { |c| charmap[c.to_i(16)] }  # ["zero", "one", ...]
        sentence = "The SHA256 for this sentence begins with: %s, %s, %s, %s, %s, %s and %s." % digit_words
        sha = Digest::SHA256.hexdigest(sentence)
        if sha[0,7] == hex
          puts(sentence)
        end
      end
    end
  end
  
  Process.waitall

8 minutes on an old quad-core. I don't expect there's much to optimize since all the heavy lifting is in the libraries. I didn't bother with synchronization so you might get a scrambled result if it somehow finds a flurry of hits. :)

skykooler · on Sept 12, 2023

What is the reply? Following the link leads to the tweet with no context.

tantalor · on Sept 12, 2023

> Was just verifying your tweet's hash, and then...omg!!! I couldn't believe what I realised. The SHA256 of THIS tweet starts with exactly the same 7 characters as your tweet's hash. What are the chances of that?

H8crilA · on Sept 11, 2023

This means there's quite a lot of solutions. Running the entire 2^56 would take over 770 core-days if one core can achieve an impossible 1 giga-iterations per second.

yuliyp · on Sept 11, 2023

It's 2^28, not 2^56. A hex digit is 4 bits, and we have 7 such digits to match.

H8crilA · on Sept 11, 2023

Ugh, you are right. That was a bit embarrassing :)

jeremyloy_wt · on Sept 12, 2023

https://go.dev/play/p/6Lg9L2up5mI

Here is a slightly different version I wrote that has 0 allocations per loop.

Took about 26 seconds.

I'm welcome to suggestions on how to make it more readable! Its been a few years since I have written Go professionally

lun4r · on Sept 11, 2023

not that difficult either. i wrote a script in 10 minutes and had a result in under a minute: https://twitter.com/schuilr/status/1701268438931460311