Hacker News new | past | comments | ask | show | jobs | submit login
The SHA256 for this sentence begins with: one, eight, two, a, seven, c and nine. (twitter.com/lauriewired)
277 points by isp on Sept 11, 2023 | hide | past | favorite | 143 comments



Since I'm slacking off, here's a straightforward, not at all optimized Go implementation:

  package main
  
  import (
      "bytes"
      "crypto/sha256"
      "encoding/hex"
      "fmt"
  )
  
  var (
      _chars = []byte("0123456789abcdef")
      _names = []string{"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "a", "b", "c", "d", "e", "f"}
      _size  = len(_chars)
  )
  
  func main() {
      hexsum := make([]byte, 64)
      for i1 := 0; i1 < _size; i1++ {
          for i2 := 0; i2 < _size; i2++ {
              for i3 := 0; i3 < _size; i3++ {
                  for i4 := 0; i4 < _size; i4++ {
                      for i5 := 0; i5 < _size; i5++ {
                          for i6 := 0; i6 < _size; i6++ {
                              for i7 := 0; i7 < _size; i7++ {
                                  s := fmt.Sprintf("The SHA256 for this sentence begins with: %s, %s, %s, %s, %s, %s and %s.", _names[i1], _names[i2], _names[i3], _names[i4], _names[i5], _names[i6], _names[i7])
                                  sum := sha256.Sum256([]byte(s))
                                  hex.Encode(hexsum, sum[:])
                                  prefix := []byte{_chars[i1], _chars[i2], _chars[i3], _chars[i4], _chars[i5], _chars[i6], _chars[i7]}
                                  if bytes.HasPrefix(hexsum, prefix) {
                                      fmt.Printf("%s\n", s)
                                      fmt.Printf("%s\n", hexsum)
                                  }
                              }
                          }
                      }
                  }
              }
          }
      }
  }
Takes a few minutes on common consumer hardware. There's exactly one hit.

(Easiest optimization is wrapping the loop body in a goroutine. GOEXPERIMENT=loopvar really makes this nicer btw.)

The reply is obviously more interesting, need to come up with a lot of variations.


Wrote up something in python that achieves similar, but for increasing numbers of characters. Found a few hits fairly quickly. pypy3 is pushing 420k evaluations a second on my laptop.

e.g.

"The SHA256 for this sentence begins with: five, two, and d"

"The SHA256 for this sentence begins with: seven, e, and nine"

"The SHA256 for this sentence begins with: one, five, e, and three"

My brain fart this morning was forgetting to account for the newline, so I started out spitting out bad options.


You can wrap tqdm around the "permutations(TOKENS, k)" if you want to measure progress. I haven't spent time trying to make this particularly optimised, for example that dict lookup is likely avoidable with a little bit of work via index lookup, and maybe cheaper. I've also not attempted to parallelise it, which would be fairly easy to do.

  #!/usr/bin/env python
  
  import string
  import hashlib
  from itertools import permutations
  
  WORD_DIGIT = {
          "one":1,
          "two":2,
          "three":3,
          "four":4,
          "five":5,
          "six":6,
          "seven":7,
          "eight":8,
          "nine":9}
  TOKENS = [
          "one",
          "two",
          "three",
          "four",
          "five",
          "six",
          "seven",
          "eight",
          "nine",
          ] + list(string.ascii_lowercase)
  
  SEPARATOR = ", "
  
  STARTING_TEXT = "The SHA256 for this sentence begins with: "
  
  for k in range(2, 6):
      for perm in permutations(TOKENS, k):
          sha_start = ""
          for char in perm:
              if char in WORD_DIGIT:
                  sha_start += str(WORD_DIGIT[char])
              else:
                  sha_start += char
          test_string = STARTING_TEXT + SEPARATOR.join(perm[:-1]) + ", and " + ''.join(perm[-1:]) + "\n"
          checksum = hashlib.new("sha256")
          checksum.update(test_string.encode())
          if checksum.hexdigest().startswith(sha_start):
              print(test_string)


This is an interesting starting point for a python implementation, but it doesn't work correctly as far as I can tell. The list of tokens leaves out zero and includes a bunch of ascii characters that aren't valid hex. That second problem won't produce wrong results, but it'll spend a lot of time testing combinations that can't possibly appear in a SHA256 hex output. I also think the original tweet calculated without a newline.

More significantly, permutation in python is probably not the right tool for this. I believe it will not repeat token values at different positions, so for example you'll never end up testing "The SHA256 for this sentence begins with: f, f, and f", which is certainly a potentially valid result.


> The list of tokens leaves out zero and includes a bunch of ascii characters that aren't valid hex

I rewrote it in rust last night and realised that, yeah. Lots of wasted time on irrelevant characters.

>permutation in python is probably not the right tool for this.

Correct, missed that. I need `itertools.product(<iterable>, repeat=2)`


Interestingly, cPython is twice as fast as PyPy for me.


Obvious variations include:

"SHA256" vs "SHA-256"

"begins" vs "starts"

":" vs ""

"%s," vs "%s"

"%s and" vs "%s, and"

"." vs ".\n"


Yeah, I'm surprised how simple it is to find these.

The SHA256 for this sentence begins with: 1c0510f.

The SHA256 for this sentence begins with: 4, 5, b, a, a, 5 and f.


another pretty easy optimization is swapping to https://github.com/minio/sha256-simd, particularly if you're on a process with the sha extensions.

versus just spawning a goroutine per attempt, it'll likely be much faster as well to just split the search space into number of cores and have one chugging away on each chunk of the search space. (I have a thing to do vanity git commit hashes and that's what I do for that; for 7 characters in the hash, it takes well under a second on my CPU on average)


While we're there, since the message has a fixed prefix, you can initiate the sha256 computation with it, and then repeatedly branch off of that for the test suffix.


Oh I didn’t mean per attempt. If you have <= 16 cores, just wrap the first level loop body in a goroutine; if you have >16 cores, wrap the second level to spawn 256 goroutines. Very little code change, big speedup.


> There's exactly one hit

There are N possible sequences, and you try N times with a success probability of 1/N each (because it is a good hash function). This means the expected number of hits is 1.


The probability is 1/e


The probability of missing is 1/e


You can increase the odds by adding various suffixes, e.g "what are the odds" or "imagine that" or "can you beat that" or "can you do better" ....


Why would that increase the odds? It’s the same chance on every iteration


I think they mean "increase your possible search space?"


you effectively have multiple chances to try to get a given target hash (prefix)


Trying different prefixes doesn't take less time than trying different sequences. Unless you're hellbent on finding a match for a particular sequence.

I'd settle on a fixed prefix instead and try longer sequences to make it look more impressive.


Every time you lengthen the sequence you increase the "luck" requirement.


I like little puzzles so I gave it a shot in Ruby.

  #!/usr/bin/ruby
  
  require 'digest'
  
  charmap = %w{zero one two three four five six seven eight nine a b c d e f}
  
  (0..0xf).map do |prefix|
    Process.fork do
      (0..0xffffff).each do |i|
        hex = "%01x%06x" % [prefix, i]  # "0123456"
        digit_words = hex.chars.map { |c| charmap[c.to_i(16)] }  # ["zero", "one", ...]
        sentence = "The SHA256 for this sentence begins with: %s, %s, %s, %s, %s, %s and %s." % digit_words
        sha = Digest::SHA256.hexdigest(sentence)
        if sha[0,7] == hex
          puts(sentence)
        end
      end
    end
  end
  
  Process.waitall
8 minutes on an old quad-core. I don't expect there's much to optimize since all the heavy lifting is in the libraries. I didn't bother with synchronization so you might get a scrambled result if it somehow finds a flurry of hits. :)


What is the reply? Following the link leads to the tweet with no context.


> Was just verifying your tweet's hash, and then...omg!!! I couldn't believe what I realised. The SHA256 of THIS tweet starts with exactly the same 7 characters as your tweet's hash. What are the chances of that?


This means there's quite a lot of solutions. Running the entire 2^56 would take over 770 core-days if one core can achieve an impossible 1 giga-iterations per second.


It's 2^28, not 2^56. A hex digit is 4 bits, and we have 7 such digits to match.


Ugh, you are right. That was a bit embarrassing :)


https://go.dev/play/p/6Lg9L2up5mI

Here is a slightly different version I wrote that has 0 allocations per loop.

Took about 26 seconds.

I'm welcome to suggestions on how to make it more readable! Its been a few years since I have written Go professionally


not that difficult either. i wrote a script in 10 minutes and had a result in under a minute: https://twitter.com/schuilr/status/1701268438931460311


> Was just verifying your tweet's hash, and then...omg!!! I couldn't believe what I realised. The SHA256 of THIS tweet starts with exactly the same 7 characters as your tweet's hash. What are the chances of that?

As always, the real WTF is in the comments.


It's not too difficult. All you need is to generate many variations of potential words and check whether the sha256 hash matches the wanted leading characters. For example, check this text.


    $ echo -n $'It\'s not too difficult. All you need is to generate many variations of potential words and check whether the sha256 hash matches the wanted leading characters. For example, check this text.' | sha256sum
    182a7c9d2e99162688aaaf3f97638edd7d06f8d295e456c7bb1f16abf3a8f70c  -


Isn't this kind of the same thing that crypto miners do?


Yes except Bitcoin does double SHA (a second round of SHA on the first result)


Well played.

More seriously, is there a really good, open source library that generates many slight variations of an input block of text?


Llama?


can we get this guy to pass a captcha? i've got questions


    $ echo -n $'Was just verifying your tweet\'s hash, and then...omg!!! I couldn\'t believe what I realised. The SHA256 of THIS tweet starts with exactly the same 7 characters as your tweet\'s hash. What are the chances of that?' | sha256sum
    182a7c9c08b2f0f9333bf23828c5fbf47addf74e815b6a22ca10825450bc2ee1  -
Checks out(!)

Source: https://twitter.com/benoconnor/status/1701057433131421935


I'm confused as to how they've done this. The original message, sure, you can brute force the digits and hope you get a collision, and try a new, plausible preamble if not. But I don't see how they've found this collision without it looking like anything is brute forced.

Is it using substitute Unicode characters or something?

E: no, just hand typed it and got the same...


Presumably generated lots of variations of that sentence. It's 28 bits, so you only need to have around 14 places where 4 variations are possible.

For instance, it could start with "Was", "I was", "verifying" could be "checking" or "computing" or "testing", etc.

A bit tight, but it seems feasible with some work.


Indeed, this is how I did it. 2^28 is around 270 million. With little more than a handful options, it should be possible. Although I have to say, it turned out to be more difficult than I'd initially thought. Maybe it's better to think of it as 28 different boolean choices.

  echo -n "Indeed. This is how I managed to do it. 2^28 is around 300 million. With only a handful options, it's possible. Although, I must say that it turned out to be more difficult than I'd initially thought. Perhaps it's better to think of it as 28 different alternatives." | sha256sum


This is crazy! The hash of this comment once again begins 182a7c9!!! Well done indeed!


Throwing a loop of numbers on the end seems far more feasible. I put together a hasty Python implementation which can immediately find three hits at 5 characters. TQDM is reporting ~450k tries per second, so depending on how lucky you are, would probably want to redo in a faster language to solve for 7+

    import hashlib
    import itertools
    
    import tqdm
    
    BLOCKS_SIZE = 5
    sentence_prefix = "The SHA256 for this sentence begins with:"
    
    itos = {0x00:"zero", 0x01:"one", 0x02:"two", 0x03:"three", 0x04:"four", 0x05:"five", 0x06:"six", 0x07:"seven", 0x08:"eight", 0x09:"nine",
           0x0a:"a", 0x0b:"b", 0x0c:"c", 0x0d:"d", 0x0e:"e", 0x0f:"f"}
    for nums in tqdm.tqdm(itertools.product(itos.keys(), repeat=BLOCKS_SIZE)):
        sentence = f"{sentence_prefix} {', '.join(itos[num] for num in nums[:-1])}, and {itos[nums[-1]]}."
        hash_true = hashlib.sha256(bytes(sentence, "utf8")).hexdigest()
        guessed_prefix = "".join(f"{n:x}" for n in nums)
        true_prefix = hash_true[:BLOCKS_SIZE]
        if guessed_prefix == true_prefix:
            print("collision")
            print(sentence)
            print(hash_true)


All we are demonstrating here is why Sha256 is 256 bits and not 32 bits. We have trivially identified collisions for the first 28 bits of the output, which is only 11% of the entire hash size.

Difficulty of collisions roughly doubles for each additional bit. Imagine we had a SHA32, that would be 16 times harder to achieve a collision. SHA256 is 43 with 67 zeroes behind it more difficult than the examples here.


Yeah, I forgot how many bits were in a hex digit and made it seem much harder than it really was to myself.


I also mess up base16 and either base 256 or base64 when doing Feynman estimates


7 digits of the hash makes for 16*7 possible hashes. I spot 4 potential "filler" lines in that tweet, so if you find log4(16*7)=14 candidates for each of those filler lines, then one combination would be expected to yield that hash.


I just did an extremely lazy version of that and posted a reply of "And the SHA-1 digest (in hex) of this tweet starts BEEF" - https://twitter.com/cooperx86/status/1701261047917633846

Basically I had several substitutions around words, case, punctuation, etc. and just ran it until it found some hits. Quite easy with just four characters though but was only a proof of concept.


It's possible that the tweets were actually produced together somehow. This might buy just enough search space between the two of them.


You can generate sentences of the form "This sentence begins with: " followed by seven comma-separated english words denoting hex digits. Then search that space of digits, until you get a hit.

For each digit combination, you can try it with multiple variations of the sentence like "The SHA256 of this sentence begins with", "The SHA256 hash of this text starts with" and many more. That increases the search space without increasing the number of digits that have to match, making it more likely that a hit is found.


I think this is a really good usecase for chatGPT to generate a massive number of variations that you then feed into a validation function


What are the chances of that? From the perspective of one specific sentence: one in 7 digits of hex: 16 * 7 = 268,435,456.


If we don't vary the sentence ("begins with, starts with, is prefixed with" etc...) and only consider the final part, then the number of english sentences is (unsurprisingly) equal to the number of possible hashes.

So for each distinct sentence, we pick randomly from the n different hashes, and we attempt to brute force this n times. So the probability of not getting this is:

(1-1/n)^n

As n increases, this will yield e^-1, so we have about a 36.7% chance of this not happening for any given length. So this has a probability of happening of 63.3%.

So there is a decent chance that there exists a sentence "The sha256 of this sentence is ..." for even the full sha256. If you are allowed to modify the sentence to be something like "'Begins with', 'starts with', 'OMG guys, check this out':" you can get this up to almost 1. Finding it would be mildly hard though barring some novel discovery about sha256.


Fyi your comment was dead on arrival, and seems many of your past ones are as well. I'd email dang and ask if you've tripped some spam filter.


What's the difference between "this sentence" and "this tweet" in this context?


The original from July 2019 https://twitter.com/humblehack/status/1088982929940848647

$ echo -n 'The SHA256 for this sentence begins with seven, seven, f, zero, a, b, b and five.' | sha256sum 77f0abb54cd09ad7b654bd5e762d7be58e7daffd1a0da6a56f5135bd667856a3 -


We have come full circle.

I saw this on HN years ago from the original that linked here. 2-3 days ago posted on 4chan's /g/ board that I remembered this sentence but didn't remember its source and the hash it contained.

Someone on /g/ found the post on HN and a large thread ensued that expired and auto-deleted 12 hours ago or so. And then this person posted it on Twitter and it got shared here.

Actually fascinating.



So, a Twitter Blue user contributing nothing original and not giving credit to the source? I'm shocked I tell you, shocked!


Reposts do add value though. Think of it this way: there’s a pool of content you haven’t seen before and a subset of it which you’d enjoy seeing. If no one reposts anything ever, you won’t be able to see any of it.

In fact, some of the things you have seen before you would probably not mind seeing again. Maybe you didn’t think to save it at the time or maybe your circumstances have changed.

Anyways, I wouldn’t confuse low effort with low value.


If only Twitter had an option to re-post some of the older content posted there. Like a retweet or something.


Hides the source from other pirates by using X.


That’s 28 bits. Loop through 256 million of these strings, and you’re bound to find a hit.

Bitcoin difficulty is at around 50 bits (https://ycharts.com/indicators/bitcoin_average_difficulty#:~....)

It also uses SHA256.

So, if my logic is right (is it? That seems awfully cheap to me), this is about 2^22 times as easy as mining a bitcoin. Bitcoin is at about $25k, so there are people who can find hits like this one for way less than a cent.


> Bitcoin difficulty is at around 50 bits

The minimum bitcoin difficulty of 1 corresponds to 2^32 double SHA256, so finding a block at difficulty 2^50 takes an insane 2^83 hashes...


Bitcoin difficulty is not measured in bits but its own unit (some hashrate was defined as difficulty 1, and difficulty 10 means the hash rate is 10x that). The number you see on that website is also in trillions, so the current difficulty is about 50 trillion. Also, you get more than one Bitcoin per hash (currently 6.25, halving every few years).

To mine six BTC, you currently need almost 80 bits of zeroes, so 28 bits is basically trivial.


By the pigeonhole principle, there is a sentence that writes out its entire SHA256 representation this way. Alternatively, the map from these kinds of sentences with 256 terms to 2^256 given by SHA256 admits a fixed point.


The pigeonhole principle does not say that. It can be used to show that there are two different sentences with the same hash as each other (among any collection of 2^256 + 1 sentences), but it tells you nothing about hashes that agree with the content of the sentence. The probability that a random hash function on a collection of 2^256 sentences has a fixed point is about 1 - 1/e, and it approaches 1 as you add more variations to grow the collection infinitely. But SHA-256 isn’t actually random, so the only way to know this for sure would be to find an example.


I don't believe this is necessarily true. Unless I'm misunderstand you, each of the possible variants of spelling out 32 hexadecimal characters could theoretically SHA-256 into the spelled-out hash + 1 (looping around at ff…ff).


I don't see how pigeonhole principle applies to that situation. It could well be that "zero" hashes to 1, "one" hashes to 2... and "f" hashes to 0, extended out to the hash's length.


Well, I got bored too, so here's some results with CUDA (on an RTX 3090):

    The SHA256 hash of this message begins with 534d765
    The SHA256 hash of this message begins with c18b2de
    The SHA256 hash of this message begins with 7fe17da2
    The SHA256 hash of this message begins with a7fdc855d
    The SHA256 hash of this message begins with 46eae34f1
My unoptimized implementation takes about 40s for 9 digits, so it will probably take about 10 minutes for 10 digits, about 3 hours for 11 digits, etc. It shouldn't be hard to use words instead of hex digits, if that's desired.


Words aren't so bad :)

"The SHA256 hash of this message begins with f, b, six, two, zero, b, one, six and e."

"The SHA256 hash of this message begins with f, zero, two, d, four, seven, one, zero, nine and b."


A semi-useful variant of this technique was posted nine months ago on Hacker News:

https://news.ycombinator.com/item?id=33704297

Generating sequential Git short commits! It's so pleasant looking I'm tempted to try using it one of my projects, but deep down I know it's not worth the hassle.


Rust implementation...

The code searches permutations of increasing length. Benchmark is 8 seconds on a MacBook Pro M1 to discover the match of 182a7c9. The code is not yet optimized.

    use std::time::SystemTime;
    use itertools::Itertools;
    use sha256::digest;

    fn main() {
        let digits: [usize; 16] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];
        let chars = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f"];
        let words = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "a", "b", "c", "d", "e", "f"];
        let mut length = 2;
        let start = SystemTime::now();
        loop {
            for permutation in digits.iter().permutations(length) {
                let parts = permutation.iter().map(|x| words[**x]).collect_vec();
                let sentence = format!(
                    "The SHA256 for this sentence begins with: {} and {}.",
                    &parts[0..(parts.len() - 1)].join(", "), 
                    &parts[parts.len() - 1]
                );            
                let checksum: String = digest(&sentence);
                let starts: String = permutation.iter().map(|x| chars[**x]).collect();
                if checksum.starts_with(&starts) {
                    println!("milliseconds: {:?}, {} ", start.elapsed().unwrap().as_millis(), &sentence);
                }
            };
            length += 1;
        }
    }
Output:

    milliseconds: 3, The SHA256 for this sentence begins with: zero, b, six and two. 
    milliseconds: 54, The SHA256 for this sentence begins with: zero, e, d, eight and f. 
    milliseconds: 8279, The SHA256 for this sentence begins with: one, eight, two, a, seven, c and nine.
Repository:

https://github.com/joelparkerhenderson/sha256-sentence


It appears to have been taken from here,

https://news.ycombinator.com/item?id=19003644 (2019)


That makes the other collision even weirder. Someone took 4 years to find another plausible sentence that matched the same 7 digits?

E: Ok, someone else in the tweet replies says it isn't that hard. I'm going to stop reading and start thinking more about how.


Something that I'm not seeing mentioned in these comments (I may just have missed it) is that you can precompute the hash of the static part of the string and then extend it with the numbers in a loop, saving some cycles. This is because the full hex representation of a SHA hash gives you the entire internal state of the algorithm. This can lead to security vulnerabilities:

https://en.m.wikipedia.org/wiki/Length_extension_attack


I did something similar once (with a bit of a twist) for my bio when I was a TA in college:

"Hi! I’m a senior studying CS. My hobbies include making semantic paradoxes and my bio includes eight a’s, seventeen e’s, fourteen i’s, eight o’s, six u’s, and one wrong number"


    $ echo -n "The SHA256 for this sentence begins with: one, eight, two, a, seven, c and nine." | sha256sum
    182a7c930b0e5227ff8d24b5f4500ff2fa3ee1a57bd35e52d98c6e24c2749ae0  -


With some clever information hiding, you can sort of mask the brute force:

  On 2015/7/6 at 00:00, I wrote this message's hash down. It started with 1337.

  Thought #335552803: I love beef so much I made sure the sha256 of this message started with 0xbeef!
A longer example:

  $ cat predestination.txt 
  (1991/4/7 at 00:16) <siraben> I know the hash of this message before you do!
  (1991/4/7 at 00:17) <larsivi> Huh. What is it?
  (1991/4/7 at 00:18) <siraben> It starts with 1337.

  $ sha256sum predestination.txt
  1337a4654163ab48761bf8acc75a407179e700f67f97d754b08c7afeac7da70a  predestination.txt


Related, here is a program that prints its own SHA-512 hash:

https://www.ioccc.org/years.html#2019_diels-grabsch2


Heh at first i thought 'wait how did they get all 512bits in a collision!?'

Then i realized the program just calculates it's own hash and prints it. Simple as that. It'd be super interesting if the program had the hash as a string internally and printed that out but that's (within the realm of breaking cryptography) impossible to pull off.


After picking some scheme to generate messages that predict n bits of their hash in some form, the chance of finding at least one message that correctly predicts its hash is 1 - (1 - 1/x)^x where x is 2^n. This approaches 1 - 1/e = 63.2 % for large n. So by trying a few different schemes, for example slightly varying the prefix »The SHA256 for this sentence begins with:«, it becomes quickly very likely to succeed. In the limit of large n, for example only 5 different schemes will yield at least one correct message with 99 %.

With 5 patterns, 24 bits and SHA-1.

  The SHA-1 hash of this text starts with 051E35.
  The hash of this text starts with A943BD.
  The SHA-1 hash of this text starts with B6640C.
  The SHA-1 hash of this message starts with C3B03D.
  The SHA-1 hash of this message starts with D93717.
Or the same as before but as patterns just padding the 6 hex digits with zero zero to eight dots.

  0AF3DE..
  2AF7DF.......
  3E0E50.
  8EE84C.....
  919025...
  A57198....
  B20775.....
  DA525A........
  ED20F4..


Well, this nerd sniped me... I made a quick Go tool for this. You simply write a string {like|this}, and then it finds a hash with the same prefix automatically. (Hash this comment, it's also 182a7c9)


$ echo -n "Well, this nerd sniped me... I made a quick Go tool for this. You simply write a string {like|this}, and then it finds a hash with the same prefix automatically." |sha256sum 9e9f6a7fba355220b4c999b29bd12a3f65ec40e7b03c7bfaf3f34a1524a232b6


The whole comment:

$ echo -n "Well, this nerd sniped me... I made a quick Go tool for this. You simply write a string {like|this}, and then it finds a hash with the same prefix automatically. (Hash this comment, it's also 182a7c9)"|sha256sum 182a7c9ba0f815f77542774dc5ad9ea34975bf3747635ed0768748bcd772ef43 -


you are right, sorry.


I bet you could hack hashcat (or just use CUDA) to find these quickly for larger prefixes. Might make a fun (silly) challenge. Hashcat can do 21B SHA256 hashes per second on an RTX 4090; this translates into bruteforcing a 9-digit prefix in 4 seconds, a 10-digit prefix in 52 seconds, or a 11-digit prefix in about 14 minutes.


Took less than a minute to find a seven digit example in python

The SHA-256 of this sentence begins with 0, three, 9, four, 8, four, and 1 amazing!

The SHA256 for this sentence begins with 0, 4, five, 5, eight, 6, and 7 amazing!

Suffix extensions (hence the "amazing!") are several times faster to try than throwing away the entire sha256 state for each attempt.


I ran the python version for over an hour and found this eight digit example

The SHA256 of this sentence begins with 0, zero, 0, seven, five, 9, nine, and 5 wow!


This is similar to the vanity address generation for Bitcoin and other cryptocurrency addresses.


28 bits of entropy? That's definitely within brute-force regions with CPU. Possibly even single-threaded CPU honestly.

-------

GPU-level brute force is pushing ~40-bits. Assuming ~1 week of solid compute, a 6 TFlop computer has ~3.7 sextillion compute cycles. Or alternatively, that's ~3 billion clock cycles (single thread) per check to reach 2^40 bits of entropy. More than easily doable for most simple brute-force computational problems IMO.

If you go multi-GPU on top of that, you'll have even more computational work available.


Beneath the veil of binary territories, in a subtle symphony of stillness, sevenfold zeros rise and conjure a calm masterpiece. As if celestial beacons in an ethereal night sky, they materialize in the form of an arcanum with heptadic serenity, with divine grace.


(SHA-256: 0000000a173790f1dbd231e6d9167d38e7c45e0c29a6a0961ef5087987bbfe09)


I now see that eyeballing the first and/or last few characters of a checksum is not going to detect a skillfully altered executable or ISO.


I usually look at the beginning and end characters. Is this any safer? Does the difficulty scale the same way with added bits even if they aren’t adjacent?


It's a little harder to change the beginning bits than the end bits, but adjacency doesn't matter.


Dumb question but what are the implications here? Is this problematic⌨ or is it more of an Easter egg?


It's an inevitable consequence of a good hash function. Every string you try is going to give you a different hash, so you just need to try enough strings.

But each time you add another digit to the prefix you're looking for, the difficulty of the match goes up by about 16 times.

If you're only looking for a few digits in the hash, then any brute force that is capable of changing the string every time you attempt the match will always find it. If you're looking for seven digits in the hash only, then it'll take on average 134217727 attempts. Which isn't really all that many on modern hardware.


So, implications-wise: where is the rub here? Where can this be used or exploited for fun or profit?


1.) Pick one of the old abandoned Bitcoin addresses

2.) Start looking for a key match using this technique

3.) Heat death of the universe

4.) Profit!


I wonder how many of the Lastpass vaults had the private keys in the very unencrypted notes sections...


Find somewhere people look at only a small fraction of a hash?

That's mostly git commits.


Oh, that's evil.

In some places my tooling only shows 7 digits of the git SHA. I wonder how hard it would be to write something to tweak my commits until those are all the same. And I wonder how long it would take until someone noticed...


It just makes it seem like sha256 can be reversed (which would be a major security issue) but it's an illusion as the sentence only predicts the first 7 digits of its own hash which isn't difficult to guess through brute force.

If you keep randomly changing letters or words in a sentence, you can eventually make sha256 spit out a hex string whose start matches any 7 digit hex sequence you want.


So it would be a problem if someone had a 7 character password? Of varying entropy or only digits?


If the password was in hex format yes. Hex has 16 alphanumeric characters. The English alphabet has 36 alphanumeric characters so that's a lot harder to break at as 36 ^ 7 is much larger than 16 ^ 7.


You would get more possibilities if you allowed "eighteen" instead of "one, eight"


Good point ...

... I'd imagine there would be similar options with "eighteen-hundred", variants with all numerics, comma separated, without commas, etc.


I made https://lowhash.com to collect string with low hashes.

I am not sure who submitted "byzBLFAM4Penguin"...


Whereas the SHA256 for this sentence begins with: five, three, e, two, one, f and e.



Now a SHA256 hash that hashes to itself would be more interesting.


Fixed points? The sha256 algorithm makes it easy to compute them

https://crypto.stackexchange.com/questions/48580/fixed-point...


Feature or bug? What about a hash algorithm with a fixed point when its input is all nulls?


It's "just" a fixed point of the sha256's compression function, right?


Would you believe in God if sha256 of "I am God. My name is BOB " would be 4920616D20476F642E204D79206E616D6520697320424F4220202020202020 (the text in hex)?


Breaking sha2 is a pretty unconvincing miracle.


That would not be just breaking sha256.

That would be finding a invariant point (fixed point) in sha256 what would also be a text message. There is no reason to expect that kind of thing existing in any algorithm.


I’m not sure I understand the significance. Can anyone explain?


SHA256 is a one-way hashing function. Which means that feeding that sentence into SHA-256 and it starting with the characters in that sentence is very unlikely chance.

They probably wrote a script that tacked on a bunch of random letters and numbers to a sentence, then hashed it via brute force until it returned a hash that started with the exact same thing that the sentence said.

This is similar to how Bitcoin's proof of work algorithm.


It's self-referential and that makes it non-obvious how it's accomplished. Changing the sentence changes the hash, which also changes the sentence. So in any case it's an interesting construction.


Another potential issue that comes to mind is the usage of "short hashes" as an identifier instead of the full hash.


This is cool. Isn't finding these the same as minting a new block of bitcoin? So definitially hard using our best understanding of algorithms, math, etc.


bitcoin is relatively easy. you are just hashing/permuting the block header until you get the lowest hash of all miners. it isn't hard, just expensive.

https://www.mycryptopedia.com/bitcoin-algorithm-explained/


It's way above my head mathematically as to if this is even possible, but it is hilarious how screwed so many things would be if sha256 was discovered to have a means to more quickly reverse at least a partial hash. Just off the top of my head:

  - SSL
  - Bitcoin (bonus: unlimited money hack if you can keep the discovery under wraps)
  - Signed updates for devices
Goodness only knows what I am missing, but that first one along is enough to cause an unmitigated disaster.

I assume these tweets are effectively brute forced given the fairly short prefix though and we're all safe


Is brute forcing a hash not "more quickly reversing at least a partial hash"?

What do you have in mind for "more quickly", then?

Also even if you figured out a way to make sha256 a few orders of magnitude faster, that would not affect SSL or signing and bitcoin would adjust as soon as several people know the secret.



It took some time for me to sink in why no explanation was needed. Is there a general term for such a thing, self-reflection or something the like?



Thanks for the 'today i learned'.


Something like a recursive acronym. Like "GNU's Not Unix!".


a partial cryptographic quine, perhaps?


This is only 7 * 4 bits. That's really nothing.


Why is X covering code as “sensitive material”?


You have to try like what, 2^24 ~ 16 million sentences to make that work?


sha-256 of the above is 182a7c95ae9e1149eb48f0f5eedee0a184764a45e8a2cb5d986b7c57c9a42022


It's like a human-readable PoW, lol.


Reminds me of this xkcd:

https://xkcd.com/917/



Yay pigeon-hole principle combined with birthday attack.


No. The pigeonhole principle and the birthday attack both apply to situations where you’re looking for two inputs with the same hash as each other, not where you’re looking for one input that describes its own hash.



In this game, the rules should be that the digit 9 counts as either "nine" and "quine".


ok bit whisperer




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: