Great question, and you're right to be skeptical. Indeed extraordinary claims require extraordinary evidence. We've put great care into being fully reproducible, and have provided all files necessary for you to do so before taking the claim at face value.
Before we continue, you can verify that in a general sense (not for this specific hash), we have the technical capability to make end-to-end collisions through our linked tool for an already-broken hash (the MD5 hash from 1991 which was considered broken by 2008):
In case you're skeptical, you can use literally any MD5 tool, including the ones built into Linux, Windows Powershell, any online MD5 calculator, etc, to test the differing files.
Our certificates implement the full SHA-256 algorithm but with the relaxations we mention throughout the paper, and we provide the source code. You can verify it yourself using any and all means including our certificates or writing your own version by hand if you don't trust our code. In addition to this, our results have been verified by other cryptographers. Thanks again for the question.
edited to clarify, thanks for pointing it out. It wouldn't be responsible for us to only publish when we got to the same stage for SHA-256, since at that point TLS and other certificates would be considered compromised.
The neat thing about bitcoin is that the incentive to break it is so high that it would almost certainly be the first place you would learn that SHA2 had been broken. Not on a website like this. I can verify its integrity by opening robinhood on my phone.
>The neat thing about bitcoin is that the incentive to break it is so high that it would almost certainly be the first place you would learn that SHA2 had been broken.
We actually see the incentive in the other direction, if we were able to reduce the search space for bitcoin proof-of-work (by applying thousands of higher-order algabraic theorems end-to-end to reduce the search space somewhat[1]), we would be financially incentivized not to tell anyone and mine at a discount. The financial incentive is against open research and disclosure. We don't get anything out of disclosing this except a neat publication.
[1] interestingly, ASICs (which are usually used to mine bitcoin) basically encode every operation verbatim, they don't use higher order mathematics at all. However, reducing mining complexity is not really on the horizon, even with our latest approaches, since it would require end-to-end complete control over the double-SHA-256 pipeline. That's considerably harder than just finding a collision when you're allowed to search just the tail part (the final rounds).
> Secure hash functions are used to make a short version of a large file. Ideally, it has several properties including making it infeasible to find two files with the same cryptographic hash. We've just gotten 92% of the way there. This has security ramifications in that other researchers are expected to be able to complete the work through similar methods as explored in the paper. We weren't sure if this was a remarkable result, since it's not a full collision
I thought this meant they were able to generate collisions for 92% of files/hashes they tried, but it sounds like they're able to generate hashes that are 92% identical?
Possible. It's up to people to decide if they're OK with a known 92% collision out there (with the unknown being there could be a 100%), or go for something stronger.
Thanks, you have this exactly right. The unknown part is especially worrying because we didn't implement many of the strongest ways to make to the final stretch yet, i.e. Wang-style message modification. Our result is basically a very strong direction in this cryptographic research, but not a full break yet.
Thank you for pointing out that that section could be clearer. I've now updated it. It now reads:
>We've just gotten 92% of the way to finding a single collision (this means that there is no full collision yet.). This has security ramifications in that other researchers are expected to be able to complete the work through similar methods as explored in the paper, and eventually produce collisions at will. We weren't sure if this was a remarkable result, since it's not a full collision, but we shared the work with the leading cryptographer in the field, who holds the world records in reduced-round attacks, and got great encouragement to proceed to publish it as a paper, so we did so.
(if we had found a single full collision, we would have just written "we broke SHA-256". This is 92% of the way to a full collision. Any collision is considered a great reduction in the security of the hash, because it means that there two different files with the same cryptographic hash. This is what happened to other algorithms such as MD5, as demonstrated in the linked tool.)
I'd expect a finding / paper like this to be submitted to the IACR ePrint server [1] to bring it to the attention of the cryptographic community. I can't see that it's been submitted yet.
Venue should not imply credibility but in this case it would certainly help bring the proper scrutiny.
You can verify the certificates yourself or just wait for us to make an end-to-end collision generator as we did for MD5[1] - you can use that to generate a collision in seconds on your phone or any computer. If you wait for us to complete the end to end collision, in a sense it will be a little too late as TLS certificates and other security that relies on SHA-256 needs time to move away. We think it's responsible to disclose at this stage, and as mentioned, our peer reviewer said it is a "very good result" that is "worth publishing". We've gone to great pains to make our method completely reproducible, even writing in the article that we'll help anyone who is having trouble with any part.
I looked into citation [5] since it sounded interesting but the DOI link has been hallucinated and goes to some other article. I assume many of the others are similarly bogus.
Yes, I'm the author of the paper. It's received more than a tiny bit of peer review. I'm happy to answer any questions about it or answer anything that is unclear.
> his report was generated on 2026-03-22 as the final artifact of the SHA-256 Cryptanalysis
Research Project. Collaboration: Robert V. (research direction, strategy) and Claude/Anthropic (implementation, computation).
This Claude guy is pretty prolific it seems.
But I'll wait for some known cryptographers to chime in
> it is possible that we'll find relations that carry across the entire double-SHA-256 pipeline
Bitcoin mining is a partial second preimage of 0x00 though, not a collision, that statement just seems to be so outside the realm of what they’re claiming to have done. Even MD5, the most widely known to be broken hash, would be secure when used in the same way bitcoin uses SHA256 (other than being too short now, bitcoin miners have done 80 bits of work at this point many times over).
Also, a collision on single-sha256 would imply a collision of double-sha256 right off the bat, since the inputs to the second round would be matching. But as you say, a collision attack doesn't do much to BTC mining.
We publish this work as responsible disclosure. While a full SHA-256 collision (sr = 64) has not yet been achieved, the tools and techniques presented here represent significant methodological advances that bring it closer. Organizations relying on SHA-256 for collision resistance should begin evaluating migration paths to SHA-3 or other post-quantum hash functions. The cryptographic community should treat the collision resistance of SHA-256 as having a finite and shrinking safety margin.
In the linked work, we've broken 92% of SHA-256 across its full 64 rounds, and were encouraged to publish it by the leading cryptographer in the field (who held the previous record). Currently, SHA-256 is the basis of TLS certificates, bitcoin, and many other security applications. We think it is time to begin to migrate to other hash families, because we expect the rest of SHA-256 to fall soon.
As long as there is no verification of the results and their relevancy in reaching higher numbers it means as much as nearly having won the lottery by guessing 9 of the 12 numbers correctly: you did not win the lottery.
I know people (especially around here) hate it when people just post AI output, and I generally agree, since it is trivial for anyone else who is interested to do the same thing. However, the majority of the comments here are from people seemingly asking the author (or someone else) to explain how significant this is, without having taken that step themselves. So while I normally wouldn't do this, in this case it seems helpful. Claude thought the paper was interesting and had a novel cryptographic technique, but that the claims of near-term breaking of the SHA-256 algorithm to be unsupported. Here's the conversation:
That's not how this works, though. I don't care if the method is interesting. I care if it works. I can write an interesting proof that P=NP but that doesn't make it valid.
It's on the author to explain what they mean. Here, they haven't.
Does the fact that Claude wrote the paper help Claude to think the paper was interesting? <facepalm> I'd suggest sticking to your "I don't normally do this" idea
[1] https://stateofutopia.com/papers/2/intermediate-report.pdf
Before we continue, you can verify that in a general sense (not for this specific hash), we have the technical capability to make end-to-end collisions through our linked tool for an already-broken hash (the MD5 hash from 1991 which was considered broken by 2008):
https://stateofutopia.com/experiments/md5collider
In case you're skeptical, you can use literally any MD5 tool, including the ones built into Linux, Windows Powershell, any online MD5 calculator, etc, to test the differing files.
Our certificates implement the full SHA-256 algorithm but with the relaxations we mention throughout the paper, and we provide the source code. You can verify it yourself using any and all means including our certificates or writing your own version by hand if you don't trust our code. In addition to this, our results have been verified by other cryptographers. Thanks again for the question.
> Our certificates implement the full SHA-256 algorithm
We knew MD5 is broken. Do you have a POC for breaking SHA-256, too?
We actually see the incentive in the other direction, if we were able to reduce the search space for bitcoin proof-of-work (by applying thousands of higher-order algabraic theorems end-to-end to reduce the search space somewhat[1]), we would be financially incentivized not to tell anyone and mine at a discount. The financial incentive is against open research and disclosure. We don't get anything out of disclosing this except a neat publication.
[1] interestingly, ASICs (which are usually used to mine bitcoin) basically encode every operation verbatim, they don't use higher order mathematics at all. However, reducing mining complexity is not really on the horizon, even with our latest approaches, since it would require end-to-end complete control over the double-SHA-256 pipeline. That's considerably harder than just finding a collision when you're allowed to search just the tail part (the final rounds).
I thought this meant they were able to generate collisions for 92% of files/hashes they tried, but it sounds like they're able to generate hashes that are 92% identical?
>We've just gotten 92% of the way to finding a single collision (this means that there is no full collision yet.). This has security ramifications in that other researchers are expected to be able to complete the work through similar methods as explored in the paper, and eventually produce collisions at will. We weren't sure if this was a remarkable result, since it's not a full collision, but we shared the work with the leading cryptographer in the field, who holds the world records in reduced-round attacks, and got great encouragement to proceed to publish it as a paper, so we did so.
(if we had found a single full collision, we would have just written "we broke SHA-256". This is 92% of the way to a full collision. Any collision is considered a great reduction in the security of the hash, because it means that there two different files with the same cryptographic hash. This is what happened to other algorithms such as MD5, as demonstrated in the linked tool.)
Venue should not imply credibility but in this case it would certainly help bring the proper scrutiny.
[1] https://eprint.iacr.org/
[1] https://stateofutopia.com/experiments/md5collider
> his report was generated on 2026-03-22 as the final artifact of the SHA-256 Cryptanalysis Research Project. Collaboration: Robert V. (research direction, strategy) and Claude/Anthropic (implementation, computation).
This Claude guy is pretty prolific it seems.
But I'll wait for some known cryptographers to chime in
Bitcoin mining is a partial second preimage of 0x00 though, not a collision, that statement just seems to be so outside the realm of what they’re claiming to have done. Even MD5, the most widely known to be broken hash, would be secure when used in the same way bitcoin uses SHA256 (other than being too short now, bitcoin miners have done 80 bits of work at this point many times over).
Do some research and write a paper about breaking Bitcoin.
https://news.ycombinator.com/item?id=38668893
(Also my work does not demonstrate any weakness in SHA256, it's just an application of the birthday paradox)
[1] https://eprint.iacr.org/2024/349
https://claude.ai/share/b10b95ef-5d9f-43dd-9005-3d1d89f9dbc1
It's on the author to explain what they mean. Here, they haven't.
See also https://en.wikipedia.org/wiki/Brandolini%27s_law -
> The amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it.