The other day I needed to add SHA-3 (Keccak) support to a project in
order to ensure interoperability with some other applications. I
actually wanted to add SHA-3 support back when it first came out
before I had a specific use for it, but at the time, I looked at some
code samples, and it looked way too confusing. However, when looking
for code samples the other day, I found some more straightforward
implementations, that were clear enough that I could figure out how
it worked and write my own implementation.

I’ve implemented
over a dozen hash algorithms, for reasons I won’t go into here. One
upside to doing this is that I can get a decent understanding of how
it works internally so I can be familiar with its pros and cons. Now
understanding how SHA-3 (Keccak) works, and comparing it to some
other hash algorithms, I’m actually somewhat appalled by what I’m
seeing.

### Hash Properties Overview

Cryptographic hash algorithms (like the Secure Hash Algorithm family) take a file,
document, or some other set of data, and produce a fixed size series
of bytes to describe the data. Typically between 20 and 64 bytes.
These hashes are supposed to change drastically if even a single bit
is altered in the input data that is being hashed. They’re also
supposed to ensure it’s computationally difficult to produce data
targeting a specific hash.

There’s two main properties desired in a cryptographic hash algorithm:

1) That it should be
too difficult to generate two documents that have the same hash. This
ensures that if party A shows party B a document, and gets them to
approve the document’s hash in some sort of whitelist or signature,
that party A cannot use a different document against B’s whitelist
or signature that is now also approved without consent of B.

2) That if a certain
set of data is whitelisted or signed by a party, that it would be too difficult to generate a different set of data that
matches a whitelisted or signed hash.

These two properties
are really similar. The only difference is whether you’re targeting
a preexisting hash that exists somewhere, or generating two different
sets of data that can have any hash, as long as they’re identical.

If the first
property is not applicable for a hash algorithm, it may still be
usable for cases where you’re not accepting data generated by a
third party, but only validating your own. If the second property is
not applicable, there may still be some use cases where the hash
algorithm can be used, especially if only a certain subset of hashes
can have data generated against them which match. However, if either
property does not hold true, you generally should be using some other
hash algorithm, and not rely on something which is broken.

### Classical Cryptographic Hash Algorithm Design

For decades hash
algorithms basically did the following:

- Initialize a “state” of a particular size, generally the output size, with a series of fixed values.
- Break the data up into a series of blocks of a larger particular size.
- Take each block and use bit operations and basic math operations (addition, subtraction, multiplication) on its data, ignoring overflow, to reduce the block to a much smaller size, generally the output size, to be merged with the “state”. Each block is processed with some fixed data.
- Combine each of those states. This might be done by xoring or adding them altogether, and throwing away any overflow.
- Append the size of the data to the final block, producing an extra block if neccesary, performing steps 3 and 4 upon it.
- Return the state as the result.

### SHA-3 Competition

All hash algorithms
sooner or later don’t quite hold up to their expectations. Although
this isn’t always a problem. For example, if a hash algorithm is
supposed to take 1000 years to brute force a match to an existing
hash, and someone figures out an algorithm to do it in 950 years, it
doesn’t provide the security it theoretically advertises, but the
margin of security is so high, it doesn’t matter.

However, in some
recent years, real attacks violating one of the two key cryptographic
hash properties which could be performed in hours or even minutes
have been found against the popular MD5 and SHA-1 algorithms. These
attacks don’t necessarily invalidate MD5 and SHA-1 from every
potential use, but it’s bad enough that they should be avoided
whenever possible, and should not be relied upon for security.

There’s also an
issue with these classical hash algorithms regarding how the states are chained and returned as the hash. It
makes it easy for hashes to be misused in some cases. Someone who
doesn’t necessarily know what data matches a particular hash, can
still calculate the hash of data + data2. This might be an issue in
certain naïve usages of these hash algorithms.

Further, all the
classical algorithms were shown to not quite hold up to their
expectations, even though they’re still considered secure enough
for the time being. This led to a desire to create a new series of
hash algorithms which have a different structure than the classical
ones. Therefore a competition was held to create some new ones and determine the best candidates for future use and to earn the title “SHA-3”.

#### BLAKE and BLAKE2

One of the SHA-3
competition finalists was BLAKE. This hash algorithm was composed of
a few different components. Its authors suggested studying
BLAKE leaving out each of them, to understand how strong BLAKE was based on a
majority of its components. After the competition was over, and after much
study, it was realized that one of the components of BLAKE wasn’t
necessary, and that the amount of operations overall could be reduced
to improve its speed, without really hurting its security. Taking that alongside some interesting ideas
introduced alongside Skein and Keccak, two other finalists, BLAKE2
was created. BLAKE2 is one of the fastest cryptographic hash
algorithms, and is also considered to be one of the strongest
available.

BLAKE2 works as
follows:

- Initialize a “state” of a particular size, with a series of fixed values, mostly zeros.
- Break the data up into a series of blocks of a larger particular size.
- Take each block, the block number, and a flag, and use bit operations and addition, ignoring overflow, reducing the size of these in half, to be merged with the “state”. Each block+number+flag is processed with some fixed data.
- Combine each of those states.
- The flag used alongside the final block is different than all prior blocks.
- Return the state as the result.

Theoretically BLAKE2
is stronger than classical hashes, because there is data added to
each block that is processed, which cannot be simply influenced by
passing data to it. This makes computing data to go from state A to
state B more difficult, because you would need a different set of
data to do so depending on where the block you’re trying to replace
is. Calculating a hash from another hash for data + data2 is more
difficult, because of the flag change. The state for data would be
different if more blocks were appended to the data.

#### Keccak

The actual winner of
the SHA-3 competition was the Keccak algorithm. It was chosen because
it was really different from classical hashes (not necessarily a good
thing), and really fast in hardware implementations.

Keccak works as
follows:

- Initialize a large “state” of a particular size, with zeros.
- Break the data up into a series of blocks of a smaller particular size.
- Take each block, and use bit operations, to be merged with the larger “state”. Each block is processed with some fixed sparse data.
- Combine each of those states.
- The final block has two bits flipped.
- Return a small portion of the state as the result.

Like BLAKE2, Keccak
aims to be stronger than classical hashes because there is state data
larger than the size of the block, which cannot be immediately
influenced by data (but can still be influenced). Calculating a hash based upon another with
appended data is also difficult, because the result is a truncated
state. The bit flips in the final block would help as well.

### My thoughts

After implementing
Keccak and understanding its design, I became alarmed by how much is
missing. The use of pure bit operations make it easier to compute in
reverse, aside from a few AND operations. The computation of the
final state uses bit flippings inside the block, as opposed to
outside beyond it, making it easier to tamper with (although
theoretically still difficult). But most importantly, the utter lack
of using a block counter or data size anywhere.

Classical hash
algorithms make it difficult to insert blocks anywhere in the
middle of data and get a matching hash. Even if you were able to
compute some data to go from state A back to state A, you could not
insert this somewhere, because the data size at the end must then be
different, resulting in a different hash. Same goes for BLAKE2
because of its block counter appended to each block it processes.

**Keccak has absolutely nothing here.**
Keccak’s initial
state is all zeros, something which every Keccak hash must use. If
you could compute data which is a multiple of the block size which
when processed in Keccak would go from state all zeros to state all
zeros,

__you have just broken Keccak__. You would be able to prepend this data as many times as you want to the beginning of any other set of data, and produce the exact same hash.**This would apply to every single Keccak hash produced, it targets all Keccak hashes in existence.**
Now, I’m no expert
cryptographer. Perhaps Keccak’s bit operations ensure that a state
of all zeros could never be produced from it. Maybe there even exists
a proof for this which is in a document or article I haven’t seen yet.
However, if there is no such proof,

__then Keccak may be broken much worse than even the classical hash algorithms are__. With the classical algorithms that are broken, you generally have to break each hash with a very large amount of computations specific to each hash result you want broken. With Keccak here, once a prefix becomes known, you can just prepend it,**no computation necessary**.#### Improving Keccak

Of course if Keccak
is indeed broken in this fashion, it’s not hard to fix. The data
size could be processed in the end, or a block counter could be used
alongside each block like BLAKE2, the state is certainly large enough
to be able to handle it. If one were to change Keccak, I’d also
move the bit flippings at the end to occur beyond the block size
inside the state, just to make it more difficult to tamper with the
final result. In the mean time though,

__is Keccak, and therefore SHA-3, even safe to use?__
## 13 comments:

A few points:

- Preimage attack (second on your list) implies collision attack (first on your list).

- Brute forcing a collision is much easier (square root) than brute forcing a preimage.

- SHA1 and MD5 are basically just broken with regards to collision, because their hash lengths is now in the range of bruteforceability. (Hash lengths of 128 bits means collision attack of 64 bits (sqrt(2^128) = 2^64), which is in the range of bruteforceability.)

- Your description of how basic hash algorithms work (namely Merkle-Damgard construction) lacks of an important step: Transformation of the input into a prefix-free string in order to avoid attacks that append inputs.

I have not looked into them in detail yet, but it looks to me as if the new hash algorithms are heavily inspired by how cryptographic random-number generators work. There is an internal state that is larger than the output and you can feed in as much entropy as you want.

Your analysis is very interesting though. I cannot wait to see some interesting comments on this by people who have looked into it more deeply.

Hi 施特凡,

I linked to the Wikipedia page about preimages and collisions right before my "list" for those that want to learn more about those.

Obviously if you can brute force a hash you can create two data sets with the same hash. However, there may be techniques available that allow the creation of two data sets with the same hash without being required to brute force a hash. People were generating two documents with the same MD5 in a few seconds on a Pentium 4, several years before it became feasible to brute force an MD5 hash with a server farm or stack of GPUs, which usually takes a few minutes.

Regarding the Merkle–Damgård construction, I did not go into full details, as my objective was to focus on the high level aspects and compare and contrast them with BLAKE2 and Keccak. Stuff like compression, sponges, block ciphers, padding, and more was left out on purpose. I intentionally did not cover why HMAC (or other algorithms) does stuff the way it does to avoid prefix and suffix issues.

I have not looked at every new hash algorithm either. However, from what I recall about BLAKE2, the full output is not larger than the internal state (excluding the counters). However, most of the modern hash algorithms offer features to make KDF, MAC, and PRF construction fairly easy. BLAKE2 is built upon ChaCha20 which is a CSPRNG internally.

I too eagerly await the interesting comments. I also hope someone can prove that a prefix block or blocks breaking Keccak is not possible. Although from people I spoke to so far on the topic, they don't think it is.

The simple answer is that the sponge mode of operation, which is the mode used by SHA-3, [guarantees that it cannot be distinguished from the ideal hash with effort less than approximately 2^(c/2) operations](https://keccak.team/files/SpongeIndifferentiability.pdf), where c is the capacity of the hash function. This is often called indifferentiability. In SHA-3, c=448,512,8769,or 1024 depending on the output length. This rules out any generic attacks of cost < 2^(c/2) that could break the hash with reasonable probability.

With respect to the core permutation, which needs to approximate a randomly sampled permutation, it does indeed only use xor, rotation and bitwise and. But repeated 24 times, and with a very well thought-out linear layer, this is sufficient to deter all known attacks. In fact, much fewer rounds would be necessary for this hash to be considered secure; 24 rounds constitute an enormous security margin.

So yes, you could find some output that would lead back to the all-zero state. But this would require either an astronomical computation cost (e.g., more than all the energy in the galaxy put together), or an unknown breakthrough in cryptanalysis that would help a highly conservative design (for reference, most conservative block ciphers designed in the 90s, such as AES, Serpent, Twofish, etc, are still as secure as they were then). None of these things seem likely to happen anytime soon.

@Bob: You cannot say that it is secure because it meets the security definition. That is circular. The questions is: Why and how does it meet the security definition?

I think the Insane Coder is looking for an intuitive understanding /why/ and /how/ the transformation (sponge) function is making this attack hard.

Hi 施特凡,

You are correct.

However, to go a step beyond that. We all know that ciphers and hashes get weaker over time. People find ways to break rounds in the process. People find exploits for certain aspects. The outcome is that attacks exist that are better than brute force.

At some point, if an attack exists that can be performed by a government or a large corporation with a massive farm, you have to wonder how susceptible your algorithm is. The key here is that your typical algorithm does not have an achilles heel. It's highly unlikely that anyone is going to spend billions of dollars just to break a single digest produced that is used by software people like us are developing. However, if you can spend a billion dollars to break every single digest produced in a particular algorithm that's being used? That might be a feasible target and worth pouring a huge investment into.

4 rounds of Keccak can already be broken with minimal resources. Possibly even 6 rounds. Yet, the Keccak team recently introduced a 12 round variant because they believe that's strong too: https://keccak.team/kangarootwelve.html

Now if I have two hash algorithms in front of me, both with the same theoretical security margin, but one probably has an achilles heel and the other does not, why would I use the one with the achilles heel?

While talking this over with some colleagues, a truly scary thought came up.

What if Keccak has a kleptographic backdoor in it?

Dual_EC_DRBG is a random number algorithm that NIST standardized back in the 2000s. Researchers realized its design allowed for a backdoor, where if the constants used by it were computed in a certain way, those who computed those constants could use some secret data generated during those computations to predict the random output. Several years later, Snowden more or less confirmed that indeed this occurred, and the NSA has a backdoor into the algorithm.

Now we look at Keccak. Of all the popular hash algorithms in use, it's the only one which does nothing to prevent prepend and insertion attacks. This could be an oversight by its developers, or it could be just one part of hiding an intentional backdoor to break the hash in certain ways. Although we haven't proven that a zero to zero sequence must exist, we cannot immediately disprove it either (would love to see a proof!).

Now the round and rotation constants look like nothing up my sleeve numbers, but

Bernstein, et al., demonstrate that use of nothing-up-my-sleeve numbers as the starting point in a complex procedure for generating cryptographic objects, such as elliptic curves, may not be sufficient to prevent insertion of back doors. If there are enough adjustable elements in the object selection procedure, the universe of possible design choices and of apparently simple constants can be large enough so that a search of the possibilities allows construction of an object with desired backdoor properties.Now this is not a case of elliptic curves, but those round constants for Keccak look really sparse. No other popular hash algorithm uses such sparse constants. The Keccak team describes the algorithm they used to generate their constants, but out of all the possible algorithms used, there is no rationale for why this one was picked over something less sparse.Now you may say to yourself:

But insane coder, so what if skeleton keys to break all instances of Keccak hashes can exist, there is no reason to believe they do and that some entity already has them. So what if the round constants look too simple, that doesn't mean anything. You're just being paranoid!To this I respond, cryptography is about being paranoid. Consider the amount of research we now have into Post-Quantum-Computing cryptographic algorithms. Quantum Computers that actually can compute and break anything significant are science fiction, and there's a good chance they will never exist in our universe. Yet research is currently pouring tons of resources into this paranoia.

Seriously, why should we use an algorithm which might be compromised when we have other options?

I found some time to look at the algorithm and at least can answer one of your initial questions: There cannot be a transfer from the zero state to the zero state because of the Iota step in the f function, where constants are added in each round (after the rotations, additions and permutations).

Hello 施特凡,

Yes constants can be xor'ed in. But those same constants can be xor'ed out.

How does this step prevent the f function from ever outputting all zeros?

Or are you saying how it's added with the 5 variables cycling prevents this?

Hello 施特凡,

Using the inversion code found here: https://github.com/KeccakTeam/KeccakTools/blob/master/Sources/Keccak-f.h#L496

We computed a state that when run through the Keccak F function would output all zeros.

These are the 25 64-bit state values that permuted through all 24 Keccak rounds would result in all zeros:

14721456279641464894

7724296291459751827

10886275407155278477

319949407487385933

2401862744500137422

6621695293454996830

15688751689846876917

2272708723695410598

3914634542405067510

121355999895083400

4275415388935110838

4250087273847779342

11299929515691117547

7305221439536160700

17775658536205013067

4402618029247410770

12748353195565577610

17929335980907442738

908399519615871481

5655375619681861610

7730053291985347927

4776334871276930609

14375008313403715986

11069090975619611366

8758282975667999104

Here it is in C99 if you want to test it:

uint64_t state[25] =

{

UINT64_C(14721456279641464894),

UINT64_C(7724296291459751827),

UINT64_C(10886275407155278477),

UINT64_C(319949407487385933),

UINT64_C(2401862744500137422),

UINT64_C(6621695293454996830),

UINT64_C(15688751689846876917),

UINT64_C(2272708723695410598),

UINT64_C(3914634542405067510),

UINT64_C(121355999895083400),

UINT64_C(4275415388935110838),

UINT64_C(4250087273847779342),

UINT64_C(11299929515691117547),

UINT64_C(7305221439536160700),

UINT64_C(17775658536205013067),

UINT64_C(4402618029247410770),

UINT64_C(12748353195565577610),

UINT64_C(17929335980907442738),

UINT64_C(908399519615871481),

UINT64_C(5655375619681861610),

UINT64_C(7730053291985347927),

UINT64_C(4776334871276930609),

UINT64_C(14375008313403715986),

UINT64_C(11069090975619611366),

UINT64_C(8758282975667999104),

};

If we know of a series of input a multiple of the block size that would produce this internal state, then we have successfully determined a zero to zero cycle.

We also have some leeway in that there are multiple internal states that would permute to zero. For SHA3-256, there should be multiple solutions where we can change the first 17 values and this will still work.

My colleagues and I have determined that there are multiple 272 byte inputs to SHA3-256 and 216 bytes of input to SHA3-512 that would cause a state cycle.

Some math regarding this topic:

In Keccak, for every

f(x) = y, there exists a non-emptyzthatf(z||x) = y, and the value ofzis actually independent of the values ofxandy. There's also more than one possiblez, andf(z||z||x) = yas well asf(z||z||z||x) = yand so on.In the realm of a two block

z(a skeleton key) for SHA3-256, the first block of input would need to produce an internal state where the last 8 64-bit integers match the numbers I posted above. Regarding the first 17 integers, whatever the state is from the first block, the input for the second block would be the first 17 integers from the state xor'd against the first 17 numbers from above. This means that what we need for the second block of input here is fully known, and all we care about targeting in our first block are those 8 64-bit values (which matches the security level).Since we have 2^1088 inputs for the first block, targeting 512 bits, there should exist approximately 2^(1088-512) skeleton keys, which is ~2^576 for SHA3-256. Under NIST's initial desire to lower the security level for SHA3-256, it would have been 2^(1344-256), which is ~2^1088 skeleton keys. These skeleton keys exist independently of the padding bit flips suggested by the initial Keccak proposal and what SHA-3 ending up using.

Apparently this problem was already identified back in 2011, as a colleague of mine just found this paper: https://eprint.iacr.org/2011/261.pdf

What we know is just more of a simplification of that, and we've already computed the (much) easier half of the problem. The concept which really bothered me in this article is mentioned by the paper:

it seems that one property present in older designs (MD5, SHA-1, SHA-2) that we took for granted and there was no attempt to define it precisely as a property (or requirement)for SHA3. Therefore,For all other known cryptographic hash functions (MD5, SHA-1, SHA-2, SHA-3 candidates), there is not known explicit form for a class of second preimages for an arbitrary message M. However, for Keccak one class of second preimages is known and is defined.Therefore we can conclude that Keccak is an algorithm which allows for backdoors. We cannot assume at this point that anyone has the skeleton keys for this backdoor, but we also cannot assume that no one has a skeleton key.

Troubling to consider is that Keccak has its roots in Panama which as a hash function offered a 2^128 security level against collisions, but using some of its mathematical properties actually has a break in 2^6. It was created and broken several years later by some of Keccak's authors. The highly skilled Keccak team, extremely knowledgeable regarding this kind of novel structure used in these algorithms, improved Panama against these earlier attacks they found, producing Keccak. However, by the same token, they would also have the skill to hide a backdoor in the algorithm. There's a decent chance they have some skeleton keys at their disposal if they created an algorithm which has some mathematical property others are not yet aware of.

Knowing this, why should we use an algorithm which might be compromised when we have other options?

"Your hard work does not go unnoticed. Thank you for your dedication."

For more information visit us on Clinical Research Courses in pune

Nice post, I really like the content posted by you. Thank you for sharing these details here with us.

"Manjummel Boys" captivates with its heartfelt portrayal of friendship amidst the vibrant culture of Kerala. The film delves into the complexities of youth, weaving a narrative rich in emotion and authenticity. With stellar performances and a compelling storyline, it leaves a lasting impression, resonating with audiences long after the credits roll.

Performance review examples are essential for both employees and employers to assess progress, set goals, and foster growth within the organization.

Post a Comment