The other day I needed to add SHA-3 (Keccak) support to a project in
order to ensure interoperability with some other applications. I
actually wanted to add SHA-3 support back when it first came out
before I had a specific use for it, but at the time, I looked at some
code samples, and it looked way too confusing. However, when looking
for code samples the other day, I found some more straightforward
implementations, that were clear enough that I could figure out how
it worked and write my own implementation.
I’ve implemented
over a dozen hash algorithms, for reasons I won’t go into here. One
upside to doing this is that I can get a decent understanding of how
it works internally so I can be familiar with its pros and cons. Now
understanding how SHA-3 (Keccak) works, and comparing it to some
other hash algorithms, I’m actually somewhat appalled by what I’m
seeing.
Hash Properties Overview
Cryptographic hash algorithms (like the Secure Hash Algorithm family) take a file,
document, or some other set of data, and produce a fixed size series
of bytes to describe the data. Typically between 20 and 64 bytes.
These hashes are supposed to change drastically if even a single bit
is altered in the input data that is being hashed. They’re also
supposed to ensure it’s computationally difficult to produce data
targeting a specific hash.
There’s two main properties desired in a cryptographic hash algorithm:
1) That it should be
too difficult to generate two documents that have the same hash. This
ensures that if party A shows party B a document, and gets them to
approve the document’s hash in some sort of whitelist or signature,
that party A cannot use a different document against B’s whitelist
or signature that is now also approved without consent of B.
2) That if a certain
set of data is whitelisted or signed by a party, that it would be too difficult to generate a different set of data that
matches a whitelisted or signed hash.
These two properties
are really similar. The only difference is whether you’re targeting
a preexisting hash that exists somewhere, or generating two different
sets of data that can have any hash, as long as they’re identical.
If the first
property is not applicable for a hash algorithm, it may still be
usable for cases where you’re not accepting data generated by a
third party, but only validating your own. If the second property is
not applicable, there may still be some use cases where the hash
algorithm can be used, especially if only a certain subset of hashes
can have data generated against them which match. However, if either
property does not hold true, you generally should be using some other
hash algorithm, and not rely on something which is broken.
Classical Cryptographic Hash Algorithm Design
For decades hash
algorithms basically did the following:
- Initialize a “state” of a particular size, generally the output size, with a series of fixed values.
- Break the data up into a series of blocks of a larger particular size.
- Take each block and use bit operations and basic math operations (addition, subtraction, multiplication) on its data, ignoring overflow, to reduce the block to a much smaller size, generally the output size, to be merged with the “state”. Each block is processed with some fixed data.
- Combine each of those states. This might be done by xoring or adding them altogether, and throwing away any overflow.
- Append the size of the data to the final block, producing an extra block if neccesary, performing steps 3 and 4 upon it.
- Return the state as the result.
SHA-3 Competition
All hash algorithms
sooner or later don’t quite hold up to their expectations. Although
this isn’t always a problem. For example, if a hash algorithm is
supposed to take 1000 years to brute force a match to an existing
hash, and someone figures out an algorithm to do it in 950 years, it
doesn’t provide the security it theoretically advertises, but the
margin of security is so high, it doesn’t matter.
However, in some
recent years, real attacks violating one of the two key cryptographic
hash properties which could be performed in hours or even minutes
have been found against the popular MD5 and SHA-1 algorithms. These
attacks don’t necessarily invalidate MD5 and SHA-1 from every
potential use, but it’s bad enough that they should be avoided
whenever possible, and should not be relied upon for security.
There’s also an
issue with these classical hash algorithms regarding how the states are chained and returned as the hash. It
makes it easy for hashes to be misused in some cases. Someone who
doesn’t necessarily know what data matches a particular hash, can
still calculate the hash of data + data2. This might be an issue in
certain naïve usages of these hash algorithms.
Further, all the
classical algorithms were shown to not quite hold up to their
expectations, even though they’re still considered secure enough
for the time being. This led to a desire to create a new series of
hash algorithms which have a different structure than the classical
ones. Therefore a competition was held to create some new ones and determine the best candidates for future use and to earn the title “SHA-3”.
BLAKE and BLAKE2
One of the SHA-3
competition finalists was BLAKE. This hash algorithm was composed of
a few different components. Its authors suggested studying
BLAKE leaving out each of them, to understand how strong BLAKE was based on a
majority of its components. After the competition was over, and after much
study, it was realized that one of the components of BLAKE wasn’t
necessary, and that the amount of operations overall could be reduced
to improve its speed, without really hurting its security. Taking that alongside some interesting ideas
introduced alongside Skein and Keccak, two other finalists, BLAKE2
was created. BLAKE2 is one of the fastest cryptographic hash
algorithms, and is also considered to be one of the strongest
available.
BLAKE2 works as
follows:
- Initialize a “state” of a particular size, with a series of fixed values, mostly zeros.
- Break the data up into a series of blocks of a larger particular size.
- Take each block, the block number, and a flag, and use bit operations and addition, ignoring overflow, reducing the size of these in half, to be merged with the “state”. Each block+number+flag is processed with some fixed data.
- Combine each of those states.
- The flag used alongside the final block is different than all prior blocks.
- Return the state as the result.
Theoretically BLAKE2
is stronger than classical hashes, because there is data added to
each block that is processed, which cannot be simply influenced by
passing data to it. This makes computing data to go from state A to
state B more difficult, because you would need a different set of
data to do so depending on where the block you’re trying to replace
is. Calculating a hash from another hash for data + data2 is more
difficult, because of the flag change. The state for data would be
different if more blocks were appended to the data.
Keccak
The actual winner of
the SHA-3 competition was the Keccak algorithm. It was chosen because
it was really different from classical hashes (not necessarily a good
thing), and really fast in hardware implementations.
Keccak works as
follows:
- Initialize a large “state” of a particular size, with zeros.
- Break the data up into a series of blocks of a smaller particular size.
- Take each block, and use bit operations, to be merged with the larger “state”. Each block is processed with some fixed sparse data.
- Combine each of those states.
- The final block has two bits flipped.
- Return a small portion of the state as the result.
Like BLAKE2, Keccak
aims to be stronger than classical hashes because there is state data
larger than the size of the block, which cannot be immediately
influenced by data (but can still be influenced). Calculating a hash based upon another with
appended data is also difficult, because the result is a truncated
state. The bit flips in the final block would help as well.
My thoughts
After implementing
Keccak and understanding its design, I became alarmed by how much is
missing. The use of pure bit operations make it easier to compute in
reverse, aside from a few AND operations. The computation of the
final state uses bit flippings inside the block, as opposed to
outside beyond it, making it easier to tamper with (although
theoretically still difficult). But most importantly, the utter lack
of using a block counter or data size anywhere.
Classical hash
algorithms make it difficult to insert blocks anywhere in the
middle of data and get a matching hash. Even if you were able to
compute some data to go from state A back to state A, you could not
insert this somewhere, because the data size at the end must then be
different, resulting in a different hash. Same goes for BLAKE2
because of its block counter appended to each block it processes.
Keccak has absolutely nothing here.
Keccak’s initial
state is all zeros, something which every Keccak hash must use. If
you could compute data which is a multiple of the block size which
when processed in Keccak would go from state all zeros to state all
zeros, you have just broken Keccak. You would be able to prepend this
data as many times as you want to the beginning of any other set of
data, and produce the exact same hash. This would apply to every
single Keccak hash produced, it targets all Keccak hashes in
existence.
Now, I’m no expert
cryptographer. Perhaps Keccak’s bit operations ensure that a state
of all zeros could never be produced from it. Maybe there even exists
a proof for this which is in a document or article I haven’t seen yet.
However, if there is no such proof, then Keccak may be broken much
worse than even the classical hash algorithms are. With the classical
algorithms that are broken, you generally have to break each hash
with a very large amount of computations specific to each hash result
you want broken. With Keccak here, once a prefix becomes known, you
can just prepend it, no computation necessary.
Improving Keccak
Of course if Keccak
is indeed broken in this fashion, it’s not hard to fix. The data
size could be processed in the end, or a block counter could be used
alongside each block like BLAKE2, the state is certainly large enough
to be able to handle it. If one were to change Keccak, I’d also
move the bit flippings at the end to occur beyond the block size
inside the state, just to make it more difficult to tamper with the
final result. In the mean time though, is Keccak, and therefore
SHA-3, even safe to use?