How do you stop an attacker from just changing a hash?

In the previous episode, I explained how hashes give you the best error detection even when under a direct attack. But the best hash function by itself is not enough. This episode will explain step by step how an attacker can defeat simple attempts to protect data and what you can do about it.

Make sure to listen to the full episode where I describe how another hash creates something called an HMAC and how it can be used to prevent length extension attacks. Or you can also read the full transcript below.

Transcript

In the previous episode, I explained how hashes give you the best error detection even when under a direct attack. But the best hash function by itself is not enough. This episode will explain step by step how an attacker can defeat simple attempts to protect data and what you can do about it.

Let’s start with this scenario: You’re designing a game that allows a player to login and then issue instructions to a central server that tell the server where to move the player’s character. If you were playing this game, would you want your opponents to be able to issue commands as if the commands came from you? Not only would the game be unfair, nobody would want to play anymore once the vulnerability became known. So you need to design the game server so it won’t accept any false or malicious commands from other players.

If you’re only worried about detecting accidental errors made during transmission of the instructions as the messages travel to the server computer, then you could use a hash. Just create a hash of each instruction and send that along with each instruction. The server can then perform the same hash, compare them, and ask for the instructions to be retransmitted if the computed hash doesn’t match the hash that was sent. This means that your game must remember commands for at least a short while in case they need to be sent again.

There are things you can do that would allow the server to sometimes correct an error on its own without asking for the data to be sent again. That’s a whole other topic and I’ll explain error correction in a future episode.

The problem with sending a simple hash is that an attacker can send commands intended to disrupt another player’s character. All that attacker has to do is send a command that causes the other character to jump off a cliff and the attacker wins the game. The server accepts the command because it has a valid hash and therefore could not possibly have been accidentally changed during transit.

How do we fix this?

One of the most important things to realize about security is that there are boundaries that define what’s trusted vs. everything else outside of the boundary that’s not trusted. At some point, something has to be trusted or you might as well go back to designing a single user game that can only be played on a single computer. Then if that user wants to cheat, no problem. The game will quickly become boring though.

There’s very little you can do by trying to add security to the player’s computer. It has to be considered outside of the security boundary and therefore a source of possible attacks.

Alright, so back to the game design. Maybe your first thought is to add a MAC or a message authentication code instead of a simple hash. A MAC is really nothing more than a hash of some secret key along with the message. The hash function doesn’t have to change and produces the same type of output. Just adding a key to the message results in a different hash value.

So where does this key come from? It has to come from the server because that’s inside the trusted zone. In other words, without realizing it, when players agree to play your game, what they’re actually saying is that they agree to trust your server to make sure that the game is fair. They’re not going to trust some other player’s computer.

When a player signs into your server to begin playing, they each get their own secret key. Now when sending commands, the game first starts out with the secret key, then adds a command. The combined key plus command is then hashed and the resulting hash value is sent along with the message.

The server needs to remember each secret key for each player because when it gets a command along with a hash, it does the same thing by hashing the key plus the command and then comparing the calculated hash with the hash that was sent. If the two values match, then it proves two things now. Not only does it prove that the command wasn’t changed, but it also proves that whoever sent the command must also know the secret key. Notice that the secret key is never actually sent with each command. It only needs to be sent when the player logs in. There are ways to do this safely that I’ll also have to explain in a future episode.

What’s happening now with your game is called authentication. And a MAC can be used to not only detect errors but to authenticate messages too.

You might think the solution is done. That there’s no way for an attacker to replace a hash when the data contains something secret. But there’s a problem with most hash functions. Many hash functions are designed to produce a hash by accepting blocks of data of fixed sizes. The hash function doesn’t care how many blocks of data there are. It just allows a program to keep sending it more and more data as needed. The important point is that after hashing each block, the current hash value represents the entire state of the process so far.

And here’s the really important thing to consider. When the secret key is added to the original message and a hash value is generated, that hash value already factors in the secret. An attacker doesn’t even need to know the secret and just needs to pick up where the hash left off. Nothing in the original message can be changed, but that doesn’t stop an attacker from adding more commands to the message.

Let me give you an example. You’re playing the game and issue a command to move your character forward. This command is hashed with your secret and sent to the server. An attacker cannot change it without that change being detected. But if an a attacker can obtain that command along with its MAC, then the attacker can load that MAC into its has function and continue where it left off. All the attacker needs to do is add another command to move your character to the left and generate a new hash. Sending this command might seem confusing. After all, what will the server receive? It will get a command to move forward and then to move left. The server will check the MAC by using its copy of your secret key. Everything will verify just fine. What the server actually decides to do doesn’t really matter because whatever it decides to do, it will still be accepting a command that didn’t come from you.

This attack is called a length extension attack and the solution is not very complicated. And you really just need to know about it so that instead of creating a MAC, instead you create something called an HMAC. This is a hashed message authentication code.

I know, you might be thinking that a MAC already uses a hash, so what does it mean to hash something that’s already been hashed? What value could that provide?

There’s a small twist to this that makes all the difference. What we want to protect against is somebody adding more data and using the original hash as the starting point to get around not knowing the secret.

The way this works starts out just like a MAC, by hashing the secret plus the original message. But instead of stopping there, an HMAC hashes the secret key again added to the first hash. There’s two hashes but we’re not hashing the same thing twice. We are adding the secret key in twice, once with the message and then again with that first hash.

Now, if an attacker tries to extend the message and generate a new hash, then the attacker will need the secret key in order to perform the second hash. And as long as the secret key remains secret, then an HMAC will protect your messages from length extension attacks.