Avoid being predictable. This advice applies to almost everything you do as a programmer. This episode will focus on the filesystem and how being predictable can make it much easier for an attacker to gain control.

We use files to store information and configuration. And we also use files to communicate. Especially between companies or departments within a company. Maybe you’re writing an application that needs to wait on some information that will be sent to you when it’s ready. It’s a lot of information so you agree with the other team that they can just write it all to a file and send it to you when it’s ready. Your application just needs to wait for the file to appear, open it, and start reading.

The only question is where should the file be placed and what should it be named. This is where the problem of predictability comes in.

Listen to the full episode for examples of how you can solve the predictability problem. You’ll learn why simple solutions are not enough and how you can use an HMAC or hashed message authentication code to help. You can also read the full transcript below.

Transcript

This advice applies to almost everything you do as a programmer. This episode will focus on the filesystem and how being predictable can make it much easier for an attacker to gain control.

I remember once when living in Washington, the city was doing roadwork and they removed the center turn lane and replaced it with a median that cars could not cross. This was in a busy commercial street with lots of shops. I guess they were worried that the extra turning was causing too many accidents. It definitely inconvenienced me because I could no longer turn left directly into my favorite lunch place. I now had to drive further down the road and make a u-turn and come back.

The interesting thing about this was the bank that was right next to where I went to eat. That bank was able to get the city to change their mind. And they had to open up a special turn lane just for the bank. Was it because the bank had enough money to pay for the changes? I’m sure it cost a lot to tear down the new median and put a turn lane back where cars used to be able to turn before the work began.

The real reason why the bank was able to make the city reverse their decision did have to do with money. But not because the bank paid the city. It was because the bank was able to show that with the new turn restrictions in place, there was only a single route for an armored car to take in order to get to the bank. A single path that the armored cars would always be taking. Just think about that for a minute.

If you were planning to set a trap for an armored car to steal the money inside, then wouldn’t you want to know which roads the armored car would be driving on? Normally, the exact path is kept secret and changes often. But when there’s only one path possible, then it’s no secret anymore.

This story had a happy ending for me. I was able to use the new turn lane to turn into the bank, drive through their parking lot and then get back on the road for a quick hop into the restaurant. The bank might not have been too happy about me and all the other cars making use of their driveway to make a u-turn, but we weren’t allowed to make a direct u-turn so the bank was put to good use.

The lesson here is that predictability leads to security vulnerabilities. So avoid them. How does this relate to filesystems?

We use files to store information and configuration. And we also use files to communicate. Especially between companies or departments within a company. Maybe you’re writing an application that needs to wait on some information that will be sent to you when it’s ready. It’s a lot of information so you agree with the other team that they can just write it all to a file and send it to you when it’s ready. Your application just needs to wait for the file to appear, open it, and start reading.

The only question is where should the file be placed and what should it be named. This is where the problem of predictability comes in.

Maybe the file location is protected so that only the other team has permission to write anything there. But it’s more likely that you might use a more public location. I’m not suggesting that the file be placed somewhere the entire world can see and access. But there’s often a place that seems secure and is only used by a smaller set of people. It’s not open to the whole world. But it’s not locked down to just the one team either.

The problem with this is that anybody who has access to write a file to that location and who knows the name of the file your application is waiting on can trick your application into reading fake data.

Or maybe even an empty file with the right name at the right location is all it takes to cause your application to fail or crash. That could be the goal of an attacker. We don’t know what an attacker wants to do. And maybe an attack is accidental. Maybe some other customer places a file with the vulnerable name in the same location not intending to cause any problems at all. They had no idea you would be looking for that file.

This is more likely if you agree on a simple name such as data.txt.

A good solution to this needs a bit more code and some coordination on both sides. But it is possible to avoid being predictable.

What you want is to change the name of the file. And possibly even the location. Changing just the name would be like trying to disguise an armored car with a different color each delivery. Changing the location too adds an extra layer of protection. You’ll have to decide how much protection your app needs.

I’ve mentioned before that anything can be broken into. My dad told me when I was young that locks are there only to keep the honest people out. If somebody really wants to defeat your security, then it’s going to happen. It’s our job to make that as difficult as possible within the expected realm of what’s likely to happen.

If you’re protecting something very important, then you need extra security. I remember another story that I heard a long time ago. It said that all bicycles weigh 50 pounds. An old 50 pound bicycle made from steel tubes and rusted in many places needs no lock. And a new carbon fiber bicycle weighing just 5 pounds needs a lock weighing 45 pounds.

Now, you could put in a simple change where you read the name and location of a data file from a configuration file. This configuration file would be locked down so that only your application could read it. This lets you change the name and location of the data file by just changing the configuration file. You don’t need to change the source code of your application, build it, and deploy a new application every time you want to change the name and location of the data file. Then you just agree with the other team or company on a regular schedule where the name and location of the data file will be changed. This is a manual approach. It’s simple but needs a lot of human intervention to keep it running.

What you don’t want is to change the name and location in a predictable way. Maybe you try putting the current year, month, and date on the file name so it will change automatically each day. This is still predictable. You might as well call the file important_data_attack_here.txt.

A better solution is to combine two pieces of information in the name of the file. One is specific to the team or company that creates the file but still generic and non-identifiable. In other words, don’t use the company or team name. Use a series of random looking numbers and letters that are assigned to that source of data.

If this is all you do though, all you’ve done is put a really good disguise on the armored car. It’s still predictable.

The second piece of information will need to be generated. This is the part that changes. And this is the thing that solves the predictable vulnerability. There might still be other vulnerabilities. But you’ve just made an attacker work harder with this second piece.

You generate this second piece through a cryptographic hash. Listen to episode 173 about hashed message authentication codes for more information. You take a secret key that both you and the other team or company knows about and use that as the basis for the hashing and generating the data. If done right, this will not only make the file name unpredictable but will also verify that the contents of the file haven’t changed or been tampered with. You really need to listen to the other episode to learn the full story.

That solution required both sides to know a secret key. There’s another form of cryptography that uses two keys. One is public and the other is private. This is the system used by the internet when you visit a protected website using SSL.

The main thing to understand is that you need to use some form of cryptography when coming up with the name of the files used for communication. Anything else is just too predictable.