Journaling records extra information in case there are problems.
If your computer loses power or maybe you’re saving your files to a removable drive and you eject the drive without notice, then you might lose your work.
In order to prevent larger problems like this, filesystems go through checks to make sure they’re in good shape before you can use them. This check is sometimes called fsck in Linux or Mac and chkdsk in Windows.
Journaling can help prevent this check and help the filesystem recover in case there are problems. But it might not help restore your data.
Listen to the full episode or read the full transcript below to learn about different types of journals such as a meta-data journal, a data journal, and a write-on-copy journal.
Transcript
We normally think of our filesystems as a safe place to write information. After working on a document for hours, you want to be able to press save and feel good that your work will still be around when you need it.
But the real story is this isn’t always what happens. If your computer loses power or maybe you’re saving your files to a removable drive and you eject the drive without notice, then you might lose your work.
If that’s not bad enough, the initial problem if left uncorrected could lead to your entire filesystem becoming corrupt. This could cause you to lose information in other files completely unrelated to your recent work.
In order to prevent larger problems like this, filesystems go through checks to make sure they’re in good shape before you can use them.
One thing to understand is that the filesystem has different priorities than you might think. It wants to keep everything organized. Having two different files both think they own the same storage area on a disk is bad. It’s bad for your data too. But the filesystem is more concerned with the fact that the files are mixed up. This is what the system check is looking for. Some problems can be corrected easy if they’re discovered right away. Other problems might require you to make some decisions about what to keep and what to throw away.
Journaling can help prevent this and help the filesystem recover in case there are problems. But it might not help restore your data. Journaling can help eliminate the need for a full check and find problems sooner.
Think of it like this. Let’s say that you’re renting a house when some natural disaster happens. Maybe string winds tear the roof off the house. Or heavy rains cause flooding. Now the owner of the house has insurance which should help fix the house. You can think of this insurance like a journal. But what about your furniture and clothes? Your pictures and electronics? You need a different kind of insurance for these things. The journal won’t help you.
This may not be the best example. Because it really depends on the type of journal. Some might be able to take care of everything. The point is, just like in real life where you have to pay attention to details of your insurance, you should also pay attention to the details of your filesystem journal.
The first thing to find out is if you have a journal at all. Without a journal, if there are any problems, the operating system will need to do a full scan of the filesystem to check for errors when it starts back up. This could take a long time. It’s not like you can just look up and see a big hole in your roof. A large filesystem can have problems anywhere and it all has to be scanned. This can take minutes or even hours to complete.
During this time, your computer is waiting and can’t finish mounting that filesystem. This could keep the entire startup sequence delayed. For your own personal computer, that’s bad enough. A server computer that your company relies on is worse if it can’t start up right away.
The reason this process takes so long is that your filesystem has no idea what it was doing before it crashed. There could be one problem or hundreds. And they could be anywhere.
A journal helps because this allows the filesystem to first record what it’s about to do. Then if anything happens, it at least knows where to look for problems. If everything goes smooth and completes okay, then another journal entry can be made to record success. Or if there was a problem, then the original journal entry will still be there with details about what the filesystem was going to do next.
Depending on what gets written to the journal will determine how much can be repaired later if needed.
Let’s say the journal only records metadata. This keeps the journal small and fast. When you want to save a large file with hours of work put into making changes to that file, the journal will record things such as the name of the file and how big it is.
If the data fails to be written because the application crashed before you could save your changes, then no filesystem or journal can help you. Your work was gone before it could ever reach the filesystem. The only thing a journal can help you with is making sure that any changes making to disk are recorded.
When you insert the USB drive into a computer again, a journal will let the filesystem know if there were problems previously and where to look for those problems. It will then try to repair itself. It can finish a previous write operation by making the file to be the correct size with the correct name. And as far as it’s concerned, the filesystem is back to normal and intact.
A different filesystem with a data journal might be able to go a step further. It’s a lot more work for the filesystem but it ensures that your new data gets written somewhere safe before modifying the original file.
If this initial save fails for any reason, then your original file is still intact, your filesystem is unchanged and still intact, and the only thing that needs to be done is to clear out the journal.
Assuming the journal can be written to first and the new file safely stored inside the journal, only then with the filesystem start making changes to the real filesystem. If any failures happen now, then the filesystem can recover later by trying again to copy the data from the journal into the real filesystem.
In other words, once your data makes it safely into a data journal, then it’s safe even if it hasn’t yet made it to its final location. This means that the data needs to be written twice. That’s what I mentioned earlier about this being a lot more work. Not all filesystems will have this kind of journal.
I’ll end this episode with one final thought. There are other kinds of journals with similar protection. One of my favorites is called copy-on-write. It works like how we might do things in real life.
Let’s say we want to build a new bridge over a river. How would we do this in real life? Well, we’d probably start by leaving the existing bridge in place and unaffected. We definitely wouldn’t tear down the current bridge and put up a note to remind ourselves that we intend to build a new bridge. And we also wouldn’t build a new bridge somewhere else to have a safe copy in case there were problems tearing down the current bridge and building the new one.
No, the best approach is to build the new bridge right next to the current bridge. In other words, we’d make a copy of the current bridge by building a new and better bridge. Once the better copy is done, then we can start redirecting cars onto the new bridge. And once the new bridge is being used, we can start work to tear down the old bridge.
With this approach, instead of writing the data in a journal, the filesystem writes the data directly in its new location. If it all goes well, then this copy becomes the current file.