fbpx

Filesystems allow you to refer to your content with different names.

You can usually get by just fine without linking files and directories. Until a situation arises where linking would help.

A simple explanation of a link for right now is that it allows you to create multiple names for your files and directories so you can get to the content from different locations or through different paths. There’s different kinds of links that you’ll learn about in this episode.

Listen to the full episode to learn about symbolic links and soft links, junction points, and hard links. You might be surprised that you use hard links all the time without doing anything special. Or you can read the full transcript below.

Transcript

You can usually get by just fine without linking files and directories. Until a situation arises where linking would help. Recognizing these situations and then knowing how to use linking will let your filesystem help you. If not, then you’ll just cause yourself more work. And if you ever come across a linked file without a full understanding, then you’ll be more likely to make a mistake.

If you make a mistake manually, then you might be able to stop and fix it. But if you write a program that misuses linking, then the results can be much worse.

A simple explanation of a link for right now is that it allows you to create multiple names for your files and directories so you can get to the content from different locations or through different paths. There’s different kinds of links that I’ll explain in a moment.

First, why would you want different names? Won’t that just make things more complicated?

Sure, it can. Imagine a phone directory with your friend’s name and number. Now create a bunch of fake names with the same phone number and you’ll make a mess of things. It’s enough to make anybody think that links are bad. And this is part of the problem. You need to learn how to use links properly and when. Because there are times when links will help and other times when they’ll just make things worse.

A better example of where a link would help is this. Go back to the same phone directory and imagine you have a listing for a taxi company called Fred’s Town Cars. But you can never remember the name of the taxi company when you need it. It would help if you create a new entry called Taxi and instead of writing in a phone number, you just write the name of Fred’s Town Cars.

Now whenever you need a taxi, you lookup taxi and find the actual company name. This also helps if Fred’s Town Cars goes out of business and you need to find another company. Once you create a new entry, just update the entry for Taxi to point to the new company.

This system saves you time when you forget the actual company name. It’s a little more work to always go through the Taxi link entry. So you’d probably just use the link when you forget the name and otherwise go directly to the real phone entry. But imagine going to the Taxi entry and finding the contents of the real entry right away complete with the phone number. This is what a link in your filesystem would do. Sure, it points somewhere else but makes it seamless.

Okay, now that you know that links can actually help in certain cases, it’s time to explore what kinds of links you can make.

There are really only two kinds of links. Either symbolic links or hard links.

I’m not sure but I think the name soft links was created because soft is the opposite of hard. It’s also a more common word than symbolic and easier to remember. Just know that a symbolic link and a soft link are the same thing. You can call them by either name but symbolic is the more official of the two.

A symbolic link is just like the Taxi phone book entry we created. It has it’s own entry in the phone book and the contents of this entry just point to the name of the target of the link.

It’s possible for symbolic links to be broken. If you erase the entry in your phone book for Fred’s Town Cars, then this doesn’t do anything with the Taxi entry. The Taxi entry still thinks it points to Fred’s Town Cars which no longer exists. On your filesystem, if you delete the file that a symbolic link points to, then the symbolic link now points to something that no longer exists and you get an error when you try to open the link.

What happens if you delete the symbolic link itself? Only the link gets deleted. The target of the link remains unchanged. Think what would happen if you removed the link for Taxi from your phone book. The entry for Fred’s Town Cars would remain untouched. Although you’d be back to the same problem of trying to remember the taxi company name.

The nice thing about symbolic links is they’re very flexible. You can have a symbolic link that points somewhere else entirely. Maybe to an old phone book. On your filesystem, a symbolic link can point to a different partition or a different hard drive, or even a target on a different computer. You can create symbolic links to other files or folders.

A symbolic link can do this because all it contains is the path and name of the target. It’s like a shortcut to you typing that path and name yourself. In fact, the concept of a shortcut is very similar to a symbolic link. It’s a shortcut that the filesystem knows how to handle itself.

Before we get to hard links, let’s jump ahead to junction points. This is an idea introduced by Microsoft and I’m only aware of junction points existing on Windows filesystems. You can think of a junction point as a more restricted version of a symbolic link. You lose the ability to link anywhere you want and have to stay on the same filesystem volume. So you can’t cross partitions or drives or jump to another computer. You can also only link to other folders. In exchange for this restriction, you get a faster link that can be followed right at the source.

What I mean is that let’s say you have a symbolic link on a remote computer to another folder also on that remote computer. It’ll be your own local computer that figures this out and follows the symbolic link. But had it been a junction point, then the remote computer could have done this for you.

Junction points have some other restrictions such as how soon they can be used when your computer is starting up. Not everything is aware of them. Once your computer is fully started, then they should work similar to a symbolic link to a folder. The reason for this is because junction points are implemented as mount points.

You normally mount another complete filesystem volume when considering mounting. This gives you the ability to extend your filesystem with the contents of another filesystem. A junction point lets you pick a directory on your current partition and mount it at the junction point.

You can use symbolic links to give you a fixed location to use in your programs that can then point somewhere else without needing to change your program every time the target changes. Maybe you have an application that uses another program. And this other program gets updated often and uses it’s current version number in it’s path when installing each new version. If your application wants to keep up with the current version of this other application, then it needs to change the path that it uses to find the other application. But what if instead your application goes through a symbolic link? You can keep the name and path of the symbolic link fixed. So your program can remain the same. Then whenever a new version of the other application is installed, the symbolic link can be updated to point to the new version.

Now we can move to hard links. The first thing to understand about hard links is that they only refer to other files. You can’t create a hard link to a folder. And like junction points, hard links are limited to the same filesystem. You can’t create a hard link to a file on another disk or computer.

The next thing to realize. And this one is not often explained. Is this: Every file you create is actually a hard link. Of course, we don’t refer to files as hard links because it sounds weird. But that’s essentially what happens. I think that once you understand this, then understanding hard links becomes easy.

We normally think of creating a hard link once a file has already been created and we want a link to the same file. This new hard link is actually the second hard link. The first was created with the file initially.

Let me back up a bit and explain what happens when you create a file.

When you create a file, the filesystem needs to find some space available on the disk. It might start out with a single allocation unit or maybe more. This initial space belongs to the file and as the file grows and you write more data to the file, then the filesystem will find more available allocation units and link them together so they all belong to the file.

And the filesystem will make room for other attributes like the last episode describes. And there could be alternate data streams belonging to the file. Everything gets linked together except for one important piece of information: the name.

How can the name be left out? It seems like the most important part, right? How will you ever be able to find all these linked allocation units with your important data if you don’t have the name?

Well, the name is important. It just doesn’t belong with the data. You see, the name is how you find all this data and that means it belongs in the parent folder.

This means that the actual file itself doesn’t know or care what name you gave to it. It just knows that it does have a name somewhere.

The file itself will remain on disk as long as the name exists in some folder somewhere to refer to the file. The name becomes a hard link to the file data. It’s called a hard link because if you delete the name, then the file data is no longer needed and will get deleted too.

This is different than a symbolic link where deleting the link leaves the target unchanged. You can have as many symbolic links to a file or folder as you want. And you can do the same thing with hard links to files.

The only difference is that while the actual file data may not care about what name is used to get to it or where that name resides, it does care about how many hard links there are.

So when you create a file like normal, I explained that this actually creates the first hard link. It also sets the count of how many hard links exist in the file data to be one.

Now, when you delete a file. What you’re actually doing is removing the name from a folder. This means that you’re deleting a hard link. So the filesystem goes to the actual file data and lowers the count of remaining hard links by one. And if the remaining count is zero, then that means there’s no folder entry remaining anywhere and the file data can be removed.

When you create another hard link to a target file, what you’re doing is creating a name somewhere in a folder that refers to the file and then the filesystem goes to the file data and increases the count of hard links by one.

The file data doesn’t care what name you use. It only cares about how many names it has at any given time. That’s why the name doesn’t belong with the file data.

With a hard link, you don’t have to worry about the target getting deleted and leaving you with a broken link. Because each name used to link to the data is tracked and the data will only be deleted when the last link gets deleted. For most files, there will only be a single hard link and the two are thought of as the same thing. Delete the name and you delete the file. Now you know the full story of what happens.