fbpx

Programming involves change and managing that change is the only way to make sense of it. You’ll learn about the staging area in this episode and how that affects your commits.

Git actually has three areas where it keeps your files. The most visible of these is called the working directory.

The second area is the staging area. This is where files go when you add them and before they get committed.

The third area is the archive area. I’m not sure if this area has an official name or not so that’s how I refer to this area. This is the area where Git holds all the versions of your files as they existed through each commit. Every time you commit changes, you’re updating the archive area with the latest changes. And the archive area continues to hold all the older versions of your files.

Listen to the full episode to learn more especially how to avoid committing empty files that you create through your integrated development environment. You can also read the full transcript below.

Transcript

You’ll learn about the staging area in this episode and how that affects your commits.

Normally, you don’t have to worry about the staging area. But some aspects of Git will be confusing without this understanding.

Imagine the work you do by changing your files as if you were packing bags for a trip. Now you could take each bag out to your car right away but that may not be the most efficient way to load your car. It’s better to leave each bag where it is first and then take them all out to a loading area where you can then load them into your car. In other words, you stage them or prepare them first before finally loading them.

Git actually has three areas where it keeps your files. The most visible of these is called the working directory. This is the area you see when you browse your files and is where you can open them and make changes to them. This is the area that you’re used to. When you check out a different branch, this is the area that gets updated with the files that match that branch. Many people don’t even think about the other two areas in Git. The working directory is so visible and is where you do your work that many people don’t even consider the possibility that there could be other areas.

The second area is the staging area. This is where files go when you add them and before they get committed. When you create a new file in your project, it’ll initially show up as an untracked file in your working directory. You tell Git that you’d like to track changes to this file by adding it. What actually happens is that adding the file makes a copy of the file in your staging area. This is enough to let Git know that the file is tracked even though it hasn’t yet been committed.

The third area is the archive area. I’m not sure if this area has an official name or not so that’s how I refer to this area. This is the area where Git holds all the versions of your files as they existed through each commit. Every time you commit changes, you’re updating the archive area with the latest changes. And the archive area continues to hold all the older versions of your files.

Now, these areas can get confusing when a file exists in more than one at the same time. Let’s start out with the simple case of a new file. The new file will exist in your working directory. If it’s someplace outside of your working directory, then it doesn’t relate to this Git repo at all. Git only works with files that are anywhere inside the directory structure of your project repo.

Okay, so you have a new file. If you run the command “git status”, then Git will tell you that there is an untracked file in your working directory. Any commit you make at this point will not affect that file. It will remain untracked. And if you delete the file, then it’s gone forever as far as Git is concerned. That file will not be found anywhere in your Git history if it was never added and then committed.

The first step then to get this new file under Git’s control is to run the command “git add” and specify the file name. The file will remain in your working directory but if you run the status command again, now Git will show the status of the file as being ready to be committed. Git has made a copy of the file in your staging area.

Where does this staging area exist? It’s actually inside the hidden .git folder at the root of your Git repo. That’s the same folder where all the files are stored inside Git. So Git puts both the staging area and the archive area inside the hidden .git folder. They’re still separate areas even though Git puts them in the same hidden folder.

If you were to commit your changes now, then Git will put the contents of the staging area into the commit being made. And the staging area will be cleared. Your new file is no longer new. It’s now part of your Git repo and will exist as part of that commit. If you delete the file now and commit that delete, then the file might seem like it’s gone but it remains in the archive area. All you need to do to get the file back is to checkout that commit id that included the file.

Let’s pretend that we didn’t delete the file and it’s still sitting in the working directory and in the archive area. If you change the file and run the status command, now Git will tell you that the file has changed. You can again add the file to your staging area and make another commit.

Or this time, you can if you want to skip the staging area. I do this a lot. By running the “git commit” command with a dash a option, you tell Git to commit everything that’s currently being tracked and that has changed. This will pick up changes currently waiting in the staging area as well as any tracked files that have not yet been staged. It’s a quick way to commit your changes. But you need to be careful because you can accidentally include changes in your commit that you might not have intended.

Here’s a confusing case that you need to be aware of. Well, it’s not so confusing once you understand how the staging area works. But it can cause problems if you don’t understand this. Let’s say that you modify a file and then add it to the staging area. But before you commit the changes, you modify the file again. Maybe you realized a small mistake that you made and fixed it by making another change. If you ask Git for its status now, it will tell you that the file is staged for the next commit and that the file is also modified and has changes. How can it be both staged and modified at the same time?

This is what I meant when I said that when you add a file, Git makes a copy of the file in the staging area. That copy is done on the file at the moment you added it. If you make additional changes, then those changes won’t update the staged copy until you add the file again.

Adding a file in Git doesn’t mean that you’re adding a new file. Sure, it can mean that. But what it really means is that you want to add the file to the staging area so it’ll be ready for the next commit.

If you forget to do this and have a file in the staging area and then change the file again so it also shows up as changed in the working directory and you commit your changes, then what happens is only those changes in the staging area get committed. Your latest changes in your working directory will continue to show up as changes that can be committed.

The easiest place to fall into this trap is when you’re using a development environment that integrates with Git. When you add new files through your development editor, sometimes these tools will add the new files to Git right away. But usually, this very initial files is not very interesting so you might want to edit the file right away. After you finish making your changes, you might want to commit everything. But what gets committed will be that empty initial file. You have to remember to add the file again before making your commit to make sure that you commit the version that you’re most interested in.