Can this object be thrown away yet? Keeping track of how many places are still using an object is one way to answer this question.
The way it works is like this. When you first create an object it starts out with a reference count of zero. That just means that it’s not being used. You normally want to increase that reference count right away. Some libraries might even do this for you. The point is that if you intend to use and continue referring to some object instance, then you need to either increase the reference count or know for sure that it’s already been increased for you.
This shows that the object is being used in a single place in the code. You don’t have to go hunting through the code to find out where it’s being used. You can just ask the object itself what its reference count is. Now, whenever you’re done with the object, instead of deleting it, you instead release it. There’s usually a method on the object called release that you can call. This will cause the reference count to decrement by one. and if the count goes to zero, then the object can delete itself.
What happens when things go wrong? And how can things go wrong? What if two objects each refer to the other? In other words, each object has a reference counted data member of the other object and they’ve both added a reference to the other object to show that it’s in use.
This situation is called a cyclic dependency. It doesn’t have to be just two objects. Any cycle of at least two object can cause the problem. Maybe object A is holding a reference on object B which is then holding a reference on object C. You get a cyclic dependency if object C also has added a reference on object A. It forms a loop. None of the reference counts can go to zero and the objects are stuck.
Listen to the full episode or read the full transcript below to find out how to fix this problem.
Transcript
In real life, we do this all the time. Usually a bit more informally though. You see a cup on the table and before tossing the drink and washing the cup, it’s a good idea to ask first. “Is anybody still using this cup?” When nobody answers, that’s when you clean the cup.
The problem with this approach comes when somebody asks, “Hey, what happened to my cup?”
The topic today is actually similar to episode 127 about smart pointers. That’s because smart shared pointers use reference counting to keep track of their usage. For today though, I want to focus just on the reference counting. How does it work and what are some of the downsides and benefits?
Most of the uses of reference counting are related to knowing when it’s safe to delete objects. But there’s also another aspect to this and that’s when to free system resources such as file handles, window handles, database connections, etc.
There’s also the question about the extent that reference counting should be used. Should it be used for everything or only certain class instances? When you try to use reference counting everywhere, then you end up with something like the Objective-C language. This language recently added a compiler option to turn on automatic reference counting. It’s the same concept and automatic reference counting just means that the compiler inserts code for you to make sure that references are incremented and decremented at the right times.
First of all, let me say something about timing. It’s always possible that a different thread or even a different process will start executing just as a reference counted object is about to be reclaimed. This can delay the actual object cleanup for an unknown amount of time. Usually this will be fairly short. But I mention it because deterministic execution of objects going out of scope is usually listed as one of the strengths of reference counting. All you really know is what a single thread will do next.
What exactly is reference counting? While the cup example that I started out with could sometimes apply, I used it to draw attention to the fact that reference counting is normally used when multiple places in your code need to use some object. We don’t normally share cups so I wouldn’t expect the reference count of a cup to get any bigger than one.
The way it works is like this. When you first create an object it starts out with a reference count of zero. That just means that it’s not being used. You normally want to increase that reference count right away. Some libraries might even do this for you. The point is that if you intend to use and continue referring to some object instance, then you need to either increase the reference count or know for sure that it’s already been increased for you.
This shows that the object is being used in a single place in the code. You don’t have to go hunting through the code to find out where it’s being used. You can just ask the object itself what its reference count is.
Now, whenever you’re done with the object, instead of deleting it, you instead release it. There’s usually a method on the object called release that you can call. This will cause the reference count to decrement by one. and if the count goes to zero, then the object can delete itself.
Going back to the cup on the table, just the fact that it’s still sitting there means that somebody is still using it.
I’ll continue explaining more about reference counting right after this message from our sponsor.
I already mentioned everything you need to know about reference counting. If you need to use an object, then either increase the reference count or make sure it’s already been increased for you. And then when you’re done with an object, release it.
Now we can get into some of the finer points. What happens when things go wrong? And how can things go wrong?
First of all, unless you’re using a language like Objective-C designed to use reference counting everywhere, you’ll probably want to be selective in where you use it. It can add a lot of extra overhead to your code and slow down your application.
Maybe in order to help protect your code from accidentally forgetting to release an object before you assign it the value of another reference counted object, the class implements the releasing and adding in the assignment operator for you.
Let’s say you have two reference counted objects A and B and you want to write code that says A is assigned B. What needs to happen? Well first, since A is going away, we need to release it. This might run the destructor for A if no other code is using it. Then we need to add a reference to B.
If you compare this with a garbage collected language such as C#, then the process becomes much simpler. The object that A referred to gets forgotten about and A starts referring to the same thing that B refers to.
The lesson here is that you shouldn’t try to bring one style of programming from one language to another. References in C# and garbage collection don’t use reference counting. And C++ doesn’t use garbage collection so you can’t just forget about instances.
All I’m saying is that when using reference counting in C++, you shouldn’t also try to use reference counting to avoid doing things the way C++ is designed. Using reference counting in C++ is great for certain objects. Just don’t try to use reference counting for everything.
When you have objects that you want to share with many places in your code, then making them reference counted is a great way to manage their lifetime. Otherwise, how will you know when you’re done with an object if it should be deleted or not? Reference counting let’s you release the object and it will take care of itself.
An object can have data members and some of these might also be reference counted. If this is the case, then releasing an object that causes it to be deleted may end up cascading to other reference counted objects and some of them might get deleted too.
Here’s one big problem with reference counting though. What if two objects each refer to the other? In other words, each object has a reference counted data member of the other object and they’ve both added a reference to the other object to show that it’s in use.
This situation is called a cyclic dependency. It doesn’t have to be just two objects. Any cycle of at least two object can cause the problem. Maybe object A is holding a reference on object B which is then holding a reference on object C. You get a cyclic dependency if object C also has added a reference on object A. It forms a loop. None of the reference counts can go to zero and the objects are stuck.
The way you get around this is through something called a weak reference. This let’s you refer to some object without actually adding to its reference count. If we make the reference in object C that’s referring back to object A into a wee reference, then it won’t affect the usage of object A.
So how doe this all work? Let’s say your code creates an object A and increments its reference count to one. This shows that A is in use. Now object A needs to create object B so B’s reference count will also be one. And to stay with the example, object B needs to create object C. So object C’s reference count is also one. Now how can object C find out about object A in the first place? Well, maybe object A calls some method in object C. Here’s where we could get into trouble. If object C says, “I might need to use object A again so I’m going to add a reference to it.” then we have a cyclic dependency. Because that will bump up the reference count of object A to be two.
Then when your code is done with object A and releases it, the reference count drops down to one. Your code forgets all about object A but it stays around in memory. And so does object B and object C. They all stick around because they all have reference counts of one.
Instead, had object C created a weak reference to object A, then it would have left object A’s reference count unchanged at one. when your code finished with object A and released it, then A’s reference count can go to zero and it delete’s itself. Since object A is going away, one of the last things it does is release its hold on object B. That causes object B get released. And the same thing happens to object C. All of them get deleted.
Weak references are good at preventing cyclic dependencies. But they require you as the programmer to figure out where they need to go. The compiler can’t help you here. Because weak references are more about how you intend to use your objects and how they relate to each other.
If you have reference counted inventory objects and a reference counted backpack, then it makes sense for the backpack to hold strong references on all the items in the backpack. But if the items themselves ever need to refer back to the backpack, then this should be done with weak references.