fbpx

Semaphores are often confused and characterized as just a more general form of a mutex. There’s actually some big differences though. I’m also going to explain some differences between the simple lock of the last episode and a mutex.

Because you can declare a semaphore to only allow a single thread at a time to successfully wait on the semaphore, it’s often described as a more general purpose mutex. After all, a mutex can only ever allow a single thread to obtain the lock. But don’t be fooled by this similarity. Use a mutex when you want to guarantee exclusive access to a specific resource. And use a semaphore when you want to signal to some other threads that one or more resources are available.

Another difference between how semaphores are used vs. how mutexes are used is in what operations are called. For a mutex, you have many different threads that each want to obtain the mutex so they each lock it. Only one of the threads will get the lock right away and the other threads will all block. Each thread that uses a mutex will call lock and then release in that order.

Semaphores are not meant to be used like this. They also have two operations but there’s a good reason that the method names are different. With a semaphore, each thread needs to figure out which side it’s on and it should only call that one method. If a thread is making things available to other threads and needs to signal that something is ready, then it should call signal. And if a thread is waiting on something to be ready and then consuming items, then it should call wait.

Just be aware that semaphores by themselves don’t actually protect specific resources. You’ll still want to use one or more mutexes for this purpose. Listen to the full episode for more explanation or read the full transcript below.

Transcript

Before we begin, I want to mention that some things in software don’t always have names or meanings that everybody agrees with. And some things are easy to get mixed up which can propagate this. I called the simple instruction that I described to implement locks a set and test instruction. Probably due to the way I think about it. And while some references also refer to it by this name, I think the more common name is the test and set instruction. It’s a minor point, but I wanted to mention it in case you wanted to read more about the topic.

Today, we’re going to discuss semaphores and I’ll start out as usual with a description of a problem and then explain how you might use semaphores.

Many adventure games have a similar concept of stamina. The hero usually has a limited supply of stamina and performing actions such as running will use up stamina while resting will restore stamina.

This is just like how we get tired when running and sometimes need to catch our breath for a while by resting.

The common programming name for this is the producer/consumer problem. It’s actually a little more difficult to solve properly than it might first appear. It’s easy to get into a race condition that can cause your application to think the stamina is full by the resting code and that the stamina is empty by the running code. Both sides think they can’t do anything and you have a form of a deadlock. Semaphores help with this but they’re a bit tricky to understand.

A simple way to think of a semaphore is like a mutex that can count higher than one. But wait a minute, what exactly is a mutex?

This is where we’re going to build a little more understanding from the previous episode about locks. If you haven’t already listened to episode 100, go ahead and do that now. Okay, in many ways a lock is similar to a mutex. Most languages and textbooks treat the terms interchangeably. I used the term lock in the last episode as a simple name for the concept of synchronizing access to a shared resource. The word mutex is just a shortened word made up from mutual exclusion. It’s the same thing really. You want to be able to make sure that just one thread at a time has access.

We think in terms of locking a mutex and releasing a mutex. So far, a lock and a mutex are identical. But a mutex also has a very important concept of an owner. In the last episode, I explained how the first reader could lock the w lock and the last reader could release the w lock. If the w lock was a mutex instead of the overly simplistic lock that I described, then this wouldn’t work. That’s because the readers will likely be implemented with different threads. When a thread locks a mutex, then only that thread can release or unlock the mutex. The locking thread owns the mutex. This has some advantages such as protecting the mutex from being unlocked by some other code and it also allows the operating system to automatically release a mutex if the owning thread ends without releasing the mutex.

Another big advantage to knowing who the owner is for a mutex is that mutexes can be reentrant. In other words, what if you could unlock a door and then later find that you need to pass through the same door again? Remember that mutexes are supposed to only let a single thread pass. Because we can track the fact that your thread already has access to a mutex, then trying to lock it again should be no problem.

So while a simple lock and a mutex might appear to be similar, a mutex adds some extra capabilities.

A semaphore actually behaves more like the simple lock only with the ability to let more than one thread at a time have access. Or not. You get to select the capacity of a semaphore and if you create a semaphore with a capacity of one, then it behaves similar to the simple lock from the previous episode. You can just as easily create a semaphore with a capacity of 10 or 100.

I say similar but a semaphore is used a bit differently than a lock. First of all the actions aren’t normally thought of as lock and unlock. There’s actually not a fully consistent way of describing the operations of a semaphore. I’ll describe the operations as signal and wait. But you’re likely to also see up and down, release and acquire, post and pend, vacate and procure, or even just v and p. The names v and p actually are initials from the original Dutch names which I’m not going to try pronouncing.

Because you can declare a semaphore to only allow a single thread at a time to successfully wait on the semaphore, it’s often described as a more general purpose mutex. After all, a mutex can only ever allow a single thread to obtain the lock. But don’t be fooled by this similarity. Use a mutex when you want to guarantee exclusive access to a specific resource. And use a semaphore when you want to signal to some other threads that one or more resources are available. That’s why I like the terms signal and wait for semaphores.

I’ll explain how to use semaphores to deal with the producer/consumer problem of the adventure game stamina right after this message from our sponsor.

( Message from Sponsor )

Another difference between how semaphores are used vs. how mutexes are used is in what operations are called. For a mutex, you have many different threads that each want to obtain the mutex so they each lock it. Only one of the threads will get the lock right away and the other threads will all block. Blocking here just means that the thread can’t go any further until it gets ownership of the mutex. The operating system will make sure that the blocked threads are waiting efficiently. We don’t want any of them to continuously be trying to obtain ownership of the mutex. All this will do is cause the processor usage to jump to 100% and the remaining battery level to start dropping fast. Once a thread is done with the mutex, it releases it. So in other words, threads call lock to obtain the mutex and release when they’re done. Each thread that uses a mutex will call lock and then release in that order.

Semaphores are not meant to be used like this. They also have two operations but there’s a good reason that the method names are different. With a semaphore, each thread needs to figure out which side it’s on and it should only call that one method. If a thread is making things available to other threads and needs to signal that something is ready, then it should call signal. And if a thread is waiting on something to be ready and then consuming items, then it should call wait.

With this understanding, we have a producer/consumer problem with the hero’s stamina. The act of resting will produce more stamina and the act of running will consume stamina. But when you think about it, there could be other things that produce and consume stamina. Sleeping is a great way to produce more stamina. And fighting a monster is a great way to use up a lot of stamina.

Any of the producers can call signal on a staminaAvailable semaphore to let various consumers know that more stamina is available. The consumers will likely be waiting for extra stamina by calling wait.

We have a slight problem here though. How do we know when stamina is needed? If the consumers can only call wait, then they have no way to signal the producers that extra stamina is needed. They can’t call signal themselves because that means that more stamina is actually available.

The solution to this is to create another semaphore that acts as the opposite. Instead of calling this semaphore staminaAvailable, we’ll call this one staminaUsed. Now anytime a consumer such as a mighty swing of a sword draws some stamina by first waiting, it can then signal that it used that stamina through the second semaphore.

So when I said that a thread needs to figure out which side of the semaphore it’s going to be on, that really only applies to each semaphore. Don’t go switching sides unless you’re just asking for some complicated code that will almost certainly just be hiding some deadlocks.

Having the extra semaphore is a great way for the producers to know when they need to enter the scene. They can wait on the staminaUsed semaphore and when a producer gets access to the notification that some stamina was just used, then it can start the process to make some more. And of course, when it’s done and has some stamina ready, then it calls signal on the staminaAvailable semaphore. Also, the way that I described this works great for a audio description. The actual code for this will probably use a different design and instead of having each possible producer of stamina wait for the staminaUsed semaphore, you’ll likely just have a single manager class doing the waiting and then figuring out the best combination of producers. And something similar on the consumer side too. Maybe this would be a good application of the mediator pattern described in episode 75.

Just be aware that semaphores by themselves don’t actually protect specific resources. You’ll still want to use one or more mutexes for this purpose. The word semaphore derives from some form of the word flag or signal or something like that. So use semaphores to signal availability and then use a mutex to access specific resources. In the case here of the hero’s stamina, the actual amount of stamina available is probably something that you’ll want to keep track of in its own variable and make sure to lock and release a mutex anytime you need to access the amount.