In the previous post, we have learned processes and threads. As we know, modern OSes have the responsibility to schedule different processes and threads, and there're also some dependencies between them. Thus, we need some mechanisms to synchronize these tasks. In this blog, I'll introduce these synchonization mechanisms. It's important to note that in this article I'll use the term task instead of process and thread because some synchronization mechanisms work both for process and thread. It's also interesting that in Linux kernel, process and thread are both viewed as task :)
Critical Section and Race Condition
There could be hundreds of processes running in your operating system when you open the resource manager in Windows 10 (I don't use windows :). The truth is that, even the strongest desktop CPU, only have probably 10 core, 20 threads. The CPU resource is an example of critical section, which means at the same time, only a small fraction of processes or threads could use CPU. Critical Section could be either in macro level such as CPU usage, a peripheral device, or in miro level such as a shared variable. If we have an integer variable in memory, and many processes need to modify this variable, and this variable is a critical section, and only one process could access this variable at the same time. The race condition happens when two entities want to access critical section at the same time.
Synchronization Mechanisms
Different processes or threads may have dependencies, and there should also be some critical sections among the execution of them; Thus we need some mechanism to solve this problem.
Mutex Lock
Mutex lock allows only one task holding the lock; if the lock is already hold by one task, other tasks who're aquiring the lock will spin to test the lock or go into a wait queue until the lock is already released. Spin lock is slightly different with blocking lock, other tasks are sill runs in a while loop aquiring the lock even if the lock is already held by one task. There's a trade-off between using blocking lock and spin lock. Blocking lock does not waste CPU cycles spin in test the lock, but may have high overhead of maintaining a waiting queue. Spin lock wastes CPU cycles because a task is running in a loop keep testing the lock, but it don't have any other overhead. Thus, for a rather large critical section, it's preferable to use mutex lock; For a rather small critical setion, spin lock may have a better performance.
int t = 0;
int lock(int t) {
if (t == 0) {
t = 1;
return 0;
} else
return 1;
}
void unlock(int t) {
if (t == 1)
t = 0;
}
// P1:
while (1 == lock(t));
// enter critical section
unlock(t);
// P2:
while (1 == lock(t));
// enter critical section
unlock(t);
Now the problem comes, how could we test whether the lock is already held? The easiest way is to see the lock status, and then aquire the lock if the lock is not held. Actually, there's a problem here, consider two task P1 and P2 both of them need to enter a critical section. P1 is now seeing the lock status, and the lock is available. P1 has finished executing the first line in lock() and at the same time, there's an interrupt happened, P1's stack pointer and program counter is stored back to it's PCB/TCB; Thus P1 is blocked by a sudden system event. Now P2 is looking the lock status, the lock is still available. So P2 aquire the lock. When P1 is resumed back from the interrupt, the problem comes. P1 will also get the lock because it did the lock check before the interrupt! Error happens here because both P1 and P2 get the lock. In kernel mode we could disable interrup, but in user mode, we cannot do that. Fortunately, the hardware provide an atomic operation mechanism which enable two events happens in the same clock cycle. That means when we are testing the lock, we will get the lock at the same time; It either finishes or not happens at all. There're three kinds of atomic operation mechanisms: Test And Set, Fetch And Increment, Compare And Swap.
Condition Variable
Condition variable is a wait-signal mechanism based on mutex. For mutex, only one task could enter the critical section, and others are blocked until the lock is released. The condition variable is a little bit like mutex lock but mostly, they're different. The condition variable works as when a task is blocked by a certain condition instead of a lock, it waits until that certain condition meets, then it will resume again. If several tasks are all waiting for a certain condition, they are put into the waiting queue. When this condition variable is now signaled, only one task waiting on the queue is gonna resume. And we could also do broadcast to let everybody in the waiting queue to resume.
Typically, condition variable and mutex are used together to ensure the sleep and wakeup order; Otherwise a deadlock would happen. When a mutex is acquired by a task and need to wait for a certain condition, then the mutex was released and that task will go to sleep. When another task signal that task, the mutex will be locked again. In the next several section, we'll see an example.
Semaphore
Semaphore itself is a variable, and there're two primitive operations on this variable: P() (aka. wait) and V() (aka. post). Semaphore s has a initial value N. When we do P() operation on s, two events happens atomicly: first, check if s is greater than 0, then decrease the value of s by 1; Otherwise wait until s is greater than 0. When we do V() operation on s, two events happens atomicly: first, check if s is smaller than its initial value; then increment the value of s by 1; Otherwise wait until s smaller than 0.
If we look it closer in semaphore, we could find that a binary semaphore and mutex lock are very alike because their mechanism are nearly the same. However they are totally different things. Actually, there is a significant difference between binary semaphore and mutex lock. Mutex lock has only one owner which means only the owner hold this lock could release this lock. Binary Semaphore doesn't have a owner. For example, when the semaphore s is currently 0 after task A did a P operation, which means other task will be blocked by s when doing P operation. After a while, task B does a V operation, then the first task blocked by s will resume again.
Barrier
Barrier is a method of sychronization, as its meaning, which sets a barrier between a group of tasks when any task must stop at this point and cannot proceed until all other tasks reach this barrier. The basic barrier has mainly two variables, one of which records the pass/stop state of the barrier, the other of which keeps the total number of threads that have entered in the barrier. Thus, is quite easy to use condition variable to implement barrier.
Classic Synchronization Problems
In this section, I'll explain two classic synchronization problems using the mechanisms we discussed in the last section. I'll use Pthread Library to illustrate. I also implement a pthread-like thread library in another article.
The Producer-Consumer Problem
There're several producers who produce a certain kind of products, and several consumers who need this kind of products. There's a buffer with size N. producers put products into buffer when buffer is not full, and consumers get products from buffer when buffer is not empty.
Using condition variable: Obviously, the buffer is the critical section. Thus, when any one no matter producers or consumers need a mutex lock to get into buffer. When the buffer equals to 0, means empty, the consumers are waiting for the condition variable that the buffer is not empty. When the buffer equals to N, means full, the producers are waiting for the condition variable that the buffer is not full.
It is important to note that when pthread_cond_wait() return without error, the associated predicate may still be false. Similarly, when pthread_cond_timedwait() returns with the timeout error, the associated predicate may be true due to an unavoidable race between the expiration of the timeout and the predicate state change. The application needs to recheck the predicate on any return because it cannot be sure there is another thread waiting on the thread to handle the signal, and if there is not then the signal is lost. The burden is on the application to check the predicate.
pthread_mutex_t lock;
pthread_cond_t not_empty;
pthread_cond_t not_full;
int buffer = 0;
// Producer:
while (1) {
pthread_mutex_lock(&lock);
while (buffer == N) {
pthread_cond_wait(¬_full, &lock);
}
// Put Product
buffer++;
if (buffer == 1) {
pthread_cond_signal(¬_empty);
}
pthread_mutex_unlock(&lock);
}
// Consumer:
while (1) {
pthread_mutex_lock(&lock);
while (buffer == 0) {
pthread_cond_wait(¬_empty, &lock);
}
// Get Product
buffer--;
if (buffer == N-1) {
ptheard_cond_signal(¬_full);
}
pthread_mutex_unlock(&lock);
}
Using semaphore: Obviously, the buffer is the critical section. Thus, we set a binary semaphore to ensure only one guy could enter the critical section. And we set another two semaphores empty and full initialized to be N and 0 respectively. A producer does P(empty) before going to critical section, and does V(full) to notify that a product is available to pick. A consumer does P(full) before going to critical section and does V(empty) to notify that a product is get away from the buffer.
sem_t lock;
sem_t empty;
sem_t full;
sem_init(&lock,0,1);
sem_init(&empty,0,N);
sem_init(&full,0,0);
Producer:
while (1) {
sem_wait(&empty); // P(empty)
sem_wait(&lock); // P(mutex)
// Put Product
sem_post(&lock); // V(mutex)
sem_post(&full); // V(full)
}
Consumer:
while (1) {
sem_wait(&full); // P(full)
sem_wait(&lock); // P(mutex)
// Get Product
sem_post(&lock); // V(mutex)
sem_post(&empty); // V(empty)
}
The Reader-Writer Problem
Readers and writers share the same file. But at one time, either reader or writer could operate the file. When a reader is reading the file, other readers could enter the file for reading, and writers could not write. When a writer is writing into the file, neither readers nor other writers could enter the file.
Using condition variable:
pthread_mutex_t lock;
pthread_cond_t cond;
volatile int readerCount = 1;
volatile int writerCount = 0;
// Reader:
void read() {
pthread_mutex_lock(&lock);
while (readerCount == 0 && writerCount == 1) {
pthread_cond_wait(&cond, &lock);
}
readerCount++;
pthread_mutex_unlock(&lock);
// Read
pthread_mutex_lock(&lock);
readerCount--;
if (readerCount == 0) {
pthread_cond_signal(&cond);
}
pthread_mutex_unlock(&lock);
}
// Writer:
void write() {
pthread_mutex_lock(&lock);
while (readerCount > 0 || writerCount > 0) {
pthread_cond_wait(&cond, &lock);
}
writerCount++;
// Write
writerCount--;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&lock);
}
Using semaphore:
sem_t read;
sem_t write;
sem_init(&read,0,1);
sem_init(&write,0,1);
volatile int readerCount = 0;
Reader:
void read() {
sem_wait(&read); // P(read)
if (readerCount == 0)
sem_wait(&write); // P(write)
readerCount++;
sem_post(&read); // V(read)
// Read
sem_wait(&read); // P(read)
readerCount--;
if (readerCount == 0)
sem_post(&write); // V(write)
readerCount++;
sem_post(&read); // V(read)
}
Writer:
void write() {
sem_wait(&write); // P(write)
// Write
sem_post(&write); // V(write)
}
It's interesting to see that, before pthread_cond_wait(), a mutex_lock is always need. And in pthread_cond_wait(), that mutex_lock is release automaticly otherwise a deadlock is gonna happen. And after pthread_cond_signal() of one task, the mutex_lock must be released by the programmer because the last step in another task's pthread_cond_wait() is to re-hold the mutex_lock automatically.
Monitors
Recall
Using Condition Variable and Semaphore put pressures on programmers for they need to enforce the program would run in correct order, otherwise deadlock will happen (deadlock will be discussed in detail in next section). Now Let's recall two cases:
pthread_cond_t not_empty;
pthread_cond_t not_full;
int buffer = 0;
// Producer:
while (1) {
while (buffer == N) {
pthread_cond_wait(¬_full, &lock);
}
// Put Product
buffer++;
if (buffer == 1) {
pthread_cond_signal(¬_empty);
}
}
// Consumer:
while (1) {
while (buffer == 0) {
pthread_cond_wait(¬_empty, &lock);
}
// Get Product
buffer--;
if (buffer == N-1) {
ptheard_cond_signal(¬_full);
}
}
If we remove mutexes in the producer and consumer problem, it seems OK since we've already have a sleep and wakeup policy. The issue still happens in the executing order. In user space, interrupt is not able to be disabled. So timer interrup could happen between any two statements. Now let's assume the following executing order:
Step 1: consumer execute first, and an timer interrupt happens between while (buffer == 0) and pthread_cond_wait(¬_empty, &lock).
Step 2: producer now get CPU share, and successfully put 1 product to buffer.
Step 3: consumer execute again, and going to sleep without checking again the buffer.
Thus, The consumer lost the not_empty signal and after producer fill the buffer, both producer and consumer will sleep forever.
Now let's recall using the semaphore to solve producer and consumer problem. If in the consumer's code, we reverse the order of P(full) and P(mutex) so that the consumer down the mutex first. Let's see what will happen
sem_t lock;
sem_t empty;
sem_t full;
sem_init(&lock,0,1);
sem_init(&empty,0,N);
sem_init(&full,0,0);
Producer:
while (1) {
sem_wait(&empty); // P(empty)
sem_wait(&lock); // P(mutex)
// Put Product
sem_post(&lock); // V(mutex)
sem_post(&full); // V(full)
}
Consumer:
while (1) {
sem_wait(&lock); // P(mutex)
sem_wait(&full); // P(full)
// Get Product
sem_post(&lock); // V(mutex)
sem_post(&empty); // V(empty)
}
Now let's assume the following executing order:
Step 1: Consumer executes first, lock the mutex and goes to sleep waiting on full.
Step 2: Producer then executes, up empty and stuck at locking mutex because mutex is held by consumer.
So both producer and consumer will sleep forever.
Motivation
An undesirable order of using semaphores and mutex requires programmers to be responsible for the policy correctness. This problem is pointed out to show how careful you must be when using semaphores. One subtle error and everything comes to a grinding halt. It is like programming in assembly language, only worse, because the errors are race conditions, deadlocks, and other forms of unpredictable and irreproducible behavior.
To make it easier to write correct programs, Brinch Hansen (1973 ) and Hoare (1974 ) proposed a higher level synchronization primitive called a monitor. A monitor is a collection of procedures, variables, and data structures that are allgrouped together in a special kind of module or package. Processes may call the procedures in a monitor whenever they want to, but they cannot directly access the monitor's internal data structures from procedures declared outside the monitor. This rule, which is common in modern object-oriented languages such as Java.
Deadlock and Livelock
It seem that every problem could be solved using the mechanism we discussed above. However things are not that easy. Now let's look at a problem - Dining Philosophers. Five philosophers sitting around the round table as the picture shown below. Suppose that these five philosophers need two forks to eat spaghetti.
Deadlock: At the same time, each of them puts up the left fork. All of them need another fork to continuing eating, but... there's no more forks, they keep holding the left fork so they're gonna wait forever. That's deadlock, everybody hold parts of resource and waiting for the rest resource but could never get.
Livelock: At the same time, each of them holds the left fork. All of them need another fork to continuing eating, so they put down the left fork, and then put up the right fork...still doesn't work. And each of them put down the right fork and pick up the left fork again... They keep this loop forever, no body could get to eat. This is livelock, everybody is moving instead of holding resource but no body could get full resource.