


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: S-Gen Purpose Cmptn-GPU; Subject: Computer Science; University: University of Massachusetts - Amherst; Term: Spring 2006;
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



CMPSCI 691W Parallel and Concurrent Programming Spring 2006
Lecturer: Emery Berger Scribe: Kevin Grimaldi
This lecture introduces various tools available for synchronization in concurrent programs. Synchronization is used to avoid race conditions and to coordinate the actions of threads. Locks can be used to allow multiple threads to safely access shared data. Common pitfalls and problems that arise from using locks are also covered.
One of the primary motivations for synchronization is to allow multiple threads to safely share state. When multiple accesses to a shared resource are made simultaneously, they are only safe if:
The “too much milk” problem illustrates the need for coordination amongst multiple threads accessing a resource. If you arrive home, look in the fridge and find no milk and then head to the store to buy milk, your roommate may in the meantime come home and also go to the store after noticing the lack of milk. Without some sort of synchronization you can end up with too much milk which then goes bad. Simply checking for a note does not solve the problem either, as if your roommate comes home and checks for milk and a note in between when you check for milk and a note and then leave a note, you can still end up with too much milk. Locks are needed to solve this problem.
Locks allow mutual exclusion, or preventing more than one thread from entering a section of code, called a critical section, at the same time. To implement locks hardware-level atomic updates are required, such as test-and-set or compare-and-swap. Test-and-set atomically sets the value of a word to 1 and returns the previous value.
These atomic operations, unfortunately are very slow especially on Intel architectures. They generally require the entire pipeline to be flushed, making them orders of magnitude slower than the equivalent non-atomic operations.
3-2 Lecture 3: February 8
Different types of locks have different semantics for how they interact with thread scheduling when the lock is contended for. The first type of locks, blocking locks, simply suspend the current thread immediately when they try to acquire a lock held by another thread, allowing other threads to run. This minimizes processor time spent waiting for the lock to be released, but is guaranteed to cause a context switch any time a thread tries to acquire a lock that is already held. The cost of the context switch along with the effects that it has on cache can be nontrivial.
As the name implies, spin locks just spin in a loop waiting for the lock to be released instead of suspending the thread. This can sometimes avoid the cost of a context switch, but unfortunately can sometimes waste a lot of processor time doing nothing. Note that spin locks only make sense in the context of multiprocessor systems. On a uniprocessor system they are guaranteed to result in doing nothing until the expiration of the thread’s quanta.
Instead of taking an all-or-nothing approach, hybrid locks spin for some period of time in the hopes of avoiding unnecessary context switches, but then yield to another thread after a certain period of time to avoid wasting processor time. Different variants either use a fixed timeout or an exponential backoff algorithm.
Another variant of traditional locks, called queueing locks, maintain a FIFO queue of threads waiting for a lock to ensure fairness and scalability. These were a hot research topic in the nineties but due to performance issues have not been used in real systems.
Locks can be used to enfore mutual exclusion but also introduce the possibility of various errors.
One very common error encountered in the use of locks, especially in C or C++ is forgetting to unlock when done with a lock. This can sometimes be hard to spot especially when there are several ways for a function to return, or for example if an exception occurs in C++.
A solution to this problem is to use the resource acquisition is initialization paradigm. An object is created on the stack that acquires a lock during construction and then releases the lock when the destructor is called which will occur when the object goes out of scope, no matter how the code block is exited. This is similar to the way that locks are handled in Java, where all of the built in locks are scoped.
Certain other problems such as leaving the un in unlock out can occur in C. This results in deadlock since the thread is waiting for itself to release the lock. This particular problem can be found relatively easily
3-4 Lecture 3: February 8
These locks introduce several options in terms of what to do when both readers and writers are queued up to obtain the lock. If readers are favored over writers, the writers can be starved while favoring the writers can result in readers being starved. A more complicated solution is to alternate between readers and writers every time the lock is acquired and released. This prevents either type of thread from starving.
While safety is provided by mutexes and read-write locks, threads can coordinate with each other using other primitives such as semaphores or condition variables.
The general definition of a semaphore is a visual signaling apparatus, such as a traffic light. This is vaguely related to the computer science definition of a sempahore which is “a non-negative integer counter with atomic increment and decrement” that will block whenever decrementing would make it go negative.
Given a semaphore threads can increment it and decrement it. If a thread tries to decrement the semaphore when it is already at zero, the thread will go to sleep until another thread increments the semaphore. This can be used to signal between threads, allowing a thread for example to wait for an event to occur.
Semaphores can also be used when a certain maximum number of threads should be given access to a resource simultaneously. To be used in this manner the semaphore is initialized to the number of threads that can use the resource and then threads decrement the semaphore before using the resource and increment the semaphore again when done.
Suppose that you want to make a blocking queue such that threads go to sleep if they try to dequeue an item from an empty queue. To check if the queue is empty a thread must hold a lock for the queue, but then to wait for something to be put in the queue the thread needs to go to sleep until it is signalled by a thread placing something in the queue. If this whole process is not done atomically, an item might be placed in the queue between when the thread checks the queue and goes to sleep and then the thread will never be signalled. On the other hand if the thread goes to sleep while holding the lock, no one else will ever be able to get the lock to put an item in the queue.
To solve this problem, condition variables are used. They are neither conditions nor variables. Condition variables allow a thread to atomically release a lock and then go to sleep. Other threads can then either wake up a single waiting thread or wake up all waiting threads. Waking up all waiting threads should be used with caution as it can result in what is known as the thundering herd problem, where a large group of threads wake up contending for the same object.