Notes on Shared Memory Programming | CS 415, Study notes of Computer Science

Material Type: Notes; Professor: Li; Class: PARALLEL PROGRAMMING; Subject: Computer Science; University: Portland State University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-dhk-1
koofers-user-dhk-1 🇺🇸

10 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Shared-Memory Programming
Jingke Li
Portland State University
Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 1 / 32
Programming Shared Memory Systems
In a shared-memory system, a single address space exists, i.e. each
memory location is given a unique address, and any memory location is
accessible by any of the processors.
A shared-memory system is typically used in two ways:
Multi-programming Execute multiple unrelated programs
concurrently. (A feature of operating systems; will not b e discussed
further.)
Multi-threading (Multi-tasking) Break a single application into
multiple symmetrical threads (i.e. starting and ending at the same
time).
Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 2 / 32
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Notes on Shared Memory Programming | CS 415 and more Study notes Computer Science in PDF only on Docsity!

Shared-Memory Programming

Jingke Li

Portland State University

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 1 / 32

Programming Shared Memory Systems

In a shared-memory system, a single address space exists, i.e. each memory location is given a unique address, and any memory location is accessible by any of the processors. A shared-memory system is typically used in two ways:

  • Multi-programming — Execute multiple unrelated programs concurrently. (A feature of operating systems; will not be discussed further.)
  • Multi-threading (Multi-tasking) — Break a single application into multiple symmetrical threads (i.e. starting and ending at the same time).

Threads vs. Processes

Threads are related to the concept of processes. Both are runtime entities created by a program, and can run concurrently with their parent and with their siblings. They differ in “weight” and in control structure location.

User Space Kernel Space

Stack User Code Data

Process Structure

Data

User Code

User Space Kernel Space Process Structure

Thread Structure

Stack Stack P1 P

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 3 / 32

Threads vs. Processes (cont.)

Process — a self-sufficient, independent flow of control.

  • Has its own variables, stack, file descriptors, and memory map
  • Inherits a copy of parent’s environment (e.g. variables)
  • “Heavyweight” — expensive to start and switch between contexts
  • Structure is in kernel space, can only be accessed through system calls
  • Typically supported at the OS level

Thread — an independent flow of control (with its stack, pc, etc.)

  • Shares many of parent’s things with other threads — program code, variables, memory map, etc.
  • “Lightweight” — relatively cheap to start and switch context
  • Structure is in user space, allowing for very fast access
  • Supported at the OS level or at the programming language level

Example

Thread 1 Thread 2 stmt A stmt X stmt B stmt Y stmt C stmt Z

There are several possible orderings, including

stmt A, stmt B, stmt X, stmt C, stmt Y, stmt Z stmt X, stmt Y, stmt A, stmt B, stmt C, stmt Z

Any possible statement order resulted from interleaving defines a legal execution of the program!

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 7 / 32

Issues in Shared-Memory Programming

Expressing Parallelism:

  • Compiler directives: e.g. OpenMP
  • Libraries: e.g. Pthreads
  • Language constructs: e.g. parbegin/parend, fork/join
  • New languages: e.g. Cilk/Cilk++

Synchronization Mechanism:

  • For accessing shared data
  • For coordinating computation

Compiler Directives

A non-intrusive approach for providing shared-memory programming power. In this case, the parallelism information is presented in the form of compiler directives inserted into the user program. The program can compile and run directly on a sequential computer by simply disregarding the directives.

  • (^) These directives do not change a program’s sequential semantics. They are only picked up by parallelizing compilers.
  • (^) They generally can specify more detailed information regarding parallelism than constructs can (e.g. how variables should be treated).

Example: OpenMP — A specification for a set of compiler directives, library routines, and environment variables that can be used to specify shared-memory parallelism in Fortran and C/C++ programs. OpenMP is jointly defined by a group of major computer hardware and software vendors.

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 9 / 32

Thread Libraries

Thread libraries provide a less-intrusive alternative to language constructs for providing programmers with shared-memory programming power.

Examples:

  • POSIX Threads (Pthreads) — supported by all UNIX vendors
  • Solaris Threads
  • Win32 and OS/2 Threads
  • Java threads are built directly on OS threads libraries

New Programming Languages

This is the most aggressive approach. Designing a new language has the advantage that parallel features can truely be integrated into the language model. However, the challenge for wide-acceptance is huge.

So far, not many attempts have been made in this area.

Cilk/Cilk++ is a boarderline example, it is C/C++ based, with a small set of keywords added. When the Cilk/Cilk++ keywords are removed from Cilk/Cilk++ source code, the result is a valid C program.

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 13 / 32

The Need for Synchronization

  • Resolving Competition — The non-determinism in the default semantics of multi-threaded programs can easily cause a program to produce different results on different runs.
  • Facilitating Cooperation — Another situation calling for synchronization is when one thread need to used the data produced by another thread, e.g. a consumer-producer type of application. There has to be a way for the producer to inform the consumer that data is ready; or for the consumer to tell the producer that more data is needed.

Example

Consider two threads each is to add one to a shared data item, x. To accomplish this, it is necessary for the contents of the x location to be read, x + 1 computed, and the result written back to the location. So we have,

Thread 1 Thread 2 read x read x compute x + 1 compute x + 1 write x back write x back

Due to the possibility of interleaved execution of the statements, we may get different results at the end.

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 15 / 32

Critical Sections

The problem of accessing shared data can be generalized by considering shared resources. A mechanism for ensuring that only one thread accesses a particular resource at a time is to establish sections of code involving the resource as so-called critical sections and arrange that only one thread executes a critical section for the resource at a time.

  • The first thread to reach its critical section for a particular resource enters and executes the critical section. It prevents all other threads from entering their critical sections for the same resource.
  • Once the thread has finished its critical section, another thread is allowed to enter a critical section for the same resource.

This mechanism is known as mutual exclusion.

Deadlock

Deadlock can occur with two threads when each requires a resource held by the other. For example,

Thread 1 Thread 2 requests A requests B requests B requests A uses A and B uses A and B

Deadlock occurs when Thread 1 holds A, Thread 2 holds B, and each wants to get another resource.

Deadlock can also occur in a circular fashion with several threads.

Thread 1 Thread 2 Thread k holds A 1 holds A 2 · · · holds Ak requests A 2 requests A 3 requests A 1

Deadlock can be avoid if all threads request resources in the same order. Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 19 / 32

Semaphores

When multiple copies of a particular shared resource is available for multiple threads to use, using a lock to control the access would not be appropriate.

A semaphore, s, is a non-negative integer operated upon by two operations named P and V.

  • P(s) — waits until s is greater than zero and then decrements s by one and allows the thread to continue.
  • V(s) — increments s by one to release one of the waiting threads (if any).

The P and V operations are performed atomically. A mechanism for activating waiting threads is also implicit in the operations.

Threads delayed by P(s) are kept in sleep until released by a V(s) on the same semaphore. Semaphore routines exist for Unix threads. They do not exist in Pthreads, though the Unix semaphore routines can be used in Pthreads programs.

Using (Binary) Semaphores as Locks

Mutual exclusion of critical sections can be achieved with one semaphore having the value 0 or 1, which acts as a lock variable.

  • (^) The semaphore is initialized to 1, indicating that no thread is in its critical section associated with the semaphore.
  • Each mutually exclusive critical section is preceded by a P(s) and terminated with a V(s) on the same semaphore s; i.e., Thread 1 Thread 2 Thread 3 ... ... ... P(s) P(s) P(s) cr.sec cr.sec cr.sec V(s) V(s) V(s)
  • The first thread to reach its P(s) operation will set s to 0, inhibiting the other threads from proceeding past their P(s) operations.
  • When the thread reaches its V(s) operation, it sets s to 1 and one of the threads waiting is allowed to proceed.

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 21 / 32

General (Counting) Semaphores

General semaphores can take on positive values other than 0 and 1. Such semaphores provide a means of recording the number of “resource units” available or used, and can be used to solve producer/consumer problems.

Example:

producer() { request_t *request; while(TRUE) { request = get_request(); add_request(request); V(queue˙length); } }

consumer() { request_t *request; while(TRUE) { P(queue˙length); request = remove_request(); thread_request(request); } }

A Graphic View of Condition Variables

wait

Lock

cond?

Unlock

Unlock

Sleep

Continue

signal

Lock

cond=TRUE

Unlock

Wakeup

Continue

n

y

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 25 / 32

Example

Consider one or more threads designed to take action when a counter, x, is zero. Another thread is responsible for decrementing the counter.

action() { ... lock(); while (x != 0) wait(s); unlock(); take_action(); ... }

counter() { lock(); x--; if (x == 0) signal(s); unlock(); ... }

  • The wait operation will automatically release the lock, to allow another thread to alter the condition.
  • The signal operation will automatically wake up any or all waiting threads (implementation dependent).
  • The waking-up threads will try to lock the lock again, and one of them will succeed.

Monitors

Monitors are introduced for providing onject-oriented style of access control — Essentially the protected data and the operations (monitor procedures) that can operate upon it are encapsulated inside a structure. Reading and writing to the data can only be done by using monitor procedures, and only one thread can use a monitor procedure at any instant.

A monitor procedure could be implemented using a semaphore to protect its entry; i.e.,

monitor_proc1() { P(monitor_semaphore); V(monitor_semaphore); return; }

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 27 / 32

Java Threads Example

The concept of monitor exists in Java. The keyword synchronized in Java makes a method thread safe, preventing more than one thread executing the method at the same time.

public class Adder { public int[] array; private int sum = 0, index = 0, numThreads = 10, threadQuit;

public Adder() { array = new int[1000]; threadsQuit = 0; initializeArray(); startThreads(); } private void initializeArray() { for (int i = 0; i<1000; i++) array[i] = i; }

Thread-Safe Routines

System calls or library routines are called thread safe if they can be called from multiple threads simultaneously and always produce correct results.

  • (^) Standard I/O is designed to be thread safe (and will print messages without interleaving the characters).
  • (^) Routines that access shared data and static data may require special care to be made thread safe.
  • (^) The thread-safety aspect of any routine can be avoided by forcing only one thread to execute the routine at a time. (This could be achieved by simply enclosing the routine in a critical section.)

Jingke Li (Portland State University) CS 415/515 Shared-Memory Programming 31 / 32

Desirable Access Control Policies

The above synchronization mechanisms provide basic access control on shared data. For many applications, stronger policies are required on top of them, such as

  • Safe — A safe policy is one that enforces deterministic results, i.e. access order is always the same no matter how many times the program run.
  • Fair — A fair policy guarantees access to all tasks that have requested access, before second or third accesses are granted to any pending tasks.
  • Deadlock-Free — A deadlock-free policy guarantees that no deadlock can ever happen no matter what order or speed the requests are generated.