Multithreading and Fork-Join - Programming Languages and Techniques II - Lecture Slides, Slides of Programming Languages

In all programming language only syntax is different not the logic. This course discuss core concepts for many different programming language and techniques. Key points for this lecture are: Multithreading and Fork-Join, Sequential Programming, Threads of Execution, Synchronize, Concurrent Access, Simplified View of History, Moore's Law, Parallelism and Concurrency, Analogy, Shared Memory

Typology: Slides

2012/2013

Uploaded on 09/29/2013

dhanvant
dhanvant 🇮🇳

4.9

(9)

89 documents

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
A Sophomoric Introduction to Shared-Memory
Parallelism and Concurrency
Introduction to Multithreading & Fork-Join
Parallelismx
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download Multithreading and Fork-Join - Programming Languages and Techniques II - Lecture Slides and more Slides Programming Languages in PDF only on Docsity!

A Sophomoric Introduction to Shared-Memory

Parallelism and Concurrency

Introduction to Multithreading & Fork-Join

Parallelismx

Changing a major assumption

So far most or all of your study of computer science has assumed

One thing happened at a time

Called sequential programming – everything part of one sequence

Removing this assumption creates major challenges & opportunities

  • Programming: Divide work among threads of execution and coordinate (synchronize) among them
  • Algorithms: How can parallel activity provide speed-up (more throughput: work done per unit time)
  • Data structures: May need to support concurrent access (multiple threads operating on data at the same time)

What to do with multiple processors?

  • Next computer you buy will likely have 4 processors
    • Wait a few years and it will be 8, 16, 32, …
    • The chip companies have decided to do this (not a “law”)
  • What can you do with them?
    • Run multiple totally different programs at the same time
      • Already do that? Yes, but with time-slicing
    • Do multiple things at once in one program
      • Our focus – more difficult
      • Requires rethinking everything from asymptotic complexity to how to implement data-structure operations

Parallelism vs. Concurrency

Note: Terms not yet standard but the perspective is essential

  • Many programmers confuse these concepts

There is some connection:

  • Common to use threads for both
  • If parallel computations need access to shared resources, then the concurrency needs to be managed First 3ish lectures on parallelism, then 3ish lectures on concurrency

Parallelism:

Use extra resources to solve a problem faster

resources

Concurrency: Correctly and efficiently manage access to shared resources

work^ requests

resource

Parallelism Example

Parallelism: Use extra computational resources to solve a problem faster (increasing throughput via simultaneous execution)

Pseudocode for array sum

  • Bad style for reasons we’ll see, but may get roughly 4x speedup

int sum(int[] arr){ res = new int[4]; len = arr.length; FORALL(i=0; i < 4; i++) { //parallel iterations res[i] = sumRange(arr,ilen/4,(i+1)len/4); } return res[0]+res[1]+res[2]+res[3]; } int sumRange(int[] arr, int lo, int hi) { result = 0; for(j=lo; j < hi; j++) result += arr[j]; return result; }

Concurrency Example

Concurrency: Correctly and efficiently manage access to shared resources (from multiple possibly-simultaneous clients)

Pseudocode for a shared chaining hashtable

  • Prevent bad interleavings (correctness)
  • But allow some concurrent access (performance)

class Hashtable<K,V> { void insert(K key, V value) { int bucket = …; prevent-other-inserts/lookups in table[bucket] do the insertion re-enable access to table[bucket] } V lookup(K key) { (similar to insert, but can allow concurrent lookups to same bucket) } }

Shared memory

pc=…

pc=…

pc=…

Unshared: locals and control

Shared: objects and static fields

Threads each have own unshared call stack and current statement

  • (pc for “program counter”)
  • local variables are numbers, null, or heap references

Any objects can be shared, but most are not

Other models

We will focus on shared memory, but you should know several other models exist and have their own advantages

  • Message-passing: Each thread has its own collection of objects. Communication is via explicitly sending/receiving messages - Cooks working in separate kitchens, mail around ingredients
  • Dataflow: Programmers write programs in terms of a DAG.

A node executes after all of its predecessors in the graph

  • Cooks wait to be handed results of previous steps
  • Data parallelism: Have primitives for things like “apply function to every element of an array in parallel”

Java basics

First learn some basics built into Java via java.lang.Thread

  • Then a better library for parallel programming

To get a new thread running:

  1. Define a subclass C of java.lang.Thread, overriding run
  2. Create an object of class C
  3. Call that object’s start method
    • start sets off a new thread, using run as its “main”

What if we instead called the run method of C?

  • This would just be a normal method call, in the current thread

Let’s see how to share memory and coordinate via an example…

Parallelism idea

  • Example: Sum elements of a large array
  • Idea: Have 4 threads simultaneously sum 1/4 of the array
    • Warning: This is an inferior first approach

ans0 ans1 ans2 ans

ans

  • Create 4 thread objects , each given a portion of the work
  • Call start() on each thread object to actually run it in parallel
  • Wait for threads to finish using join()
  • Add together their 4 answers for the final result

First attempt, continued (wrong)

class SumThread extends java.lang.Thread { int lo, int hi, int[] arr; // arguments int ans = 0; // result SumThread(int[] a, int l, int h ) { … } public void run (){ … } // override }

int sum(int[] arr){ // can be a static method int len = arr.length; int ans = 0; SumThread[] ts = new SumThread[4]; for(int i=0; i < 4; i++) // do parallel computations ts[i] = new SumThread(arr,ilen/4,(i+1)len/4); for(int i=0; i < 4; i++) // combine results ans += ts[i].ans; return ans; }

Second attempt (still wrong)

int sum(int[] arr){ // can be a static method int len = arr.length; int ans = 0; SumThread[] ts = new SumThread[4]; for(int i=0; i < 4; i++){// do parallel computations ts[i] = new SumThread(arr,ilen/4,(i+1)len/4); ts[i].start(); // start not run } for(int i=0; i < 4; i++) // combine results ans += ts[i].ans; return ans; }

class SumThread extends java.lang.Thread { int lo, int hi, int[] arr; // arguments int ans = 0; // result SumThread(int[] a, int l, int h ) { … } public void run (){ … } // override }

Join (not the most descriptive word)

  • The Thread class defines various methods you could not implement on your own - For example: start, which calls run in a new thread
  • The join method is valuable for coordinating this kind of computation - Caller blocks until/unless the receiver is done executing (meaning the call to run returns) - Else we would have a race condition on ts[i].ans
  • This style of parallel programming is called “fork/join”
  • Java detail: code has 1 compile error because join may throw java.lang.InterruptedException - In basic parallel code, should be fine to catch-and-exit

Shared memory?

  • Fork-join programs (thankfully) do not require much focus on sharing memory among threads
  • But in languages like Java, there is memory being shared. In our example: - lo, hi, arr fields written by “main” thread, read by helper thread - ans field written by helper thread, read by “main” thread
  • When using shared memory, you must avoid race conditions
    • While studying parallelism, we will stick with join
    • With concurrency, we will learn other ways to synchronize