Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Introduction to Threading in Python: Concepts, Implementation, and Best Practices, Quizzes of Programming Languages

Alexandria University Programming Languages

..............................

Typology: Quizzes

2022/2023

Uploaded on 01/22/2023

omnia-nabil-gharieb-ghonem 🇪🇬

5 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

Introduction to threading

Review

What Is a Thread?

A thread is a separate flow of execution. This means that your program will have two things

happening at once. But for most Python 3 implementations the different threads do not actually

execute at the same time: they merely appear to.

Threads run only one processor (generally)

It’s tempting to think of threading as having two (or more) different processors running on your

program, each one doing an independent task at the same time. That’s almost right. The threads

may be running on different processors, but they will only be running one at a time.

Getting multiple tasks running simultaneously requires a non-standard implementation of

Python, writing some of your code in a different language, or using

multiprocessing

which

comes with some extra overhead.

Because of the way CPython implementation of Python works, threading may not speed up all

tasks. This is due to interactions with the GIL that essentially limit one Python thread to run at a

time.

Threads are for I/O Bound programs

Tasks that spend much of their time waiting for external events are generally good candidates for

threading. Problems that require heavy CPU computation and spend little time waiting for

external events might not run faster at all.

This is true for code written in Python and running on the standard CPython implementation. If

your threads are written in C they have the ability to release the GIL and run concurrently. If you

are running on a different Python implementation, check with the documentation to see how it

handles threads.

If you are running a standard Python implementation, writing in only Python, and have a CPU-

bound problem, you should check out the

multiprocessing

module instead.

Starting a Thread

To start a separate thread, you create a

Thread

instance and then tell it to

.start()

:

import logging

import threading

import time

def thread_function(name):

Discover Quizzes of Programming Languages Alexandria University

Partial preview of the text

Download Introduction to Threading in Python: Concepts, Implementation, and Best Practices and more Quizzes Programming Languages in PDF only on Docsity!

Introduction to threading

Review

What Is a Thread?

A thread is a separate flow of execution. This means that your program will have two things

happening at once. But for most Python 3 implementations the different threads do not actually

execute at the same time: they merely appear to.

Threads run only one processor (generally)

It’s tempting to think of threading as having two (or more) different processors running on your

program, each one doing an independent task at the same time. That’s almost right. The threads

may be running on different processors, but they will only be running one at a time.

Getting multiple tasks running simultaneously requires a non-standard implementation of

Python, writing some of your code in a different language, or using multiprocessing which

comes with some extra overhead.

Because of the way CPython implementation of Python works, threading may not speed up all

tasks. This is due to interactions with the GIL that essentially limit one Python thread to run at a

time.

Threads are for I/O Bound programs

Tasks that spend much of their time waiting for external events are generally good candidates for

threading. Problems that require heavy CPU computation and spend little time waiting for

external events might not run faster at all.

This is true for code written in Python and running on the standard CPython implementation. If

your threads are written in C they have the ability to release the GIL and run concurrently. If you

are running on a different Python implementation, check with the documentation to see how it

handles threads.

If you are running a standard Python implementation, writing in only Python, and have a CPU-

bound problem, you should check out the multiprocessing module instead.

Starting a Thread

To start a separate thread, you create a Thread instance and then tell it to .start():

import logging import threading import time

def thread_function(name):

logging.info("Thread %s: starting", name) time.sleep(2) logging.info("Thread %s: finishing", name)

if name == "main": format = "%(asctime)s: %(message)s" logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")

logging.info("Main : before creating thread") x = threading.Thread(target=thread_function, args=(1,)) logging.info("Main : before running thread") x.start() logging.info("Main : wait for the thread to finish")

x.join()

logging.info("Main : all done")

If you look around the logging statements, you can see that the main section is creating and

starting the thread:

x = threading.Thread(target=thread_function, args=(1,)) x.start()

When you run this program as it is (with line twenty commented out), the output will look like

this:

$ ./single_thread.py Main : before creating thread Main : before running thread Thread 1: starting Main : wait for the thread to finish Main : all done Thread 1: finishing

Daemon Threads

In computer science, a daemon is a process that runs in the background.

Python threading has a more specific meaning for daemon. A daemon thread will shut down

immediately when the program exits. One way to think about these definitions is to consider the

daemon thread a thread that runs in the background without worrying about shutting it down.

If a program is running Threads that are not daemons, then the program will wait for those

threads to complete before it terminates. Threads that are daemons, however, are just killed

wherever they are when the program is exiting.

Let’s look a little more closely at the output of your program above. The last two lines are the

interesting bit. When you run the program, you’ll notice that there is a pause (of about 2 seconds)

after main has printed its all done message and before the thread is finished.

The harder way of starting multiple threads is the one you already know:

import logging import threading import time

def thread_function(name): logging.info("Thread %s: starting", name) time.sleep(2) logging.info("Thread %s: finishing", name)

if name == "main": format = "%(asctime)s: %(message)s" logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")

threads = list() for index in range(3): logging.info("Main : create and start thread %d.", index) x = threading.Thread(target=thread_function, args=(index,)) threads.append(x) x.start()

for index, thread in enumerate(threads): logging.info("Main : before joining thread %d.", index) thread.join() logging.info("Main : thread %d done", index)

This code uses the same mechanism you saw above to start a thread, create a Thread object, and

then call .start(). The program keeps a list of Thread objects so that it can then wait for them

later using .join().

Running this code multiple times will likely produce some interesting results. Here’s an example

output from my machine:

$ ./multiple_threads.py Main : create and start thread 0. Thread 0: starting Main : create and start thread 1. Thread 1: starting Main : create and start thread 2. Thread 2: starting Main : before joining thread 0. Thread 2: finishing Thread 1: finishing Thread 0: finishing Main : thread 0 done Main : before joining thread 1. Main : thread 1 done Main : before joining thread 2. Main : thread 2 done

If you walk through the output carefully, you’ll see all three threads getting started in the order

you might expect, but in this case they finish in the opposite order! Multiple runs will produce

different orderings. Look for the Thread x: finishing message to tell you when each thread is

done.

The order in which threads are run is determined by the operating system and can be quite hard

to predict. It may (and likely will) vary from run to run, so you need to be aware of that when

you design algorithms that use threading.

Fortunately, Python gives you several primitives that you’ll look at later to help coordinate

threads and get them running together. Before that, let’s look at how to make managing a group

of threads a bit easier.

Did you test this on the code with the daemon thread or the regular thread? It turns out that it

doesn’t matter. If you .join() a thread, that statement will wait until either kind of thread

is finished.

Using a ThreadPoolExecutor (Recommended)

There’s an easier way to start up a group of threads than the one you saw above. It’s called a

ThreadPoolExecutor, and it’s part of the standard library in concurrent.futures (as of

Python 3.2).

The easiest way to create it is as a context manager, using the with statement to manage the

creation and destruction of the pool.

Here’s the main from the last example rewritten to use a ThreadPoolExecutor:

import concurrent.futures

[rest of code]

if name == "main": format = "%(asctime)s: %(message)s" logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")

with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: executor.map(thread_function, range(3))

The code creates a ThreadPoolExecutor as a context manager, telling it how many worker

threads it wants in the pool. It then uses .map() to step through an iterable of things, in your case

range(3), passing each one to a thread in the pool.

The end of the with block causes the ThreadPoolExecutor to do a .join() on each of the

threads in the pool. It is strongly recommended that you use ThreadPoolExecutor as a

context manager when you can so that you never forget to .join() the threads.

Basic Synchronization Using Lock

To solve your race condition above, you need to find a way to allow only one thread at a time

into the read-modify-write section of your code. The most common way to do this is called Lock

in Python. In some other languages this same idea is called a mutex. Mutex comes from MUTual

EXclusion, which is exactly what a Lock does.

A Lock is an object that acts like a hall pass. Only one thread at a time can have the Lock. Any

other thread that wants the Lock must wait until the owner of the Lock gives it up.

The basic functions to do this are .acquire() and .release(). A thread will call

my_lock.acquire() to get the lock. If the lock is already held, the calling thread will wait until

it is released. There’s an important point here. If one thread gets the lock but never gives it back,

your program will be stuck. You’ll read more about this later.

Fortunately, Python’s Lock will also operate as a context manager, so you can use it in a with

statement, and it gets released automatically when the with block exits for any reason.

Let’s look at the FakeDatabase with a Lock added to it. The calling function stays the same:

class FakeDatabase: def init(self): self.value = 0 self._lock = threading.Lock()

def locked_update(self, name): logging.info("Thread %s: starting update", name) logging.debug("Thread %s about to lock", name) with self._lock: logging.debug("Thread %s has lock", name) local_copy = self.value local_copy += 1 time.sleep(0.1) self.value = local_copy logging.debug("Thread %s about to release lock", name) logging.debug("Thread %s after release", name) logging.info("Thread %s: finishing update", name)

It’s worth noting here that the thread running this function will hold on to that Lock until it is

completely finished updating the database. In this case, that means it will hold the Lock while it

copies, updates, sleeps, and then writes the value back to the database.

Deadlock

Before you move on, you should look at a common problem when using Locks. As you saw, if

the Lock has already been acquired, a second call to .acquire() will wait until the thread that is

holding the Lock calls .release(). What do you think happens when you run this code:

import threading

l = threading.Lock() print("before first acquire") l.acquire() print("before second acquire") l.acquire() print("acquired lock twice")

Introduction to Threading in Python: Concepts, Implementation, and Best Practices, Quizzes of Programming Languages

Related documents

Partial preview of the text

Download Introduction to Threading in Python: Concepts, Implementation, and Best Practices and more Quizzes Programming Languages in PDF only on Docsity!

Introduction to threading

Review

What Is a Thread?

A thread is a separate flow of execution. This means that your program will have two things

happening at once. But for most Python 3 implementations the different threads do not actually

execute at the same time: they merely appear to.

Threads run only one processor (generally)

It’s tempting to think of threading as having two (or more) different processors running on your

program, each one doing an independent task at the same time. That’s almost right. The threads

may be running on different processors, but they will only be running one at a time.

Getting multiple tasks running simultaneously requires a non-standard implementation of

Python, writing some of your code in a different language, or using multiprocessing which

comes with some extra overhead.

Because of the way CPython implementation of Python works, threading may not speed up all

tasks. This is due to interactions with the GIL that essentially limit one Python thread to run at a

time.

Threads are for I/O Bound programs

Tasks that spend much of their time waiting for external events are generally good candidates for

threading. Problems that require heavy CPU computation and spend little time waiting for

external events might not run faster at all.

This is true for code written in Python and running on the standard CPython implementation. If

your threads are written in C they have the ability to release the GIL and run concurrently. If you

are running on a different Python implementation, check with the documentation to see how it

handles threads.

If you are running a standard Python implementation, writing in only Python, and have a CPU-

bound problem, you should check out the multiprocessing module instead.

Starting a Thread

To start a separate thread, you create a Thread instance and then tell it to .start():

x.join()

If you look around the logging statements, you can see that the main section is creating and

starting the thread:

When you run this program as it is (with line twenty commented out), the output will look like

this:

Daemon Threads

In computer science, a daemon is a process that runs in the background.

Python threading has a more specific meaning for daemon. A daemon thread will shut down

immediately when the program exits. One way to think about these definitions is to consider the

daemon thread a thread that runs in the background without worrying about shutting it down.

If a program is running Threads that are not daemons, then the program will wait for those

threads to complete before it terminates. Threads that are daemons, however, are just killed

wherever they are when the program is exiting.

Let’s look a little more closely at the output of your program above. The last two lines are the

interesting bit. When you run the program, you’ll notice that there is a pause (of about 2 seconds)

after main has printed its all done message and before the thread is finished.

The harder way of starting multiple threads is the one you already know:

This code uses the same mechanism you saw above to start a thread, create a Thread object, and

then call .start(). The program keeps a list of Thread objects so that it can then wait for them

later using .join().

Running this code multiple times will likely produce some interesting results. Here’s an example

output from my machine:

If you walk through the output carefully, you’ll see all three threads getting started in the order

you might expect, but in this case they finish in the opposite order! Multiple runs will produce

different orderings. Look for the Thread x: finishing message to tell you when each thread is

done.

The order in which threads are run is determined by the operating system and can be quite hard

to predict. It may (and likely will) vary from run to run, so you need to be aware of that when

you design algorithms that use threading.

Fortunately, Python gives you several primitives that you’ll look at later to help coordinate

threads and get them running together. Before that, let’s look at how to make managing a group

of threads a bit easier.

Did you test this on the code with the daemon thread or the regular thread? It turns out that it

doesn’t matter. If you .join() a thread, that statement will wait until either kind of thread

is finished.

Using a ThreadPoolExecutor (Recommended)

There’s an easier way to start up a group of threads than the one you saw above. It’s called a

ThreadPoolExecutor, and it’s part of the standard library in concurrent.futures (as of

Python 3.2).

The easiest way to create it is as a context manager, using the with statement to manage the

creation and destruction of the pool.

Here’s the main from the last example rewritten to use a ThreadPoolExecutor:

[rest of code]

The code creates a ThreadPoolExecutor as a context manager, telling it how many worker

threads it wants in the pool. It then uses .map() to step through an iterable of things, in your case

range(3), passing each one to a thread in the pool.

The end of the with block causes the ThreadPoolExecutor to do a .join() on each of the

threads in the pool. It is strongly recommended that you use ThreadPoolExecutor as a