Midterm I SOLUTIONS, Lecture notes of Operating Systems

University of California, Berkeley ... CS162: Operating Systems and Systems Programming ... You are allowed 1 page of handwritten notes (both sides).

Typology: Lecture notes

2022/2023

Uploaded on 05/11/2023

seshadrinathan_hin
seshadrinathan_hin 🇺🇸

4.6

(17)

231 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
University of California, Berkeley
College of Engineering
Computer Science Division: EECS
Summer 2019
Jack Kolb
Midterm I SOLUTIONS
July 18th, 2019
CS162: Operating Systems and Systems Programming
Your Name:
SID AND 162 Login
(e.g. s042):
TA Name:
Discussion Section
Time:
General Information:
This is a closed book exam. You are allowed 1 page of handwritten notes (both sides). You have
two (2) hours to complete as much of the exam as possible. Make sure to read all of the questions
first, as some of the questions are substantially more time consuming.
Write all of your answers directly on this paper. Make your answers as concise as possible. On
programming questions, we will be looking for performance as well as correctness, so think through
your answers carefully. If there is something about the questions that you believe is open to
interpretation, please ask us about it!
Problem
Possible
Score
1
22
2
12
3
16
4
16
5
16
Total
82
Page 1/19
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Midterm I SOLUTIONS and more Lecture notes Operating Systems in PDF only on Docsity!

University of California, Berkeley

College of Engineering

Computer Science Division: EECS

Summer 2019 Jack Kolb

Midterm I SOLUTIONS

July 18th, 2019

CS162: Operating Systems and Systems Programming

Your Name:

SID AND 162 Login

(e.g. s042):

TA Name:

Discussion Section

Time:

General Information:

This is a closed book exam. You are allowed 1 page of handwritten notes (both sides). You have

two (2) hours to complete as much of the exam as possible. Make sure to read all of the questions

first, as some of the questions are substantially more time consuming.

Write all of your answers directly on this paper. Make your answers as concise as possible. On

programming questions, we will be looking for performance as well as correctness, so think through

your answers carefully. If there is something about the questions that you believe is open to

interpretation, please ask us about it!

Problem Possible Score

Total 82

Problem 1: True/False [22 pts]

Please explain your answer in two sentences or fewer (Answers longer than this may not get

credit!). Also, answers without an explanation get no credit.

Problem 1a[2pts]: The number of allocated kernel stacks is always equal to the number of allocated

userspace stacks.

⬜ True X False

Explain:

If a process is running a user-level threading library, then there can be more stacks in userspace (one per user-level thread) than in the kernel (just on kernel thread for the process). Additionally, there are kernel threads that perform tasks entirely within the operating system and do not require a userspace stack.

Problem 1b[2pts]: When using a simple “Base and Bound” scheme to enforce memory protection, the

CPU only sees virtual memory addresses.

⬜ True X False

Explain:

This depends on how the Base and Bound scheme is implemented. If memory addresses are translated at run time (adding the Base value to each address referenced before it is passed on to the memory controller), then the CPU does only see virtual memory addresses. However, if instead a program’s instructions are relocated at load time, then the CPU still uses physical addresses directly.

Problem 1c[2pts]: When fork returns a negative integer, an error has occurred that must be dealt with in

both the parent and child process.

⬜ True X False

Explain:

When fork returns a negative integer, an error has occurred, but no child process is actually created.

Problem 1d[2pts]: Switching between two threads within the same process is generally more efficient

than switching between two threads belonging to different processes.

X True ⬜ False

Explain:

Switching between two threads always requires changing the execution context (registers, stack pointer, program counter, etc.). For two threads belonging to different processes, the OS must also change the current address space, which incurs an additional cost.

Problem 1i[2pts]: Synchronization primitives in base Pintos are implemented with test&set.

⬜ True X False

Explain:

The synchronization primitives in synch.c are implemented by disabling interrupts.

Problem 1j[2pts]: According to the End-to-End Principle, reliable transport should be implemented by the

two communications endpoints, not the network infrastructure.

X True ⬜ False

Explain:

The End-to-End Principle states that many features should be implemented by end hosts on a computer network, rather than built in to the network infrastructure itself. Reliable transport is a good example of such a feature, and we saw how TCP achieves this in lecture.

Problem 1k[2pts]: After a call to fork, stdin, stdout, and stderr are reset to their default states in

the child process.

⬜ True X False

Explain:

A newly-forked child process inherits the file descriptors of its parents, including the descriptors for stdin, stdout, and stderr.

Problem 2: Short Answer [12 pts]

Problem 2a[3pts]: What is the Interrupt Vector Table and what role does it play in protecting the kernel?

The Interrupt Vector Table is a mapping from interrupt type (expressed as a number) to the proper handler for that interrupt (typically expressed as an address in memory to jump to). This ensures that we start executing kernel code only at predefined entry points.

Problem 2b[3pts]: Why are device drivers divided into a top half and bottom half? What is each half

responsible for?

Device drivers execute on two separate occasions. First, the top half is invoked by the kernel’s IO subsystem to issue a request to hardware, say to read or write data on a hard drive. Second, the bottom half is invoked by an interrupt handler, after the hardware has fulfilled the original request. The bottom half does the work required to copy data off of the device and back into kernel memory (this is usually not something we want to do within the interrupt handler itself).

Problem 2c[3pts]: In Pintos, how would we allow a struct thread to be an element of two lists at the

same time?

We can allow this by adding a second struct list_elem member to struct thread.

Problem 2d[3pts]: Does First-Come First-Served or Round-Robin scheduling have lower overhead?

Explain.

First-come first-served scheduling has lower overhead because it will require fewer context switches between threads. Because FCFS is non-preemptive, a context switch will only occur when one thread terminates and the scheduler needs to pick a new thread to run. Round-Robin scheduling is preemptive and will context switch between threads whenever the running thread’s quantum has expired.

List all possible outputs of the main program in each row of the table below (where one row corresponds to one run of the program). You may not need all the tables provided. Assume that all system calls succeed EXCEPT that execv() may possibly fail and assume that the child’s PID is 162.

Standard Output greetings.txt

Parent howdy!: 162 Parent howdy!: 162

Child hello! Child hello! Parent howdy!: 162 Parent howdy!: 162

Child hello! Child hello! Parent howdy!: 162 Parent howdy!: 162

Case 1: After the parent process forks, the child process fails to execv, prompting the child process to send a SIGTERM signal to the parent process. The parent process invokes signal_handler() upon returning, restoring file descriptor 1 to stdout. The parent process then prints “Parent howdy!: 162” to both stdout and greetings.txt.

Case 2: After the parent process forks, it continues execution and sends a SIGTERM signal to the child process. The parent process calls wait, context switching back to the child process. Upon executing, the child process invokes signal_handler() and restores file descriptor 1 to stdout. Thus, the child process prints “Child hello!” to both stdout and greetings.txt. When the parent process returns from

wait, it prints “Parent howdy!: 162” to greetings.txt twice because its file descriptor 1 never got restored to stdout.

Case 3:After the parent process forks, the child process continues execution and writes “Child hello!” twice to greetings.txt. The child then exits, context switching back to the parent process. The parent process sends a SIGTERM signal to zombie child process, which has no effect because the child process is no longer running. The parent process calls wait, returning right away. The parent process then prints “Parent howdy!: 162” twice to greetings.txt.

Common Mistakes: ● Students would often mix the output of greetings.txt with the output of stdout. ● Some students would print the parent process’s greeting before the child process’s greeting, but the wait call in the parent process prevents it from writing any output before the child process exits.

Thanos is now trying to decide in what order to lock the Infinity Locks in the locklist. Which of the following definitions of lock_locklist can cause deadlock? Explain your answer for each selection. If deadlock is possible, please provide an acquisition ordering in your explanation.

Problem 4c[2pts]: Lock acquisition code:

void lock_locklist(struct locklist *list) { int start = 0; for (int i = start; i < NUM_LOCKS; i++) { int index = i; lock_acquire (list->my_locks + index); } }

⬜ Can Cause Deadlock X Cannot Cause Deadlock

Explain:

This cannot cause deadlock. The preference order for each thread is the same (acquiring lower indices first), so we will never cause deadlock.

Problem 4d[2pts]: Lock acquisition code:

void lock_locklist(struct locklist *list) { int start = getpid() % NUM_LOCKS; for (int i = start; i < start + NUM_LOCKS; i++) { int index = i % NUM_LOCKS; lock_acquire (list->my_locks + index); } }

X Can Cause Deadlock ⬜ Cannot Cause Deadlock

Explain:

Example: thread 1 acquires lock 1, thread 2 acquires lock 2. Thread 1 blocks waiting for lock 2. Thread 2 acquires locks 3-5, but blocks on acquiring lock 1.

This can cause deadlock. The preference order for each thread is different, so we acquire locks in different orders and can end up with deadlock.

Problem 4e[2pts]: Lock acquisition code:

void lock_locklist(struct locklist *list) { int start = NUM_LOCKS - 1; for (int i = start; i >= 0; i--) { int index = i; lock_acquire (list->my_locks + index); } }

⬜ Can Cause Deadlock X Cannot Cause Deadlock

Explain:

This cannot cause deadlock. The preference order for each thread is the same (acquiring higher indices first), so we will never cause deadlock.

Problem 4f[2pts]: Lock acquisition code:

void lock_locklist(struct locklist *list) { int start = NUM_LOCKS/2; for (int i = start; i < start + NUM_LOCKS; i++) { int index = i % NUM_LOCKS; lock_acquire (list->my_locks + index); } }

⬜ Can Cause Deadlock X Cannot Cause Deadlock

Explain:

This cannot cause deadlock. The preference order for each thread is the same (starting from the middle indices first), so we will never cause deadlock.

Problem 5: Just One TCP Connection [16 pts] Bob is trying to build a system which helps people process jobs. After learning about how sockets work in CS162, he wants to write a server program which listens on port 162 and takes jobs from clients. Each job is represented by the following data structure:

struct Job { int job_number; char job_details[200]; int result; };

However, Bob thinks that having clients make a new connection for each job they want the server to process seems to be an inefficient design because it consumes resources unnecessarily.

For 5a and 5b, full points are awarded to answers which have demonstrated not only correct conceptual understandings but also clarity in the expression. Students need to both identify the resource constraints and then elaborate on why establishing new connections will consume that scarce resource.

Problem 5a[2pts]: Briefly explain why establishing new connections consumes additional resources in

userspace.

Acceptable answers include: More memory needed to manage the connection (e.g., int for the file descriptor), more userspace buffers to store contents sent to or received from each connection, and potentially more threads to work with connections concurrently, which consumes memory.

Common Mistakes:

  1. Not clear about the scarce resource. e.g. Memory, CPU time, bandwidth, disk, ...
  2. We do not consider the time taken to process a new user connection as a resource. You have to point out clearly that the CPU time is the scarce resource here
  3. When talking about buffer, answers need to highlight that they are referring to userspace buffer
  4. TCP state management (e.g. ACK, retries, seq number …) is in the kernel space
  5. “because we need to establish sockets” is too vague

Problem 5b[2pts]: Briefly explain why establishing new connections consumes additional resources in the

OS kernel. Acceptable answers include: More elements in the process’s file descriptor table, more elements in the OS-wide file description table, and more kernel buffers to store data sent to or received from the network interface card.

Common Mistakes:

  1. Unclear about the scarce resource. e.g. Memory, CPU time, bandwidth, disk, ...
  2. Discussions of an “inode table” are not relevant, as this is a network socket.
  3. Statements like “the OS needs to do more work” do not specify a scarce resource
  4. Some answers mentioned additional interrupts but did not specify where they would come from

Bob comes up with a scheme where the server just needs to maintain, for each client, a single connection through which multiple jobs can be sent and through which processed jobs can be sent back. More specifically, each connection is managed by one thread and whenever a new job is received, a new thread is created to do the work in parallel and then send back the results via the same connection.

Graphically, it looks like this:

Problem 5c[2pts]: Bob approaches you for help with implementing his design. He’s done most of the

coding but left out some critical functions. You are required to fill in the missing lines below. Note that you might not need all the lines provided.

Assumptions:

  1. When you call ssize_t read(int fd, void *buf, size_t count), it always reads count bytes if there is data available. Otherwise, the call blocks.
  2. When you call ssize_t write(int fd, const void *buf, size_t count) , it always writes count bytes successfully. You do not have to handle the case when read fails because of client disconnection. **1. struct Job {
  3. int job_number;
  4. char job_details[200];
  5. int result;
  6. };
  7. struct Arg_struct{
  8. Job *job;
  9. int socket_fd;
  10. }
  11. void do_work (struct Job *job) {
  12. /* This function does the work and set the result back into job */
  13. /* This function is compute-intensive */
  14. }
  15. void *new_conn(void *arg) {
  16. /* Handles a new client connection */
  17. struct Job *new_job;
  18. struct Arg_struct *new_args;
  19. ssize_t bytes_read;
  20. int con_sockfd = (int) arg;
  21. while (1) {**

After you are done writing the program, you realize that it doesn’t work. Clients are receiving gibberish when they try to read each struct that is sent back. You show it to master systems programmer Jeff Dean, who sees two problems in the code that cause the program to fail.

Firstly, the assumptions that read and write will always return count bytes do not hold. In fact, these sys calls often read/write fewer than count bytes before returning.

Problem 5d[2pts]: Provide a reason why a write system call might write fewer than count bytes.

The write system call may be interrupted while in progress by a signal, in which case some of the requested bytes will not be written. Common Mistakes:

  1. \x00 does not stop the write call from continuing.
  2. If the user buffer passed in as the argument for write is shorter than count, this could lead to access of an invalid portion of memory and cause the program to crash.

Problem 5e[2pts]: To solve this problem, Bob proposes writing a new function with the signature:

write_bob(int fd, const void* buf, size_t count)

write_bob makes use of write syscall in its implementation and makes sure that it always writes count bytes before returns. Provide an implementation for write_bob : Hint : If write returns 3, then 3 bytes are already written. You don't have to write them again.

void write_bob(int fd, const void buf, size_t count) { char buffer = (char) buf; / Your code below */ while (bytes_written < count){ bytes_written += write(fd, buffer + bytes_written, count – bytes_written) }

Common Mistakes:

  1. Arithmetic on a void pointer is invalid. Therefore, buffer, instead of buf, should be used. To make things simpler, we cast the pointer for you on the first line as a hint.
  2. Making the write call twice doesn’t guarantee successful writing of the full buffer. Even repeating write ten times may not work.
  3. Do not write again whatever is already written.
  4. This function returns void, so there is no need to return an integer.
  5. fflush and fsync are not relevant in this context.

Your temporary variable should use ssize_t instead of int. Otherwise there could be a buffer overflow vulnerability. However, we did not deduct points for this.

Problem 5f[3pts]: Jeff Dean points out a new problem with the code, because the threads share the same

connection socket. Bob proposes solving this problem with a mutex lock. List between which of the lines above you need to add lock_acquire() and lock_release() with a global mutex to solve Jeff’s problem. For instance, if you needed a lock_acquire() between lines 1 and 2, you would write (1, 2) under lock_acquire() below. You may not need all six spaces.

lock_acquire()

(_____ 43 _____, _____ 44 _____) (__________, __________) (__________, __________)

lock_release()

(_____44_____, _____ 45 _____) (__________, __________) (__________, __________)

You must acquire a lock before the call to write and release it after the write.

We took away one point if any unnecessary locks were added. Locking around read is unnecessary since only a single thread is reading from that connection. This, however, is not true for write.

Problem 5g[3pts]: Bob successfully builds the server above, but his computer is slow, and can only handle

up to 5 jobs being processed concurrently per connection. Jeff proposes a solution where a semaphore would block new jobs from being processed if 5 other jobs are already being processed, and only allow new jobs to start once an existing job is finished. Assume each thread has a semaphore initialized to 5 and that the line numbers are as originally (ignore the lock operations above). List between which lines you need to add sema_up() and sema_down() to allow only up to 5 jobs to be concurrently processed per connection.

sema_down()

(_____ 41 _____, ____ 42 ______) (__________, __________) (__________, __________)

sema_up()

(_____ 42 _____, _____ 43 _____) (__________, __________) (__________, __________)

Call sema_up before do_work and call sema_down after do_work.

We took away one point if any unnecessary semaphores were added. If the unnecessary semaphores have the potential to cause deadlock, the answer was not given any points.

[Scratch Page: Do not put answers here!]