Assignment 4 for Sample Solutions - Dependability | CS 686 | Assignments Computer Science

Page 1

Department of Computer Science University of Virginia

CS686 - DEPENDABLE COMPUTING

ASSIGNMENT 4

SAMPLE SOLUTIONS

Individual Activity

There is far more to these questions than most of you realized. Read the following carefully.

1. For the dual redundant architecture, we noted that software has to be deterministic and pre-

dictable. More specifically, we noted that great needs to be exercised in the use of threads and

processes. Develop an approach to the use of threads that would ensure deterministic behavior

of a multi-threaded program.

No threads or concurrency could be considered as an option....

There are two issues to deal with:

(1) Fundamental differences in control flow within the threads.

(2) Timing differences between the two target systems that could affect control flow.

To deal with (1): Threads must operate with a non-pre-emptive scheduler so that there is no

dependence on a system clock for switching between threads. Non-pre-emptive scheduling

means that threads stop executing when they block and not because some timer expires.

Although the high-level architecture of such a system includes a “clock”, that clock is not very

well defined. Even if the clock meant the actual hardware oscillator, you could not count on

precisely the same timing between the two units because of different individual delays.

Also, there can be no separate local devices that could be non-deterministic like disks.

To deal with (2): Even if the same scheduling algorithm is used, the processor timing will

differ thereby offering the possibility of divergence between threads if one gets ahead of

another. This probably means no paging since that affects speed and requires a peripheral

disk. Care has to be taken with cache management to be sure cache content is identical.

Finally, various internal processor clocks could differ so there is a need to synchronize the

machines periodically. The simplest way to do that is to synchronize when output is to be gen-

erated (as seen by the comparator), but that might be too infrequent. Many systems synchro-

nize on a fixed time boundary, i.e., there is set of specific synch points in the code that requires

the two units to synchronize. For example, the Space Shuttle uses four machines in the on-

board Primary Flight Computer and they synchronize every 40 milliseconds.

2. Consider the disk structure known as a mirrored disk. Writes go to two separate identical disks

so that there are always two copies of the data. If one of the disks fails, it is replaced with a

new unit. The new unit is then resilvered, i.e., made to contains the same data as the surviving

original unit. Develop a resilvering algorithm that could be used by the recovery software.

Assignment 4 for Sample Solutions - Dependability | CS 686, Assignments of Computer Science