Assignment 3 for Architecture of Parallel Computers | ECE 506 | Assignments Electrical and Electronics Engineering

–1–

CSC/ECE 506: Architecture of Parallel Computers

Problem Set 3

Due July 20, 2001

Problems 1 and 4 will be graded. There are 45 points on these problems.

Note:

You must do

all

the problems, even the non-graded ones. If you do not do some of them, half as many points as

they are worth will be subtracted from your score on the graded problems.

Problem 1.

(25 points)

Consider a multiprocessor with a sequentially consistent memory

system. Each processor has a cache implementing a basic 3-state write-invalidate protocol (similar

to the one in Figure 5.13, p.295 of Culler, Singh & Gupta), and a one-entry write buffer between

the CPU and the cache, so that stores need not block the processor.

Upon a store, if the referenced cache block isn’t in the exclusive state, then the register value is

transferred into the write buffer, and the necessary protocol action is launched to obtain

ownership. Once the processor obtains ownership for the block, the value is transferred from the

write buffer to the cache, and the entry is removed from the write buffer. Load instructions stall

the processor (upon a cache miss); store instuctions don’t stall the processor so long as the SC

memory model can be obeyed.

Assume the following conditions:

• cache block size is 4 words;

• a read/write instuction that hits in the cache takes 1 cycle;

• a write instruction that incurs a privilege miss (e.g., the block is in the cache in Shared

state) takes 3 cycles;

• a read/write instruction that incurs a miss (the block has to be fetched from memory) takes 5

cycles;

• all arithmetic instructions take 1 cycle; and

• there is no resource or network contention.

The trace at the rightgives the interleaved order in which the

instructions from the two processors (P1 and P2) were executed.

Assume that I0 starts at time 0 and successive instructions start

execution on consecutive cycles as long as there is no memory

model-induced stalling. U and V are in different cache blocks.

Initially both processors have U and V in the Shared state in their

respective caches.

You should give a timing diagram (

-axis is time,

-axis is instructions

I0--I9), showing when an instruction starts execution and when it

finishes execution. Also give a table that has the following fields:

I0: P1: LOAD R5, U

I1: P2: STORE U, R1

I2: P2: ADD R1, R2

I3: P2: STORE V, R1

I4: P1: LOAD R4, U

I5: P1: STORE V, R1

I6: P1: ADD R5, R4

I7: P1: MULT R5, R4

I8: P2: LOAD R4, U

I9: P1: STORE V, R5

instruction consistency actions miss type/hit reason for stall (if any)

Problem 2.

(20 points)

This problem should be solved using the MESI protocol for a bus-

based shared-memory multiprocessor. Assume the following:

Direct-mapped cache organization

1 and

2 each have exactly 2 cache lines

Cache-block size: 4 words

Cache-to-cache block transfer takes 4 cycles

Read/write hit (when no bus action is needed) takes 1 cycle

Invalidation takes 2 cycles

Memory-to-cache block transfer takes 8 cycles

and

2 are two memory blocks that map to the same cache line. They contain the data items

W, X, Y, Z, and P, Q, R, S, respectively as shown below. Each data item is one word.

Assignment 3 for Architecture of Parallel Computers | ECE 506, Assignments of Electrical and Electronics Engineering

Related documents

Partial preview of the text

Download Assignment 3 for Architecture of Parallel Computers | ECE 506 and more Assignments Electrical and Electronics Engineering in PDF only on Docsity!

CSC/ECE 506: Architecture of Parallel Computers

Problem Set 3

Due July 20, 2001

I0: P1: LOAD R5, U

I1: P2: STORE U, R

I2: P2: ADD R1, R

I3: P2: STORE V, R

I4: P1: LOAD R4, U

I5: P1: STORE V, R

I6: P1: ADD R5, R

I7: P1: MULT R5, R

I8: P2: LOAD R4, U

I9: P1: STORE V, R

W

B 1

X Y Z P

B 2

Q R S

P1: READ Z

P2: READ W

P1: READ W

P1: WRITE Z

P2: READ W

P1: READ P

P2: WRITE P

P1: READ P

P1: READ Q

P2: READ W

P1: WRITE Q

P2: READ X

P1: READ R