Advanced Computer Architecture Exam 2, Spring 2000, Exams of Computer Architecture and Organization

Information about an exam for the advanced computer architecture course at ece 4100, university of x, spring 2000. The exam consists of 4 problems, each with multiple parts, and covers topics such as disk technology, cache organization, pipelines, and superscalar pipelining. Students are required to solve problems related to disk access times, cache organization, cache access times, and superscalar pipeline execution.

Typology: Exams

Pre 2010

Uploaded on 08/05/2009

koofers-user-xk0
koofers-user-xk0 🇺🇸

10 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 4100 Advanced Computer Architecture Spring 2000
4 problems, 8 pages Exam 2 23 March 2000
1
Name (print) Solutions
SS Number
1234 total
25 25 25 25 100
25 25 25 25 100
For maximum credit, show all work.
Hand in everything you write on.
Put your name on everything you hand in.
Good luck!
pf3
pf4
pf5
pf8

Partial preview of the text

Download Advanced Computer Architecture Exam 2, Spring 2000 and more Exams Computer Architecture and Organization in PDF only on Docsity!

4 problems, 8 pages Exam 2 23 March 2000

Name (print) (^) Solutions

SS Number

1 2 3 4 total 25 25 25 25 100

25 25 25 25 100

For maximum credit, show all work. Hand in everything you write on. Put your name on everything you hand in. Good luck!

4 problems, 8 pages Exam 2 23 March 2000

Problem 1 (5 parts, 25 points) Disk Technology Answer the following questions for the disk described in the table below. For maximum credit, show your work for each part.

heads 8

cylinders 8192

sectors per track 64 sectors/track

sector size 512 bytes controller overhead 1 ms average seek time 10 ms rotation speed 7200 revolutions/minute

1a) [5 points] What is the size (in bytes) of the disk? 8 heads * 8192 tracks * 64 sectors/track * 512 B/sector = 2Gbytes

1b) [5 points] How much data (in bytes) can be accessed without repositioning the read heads? 8 tracks * 64 sectors/track * 512 B/sector = 256 Kbytes (or 2GB/8192 cylinders = 256 KB)

1c) [5 points] What is the data transfer rate (in bytes/sec) for the disk? 7200 RPM/60sec/min = 120 RPS = 120 tracks/second 120tracks/second * 64 sectors/track * 512 B/sector = 3.93 Mbytes/second

1d) [5 points] What is the average time (in seconds) to read a 40 Kbyte file. Assume the file is contiguously stored in a single track. 1 ms overhead + 10 ms seek time + 40KB/3.93MB/sec transfer time + ½(1/120 RPS) rotational latency = 25.3ms

1e) [5 points] Suppose the disk is being backed up over a 100 Mbits/sec network. Assume the disk is 75% full. Ignoring all system limits except for the disk and the network, how long (in minutes) will the backup take to complete? Amount backed up: 0.752GB8bits/byte = 12 Billion bits or Disk transfer time and network transfer costs will dominate: 0.752GB/3.93MB/sec + 12 Billion_bits/100Mb/sec = 382sec + 120 sec = 502sec = 8. minutes.*

4 problems, 8 pages Exam 2 23 March 2000

2d) [5 points] Suppose the cache access time is 10ns and main memory access time is 75ns/word. What is the maximum number of misses out of 1000 total accesses that this cache memory system can have before it exceeds an effective access time of 25ns?

10ns + m(75ns * 2words/line) > 25

m > 15/150 = 1/10 = 100/

100 misses.

4 problems, 8 pages Exam 2 23 March 2000

Problem 3 (3 parts, 25 points) Pipelines, Caches, and Performance

A pipelined processor with a separate instruction and data cache has five stages: instruction fetch, decode, execute, memory (data load or store), and write back to register file. Its cycle time is 20 ns and it can start a new instruction on every cycle when there are no hazards. The word size of the processor is 32 bits.

The instruction cache is fully associative, with a total capacity of 4KB (4096 bytes) and a line size of 2 words. The instruction cache access time is 20 ns.

The data cache is 2-way set associative with a line size of 4 words and a total capacity of 4KB (4096 bytes). The data cache access time is 20 ns.

Main memory access time is 60 ns/word. In both caches, a missed word is not passed to the processor until the entire line is received from main memory.

Parts 3a, 3b, and 3c are each independent of each other.

3a) [10 points] The processor is executing a sequential code block containing 400 nonbranching instructions and each instruction is 1 word long. The instructions are stored contiguously in memory. The instruction cache is empty at the start of the code execution.

For this part (3a) of the problem, assume the hit rate of the data cache during the execution of this code is 100%. Also, assume no hazards occur in the code fragment; stalls only occur due to memory access time. 3a.1) When an instruction is not found in the instruction cache how many stall cycles occur (per instruction)? 60 ns/word * 2 words/line = 120ns/20ns/cycle = 6 cycles ______________________________ 6 _____ stall cycles.

3a.2) What is the instruction cache's hit rate while executing this code block?

There are no conflict misses since the instruction cache is fully associative. There are no capacity misses since all 400 instructions (= 400 words = 1600 bytes) easily fit in the 4KB instruction cache. 400/L compulsory misses/ 400 total accesses = 200/400 = 50% miss rate

_________________ 50 ___________ % hit rate.

3a.3) What is the average instruction throughput of this processor executing this code block? Express your answer in cycles per instruction.

1 + .50miss_rate(6stall_cycles/miss) = 4 CPI.

3b) [5 points] How many sets does the data cache contain? N = 4KB/(4words/line)(2lines/set)(4 bytes/word) = 128 sets.

3c) [10 points] Assume we have another fragment of code which is preloaded into the instruction cache so that there are no instruction cache misses. Assume the 2-way set associative data cache has a hit rate of 90%.

4 problems, 8 pages Exam 2 23 March 2000

Problem 4 (4 parts, 25 points) Superscalar Pipelining

Consider a superscalar microprocessor with a superscalar factor of six. It has a design similar to the PowerPC (and the processor simulated Project 2), using reservation stations, a renaming buffer, and a reorder buffer to support out-of-order execution. The functional units have the following latencies (measured in number of cycles):

IntegerALU: 1 cycle FP multiply: 5 cycles Memory: 10 cycles FP divide: 10 cycles FP add/sub: 10 cycles Branch: 1 cycle

4a) [10 points] Suppose the following program fragment is executed. No other instructions are currently in the pipeline. The branch instruction (I6) is mispredicted to be “not taken” (i.e., R6 is 0). Fill in the last 2 columns of the table below as follows:

  1. Indicate for each instruction, whether the results of executing the instruction are written to the register file (Y/N) and
  2. Number the order in which the instructions are retired (or “committed”). If an instruction is cancelled after being issued, put a “C” in the column entry for that instruction.

Inst# Label Instruction Write to Register File? Order of Retirement I1 ADD R6, R1, R2 (^) Y 1 I2 ADD R3, R6, R5 (^) Y 2 I3 DIVF F2, F4, F6 (^) Y 3 I4 ADDF F2, F3, F2 (^) Y 4 I5 SUB R6, R4, R5 (^) Y 5 I6 BEQZ R6, skip (^) N --- I7 SUBF F8, F8, F9 (^) N C I8 skip: MULTF F3, F1, F0 (^) Y 6 I9 ADDF F8, F6, F9 (^) Y 7

4 problems, 8 pages Exam 2 23 March 2000

4b) [15 points] Suppose that the superscalar pipeline is empty and the following instructions are issued. F3 holds value 80, F4 holds value 4, and F5 holds value 2. Inst # Instruction I1 DIVF F1, F3, F I2 MULTF F2, F3, F I3 ADDF F3, F1, F I4 ADDF F3, F4, F I5 MULTF F4, F2, F I6 SUBF F5, F4, F There are 3 reservation stations for the FPAdd functional unit, 2 reservation stations for the FPMultiply functional unit, and 1 reservation station for the FPDivide functional unit. Assume that none of the functional units have finished execution, yet.

What is the status of the reservation stations and renaming buffer once all the instructions have issued? Reserv. Station Name

Destination Reg. Color

Busy? Operation Source 1 Value/Color

Source 2 Value/Color FPAdd1 RB3 Y + RB1 RB FPAdd2 RB4 Y + 4 RB FPAdd3 RB6 Y - RB5 2 FPMultiply1 RB2 Y * 80 2 FPMultiply2 RB5 Y * RB2 4 FPDivide1 RB1 Y / 80 4

Renaming Buffer: Color Register # Value Value Valid? Busy? RB1 F1 --- N Y RB2 F2 --- N Y RB3 F3 --- N Y RB4 F3 --- N Y RB5 F4 --- N Y RB6 F5 --- N Y RB7 N RB8 N

  1. Indicate all reservation stations that contain instructions that can begin execution at this point; if none can, circle “None.”

FPAdd1 FPAdd2 FPAdd3 FPMultiply1 FPMultiply2 FPDivide1 None