


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
COMPUTER SCIENCES DEPARTMENT. UNIVERSITY OF WISCONSIN—MADISON. PH.D. QUALIFYING EXAMINATION. Computer Architecture. Qualifying Examination. Fall 2020.
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



The performance of a data cache design has, for many years, been studied using Hill’s 3C model, where the 3 C’s stand for compulsory , capacity , and conflict misses. Later, this model was extended to add a 4th^ C, coherence misses, in multiprocessor caches.
Handling conditional branching is seen as very important for high-performance microprocessors. However, different ISAs have taken significantly different approaches to providing support for handling conditional branching. For example, IA-64 and ARM support predication in the ISA, which in effect eliminates conditional branch instructions. Other ISAs have conditional branch instructions and rely on the microarchitecture to employ dynamic branch prediction for high performance. You are the lead architect for a company that builds high performance microprocessors, and dealing with branches is becoming increasingly important. Assume that your company has a fixed ISA that already has support for both predication and conditional branches. Your co-worker Bob argues that your next processor should focus on predication, while your co-worker Carol argues that you should focus on branch prediction.
In recent years, as single-core performance stagnated, companies responded by designing multi-core CPUs. In addition to multi-core CPUs, general-purpose GPUs (GPGPUs) have also become increasingly popular with a large number of simpler “cores”, referred to as “streaming multiprocessors” (also referred to as “compute units” by other GPU manufacturers).
an 8-issue out-of-order processor with a very large instruction window size (a window size of perhaps a 1000 instructions).