









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
this will tell you the memory hierarchy of computers
Typology: Thesis
1 / 16
This page cannot be seen from the preview
Don't miss anything!










Reducing the hit time —Small and simple first-level caches and wayprediction. Increasing cache bandwidth —Pipelined caches, multibanked caches, and nonblocking caches. These techniques have varying impacts on power consumption. Reducing the miss penalty —Critical word first and merging write buffers. These optimizations have little impact on power. Reducing the miss rate —Compiler optimizations. Obviously any improvement at compile time improves power consumption. Reducing the miss penalty or miss rate via parallelism —Hardware prefetching and compiler prefetching. These optimizations generally increase power consumption, primarily due to prefetched data that are unused
Second Optimization: Way Prediction to Reduce Hit Time
Third Optimization: Pipelined Cache Access to Increase Cache Bandwidth
Sixth Optimization: Critical Word First and Early Restart to Reduce Miss Penalty Two strategies Critical word first—Request the missed word first from memory and send it to the processor as soon as it arrives; let the processor continue execution while filling the rest of the words in the block. (^) Early restart—Fetch the words in normal order, but as soon as the requested word of the block arrives send it to the processor and let the processor continue execution.
Eighth Optimization: Compiler Optimizations to Reduce Miss Rate The increasing performance gap between processors and main memory has inspired compiler writers to scrutinize the memory hierarchy to see if compile time optimizations can improve performance The goal is to maximize accesses to the data loaded into the cache before the data are replaced Loop Interchange Blocking
DRAMs grew in capacity
Multiplex the address lines, thereby cutting the number of address pins in half
Improving Memory Performance Inside a DRAM Chip First, DRAMs added timing signals that allow repeated accesses to the row buffer without another row access time Synchronous DRAM (SDRAM) Second major change was to add a clock signal to the DRAM interface, so that the repeated transfers would not bear that overhead Third, to overcome the problem of getting a wide stream of bits from the memory without having to make the memory system too large as memory system density increased, DRAMS were made wider Double data rate (DDR) The fourth major DRAM innovation to increase bandwidth is to transfer data on both the rising edge and falling edge of the DRAM clock signal, thereby doubling the peak data rate
One memory access to obtain the physical address Second access to get the data if the accesses have locality, then the address translations for the accesses must also have locality Translation lookaside buffer (TLB) keeping these address translations in a special cache A memory access rarely requires a second access to translate the address TLB entry Cache entry where the tag holds portions of the virtual address and the data portion holds a physical page address, protection field, valid bit, and usually a use bit and a dirty bit