



































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Computer Architecture. Lecture 32: Heterogeneous Systems. Prof. Onur Mutlu. Carnegie Mellon University. Spring 2014, 4/20/2015 ...
Typology: Slides
1 / 75
This page cannot be seen from the preview
Don't miss anything!




































































Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 4/20/ 2015
The memory hierarchy Caches, caches, more caches Virtualizing the memory hierarchy: Virtual Memory Main memory: DRAM Main memory control, scheduling Memory latency tolerance techniques Non-volatile memory Multiprocessors Coherence and consistency In-memory computation and predictable performance Multi-core issues (e.g., heterogeneous multi-core) Interconnection networks
Midterm II is this Friday (April 24, 2015) 12:30-2:30pm, CIC Panther Hollow Room ( th floor) Please arrive 5 minutes early and sit with 1-seat separation Same rules as Midterm I except you get to have 2 cheat sheets Covers all topics we have examined so far, with more focus on Lectures 17-32 (Memory Hierarchy and Multiprocessors) Midterm II Review is Wednesday (April 22) Come prepared with questions on concepts and lectures Detailed homework and exam questions and solutions study on your own and ask TAs during office hours
Solve past midterms (and finals) on your own… And, check your solutions vs. the online solutions Questions will be similar in spirit http://www.ece.cmu.edu/~ece447/s15/doku.php?id=exams Do Homework 7 and go over past homeworks. Study and internalize the lecture material well. Buzzwords can help you. Ditto for slides and videos. Understand how to solve all homework & exam questions. Study hard. Also read: https://piazza.com/class/i3540xiz8ku40a?cid= 335
Reminder of 447 policy: Absolutely no form of collaboration allowed No discussions, no code sharing, no code reviews with fellow students, no brainstorming, … All labs and all portions of each lab has to be your own work Just focus on doing the lab yourself, alone
740 is the next course in sequence Tentative Time: Lect. MW 7:30-9:20pm, (Rect. T 7:30pm) Content: Lectures: More advanced, with a different perspective Recitations: Delving deeper into papers, advanced topics Readings: Many fundamental and research readings; will do many reviews Project: More open ended research project. Proposal milestones final poster and presentation Done in groups of 1- 3 Focus of the course is the project (and papers) Exams: lighter and fewer Homeworks: None
Stay tuned…
The memory hierarchy Caches, caches, more caches Virtualizing the memory hierarchy: Virtual Memory Main memory: DRAM Main memory control, scheduling Memory latency tolerance techniques Non-volatile memory Multiprocessors Coherence and consistency In-memory computation and predictable performance Multi-core issues (e.g., heterogeneous multi-core) Interconnection networks
Heterogeneity (asymmetry) in system design Evolution of multi-core systems Handling serial and parallel bottlenecks better Heterogeneous multi-core systems
Heterogeneity and asymmetry have the same meaning Contrast with homogeneity and symmetry Heterogeneity is a very general system design concept (and
Idea: Instead of having multiple instances of the same “resource” to be the same (i.e., homogeneous or symmetric), design some instances to be different (i.e., heterogeneous or asymmetric) Different instances can be optimized to be more efficient in executing different types of workloads or satisfying different requirements/goals Heterogeneity enables specialization/customization
Different workloads executing in a system can have different behavior Different applications can have different behavior Different execution phases of an application can have different behavior The same application executing at different times can have different behavior (due to input set changes and dynamic events) E.g., locality, predictability of branches, instruction-level parallelism, data dependencies, serial fraction, bottlenecks in parallel portion, interference characteristics, … Systems are designed to satisfy different metrics at the same time There is almost never a single goal in design, depending on design point E.g., Performance, energy efficiency, fairness, predictability, reliability, availability, cost, memory capacity, latency, bandwidth, …
Symmetric: One size fits all Energy and performance suboptimal for different “workload” behaviors Asymmetric: Enables customization and adaptation Processing requirements vary across workloads (applications and phases) Execute code on best-fit resources (minimal energy, adequate perf.) C4 C C5 C C4 C C5 C C C C Asymmetric C C C C C C C C C C C C C C C C Symmetric
We Have Already Seen Examples Before (in 447) CRAY-1 design: scalar + vector pipelines Modern processors: scalar instructions + SIMD extensions Decoupled Access Execute: access + execute processors Thread Cluster Memory Scheduling: different memory scheduling policies for different thread clusters RAIDR: Heterogeneous refresh rate Hybrid memory systems DRAM + Phase Change Memory Fast, Costly DRAM + Slow, Cheap DRAM Reliable, Costly DRAM + Unreliable, Cheap DRAM …