Performance Measurement in Computer Architecture: Metrics, Trends, and Technologies, Assignments of Computer Architecture and Organization

An overview of a university course, cs/ee 6810, focusing on measuring performance, cost, and power in computer architecture. Topics include instruction level parallelism, memory hierarchy, multiprocessors, storage systems, networks, and processor technology trends. Students will learn about performance metrics, technology trends, and power consumption trends. The course includes lectures, homework assignments, and midterms.

Typology: Assignments

Pre 2010

Uploaded on 08/30/2009

koofers-user-6xz
koofers-user-6xz 🇺🇸

10 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Introduction
Background: CS 3810 or equivalent, based on Hennessy
and Patterson’s Computer Organization and Design
Text for CS/EE 6810: Hennessy and Patterson’s
Computer Architecture, A Quantitative Approach, 4th Edition
Topics
¾Measuring performance/cost/power
¾Instruction level parallelism, dynamic and static
¾Memory hierarchy
¾Multiprocessors
¾Storage systems and networks
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Performance Measurement in Computer Architecture: Metrics, Trends, and Technologies and more Assignments Computer Architecture and Organization in PDF only on Docsity!

Introduction

Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design

Text for CS/EE 6810: Hennessy and Patterson’s Computer Architecture, A Quantitative Approach, 4 th Edition - Topics

Measuring performance/cost/power ¾ Instruction level parallelism, dynamic and static ¾ Memory hierarchy ¾ Multiprocessors ¾ Storage systems and networks

Organizational Issues^ •

Office hours, MEB 3414, by appointment

TA: Kenneth Williams; TA office hrs: TBA - Special accommodations, add/drop policies (see class webpage) - Class web-page and class mailing list at^ http://www.eng.utah.edu/~cs - Grades:

Two midterms, 25% each ¾ Homework assignments, 50%, you may skip one ¾ No tolerance for cheating

Microprocessor Performance

15x performance growth can be attributed to architectural innovations

Where Are We Headed? •^

Modern trends:

Clock speed improvements are slowing

power constraints ƒ already doing less work per stage ¾ Difficult to further optimize a single core for performance ¾ Multi-cores: each new processor generation will^ accommodate more cores

Technology Trends

DRAM density increases by 40-60% per year, latency has reduced by 33% in 10 years (the memory wall!), bandwidth improves twice as fast as latency decreases

Disk density improves by 100% every year, latency improvement similar to DRAM - Networks: primary focus on bandwidth; 10Mb

Æ

100Mb in 10 years; 100Mb

Æ

1Gb in 5 years

Power Consumption Trends

Dyn power α activity x capacitance x voltage 2 x frequency

Capacitance per transistor and voltage are decreasing, but number of transistors and frequency are increasing at a faster rate - Leakage power is also rising and will soon match dynamic power - Power consumption is already between 100-150W in high-performance processors today

Summarizing Performance

Consider 25 programs from a benchmark set – how do^ we capture the behavior of all 25 programs with a^ single number?

P

P

P

Sys-A

Sys-B

Sys-C

Total (average) execution time ¾ Total (average) weighted execution time ¾ Average of normalized execution times ¾ Geometric mean of normalized execution times

AM Example •^

We fixed a reference machine X and ran 4 programs A, B, C, D on it such that each program ran for 1 second

The exact same workload (the four programs execute^ the same number of instructions that they did on^ machine X) is run on a new machine Y and the^ execution times for each program are 0.8, 1.1, 0.5, 2 - With AM of normalized execution times, we can conclude^ that Y is 1.1 times slower than X – perhaps, not for all^ workloads, but definitely for one specific workload (where^ all programs run on the ref-machine for an equal #cycles) - With GM, you may find inconsistencies

Summarizing Performance

GM: does not require a reference machine, but does not predict performance very well

So we multiplied execution times and determined that sys-A is 1.2x faster…but on what workload?

AM: does predict performance for a specific workload, but that workload was determined by executing programs on a reference machine

Every year or so, the reference machine will have to be updated

Normalized Execution Times

Advantage of GM: no reference machine required

Disadvantage of GM: does not represent any “real entity”^ and may not accurately predict performance - Disadvantage of AM of normalized: need weights (which^ may change over time) - Advantage: can represent a real workload

Measuring System CPI

Assume that an architectural innovation only affects CPI

For 3 programs, base CPIs: 1.2, 1.8, 2.5 CPIs for proposed model: 1.4, 1.9, 2. - What is the best way to summarize performance with a single number? AM, HM, or GM of CPIs?

Example

AM of CPI for base case = 1.2 cyc

  • 1.8 cyc
  • 2.5 cyc instr instr instr 5.5 cycles is execution time if each program ran for one instruction – therefore, AM of CPI defines a workload where every program runs for an equal #instrs

HM of CPI = 1 / AM of IPC ; defines a workload where^ every program runs for an equal number of cycles - GM of CPI: warm fuzzy number, not necessarily representing any workload

Title

Bullet