Performance Evaluation I, Lecture Slide - Computer Science, Slides of Introduction to Computers

Performance Measures, Timing, Timing Mechanisms, Measurement Pitfalls, Modeling Discretization Error, Relative Error, Profiling, Profiling Errors, DCPI Architecture

Typology: Slides

2010/2011

Uploaded on 10/07/2011

rolla45
rolla45 🇺🇸

4

(6)

133 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Performance Evaluation I
November 5, 1998
Topics
Performance measures (metrics)
Timing
Profiling
15-213
class22.ppt
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Performance Evaluation I, Lecture Slide - Computer Science and more Slides Introduction to Computers in PDF only on Docsity!

Performance Evaluation I

November 5, 1998

Topics

Performance measures (metrics)

Timing

Profiling

class22.ppt

  • 2 –

class22.ppt Performance expressed as a time

Absolute time measures (metrics)

difference between start and finish of an operation

completion time, execution timesynonyms: running time, elapsed time, response time, latency,

most straightforward performance measure

Relative (normalized) time measures

running time normalized to some reference time

(e.g. time/reference time)

Guiding principle: Choose performance measures that

track running time.

  • 4 –

class22.ppt

Performance expressed as a rate(cont)

Example: Suppose we are measuring a program thatKey idea: Report rates that track execution time.

convolves a stream of images from a video camera.

Bad performance measure: MFLOPS

than a matrix-vector product with a good MFLOPS rate.Fourier transform. An FFT with a bad MFLOPS rate may run fasterconvolution algorithm: n^2 matix-vector product vs nlogn fastnumber of floating point operations depends on the particular

Good performance measure: images/sec

a program that runs faster will convolve more images per second.

  • 5 –

class22.ppt

Timing mechanisms

Clocks

returns elapsed time since epoch (e.g., Jan 1, 1970)

Unix getclock() command

coarse grained (e.g.,

us resolution on Alpha)

long int secs,

ns;

struct timespec

*start, *stop;

printf(“%ld ns\n”,ns = (stop->tv_nsec - start->tv_nsec);secs = (stop->tv_sec - start->tv_sec);getclock(TIMEOFDAY, stop);P();getclock(TIMEOFDAY, start);

secs*1e9 +

ns);

  • 7 –

class22.ppt

Timing mechanisms (cont)

Performance counters

counts system events (CYCLES, IMISS, DMISS, BRANCHMP)

very fine grained

short time span (e.g., 9 seconds on 450 MHz Alpha)

unsigned

int counterRoutine[] = { /* Alpha cycle counter */

0x6bfa8001u0x401f0000u,0x601fc000u,

unsigned};

int (*counter)(void) = (void *)counterRoutine;

printf(“%d cycles\n”, cycles);cycles = counter() - cycles;P();cycles = counter();

cycle counter Using the Alpha

  • 8 –

class22.ppt

Measurement pitfalls

Discretization errors

need to measure large enough chunks of work

but how large is large enough?

Unexpected cache effects

artificial hits or misses

cold start misses due to context swapping

CS 213 F’

  • 10 –

class22.ppt

Anatomy of a timer

timer period:

dt secs/tick

timer resolution:

1/dt ticks/sec

time

dt

clock interrupt (tick)

T 1 T 2 T n

T

start

T

finish

program execution time

interval 2

Assume here that

T

k

(^) T k-

  • dt T k

CS 213 F’

  • 11 –

class22.ppt

Measurement pitfall #1:

Discretization error

time

dt

T 1 T 2 T n T start

T

finish

actual program execution time

measured time:

(T

n

  • T

1 )

actual time:

(T

n

  • T

1 ) + (T

finish

- T

n ) - (T

start

- T

1 )

f absolute error = measured time - actual time start

= (T

start

- T

1 )/dt

fraction of interval overreported

f finish

= (T

finish

- T

n )/dt

fraction of interval underreported

absolute error =

dt f

start

  • dt f

finish

= dt (f

start

  • f

finish

max absolute error =

+/- dt

  • 13 –

class22.ppt

Examples of discretization error (cont)

time

actual running time

Actual time = near

2dt

measured time =

dt

Absolute measurement error =

-dt

CS 213 F’

  • 14 –

class22.ppt

Estimating the timer period

dt

while (start start = 0;

(end

get_etime())))

dt = end

start;

printf(“dt

%lf\n”, dt);

Digital Unix Alpha systems: dt = 1ms

  • 16 –

class22.ppt

Relative error analysis

Let

t and

t’ be the actual and measured running times of the loop,

respectively, and let

dt be the timer period.

Also, let

t’-t be the absolute error and let

|t’-t|/t be the relative error.

or equal to EProblem: What value of t’ will result in a relative error less than

max

Fact (1):

|t’-t| <= dt

Fact (2):

t’ - dt <= t

We want

|t’-t|/t <= E

max

dt/t <= E

max

dt/ E

max

(^) <= t

(algebra)

dt/ E

max

(^) <= t’ - dt

dt/ E

max

  • dt <= t’
  • 17 –

class22.ppt

Relative error analysis

for (i=0; i= 0.070 seconds (70 ms).001/.05 + .05 <= t’

  • 19 –

class22.ppt

Measurement summary

It’s difficult to get accurate times

discretization error

but can’t always measure short procedures in loops

  • changes cache behavior– mallocs – global state

It’s difficult to get repeatable times

cache effects due to ordering and context switches

Moral of the story:

Adopt a healthy skepticism about measurements!

Always subject measurements to sanity checks.

  • 20 –

class22.ppt

Profiling

The goal of profiling is to account for the cycles used

by a program or system.

Basic techniques

src translation

  • gprof [Graham, 1982]

binary translation

  • pixie [MIPS, 1990] – Atom [DEC, 1993]

direct simulation

  • SimOS [Rosenblum, 1995]

statistical sampling

  • SpeedShop [Zhaga, 1996] (performance counter interrupts)– DCPI [Anderson, 1997] (performance counter interrupts) – prof (existing interrupt source)