Computer Architecture is Back: The Berkeley View on the Parallel Computing Landscape, Lecture notes of Computer Networks

The changing landscape of computer architecture and the need for new approaches based on parallelism. It challenges old conventional wisdom and highlights the importance of bringing together architects, language designers, application experts, numerical analysts, algorithm designers, and programmers to find solutions. The document also discusses the challenges of building large designs at ≤65 nm and the memory wall. It ends by discussing the shift towards increasing parallelism and the industry's bet on breakthroughs before it's too late.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

ambau
ambau 🇺🇸

4.5

(11)

250 documents

1 / 65

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Computer Architecture is Back
- The Berkeley View on the
Parallel Computing Landscape
David Patterson, Krste Asanovíc, Kurt Keutzer,
and a cast of thousands
U.C. Berkeley
January, 2007
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41

Partial preview of the text

Download Computer Architecture is Back: The Berkeley View on the Parallel Computing Landscape and more Lecture notes Computer Networks in PDF only on Docsity!

Computer Architecture is Back

  • The Berkeley View on theParallel Computing Landscape David Patterson, Krste Asanov

íc, Kurt Keutzer,

and a cast of thousands

U.C. BerkeleyJanuary, 2007

High Level Message^ ^

Everything is changing  Old conventional wisdom is out  We desperately

need new approach to HW

and SW based on parallelism since industryhas bet its future that parallelism works



^ Need to create a “watering hole” to bringeveryone together to quickly find thatsolution^ 

architects, language designers, application experts, numericalanalysts, algorithm designers, programmers, …

Conventional Wisdom (CW)

in Computer Architecture

1.^

Old CW

: Power is free, but transistors expensive



New CW

is the “

Power wall

Power is expensive, but transistors are “free” ^ Can put more transistors on a chip than have the power to turn on

2.^

Old CW

: Only concern is dynamic power



New CW

: For desktops and servers, static power due to leakage is 40% of total power

3.^

Old CW

: Monolithic uniprocessors are reliable internally, with errors occurring only at pins 

New CW

: As chips drop below 65 nm feature sizes, they will have high soft and hard error rates

Conventional Wisdom (CW)

in Computer Architecture

4.^

Old CW

: By building upon prior successes, continue raising level of abstraction and size of HW designs 

New CW

: Wire delay, noise, cross coupling, reliability, clock jitter, design validation, … stretch developmenttime and cost of large designs at ≤65 nm

5.^

Old CW

: Researchers demonstrate new architectures by building chips 

New CW

: Cost of 65 nm masks, cost of ECAD, and design time for GHz clocks ⇒^ Researchers no longer build believable chips

6.^

Old CW

: Performance improves latency & bandwidth



New CW

: BW improves > (latency improvement)

2

10000 1000 100 10 1 1978

1980 1982 1984 1986 1988 1990

1992 1994 1996

1998

2000 2002 2004

2006

Performance (vs. VAX-11/780)

25%/year

52%/year

??%/year

Uniprocessor Performance (SPECint) • VAX^

: 25%/year 1978 to 1986

-^ RISC + x86: 52%/year 1986 to 2002 •^ RISC + x86: ??%/year 2002 to present

From Hennessy and Patterson,

Computer Architecture: A

Quantitative Approach

, 4th edition, Sept. 15, 2006

⇒⇒⇒⇒^ Sea change in chipdesign: multiple “cores” orprocessors per chip

3X

Sea Change in Chip Design ^ Intel 4004 (1971): 4-bit processor,2312 transistors, 0.4 MHz,10 micron PMOS, 11 mm

2 chip

^ RISC II (1983): 32-bit, 5 stagepipeline, 40,760 transistors, 3 MHz,3 micron NMOS, 60 mm •^ Processor is the new transistor!

2 chip

^ 125 mm

2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+Dcache^ ^ RISC II shrinks to

≈^ 0.02 mm

2 at 65 nm

^ Caches via DRAM or 1 transistor SRAM or 3D chip stacking ^ Proximity Communication via capacitive coupling at > 1 TB/s ?(Ivan Sutherland @ Sun / Berkeley)

Parallelism again? What’s differentthis time?^ “This shift toward increasing parallelism is not atriumphant stride forward based on breakthroughsin novel software and architectures for parallelism;instead, this

plunge into parallelism is actually a

retreat from even greater challenges that thwartefficient silicon implementation of traditionaluniprocessor architectures

Berkeley View, December 2006

^ HW/SW Industry bet its future that breakthroughswill appear before its too late

From Multiprogramming toMultithreading^ ^

Multiprogrammed workloads (mix ofindependent sequential tasks) mightobviously benefit from first few generationsof multicores  But how will single tasks get faster on futuremanycores?

7 Questions for Parallelism  Applications: 1. What are the apps?2. What are kernels of apps?  Hardware: 3. What are the HW buildingblocks?4. How to connect them?  Programming Model &Systems Software: 5. How to describe apps andkernels?6. How to program the HW?  Evaluation: 7. How to measure success?

(Inspired by a view of theGolden Gate Bridge from Berkeley)

^ Old CW: Since cannot know future programs,use old programs to evaluate future computers^ 

e.g., SPEC2006, EEMBC ^ What about parallel codes?^ 

Few, tied to old models, languages, architectures, … ^ New approach: Design future computers forpatterns of computation and communicationimportant in the future ^ Claim: 13 “dwarfs” are key for next decade,so design for them!^ 

Representative codes may vary over time, but thesedwarfs will be important for > 10 years Apps and Kernels

Do dwarfs work well outside HPC?^ ^

Examine effectiveness 7 dwarfs elsewhere

1.^

Embedded Computing (EEMBC benchmark)

2.^

Desktop/Server Computing (SPEC2006)

3.^

Machine Learning ^ Advice from Mike Jordan and Dan Klein of UC Berkeley

4.^

Games/Graphics/Vision

5.^

Data Base Software ^ Advice from Jim Gray of Microsoft and Joe Hellerstein of UC ^

Result: Added 7 more dwarfs, revised 2original dwarfs, renumbered list

13 Dwarfs (so far)^ 1. Dense Linear Algebra2. Sparse Linear Algebra3. Spectral Methods4. N-Body Methods5. Structured Grids6. Unstructured Grids7. MapReduce

  1. Combinational Logic9. Graph Traversal10. Dynamic Programming11. Back-track/Branch & Bound12. Graphical Model Inference13. Finite State Machine
    • Claim is that parallel architecture, language, compiler… that do these well will run parallel apps of future well• Note: MapReduce is embarrassingly parallel;

perhaps FSM is embarrassingly sequential?

7 Questions for Parallelism Applications:1. What are the apps?2. What are kernels of apps?  Hardware: 3. What are the HW buildingblocks?4. How to connect them?  Programming Model &Systems Software: 5. How to describe apps andkernels?6. How to program the HW?  Evaluation: 7. How to measure success?

(Inspired by a view of theGolden Gate Bridge from Berkeley)

^ Power limits leading edge chip designs^ 

Intel Tejas Pentium 4 cancelled due to power issues

^ Yield on leading edge processes droppingdramatically^ 

IBM quotes yields of 10 – 20% on 8-processor

Cell

^ Design/validation leading edge chip isbecoming unmanageable^ 

HW: What are the problems?^ Verification teams > design teams on leading edgeprocessors