Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Revolution in Computer Architecture: Old Conventional Wisdom vs. New Conventional Wisdom, Slides of Electronics engineering

Bharat Ratna Dr. B. R. Ambedkar University Electronics engineering

The shift in conventional wisdom in computer architecture, moving from the belief that chips are reliable internally and power is free, to acknowledging the high error rates and power expenses in sub-65nm technologies. It also explores the challenges of building believable prototypes and the shift towards compiler optimizations and architecture innovation taking over a decade to be adopted. The document also touches upon the sea change in chip design towards multiple cores or processors per chip.

Typology: Slides

2012/2013

Uploaded on 03/23/2013

dhrupad 🇮🇳

4.4

(17)

213 documents

1 / 45

This page cannot be seen from the preview

Don't miss anything!

High Level Message

•Everything is changing; Old conventional wisdom

is out

•We DESPERATELY need a new architectural

solution for microprocessors based on parallelism

•Need to create a “watering hole” to bring

everyone together to quickly find that solution

–architects, language designers, application experts,

numerical analysts, algorithm designers,

programmers, …

Docsity.com

Discover Slides of Electronics engineering Bharat Ratna Dr. B. R. Ambedkar University

Partial preview of the text

Download Revolution in Computer Architecture: Old Conventional Wisdom vs. New Conventional Wisdom and more Slides Electronics engineering in PDF only on Docsity!

High Level Message

Everything is changing; Old conventional wisdom

is out

We DESPERATELY need a new architectural

solution for microprocessors based on parallelism

Need to create a “watering hole” to bring

everyone together to quickly find that solution

architects, language designers, application experts, numerical analysts, algorithm designers, programmers, …

Docsity.com

Outline

• Part I: A New Agenda for Computer

Architecture

Old Conventional Wisdom vs. New Conventional

Wisdom

• Part II: A “Watering Hole” for Parallel Systems

Research Accelerator for Multiple Processors

• Conclusion

Docsity.com

Conventional Wisdom (CW) in Computer Architecture

Old CW: Power is free, Transistors expensive
New CW: “Power wall” Power expensive, Xtors free

(Can put more on chip than can afford to turn on)

Old: Multiplies are slow, Memory access is fast
New: “Memory wall” Memory slow, multiplies fast

(200 clocks to DRAM memory, 4 clocks for FP multiply)

Old : Increasing Instruction Level Parallelism via compilers,

innovation (Out-of-order, speculation, VLIW, …)

New CW: “ILP wall” diminishing returns on more ILP
New: Power Wall + Memory Wall + ILP Wall = Brick Wall
- Old CW: Uniprocessor performance 2X / 1.5 yrs
- New CW: Uniprocessor performance only 2X / 5 yrs? Docsity.com

Uniprocessor Performance (SPECint)

100

1000

10000

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Performance (vs. VAX-11/780) 25%/year

52%/year

??%/year

VAX : 25%/year 1978 to 1986
RISC + x86: 52%/year 1986 to 2002
RISC + x86: ??%/year 2002 to present

From Hennessy and Patterson, Computer Architecture: A Quantitative Approach , 4th edition, 2006

⇒ Sea change in chip design: multiple “cores” or processors per chip

Docsity.com

Déjà vu all over again? “… today’s processors … are nearing an impasse as technologies approach the speed of light..” David Mitchell, The Transputer: The Time Is Now (1989)

Transputer had bad timing (Uniprocessor performance↑) ⇒ Procrastination rewarded: 2X seq. perf. / 1.5 years
“We are dedicating all of our future product development to multicore designs. … This is a sea change in computing” Paul Otellini, President, Intel (2005)
All microprocessor companies switch to MP (2X CPUs / 2 yrs) ⇒ Procrastination penalized: 2X sequential perf. / 5 yrs

Manufacturer/Year (^) AMD/’05 Intel/’06 IBM/’04 Sun/’

Processors/chip 2 2 2

Threads/Processor (^1 2 2 )

Threads/chip 2 4 4 32 Docsity.com

21 st^ Century Computer Architecture

Old CW: Since cannot know future programs, find

set of old programs to evaluate designs of

computers for the future

E.g., SPEC
What about parallel codes?
Few available, tied to old models, languages, architectures, …
New approach: Design computers of future for

numerical methods important in future

Claim: key methods for next decade are 7

dwarves (+ a few), so design for them!

Representative codes may vary over time, but these numerical methods will be important for > 10 years Docsity.com

6/11 Dwarves Covers 24/30 SPEC

SPECfp
- 8 Structured grid
  - 3 using Adaptive Mesh Refinement
- 2 Sparse linear algebra
- 2 Particle methods
- 5 TBD: Ray tracer, Speech Recognition, Quantum Chemistry, Lattice Quantum Chromodynamics (many kernels inside each benchmark?)
SPECint
- 8 Finite State Machine
- 2 Sorting/Searching
- 2 Dense linear algebra (data type differs from dwarf)
- 1 TBD: 1 C compiler (many kernels?) Docsity.com

21 st^ Century Code Generation

• Old CW: Takes a decade for compilers to

introduce an architecture innovation

• New approach: “Auto-tuners” 1st run

variations of program on computer to find

best combinations of optimizations (blocking,

padding, …) and algorithms, then produce C

code to be compiled for that computer

E.g., PHiPAC (BLAS), Atlas (BLAS),

Sparsity (Sparse linear algebra), Spiral (DSP), FFT-

W

Can achieve 10X over conventional compiler Docsity.com

Best Sparse Blocking for 8 Computers

All possible column block sizes selected for 8 computers; How could compiler know?

Intel Pentium M

Sun Ultra 2, Sun Ultra 3, AMD Opteron IBM Power 4, Intel/HP Itanium

Intel/HP Itanium 2

IBM

Power 3

2 1 1 2 4 8

row block size (r)

column block size (c)

Docsity.com

21 st^ Century Measures of Success

• Old CW: Don’t waste resources on accuracy,

reliability

Speed kills competition
Blame Microsoft for crashes

• New CW: SPUR is critical for future of IT

S ecurity
P rivacy
U sability (cost of ownership)
R eliability

• Success not limited to performance/cost“20th century vs. 21st century C&C: the SPUR manifesto,”

Communications of the ACM , 48:3, 2005.Docsity.com

Parallel Framework – Apps (so far)  Original 7 dwarves: 6 data parallel, 1 no coupling TLP  Bonus 4 dwarves: 2 data parallel, 2 no coupling TLP  EEMBC (Embedded): Stream 10, DLP 19, Barrier TLP 2  SPEC (Desktop): 14 DLP, 2 no coupling TLP

E E M B C

S P E C

S P E C (^) D w a r f S

D W A R F S

Streaming DLP DLP No coupling TLP Barrier TLP Tight TLP

Most New Architectures

Most Important Apps?

Docsity.com

Outline

• Part I: A New Agenda for Computer

Architecture

Old Conventional Wisdom vs. New Conventional

Wisdom

• Part II: A “Watering Hole” for Parallel Systems

Research Accelerator for Multiple Processors

• Conclusion

Docsity.com

Build Academic MPP from

• As ≈ 25 CPUs will fit in Field Programmable Gate ArrayFPGAs

(FPGA), 1000-CPU system from ≈ 40 FPGAs?

16 32-bit simple “soft core” RISC at 150MHz in 2004

(Virtex-II)

FPGA generations every 1.5 yrs; ≈ 2X CPUs, ≈ 1.2X clock

rate

HW research community does logic design (“gate

shareware”) to create out-of-the-box, MPP

E.g., 1000 processor, standard ISA binary-compatible,

64-bit, cache-coherent supercomputer @ ≈ 100

MHz/CPU in 2007

RAMPants: Arvind (MIT), Krste Asanovíc (MIT), DerekDocsity.com

Why RAMP Good for Research MPP?

SMP Cluster Simulate RAMP

Scalability (1k CPUs) C A A A

Cost (1k CPUs) F ($40M) C ($2-3M) A+ ($0M) A ($0.1-0.2M)

Cost of ownership A D A A

Power/Space (kilowatts, racks)

D (120 kw, 12 racks)

A+ (.1 kw, 0.1 racks)

A (1.5 kw, 0.3 racks)

Community D A A A

Observability D C A+ A+

Reproducibility B D A+ A+

Reconfigurability D C A+ A+

Credibility A+ A+ F B+/A-

Perform. (clock) A (2 GHz) A (3 GHz) F (0 GHz) C (0.1-.2 GHz)

GPA C B- B A- Docsity.com

Revolution in Computer Architecture: Old Conventional Wisdom vs. New Conventional Wisdom, Slides of Electronics engineering

Related documents

Partial preview of the text

Download Revolution in Computer Architecture: Old Conventional Wisdom vs. New Conventional Wisdom and more Slides Electronics engineering in PDF only on Docsity!

High Level Message

is out

solution for microprocessors based on parallelism

everyone together to quickly find that solution

Outline

• Part I: A New Agenda for Computer

Architecture

Wisdom

• Part II: A “Watering Hole” for Parallel Systems

• Conclusion

Processors/chip 2 2 2

Threads/chip 2 4 4 32 Docsity.com

set of old programs to evaluate designs of

computers for the future

numerical methods important in future

dwarves (+ a few), so design for them!

21 st^ Century Code Generation

• Old CW: Takes a decade for compilers to

introduce an architecture innovation

• New approach: “Auto-tuners” 1st run

variations of program on computer to find

best combinations of optimizations (blocking,

padding, …) and algorithms, then produce C

code to be compiled for that computer

Sparsity (Sparse linear algebra), Spiral (DSP), FFT-

W

IBM

21 st^ Century Measures of Success

• Old CW: Don’t waste resources on accuracy,

reliability

• New CW: SPUR is critical for future of IT

• Success not limited to performance/cost“20th century vs. 21st century C&C: the SPUR manifesto,”

Outline

• Part I: A New Agenda for Computer

Architecture

Wisdom

• Part II: A “Watering Hole” for Parallel Systems

• Conclusion

Build Academic MPP from

• As ≈ 25 CPUs will fit in Field Programmable Gate ArrayFPGAs

(FPGA), 1000-CPU system from ≈ 40 FPGAs?

(Virtex-II)

rate

64-bit, cache-coherent supercomputer @ ≈ 100

MHz/CPU in 2007