Raw Experience, Lecture Slides - Assembly Programming, Slides of Assembly Language Programming

Raw Chips, Raw Microprocessors, MPEG-2 Encoder Performance, Phong Shading, Shadow Volume, Operand Routing, Operand Transport, Latency, More Scalability Problems,Tiled Processors

Typology: Slides

2010/2011

Uploaded on 10/11/2011

lovefool
lovefool ๐Ÿ‡ฌ๐Ÿ‡ง

4.5

(21)

292 documents

1 / 53

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Dr. Rodric Rabbah, IBM. 1 6.189 IAP 2007 MIT
6.189 IAP 2007
Lecture 17
The Raw Experience
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35

Partial preview of the text

Download Raw Experience, Lecture Slides - Assembly Programming and more Slides Assembly Language Programming in PDF only on Docsity!

Dr. Rodric Rabbah, IBM.^

6.189 IAP 2007Lecture 17The Raw Experience^1 6.189 IAP 2007 MIT

2 6.189 IAP 2007 MIT

Raw Chips Dr. Rodric Rabbah, IBM.

October 02

4 6.189 IAP 2007 MIT

Raw Microprocessor Dr. Rodric Rabbah, IBM.

โ—^ 16 tiles (16 issue) โ—^ 180 nm ASIC (IBM SA-27E) โ—^ ~100 million transistors โ—^ 1 million gates โ—^ 3-4 years of development โ—^ 1.5 years of testing โ—^ 200K lines of test code โ—^ Core Frequency:^ ย„^ 425 MHz @ 1.8 V^ ย„^ 500 MHz @ 2.2 V โ—^ Frequency competitive with IBM-implemented PowerPCs in sameprocess โ—^ 18W average power

5 6.189 IAP 2007 MIT

One Cycle in the Life of a Tiled Processor โ—^ Application uses as many tiles as needed to exploit its parallelismDr. Rodric Rabbah, IBM.

mem mem mem 2-thread4-wayMPI appautomaticallyparallelizedC program httpd^ Zzz...

7 6.189 IAP 2007 MIT

Raw in Action Dr. Rodric Rabbah, IBM.

8 6.189 IAP 2007 MIT
16 8 4 1 1 8 164 Dr. Rodric Rabbah, IBM.

Speedup 350 x 240 Images^ 720 x 480 Images^16 Frames/s^8414 8 # of Tiles

Speedup

Frames/s # of Tiles

MPEG-2 Encoder Performance^ ย„^

Square โ€“ Linear speedup ย„ Diamond โ€“ Hand-optimized, slice parallel implementation ย„ Circle โ€“ Slice parallel implementation ย„ Triangle โ€“ Baseline macroblock parallel implementation
10 6.189 IAP 2007 MIT

Programmable Graphics Pipeline^ screenshot from^ Counterstrike Dr. Rodric Rabbah, IBM.

Input VVertex VertexSync Triangle SetupPPixel Pixel simplified graphics pipeline

11 6.189 IAP 2007 MIT

Phong Shading โ—^ Per-pixel phong-shadedpolyhedron โ—^ 162 vertices, 1 light Dr. Rodric Rabbah, IBM.

Output, rendered using Raw simulator

13 6.189 IAP 2007 MIT

Shadow Volumes โ—^ 4 textured triangles โ—^ 1 point light โ—^ Rendered in 3 passes Dr. Rodric Rabbah, IBM.

Output, rendered using Raw simulator

14 6.189 IAP 2007 MIT

Fixed pipelinePass 1^ Pass 2 Dr. Rodric Rabbah, IBM.

Pass 3

Reconfigurable pipelinePass 1^ Pass 2^ Pass 3

โ—^ 40% faster Shadow Volumes (64-tiles)

cycles

16 6.189 IAP 2007 MIT

Case Study: Beamformer^240 Dr. Rodric Rabbah, IBM.

(^64019) 1, 1,6001,4001,2001,000^8006004002000 C program^ C program

UnoptimizedOptimized StreamItStreamIt 1 GHz Pentium III^ 420 MHz single tileRaw

420 MHz 64 tile420 MHz 16 tileRawRaw MFLOPS

17 6.189 IAP 2007 MIT

The Raw Experience โ—^ Insights into the design Raw architecture โ—^ Raw parallelizing compiler โ—^ StreamIt language and CompilerDr. Rodric Rabbah, IBM.

19 6.189 IAP 2007 MIT

Dr. Rodric Rabbah, IBM.

ALUALUALUALU ALUALUALUALU ALUALUALUALU Bypass Net ALUALUALUALU (^2 3) ~N~N RF

N ALUs

Area and Frequency Scalability Problems

20 6.189 IAP 2007 MIT

Dr. Rodric Rabbah, IBM.

ALUALUALUALU ALUALUALUALU ALUALUALUALU Bypass Net ALUALUALUALU RF

Operand Routing is Global