


































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Introduction to Cell Processor Cell Basic Design Concept Cell Hardware Overview Cell Processor Cell Processor Components Cell Performance Characteristics Cell Application Affiinity Cell
Typology: Slides
1 / 74
This page cannot be seen from the preview
Don't miss anything!



































































Class Agenda ●
Motivation for multicore chip design
●
Cell basic design concept
●
Cell hardware overview
Cell highlights
Cell processor
Cell processor components
●
Cell performance characteristics
●
Cell application affinity
●
Cell software overview
Cell software environment
Development tools
Cell system simulator
Optimized libraries
●
Cell software development considerations
●
Cell blade
Technology Scaling – We’ve hit the wall
1988
1992
1996
2000
2004
2008
2012
0.80.60.4 0.
8 6 4 2 1
20 10
Conventional Bulk CMOSSOI (silicon-on-insulator)High mobilityDouble-Gate
Relative Device Performance
Year
?
Power Density – The fundamental problem
1
10
100
1000
1.
μ
1
μ
0.
μ
0.
μ
0.
μ
0.
μ
0.
μ
0.
μ
0.
μ
0.
μ
i
i
Pentium
®
Pentium Pro
®
Pentium II
®
Pentium III
®
W/cm
Hot Plate
Nuclear Reactor
Steam Iron
5W/cm
Has This Ever Happened Before?
Steam Iron
5W/cm
?
Has This Ever Happened Before?
opportunity
Cell
Systems and Technology Group
Cell History ●
IBM, SCEI/Sony, Toshiba Alliance formed in 2000
●
Design Center opened in March 2001
Based in Austin, Texas
●
Single Cell BE operational Spring 2004
●
2-way SMP operational Summer 2004
●
February 7, 2005: First technical disclosures
●
October 6, 2005: Mercury Announces Cell Blade
●
November 9, 2005: Open Source SDK & Simulator Published
●
November 14, 2005: Mercury Announces Turismo Cell Offering
●
February 8, 2006 IBM Announced Cell Blade
Systems and Technology Group
Cell Basic Concept ●
Compatibility with 64b Power Architecture™
Builds on and leverages IBM investment and community
●
Increased efficiency and performance
Attacks on the “Power Wall”–
Non Homogenous Coherent Multiprocessor
High design frequency @ a low operating voltage with advanced power management
Attacks on the “Memory Wall”–
Streaming DMA architecture
3-level Memory Model: Main Storage, Local Storage, Register Files
Attacks on the “Frequency Wall”–
Highly optimized implementation
Large shared register files and software controlled branching to allow deeper pipelines
●
Interface between user and networked world
Image rich information, virtual reality
Flexibility and security
●
Multi-OS support, including RTOS / non-RTOS
Combine real-time and non-real time worlds
Cell Design Goals ●
Cell is an accelerator extension to Power
Built on a Power ecosystem
Used best know system practices for processor design
●
Sets a new performance standard
Exploits parallelism while achieving high frequency
Supercomputer attributes with extreme floating point capabilities
Sustains high memory bandwidth with smart DMA controllers
●
Designed for natural human interaction
Photo-realistic effects
Predictable real-time response
Virtualized resources for concurrent activities
●
Designed for flexibility
Wide variety of application domains
Highly abstracted to highly exploitable programming models
Reconfigurable I/O interfaces
Virtual trusted computing environment for security
Cell Chip
Cell Processor Components (1) ●
Power Processor Element (PPE):
General purpose, 64-bit RISCprocessor (PowerPC AS 2.0.2)
2-Way hardware multithreaded
L1 : 32KB I ; 32KB D
L2 : 512KB
Coherent load / store
VMX-
Realtime Controls–
●
Element Interconnect Bus (EIB):
Four 16 byte data rings supporting multiplesimultaneous transfers per ring
96Bytes/cycle peak bandwidth
Over 100 outstanding requests
In the Beginning
- the solitary Power Processor
Custom Designed
- for high frequency, space,
and power efficiency
96 Byte/Cycle
Element Interconnect Bus
Power Core
L2 Cache
Cell Processor Components (2) ●
Synergistic Processor Element (SPE):
Provides the computational performance
Simple RISC User Mode Architecture–
Dedicated resources: unified 128x128-bitRF, 256KB Local Store
Dedicated DMA engine: Up to 16outstanding requests
●
Memory Management & Mapping
SPE Local Store aliased into PPE systemmemory
MFC/MMU controls / protects SPE DMAaccesses–
DMA 1,2,4,8,16,128 -> 16Kbyte transfersfor I/O access
Two queues for DMA commands: Proxy &SPU
Local Store
N AUC
Local Store
N AUC
Local Store
N
Local Store
N
Local Store
N
Local Store
N
Local Store
N
Local Store
N
96 Byte/Cycle
Element Interconnect Bus
Power Core
L2 Cache