





























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This lecture is part of complete lecture series on Advanced Theory of Computation. Key points in this lecture are: Multi Core Computer, Tlp, Technology Providers, Chip Multi-Processor, Cores Run in Parallel, Instruction-Level Parallelism, Shared Memory, Distributed Memory, Database Servers, Intel Xeon Processors, Advantages
Typology: Slides
1 / 37
This page cannot be seen from the preview
Don't miss anything!






























Multi-core architectures
Replicate multiple processor cores on asingle die.
Core 1
Core 2
Core 3
Core 4
Multi-core CPU chip
Within each core, threads are time-sliced(just like on a uniprocessor)
c o r e 1
c o r e 2
c o r e 3
c o r e 4
severalthreads
severalthreads
severalthreads
severalthreads
Parallelism at the machine-instruction level
The processor can re-order, pipelineinstructions, split them intomicroinstructions, do aggressive branchprediction, etc.
Instruction-level parallelism enabled rapidincreases in processor speeds over thelast 15 years
General context: Multiprocessors
Multiprocessor is anycomputer with severalprocessors
SIMD
MIMD
Lemieux cluster,
Pittsburgh
supercomputing
center
Shared memory:In this model, there is one (large) commonshared memory for all processors
Distributed memory:In this model, each processor has its own(small) local memory, and its content is notreplicated anywhere else
What applications benefitfrom multi-core?
Each canrun on itsown core
More examples
Editing a photo while recording a TV showthrough a digital video recorder
Downloading software while running ananti-virus program
“Anything that can be threaded today willmap efficiently to multi-core”
BUT: some applications difficult toparallelize
BTB and I-TLB
Decoder
Trace Cache Rename/Alloc
Uop queues
Schedulers
Integer
Floating Point
L1 D-Cache D-TLB
uCode ROM
L2 Cache and ControlBus
Thread 1: floating point
Without SMT, only a single threadcan run at any given time
Without SMT, only a single threadcan run at any given time
BTB and I-TLB
Decoder
Trace Cache Rename/Alloc
Uop queues
Schedulers
Integer
Floating Point
L1 D-Cache D-TLB
uCode ROM
L2 Cache and ControlBus
Thread 2:integer operation