Multi-core Architectures, Study notes of Advanced Computer Architecture

Multi-core Architectures, Software and Hardware multi-threading, SMT and CMP architectures, Design Issues

Typology: Study notes

2018/2019

Uploaded on 05/01/2019

suhail-ansari
suhail-ansari 🇮🇳

1 document

1 / 89

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Microprocessor
Methods To Increase Performance:
The number of transistors available has a huge effect on the performance of a processor.
More transistors also allow for a technology called pipelining.
Parallelism
1
ELEC6200-001, Fall 08
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59

Partial preview of the text

Download Multi-core Architectures and more Study notes Advanced Computer Architecture in PDF only on Docsity!

Microprocessor

Methods To Increase Performance:

  • The number of transistors available has a huge effect on the performance of a processor.
  • (^) More transistors also allow for a technology called pipelining.
  • (^) Parallelism ELEC6200-001, Fall 08 1

Parallelism in Microprocessors

  • (^) Pipelining is most prevalent ▫ Used in everything ▫ (^) Even microcontrollers ▫ Decreases cycle time ▫ (^) Allows up to 1 instruction per cycle (IPC) ▫ (^) No programming changes ▫ (^) Some Pentium 4s have more than 30 stages!
  • (^) Parallelism classifications: Instruction level Loop level Thread level - Future trend Process level - Future trend ELEC6200-001, Fall 08 2

Superscalar pipeline

ELEC6200-001, Fall 08 4

Competing technologies

  • (^) Simultaneous Multi Threading
    • (^) Simultaneous Multi threading architecture is similar to that of the superscalar.
    • (^) SMT processors support wide superscalar processors with hardware, to execute instructions from multiple thread concurrently.
  • (^) Out-of-Order Execution
    • Where instructions execute in any order that does not violate data dependencies.
    • (^) Note that this technique is independent of both pipelining and superscalar ELEC6200-001, Fall 08 5

Why Multiprocessor Systems?

  • (^) Single-core microprocessor performance increases are beginning to slow [1] due to:
    • (^) Increasing power consumption (>100 W)
    • (^) Increasing heat dissipation
    • Diminishing performance gains from ILP & TLP
  • (^) As a result manufactures are turning to a multi-core microprocessor approach
    • (^) Multiple smaller energy efficient processing cores are integrated onto a single chip
    • (^) Improves overall performance by performing more work concurrently
    • (^) The latencies associated with chip-to-chip communication disappear, Shared data structures are much less of a problem. ELEC6200-001, Fall 08 7

Centralized architecture

  • (^) Disadvantages of centralized architectures such as SMT and Superscalars are:
    • Area increases quadratically with core’s complexity.
    • Increase in cycle time – interconnect delays. Delay with wires dominate delay of critical path of CPU. Possible to make simpler clusters, but results in deeper pipeline and increase in branch misprediction penalty.
    • Design verification cost high, due to complexity and single processor
      • Large demand on memory system. ELEC6200-001, Fall 08 8

Case for single chip multiprocessors

  • (^) Advances in the field of integrated chip processing.
    • Gate density (More transistors per chip)
    • Cost of wires
  • (^) Large uniprocessors are no longer scaling in performance, because it is only possible to extract a limited amount of parallelism from a typical instruction stream using conventional superscalar instruction issue techniques. ELEC6200-001, Fall 08 10

CMP Architectures

  • Two general types of multi-core or chip multiprocessor (CMP) architectures
    • (^) Homogeneous CMPs – all processing elements (PEs) are the same
    • (^) Heterogeneous CMPs – comprised of different PEs
  • Homogenous dual-core processors for PCs are now available from all major

manufactures

  • (^) Heterogeneous CMPs are available in the form of multiprocessor systems-on-chips

(MPSoCs)

ELEC6200-001, Fall 08 11

Multicore processor

  • (^) A multi-core processor is a single computing component with two or more independent processing units called cores, which read and execute program instructions.
  • (^) The instructions are ordinary CPU instructions (such as add, move data, and branch) but the single processor can run multiple instructions on separate cores at the same time, increasing overall speed for programs amenable to parallel computing.
  • (^) Manufacturers typically integrate the cores onto a single integrated circuit die (known as a chip multiprocessor or CMP) or onto multiple dies in a single chip package.
  • (^) The microprocessors currently used in almost all personal computers are multi-core.

http://csp2.epgpbooks.inflibnet.ac.in/chapter/thread-level-parallelism-s

mt-and-cmp/

Single-core

computer

1 9

Single-core CPU chip

the single core 2 0