Computer Architecture Wrap-Up, Lecture Slide - Computer Science, Slides of Computer Architecture and Organization

Pipe Design, Modern High-Performance Processors, Exceptions, Maintain Exception, Ordering Exception, Handling Logic, Side Effect in Pipeline, processors, Control logic, Performance metrics, CPI for PIPE, Standard Fetch Timing, Modified Fetch Timing, Execution Unit, ICore Operation

Typology: Slides

2010/2011

Uploaded on 10/08/2011

rolla45
rolla45 šŸ‡ŗšŸ‡ø

4

(6)

133 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Randal E. Bryant
Carnegie Mellon University
CS:APP2e
CS:APP Chapter 4
Computer Architecture
Wrap-Up
http://csapp.cs.cmu.edu
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Computer Architecture Wrap-Up, Lecture Slide - Computer Science and more Slides Computer Architecture and Organization in PDF only on Docsity!

Randal E. Bryant

Carnegie Mellon University

CS:APP2e

CS:APP Chapter 4

Computer Architecture

Wrap-Up

http://csapp.cs.cmu.edu

Overview

Wrap-Up of PIPE Design

 Exceptional conditions

 Performance analysis

 Fetch stage design

Modern High-Performance Processors

 Out-of-order execution

Exception Examples

Detect in Fetch Stage

irmovl $100,%eax

rmmovl %eax,0x10000(%eax) # invalid address

jmp $-1 # Invalid jump target

.byte 0xFF # Invalid instruction code

halt # Halt instruction

Detect in Memory Stage

Exceptions in Pipeline Processor #

Desired Behavior

 rmmovl should cause exception

 Following instructions should have no effect on processor state

# demo-exc1.ys

irmovl $100,%eax

rmmovl %eax,0x10000(%eax) # Invalid address

nop

.byte 0xFF # Invalid instruction code

0x000: irmovl $100,%eax

1 2 3 4

F D E M 0x006: rmmovl %eax,0x1000(%eax) F D E 0x00c: nop 0x00d: .byte 0xFF

F D

F

W

5

M

E

D

Exception detected

Exception detected

Maintaining Exception Ordering

 Add status field to pipeline registers

 Fetch stage sets to either ā€œAOK,ā€ ā€œADRā€ (when bad fetch address), ā€œHLTā€ (halt instruction) or ā€œINSā€ (illegal instruction)

 Decode & execute pass values through

 Memory either passes through or sets to ā€œADRā€

 Exception triggered only when instruction hits write back

F predPC

W stat icode^ valE^ valM^ dstE^ dstM

M stat icode^ Cnd valE^ valA^ dstE^ dstM

E stat icode^ ifun^ valC^ valA^ valB^ dstE^ dstM^ srcA^ srcB

D stat icode^ ifun^ rA rB^ valC^ valP

Exception Handling Logic

Fetch Stage

Memory Stage

Writeback Stage

dmem_error

Determine status code for fetched instruction

int f_stat = [ imem_error: SADR; !instr_valid : SINS; f_icode == IHALT : SHLT; 1 : SAOK; ];

Update the status

int m_stat = [ dmem_error : SADR; 1 : M_stat; ];

int Stat = [

SBUB in earlier stages indicates bubble

W_stat == SBUB : SAOK; 1 : W_stat; ];

Avoiding Side Effects

Presence of Exception Should Disable State Update

 Invalid instructions are converted to pipeline bubbles

 Except have stat indicating exception status

 Data memory will not write to invalid address

 Prevent invalid update of condition codes

 Detect exception in memory stage

 Disable condition code setting in execute

 Must happen in same clock cycle

 Handling exception in final stages

 When detect exception in memory stage

Ā» Start injecting bubbles into memory stage on next cycle

 When detect exception in write-back stage

Ā» Stall excepting instruction

 Included in HCL code

Control Logic for State Changes

Setting Condition Codes

Stage Control

 Also controls updating of memory

Should the condition codes be updated?

bool set_cc = E_icode == IOPL &&

State changes only during normal operation

!m_stat in { SADR, SINS, SHLT } && !W_stat in { SADR, SINS, SHLT };

Start injecting bubbles as soon as exception passes

through memory stage bool M_bubble = m_stat in { SADR, SINS, SHLT } || W_stat in { SADR, SINS, SHLT };

Stall pipeline register W when exception encountered

bool W_stall = W_stat in { SADR, SINS, SHLT };

Performance Metrics

Clock rate

 Measured in Gigahertz

 Function of stage partitioning and circuit design

 Keep amount of work per stage small

Rate at which instructions executed

 CPI: cycles per instruction

 On average, how many clock cycles does each instruction require?

 Function of pipeline design and benchmark programs

 E.g., how frequently are branches mispredicted?

CPI for PIPE

CPI ā‰ˆ 1.

 Fetch instruction each clock cycle

 Effectively process new instruction almost every cycle

 Although each individual instruction has latency of 5 cycles

CPI > 1.

 Sometimes must stall or cancel branches

Computing CPI

 C clock cycles

 I instructions executed to completion

 B bubbles injected (C = I + B)

CPI = C/I = (I+B)/I = 1.0 + B/I

 Factor B/I represents average penalty due to bubbles

Fetch Logic Revisited

During Fetch Cycle

**1. Select PC

  1. Read bytes from** **instruction memory
  2. Examine icode to** determine **instruction length
  3. Increment PC**

Timing

 Steps 2 & 4 require significant amount of time

Standard Fetch Timing

 Must Perform Everything in Sequence

 Can’t compute incremented PC until know how much to increment it by

Select PC

Mem. Read Increment

need_regids, need_valC

1 clock cycle

Modified Fetch Timing

29-Bit Incrementer

 Acts as soon as PC selected

 Output not needed until final MUX

 Works in parallel with memory read

Select PC

Mem. Read

Incrementer

need_regids, need_valC 3-bit add

MUX

1 clock cycle

Standard cycle

More Realistic Fetch Logic

Fetch Box

 Integrated into instruction cache

 Fetches entire cache block (16 or 32 bytes)

 Selects current instruction from current block

 Works ahead to fetch next block

 As reaches end of current block

 At branch target

Instruction Cache

Instruction Cache

Byte 0 Bytes 1-

Current Block

Next Block

Current Instruction

Current Instruction

Instr. Length

Instr. Length

Fetch Control

Fetch Control

Other PC Controls