Combining Datapath: R type and Memory in Complete Implementation | CMSC 311, Study notes of Computer Architecture and Organization

Material Type: Notes; Class: COMPUTER ORGNIZATN; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-d8g
koofers-user-d8g 🇺🇸

9 documents

1 / 22

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Complete implementation
MIPS-lite
arithmetic/logical:
add, sub, and, or, slt
memory access:
lw, sw
branch/jump:
beq, j
Combine datapaths for
instruction fetch (Fig. 5.5)
R-type operations (Fig. 5.7)
Load and store (Fig. 5.9)
Branch (Fig. 5.10)
Jump (to be added)
Add control signals
Version 1: execute each instruction in 1 clock cycle
Version 2: execute each instruction in multiple (shorter) clock cycles
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16

Partial preview of the text

Download Combining Datapath: R type and Memory in Complete Implementation | CMSC 311 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Complete implementation

MIPS-lite arithmetic/logical: add, sub, and, or, slt memory access: lw, sw branch/jump: beq, j Combine datapaths for instruction fetch (Fig. 5.5) R-type operations (Fig. 5.7) Load and store (Fig. 5.9) Branch (Fig. 5.10) Jump (to be added) Add control signals Version 1: execute each instruction in 1 clock cycle Version 2: execute each instruction in multiple (shorter) clock cycles

Combining datapaths: R-type and memory

Version 1: execute instruction in 1 clock cycle No datapath resource can be used more than once in a single instruction If needed more than once, must be duplicated Separate instruction, data memory Some resources can be shared between different instruction types Need to have multiple inputs Control signal to select which to use: multiplexor Arithmetic-logical (R-type) and memory access (load/store) are similar in some ways Differences Second operand input to ALU: Register contents for R-type Sign-extended immediate value (offset) for load/store Value stored in destination register: ALU output for R-type Data memory value for load To combine datapaths: Select source of second ALU operand Select source of data to write to register Use 2 MUXes with control inputs ALUsrc for ALU MemtoReg for register write

Combining datapaths: instruction fetch

Can add instruction fetch:

Fig. 5. Need separate adder in order to: increment PC perform ALU operation in same clock cycle

Combining datapaths: branch

Add branch:

Fig. 5.

Additional MUX uses PCSrc input to select: incremented PC or (PC + immediate) from second adder output goes to update PC Need to keep separate adder to compute branch address

Control unit

New features: Fig. 5. Instruction bit numbers for register numbers, opcode, function MUX to select destination register RegDst: selects $rd or $rt to write data ALU control: uses function code and ALUOp to generate ALU operation selection What is ALUOp? 2-bit code generated by main control (stay tuned) Note that the values of RegDst, ALUSrc, and PCSrc are reversed in this diagram. The version in the current printing of the text is correct.

Control unit: main

Registers opcode Main Memory Control MUXes ALUOp

ALU control function

Main control unit generates Write signals for registers and memory Control input for MUXes ALUOp input (2 bits) Multiple levels of control: reduces size of main control unit

Control unit: main

Main control signals ALUOp: 2 bits based on op code used as input by ALU control RegDst: selects from instruction bits 20-16 or 15-11 ($rt or $rd) for destination register to write data RegWrite: enables writing of destination register ALUSrc: selects second input of ALU 0: Read data 2 from register file 1: sign-extended immediate from instruction PCSrc: selects input to update PC 0: PC + 4 from adder 1: PC + offset from branch target calculation (other adder) MemRead: data is read from data memory MemWrite: data is written to data memory MemtoReg: selects data to send to register file to write 0: ALU result 1: data read from memory Can set all of these based only on opcode, except PCSrc Should be set based on beq instruction AND ALU ouput is 0 Control unit generates Branch signal, which is ANDed with ALU Zero output

Datapath with control

How to set control bits? Fig. 5. Could use Boolean equations or truth tables, but depends only on opcodes

Datapath: R-type

  1. Fetch instruction and increment PC Fig. 5.
  2. Obtain operands from register file, based on source register numbers
  3. Perform ALU operation, using ALU control to select, ALUSrc = 0
  4. Select output from ALU using MemtoReg = 0
  5. Write back to destination register (RegWrite = 1, RegDst = 1 for $rd) Note that this entire path is combinational, but the values are generated in the order shown.

1

(^2 )

4

5

Datapath: memory access (load)

  1. Fetch instruction and increment PC Fig. 5.
  2. Obtain base register operand (Read data 1) from register file
  3. Perform addition of register value with sign-extended immediate operand in ALU, using ALU control to select operation, ALUSrc = 1 to select immediate
  4. Use ALU result as address for data memory
  5. Use MemtoReg = 1 to select Read data and write back to destination register Controls: RegWrite = 1, RegDst = 0 for $rt How would this have to change for a store instruction?

1 2

3

4

5

Datapath: branch

  1. Fetch instruction and increment PC Fig. 5.
  2. Read 2 registers from register file for comparison
  3. ALU subtracts data values, using ALU control to select operation and ALUSrc = 0
  4. Generate branch address: add (PC + 4) to sign-extended offset, shifted left by 2 why shifted by 2? what control signal values?
  5. Use Zero output from ALU (and Branch control) to determine which result to use to update PC If equal, use branch address else use incremented PC Note that this is all happening simultaneously in 1 clock cycle (adder separate from ALU)

1 2

3

4 5

Datapath: jump

Jump is still missing Recall format:

000010 target b 31-26 b 25- Compare to branch: computes target address differently PC <- PC (^) 31-28 :: IR (^) 25-0 :: 00 update the PC by using:

  • upper 4 bits of the program counter
  • 26 bits of the target (lower 26 bits of instruction register)
  • two 0's (creates a 32-bit address) unconditional

Datapath: critical path

Why don't we use single-cylcle datapath? Too slow! Clock cycle time must be as long as the longest path: load instruction

Fig. 5.

5 functional units: PC, instruction memory, register file, ALU, data memory, plus 2 MUXes

1

2

3

4

5

Datapath: performance

Single-cycle performance Suppose operation times are: Memory: 2 nanoseconds ALU/adders: 2 ns Register file (read or write): 1 ns Assume MUXes, control units, PC access, sign-extend have no delay How long must the clock cycle be? Instruction Inst Reg ALU Data Reg Total Distribution type mem read op mem write R-type 2 1 2 0 1 6 44% 2. load 2 1 2 2 1 8 24% 1. store 2 1 2 2 0 7 12% 0. branch 2 1 2 0 0 5 18% 0. jump 2 0 0 0 0 2 2% 0. With single-cycle, clock period must be 8 ns 6. But, not all instructions are loads! Suppose distribution of instruction types shown. If we could vary the instruction time, the average would be: (6 * 44%) + (8 * 24%) + (7 * 12%) + (5 * 18%) + (2 * 2%) = 6. This is 27% faster!