Unit 6- Microprocessor Design, Summaries of Design

ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ... Unit 6- Microprocessor Design ... It requires specialized software development.

Typology: Summaries

2021/2022

Uploaded on 09/27/2022

gerrard_11
gerrard_11 🇬🇧

4.3

(6)

234 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY
ECE-3710: Computer Hardware Design Winter 2019
1 Instructor: Daniel Llamocca
Unit 6- Microprocessor Design
INTRODUCTION
Abstraction Layers in Computer Systems Design: Transistor Circuits Logic Gates Register Transfers Microarchitecture
Instruction Set Architecture Operating Systems Programming Languages Algorithm
Typical devices used to implement digital systems (they can be implemented with a hardware-description language):
ASICs, FPGAs: For dedicated hardware implementation. It requires highly specialized design.
General-Purpose Microprocessors, Microcontrollers (e.g. embedded C). It requires software development.
Specialized uPs: PDSPs (programmable digital signal processor). It requires specialized software development.
ASICs or uPs? Performance vs. flexibility. ASIC design requires high development cost, not reprogrammable.
FPGAs: Intermediate option between ASICs and uP. Not commonly used for processor implementation. Operating frequencies
can be relatively low compared to uP, but can achieve higher performance for specific tasks. They are reconfigurable.
PSoCs (Programmable System-on-Chip). They integrate reconfigurable logic (like an FPGA), a hard-wired microprocessor,
and peripherals. With proper software/hardware co-design, high performance solutions can be attained.
COMPUTER HARDWARE ORGANIZATION
General-purpose Digital Computer: Usually called ‘Computer’. It is a digital system that can follow a stored sequence
of instructions, called a program, that operates on data.
The user can specify and modify the program and/or the data according to their specific needs.
As a result of this flexibility, general-purpose digital computers can perform a variety of information-processing tasks,
ranging over a very wide spectrum of applications.
The digital computer is thus a highly general and very flexible digital system.
Computer Specification: It is the description of its appearance to a programmer at the lowest level: the
Instruction Set
Architecture (ISA)
. From the ISA, a high-level description of the hardware to implement the computer (i.e., the
computer
architecture
) is formulated.
Computer: Processor + I/O + Memory
Memory: It stores programs as well as input, output, and intermediate data.
Central Processing Unit (CPU): It sequentially executes the instructions in memory (the program) by performing
arithmetic and other data-processing operations.
I/O Units: A digital computer can accommodate many different input and output devices, e.g.: DVD drives, USB flash
drives, printers, LCDs, keyboards.
Memory
Data /
Instruction
OUTPUT
UNITS
INPUT
UNITS
CONTROL
UNIT
ALU
REGISTERS
PROCESSOR (CPU)
ADDRESS/DATA BUS
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Unit 6- Microprocessor Design and more Summaries Design in PDF only on Docsity!

ECE- 3710 : Computer Hardware Design Winter 201 9

Unit 6- Microprocessor Design

INTRODUCTION

▪ Abstraction Layers in Computer Systems Design: Transistor Circuits → Logic Gates → Register Transfers → Microarchitecture

→ Instruction Set Architecture → Operating Systems → Programming Languages → Algorithm

▪ Typical devices used to implement digital systems (they can be implemented with a hardware-description language): ✓ ASICs, FPGAs: For dedicated hardware implementation. It requires highly specialized design. ✓ General-Purpose Microprocessors, Microcontrollers (e.g. embedded C). It requires software development. ✓ Specialized uPs: PDSPs (programmable digital signal processor). It requires specialized software development. ▪ ASICs or uPs? Performance vs. flexibility. ASIC design requires high development cost, not reprogrammable. ▪ FPGAs: Intermediate option between ASICs and uP. Not commonly used for processor implementation. Operating frequencies can be relatively low compared to uP, but can achieve higher performance for specific tasks. They are reconfigurable. ▪ PSoCs (Programmable System-on-Chip). They integrate reconfigurable logic (like an FPGA), a hard-wired microprocessor, and peripherals. With proper software/hardware co-design, high performance solutions can be attained.

COMPUTER HARDWARE ORGANIZATION

General-purpose Digital Computer : Usually called ‘Computer’. It is a digital system that can follow a stored sequence of instructions, called a program, that operates on data. ✓ The user can specify and modify the program and/or the data according to their specific needs. ✓ As a result of this flexibility, general-purpose digital computers can perform a variety of information-processing tasks, ranging over a very wide spectrum of applications. ✓ The digital computer is thus a highly general and very flexible digital system.

Computer Specification : It is the description of its appearance to a programmer at the lowest level: the Instruction Set Architecture (ISA). From the ISA, a high-level description of the hardware to implement the computer (i.e., the computer architecture) is formulated.

Computer : Processor + I/O + Memory ✓ Memory: It stores programs as well as input, output, and intermediate data. ✓ Central Processing Unit (CPU): It sequentially executes the instructions in memory (the program) by performing arithmetic and other data-processing operations. ✓ I/O Units: A digital computer can accommodate many different input and output devices, e.g.: DVD drives, USB flash drives, printers, LCDs, keyboards.

Memory

Data /

Instruction

OUTPUT UNITS

INPUT UNITS

CONTROL

ALU UNIT

REGISTERS

PROCESSOR (CPU)

ADDRESS/DATA BUS

ECE- 3710 : Computer Hardware Design Winter 201 9

CENTRAL PROCESSING UNIT (CPU)

▪ Also called Processor. It consists of a Datapath and Control Unit. ✓ Datapath :  Register File (set of Registers): They hold data and memory address values during the execution of an instruction.  Arithmetic Logic Unit (ALU) : Shared operation unit that performs arithmetic (e.g., addition, subtraction, division) and bit-wise logic (e.g., AND, OR, operations). ✓ Control Unit: It controls operations performed on the Datapath and other components (e.g. memory). It interprets the instructions and executes them. Instructions are read from memory. To execute a particular instruction, this unit asserts specific signals at certain times to control the registers, ALU, memories and ancillary logic. A Control Unit usually includes:  Program Counter (PC): During program execution, it provides the address of the instruction being executed. It can increase the address as well as change the sequence of operations using decisions based on status information.  Instruction Decoder (ID): It reads the instructions and generates control signals to the datapath and other components. It is usually implemented as a combinational circuit (single-cycle computers) or as a large Finite State Machine (FSM) with ancillary logic (multi-cycle computers).

▪ Complex CPU: Multiple control units and datapaths.

Harvard vs. Von Neumann Harvard:  ^ Instruction memory andOperands usually placed in registers in the CPU: register^ Data memory -to-register architecture Von Neumann:

 One memory for both instruction and data  Operands placed in an accumulator register or in the instruction memory: register-memory architecture

GENERIC CPU MODEL

▪ The figure depicts a generic model for a CPU with typical components. The Control Unit includes the Program Counter (PC) and the Instruction Decoder (ID). The Datapath includes a Register File and an ALU. Instruction and Data Memories are usually included. A specific CPU might not have all the components or connections, or it might include more components. ▪ Program Counter (PC): It has a branch control mechanism to increment the PC, assign an arbitrary value (jump/ branch), or to apply an address offset. The jump/branch address and offset are latched from the instruction itself or from the datapath. In the figure, the instruction register (IR) goes to the offset address, while the Datapath generates the jump address. But it can be the other way around, or the PC might not include an offset or jump address. ▪ Instruction Decoder (ID): It generates control bits (orange-colored signals) for the Datapath, PC, and Data Memory. ▪ Instruction Memory (IM): It generates the instructions to be executed. The output is called the Instruction Register (IR). ▪ This CPU requires an extra circuitry that: i) enables the execution of PC (E_PC, sclr_PC), ii) controls Instruction Memory (IM) loading, and iii) enables the Instruction Decoder. ▪ Model Parameters: H: Memory address word size. K: Address Offset Size. N: Data word size. NI: Instruction Word Size. M: Bits to address the Register File ( 2 M^ registers). K ≤ H ≤ N.

DATAPATH

PC

Instruction

Memory

IM_WE

IM_DI

Instruction

Decoder

CONTROL_WORD

Z V N C DatapathControl

FS

FS

AD

Branch Control JMP E OFFSET sclr

E_PC sclr_PC H

NI

Data

Memory

DM_WE

DI

AD

DO

NI

Z V N C

N

H

DM_DI

WE

N

CONTROL UNIT

ALU

Register

File

2 M^ registers

x_ctrl

I/O interfaces

IR

WE

PC

DATAPATH K

H

ECE- 3710 : Computer Hardware Design Winter 201 9

OPCODE (IR[7..5]) Instruction Operation Description 000 MOV DR, SR (^) DR  SR 001 LOADI DR, DATA (^) DR  DATA, DATA = IR[3..0] 010 ADD DR, SR (^) DR  DR + SR 011 ADDI DR, DATA (^) DR  DR + DATA, DATA = IR[3..0] 100 SR0 DR, SR (^) DR  0&SR[3..1] 101 IN DR (^) DR  IN 110 OUT DR (^) OUT  DR 111 JNZ DR, ADDRESS

PC  PC + 1 if DR= PC  IR[3..0] if DR 0

  • ADDRESS = IR[3..0]

▪ opcode: IR[7..5]: This is the operation code of an instruction. This group of bits specifies an operation (such as add, subtract, shift, complement in the ALU). If it has m bits, there can be up to 2m^ distinct instructions. ▪ Immediate Data: IR[3..0]. This is called an immediate operand since it is immediately available in the instruction.

INSTRUCTION DECODER: ▪ This component is in charge of issuing control signals for the proper execution of instructions. The inputs to this circuit are the Instruction Register (IR) and the Z flag. The outputs are all the control signals: M1, M2, M3, M4, M5, M6, L_R0, L_R1, L_OP. Note that the Function Select (FS) output to the ALU is directly generated by IR[7..5]. ▪ Also, if stop_ID=0, all the outputs are ‘0’. This is useful to pause execution of a program (PC and Datapath are not updated). ▪ This is a combinational circuit. The I/O relationship depends on how each instruction is defined.

✓ IN DR: DR grabs the contents from the input

IN R0: 1010XXXX  M1  1, L_R0  1, M6  0

IN R1: 1011XXXX  M1  1, L_R1  1, M6  0

✓ OUT DR: Places the contents of DR on the output register

OUT R0: 1100XXXX  M2  0, L_OP  1, M6  0

OUT R1: 1101XXXX  M2  1, L_OP  1, M6  0

✓ LOADI DR, DATA: Copies immediate DATA onto DR

LOADI R0, DATA: 0010d 3 d 2 d 1 d 0  M5  1, M3  1, M1  0, L_R0  1, M6  0 LOADI R1, DATA: 0011d 3 d 2 d 1 d 0  M5  1, M3  1, M1  0, L_R1  1, M6  0

✓ ADD DR, SR: Adds SR and DR, and copies the result onto DR

ADD R0,R0: 01000XXX  M40, M50, M20, M31, M10, L_R01, M6 0

ADD R0,R1: 01001XXX  M40, M50, M21, M31, M10, L_R01, M6 0

ADD R1,R0: 01010XXX  M40, M50, M21, M31, M10, L_R11, M6 0

ADD R1,R1: 01011XXX  M41, M50, M21, M31, M10, L_R11, M6 0

1 0 1 DR X X X X

0 1

1 1 0 DR X X X X

0 1

0 0 1 DR d 3 d 2 d 1 d 0

0 1

0 1 0 DR SR X X X

0 1 0 1

ECE- 3710 : Computer Hardware Design Winter 201 9

✓ ADDI DR, DATA: Adds immediate DATA and DR, and copies the result onto DR

ADDI R0, DATA: 0110d 3 d 2 d 1 d 0  M2  0, M5  1, M3  1, M1  0, L_R0  1, M6  0 ADDI R1, DATA: 0111d 3 d 2 d 1 d 0  M2  1, M5  1, M3  1, M1  0, L_R1  1, M6  0 ✓ MOV DR, SR: Copies the contents of SR onto DR

MOV R0, R0: 00000XXX  M2  0, M3  0, M1  0, L_R0  1, M6  0

MOV R1, R1: 00011XXX  M2  1, M3  0, M1  0, L_R1  1, M6  0

MOV R0, R1: 00001XXX  M2  1, M3  0, M1  0, L_R0  1, M6  0

MOV R1, R0: 00010XXX  M2  0, M3  0, M1  0, L_R1  1, M6  0

"MOV R0,R0", "MOV R1,R1"(can be used as NOP instruction) ✓ SR0 DR, SR: Shifts (to the right) the contents of SR and places the result onto DR

SR0 R0,R0: 10000XXX  M40, M50, M20, M31, M10, L_R01, M6 0

SR0 R0,R1: 10001XXX  M40, M50, M21, M31, M10, L_R01, M6 0

SR0 R1,R0: 10010XXX  M40, M50, M21, M31, M10, L_R11, M6 0

SR0 R1,R1: 10011XXX  M41, M50, M21, M31, M10, L_R11, M6 0

✓ JNZ DR, ADDRESS: Jumps to a certain instruction if DR 0. This is how computers implement loops.

JNZ R0, ADDRESS: 1110a 3 a 2 a 1 a 0  M2  0 , M6  0 if z = 1, M6  1 if z = 0 JNZ R1, ADDRESS: 1111a 3 a 2 a 1 a 0  M2  1, M6  0 if z = 1, M6  1 if z = 0

  • M6  0  PC  PC + 1 ; M6  1  PC  IR[3..0]

ARITHMETIC LOGIC UNIT ▪ With the 3-bit input selector FS, the operations performed here are very simple. For 4-bit inputs A and B as well as 4 - bit output F, we have that: F=A when FS=000,001; F=A+B when FS=010,011; F=sr(A) when FS=100; and F=B when FS=111. The output Z=1 if the result of F is all 0’s, except is FS=101, 110 (since these are the IN, OUT instructions).

Example: ▪ Write an assembly program for a counter from 1 to 5: 1, 2, 3, 4, 5, 1, 2, 3, …. The count must be shown on the output register (OUT). start: loadi R0,1 R0  1 out R0 OUT = 1 addi R0,1 R0  R0 + 1 = 2 out R0 OUT = 2 addi R0,1 R0  R0 + 1 = 3 out R0 OUT = 3 addi R0,1 R0  R0 + 1 = 4 out R0 OUT = 4 addi R0,1 R0  R0 + 1 = 5 out R0 OUT = 5 jnz R0, start

0 1 1 DR d 3 d 2 d 1 d 0

0 1

1 0 0 DR SR X X X

0 1 0 1

0 0 0 DR SR X X X

0 1 0 1

1 1 1 DR a 3 a 2 a 1 a 0

0 1

ECE- 3710 : Computer Hardware Design Winter 201 9

SINGLE-CYCLE HARDWIRED CONTROL – A SIMPLE COMPUTER

▪ Here, we provide a more formal description of a microprocessor (using the generic CPU model); the figure includes the Instruction Load Control component. Parameters: NI=16, N=16, K=H=6, M=3 (8 Registers). The Function Select (FS) of the ALU has 4 bits. The Constant Input (CI) of the Datapath has N=16 bits, where CI[2..0]=IR[2..0], and CI[15..3]=”00...00” ▪ Instruction Load Control: It does not control loading of data into Data Memory, though it could be updated to handle that.

PROGRAM COUNTER (PC) :

▪ This Generic Program Counter accepts a Jump Address (AO) and an Offset Address. ▪ Note that PC and JA are unsigned H-bit addresses, while OFFSET can be an unsigned or signed K-bit value (K ≤ H). ▪ In the figure, we use a signed offset. As a result, we zero extend PC and add it to the OFFSET resulting in H+1 bits. We only grab H bits and treat the result as unsigned. This means that if the result ends up being outside [0, 2H-1], we wraparound the result (e.g.: 111110 + 000011 = 1 000001 ; 000001 - 000011 = 000001 + 111101 = 111110 ).

E_PC sclr_PC

1 0

+

0...

E sclr

H

K ≤ H H+

OFFSET

1 0

JA H

H PC

0&PC H+

K

OS

JS

PROGRAM COUNTER (PC)

Register

File

2 M^ registers

ALU

0 1 MB N

N (^) CI

AO DO

D

(^0 1) MD

N (^) DI

Z V N C

FS

RW SA

DR SB

M

M M

N N

BUS_A BUS_B

DATAPATH

DATAPATH

PC

Instruction

Memory

Instruction

Decoder

Z V N C

FS

FS

AD

JA E OFFSET sclr H=

NI=

Data

Memory

MW

DI

AD

DO Z V N C

16

6

DM_DI

WE

N=

CONTROL UNIT

ALU

Register

File

23 registers

IR

WE

PC

DATAPATH OS JS

3 IR[2..0]

4

IM_WE

E_PC

16

stop_ID

INST_LOAD

CONTROL IM_DI

sclr_PC

isbranch

16

L_in L_ex D_ex we_ex

start step

DM_DO

IR[8..6]&IR[2..0]

DM_AD

RW DR SA SB MB

3 3 3

MD

DR SA SBMBMDRWMWOSJS

H=6 DM_AD= AO[5..0]

DO

DI

AO

AO CI

K=

AO: 16-bits wide. Only the 6 LSBs are used

ECE- 3710 : Computer Hardware Design Winter 201 9

DATAPATH: ▪ A generic datapath includes a Register File and an ALU (see previous figure). A Register File includes 2M^ registers, so we need M bits to address all of these registers. ▪ Register File : The figure below depicts a Register File with M=2, resulting in 22 =4 registers. Note how in this particular implementation, we use 2 data buses (Bus A and Bus B). Other implementations only use one Data Bus. We also include the connections to the ALU and to the Datapath inputs and outputs.

Arithmetic Logic Unit : The FS has 4 bits, and the following table lists all the possible operations. The input Data (A, B) and output data (Y) are represented as signed numbers. Here, the flags Z, V, N, C are generated. FS Operation Function Flag bits Unit 0000 Y <= A (^) Transfer A N, Z

Arithmetic

0001 Y <= A + (^1) Increment A V, C, N, Z 0010 Y <= A + B (^) Add A and B with cin=0 V, C, N, Z 0011 Y <= A + B + (^1) Add A and B with cin=1 V, C, N, Z 0100 Y <= A – B - 1 = A + not(B) (^) Subtract B from A with bin=1 V, C, N, Z 0101 Y <= A – B = A + not(B) + (^1) Subtract B from A with bin=0 V, C, N, Z 0110 Y <= A - (^1) Decrement A V, C, N, Z 0111 Y <= B (^) Transfer B N, Z 1000 Y <= A OR B (^) Bit-wise OR N, Z

Logic

1001 Y <= A AND B (^) Bit-wise AND N, Z 1010 Y <= A XOR B (^) Bit-wise XOR N, Z 1011 Y <= not A (^) Complement A N, Z 1100 Y <= not B (^) Complement B N, Z 1101 Y <= sr B (^) Right-shift B N, Z 1110 Y <= sl B (^) Left-shift B N, Z 1111 Y <= (^0) Transfer 0 None affected ✓ In this particular implementation, the carry out (C) from a previous operation is not an input to the ALU. Instead, we have to use a specific instruction that adds the carry in (or borrow in) to an operation when desired.

REGISTER FILE (M=2)

E (^) R3 E (^) R2 E (^) R1 E R

3 2 1 0 Decoder

Destination Register (DR)

E M=

0 1 2 3

RW

E3E2E1E

E3 E2 E1 E

(^0 1) MB Data Out(DO)

BUS A BUS B (^) N (^) Constant In(CI)

ALU

Address Out(AO)

N N

N

Z V N C

FS

0 1 MD

N (^) Data In(DI)

Select A(SA) Select B(SB)

M M

N

A B

Y

ECE- 3710 : Computer Hardware Design Winter 201 9

▪ The table provides the instruction specification, i.e., a description of the operation performed by each instruction, including the status bits affected by the instruction. We include a limited number of instructions; the designer can always add more instructions that are supported by the Datapath and Control Unit.

Instruction Opcode (^) mMneonic- Format Description PC Status Bits Move A 0000000 MOVA^ RD, RA^ R[DR]  R[SA] PC  PC+1 N, Z Increment 0000001 INC^ RD, RA^ R[DR]^ ^ R[SA] + 1^ PC^ ^ PC+1^ N, Z, C, V Add 0000010 ADD^ RD, RA, RB^ R[DR]  R[SA] + R[SB] PC  PC+1 N, Z, C, V Subtract 0000101 SUB^ RD, RA, RB^ R[DR]  R[SA] - R[SB] PC  PC+1 N, Z, C, V Decrement 0000110 DEC^ RD, RA^ R[DR]^ ^ R[SA]^ -^1 PC^ ^ PC+1^ N, Z, C, V AND 0001000 AND^ RD, RA, RB^ R[DR]  R[SA]  R[SB] PC  PC+1 N, Z OR 0001001 OR^ RD, RA, RB^ R[DR]  R[SA]  R[SB] PC  PC+1 N, Z Exclusive OR 0001010 XOR^ RD, RA, RB^ R[DR]  R[SA]  R[SB] PC  PC+1 N, Z NOT 0001011 NOT^ RD, RA^ R[DR]  not (R[SA]) PC  PC+1 N, Z Move B 0001100 MOVB^ RD, RB^ R[DR]  R[SB] PC  PC+1 N, Z Shift Right 0001101 SHR^ RD, RB^ R[DR]  sr R[SB] PC  PC+1 N, Z Shift Left 0001110 SHL^ RD, RB^ R[DR]  sl R[SB] PC  PC+1 N, Z Load Immediate 1001100 LDI^ RD, OP^ R[DR]^ ^ OP^ PC^ ^ PC+1^ N, Z Add Immediate 1000010 ADI^ RD, RA, OP^ R[DR]  R[SA] + OP PC  PC+1 N, Z Load 0010000 LD^ RD, RA^ R[DR]  M[R[SA]] PC  PC+ Store 0100000 ST^ RA, RB^ M[R[SA]]^ ^ R[SB]^ PC^ ^ PC+ Branch on Zero 1100000 BRZ^ RA, AD^ If R[SA] ≠ 0 If R[SA] = 0^ PC PC^ ^ PC+1PC+AD N, Z Brand on Negative 1100001 BRN^ RA, AD^

If R[SA] ≥ 0 If R[SA] < 0

PC  PC+

PC  PC+AD N, Z

Jump 1110000 JMP^ RA^ PC  R[SA]

▪ Other ISAs do not generate status bits when transfers on the Bus B are occurring. ▪ Note that the branch instructions generate N, Z because they require Bus A to be transferred in order to evaluate R[SA] which might assert N or Z. The Jump instruction does not affect the status bits. ▪ Some considerations regarding the notation of the Instruction Description: ✓ R[DR]: This refers to the register whose number is DR. Example: if DR=2 → R2. ✓ M[R[SA]]: This refers to the memory address given by the value of the Register with number SA, e.g.: if SA=3 → M[R3]. ▪ The following table shows an example with instructions in memory and a detailed description of them:

Address Memory Contents Other Fields (^) InstructionAssembly Operation Comments 011001 0000101001010011 DR:1, SA:2, SB:3 SUB R1,R2,R (^3) R 1  R 2 – R 3 100011 0100000000100101 SA:4, SB:5 ST R4,R (^5) M[R 4 ]  R 5 DR unused 101101 1000010010111110 DR: 2 , SA:7, OP: 6 ADI R2,R7, (^6) R2  R7 + 6 110111 1100000101110100 AD:- 20 , SA:6 BRZ R6,- (^20) If R[SA] = 0: PC  PC- 20 - 20= 101100 111110 0010000101010000 DR:5, SA:2 LD R5,R2 (^) R5  M[R2] SB unused

Example: ▪ The following Assembly Program implements a counter from 2 to 13: 2,3,…, 13,2,3,… As we cannot use 1 1 as a 3 - bit immediate operand, we first load 7 on R1 and then add 4. * 2 to 13  11 downto 0 We use ‘---‘ to indicate the values that are unused. This means we can assign any value to them. Address Instruction Memory Assembly Program 000000 1001100 011 --- 100 start: LDI R3,4 R3  4 000001 1001100 000 --- 010 LDI R0,2 (^) R0  2 000010 1001100 001 --- 111 LDI R1,7 (^) R1  7 000011 1000010 001 001 100 ADI R1,R1, (^4) R1  R1+ 000100 1000010 000 000 001 loop: ADI R0,R0, (^1) R0  R0+ 000101 0000110 001 001 --- DEC R1,R1 (^) R1  R1- 1 000110 1100000 111 001 011 BRZ R1,- 5 000111 1110000 --- 011 --- JMP R 001000 0000000 000 000 000 (^) R0  R0 (This is NOP operation) ...

ECE- 3710 : Computer Hardware Design Winter 201 9

INSTRUCTION DECODER ▪ The inputs to this circuit are the Instruction Register (IR) and the V, C, N, Z flags. The outputs are all the control signals: DR, SA, SB, MB, MD, RW, MW, OS, JS, FS. In this implementation, the V, C, N, Z flags are only considered when branching. ▪ Also, if stop_ID=0, all the outputs are ‘0’. This is useful to pause execution of a program (PC and Datapath are not updated). ▪ This is a combinational circuit. The I/O relationship depends on how each instruction is defined. We provide the output signals for some instructions: Instruction Register V C N Z RW DR SA SB MB MD FS MW OS JS MOVA R1,R2 0000000001010 --- (^1 001 010) --- - 0 0000 0 0 0 MOVA R7,R0 0000000111000 --- 111 000 MOVB R0,R3 0001100000 --- (^011 1 000) --- 011 0 0 0111 0 0 0 MOVB R6,R6 0001100110 --- 110 110 110 ADD R3,R2,R1 (^0000010011010001 1 011 010 001 0 0 0010 0 0 ) ADD R6,R0,R0 0000010110000000 110 000 000 XOR R6,R1,R3 (^0001010110001011 1 110 001 011 0 0 1010 0 0 ) XOR R5,R4,R5 0001010101100101 101 100 101 LDI R7,3 1001100111 --- (^011 1 111) --- --- (^1 0 0111 0 0 ) LDI R5,4 1001100101 --- 100 101 --- ADI R0,R1,7 1000010000001111 1 000 001 --- (^1 0 0010 0 0 ) ADI R2,R6,3 1000010010110011 010 110 --- LD R3,R7 0010000011111 --- 1 011 111 --- - 1 1111 0 0 0 ST R1,R5 0100000 --- 001101 0 --- 001 101 0 - 1111 1 0 0 BRN R4,- 5 1100000111100011 01 0 --- 100 --- - - (^0000 0 01 ) BRZ R3,12 1100001001011100 01 0 --- 011 --- - - (^0000 0 01 ) JMP R5 1110000 --- 101 --- 0 --- 101 --- - - 1111 0 - 1

▪ Branch instructions (BRN, BRZ): These instructions might affect the N and Z bits. Depending on how they affect these flag bits, we either branch or increase the value of the PC. ▪ JMP, LD, ST: They use FS=1111 since in this case the V, C, N, Z flags are unaffected.

Memory latency (IM, DM) ▪ Memory implemented as an array of registers: Writing: Data takes 1 clock cycle to be written. Reading: Output data output appears as soon as address is ready. ▪ Memory implemented using BRAMs (assuming no extra output register): Writing: Data takes 1 clock cycle to be written. Reading: Output data takes 1 clock cycle to appear when address is presented (this is, address is read on the clock edge). ▪ Other memory technologies (SRAMs, DDRRAMs, etc): Writing/Reading: It might take many cycles for data to be written or to appear on the output.

Single-Cycle Computer Shortcomings : ▪ ALU operations that might require more than one cycle to execute (e.g. multiplication, division) cannot be executed, or they would require a large combinational delay. ▪ Lower limit on the clock period based on a long worst-case delay path. Pipelining of the datapath is required to reduce the combinational delay between registers. This requires multiple-cycle control.