Download Datapath 1 notes and more Lecture notes Computer Architecture and Organization in PDF only on Docsity!
Start: X:
ECE4680 Datapath.1 2002-4-
ECE
Computer Organization and Architecture
Designing a Single Cycle Datapath
Processor Design: How to Implement MIPS
Simplicity favors regularity
Before we go any further, let’s step back for a second and take a look at the big picture.
All computer consist of five components: (1) Input and (2) output devices. (3) The Memory System.
And the (4) Control and (5) Datapath of the Processor.
Today’s lecture covers the datapath design.
In the next lecture, I will show you how to design the processor’s control unit.
+1 = 5 min. (X:45)
ECE4680 Datapath.2 2002-4-
The Big Picture: Where are We Now?
° The Five Classic Components of a Computer
° Today’s Topic: Datapath Design
- What is data?
- What is datapath?
Control
Datapath
Memory
Processor Input
Output
One of the most important thing you need to know before you start designing a processor is how
the instructions look like.
Or in more technical term, you need to know the instruction format. One good thing about the MIPS
instruction set is that it is very simple.
First of all, all MIPS instructions are 32 bits long and there are only three instruction formats: (a) R-
type, (b) I-type, and (c) J-type.
The different fields of the R-type instructions are:
(a) OP specifies the operation of the instruction.
(b) Rs, Rt, and Rd are the source and destination register specifiers.
(c) Shamt specifies the amount you need to shift for the shift instructions.
(d) Funct selects the variant of the operation specified in the “op” field.
For the I-type instruction, bits 0 to 15 are used as an immediate field. I will show you how this
immediate field is used differently by different instructions.
Finally for the J-type instruction, bits 0 to 25 become the target address of the jump.
+3 = 10 min. (X:50)
ECE4680 Datapath.4 2002-4-
The MIPS Instruction Formats
° All MIPS instructions are 32 bits long. The three instruction formats:
° The different fields are:
- op: operation of the instruction
- rs, rt, rd: the source and destination register specifiers
- shamt: shift amount
- funct: selects the variant of the operation in the “op” field
- address / immediate: address offset or immediate value
- target address: target address of the jump instruction
op target address
6 bits 26 bits
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
op rs rt immediate
6 bits 5 bits 5 bits 16 bits
In today’s lecture, I will show you how to implement the following subset of MIPS instructions: add,
subtract, or immediate, load, store, branch, and the jump instruction.
The Add and Subtract instructions use the R format. The Op together with the Func fields together
specified all the different kinds of add and subtract instructions.
Rs and Rt specifies the source registers. And the Rd field specifies the destination register.
The Or immediate instruction uses the I format. It only uses one source register, Rs. The other
operand comes from the immediate field. The Rt field is used to specified the destination register.
Both the load and store instructions use the I format and both add the Rs and the immediate filed
together to form the memory address.
The difference is that the load instruction will load the data from memory into Rt while the store
instruction will store the data in Rt into the memory.
The branch on equal instruction also uses the I format. Here Rs and Rt are used to specified the
registers we need to compare.
If these two registers are equal, we will branch to a location specified by the immediate field.
Finally, the jump instruction uses the J format and always causes the program to jump to a
memory location specified in the address field.
I know I went over this rather quickly and you may have missed something. But don’t worry, this is
just an overview. You will keep seeing these (point to the format) all day today.
+3 = 13 min. (X:53)
ECE4680 Datapath.5 2002-4-
The MIPS Subset
° ADD and subtract
- add rd, rs, rt
- sub rd, rs, rt
° OR Immediate:
° LOAD and STORE
- lw rt, rs, imm
- sw rt, rs, imm
° BRANCH:
° JUMP:
- j target op^ target address
6 bits 26 bits
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
op rs rt immediate
6 bits 5 bits 5 bits 16 bits
Remember, we will be using a clocking methodology where all storage elements are clocked by
the same clock edge.
Consequently, our cycle time will be the sum of:
(a) The Clock-to-Q ( or latch propagation) time of the input registers.
(b) The longest delay path through the combinational logic block.
(c) The set up time of the output register.
(d) And finally the clock skew.
In order to avoid hold time violation, you have to make sure this inequality is fulfilled.
+2 = 18 min. (X:58)
ECE4680 Datapath.7 2002-4-
Clocking Methodology
° All storage elements are clocked by the same clock edge
- Edge-trigged: all stored values are updated on a clock edge
° Cycle Time = Latch Prop + Longest Delay Path + Setup + Clock Skew
° (Latch Prop + Shortest Delay Path - Clock Skew) > Hold Time
Clk
Don’t Care
Setup Hold
Setup Hold
Now with the clocking methodology back in your mind, we can think about how the critical path of
our “abstract” datapath may look like.
One thing to keep in mind about the Register File and Ideal Memory (points to both Instruction and
Data) is that the Clock input is a factor ONLY during the write operation.
For read operation, the CLK input is not a factor. The register file and the ideal memory behave as
if they are combinational logic.
That is you apply an address to the input, then after certain delay, which we called access time,
the output is valid.
We will come back to these points (point to the “behave” bullets) later in this lecture.
But for now, let’s look at this “abstract” datapath’s critical path which occurs when the datapath
tries to execute the Load instruction.
The time it takes to execute the load instruction are the sum of:
(a) The PC’s clock-to-Q time.
(b) The instruction memory access time.
(c) The time it takes to read the register file.
(d) The ALU delay in calculating the Data Memory Address.
(e) The time it takes to read the Data Memory.
(f) And finally, the setup time for the register file and clock skew.
+3 = 21 (Y:01)
ECE4680 Datapath.8 2002-4-
An Abstract View of the Critical Path
° Register file and ideal memory:
- The CLK input is a factor ONLY during write operation
- During read operation, behave as combinational logic:
- Address valid => Output valid after “access time.”
Clk
Rw Ra Rb 32 32-bit Registers
Rd
ALU
Clk
Data In
DataOut
Data Address Ideal Data Memory
Instruction
Instruction Address
Ideal Instruction Memory
Clk PC
Rs 5
Rt 16
Imm
Critical Path (Load Operation) = PC’s prop time + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Setup Time for Register File Write + Clock Skew
Here is an example. In terms of Register Transfer Language, this is what the Add instruction need
to do.
First, you need to fetch the instruction from memory.
Then you perform the actual add operation.
And finally, you need to update the program counter to point to the next instruction.
+1 = 28 min. (Y:08)
ECE4680 Datapath.10 2002-4-
What is RTL: The ADD Instruction
° add rd, rs, rt
- mem[PC] Fetch the instruction from memory
- R[rd] <- R[rs] + R[rt] The ADD operation
- PC <- PC + 4 Calculate the next instruction’s address
Register Transfer Language
Here is another example.
The load instruction also starts off by fetching the instruction from Instruction Memory.
Then you calculate the memory address, use the address to fetch the data from memory
(Mem(Addr)), and then load the data into the register.
Finally, you need to update the PC to point to the next sequential instruction.
+1 = 29 min (Y:09)
ECE4680 Datapath.11 2002-4-
What is RTL: The Load Instruction
° lw rt, rs, imm
- mem[PC] Fetch the instruction from memory
- Addr <- R[rs] + SignExt(imm16)
Calculate the memory address
- R[rt] <- Mem[Addr] Load the data into the register
- PC <- PC + 4 Calculate the next instruction’s address
As far as storage elements are concerned, we will need a N-bit register that is similar to the D flip-
flop I showed you in class.
The significant difference here is that the register will have a Write Enable input.
That is the content of the register will NOT be updated if Write Enable is zero.
The content is updated at the clock tick ONLY if the Write Enable signal is set to 1.
+1 = 31 min. (Y:11)
ECE4680 Datapath.13 2002-4-
Storage Element: Register (p.B22-B25)
° Register
- Similar to the D Flip Flop except
- N-bit input and output
- Write Enable input
- Write Enable:
- 0: Data Out will not change
- 1: Data Out will become Data In
- Array of logical elements(see register file on next 2 slides)
Clk
Data In
Write Enable
N N
Data Out
The content is updated at the clock tick ONLY if the Write Enable signal is set to 1.
We will also need a register file that consists of 32 32-bit registers with two output busses (busA
and busB) and one input bus.
The register specifiers Ra and Rb select the registers to put on busA and busB respectively.
When Write Enable is 1, the register specifier Rw selects the register to be written via busW.
In our simplified version of the register file, the write operation will occurs at the clock tick.
Keep in mind that the clock input is a factor ONLY during the write operation.
During read operation, the register file behaves as a combinational logic block.
That is if you put a valid value on Ra, then bus A will become valid after the register file’s access
time.
Similarly if you put a valid value on Rb, bus B will become valid after the register file’s access time.
In both cases (Ra and Rb), the clock input is not a factor.
+2 = 33 min. (Y:13)
ECE4680 Datapath.14 2002-4-
Storage Element: Register File
° Register File consists of 32 registers:
- Two 32-bit output busses:
busA and busB
- One 32-bit input bus: busW
° Register is selected by:
- RA selects the register to put on busA
- RB selects the register to put on busB
- RW selects the register to be written
via busW when Write Enable is 1
° Clock input (CLK)
- The CLK input is a factor ONLY during write operation
- During read operation, behaves as a combinational logic block:
- RA or RB valid => busA or busB valid after “access time.”
Clk
busW
Write Enable
busA
busB
RW RA RB
32 32-bit Registers
The last storage element you will need for the datapath is the idealized memory to store your data
and instructions.
This idealized memory block has just one input bus (DataIn) and one output bus (DataOut).
When Write Enable is 0, the address selects the memory word to put on the Data Out bus.
When Write Enable is 1, the address selects the memory word to be written via the DataIn bus at
the next clock tick.
Once again, the clock input is a factor ONLY during the write operation.
During read operation, it behaves as a combinational logic block.
That is if you put a valid value on the address lines, the output bus DataOut will become valid after
the access time of the memory.
+2 = 35 min. (Y:15)
ECE4680 Datapath.16 2002-4-
Storage Element: Idealized Memory
° Memory (idealized)
- One input bus: Data In
- One output bus: Data Out
° Memory word is selected by:
- Address selects the word to put on Data Out
- Write Enable = 1: address selects the memory
memory word to be written via the Data In bus
° Clock input (CLK)
- The CLK input is a factor ONLY during write operation
- During read operation, behaves as a combinational logic block:
- Address valid => Data Out valid after “access time.”
Clk
Data In
Write Enable
DataOut
Address
Now let’s take a look at the first major component of the datapath: the instruction fetch unit.
The common RTL operations for all instructions are:
(a) Fetch the instruction using the Program Counter (PC) at the beginning of an
instruction’s execution (PC -> Instruction Memory -> Instruction Word).
(b) Then at the end of the instruction’s execution, you need to update the
Program Counter (PC -> Next Address Logic -> PC).
More specifically, you need to increment the PC by 4 if you are executing sequential code.
For Branch and Jump instructions, you need to update the program counter to “something else”
other than plus 4.
I will show you what is inside this Next Address Logic block when we talked about the Branch and
Jump instructions.
For now, let’s focus our attention to the Add and Subtract instructions.
+2 = 37 min. (Y:17)
ECE4680 Datapath.17 2002-4-
Overview of the Instruction Fetch Unit (Fig. 5.5)
° The common RTL operations
- Fetch the Instruction: mem[PC]
- Update the program counter:
- Sequential Code: PC <- PC + 4
- Branch and Jump PC <- “something else”
Instruction Word
Address
Instruction Memory
Clk PC
Next Address Logic
The Subtract instruction is also a R-type instruction.
Here we need to subtract the the contents of the register specified by Rt from the contents of the
register specified by the Rs field (Rs and Rt of the format diagram).
And then write the results back to the register specified by the Rd field.
+1 = 39 min. (Y:19)
ECE4680 Datapath.19 2002-4-
RTL: The Subtract Instruction
° sub rd, rs, rt
- mem[PC] Fetch the instruction from memory
- R[rd] <- R[rs] - R[rt] The actual operation
- PC <- PC + 4 Calculate the next instruction’s address
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
And here is the datapath that can do the trick.
First of all, we connect the register file’s Ra, Rb, and Rw input to the Rd, Rs, and Rt fields of the
instruction bus (points to the format diagram).
Then we need to connect busA and busB of the register file to the ALU.
Finally, we need to connect the output of the ALU to the input bus of the register file.
Conceptually, this is how it works.
The instruction bus coming out of the Instruction memory will set the Ra and Rb to the register
specifiers Rs and Rt.
This causes the register file to put the value of register Rs onto busA and the value of register Rt
onto busB, respectively.
But setting the ALUctr appropriately, the ALU will perform either the Add and Subtract for us.
The result is then fed back to the register file where the register specifier Rw should already be set
to the instruction bus’s Rd field.
Since the control, which we will design in our next lecture, should have already set the RegWr
signal to 1, the result will be written back to the register file at the next clock tick (points to the Clk
input).
+3 = 42 min. (Y:22)
ECE4680 Datapath.20 2002-4-
Datapath for Register-Register Operations
° R[rd] <- R[rs] op R[rt] Example: add rd, rs, rt
- Ra, Rb, and Rw comes from instruction’s rs, rt, and rd fields
- ALUctr and RegWr: control logic after decoding the instruction
Result
ALUctr
Clk
busW
RegWr
busA
busB
Rw Ra Rb 32 32-bit Registers
Rd Rs Rt
ALU
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits