Cache Design for the LC3 Microcontroller - Project II | ECE 406, Study Guides, Projects, Research of Electrical and Electronics Engineering

Material Type: Project; Professor: Davis; Class: Design of Complex Digital Systems; Subject: Electrical and Computer Engineering; University: North Carolina State University; Term: Spring 2008;

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 03/18/2009

koofers-user-nx5-1
koofers-user-nx5-1 🇺🇸

10 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
NC State University ECE 406 Spring 2008
ECE Department Design of Complex Digital Systems W. Rhett Davis
Project #2: Cache Design for the LC3 Microcontroller
1. Introduction
The intent of this lab is to improve the LC3 microcontroller you have designed in Project #1.
Specifically, you need to create a unified cache. The detailed specification of the cache is described
below.
2. Learning Objectives
Complete a cache design that includes a single level-1 cache for both data and instructions.
Implement complex message communication mechanisms using Responsive Handshaking.
3. Project Report
You are expected to turn in a report after the end of this project. Follow the project report format
given on the Laboratories page on the course web-site. Be sure to include all items listed in that
report format for full credit.
4. Wolfware Submission
You also need to submit your Verilog code electronically through Wolfware as proj2.v. This file should
contain five modules called CacheController, ProcInterface, MemInterface, ValidArray, and CacheData,
as described below. It may use the ‘include directive to include other files, if you wish, but they must
also be submitted with Wolfware. Your code must successfully execute with the test-bench provided on
the course web-site (called proj2test.v) and give the correct expected output as listed in the test-bench.
The memory file (proj2.dat) is also given, along with the expected output of the instruction set simulator
(proj2.out). This program is taken from the example given in class. In addition, a second program will
be used to test your code that will not be provided.
Note also that the LC-3 CPU has been provided for you, in the file lc3.v. In addition, two of the
UnifiedCache blocks are provided in the file proj2test.v, including the CacheRAM and the top-level
UnifiedCache module (described below).
Finally, please also ensure that the instance name of your CacheRAM (inside the CacheData module) is
“cram”. This will aid us in grading the project.
5. Lab Design: Implementation of a unified cache
In the actual designs, cache sizes are often quite small so that their access times are reduced. In this part
of the lab, you will implement a realistic cache. The cache specifications are described as follows:
1) The cache hierarchy is 2-level. There are only a level-1 cache and the off-chip memory.
2) The cache block size is 4 words, so the block offset is 2-bits wide
3) The cache is a direct-mapped cache
4) The size of the cache is 64 (26) words. Therefore, there are 16 valid data blocks in the cache at
most.
5) The write policies that you need to implement are write-through and write-allocate. As a result,
whenever a write-hit happens, the word is written to both the memory and cache. Whenever a
write-miss happens, the corresponding block is fetched onto the chip.
5.1 Data transfer specifications
The LC-3 CPU can communicate only with the cache. The cache will be in charge of both delivering data
from and to the off-chip memory as well as the LC-3 CPU. As a result, whenever the data/instruction
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Cache Design for the LC3 Microcontroller - Project II | ECE 406 and more Study Guides, Projects, Research Electrical and Electronics Engineering in PDF only on Docsity!

ECE Department Design of Complex Digital Systems W. Rhett Davis

Project #2: Cache Design for the LC3 Microcontroller

1. Introduction The intent of this lab is to improve the LC3 microcontroller you have designed in Project #1. Specifically, you need to create a unified cache. The detailed specification of the cache is described below. 2. Learning Objectives ƒ Complete a cache design that includes a single level-1 cache for both data and instructions. ƒ Implement complex message communication mechanisms using Responsive Handshaking. 3. Project Report

You are expected to turn in a report after the end of this project. Follow the project report format

given on the Laboratories page on the course web-site. Be sure to include all items listed in that

report format for full credit.

4. Wolfware Submission You also need to submit your Verilog code electronically through Wolfware as proj2.v. This file should contain five modules called CacheController , ProcInterface , MemInterface , ValidArray , and CacheData , as described below. It may use the ‘include directive to include other files, if you wish, but they must also be submitted with Wolfware. Your code must successfully execute with the test-bench provided on the course web-site (called proj2test.v ) and give the correct expected output as listed in the test-bench. The memory file ( proj2.dat ) is also given, along with the expected output of the instruction set simulator ( proj2.out ). This program is taken from the example given in class. In addition, a second program will be used to test your code that will not be provided.

Note also that the LC-3 CPU has been provided for you, in the file lc3.v. In addition, two of the UnifiedCache blocks are provided in the file proj2test.v , including the CacheRAM and the top-level UnifiedCache module (described below).

Finally, please also ensure that the instance name of your CacheRAM (inside the CacheData module) is “cram”. This will aid us in grading the project.

5. Lab Design: Implementation of a unified cache

In the actual designs, cache sizes are often quite small so that their access times are reduced. In this part of the lab, you will implement a realistic cache. The cache specifications are described as follows:

  1. The cache hierarchy is 2-level. There are only a level-1 cache and the off-chip memory.
  2. The cache block size is 4 words, so the block offset is 2-bits wide
  3. The cache is a direct-mapped cache
  4. The size of the cache is 64 (2 6 ) words. Therefore, there are 16 valid data blocks in the cache at most.
  5. The write policies that you need to implement are write-through and write-allocate. As a result, whenever a write-hit happens, the word is written to both the memory and cache. Whenever a write-miss happens, the corresponding block is fetched onto the chip.

5.1 Data transfer specifications

The LC-3 CPU can communicate only with the cache. The cache will be in charge of both delivering data from and to the off-chip memory as well as the LC-3 CPU. As a result, whenever the data/instruction

ECE Department Design of Complex Digital Systems W. Rhett Davis

required is not in the cache, the cache controller needs to load the data from the memory. The LC-3 CPU is not even aware of the detailed structure of the cache. Figure 1 illustrates the interface relationship between the LC-3 CPU, cache, and the off-chip memory. Note that the LC-3 CPU is unchanged from Project #1, except for the addition of a signal called macc (Memory Access). The macc signal is “1” whenever the LC-3 CPU is in the Fetch, Read Memory , Write Memory , or Read Indirect Address States.

Figure 1. Data transfer interface

The off-chip memory interface signals are the following:

  • Read Request (rrqst)
  • Read Ready (rrdy)
  • Read Data Ready (rdrdy)
  • Read Data Accept (rdacpt)
  • Write Request (wrqst)
  • Write Accept (wacpt)

The protocols under read-hit, read-miss, write-hit, and write-miss are described below.

  1. Read-hit: the target data should be fetched in one clock cycle
  2. Read-miss:

Figure 2. Signal waveforms for a Read-Miss

address data0 data1 data2 data

Read Request

Read Ready

Read Data Ready

Read Data Accept

ECE Department Design of Complex Digital Systems W. Rhett Davis

Figure 5. UnifiedCache Schematic

ECE Department Design of Complex Digital Systems W. Rhett Davis

5.2.1 CacheController

This module is a finite state machine that controls the state of the cache. The state transition diagram is given in Figure 6. The state sequences for the 4 possible accesses are: a) read-hit: 0 b) read-miss: 0 1 2 3 2 3 2 3 2 3 8 0 c) write-hit: 0 4 5 6 7 8 0 d) write-miss: 0 4 5 6 7 2 3 2 3 2 3 2 3 8 0

It takes the following control signals as inputs:

  • clock – the system clock shared with the CPU datapath
  • reset – the system reset shared by the entire design including CPU datapath and off-chip memory
  • macc – memory access enable signal from the CPU datapath, with 1 indicating a memory access is required (from the LC-3 CPU)
  • rd – 1 indicates a read access and 0 indicates a write access (from the LC-3 CPU)
  • miss – whether the current memory access is a miss (1) or hit (0) (determined by the CacheData block)
  • rrdy – read ready (from the off-chip memory)
  • rdrdy – read data ready (from the off-chip memory)
  • wacpt – write accept (from the off-chip memory)

The outputs of this module include the variable state and a counter that keeps track of the number of words that have been loaded. The state transition diagram is given in Figure 6.

Figure 6: FSM transition graph of the cache controller 5.2.2 ProcInterface

This block communicates with the LC-3 CPU, setting dout[15:0] to correct 16-bit word, based on the the two least-significant address bits and the 64-bit bits from the cache blockdata. A register stores dout[15:0], and this value should be updated when there is a read-hit or when a read-miss operation completes. Ths block should also generate the complete signal when any cache operation completes. A suggested schematic for the ProcInterface block is given in Figure 7.

5 wacpt==

(^6) wacpt==1 7

rrdy==

reset=1 || macc=

Write

rdrdy==

rdrdy==

8

wacpt==0 && Hit

Read-complete && rdrdy ==

(^4) wacpt==

rrdy==

wacpt==0 wacpt==1 wacpt==0^ wacpt==

rdrdy==

Read-hit

0

1 2 3

Read-miss

wacpt==0 && Miss always

Read-incomplete && rdrdy ==

ECE Department Design of Complex Digital Systems W. Rhett Davis

validarray

decoder

Figure 8: ValidArray Schematic

5.2.5 CacheData

The CacheData block has an instance of the CacheRAM and takes care of reading and writing to the CacheRAM. The CacheRAM is a 16x74 asynchronous memory that has been provided for you in the file proj2test.v. The address input of the CacheRAM is connected to the current cache index (bits [5:2] of the address from the LC-3 CPU).

A suggested schematic for the CacheData block is given in Figure 9. It contains the CacheRAM and four 16-bit block-registers that are used to hold the data to be written into the CacheRAM. The block has the following inputs:

  • clock – The global system clock.
  • state[3:0] – The current CacheController state.
  • count[1:0] – The offset count from the CacheController.
  • rd – The read signal from the LC-3 CPU.
  • valid – The current valid-bit from the ValidArray.
  • addr[15:0] – The current address from the LC-3 CPU.
  • din[15:0] – Data-In from the LC-3 CPU. This value should written into one of the block- registers at the end of state 4, depending on the value of the offset (bits [1:0] of the address from the LC-3 CPU).
  • offdata[15 :0] – Data from the Off-chip Memory. This value should be written into one of the block-registers at the end of state 2, depending on the count from the CacheController.

ECE Department Design of Complex Digital Systems W. Rhett Davis

data[73:0]

rd addr[3:0]

CacheRAM

addr[15:0]

BlockReg

blocksel0[1:0]

16 16

16 BlockReg 16

16 BlockReg 16

16 BlockReg 16

[15:6]

[5:2]

[63:48]

[73:64]

[47:32]

[31:16]

[15:0]

74

4

10

ramrd

74

offdata[15:0]

din[15:0]

[63:48]

[47:32]

[31:16]

[15:0]

16 16

16

blocksel1[1:0]

blocksel2[1:0]

blocksel3[1:0]

ramrd

64

blockdata[63:0]

state[3:0] count[1:0] rd valid

blocksel0[1:0] blocksel1[1:0] blocksel2[1:0] blocksel3[1:0] ramrd miss

generate control

signals addr[15:0]

74

[63:0]

tag[9:0]

tag[9:0] 10

[73:64]

74

Figure 9: CacheData Schematic