








Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Computer Organization; Subject: Computer Science; University: Columbus State University; Term: Unknown 2005;
Typology: Study notes
1 / 14
This page cannot be seen from the preview
Don't miss anything!









We begin with a number of views of computer memory and comment on their use. The simplest view of memory is that presented at the ISA (Instruction Set Architecture) level. At this level, memory is a monolithic addressable unit. At this level, the memory is a repository for data and instructions, with no internal structure apparent. For some very primitive computers, this is the actual structure. In this view, the CPU issues addresses and control signals. It receives instructions and data from the memory and writes data back to the memory. This is the view that suffices for many high–level language programmers. In no modern architecture does the CPU write instructions to the main memory.
In a course such as this, we want to investigate the internal memory structures that allow for more efficient and secure operations. The logical view for this course is a three–level view with cache memory, main memory, and virtual memory. The primary memory is backed by a “DASD” (Direct Access Storage Device), an external high–capacity device. While “DASD” is a name for a device that meets certain specifications, the standard disk drive is the only device currently in use that “fits the bill”. Thus DASD = Disk. This is the view we shall take when we analyze cache memory.
This lecture covers two related subjects: Virtual Memory and Cache Memory. In each case, we have a fast primary memory backed by a bigger secondary memory. The “actors” in the two cases are as follows: Technology Primary Memory Secondary Memory Block Cache Memory SRAM Cache DRAM Main Memory Cache Line Virtual Memory DRAM Main Memory Disk Memory Page Access Time TP (Primary Time) TS (Secondary Time) Effective Access Time: TE = h TP + (1 – h ) TS, where h (the primary hit rate) is the fraction of memory accesses satisfied by the primary memory; 0.0 h 1.0. This formula does extend to multi–level caches. For example a two–level cache has TE = h 1 T 1 + (1 – h 1 ) h 2 T 2 + (1 – h 1 ) (1 – h 2 ) TS. NOTATION WARNING: In some contexts, the DRAM main memory is called “primary memory”. I never use that terminology when discussing multi–level memory.
Suppose a single cache fronting a main memory, which has 80 nanosecond access time. Suppose the cache memory has access time 10 nanoseconds. If the hit rate is 90%, then TE = 0.9 10.0 + (1 – 0.9) 80. = 0.9 10.0 + 0.1 80.0 = 9.0 + 8.0 = 17.0 nsec. If the hit rate is 99%, then TE = 0.99 10.0 + (1 – 0.99) 80. = 0.99 10.0 + 0.01 80.0 = 9.9 + 0.8 = 10.7 nsec. Suppose a L1 cache with T 1 = 4 nanoseconds and h 1 = 0. Suppose a L2 cache with T 2 = 10 nanoseconds and h 2 = 0. This is defined to be the number of hits on references that are a miss at L1. Suppose a main memory with TS = 80. TE = h 1 T 1 + (1 – h 1 ) h 2 T 2 + (1 – h 1 ) (1 – h 2 ) TS. = 0.90 4.0 + 0.1 0.99 10.0 + 0.1 0.01 80. = 0.90 4.0 + 0.1 9.9 + 0.1 0. = 3.6 + 0.99 + 0.08 = 4.67 nanoseconds. Note that with these hit rates, only 0.1 0.01 = 0.001 = 0.1% of the memory references are handled by the much slower main memory.
Virtual memory allows the program to have a logical address space much larger than the computers physical address space. It maps logical addresses onto physical addresses and moves “pages” of memory between disk and main memory to keep the program running. An address space is the range of addresses, considered as unsigned integers, that can be generated. An N–bit address can access 2 N items, with addresses 0 … 2 N
16–bit address 2 16 items 0 to 65535 20–bit address 2 20 items 0 to 1,048, 32–bit address 2 32 items 0 to 4,294,967, In all modern applications, the physical address space is no larger than the logical address space. It is often somewhat smaller than the logical address space. As examples, we use a number of machines with 32–bit logical address spaces. Machine Physical Memory Logical Address Space VAX–11/780 16 MB 4 GB (4, 096 MB) Pentium (2004) 128 MB 4 GB Desktop Pentium 512 MB 4 GB Server Pentium 4 GB 4 GB NOTE: The MAR structure usually allows the two address spaces to be equal.
A small fast expensive memory is backed by a large, slow, cheap memory. Memory references are first made to the smaller memory.
Associative memory is “content addressable” memory. The contents of the memory are searched in one memory cycle. Consider an array of 256 entries, indexed from 0 to 255 (or 0x0 to 0xFF). Suppose that we are searching the memory for entry 0xAB712. Normal memory would be searched using a standard search algorithm, as learned in beginning programming classes. If the memory is unordered, it would take on average 128 searches to find an item. If the memory is ordered, binary search would find it in 8 searches. Associative memory would find the item in one search. Think of the control circuitry as “broadcasting” the data value (here oxAB712) to all memory cells at the same time. If one of the memory cells has the value, it raises a Boolean flag and the item is found. We do not consider duplicate entries in the associative memory. This can be handled by some rather straightforward circuitry, but is not done in associative caches.
We now focus on cache memory, returning to virtual memory only at the end. Primary memory = Cache Memory (assumed to be one level) Secondary memory = Main DRAM Assume a number of cache lines, each holding 16 bytes. Assume a 24–bit address. The simplest arrangement is an associative cache. It is also the hardest to implement. Divide the 24–bit address into two parts: a 20–bit tag and a 4–bit offset. Bits 23 – 4 3 – 0 Fields Tag Offset A cache line in this arrangement would have the following format. D bit V Bit Tag 16 indexed entries 0 1 0xAB712 M[0xAB7120] … M[0xAB712F] The placement of the 16 byte block of memory into the cache would be determined by a cache line replacement policy. The policy would probably be as follows:
An N–way set–associative cache uses direct mapping, but allows a set of N memory blocks to be stored in the line. This allows some of the flexibility of a fully associative cache, without the complexity of a large associative memory for searching the cache. Suppose a 2–way set–associative implementation of the same cache memory. Again assume 256 cache lines, each holding 16 bytes. Assume a 24–bit address. Recall that 256 = 2 8 , so that we need eight bits to select the cache line. Consider addresses 0xCD4128 and 0xAB7129. Each would be stored in cache line 0x12. Set 0 of this cache line would have one block, and set 1 would have the other. Entry 0 Entry 1 D V Tag Contents D V Tag Contents 1 1 0xCD4 M[0xCD4120] to M[0xCD412F] 0 1 0xAB7 M[0xAB7120] to M[0xAB712F]
Suppose we want to support 32–bit logical addresses in a system in which physical memory is 24–bit addressable. We can follow the primary / secondary memory strategy seen in cache memory. We shall see this again, when we study virtual memory in a later lecture. For now, we just note that the address structure of the disk determines the structure of virtual memory. Each disk stores data in blocks of 512 bytes, called sectors. In some older disks, it is not possible to address each sector directly. This is due to the limitations of older file organization schemes, such as FAT–16. FAT–16 used a 16–bit addressing scheme for disk access. Thus 2 16 sectors could be addressed. Since each sector contained 2 9 bytes, the maximum disk size under “pure FAT–16” is 2 25 bytes = 2 5 2 20 bytes = 32 MB. To allow for larger disks, it was decided that a cluster of 2 K sectors would be the smallest addressable unit. Thus one would get clusters of 1,024 bytes, 2,048 bytes, etc. Virtual memory transfers data in units of clusters, the size of which is system dependent.