Understanding Computer Memory: A Single System with Primary, Secondary, and Cache Memory -, Study notes of Computer Architecture and Organization

An overview of computer memory, focusing on the main levels of memory: primary and secondary, and introducing cache memory. The text also discusses virtual memory systems and cache memory management, as well as the classification of memory into random access, sequential access, and associative memory. It further explains the differences between read-write memory (ram) and read-only memory (rom), and the concept of a linear view of memory.

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-xz6
koofers-user-xz6 🇺🇸

10 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Memory
We now discuss the memory subsystem of a computer. Computer memory is organized into
a number of levels, the main levels being primary and secondary memory.
In this course, we shall view computer memory as a single system. At this point, however,
we consider all of the levels of memory. We first mention that some authors discuss tertiary
memory in addition to secondary memory.
Consider a three-level computer memory and, for the moment, disregard cache memory.
The distinguishing feature of main memory is that it can be directly accessed by the CPU.
Secondary memory can be accessed only through Input/Output operations. The main
difference between disk and magnetic tape memory is that a tape has to be located by a
human operator and physically mounted – a matter of minutes rather than the milliseconds
required to access a disk drive.
We should also note that disks and magnetic tapes are also Input/Output devices and will be
studied as a part of our coverage of I/O architecture. At this point, we focus on the use of
these I/O devices as a part of the memory structure.
We now consider memory systems by combining disk and magnetic tape memory as
secondary memory and introducing the cache memory.
The study of a virtual memory system focuses on the management of main and secondary
memory by the operating system as a single large address space. This is covered in courses
on Operating Systems. The study of cache memory focuses on the management of a high
speed cache placed between the main memory and the CPU. This is the topic of an advanced
course in computer architecture.
Both virtual memory and cache memory depend on two locality properties of computer
programs. These are summarized by noting that when a specific address is accessed by a
program, that address and addresses close to it will very likely be accessed in the near future.
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Understanding Computer Memory: A Single System with Primary, Secondary, and Cache Memory - and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Memory

We now discuss the memory subsystem of a computer. Computer memory is organized into a number of levels, the main levels being primary and secondary memory. In this course, we shall view computer memory as a single system. At this point, however, we consider all of the levels of memory. We first mention that some authors discuss tertiary memory in addition to secondary memory. Consider a three-level computer memory and, for the moment, disregard cache memory. The distinguishing feature of main memory is that it can be directly accessed by the CPU. Secondary memory can be accessed only through Input/Output operations. The main difference between disk and magnetic tape memory is that a tape has to be located by a human operator and physically mounted – a matter of minutes rather than the milliseconds required to access a disk drive. We should also note that disks and magnetic tapes are also Input/Output devices and will be studied as a part of our coverage of I/O architecture. At this point, we focus on the use of these I/O devices as a part of the memory structure. We now consider memory systems by combining disk and magnetic tape memory as secondary memory and introducing the cache memory. The study of a virtual memory system focuses on the management of main and secondary memory by the operating system as a single large address space. This is covered in courses on Operating Systems. The study of cache memory focuses on the management of a high speed cache placed between the main memory and the CPU. This is the topic of an advanced course in computer architecture. Both virtual memory and cache memory depend on two locality properties of computer programs. These are summarized by noting that when a specific address is accessed by a program, that address and addresses close to it will very likely be accessed in the near future.

For the purpose of this course, we consider only main memory and view it as being directly accessed by the CPU. This gives rise to a simpler model that can be elaborated later. Note that this is the view of memory as seen by the assembly language programmer. A main memory with an attached cache can be modeled as a faster main memory. Memory can be classified into three major types: random access memory, sequential access memory, and associative memory. Magnetic tape is a good example of sequential memory and displays its main characteristic – to access a specific item the program must first access every item that is before it in some sort of sequential order. Random access memory is defined by the characteristic that any addressable item can be accessed in a time that is independent of its address. Main memory is a good example of random access memory. Disk memory shares some characteristics of each of random access and sequential access memory; it is not random access memory in that the time to access a block does depend on the blocks address, but it is not a sequential access device as a block can be accessed by moving directly to the appropriate cylinder and track. Random access memory involves access by some sort of address. Associative memory involves access by content; i.e., a bit pattern is sent to all words in the memory and a signal is raised to indicate each word that contains the pattern. Associative memory can be considered as memory that can be searched in one step rather than by examination of each memory word, one at a time. In time complexity notation, plain search has O(N) time complexity, binary search has O(log N) time complexity, and associative memory has O(1). We shall examine associative memory in more detail and actually design a small sample memory. Random access memory has two main variants Read Write Memory (called RAM) and Read Only Memory (called ROM) The usage of the term “RAM” for the type of random access memory that might well be called “RWM” has a long history and will be continued in this course. The basic reason is probably that the terms “RAM” and “ROM” can easily be pronounced; try pronouncing “RWM”. Keep in mind that both RAM and ROM are random access memory. Of course, there is no such thing as a pure Read-Only memory; at some time it must be possible to put data in the memory by writing to it, otherwise there will be no data in the memory to be read. The term “Read-Only” usually refers to the method for access by the CPU. All variants of ROM share the feature that their contents cannot be changed by normal CPU write operations. All variants of RAM (really Read-Write Memory) share the feature that their contents can be changed by normal CPU write operations. We shall discuss several variants of ROM when we return to the topic after discussion of general memory issues.

Registers associated with the memory system All memory types, both RAM and ROM can be characterized by two registers and a number of control signals. Consider a memory of 2N^ words, each having M bits. Then the MAR ( Memory Address Register ) is an N-bit register used to specify the memory address the MBR ( Memory Buffer Register ) is an M-bit register used to hold data to be written to the memory or just read from the memory. This register is also called the MDR (Memory Data Register). In our sample computer, both the MAR and MBR are 16-bit registers. This choice is perhaps a bit unfortunate in that it can cause some confusion about the purpose of certain registers, but that is the design we study. This author is trying to evolve a better design. We specify the control signals to the memory unit by recalling what we need the unit to do. First consider RAM (Read Write Memory). From the viewpoint of the CPU there are three tasks for the memory CPU reads data from the memory. Memory contents are not changed. CPU writes data to the memory. Memory contents are updated. CPU does not access the memory. Memory contents are not changed. We need two control signals to specify the three options for a RAM unit. One standard set is Select – the memory unit is selected. R / W – if 0 the CPU writes to memory, if 1 the CPU reads from memory. We can use a truth table to specify the actions for a RAM. Note that when Select = 0, nothing is happening to the memory. It is not being accessed by the CPU and the contents do not change. When Select = 1, the memory is active and something happens. Consider now a ROM (Read Only Memory). Form the viewpoint of the CPU there are only two tasks for the memory CPU reads data from the memory. CPU does not access the memory. We need only one control signal to specify these two options. The natural choice is the Select control signal as the R /Wsignal does not make sense if the memory cannot be written by the CPU. The truth table for the ROM should be obvious Select Action 0 CPU is not accessing the memory. 1 CPU reads data from the memory. Select R /W Action 0 0 Memory contents are not changed. 0 1 Memory contents are not changed. 1 0 CPU writes data to the memory. 1 1 CPU reads data to the memory.

The Linear View of Memory The linear view of memory is a way to think logically about memory that has the advantage of being rather simple. It is not the way that memory is actually organized, but it is a start in understanding the true organization. Consider a memory in which an M-bit word is the smallest addressable unit. For simplicity, we assume that the memory contains N = 2K^ words and that the address space is also N = 2K. The memory can be viewed as a one-dimensional array, declared something like Memory : Array [0 .. (N – 1)] of M-bit word. Here is a diagram illustrating this memory. Notice that the memory select control signal (Select) is connected to the Enable input of the Decoder. When Select = 0, none of the memory words are selected. When Select = 1, the word with address indicated by the code in the MAR is selected. For K = 16, we have a 16-to- decoder. If this could be built, it would be about the size of a car battery and consume a few hundred watts. This is not a physical model. After we study the construction of memory cells from D flip-flops and JK flip-flops, we shall show a way to make two 8-to-256 decoders function as a 16-to-65536 decoder. We might say that 8-to-256 decoders stretch the technology, but they are not ridiculous. Memory Technologies We now discuss technologies used to store binary information. The first topic is to make a list of requirements for devices used to implement binary memory.

  1. Two well defined and distinct states.
  2. The device must be able to switch states reliably.
  3. The probability of a spontaneous state transition must be extremely low.
  4. State switching must be as fast as possible>
  5. The device must be small and cheap so that large capacity memories are practical. There are a number of memory technologies that were developed in the last half of the twentieth century. Most of these are now obsolete. There are three that are worth mention:
  6. Core Memory (now obsolete)
  7. Static RAM
  8. Dynamic RAM

Basic Memory Cell We now examine two designs of a memory cell based on JK flip-flops. The first circuit is basically a copy of the figure in the textbook. There is only one problem with the circuit at left – it must be connected using a large OR gate. In general, there are too many memory cells to be connected in this fashion. The following circuit shows the use of a tri-state on the output of the memory cell. The circuit at right uses a tri-state on its output. This allows the outputs of any number of memory cells to be connected directly. Designs using this circuit are simpler. In this course, we shall assume that all memory cells have tri-state buffers on the output. The symbol at the left will be used to represent a simple memory cell – remember that it stores a single bit: 0 or 1. All memory can be viewed as being constructed from these memory cells. From our viewpoint, a DRAM cell is functionally equivalent to one of these memory cells, so we will not pay much more attention to DRAM.

We now look at two ways to build a small memory of four 3-bit words. The first method appears simpler, but the second method scales well to larger memories and is preferred. The main difficulty with this design is the address decoder. A 4-word memory requires only a 2-to-4 decoder, easily an MSI chip. Consider however a 1024-word memory, which would require a 10-to-1024 decoder – probably not a feasible design. A 64K-word memory would require a 16-to-65536 decoder, which is ridiculous to contemplate. A more feasible way to create a 2N^ by M bit memory is to use M single bit memory chips, each holding 2N^ addressable bits. For example, a 64Kbyte memory would use 8 memory chips, each with 65536 single bits. What makes this preferable is the addressing scheme made possible by use of a 2N^ by 1 chip – a sort of 2-dimensional addressing involving the use of two decoders. For example, a 65536 by 1 memory could be addressed using two 8-to- decoders, which probably can be built with current technology.

Here is the memory, with the MBR bits labeled Y 2 Y 1 Y 0. As a final step in the consideration of the use of single bit chips to create a main memory, we consider a memory of 32 3-bit words created from six of the 16 x 1 memory chips. The 5-bit address A 4 A 3 A 2 A 1 A 0 selects 32 words with addresses in the range 0 through 31 inclusive. When A 4 = 0, the four lower order bits A 3 A 2 A 1 A 0 select words with address in the range 0 through 15 inclusive. When A 4 = 1, the four lower order bits A 3 A 2 A 1 A 0 select words with address in the range 16 through 31 inclusive. This design can be considered as having two address banks, one set of three chips for the upper 16 addresses and one set of three chips for the lower 16 addresses. The enable input is passed to AND gates that enable each of the two address banks to allow for the possibility of addressing neither bank. Another memory design, interleaved memory , uses the lower bits to select the memory bank. In this design the four bits A 4 A 3 A 2 A 1 would be passed to each of the memory banks and the low bit A 0 used to select the bank. The advantage of this design is to assign adjacent memory words to different memory banks; this can result in a speed advantage in accessing memory.

Memory Timings We now discuss two time measures of memory performance: access time and cycle time. These two times are used to characterize all of random access memory – both RAM and ROM. Memory access time is the time required for the memory to access the data; more specifically, it is the time between the instant that the memory address is stable in the MAR and the data are available in the MBR. Memory cycle time is the minimum time between two independent memory accesses. It should be clear that the cycle time is at least as great as the access time, because the memory cannot process an independent access while it is in the process of placing data in the MBR. The cycle time is strictly greater than the access time in those memories in which the read operation is destructive. Both the now-obsolete magnetic core memory and modern DRAM have the destructive read-out property. Recall that DRAM uses an electrical device called a capacitor to store the bit. The capacitor can store a charge of electricity: no charge indicates a logic 0 and a positive charge indicates a logic 1. The process of reading the DRAM cell is that of detecting the charge on the capacitor: if a charge is detected the cell was storing a logic 1, otherwise a logic 0. The problem is that the process of detecting the charge also depletes the charge, so the memory chip must rewrite the cell after it is written in order to restore its contents. For memory technologies with destructive read-out, the cycle time is approximately the sum of the access time and the rewrite time. For other technologies, the two times are about the same. Read-Only Memory We now study ROM and a number of its variants: PROM and EPROM. Again, we are caught in a nomenclature trap: the three big classes of ROM are ROM, PROM, & EPROM. ROM, as a technology as opposed to a class of memory, refers to memory in which the contents are fixed at the time of manufacture. This is also called mask-programmed ROM. To understand this type of ROM, we need to understand two simple electrical circuit elements: the diode and the fuse. The diode, drawn as an arrow (the left figure in the diagram) restricts electrical current to flow in one direction; actually it offers very low resistance in one direction and very high resistance in the other direction. The fuse is exactly what one would expect: when “blown” it will not conduct electricity and when not blown it conducts very well. To clarify the design of ROM we show a two-bit ROM storing the binary bits 10. The horizontal line is the input line; it carries a voltage (logic 1) when selected. The left cell of the two shows a diode connecting the input line to the output line; the output line gets a voltage and represents a logic 1. The right cell of the two shows both a diode and a blown fuse. There is no electrical connection between the input line and this output line, so the voltage will not be asserted and the output taken as a logic 0.

Suppose more than one of the words contains a match. The solution to this is to have some sort of priority arbitration scheme, called a multiple match resolver. Here is one circuit for a four word multiple match resolver (MMR). This variant of the MMR gives priority to X 0 over X 1 , X 2 , and X 3 ; X 1 priority over X 2 and X 3 ; and X 2 priority over X 3. When X 0 = 1 we have Y 0 = 1 and Y 1 = Y 2 = Y 3 = 0 without regard to X 1 , X 2 , or X 3. If X 0 = 0 and X 1 = 1, then Y 1 = 1 and Y 2 = Y 3 = 0 without regard to either X 2 or X 3. The story for X 2 is similar. Note that this is just one of many possible priority arbitration schemes. In many applications of associative memory, the circuit is designed to avoid multiple matches. The word select register can be used in conjunction with the MMR. The word select register contains one bit for each word in the associative memory; it is set to 1 if the word is a candidate for matching and 0 if not. Suppose we have a four word AM and are searching for a specific pattern. Suppose the above MMR scheme and that Y 1 = 1 is the first match. At that point we know that word 1 matches and that words 2 and 3 might match. Bit 1 of the word select register is set to 0 and the match is tried again. This time, the match on word 1 will be ignored and any match on word 2 or 3 will be discovered. Consider the following example with data register D 3 D 2 D 1 D 0 = 0011, and match register M 3 M 2 M 1 M 0 = 1100, indicating that only bits 3 and 2 are to be matched. The temp register indicates that words 1, 4, and 5 match the pattern. The word select register indicates that matches are sought only in words 1, 3, and 4. The results register is the logical AND of the temp and WS register, showing matches in words 1 and 4. These two numbers are passed to the MMR for resolution. This is all we shall do on the study of memory.