Understanding Memory Hierarchy and Cache Memory in Computing, Study Guides, Projects, Research of Technology

The concept of memory hierarchy in computing systems, focusing on the use of cache memory. It discusses the different types of computer memory, including RAM and ROM, and their variations such as SRAM, DRAM, EPROM, EEPROM, and flash memory. The document also introduces the concept of locality of reference and how it is used to improve memory access times through cache memory. It explains the functioning of cache memory, its different levels, and mapping schemes.

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/12/2022

juliant
juliant 🇬🇧

4.3

(12)

219 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
TYPES OF MEMORY
Q1: Why are there so many different types of computer memory?
- To match the improvements in CPU or the memory becomes a bottleneck.
- The use of cache memory.
Cache memory is a small, high speed (and thus high-cost) type of memory that
serves as a buffer for frequently accessed data.
- -There are only two basic types of memory: RAM (random access memory) and ROM (read-
only memory).
RAM is used to store programs and data that the computer needs when executing
programs; but RAM is volatile, and loses this information once the power is turned off.
There are two general types of chips used to build the bulk of RAM memory in today’s
computers: SRAM and DRAM (static and dynamic random access memory).
Dynamic RAM is constructed of tiny capacitors that leak electricity. DRAM requires a
recharge every few milliseconds to maintain its data. Static RAM technology, in
contrast, holds its contents as long as power is available. SRAM consists of circuits
similar to the D flip-flops. SRAM is faster and much more expensive than DRAM;
however, designers use DRAM because it is much denser (can store many bits per chip),
uses less power, and generates less heat than SRAM. For these reasons, both technologies
are often used in combination: DRAM for main memory and SRAM for cache.
- ROM (read only memory) that stores critical information necessary to operate the
system, such as the program necessary to boot the computer. ROM is not volatile and
always retains its data. This type of memory is also used in embedded systems or any
systems where the programming does not need to change.
- There are five basic different types of ROM: ROM, PROM, EPROM, EEPROM, and
flash memory. PROM (programmable read-only memory) is a variation on ROM.
PROMs can be programmed by the user with the appropriate equipment. Whereas ROMs
are hardwired, PROMs have fuses that can be blown to program the chip. Once
programmed, the data and instructions in PROM cannot be changed.
- EPROM (erasable PROM) is programmable with the added advantage of being
reprogrammable (erasing an EPROM requires a special tool that emits ultraviolet light).
To reprogram an EPROM, the entire chip must first be erased.
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Understanding Memory Hierarchy and Cache Memory in Computing and more Study Guides, Projects, Research Technology in PDF only on Docsity!

TYPES OF MEMORY

Q1: Why are there so many different types of computer memory?

  • To match the improvements in CPU or the memory becomes a bottleneck.
  • The use of cache memory. Cache memory is a small, high speed (and thus high-cost) type of memory that serves as a buffer for frequently accessed data.
  • -There are only two basic types of memory: RAM (random access memory) and ROM (read- only memory). RAM is used to store programs and data that the computer needs when executing programs; but RAM is volatile, and loses this information once the power is turned off. There are two general types of chips used to build the bulk of RAM memory in today’s computers : SRAM and DRAM (static and dynamic random access memory). Dynamic RAM is constructed of tiny capacitors that leak electricity. DRAM requires a recharge every few milliseconds to maintain its data. Static RAM technology, in contrast, holds its contents as long as power is available. SRAM consists of circuits similar to the D flip-flops. SRAM is faster and much more expensive than DRAM ; however, designers use DRAM because it is much denser (can store many bits per chip), uses less power, and generates less heat than SRAM. For these reasons, both technologies are often used in combination: DRAM for main memory and SRAM for cache.
  • ROM (read only memory) that stores critical information necessary to operate the system, such as the program necessary to boot the computer. ROM is not volatile and always retains its data. This type of memory is also used in embedded systems or any systems where the programming does not need to change.
  • There are five basic different types of ROM: ROM, PROM, EPROM, EEPROM , and flash memory. PROM (programmable read-only memory) is a variation on ROM. PROMs can be programmed by the user with the appropriate equipment. Whereas ROMs are hardwired, PROMs have fuses that can be blown to program the chip. Once programmed, the data and instructions in PROM cannot be changed.
  • EPROM (erasable PROM) is programmable with the added advantage of being reprogrammable (erasing an EPROM requires a special tool that emits ultraviolet light). To reprogram an EPROM, the entire chip must first be erased.
  • EEPROM (electrically erasable PROM) removes many of the disadvantages of EPROM, no special tools are required for erasure (this is performed by applying an electric field) and you can erase only portions of the chip, one byte at a time.
  • Flash memory is essentially EEPROM with the added benefit that data can be written or erased in blocks, removing the one-byte-at-a-time limitation. This makes flash memory faster than EEPROM.

THE MEMORY HIERARCHY

  • Understanding the performance capabilities of a modern processor is the memory hierarchy.
  • Today’s computer systems use a combination of memory types to provide the best performance at the best cost. This approach is called hierarchical memory.
  • By using a hierarchy of memories, each with different access speeds and storage capacities, a computer system can exhibit performance above what would be possible without a combination of the various types. The base types that normally constitute the hierarchical memory system include registers, cache, main memory, and secondary memory.
  • Today’s computers each have a small amount of very high-speed memory, called a cache , where data from frequently used memory locations may be temporarily stored. This cache is connected to a much larger main memory, which is typically a medium- speed memory. This memory is complemented by a very large secondary memory, composed of a hard disk and various removable media.
  • Classify memory based on its “ distance ” from the processor, with distance measured by the number of machine cycles required for access. The closer memory is to the processor, the faster it should be. As memory gets further from the main processor, we can afford longer access times. Thus, slower technologies are used for these memories, and faster technologies are used for memories closer to the CPU.
  • Terminology is used with memory hierarchy:  Hit —The requested data resides in a given level of memory (typically, we are concerned with the hit rate only for upper levels of memory).  Miss —The requested data is not found in the given level of memory.  Hit rate —The percentage of memory accesses found in a given level of memory.  Miss rate —The percentage of memory accesses not found in a given level of memory. Note: Miss Rate = 1 _ Hit Rate.

Locality of Reference

Processors tend to access memory in a very patterned way. If memory location X is accessed at time t , there is a high probability that memory location X + 1 will also be accessed in the near future. This clustering of memory references into groups is an example of locality of reference.

When a miss is processed, instead of simply transferring the requested data to a higher level, the entire block containing the data is transferred. Because of locality of reference, it is likely that the additional data in the block will be needed in the near future, and if so, this data can be loaded quickly from the faster memory.

There are three basic forms of locality:  Temporal locality Recently accessed items tend to be accessed again in the near future.  Spatial locality Accesses tend to be clustered in the address space (for example, as in arrays or loops).  Sequential locality Instructions tend to be accessed sequentially.

The locality principle provides the opportunity for a system to use a small amount of very fast memory to effectively accelerate the majority of memory accesses.

CACHE MEMORY

  • A very fast computer processor is constantly reading information from memory, which means it often has to wait for the information to arrive, because the memory access times are slower than the processor speed.
  • It stores data that has been accessed and data that might be accessed by the CPU in a faster, closer memory.
  • Cache memory in a computer differs from our real-life examples in one important way: The computer really has no way to know, a priori , what data is most likely to be accessed, so it uses the locality principle and transfers an entire block from main memory into cache whenever it has to make a main memory access.
  • The size of cache memory can vary enormously. A typical personal computer’s level 2 (L2) cache is 256K or 512K. Level 1 (L1) cache is smaller, typically 8K or 16K. L1 cache resides

on the processor, whereas L2 cache resides between the CPU and main memory. L1 cache is, therefore, faster than L2 cache.

  • The purpose of cache is to speed up memory accesses by storing recently used data closer to the CPU, instead of storing it in main memory. Although cache is not as large as main memory, it is considerably faster. Whereas main memory is typically composed of DRAM with, say, a 60ns access time, cache is typically composed of SRAM, providing faster access with a much shorter cycle time than DRAM (a typical cache access time is 10ns).
  • Cache is not accessed by address; it is accessed by content. For this reason, cache is sometimes called content addressable memory or CAM. Under most cache mapping schemes, the cache entries must be checked or searched to see if the value being requested is stored in cache.

Cache Mapping Schemes

  • To simplify the process of locating the desired data, various cache mapping algorithms are used.
  • When accessing data or instructions, the CPU first generates a main memory address. If the data has been copied to cache, the address of the data in cache is not the same as the main memory address.
  • How does the CPU locate data when it has been copied into cache? The CPU uses a specific mapping scheme that “converts” the main memory address into a cache location.
  • This address conversion is done by giving special significance to the bits in the main memory address. We first divide the bits into distinct groups we call fields. Depending on the mapping scheme, we may have two or three fields.
  • The mapping scheme determines where the data is placed when it is originally copied into cache and also provides a method for the CPU to find previously copied data when searching cache.
  • Main memory and cache are both divided into the same size blocks. When a memory address is generated, cache is searched first to see if the required word exists there. When the requested word is not found in cache, the entire main memory block in which the word resides is loaded into cache; one missed word often results in several found words

So, how do we use fields in the main memory address?

Direct Mapping of Main Memory Blocks to Cache Blocks

If main memory blocks 0 and 10 both map to cache block 0, how does the CPU know which block actually resides in cache block 0 at any given time?

The answer is that each block is copied to cache and identified by the tag (see the Figure)

There are two valid cache blocks. Block 0 contains multiple words from main memory, identified using the tag “00000000”. Block 1 contains words identified using tag “11110101”. The other two cache blocks are not valid. To perform direct mapping, the binary main memory address is partitioned into the fields shown in the Figure.

The size of each field depends on the physical characteristics of main memory and cache. The word uniquely identifies a word from a specific block. This is also true of the block field—it must select a unique block of cache. The tag field is whatever is left over. When a block of main memory is copied to cache, this tag is stored with the block and uniquely identifies this block.

Consider the following example: Assume memory consists of 2^14 words, cache has 16 blocks , and each block has 8 words. From this we determine that memory has 2^14 /2^3 = 2^11 blocks. We know that each main memory address requires 14 bits. Of this 14-bit address field, the rightmost 3 bits reflect the word field (we need 3 bits to uniquely identify one of 8 words in a block). We need 4 bits to select a specific block in cache, so the block field consists of the middle 4 bits. The remaining 7 bits (2^11 /24 (16 block)= 2^7 (128) ) make up the tag field. The fields with sizes are illustrated in Figure.