A Closer Look at Instruction Set Architectures - Lecture Notes | CPSC 2105, Study notes of Computer Architecture and Organization

Material Type: Notes; Class: Computer Organization; Subject: Computer Science; University: Columbus State University; Term: Unknown 2005;

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-fhy
koofers-user-fhy 🇺🇸

9 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 5: A Closer Look at Instruction Set Architectures
Here we look at a few design decisions for instruction sets of the computer.
Some basic issues for consideration today.
Basic instruction set design
Fixed–length vs. variable–length instructions.
Word addressing in a byte–addressable architecture.
Big–Endian vs. Little–Endian Designs
Stack machines, accumulator machines, and multi–register machines
Modern Design Principles
NOTE: We shall not discuss expanding opcodes, covered in Section 5.2.5 of the
textbook. The discussion is overly complex.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download A Closer Look at Instruction Set Architectures - Lecture Notes | CPSC 2105 and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Chapter 5: A Closer Look at Instruction Set Architectures

Here we look at a few design decisions for instruction sets of the computer. Some basic issues for consideration today. Basic instruction set design Fixed–length vs. variable–length instructions. Word addressing in a byte–addressable architecture. Big–Endian vs. Little–Endian Designs Stack machines, accumulator machines, and multi–register machines Modern Design Principles NOTE: We shall not discuss expanding opcodes, covered in Section 5.2.5 of the textbook. The discussion is overly complex.

Design Considerations: Instruction Size

The MARIE has a fixed instruction size: 16 bits, divided as follows: Bits 15 – 12 11 – 0 Content Opcode Operand address (if used) Instructions of a fixed size are easier to decode. Moreover, they facilitate prefetching of instructions, in which one instruction is fetched while the previous one is executing. Instructions with variable size make more efficient use of memory. When memory was costly, this was an important design consideration. For example, the VAX–11/780 had a number of addition instructions: Add register to register Add memory to register Add memory to memory Suppose an 8–bit opcode, 16 registers (so that 4 bits identify each register), and 32–bit addressing. Instruction Lengths: Register to register 8 + 4 + 4 = 16 bits (two bytes) Memory to register 8 + 32 + 4 = 44 bits (six bytes) Memory to memory 8 + 32 + 32 = 72 bits (nine bytes) With fixed length instructions, each instruction would have to be at least 9 bytes long.

Word Addressing vs. Byte Addressing

Each architecture devotes a number of bits to addressing memory. The MARIE has a 12–bit memory address. Question: What is the size of an addressable unit? In byte–addressable machines, each byte is individually addressable. In word–addressable machines, bytes are not individually addressable. Advantages of Word–Addressable Designs The CPU can address a larger memory. In the MARIE it is 2 12 words (not 2 12 bytes) Word addressing is more natural for a number of problems, such as numerical simulations that use a large amount of real–number arithmetic. The CDC–6600 used 60–bit addressable words to store real numbers. Advantages of Byte–Addressable Designs Direct access to the bytes for easy manipulation. This is good for any string–oriented processing, such as editing and message passing. Question: How are multi–byte entities (words, longwords, etc.) handled?

Word Addressing in a Byte Addressable Machine

Each 8–bit byte has a distinct address. A 16–bit word at address Z contains bytes at addresses Z and Z + 1. A 32–bit word at address Z contains bytes at addresses Z, Z + 1, Z + 2, and Z + 3. Note that computer architecture refers to addresses, rather than variables. In a high–level programming language we use the term “variable” to indicate the contents of a specific memory address. Consider the statement Y = X Go to the memory address associated with variable X Get the contents Copy the contents into the address associated with variable Y.

Big–Endian vs. Little–Endian Addressing

Address Big-Endian Little-Endian Z 01 04 Z + 1 02 03 Z + 2 03 02 Z + 3 04 01 Note that, within the byte, the hexadecimal digits are never reversed.

Words and Longwords in Byte

Addressable Memory: Example

You are given the following memory map for a byte addressable memory. All values are given in hexadecimal. Address FC FD FE FF 100 101 102 103 104 Contents 99 88 66 66 10 20 30 40 55 Each addressable unit contains one byte, expressed as two hexadecimal digits. A 32–bit longword occupies four bytes. Question: What is the 32–bit longword associated with address 0x100? Answer: It occupies addresses 0x100, 0x101, 0x102, and 0x103. Big–Endian: The value, in hexadecimal, is 0x1020 3040. Little–Endian: The value, in hexadecimal, is 0x4030 2010. Note: In little–endian, we read the bytes backwards. We do not read the hexadecimal digits backwards.

Words and Longwords in Byte

Addressable Memory: Example

You are given the following memory map for a byte addressable memory. Address FC FD FE FF 100 101 102 103 104 Contents 99 88 66 66 10 20 30 40 55 Question: What is the 16–bit word associated with address 0x100? Answer: It occupies addresses 0x100 and 0x101. Big–Endian The value, in hexadecimal, is 0x In decimal: 1 16 3

  • 0 16 2
  • 2 16 1
  • 0 16 0 Little–Endian : The value, in hexadecimal, is 0x. In decimal: 2 16 3
  • 0 16 2
  • 1 16 1
  • 0 16 0

One Classification of Architectures

How do we handle the operands? Consider a simple addition, specifically C = A + B Stack architecture In this all operands are found on a stack. These have good code density (make good use of memory), but have problems with access. Typical instructions would include: Push X // Push the value at address X onto the top of stack Pop Y // Pop the top of stack into address Y Add // Pop the top two values, add them, & push the result Program implementation of C = A + B Push A Push B Add Pop C

General Purpose Register Architectures

These have a number of general purpose registers, normally identified by number. The number of registers is often a power of 2: 8, 16, or 32 being common. (The Intel architecture with its four general purpose registers is different. These are called EAX, EBX, ECX, and EDX – a lot of history here) The names of the registers often follow an assembly language notation designed to differentiate register names from variable names. An architecture with eight general purpose registers might name them: %R0, %R1, …., %R7. The prefix “%” here indicates to the assembler that we are referring to a register, not to a variable that has a name such as R0. The latter name would be poor coding practice. Designers might choose to have register %R0 identically set to 0. Having this constant register considerably simplifies a number of circuits in the CPU control unit. We shall return to this %R0  0 when discussing addressing modes.

General Purpose Registers with Load–Store

A Load–Store architecture is one with a number of general purpose registers in which the only memory references are:

  1. Loading a register from memory
  2. Storing a register to memory The realization of our programming statement C = A + B might be something like Load %R1, A // Load memory location A contents into register 1 Load %R2, B // Load register 2 from memory location B Add %R3, %R1, %R2 // Add contents of registers %R1 and %R // Place results into register %R Store %R3, C // Store register 3 into memory location C

General Purpose Registers: Memory–Memory

In this, there are no restrictions on the location of operands. Our instruction C = A + B might be encoded simply as Add C, A, B The VAX series supported this mode. The VAX had at least three different addition instructions for each data length Add register to register Add memory to register Add memory to memory There were these three for each of the following data types: 8–bit bytes, 16–bit integers, and 32–bit long integers 32–bit floating point numbers and 64–bit floating point numbers. Here we see at least 15 different instructions that perform addition. This is complex.

Modern Design Principles: Basic Assumptions

Some assumptions that drive current design practice include:

  1. The fact that most programs are written in high–level compiled languages. Provision of a large general purpose register set greatly facilitates compiler design.
  2. The fact that current CPU clock cycle times (0.25 – 0.50 nanoseconds) are much faster than memory access times.
  3. The fact that a simpler instruction set implies a smaller control unit, thus freeing chip area for more registers and on–chip cache.
  4. The fact that execution is more efficient when a two level cache system is implemented on–chip. We have a split L1 cache (with an I–Cache for instructions and a D–Cache for data) and a L2 cache.
  5. The fact that memory is so cheap that it is a commodity item.

The Fetch–Execute Cycle (Again)

This cycle is the logical basis of all stored program computers. Instructions are stored in memory as machine language. Instructions are fetched from memory and then executed. The common fetch cycle can be expressed in the following control sequence. MAR  PC. // The PC contains the address of the instruction. READ. // Put the address into the MAR and read memory. IR  MBR. // Place the instruction into the MBR. This cycle is described in many different ways, most of which serve to highlight additional steps required to execute the instruction. Examples of additional steps are: Decode the Instruction, Fetch the Arguments, Store the Result, etc. A stored program computer is often called a “von Neumann Machine” after one of the originators of the EDVAC. This Fetch–Execute cycle is often called the “von Neumann bottleneck” , as the necessity for fetching every instruction from memory slows the computer.

Avoiding the Bottleneck

In the simple stored program machine, the following loop is executed. Fetch the next instruction Loop Until Stop Execute the instruction Fetch the next instruction End Loop. The first attempt to break out of this endless cycle was “instruction prefetch” ; fetch the next instruction at the same time the current one is executing. As we can easily see, this concept can be extended.