MIPS - High Performance Computing - Lecture Slides, Slides of Computer Science

Some concept of High Performance Computing are Addressing Modes, Program Execution, Basic Computer Organization, Control Hazard Solutions, Least Recently Used, Memory Hierarchy Progression. Main points of this lecture are: Mips, Function, Main Memory, Program, Instructions, Statically Allocated, Stack Allocated, Heap Allocated, Conditional Branch Instructions, Floating Point

Typology: Slides

2012/2013

Uploaded on 04/28/2013

dewaan
dewaan 🇮🇳

3.8

(4)

43 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
High Performance Computing
Lecture 9
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download MIPS - High Performance Computing - Lecture Slides and more Slides Computer Science in PDF only on Docsity!

High Performance Computing

Lecture 9

2

MIPS 1 Function Call

void A() { … B(5); … } void B (int x) { int a, b; … return(); } ADDI R1, R0, 5 ADDI R29, R29, 4 SW 0(R29), R JAL B ADDI R29, R29, 4 SW 0(R29), R B : 5 (int x) Return address Local int a Local int b ADDI R29, R29, 8 … SUBI R29, R29, 16 LW R31, 8(R29) JR R Function Call/Return Stack

4

MIPS 1 Instruction Set

1. Conditional branch instructions

2. Floating point

5

MIPS 1 Branch Instructions

What about a condition like R1 >= R You could use the BGEZ instruction Idea: Rewrite the condition as (R1 – R2) >= 0 SUB R3, R1, R2 / R3  R1 - R BGEZ R3, target / if R3 >= 0 goto target Problem: Possibility of overflow If R2 < 0, PC  PC + 4 - 16 BEQ, BNE, BLTZ R2, - 16 BGEZ, BLEZ, BLTZ, BGTZ Conditional Branch Mnemonics Example Meaning

7

MIPS 1 Floating Point

 Assume that there is a separate floating point

register file

 32 32b floating point registers F0-F  A double (64b floating point value) occupies 2 registers  Even-odd pair, such as F0,F  Addressed as F

 Additional instructions

 Loads: LF (load float), LD (load double)  Arithmetic: ADDF (add float), ADDD (add double)

8

Rationale for Separate FP Register File?

ALU FP Adder FP Multiplier 64 Registers 32 Integer Registers 32 FP Registers ALU FP Adder FP Multiplier

10

Basic Computer Organization

Cache Main Memory I/O Bus I/O I/O MMU ALU Registers CPU Control

11

Steps in Instruction Processing

  1. Fetch instruction from Main Memory to CPU  Get instruction whose address is in PC from memory into IR  Increment PC
  2. Decode the instruction  Understand instruction, addressing modes, etc  Calculate memory addresses and fetch operands
  3. Execute the required operation  Do the required operation
  4. Write the result of the instruction

13

Timeline of events (CISC)

PC to memory Instruction in IR PC++; Decode Op1 address calculation Op1 fetched Op2 address calculation Op2 fetched Op done Write result Processor/Memory Speed disparity ~2 orders of magnitude

14

Timeline of events (RISC)

PC to memory Instruction in IR PC++; Decode Op1 address calculation Op1 fetched Op2 fetched Op done Write result

16

We will assume that …

  1. Activity is overlapped in time where possible
    • PC increment and instruction fetch from memory?
    • Instruction decode and effective address calculation
  2. Load-store ISA: the only instructions that take operands from memory are loads & stores
  3. Main memory delays are not typically seen by the processor
    • Otherwise the timeline is dominated by them
    • There is some hardware mechanism through which most memory access requests can be satisfied at processor speeds (cache memory)