CS 61C Fall 2011 Final Cheat Sheet, Study notes of Data Communication Systems and Computer Networks

A cheat sheet for the CS 61C course in Fall 2011. It covers topics such as incrementing memory addresses, calling conventions, memory access time, parallelism, CPU design, and hazards. It also includes examples and definitions for various concepts related to the course.

Typology: Study notes

2010/2011

Uploaded on 05/11/2023

lana23
lana23 🇺🇸

4.8

(4)

216 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 61C Fall 2011
Kenny Do
Final cheat sheet
Increment memory addresses by multiples of 4, since lw and sw are byte-
aligned
When going from C to Mips, always use addu, addiu, and subu
When saving stu into the stack, addi to $sp
Stack frame includes return instruction address, parameters, space
for local variables
Calling conventions
Save and restore $s0-9 and $sp
Save $ra if callee does nested function call
Save $a0-3 and $t0-9 in caller if necessary
Average Memory Access Time = L1 Hit Time + L1 Miss Rate * L1 Miss
Penalty
L1 Miss Penalty = L2 Hit Time + L2 Miss Rate * L2 Miss Penalty
Increasing associativity by 2 decreases size of Index by 1 bit and increases
the size of Tag by 1 bit
Big Endian vs Little Endian determines BYTE order, not the bit order
within bytes
Big endian stores the most signicant byte rst
2 GHz = 500 picosec frequency
Vars declared outside of main() are in static
*bigArray[4] uses 4*4 bytes, but bigTriple[3] uses 3 * sizeof(bigTriple)
Seek time is time it takes to move disk head from one track to another
Updating other caches on write and invalidating on cache write maintain
cache coherence during writes
2
# oset bits
= block size
Put starting arrow in FSM diagrams
1
pf3
pf4
pf5
pf8

Partial preview of the text

Download CS 61C Fall 2011 Final Cheat Sheet and more Study notes Data Communication Systems and Computer Networks in PDF only on Docsity!

CS 61C Fall 2011 Kenny Do Final cheat sheet

  • Increment memory addresses by multiples of 4, since lw and sw are byte- aligned
  • When going from C to Mips, always use addu, addiu, and subu
  • When saving stu into the stack, addi to $sp

 Stack frame includes return instruction address, parameters, space for local variables

  • Calling conventions

 Save and restore $s0-9 and $sp  Save $ra if callee does nested function call  Save $a0-3 and $t0-9 in caller if necessary

  • Average Memory Access Time = L1 Hit Time + L1 Miss Rate * L1 Miss Penalty

 L1 Miss Penalty = L2 Hit Time + L2 Miss Rate * L2 Miss Penalty

  • Increasing associativity by 2 decreases size of Index by 1 bit and increases the size of Tag by 1 bit
  • Big Endian vs Little Endian determines BYTE order, not the bit order within bytes

 Big endian stores the most signicant byte rst

  • 2 GHz = 500 picosec frequency
  • Vars declared outside of main() are in static
  • bigArray[4] uses 44 bytes, but bigTriple[3] uses 3 * sizeof(bigTriple)
  • Seek time is time it takes to move disk head from one track to another
  • Updating other caches on write and invalidating on cache write maintain cache coherence during writes
  • 2 # oset bits= block size
  • Put starting arrow in FSM diagrams

1 Data Level Parallelism

  • Flynn Taxonomy

 {Single, Multiple} Instruction {Single, Multiple} Data Stream

  • 8 XMM registers are 128 bits wide

2 Thread Level Parallelism

  • Example SSE instrinsics on _m128d data type

 mm{load, store, loadu, storeu, load1, add, mul}_pd

  • All multicore CPUs are Shared Memory Multiprocessors

 Single address space shared by all cores  Coordination/communication through shared variables in memory ∗ Shared data coordinated via synchronization primitives (locks)

  • MOESI protocol for each block in cache:

 Modied = up-to-date data, changed (dirty), no other cache has a copy, OK to write, memory out-of-date  Owner = up-to-date data, other caches may have a copy (they must be in Shared state) ∗ Only cache that supplies data on read instead of going to memory  Exclusive = up-to-date data, no other cache has a copy, OK to write, memory up-to-date ∗ Avoids writing to memory if block replaced ∗ Supplies data on read instead of going to memory  Shared = up-to-date data, other caches may have a copy  Invalid = not in cache

  • Memory in multi-threaded

 All threads can access globally shared memory, but each thread also has private data

  • Main bottleneck of SMP is the memory system
  • Data race is when two serial memory accesses from dierent threads to same location and at least one is a write
  • Locks create critical section where only one thread operates

5 Boolean algebra

    • means OR, • means AND, x means NOT
  • Laws of boolean algebra

 Complementarity ∗ x · x = 0 ∗ x + x = 1  Laws of 0's and 1's ∗ x · 0 = 0 ∗ x + 1 = 1  Identities ∗ x · 1 = x ∗ x + 0 = x  Idempotent law ∗ x · x = x ∗ x + x = x  Commutativity, associativity, and distrubution also apply  Uniting theorem ∗ (x + y) x = x  Uniting theorem 2 ∗ (x + y) x = xy  DeMorgan's Law ∗ x · y = x + y ∗ x + y = x · y

  • Truth table <-> boolean sum-of-products <-> gate diagram -> truth table
  • overow = cn XOR cn− 1 for left-most adder

 cn is carry out, cn− 1 is carry in

  • Mux

 Truth table for mux with 4-bits of signals controls 16 inputs, so it has 220 rows in truth table

6 CPU Design

  • Stages of the datapath
    1. Instruction fetch (a) Also where we increment PC
    2. Instruction decode (a) Read opcode and read data from necessary registers
    3. ALU (Execute) (a) Includes calculating the address of memory for lw and sw
    4. Memory access (a) Only lw and sw do anything at this stage (b) Expected to be fast because of cache, otherwise multicycle stall
    5. Register write (a) Idle for stores, branches, and jumps
  • Load uses all 5 stages
  • Registers between each stage to hold intermediate data and control signals
  • Always include incrementing the PC in Reg Transfer Language, and always start by fetching the instruction

 Ex: {op, rs, rt, Imm 16 } ← M EM [P C]

  • Critical path is longest path through logic and determines length of clock period
  • Control signals

 nPCsel = 0 (next PC is PC + 4), 1 (branch), X (jump)  Jump = 1 (is a jump), 0  ExtOp = zero, sign  ALUsrc = 0 (regB), 1 (immed)  ALUctr = ADD, SUB, OR  MemWr = 1 (write memory), 0  MemToReg = 0 (ALU output goes to reg), 1 (Mem output goes)  RegDst = 0 (rt), 1 (rd)  RegWr = 1 (write register), 0

  • Pipelining rate limited by slowest pipeline stage

 Cols are valid?, access rights, physical page address  Row index == virtual address' page number  Physical address is PPN, oset

  • Translation Lookaside Buer is cache of page table

 VPN is split into TLB tag and index  Cols are TLB tag, PPN, dirty? (== need to write to disk when replaced), ref (to calculate LRU replacement)  Row index == index from VPN

8 I/O

  • Memory Mapped Input/Output dedicates portion of address space to com- munication with devices

 Polling: processor reads from control register in loop until it is ready, then writes/reads data register, which resets ready bit of control register

  • Exceptions arise within CPU

 PC of oending instr is saved in Exception Program Counter, cause is saved in Cause register, and jumps to exception handler code at 0x8000 0180 ∗ After exception, pipeline ushed, handler executed, then instr executed from scratch or program terminated ∗ Precise exceptions (actually used) · Earliest exception-causing instr is handled rst ∗ Imprecise exceptions · Pipeline stopped, software handler works out cause and what to do

  • Interrupts from external IO controller

 Asynchronous, but does not prevent any instruction from completion

  • Switched network has higher bandwidth than shared network
  • Mean Time to Failure
  • RAID 3 = Sequential bytes on dierent drives, dedicated parity drive
  • RAID 5 = rotated parity, faster small writes

9 Amdahl's Law

f (n) =

(1 − P ) + NP

  • P is percentage of parallelizable code
  • N is number of cores used
  • f(n) is amount of speedup code gains
  • Assumptions

 No contention for shared resources (ideal!)  No per-thread overhead (ideal!)  No pipelining