






































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The topics of memory, data, integers, floats, machine code, C, Java, x86 assembly, procedures, stacks, arrays, structs, caches, virtual memory, and more. It also talks about the problem of processor-memory bottleneck and the solution of caches. plots and diagrams to explain the concepts. It is a useful resource for computer science students studying memory and caches in computer systems.
Typology: Study notes
1 / 78
This page cannot be seen from the preview
Don't miss anything!







































































*car c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Car c = new Car(); c.setMiles(100); c.setGals(17); float mpg = c.getMPG(); get_mpg: pushq %rbp movq %rsp, %rbp ... popq %rbp ret
0111010000011000 100011010000010000000010 1000100111000010 110000011111101000011111
Memory & data Integers & floats Machine code & C x86 assembly Procedures & stacks Arrays & structs Memory & caches Processes Virtual memory Memory allocaJon Java vs. C
int array[SIZE]; int A = 0; for (int i = 0 ; i < 200000 ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { A += array[j]; } }
¢ Cache basics ¢ Principle of locality ¢ Memory hierarchies ¢ Cache organizaJon ¢ Program opJmizaJons that consider caches
CPU Reg Processor performance doubled about every 18 months Bus bandwidth evolved much slower Core 2 Duo: Can process at least 256 Bytes/cycle Core 2 Duo: Bandwidth 2 Bytes/cycle Latency 100 cycles Problem: lots of wai4ng on memory 5
¢ English definiJon: a hidden storage space for provisions, weapons, and/or treasures ¢ CSE definiJon: computer memory with short access Jme used for the storage of frequently or recently used instrucJons or data (i-‐cache and d-‐cache) more generally, used to opJmize data transfers between system elements with different characterisJcs (network interface cache, I/O cache, etc.)
Cache^8 9 14 Memory Larger, slower, cheaper memory viewed as parJJoned into “blocks” or “lines” Data is copied in block-‐sized transfer units Smaller, faster, more expensive memory caches a subset of the blocks (a.k.a. lines)
Cache^8 9 14 Memory
Request: 12 12
¢ Locality: Programs tend to use data and instrucJons with addresses near or equal to those they have used recently
¢ Locality: Programs tend to use data and instrucJons with addresses near or equal to those they have used recently ¢ Temporal locality: § Recently referenced items are likely
¢ SpaJal locality? block
¢ Locality: Programs tend to use data and instrucJons with addresses near or equal to those they have used recently ¢ Temporal locality: § Recently referenced items are likely
¢ SpaJal locality: § Items with nearby addresses tend
§ How do caches take advantage of this? block block
int sum_array_rows(int a[M][N]) { int i, j, sum = 0; for (i = 0; i < M; i++) for (j = 0; j < N; j++) sum += a[i][j]; return sum; } a[0][0] a[0][1] a[0][2] a[0][3] a[1][0] a[1][1] a[1][2] a[1][3] a[2][0] a[2][1] a[2][2] a[2][3]
int sum_array_rows(int a[M][N]) { int i, j, sum = 0; for (i = 0; i < M; i++) for (j = 0; j < N; j++) sum += a[i][j]; return sum; } a[0][0] a[0][1] a[0][2] a[0][3] a[1][0] a[1][1] a[1][2] a[1][3] a[2][0] a[2][1] a[2][2] a[2][3] 1: a[0][0] 2: a[0][1] 3: a[0][2] 4: a[0][3] 5: a[1][0] 6: a[1][1] 7: a[1][2] 8: a[1][3] 9: a[2][0] 10: a[2][1] 11: a[2][2] 12: a[2][3] stride-‐
int sum_array_cols(int a[M][N]) { int i, j, sum = 0; for (j = 0; j < N; j++) for (i = 0; i < M; i++) sum += a[i][j]; return sum; } a[0][0] a[0][1] a[0][2] a[0][3] a[1][0] a[1][1] a[1][2] a[1][3] a[2][0] a[2][1] a[2][2] a[2][3] 1: a[0][0] 2: a[1][0] 3: a[2][0] 4: a[0][1] 5: a[1][1] 6: a[2][1] 7: a[0][2] 8: a[1][2] 9: a[2][2] 10: a[0][3] 11: a[1][3] 12: a[2][3] stride-‐N
int sum_array_3d(int a[M][N][N]) { int i, j, k, sum = 0; for (i = 0; i < N; i++) for (j = 0; j < N; j++) for (k = 0; k < M; k++) sum += a[k][i][j]; return sum; } ¢ What is wrong with this code? ¢ How can it be fixed?