High Performance Computing Lecture 30: Array Operations and Cache Memory, Slides of Computer Science

This document from docsity.com covers various examples of array operations in high performance computing, including merging arrays, daxpy, and 2-d matrix sum. The lectures also discuss cache memory and hits and misses in array access. Examples of c and fortran storage orders for multi-dimensional arrays.

Typology: Slides

2012/2013

Uploaded on 04/28/2013

dewaan
dewaan 🇮🇳

3.8

(4)

43 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
High Performance Computing
Lecture 30
Docsity.com
pf3
pf4
pf5
pf8

Partial preview of the text

Download High Performance Computing Lecture 30: Array Operations and Cache Memory and more Slides Computer Science in PDF only on Docsity!

High Performance Computing

Lecture 30

2

Example 2 with Array Merging

What if we re-declare the arrays as

struct {double A, B;} array[2048]; for (i=0; i<2048, i++) sum += array[i]. Aarray[i].* B;

Hit ratio: 75%

4

Example 4: 2-d Matrix Sum

double A[1024][1024], B[1024][1024]; for (j=0;j<1024;j++) for (i=0;i<1024;i++) B[i][j] = A[i][j] + B[i][j];

 Reference Sequence:

load A[0,0] load B[0,0] store B[0,0] load A[1,0] load B[1,0] store B[1,0]

 Question: In what order are the elements of a

multidimensional array stored in memory?

5

Storage of Multi-dimensional Arrays

 Row major order

 Example: for a 2-dimensional array, the elements of the first row of the array are followed by those of the 2 nd row of the array, the 3 rd row, and so on  This is what is used in C

 Column major order

 A 2-dimensional array is stored column by column in memory  Used in FORTRAN

7

Example 4 with Loop Interchange

double A[1024][1024], B[1024][1024]; for (j=0; j<1024; j++) for (i=0; i<1024; i++) B[i][j] = A[i][j] + B[i][j];

8

Example 4 with Loop Interchange.

double A[1024][1024], B[1024][1024]; for (i=0; i<1024; i++) for (j=0; j<1024; j++) B[i][j] = A[i][j] + B[i][j];

 Reference Sequence:

load A[0,0] load B[0,0] store B[0,0] load A[0,1] load B[0,1] store B[0,1]

 Hit ratio: 83.3%