C Programming: Array and Multi-Level Array Access, Alignment and Byte Ordering, Study notes of Computer Architecture and Organization

An in-depth exploration of arrays, multi-level arrays, alignment, and byte ordering in c programming. Topics include array declaration, memory allocation, accessing elements, nested arrays, and handling different data types. It also covers the concept of alignment for efficient memory access and the importance of understanding byte ordering when exchanging binary data between systems.

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-jy0
koofers-user-jy0 🇺🇸

4.5

(1)

9 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Page 1
Machine-Level Programming IV:
Structured Data
CSCE 230J
Computer Organization
Dr. Steve Goddard
http://cse.unl.edu/~goddard/Courses/CSCE230J
2
Giving credit where credit is due
Most of slides for this lecture are based on
slides created by Drs. Bryant and
O’Hallaron, Carnegie Mellon University.
I have modified them and added new
slides.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download C Programming: Array and Multi-Level Array Access, Alignment and Byte Ordering and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Machine-Level Programming IV:

Structured Data

CSCE 230J

Computer Organization

Dr. Steve Goddard

[email protected]

http://cse.unl.edu/~goddard/Courses/CSCE230J

2

Giving credit where credit is due

 Most of slides for this lecture are based on

slides created by Drs. Bryant and

O’Hallaron, Carnegie Mellon University.

 I have modified them and added new

slides.

3

Topics

 Arrays

 Structs

 Unions

4

Basic Data Types

Integral

 Stored & operated on in general registers

 Signed vs. unsigned depends on instructions used

Intel GAS Bytes C byte b 1 [ unsigned ] char word w 2 [ unsigned ] short double word l 4 [ unsigned ] int

Floating Point

 Stored & operated on in floating point registers

Intel GAS Bytes C Single s 4 float Double l 8 double Extended t 10/12 long double

7

Array Example

Notes

 Declaration “ zip_dig cmu ” equivalent to “ int cmu[5] ”

 Example arrays were allocated in successive 20 byte blocks

 Not guaranteed to happen in general

typedef int zip_dig[5];

zip_dig cmu = { 1, 5, 2, 1, 3 }; zip_dig mit = { 0, 2, 1, 3, 9 }; zip_dig ucb = { 9, 4, 7, 2, 0 };

zip_dig cmu; 1 5 2 1 3

16 20 24 28 32 36 zip_dig mit; (^0 2 1 3 )

36 40 44 48 52 56 zip_dig ucb; (^9 4 7 2 )

56 60 64 68 72 76

8

Array Accessing Example

Memory Reference Code

int get_digit (zip_dig z, int dig) { return z[dig]; }

%edx = z

%eax = dig

movl (%edx,%eax,4),%eax # z[dig]

Computation

 Register %edx contains starting

address of array

 Register %eax contains array

index

 Desired digit at 4*%eax + %edx

 Use memory reference

(%edx,%eax,4)

9

Referencing Examples

Code Does Not Do Any Bounds Checking!

Reference Address Value Guaranteed?

mit[3] 36 + 4* 3 = 48 3

mit[5] 36 + 4* 5 = 56 9

mit[-1] 36 + 4*-1 = 32 3

cmu[15] 16 + 4*15 = 76 ??

 Out of range behavior implementation-dependent

 No guaranteed relative allocation of different arrays

zip_dig cmu; 1 5 2 1 3

16 20 24 28 32 36 zip_dig mit; (^0 2 1 3 )

36 40 44 48 52 56 zip_dig ucb; (^9 4 7 2 )

56 60 64 68 72 76

Yes

No

No

No

10

int zd2int(zip_dig z) { int i; int zi = 0; for (i = 0; i < 5; i++) { zi = 10 * zi + z[i]; } return zi; }

Array Loop Example

Original Source

int zd2int(zip_dig z) { int zi = 0; int *zend = z + 4; do { zi = 10 * zi + *z; z++; } while(z <= zend); return zi; }

Transformed Version

 As generated by GCC

 Eliminate loop variable i

 Convert array code to

pointer code

 Express in do-while form

 No need to test at entrance

13

Nested Array Allocation

Declaration

T A[ R ][ C ];

 Array of data type T

 R rows, C columns

 Type T element requires K

bytes

Array Size

 R * C * K bytes

Arrangement

 Row-Major Ordering

A[0][0] A[0][C-1]

A[R-1][0]

• • • A[R-1][C-1]

int A[R][C];

A

[0]

[0]

A

[0]

[C-1]

A

[1]

[0]

A

[1]

[C-1]

A

[R-1]

[0]

A

[R-1]

[C-1]

4RC Bytes

14

Nested Array Row Access

Row Vectors

 A[i] is array of C elements

 Each element of type T

 Starting address A + i * C * K

A

[i] [0]

A

[i] [C-1]

A[i]

A [R-1] [0]

A

[R-1]

[C-1]

A[R-1]

A

A

[0]

[0]

A

[0]

[C-1]

A[0]

int A[R][C];

A+iC4 A+(R-1)C

15

Nested Array Row Access Code

Row Vector

 pgh[index] is array of 5 int ’s

 Starting address pgh+20*index

Code

 Computes and returns address

 Compute as pgh + 4(index+4index)

int *get_pgh_zip(int index) { return pgh[index]; }

%eax = index

leal (%eax,%eax,4),%eax # 5 * index leal pgh(,%eax,4),%eax # pgh + (20 * index)

16

Nested Array Element Access

Array Elements

 A[i][j] is element of type T

 Address A + ( i * C + j ) * K

A

[i] [j]

A

[i] [j]

A[i]

A [R-1] [0]

A

[R-1]

[C-1]

A[R-1]

A

A

[0]

[0]

A

[0]

[C-1]

A[0]

int A[R][C];

A+iC4 A+(R-1)C A+(iC+j)

19

Multi-Level Array Example

 Variable univ

denotes array of 3

elements

 Each element is a

pointer

 4 bytes

 Each pointer points

to array of int ’s

zip_dig cmu = { 1, 5, 2, 1, 3 }; zip_dig mit = { 0, 2, 1, 3, 9 }; zip_dig ucb = { 9, 4, 7, 2, 0 };

#define UCOUNT 3 int *univ[UCOUNT] = {mit, cmu, ucb};

univ

cmu 1 5 2 1 3

16 20 24 28 32 36 mit 0 2 1 3 9

ucb^36 40 44 48 52 9 4 7 2 0

56 60 64 68 72 76

20

Element Access in Multi-Level Array

Computation

 Element access

Mem[Mem[univ+4index]+4dig]

 Must do two memory reads

 First get pointer to row array  Then access element within array

%ecx = index

%eax = dig

leal 0(,%ecx,4),%edx # 4index movl univ(%edx),%edx # Mem[univ+4index] movl (%edx,%eax,4),%eax # Mem[...+4*dig]

int get_univ_digit (int index, int dig) { return univ[index][dig]; }

21

Array Element Accesses

 Similar C references

Nested Array

 Element at Mem[pgh+20index+4dig]

 Different address computation

Multi-Level Array

 Element at Mem[Mem[univ+4index]+4dig]

int get_pgh_digit (int index, int dig) { return pgh[index][dig]; }

int get_univ_digit (int index, int dig) { return univ[index][dig]; }

(^16036) 16 56

164 168

univ

cmu 1 5 2 1 3 mit 16 20 24 28 32 36 0 2 1 3 9 ucb^36 40 44 48 52 9 4 7 2 0 56 60 64 68 72 76

(^16036) 16 56

164 168

univ (^16036) 16 56

164 168

univ

cmu 1 5 2 1 3 16 20 24 28 32 36

11 55 22 11 33 mit 16 20 24 28 32 36 0 2 1 3 9 36 40 44 48 52 56

00 22 11 33 99 ucb^36 40 44 48 52 9 4 7 2 0 56 60 64 68 72 76

99 44 77 22 00 56 60 64 68 72 76

76 96 116 136 156

1 5 2 0 6 1 5 2 1 3 1 5 2 1 7 1 5 2 2 1 76 96 116 136 156

111 555 222 000 666 111 555 222 111 333 111 555 222 111 777 111 555 222 222 111

22

Strange Referencing Examples

Reference Address Value Guaranteed? univ[2][3] 56+43 = 68 2 univ[1][5] 16+45 = 36 0 univ[2][-1] 56+4-1 = 52 9 univ[3][-1] ?? ?? univ[1][12] 16+412 = 64 7  Code does not do any bounds checking  Ordering of elements in different arrays not guaranteed

univ

cmu 1 5 2 1 3

16 20 24 28 32 36 mit 0 2 1 3 9

ucb^36 40 44 48 52 9 4 7 2 0

56 60 64 68 72 76

Yes No No No No

25

Dynamic Array Multiplication

Without Optimizations

 Multiplies

 2 for subscripts  1 for data

 Adds

 4 for array indexing  1 for loop index  1 for data

/* Compute element i,k of variable matrix product */ int var_prod_ele (int a, int b, int i, int k, int n) { int j; int result = 0; for (j = 0; j < n; j++) result += a[in+j] * b[jn+k]; return result; }

A

(i,*) B

(*,k)

Column-wise

Row-wise

26

Optimizing Dynamic Array Mult.

Optimizations

 Performed when set

optimization level to -O

Code Motion

 Expression i*n can be

computed outside loop

Strength Reduction

 Incrementing j has effect of

incrementing j*n+k by n

Performance

 Compiler can optimize

regular access patterns

int j; int result = 0; for (j = 0; j < n; j++) result += a[in+j] * b[jn+k]; return result; } { int j; int result = 0; int iTn = i*n; int jTnPk = k; for (j = 0; j < n; j++) { result += a[iTn+j] * b[jTnPk]; jTnPk += n; } return result; }

27

struct rec { int i; int a[3]; int *p; };

Assembly

%eax = val

%edx = r

movl %eax,(%edx) # Mem[r] = val

void set_i(struct rec *r, int val) { r->i = val; }

Structures

Concept

 Contiguously-allocated region of memory

 Refer to members within structure by names

 Members may be of different types

Accessing Structure Member

Memory Layout

i a p 0 4 16 20

28

struct rec { int i; int a[3]; int *p; };

%ecx = idx

%edx = r

leal 0(,%ecx,4),%eax # 4idx leal 4(%eax,%edx),%eax # r+4idx+

int * find_a (struct rec *r, int idx) { return &r->a[idx]; }

Generating Pointer to Struct. Member

Generating Pointer to Array Element

 Offset of each structure

member determined at

compile time

i a p 0 4 16 r + 4 + 4*idx

r

31

Specific Cases of Alignment

Size of Primitive Data Type:

 1 byte (e.g., char)

 no restrictions on address

 2 bytes (e.g., short )

 lowest 1 bit of address must be 0 2

 4 bytes (e.g., int , float , char * , etc.)

 lowest 2 bits of address must be 00 2

 8 bytes (e.g., double )

 Windows (and most other OS’s & instruction sets): » lowest 3 bits of address must be 000 2  Linux: » lowest 2 bits of address must be 00 2 » i.e., treated the same as a 4-byte primitive data type

 12 bytes ( long double )

 Linux: » lowest 2 bits of address must be 00 2 » i.e., treated the same as a 4-byte primitive data type

32

struct S1 { char c; int i[2]; double v; } *p;

Satisfying Alignment with Structures

Offsets Within Structure

 Must satisfy element’s alignment requirement

Overall Structure Placement

 Each structure has alignment requirement K

 Largest alignment of any element

 Initial address & structure length must be

multiples of K

Example (under Windows):

 K = 8, due to double element

c i[0] i[1] v p+0 p+4 p+8 p+16 p+

Multiple of 4 Multiple of 8

Multiple of 8 Multiple of 8

33

Linux vs. Windows

Windows (including Cygwin):

 K = 8, due to double element

Linux:

 K = 4; double treated like a 4-byte data type

struct S1 { char c; int i[2]; double v; } *p;

c i[0] i[1] v p+0 p+4 p+8 p+16 p+

Multiple of 4 Multiple of 8 Multiple of 8 Multiple of 8

c i[0] i[1] p+0 p+4 p+

Multiple of 4 Multiple of 4 Multiple of 4

v p+12 p+

Multiple of 4

34

Overall Alignment Requirement

struct S2 { double x; int i[2]; char c; } *p;

struct S3 { float x[2]; int i[2]; char c; } *p;

p+0 p+8 p+12 p+16 (^) Windows : p+ Linux : p+

x i[0] i[1] c

i[0] i[1] c p+0 p+8 p+12 p+16 p+

x[0] x[1] p+

p must be multiple of:

8 for Windows

4 for Linux

p must be multiple of 4 (in either OS)

37

Accessing Element within Array

 Compute offset to start of structure

 Compute 12* i as 4(* i +2 i )

 Access element according to its offset

within structure

 Offset by 8  Assembler gives displacement as a + 8 » Linker must set actual value

a[0] a+

a[i] a+12i

  • • • • • •

short get_j(int idx) { return a[idx].j; }

%eax = idx

leal (%eax,%eax,2),%eax # 3*idx movswl a+8(,%eax,4),%eax

a+12i a+12i+

struct S6 { short i; float v; short j; } a[10];

a[i].i a[i].v a[i].j

38

Satisfying Alignment within Structure

Achieving Alignment

 Starting address of structure array must be

multiple of worst-case alignment for any element

 a must be multiple of 4

 Offset of element within structure must be

multiple of element’s alignment requirement

 v ’s offset of 4 is a multiple of 4

 Overall size of structure must be multiple of

worst-case alignment for any element

 Structure padded with unused space to be 12 bytes

struct S6 { short i; float v; short j; } a[10];

a[0] a+

a[i] a+12i

a+12i a+12i+

a[1].i a[1].v a[1].j

Multiple of 4

Multiple of 4

39

Union Allocation

Principles

 Overlay union elements

 Allocate according to largest element

 Can only use one field at a time

union U1 { char c; int i[2]; double v; } *up;

c i[0] i[1] v

struct S1 { up+0^ up+4^ up+ char c; int i[2]; double v; } *sp;

c i[0] i[1] v sp+0 sp+4 sp+8 sp+16 sp+

(Windows alignment)

40

typedef union { float f; unsigned u; } bit_float_t;

float bit2float(unsigned u) { bit_float_t arg; arg.u = u; return arg.f; u } f 0 4 unsigned float2bit(float f) { bit_float_t arg; arg.f = f; return arg.u; }

Using Union to Access Bit Patterns

 Get direct access to bit

representation of float

 bit2float generates float with

given bit pattern

 NOT the same as (float) u

 float2bit generates bit pattern

from float

 NOT the same as (unsigned) f