Understanding Pointers and Arrays in C: WSU CEG 320/520 Comp. Org. & Assembly, Study notes of Computer Architecture and Organization

An in-depth exploration of pointers and arrays in c programming language, using examples and code snippets from wright state university's ceg 320/520 computer organization and assembly course. Students will learn about pointer declarations, pointer arithmetic, multi-dimensional arrays, passing arrays as arguments, and common pitfalls related to arrays and pointers.

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-m2j
koofers-user-m2j 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Chapter 16Chapter 16
Why do we need Pointers?
Call by Value vs. Call by Reference in detail
Implementing Arrays
Buffer Overflow / The “Stack Hack”
2
Wright State Univer sity, Colleg e of Engine ering
Dr. Doom, Computer Science & Engi neeri ng CEG 320/520
Comp. Org . & Assembly
A problem with parameter passing via A problem with parameter passing via
stackstack
Consider the following function that's designed to
swap the values of its arguments.
void Swap(int firstVal, int secondVal){
int tempVal = firstVal;
firstVal = secondVal;
secondVal = tempVal;
}
int main () {
int valueA = 3, valueB = 4;
Swap (valueA, valueB);
3
Wright State Univer sity, Colleg e of Engine ering
Dr. Doom, Comp uter Science & Engi neeri ng CEG 320/520
Comp. Org . & Assembly
Executing the Swap FunctionExecuting the Swap Function
firstVal
secondVal
valueB
valueA
3
4
4
3
R6
before call
tempVal
firstVal
secondVal
valueB
valueA
3
4
3
4
3
R6
after call
These values
changed...
...but these
did not.
Swap
main
4
Wright State Univer sity, Colleg e of Engine ering
Dr. Doom, Computer Science & Engi neeri ng CEG 320/520
Comp. Org . & Assembly
Pointers and ArraysPointers and Arrays
Functions such as the swap example need to be able access variables
stored in memory locations outside of their own activation record
A function’s activation record defines its “scope”
We've seen examples of how to do this in Assembly.
Pointer
Address of a variable in memory
Allows us to indirectly access variables
in other words, we can talk about its addressrather than its value
Array (still a pointer!)
An area of allocated memory with values a rranged sequentially
Expression a[4] refers to the 5th element of the arra y a
The array variable is a pointer to the base of the array
Base + offset
Thus… the first element is 0
5
Wright State Univer sity, Colleg e of Engine ering
Dr. Doom, Comp uter Science & Engi neeri ng CEG 320/520
Comp. Org . & Assembly
Pointers in CPointers in C
C lets us talk about and manipulate addresses as “pointer variables”
But first, lets refresh the somewhat confusing bits.
&: The “address-of” or “reference” operator
This operator does one thing.
It returns the address of the variable which fo llows
#include <stdio.h>
int main() {
int x = 0;
printf("Address of x ");
printf("= 0x%p \n", &x);
return 0;
}
Output: Address of x = 0x0065FDF4
6
Wright State Univer sity, Colleg e of Engine ering
Dr. Doom, Computer Science & Engi neeri ng CEG 320/520
Comp. Org . & Assembly
Pointers in CPointers in C
How do we store addresses? Pointer variables!
Although all pointers in C are exactly the same type (address) they are also typed by
the compiler so that the data to which they refer can be appropriately interpreted.
A pointer in C is always a pointer to a particular data type: int*, double*,
char*, etc.
Declaration
int *p; /* p is a pointer to an int */
Operators
*p -- returns the value pointed to by p (in direct address / dereference
op)
&z -- returns the address of variable z (a ddress-of operator)
Important point of common confusion!
* means “a pointer variable” when used in a declaration
* means “access the information that this address points to” elsewhere
What does *3 mean?
pf3
pf4
pf5

Partial preview of the text

Download Understanding Pointers and Arrays in C: WSU CEG 320/520 Comp. Org. & Assembly and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Chapter 16Chapter 16

Why do we need Pointers?

Call by Value vs. Call by Reference in detail

Implementing Arrays

Buffer Overflow / The “Stack Hack”

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 2

A problem with parameter passing viaA problem with parameter passing via

stackstack

 Consider the following function that's designed to

swap the values of its arguments.

 void Swap(int firstVal, int secondVal){

int tempVal = firstVal;

firstVal = secondVal;

secondVal = tempVal;

 int main () {

int valueA = 3, valueB = 4; … Swap (valueA, valueB); … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 3

Executing the Swap FunctionExecuting the Swap Function

firstVal

secondVal

valueB

valueA

R

before call

tempVal

firstVal

secondVal

valueB

valueA

R

after call

These values

changed...

...but these

did not.

Swap main W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 4

Pointers and ArraysPointers and Arrays

 Functions such as the swap example need to be able access variables

stored in memory locations outside of their own activation record

  • A function’s activation record defines its “scope”
  • We've seen examples of how to do this in Assembly.

 Pointer

  • Address of a variable in memory
  • Allows us to indirectly access variables  in other words, we can talk about its address rather than its value

 Array (still a pointer!)

  • An area of allocated memory with values arranged sequentially
  • Expression a[4] refers to the 5th element of the array a  The array variable is a pointer to the base of the array  Base + offset  Thus… the first element is 0 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 5

Pointers in CPointers in C

 C lets us talk about and manipulate addresses as “pointer variables”

 But first, lets refresh the somewhat confusing bits.

 &: The “address-of” or “reference” operator

  • This operator does one thing.
  • It returns the address of the variable which follows

#include <stdio.h>

int main() {

int x = 0;

printf("Address of x ");

printf("= 0x%p \n", &x);

return 0;

Output: Address of x = 0x0065FDF W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 6

Pointers in CPointers in C

 How do we store addresses? Pointer variables!

  • Although all pointers in C are exactly the same type (address) they are also typed by the compiler so that the data to which they refer can be appropriately interpreted.
  • A pointer in C is always a pointer to a particular data type: int, double, char*, etc.

 Declaration

  • int p; / p is a pointer to an int */

 Operators

  • *p -- returns the value pointed to by p (indirect address / dereference op)
  • &z -- returns the address of variable z (address-of operator)

 Important point of common confusion!

    • means “a pointer variable” when used in a declaration
    • means “access the information that this address points to” elsewhere
  • What does *3 mean?

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 7

Check for understandingCheck for understanding

#include <stdio.h>

int main() {

int x = 12;

int *ptr = &x;

printf("Address of x:\t 0x%p\n", ptr);

printf("Address of x:\t 0x%x\n", &x);

printf("Address of ptr:\t 0x%x\n", &ptr);

printf("Value of x:\t %d\n", *ptr);

return 0;

Address of x: 0x0065FDF Address of x: 0x65fdf Address of ptr: 0x65fdf Value of x: 12 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 8

Check for understandingCheck for understanding

#include <stdio.h>

int main() {

int x[10] = {0,1,2,3,4,5,6,7,8,9};

printf("Address of x[0]:\t 0x%p\n", &x[0]);

printf("Address of x:\t 0x%p\n", x);

printf("Value of x[0]:\t %d\n", x[0]);

printf("Value of x[0]:\t %d\n", *x);

return 0;

Address of x[0]: 0x0065FDD Address of x: 0x0065FDD Value of x[0]: 0 Value of x[0]: 0 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 9

ExampleExample

int i;

int *ptr;

i = 4;

ptr = &i;

*ptr = *ptr + 1;

store the value 4 into the memory location associated with i store the address of i into the memory location associated with ptr read the contents of memory at the address stored in ptr store the result into memory at the address stored in ptr W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 10

Example: LCExample: LC--3 Code3 Code

; i is 1st local (offset 0), ptr is 2nd (offset - 1) ; i = 4;

AND R0, R0, #0 ; clear R

ADD R0, R0, #4 ; put 4 in R

STR R0, R5, #0 ; store in i

; ptr = &i;

ADD R0, R5, #0 ; R0 = R5 + 0

(addr of i)

STR R0, R5, #- 1 ; store in ptr

; *ptr = *ptr + 1;

LDR R0, R5, #- 1 ; R0 = ptr

LDR R1, R0, #0 ; load contents

(*ptr)

ADD R1, R1, #1 ; add one

STR R1, R0, #0 ; store result

where R0 points Name Type Offset Scope i ptr^ int Int*^0 - 1 main main

ptr

i

fp

pc

ret

xEFFC

xxxx

xxxx

xxxx

xF

R

R

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 11

Call by referenceCall by reference

 Passing a pointer as an argument allows the function to read/change

memory outside its activation record.

  • But not the pointer itself!
  • If you wanted to change the pointer itself, what would you need to do?

 void NewSwap(int *firstVal, int *secondVal)

int tempVal = *firstVal;

*firstVal = *secondVal;

*secondVal = tempVal;

Arguments are

integer pointers.

Caller passes addresses

of variables that it wants

function to change.

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 12

Passing Pointers to a FunctionPassing Pointers to a Function

main() wants to swap the values of valueA and valueB

NewSwap(&valueA, &valueB);

LC-3 Code for main:

ADD R0, R5, #- 1 ; addr of valueB ADD R6, R6, #- 1 ; push STR R0, R6, # ADD R0, R5, #0 ; addr of valueA ADD R6, R6, #- 1 ; push STR R0, R6, #

firstVal

secondVal

valueB

valueA

xEFFA

xEFF

xEFFD

R

R

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 19

ArraysArrays areare PointersPointers

char word[10];

char *cptr;

cptr = word; /* points to word[0] */

 Note that you CAN change cptr, but you CANNOT change word.

  • What is the difference between them?

 Each line below gives three equivalent expressions:

  • cptr word &word[0]
  • (cptr + n) (word + n) &word[n]
  • *cptr *word word[0]
  • *(cptr + n) *(word + n) word[n] W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 20

MultiMulti--dimensional arraysdimensional arrays

 How do you layout a multi-dimensional array in one-dimensional

memory?

  • Array layout is critical for correctly passing arrays between programs written in different languages.
  • It is also important for performance when traversing an array because accessing array elements that are contiguous in memory is usually faster than accessing elements which are not, due to caching.

 Row-major order

  • In row-major storage, a multidimensional array in linear memory is accessed such that rows are stored one after the other.
  • { {1, 2, 3} , {4, 5, 6} } is stored 1, 2, 3, 4, 5, 6
  • offset = rowNUMCOLS + column*

 Column-major order

 Row-major order is used in C, C++, Java, and most modern languages.

 Column-major order is used in Fortran and Matlab.

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 21

Passing Arrays as ArgumentsPassing Arrays as Arguments

 C passes arrays by reference

  • the address of the array (i.e., of the first element) is written to the function's activation record
  • otherwise, would have to copy each element main() { int numbers[MAX_NUMS]; … mean = Average(numbers); … } int Average(int inputValues[MAX_NUMS]) { … for (index = 0; index < MAX_NUMS; index++) sum = sum + indexValues[index]; return (sum / MAX_NUMS); } Note: Size of static array must be known at compile time, so MAX_NUMS must be a constant W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 22

A String is an Array of CharactersA String is an Array of Characters

 Allocate space for a string just like any other array:

 char outputString[16];

 Space for string must contain room for terminating zero.

 Special syntax for initializing a string:

 char outputString[16] = "Result = ";

 …which is the same as:

 outputString[0] = 'R';

outputString[1] = 'e';

outputString[2] = 's';

outputString[9] = '\0';

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 23

I/O with StringsI/O with Strings

 Printf and scanf use "%s" format character for string

 Printf -- print characters up to terminating zero

printf("%s", outputString);

 Scanf -- read characters until whitespace, store result in string, and

terminate with zero

scanf("%s", inputString);

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 24

Common Pitfalls with Arrays in CCommon Pitfalls with Arrays in C

 Overrun array limits

  • There is no checking at run-time or compile-time to see whether reference is within array bounds.  int array[10]; int i; for (i = 0; i <= 11; i++) array[i] = 0;  What will happen?
  • Think about the activation record!

 Declaration with variable size

  • Size of array must be known at compile time.  void SomeFunction(int num_elements) { int temp[num_elements]; … }

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 25

Pointer ArithmeticPointer Arithmetic

 Address calculations depend on size of elements

  • In our LC-3 code, we've been assuming one word per element.  e.g., to find 4th element, we add 4 to base address
  • It's ok, because we've only shown code for int and char, both of which take up one word.
  • If double, we'd have to add 8 to find address of 4th element.

 C does size calculations under the covers,

depending on size of item being pointed to:

 double x[10];

 double *y = x;

*(y + 3) = 13;

allocates 20 words (2 per element) same as x[3] -- base address plus 6 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 26

How important is understanding memory?How important is understanding memory?

 C does not enforce memory safety

  • Many ways to access memory illegally  Accessing an array out of bounds  Bad pointer arithmetic manipulations

 Some memory bugs come into play only rarely

  • When manipulating large files/strings, etc.  Accessing an array out of bounds  Bad pointer arithmetic manipulations

 Crash errors: Program accesses illegal memory (SEG Fault)

  • OK for user programs, not so good for OS programs

 Non-crash errors: Strange glitches, “magic” results

 Intentional exploits: These bugs are repeatable and can be exploited to

cause great harm

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 27

Consider: How can memory be exploited?Consider: How can memory be exploited?

void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 28

Consider: PrologueConsider: Prologue

void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } READ_AR ADD^ .ORIG R6, R6,^ x3000 #- 1 ; push return ADD STR R6, R6,R7, R6, ##0- 1 ; push ret link ADD STR R6, R6,R5, R6, ##0- 1 ;; push frame ptr ADD R5, R6, #- 1 ; set frame ptr ADD ADD R6, R6,R6, R6, ##-- 61 ;; intint array[6]index (# (^) - (#06)-#-5) ADD R6, R6, #- 1 ; int hex (#-7) ADD STR R6, R6,R0, R5, ##-- 18 ; Callee save R0 (#-8) ADD STR R6, R6,R1, R5, ##-- 19 ; Callee save R1 (#-9) AND STR R0, R0,R0, R5, #0#- 1 ;; index = 0 … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 29

Consider: BodyConsider: Body

void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } … NEXT TRAP x40 STR R0, R5, # (^) - 7 ;; hex = getHexInput() LDR R0, R5, #-6 ; R0 <^ ; array[index]- index^ = hex ADD R1, R5, # ADD R1, R1, R0-5 ; R1 < ; R1 <-- &array[0]&array[index] LDR R0, R5, # STR R0, R1, #0-7 ; R0 < ; array[index]- hex = hex LDR R1, R5, # ADD R1, R1, #1-6 ; index++ STR R1, R5, #- 6 LDR R0, R5, # BRnp NEXT - 7 ; until (hex = 0) … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 30

Consider: EpilogueConsider: Epilogue

void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } … LDR R1, R5, #- 9 ; Callee restore R ADD LDR R6, R6,R0, R5, #1#- 8 ; Callee restore R ADD R6, R6, # ADD R6, R6, #8 ; Pop locals LDR ADD R5, R6,R6, R6, #0#1 ; pop frame ptr LDR ADD R7, R6,R6, R6, #0#1 ; pop ret link .END^ RET

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 37

Exploits Based on Buffer OverflowsExploits Based on Buffer Overflows

 Buffer overflow bugs allow remote machines to execute arbitrary code

on victim machines.

 Internet worm

  • Early versions of the finger server (fingerd) used gets() to read the argument sent by the client:  finger [email protected]
  • Worm attacked fingerd server by sending phony argument:  finger “exploit-code padding new-return-address”  exploit code: executed a root shell on the victim machine with a direct TCP connection to the attacker.

 IM War

  • AOL exploited existing buffer overflow bug in AIM clients
  • exploit code: returned 4-byte signature (the bytes at some location in the AIM client) to server.
  • When Microsoft changed code to match signature, AOL changed signature location. W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 38

Code Red WormCode Red Worm

 History

  • June 18, 2001. Microsoft announces buffer overflow vulnerability in IIS Internet server
  • July 19, 2001. over 250,000 machines infected by new virus in 9 hours
  • White house must change its IP address. Pentagon shut down public WWW servers for day

 When We Set Up CS:APP Web Site

  • Received strings of form GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN....NNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN%u9090%u6858%ucbd3%u7801%u9090%u 858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u 0003%u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0" 400 325 "-" "-" W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 39

Code Red Exploit CodeCode Red Exploit Code

  • Starts 100 threads running
  • Spread self  Generate random IP addresses & send attack string  Between 1st & 19th of month
  • Attack www.whitehouse.gov  Send 98,304 packets; sleep for 4-1/ hours; repeat - Denial of service attack  Between 21st & 27th of month
  • Deface server’s home page  After waiting 2 hours W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 40

iphoneiphone hackhack

 Exploits stack overflow of jpeg display dll (which runs as root)

 What does this mean about the safety of viewing images, in general?

 The Safety of using library code, in general?

 What are your ethical responsibilities?

W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 41

Avoiding Overflow VulnerabilityAvoiding Overflow Vulnerability

 Use Library Routines that Limit String Lengths

  • Use “%ns” NOT “%s”  Why %3 above?
  • Do NOT use cin in C++ / Echo Line / void echo() { char buf[4]; / Way too small! / scanf(“%3s”,&buf); printf(“%s”,buf); }