Download Understanding Pointers and Arrays in C: WSU CEG 320/520 Comp. Org. & Assembly and more Study notes Computer Architecture and Organization in PDF only on Docsity!
Chapter 16Chapter 16
Why do we need Pointers?
Call by Value vs. Call by Reference in detail
Implementing Arrays
Buffer Overflow / The “Stack Hack”
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 2
A problem with parameter passing viaA problem with parameter passing via
stackstack
Consider the following function that's designed to
swap the values of its arguments.
void Swap(int firstVal, int secondVal){
int tempVal = firstVal;
firstVal = secondVal;
secondVal = tempVal;
int main () {
int valueA = 3, valueB = 4; … Swap (valueA, valueB); … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 3
Executing the Swap FunctionExecuting the Swap Function
firstVal
secondVal
valueB
valueA
R
before call
tempVal
firstVal
secondVal
valueB
valueA
R
after call
These values
changed...
...but these
did not.
Swap main W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 4
Pointers and ArraysPointers and Arrays
Functions such as the swap example need to be able access variables
stored in memory locations outside of their own activation record
- A function’s activation record defines its “scope”
- We've seen examples of how to do this in Assembly.
Pointer
- Address of a variable in memory
- Allows us to indirectly access variables in other words, we can talk about its address rather than its value
Array (still a pointer!)
- An area of allocated memory with values arranged sequentially
- Expression a[4] refers to the 5th element of the array a The array variable is a pointer to the base of the array Base + offset Thus… the first element is 0 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 5
Pointers in CPointers in C
C lets us talk about and manipulate addresses as “pointer variables”
But first, lets refresh the somewhat confusing bits.
&: The “address-of” or “reference” operator
- This operator does one thing.
- It returns the address of the variable which follows
#include <stdio.h>
int main() {
int x = 0;
printf("Address of x ");
printf("= 0x%p \n", &x);
return 0;
Output: Address of x = 0x0065FDF W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 6
Pointers in CPointers in C
How do we store addresses? Pointer variables!
- Although all pointers in C are exactly the same type (address) they are also typed by the compiler so that the data to which they refer can be appropriately interpreted.
- A pointer in C is always a pointer to a particular data type: int, double, char*, etc.
Declaration
- int p; / p is a pointer to an int */
Operators
- *p -- returns the value pointed to by p (indirect address / dereference op)
- &z -- returns the address of variable z (address-of operator)
Important point of common confusion!
- means “a pointer variable” when used in a declaration
- means “access the information that this address points to” elsewhere
- What does *3 mean?
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 7
Check for understandingCheck for understanding
#include <stdio.h>
int main() {
int x = 12;
int *ptr = &x;
printf("Address of x:\t 0x%p\n", ptr);
printf("Address of x:\t 0x%x\n", &x);
printf("Address of ptr:\t 0x%x\n", &ptr);
printf("Value of x:\t %d\n", *ptr);
return 0;
Address of x: 0x0065FDF Address of x: 0x65fdf Address of ptr: 0x65fdf Value of x: 12 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 8
Check for understandingCheck for understanding
#include <stdio.h>
int main() {
int x[10] = {0,1,2,3,4,5,6,7,8,9};
printf("Address of x[0]:\t 0x%p\n", &x[0]);
printf("Address of x:\t 0x%p\n", x);
printf("Value of x[0]:\t %d\n", x[0]);
printf("Value of x[0]:\t %d\n", *x);
return 0;
Address of x[0]: 0x0065FDD Address of x: 0x0065FDD Value of x[0]: 0 Value of x[0]: 0 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 9
ExampleExample
int i;
int *ptr;
i = 4;
ptr = &i;
*ptr = *ptr + 1;
store the value 4 into the memory location associated with i store the address of i into the memory location associated with ptr read the contents of memory at the address stored in ptr store the result into memory at the address stored in ptr W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 10
Example: LCExample: LC--3 Code3 Code
; i is 1st local (offset 0), ptr is 2nd (offset - 1) ; i = 4;
AND R0, R0, #0 ; clear R
ADD R0, R0, #4 ; put 4 in R
STR R0, R5, #0 ; store in i
; ptr = &i;
ADD R0, R5, #0 ; R0 = R5 + 0
(addr of i)
STR R0, R5, #- 1 ; store in ptr
; *ptr = *ptr + 1;
LDR R0, R5, #- 1 ; R0 = ptr
LDR R1, R0, #0 ; load contents
(*ptr)
ADD R1, R1, #1 ; add one
STR R1, R0, #0 ; store result
where R0 points Name Type Offset Scope i ptr^ int Int*^0 - 1 main main
ptr
i
fp
pc
ret
xEFFC
xxxx
xxxx
xxxx
xF
R
R
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 11
Call by referenceCall by reference
Passing a pointer as an argument allows the function to read/change
memory outside its activation record.
- But not the pointer itself!
- If you wanted to change the pointer itself, what would you need to do?
void NewSwap(int *firstVal, int *secondVal)
int tempVal = *firstVal;
*firstVal = *secondVal;
*secondVal = tempVal;
Arguments are
integer pointers.
Caller passes addresses
of variables that it wants
function to change.
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 12
Passing Pointers to a FunctionPassing Pointers to a Function
main() wants to swap the values of valueA and valueB
NewSwap(&valueA, &valueB);
LC-3 Code for main:
ADD R0, R5, #- 1 ; addr of valueB ADD R6, R6, #- 1 ; push STR R0, R6, # ADD R0, R5, #0 ; addr of valueA ADD R6, R6, #- 1 ; push STR R0, R6, #
firstVal
secondVal
valueB
valueA
xEFFA
xEFF
xEFFD
R
R
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 19
ArraysArrays areare PointersPointers
char word[10];
char *cptr;
cptr = word; /* points to word[0] */
Note that you CAN change cptr, but you CANNOT change word.
- What is the difference between them?
Each line below gives three equivalent expressions:
- cptr word &word[0]
- (cptr + n) (word + n) &word[n]
- *cptr *word word[0]
- *(cptr + n) *(word + n) word[n] W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 20
MultiMulti--dimensional arraysdimensional arrays
How do you layout a multi-dimensional array in one-dimensional
memory?
- Array layout is critical for correctly passing arrays between programs written in different languages.
- It is also important for performance when traversing an array because accessing array elements that are contiguous in memory is usually faster than accessing elements which are not, due to caching.
Row-major order
- In row-major storage, a multidimensional array in linear memory is accessed such that rows are stored one after the other.
- { {1, 2, 3} , {4, 5, 6} } is stored 1, 2, 3, 4, 5, 6
- offset = rowNUMCOLS + column*
Column-major order
Row-major order is used in C, C++, Java, and most modern languages.
Column-major order is used in Fortran and Matlab.
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 21
Passing Arrays as ArgumentsPassing Arrays as Arguments
C passes arrays by reference
- the address of the array (i.e., of the first element) is written to the function's activation record
- otherwise, would have to copy each element main() { int numbers[MAX_NUMS]; … mean = Average(numbers); … } int Average(int inputValues[MAX_NUMS]) { … for (index = 0; index < MAX_NUMS; index++) sum = sum + indexValues[index]; return (sum / MAX_NUMS); } Note: Size of static array must be known at compile time, so MAX_NUMS must be a constant W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 22
A String is an Array of CharactersA String is an Array of Characters
Allocate space for a string just like any other array:
char outputString[16];
Space for string must contain room for terminating zero.
Special syntax for initializing a string:
char outputString[16] = "Result = ";
…which is the same as:
outputString[0] = 'R';
outputString[1] = 'e';
outputString[2] = 's';
outputString[9] = '\0';
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 23
I/O with StringsI/O with Strings
Printf and scanf use "%s" format character for string
Printf -- print characters up to terminating zero
printf("%s", outputString);
Scanf -- read characters until whitespace, store result in string, and
terminate with zero
scanf("%s", inputString);
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 24
Common Pitfalls with Arrays in CCommon Pitfalls with Arrays in C
Overrun array limits
- There is no checking at run-time or compile-time to see whether reference is within array bounds. int array[10]; int i; for (i = 0; i <= 11; i++) array[i] = 0; What will happen?
- Think about the activation record!
Declaration with variable size
- Size of array must be known at compile time. void SomeFunction(int num_elements) { int temp[num_elements]; … }
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 25
Pointer ArithmeticPointer Arithmetic
Address calculations depend on size of elements
- In our LC-3 code, we've been assuming one word per element. e.g., to find 4th element, we add 4 to base address
- It's ok, because we've only shown code for int and char, both of which take up one word.
- If double, we'd have to add 8 to find address of 4th element.
C does size calculations under the covers,
depending on size of item being pointed to:
double x[10];
double *y = x;
*(y + 3) = 13;
allocates 20 words (2 per element) same as x[3] -- base address plus 6 W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 26
How important is understanding memory?How important is understanding memory?
C does not enforce memory safety
- Many ways to access memory illegally Accessing an array out of bounds Bad pointer arithmetic manipulations
Some memory bugs come into play only rarely
- When manipulating large files/strings, etc. Accessing an array out of bounds Bad pointer arithmetic manipulations
Crash errors: Program accesses illegal memory (SEG Fault)
- OK for user programs, not so good for OS programs
Non-crash errors: Strange glitches, “magic” results
Intentional exploits: These bugs are repeatable and can be exploited to
cause great harm
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 27
Consider: How can memory be exploited?Consider: How can memory be exploited?
void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 28
Consider: PrologueConsider: Prologue
void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } READ_AR ADD^ .ORIG R6, R6,^ x3000 #- 1 ; push return ADD STR R6, R6,R7, R6, ##0- 1 ; push ret link ADD STR R6, R6,R5, R6, ##0- 1 ;; push frame ptr ADD R5, R6, #- 1 ; set frame ptr ADD ADD R6, R6,R6, R6, ##-- 61 ;; intint array[6]index (# (^) - (#06)-#-5) ADD R6, R6, #- 1 ; int hex (#-7) ADD STR R6, R6,R0, R5, ##-- 18 ; Callee save R0 (#-8) ADD STR R6, R6,R1, R5, ##-- 19 ; Callee save R1 (#-9) AND STR R0, R0,R0, R5, #0#- 1 ;; index = 0 … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 29
Consider: BodyConsider: Body
void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } … NEXT TRAP x40 STR R0, R5, # (^) - 7 ;; hex = getHexInput() LDR R0, R5, #-6 ; R0 <^ ; array[index]- index^ = hex ADD R1, R5, # ADD R1, R1, R0-5 ; R1 < ; R1 <-- &array[0]&array[index] LDR R0, R5, # STR R0, R1, #0-7 ; R0 < ; array[index]- hex = hex LDR R1, R5, # ADD R1, R1, #1-6 ; index++ STR R1, R5, #- 6 LDR R0, R5, # BRnp NEXT - 7 ; until (hex = 0) … W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 30
Consider: EpilogueConsider: Epilogue
void read_array() { int array[6]; int index; int hex; index = 0; do { hex = getHexInput(); array[index] = hex; index++; } until (hex == 0); // code to manipulate data } … LDR R1, R5, #- 9 ; Callee restore R ADD LDR R6, R6,R0, R5, #1#- 8 ; Callee restore R ADD R6, R6, # ADD R6, R6, #8 ; Pop locals LDR ADD R5, R6,R6, R6, #0#1 ; pop frame ptr LDR ADD R7, R6,R6, R6, #0#1 ; pop ret link .END^ RET
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 37
Exploits Based on Buffer OverflowsExploits Based on Buffer Overflows
Buffer overflow bugs allow remote machines to execute arbitrary code
on victim machines.
Internet worm
- Early versions of the finger server (fingerd) used gets() to read the argument sent by the client: finger [email protected]
- Worm attacked fingerd server by sending phony argument: finger “exploit-code padding new-return-address” exploit code: executed a root shell on the victim machine with a direct TCP connection to the attacker.
IM War
- AOL exploited existing buffer overflow bug in AIM clients
- exploit code: returned 4-byte signature (the bytes at some location in the AIM client) to server.
- When Microsoft changed code to match signature, AOL changed signature location. W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 38
Code Red WormCode Red Worm
History
- June 18, 2001. Microsoft announces buffer overflow vulnerability in IIS Internet server
- July 19, 2001. over 250,000 machines infected by new virus in 9 hours
- White house must change its IP address. Pentagon shut down public WWW servers for day
When We Set Up CS:APP Web Site
- Received strings of form GET /default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN....NNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN%u9090%u6858%ucbd3%u7801%u9090%u 858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u 0003%u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0" 400 325 "-" "-" W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 39
Code Red Exploit CodeCode Red Exploit Code
- Starts 100 threads running
- Spread self Generate random IP addresses & send attack string Between 1st & 19th of month
- Attack www.whitehouse.gov Send 98,304 packets; sleep for 4-1/ hours; repeat - Denial of service attack Between 21st & 27th of month
- Deface server’s home page After waiting 2 hours W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 40
iphoneiphone hackhack
Exploits stack overflow of jpeg display dll (which runs as root)
What does this mean about the safety of viewing images, in general?
The Safety of using library code, in general?
What are your ethical responsibilities?
W Dr .r ight State Univer sity, College of Engineering Doom, Computer Science & Engineering CEG 320/520 Comp. Or g. & Assemb ly 41
Avoiding Overflow VulnerabilityAvoiding Overflow Vulnerability
Use Library Routines that Limit String Lengths
- Use “%ns” NOT “%s” Why %3 above?
- Do NOT use cin in C++ / Echo Line / void echo() { char buf[4]; / Way too small! / scanf(“%3s”,&buf); printf(“%s”,buf); }