CS 3214 Spring 2010 Midterm Exam - Compiling and Optimization - Prof. Godmar Volker Back, Exams of Computer Science

A cs 3214 spring 2010 midterm exam focusing on how programs are compiled for ia32 architecture. The exam covers topics such as compiler optimizations, assembly code, and function pointers. It includes questions related to the compilation of a buggy c program, the analysis of assembly code produced by gcc, and the comparison of switch statements and if/else statements.

Typology: Exams

Pre 2010

Uploaded on 10/26/2010

chadski
chadski 🇺🇸

3 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 3214 Sample Midterm (Spring 2010)
1/9
Sample Midterm (Spring 2010)
Solutions are shown in this style. This exam was given in Spring 2010.
1. Executing Programs on IA32 (30 pts)
The following questions relate to how programs are compiled for IA32.
a) (8 pts) Consider the following buggy program contained in a file sum.c
#include <stdio.h>
int
sum(int a, int b)
{
int s = a + b;
// return statement is missing
}
int
main()
{
printf("%d\n", sum(1, 2));
}
i. (4 pts) When the program is compiled with ‘gcc –o sum sum.c’ and run,
it will output ‘3’. Explain why 3 is output!
Since no optimization level is specified, the compiler will emit code for all
statements it sees. This includes the computation of ‘s’ as the sum of ‘a’ and ‘b’.
‘s’ happens to be computed in register $eax, so it coincidentally becomes the
return value of sum.
ii. (4 pts) When the program is compiled with ‘gcc –O2 –o sum sum.c’
and run, it outputs a number such as -1074516556. Which analysis or
optimization on the part of the compiler causes the generation of code
that leads to this different result?
The compiler determines that ‘s’ is not used and thus does not emit code to
compute it - $eax in this case contains whichever value it had from the last time it
was used. Or, if this function is inlined, the compiler may completely remove the
function call since none of its computed values is used, then pass an uninitialized
value to printf() (essentially, whatever is in memory at the address where printf
expects its second argument).
b) (12 pts) Consider the following assembly code, which was produced by
gcc for a function ‘t’. The left column shows the result when compiling with
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download CS 3214 Spring 2010 Midterm Exam - Compiling and Optimization - Prof. Godmar Volker Back and more Exams Computer Science in PDF only on Docsity!

Sample Midterm (Spring 2010)

Solutions are shown in this style. This exam was given in Spring 2010.

1. Executing Programs on IA32 (30 pts)

The following questions relate to how programs are compiled for IA32.

a) (8 pts) Consider the following buggy program contained in a file sum.c

#include <stdio.h>

int sum(int a, int b) { int s = a + b; // return statement is missing }

int main() { printf("%d\n", sum(1, 2)); }

i. (4 pts) When the program is compiled with ‘gcc –o sum sum.c’ and run,

it will output ‘3’. Explain why 3 is output!

Since no optimization level is specified, the compiler will emit code for all

statements it sees. This includes the computation of ‘s’ as the sum of ‘a’ and ‘b’.

‘s’ happens to be computed in register $eax, so it coincidentally becomes the

return value of sum.

ii. (4 pts) When the program is compiled with ‘gcc –O2 –o sum sum.c’

and run, it outputs a number such as -1074516556. Which analysis or

optimization on the part of the compiler causes the generation of code

that leads to this different result?

The compiler determines that ‘s’ is not used and thus does not emit code to

compute it - $eax in this case contains whichever value it had from the last time it

was used. Or, if this function is inlined, the compiler may completely remove the

function call since none of its computed values is used, then pass an uninitialized

value to printf() (essentially, whatever is in memory at the address where printf

expects its second argument).

b) (12 pts) Consider the following assembly code, which was produced by

gcc for a function ‘t’. The left column shows the result when compiling with

optimizations at level 1 (-O1).

IA 32 Code,compiled with –O1 C Code

t: pushl %ebp movl %esp, %ebp pushl %ebx subl $4, %esp movl 8(%ebp), %ebx movl (%ebx), %eax testl %eax, %eax je .L movl %eax, (%esp) call t .L2: movl %ebx, (%esp) call visit movl 4(%ebx), %eax testl %eax, %eax je .L movl %eax, (%esp) call t .L6: addl $4, %esp popl %ebx popl %ebp ret

void t(struct node *node)

if (node->left)

t(node->left);

visit(node);

if (node->right)

t(node->right);

Provide a C version of function t()!

Hint: t() accepts a pointer to this struct:

struct node { struct node *left; struct node *right; }

c) (5 pts) Consider the following program:

#include <stdio.h>

int mystery_function(int arg, ...) { int * p = &arg; int r = 1;

while (*p) r = r * *p++;

leal (%edx,%eax,4), %eax movl 8(%ebp), %edx movl (%eax,%edx), %eax popl %ebp ret

access2: pushl %ebp movl %esp, %ebp imull $120, 12(%ebp), %eax addl 8(%ebp), %eax popl %ebp ret

int * access2(int A[][30], int x)

return A[x];

access3: pushl %ebp movl %esp, %ebp movl 12(%ebp), %edx movl 8(%ebp), %eax movl (%eax,%edx,4), %edx movl 16(%ebp), %eax movl (%edx,%eax,4), %eax popl %ebp ret

int access3(int *A[], int x, int y)

return A[x][y];

Aside: the outermost dimension doesn’t matter, so you could have written int

A[anything][30], it doesn’t change what code is generated.

b) (15 pts) Consider the following piece of code:

#include <stdio.h> struct alpha { int x; char a; int z; char b; char c; char d; }; int main() { int data[] = {8, 49, 0, 7, 4, 53, 6, 1, 0}; struct alpha* ap = (struct alpha *) data; printf ("%d %c %d, %d %c %d\n", ap[0].x, ap[0].a, ap[0].z, ap[1].x, ap[1].a, ap[1].z); }

i. (5 pts) Draw a picture of how the array “data” is laid out in

memory. On the same picture, indicate where all the elements of

ap[0] and ap[1]are stored.

data 8 0 0 0^ 49 0 0 0^ 0 0 0 0^ 7 0 0 0^ 4 0 0 0^ 53 0 0 0^ 6 0 0 0^ 1 0 0 0^ 0 0 0 0 ap[0].x ap[0].a ap[0].z .b .c .d ap[1].x ap[1].a ap[1].z .b .c .d

ii. (3 pts) What is the output of the above program?

Hint: ASCII code for character ‘0’ is 48.

iii. (3 pts) How would you rearrange the elements of alpha so as to

minimize the amount of memory used to store it? Provide C code!

Any arrangement that groups the ‘char’ fields adjacent to each other

works, such as:

struct alpha { int x; int z; char a; char b; char c; char d; };

The size of the struct is reduced from 16 to 12.

iv. (4 pts) What will be the program output with your new version of

alpha?

For above code rearrangement,

(Note that outputting a zero byte value (‘\0’) using %c will not produce any

output, nor advance the terminal’s cursor. You can see the 0 byte by

piping the output to the ‘od’ command.)

3. Buffer Overflows (28 pts)

In project 2, we explored how the lack of buffer bounds checks can allow a

malicious attacker to exploit a program vulnerability and execute dangerous

code. In this question, we will explore several techniques used to deter such

buffer overflow attacks.

a) (4 points) A proposed technique involves programming the memory

protection hardware to disallow execution of instructions fetched from

stack memory. Explain the rationale for this technique!

func_caller (good_func); }

In this code, main calls func_caller with a pointer good_func,

which then invokes this function using the passed pointer. If gets()

reads less than 10 bytes, the output of the program is “This is the

good function”. Based on your answer to part ii), discuss if the use

of a canary word could prevent an attacker from crafting an exploit

string that would cause the invocation of bad_func instead!

No it would not since the indirect function call occurs before the ‘ret’ instruction is

reached.

(If your answer to b) included the assumption that the compiler generates code

that checks the canary before each indirect function call as well as before

returning, the answer would be yes, because ‘func_ptr’ is located above the

canary on the stack.)

c) (4 points) Could the vulnerability described in part b) be avoided using

address space randomization, which is a technique that places a

program’s stack at randomly chosen addresses, which differ from run to

run?

The attack described in part b) iii) cannot, because it does not rely on knowing

the address of the current stack frame – unlike a conventional buffer overflow

attack in which the return address is overwritten to point to exploit code located

on the stack.

However, a generic stack overflow attack (such as the one you implemented in

Project 2), often can be prevented.

d) Assume that a program contains the following bug:

if (some rare error condition occurs) { int errorcode = 31; char errormsg[] = “Some Rare Error Occurred\n”; printf(“Error: %d: %x”, errorcode, errormsg); }

The correct format string would have been “Error: %d: %s”.

i. (4 pts) Is this bug exploitable, i.e., could an attacker use it to

achieve the execution of code under their control? Justify your

answer!

No, this bug does not allow the injection of code controlled by an attacker.

ii. (4 pts) Is this bug security-relevant, i.e., could it aid or amplify a

possible attack? Justify your answer!

Yes, if an attacker can trigger the ‘rare error condition,’ they will learn the address

of ‘errormsg[0]’, which is an address within the current stack frame. This

knowledge can help craft an exploit if there is a (separate) buffer overflow

vulnerability.

4. Optimization (12 pts)

Consider the following 2 versions of a function that computes the square root of a

number using Newton’s algorithm. The square root is written to the pointer

referred to by ‘root’, and the number of iterations is returned. Both versions are

nearly identical, except that the second version uses a local variable ‘x’ to hold

the intermediate values of the to-be-computed square root.

int square_root_newton(double n, double *root) { double xn = 1.0; int niter = 0;

do { *root= xn; xn = root - (root * *root - n) / (2 * *root); niter++; } while (fabs(xn - *root) > 1e-6);

return niter; }

int square_root_newton2(double n, double *root) { double x, xn = 1.0; int niter = 0;

do { x = xn; xn = x – (x * x - n) / (2 * x); niter++; } while (fabs(xn - x) > 1e-6); *root = x; return niter; }

Assume that ‘fabs’ is implemented as a built-in that maps to a single instruction.

Will one version result in faster code than the other when compiled with an

optimizing compiler? If so, state which one. In either case, justify your answer!