Download CS 3214 Spring 2010 Midterm Exam - Compiling and Optimization - Prof. Godmar Volker Back and more Exams Computer Science in PDF only on Docsity!
Sample Midterm (Spring 2010)
Solutions are shown in this style. This exam was given in Spring 2010.
1. Executing Programs on IA32 (30 pts)
The following questions relate to how programs are compiled for IA32.
a) (8 pts) Consider the following buggy program contained in a file sum.c
#include <stdio.h>
int sum(int a, int b) { int s = a + b; // return statement is missing }
int main() { printf("%d\n", sum(1, 2)); }
i. (4 pts) When the program is compiled with ‘gcc –o sum sum.c’ and run,
it will output ‘3’. Explain why 3 is output!
Since no optimization level is specified, the compiler will emit code for all
statements it sees. This includes the computation of ‘s’ as the sum of ‘a’ and ‘b’.
‘s’ happens to be computed in register $eax, so it coincidentally becomes the
return value of sum.
ii. (4 pts) When the program is compiled with ‘gcc –O2 –o sum sum.c’
and run, it outputs a number such as -1074516556. Which analysis or
optimization on the part of the compiler causes the generation of code
that leads to this different result?
The compiler determines that ‘s’ is not used and thus does not emit code to
compute it - $eax in this case contains whichever value it had from the last time it
was used. Or, if this function is inlined, the compiler may completely remove the
function call since none of its computed values is used, then pass an uninitialized
value to printf() (essentially, whatever is in memory at the address where printf
expects its second argument).
b) (12 pts) Consider the following assembly code, which was produced by
gcc for a function ‘t’. The left column shows the result when compiling with
optimizations at level 1 (-O1).
IA 32 Code,compiled with –O1 C Code
t: pushl %ebp movl %esp, %ebp pushl %ebx subl $4, %esp movl 8(%ebp), %ebx movl (%ebx), %eax testl %eax, %eax je .L movl %eax, (%esp) call t .L2: movl %ebx, (%esp) call visit movl 4(%ebx), %eax testl %eax, %eax je .L movl %eax, (%esp) call t .L6: addl $4, %esp popl %ebx popl %ebp ret
void t(struct node *node)
if (node->left)
t(node->left);
visit(node);
if (node->right)
t(node->right);
Provide a C version of function t()!
Hint: t() accepts a pointer to this struct:
struct node { struct node *left; struct node *right; }
c) (5 pts) Consider the following program:
#include <stdio.h>
int mystery_function(int arg, ...) { int * p = &arg; int r = 1;
while (*p) r = r * *p++;
leal (%edx,%eax,4), %eax movl 8(%ebp), %edx movl (%eax,%edx), %eax popl %ebp ret
access2: pushl %ebp movl %esp, %ebp imull $120, 12(%ebp), %eax addl 8(%ebp), %eax popl %ebp ret
int * access2(int A[][30], int x)
return A[x];
access3: pushl %ebp movl %esp, %ebp movl 12(%ebp), %edx movl 8(%ebp), %eax movl (%eax,%edx,4), %edx movl 16(%ebp), %eax movl (%edx,%eax,4), %eax popl %ebp ret
int access3(int *A[], int x, int y)
return A[x][y];
Aside: the outermost dimension doesn’t matter, so you could have written int
A[anything][30], it doesn’t change what code is generated.
b) (15 pts) Consider the following piece of code:
#include <stdio.h> struct alpha { int x; char a; int z; char b; char c; char d; }; int main() { int data[] = {8, 49, 0, 7, 4, 53, 6, 1, 0}; struct alpha* ap = (struct alpha *) data; printf ("%d %c %d, %d %c %d\n", ap[0].x, ap[0].a, ap[0].z, ap[1].x, ap[1].a, ap[1].z); }
i. (5 pts) Draw a picture of how the array “data” is laid out in
memory. On the same picture, indicate where all the elements of
ap[0] and ap[1]are stored.
data 8 0 0 0^ 49 0 0 0^ 0 0 0 0^ 7 0 0 0^ 4 0 0 0^ 53 0 0 0^ 6 0 0 0^ 1 0 0 0^ 0 0 0 0 ap[0].x ap[0].a ap[0].z .b .c .d ap[1].x ap[1].a ap[1].z .b .c .d
ii. (3 pts) What is the output of the above program?
Hint: ASCII code for character ‘0’ is 48.
iii. (3 pts) How would you rearrange the elements of alpha so as to
minimize the amount of memory used to store it? Provide C code!
Any arrangement that groups the ‘char’ fields adjacent to each other
works, such as:
struct alpha { int x; int z; char a; char b; char c; char d; };
The size of the struct is reduced from 16 to 12.
iv. (4 pts) What will be the program output with your new version of
alpha?
For above code rearrangement,
(Note that outputting a zero byte value (‘\0’) using %c will not produce any
output, nor advance the terminal’s cursor. You can see the 0 byte by
piping the output to the ‘od’ command.)
3. Buffer Overflows (28 pts)
In project 2, we explored how the lack of buffer bounds checks can allow a
malicious attacker to exploit a program vulnerability and execute dangerous
code. In this question, we will explore several techniques used to deter such
buffer overflow attacks.
a) (4 points) A proposed technique involves programming the memory
protection hardware to disallow execution of instructions fetched from
stack memory. Explain the rationale for this technique!
func_caller (good_func); }
In this code, main calls func_caller with a pointer good_func,
which then invokes this function using the passed pointer. If gets()
reads less than 10 bytes, the output of the program is “This is the
good function”. Based on your answer to part ii), discuss if the use
of a canary word could prevent an attacker from crafting an exploit
string that would cause the invocation of bad_func instead!
No it would not since the indirect function call occurs before the ‘ret’ instruction is
reached.
(If your answer to b) included the assumption that the compiler generates code
that checks the canary before each indirect function call as well as before
returning, the answer would be yes, because ‘func_ptr’ is located above the
canary on the stack.)
c) (4 points) Could the vulnerability described in part b) be avoided using
address space randomization, which is a technique that places a
program’s stack at randomly chosen addresses, which differ from run to
run?
The attack described in part b) iii) cannot, because it does not rely on knowing
the address of the current stack frame – unlike a conventional buffer overflow
attack in which the return address is overwritten to point to exploit code located
on the stack.
However, a generic stack overflow attack (such as the one you implemented in
Project 2), often can be prevented.
d) Assume that a program contains the following bug:
if (some rare error condition occurs) { int errorcode = 31; char errormsg[] = “Some Rare Error Occurred\n”; printf(“Error: %d: %x”, errorcode, errormsg); }
The correct format string would have been “Error: %d: %s”.
i. (4 pts) Is this bug exploitable, i.e., could an attacker use it to
achieve the execution of code under their control? Justify your
answer!
No, this bug does not allow the injection of code controlled by an attacker.
ii. (4 pts) Is this bug security-relevant, i.e., could it aid or amplify a
possible attack? Justify your answer!
Yes, if an attacker can trigger the ‘rare error condition,’ they will learn the address
of ‘errormsg[0]’, which is an address within the current stack frame. This
knowledge can help craft an exploit if there is a (separate) buffer overflow
vulnerability.
4. Optimization (12 pts)
Consider the following 2 versions of a function that computes the square root of a
number using Newton’s algorithm. The square root is written to the pointer
referred to by ‘root’, and the number of iterations is returned. Both versions are
nearly identical, except that the second version uses a local variable ‘x’ to hold
the intermediate values of the to-be-computed square root.
int square_root_newton(double n, double *root) { double xn = 1.0; int niter = 0;
do { *root= xn; xn = root - (root * *root - n) / (2 * *root); niter++; } while (fabs(xn - *root) > 1e-6);
return niter; }
int square_root_newton2(double n, double *root) { double x, xn = 1.0; int niter = 0;
do { x = xn; xn = x – (x * x - n) / (2 * x); niter++; } while (fabs(xn - x) > 1e-6); *root = x; return niter; }
Assume that ‘fabs’ is implemented as a built-in that maps to a single instruction.
Will one version result in faster code than the other when compiled with an
optimizing compiler? If so, state which one. In either case, justify your answer!