Download Tracing Programs - Introduction to Programming II - Slides | CS 112 and more Study notes Computer Science in PDF only on Docsity!
CS 112: Supplementary Notes
Tracing programs
David Wolber
1. Introduction
Programming computers is difficult, even using the high-level programming
languages (Pascal, Smalltalk, C, C++, Java) available today. Part of the difficulty
is understanding the effect of each programming language statement or
command. Most textbooks provide abstract interpretations of the commands.
Though these are helpful, it is also illuminating to study a more concrete
interpretation of programming statements. This chapter provides a method for
this study that allows a student to trace the actual effects on the computer’s
hardware (memory, disk, display device) of each programming language
statement.
One difficulty in tracing programs is that there are several levels of
abstraction between the statements in a high-level programming language, and
what really occurs during program execution. One must have at least a
rudimentary understanding of system tools called translators that convert high-
level programming language statements into low-level code that a computer can
understand, and eventually into actual effects to the computer hardware. There
are two main types of translators, compilers and interpreters. A compiler
generates assembly code that is just one level more abstract than what a
computer understands. An assembler is used to convert the assembly code into
the 0’s and 1’s that make up what is called machine code.
high-level code assembly code machine code
Discussing the effects of high-level program statements can become
unnecessarily complex if we use the compiler/assembler model of translation. A
conceptually simpler translation scheme is used by tools called interpreters. An
interpreter (in some sense) translates programming language statements directly
Compiler Assembler
Note that the variable declarations have no immediate effect on main
memory: the table above is internal to the interpreter. When the assignment
statements below the declarations are translated, the interpreter uses the table
to store values in the correct place.
x = 3 causes 3 to be stored at address 0. y = 9.9 causes 9.9 to be stored at address 4 z[1] = x+4 causes 3+4=7 to be stored at address 12+(1*4)=
So, given our assumptions, an answer to this initial question might be the
following, where memory is the rectangle labled value:
Address Value 0: 3 4 : 9. 8: 12: 16: 7 20: 24:
One way to show both the data allocation internal to the interpreter and the
state of main memory is with a table such as the following:
Address Allocated To Value 0 x 3 4 y 9. 8 12 z 16 7 20 24
This is the type of table that will be used throughout the rest of this chapter
to trace programs.
2. Tracing Functions
To describe how an interpreter processes function calls, we must explain
how memory is divided. The interpreter breaks down memory into two parts,
one called the run-time stack , and one called the heap (the heap will be
discussed later).The run-time stack is the part of memory where the data
neccesary to execute each function is stored. When a function is called, an entity
called an activation record is pushed on the stack. The activation record contains
storage space for the return value, the object, the parameters, and the local
variables of a function. When the function completes, its activation record is
popped off the stack.
For instance, consider the following code:
int fop(int x,int y) { int z; x = x+1; z = x; y=5+z; return y; } main () { int x; x = 9; x = foo(x,9); } }
The following is a trace of the above program:
Address Allocated To Value 0 x 9 15 4 fop return value 15 8 x 9 10 12 y 15 16 z 10 20
The interpreter first sets up the activation record for main, which consists of
just 4 bytes to store the local variable x. x is then set to the value 9. Then, the
function call to fop is made. An activation record is pushed on the stack. Within
the activation record, space is first allocated for the return value of the function,
then the parameters ( x and y ), and finally the local variable z. Finally, the
Main’s activation record Fop’s activation record
When execution of the function begins, all references to y use one level of
indirection. For example, in the statement:
z = x + y;
x is evaluated as its value, 8, but y is evaluated indirectly by following the
pointer stored as its value. Since the "value" of y is 0, the interpreter takes the
value of address 0 as the value of y. When this statement is executed, the value
at address 0 is 9, so y is evaluated as 9, and z is set to 8+9 = 17.
Similarly, when the statement
y = x + z;
is executed, x + z is evaluated normally as 8+17=25. This value is not stored as
the value of y at address 16, however-- the value of y is fixed as 0. Instead,
indirection is used and 25 is stored at location 0.
An alternative way to view what occurs is by illustrating the value of the by-
reference parameter as a pointer:
Address Allocated To Value 0 a 9 25 4 b 17 8 foo return value 17 12 x 7 8 16 &y 20 z 17 24
This form can make hand-tracing easier, as it helps one to remember to
follow the pointer when a reference to the variable is made. However, one
should always be aware that a pointer is just a number corresponding to some
address in memory.
It should be noted that in Java, basic types like int and double can't be sent
as "by-reference" parameters. Object's are sent by-reference, however, as will
be seen in a later section.
Main’s activation record Foo’s activation record
4. Tracing Structs
Tracing struct-type variables is straight-forward. When allocating space for a
struct, the interpreter adds up the sizes of each of the fields of the struct -- this
sum is the amount of space needed for the entire struct.
For instance, consider the following program:
struct Composite { int field1[4]; int field2; } main () { Composite c; c.field2=8; } The following is a valid trace: Address Allocated To Value 0 c.field1[0] 4 c.field1[1] 8 c.field1[2] 12 c.field1[3] 16 c.field2 8
5. Tracing Pointer Variables
An interpreter handles pass-by-reference parameters by sending addresses
to the activation records of functions, as shown previously. A programmer can
also define pointer variables explicitly. Often, such variables are used to
dynamically allocate memory space during the execution of a program. Such
dynamic allocation allows a program to make use of only the memory it needs,
when it needs it.
As stated earlier, the memory of a computer is conceptually split into two
parts, the run-time stack and the heap. The run-time stack is used to store
Main’s activation record
In the example, x and c are allocated on the stack normally. p_x and p_c
are also allocated on the stack -- they are allocated four bytes to hold an
address. No data is allocated on the heap for these declarations -- heap space is
only allocated with a new function call. The new calls follow the declarations.
The first,
p_c = new Composite;
allocates the size of the struct Composite, which is 20 bytes, on the top of the
heap. The return value of the call to new is the address on the heap where the
allocations was made, in this case address 1000. Thus, 1000 is the value of the
pointer variable p_c. The next statement,
p_x = new int(2);
allocates 8 bytes on the heap at the next available place, in this case address
1020. Thus, 1020 is the value of the pointer variable p_x.
6. Pointer Arithmetic and the & Operator
Consider the trace of the program when the following two bold-faced
statements are added:
main () { int x; Composite c; // the struct defined above int *p_x; Composite *p_c; p_c = new Composite; p_x = new int; p_x = 5; (p_c).field1[1] = 4; p_c = &c; p_x++; }
Stack Address Allocated To Value 0 x 4 c.field ... 20 c.field 24 p_x 1020 1024 28 p_c 1000 4 Heap 1000 *p_c.field 1004 4 1008 1012 1016 *p_c.field 1020 *p_x 5 1024
The first statement,
p_c = &c;
is an example of the address operator &. The pointer p_c is assigned the
address of the variable c , which in this case is 4. After this statement, p_c no
longer points to the space allocated on the heap, but instead points to the
statically allocated Composite on the stack.
The second statement,
p_x++;
is an example of pointer arithmetic. When a pointer is incremented, it is
incremented not by one byte, but by 1*(the size of what it points to). In this case,
p_x points to an integer, so the variable p_x is incremented by 1*4 = 4 bytes.
Similarly, the statement
p_c = p_c + 2;
Main’s activation record
data = x*2; } } Stack Address Allocated To Value 0 foo^ ^ 4 foo^ ^ 8 x 4 Heap 1000 foo.data 8 1004 foo.data
There are a number of things to notice about this trace. First, all objects are
references in Java, so foo is treated like a pointer in C++. It's value is null to
begin with, but after the call to "new Foo" it's value is some address in the heap
(we're assuming 1000 in this example). Second, the activation record of meth
shows two parameters, the object itself (foo) and the formal parameter x. We will
assume that the "object" parameter always appears first in the activation record.
It will always be a pointer, as objects cannot be sent by-value in Java, as they
can be in C++. Finally, note that foo has two data members, both of which
appear in slots at the location in the heap where foo was allocated. As far as
tracing is concerned, an object's data members are treated just like fields of a
struct-- space is allocated in order for each field. Note that no space is allocated
for the methods of a class.
Main’s activation record meth's activation record