Jump Instructions-Microprocessor and Assembly Language Programming-Lecture Notes, Study notes of Microprocessor and Assembly Language Programming

This lecture handout was provided at Quaid-i-Azam University for Microprocessor and Assembly Language Programming course by Prof. Saleem Raza. Its main points are: Language, Program, Instruction, Jump, Algorithms, Sorting, MAnipulation, Diversion, Assembly

Typology: Study notes

2011/2012

Uploaded on 08/04/2012

saqqi
saqqi 🇵🇰

4

(33)

40 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Till now we have accumulated the very basic tools of assembly language
programming. A very important weapon in our arsenal is the conditional
jump instruction. During the course of last two chapters we used these tools
to write two very useful algorithms of sorting and multiplication. The
multiplication algorithm is useful even though there is a MUL instruction in
the 8088 instruction set, which can multiply 8bit and 16bit operands. This is
because of the extensibility of our algorithm, as it is not limited to 16bits and
can do 32bit or 64bit multiplication with minor changes.
Both of these algorithms will be used a number of times in any program of
a reasonable size and complexity. An application does not only need to
multiply at a single point in code; it multiplies at a number of places. If
multiplication or sorting is needed at 100 places in code, copying it 100
times is a totally infeasible solution. Maintaining such a code is an
impossible task.
The straightforward solution to this problem using the concepts we have
acquainted till now is to write the code at one place with a label, and
whenever we need to sort we jump to this label. But there is problem with
this logic, and the problem is that after sorting is complete how the processor
will know where to go back. The immediate answer is to jump back to a label
following the jump to bubble sort. But we have jumped to bubble sort from
100 places in code. Which of the 100 positions in code should we jump
back? Jump back at the first invocation, but jump has a single fixed target.
How will the second invocation work? The second jump to bubble sort will
never have control back at the next line.
Instruction are tied to one another forming an execution thread, just like a
knitted thread where pieces of cotton of different sizes are twisted together to
form a thread. This thread of execution is our program. The jump instruction
breaks this thread permanently, making a permanent diversion, like a turn
on a highway. The conditional jump selects one of the two possible
directions, like right or left turn on a road. So there is no concept of
returning.
However there are roundabouts on roads as well that take us back from
where we started after having traveled on the boundary of the round. This is
the concept of a temporary diversion. Two or more permanent diversions can
take us back from where we started, just like two or more road turns can
take us back to the starting point, but they are still permanent diversions in
their nature.
We need some way to implement the concept of temporary diversion in
assembly language. We want to create a roundabout of bubble sort, another
roundabout of our multiplication algorithm, so that we can enter into the
roundabout whenever we need it and return back to wherever we left from
after completing the round.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Jump Instructions-Microprocessor and Assembly Language Programming-Lecture Notes and more Study notes Microprocessor and Assembly Language Programming in PDF only on Docsity!

Till now we have accumulated the very basic tools of assembly language

programming. A very important weapon in our arsenal is the conditional

jump instruction. During the course of last two chapters we used these tools

to write two very useful algorithms of sorting and multiplication. The

multiplication algorithm is useful even though there is a MUL instruction in

the 8088 instruction set, which can multiply 8bit and 16bit operands. This is

because of the extensibility of our algorithm, as it is not limited to 16bits and

can do 32bit or 64bit multiplication with minor changes.

Both of these algorithms will be used a number of times in any program of

a reasonable size and complexity. An application does not only need to

multiply at a single point in code; it multiplies at a number of places. If

multiplication or sorting is needed at 100 places in code, copying it 100

times is a totally infeasible solution. Maintaining such a code is an

impossible task.

The straightforward solution to this problem using the concepts we have

acquainted till now is to write the code at one place with a label, and

whenever we need to sort we jump to this label. But there is problem with

this logic, and the problem is that after sorting is complete how the processor

will know where to go back. The immediate answer is to jump back to a label

following the jump to bubble sort. But we have jumped to bubble sort from

100 places in code. Which of the 100 positions in code should we jump

back? Jump back at the first invocation, but jump has a single fixed target.

How will the second invocation work? The second jump to bubble sort will

never have control back at the next line.

Instruction are tied to one another forming an execution thread, just like a

knitted thread where pieces of cotton of different sizes are twisted together to

form a thread. This thread of execution is our program. The jump instruction

breaks this thread permanently, making a permanent diversion, like a turn

on a highway. The conditional jump selects one of the two possible

directions, like right or left turn on a road. So there is no concept of

returning.

However there are roundabouts on roads as well that take us back from

where we started after having traveled on the boundary of the round. This is

the concept of a temporary diversion. Two or more permanent diversions can

take us back from where we started, just like two or more road turns can

take us back to the starting point, but they are still permanent diversions in

their nature.

We need some way to implement the concept of temporary diversion in

assembly language. We want to create a roundabout of bubble sort, another

roundabout of our multiplication algorithm, so that we can enter into the

roundabout whenever we need it and return back to wherever we left from

after completing the round.

Key point in the above discussion is returning to where we left from, like a

loop in a knitted thread. Diversion should be temporary and not permanent.

The code of bubble sort written at one place, multiply at another, and we

temporarily divert to that place, thus avoiding a repetition of code at a 100

places.

CALL and RET

In every processor, instructions are available to divert temporarily and to

divert permanently. The instructions for permanent diversion in 8088 are the

jump instructions, while the instruction for temporary diversion is the CALL

instruction. The word call must be familiar to the readers from subroutine

call in higher level languages. The CALL instruction allows temporary

diversion and therefore reusability of code. Now we can place the code for

bubble sort at one place and reuse it again and again. This was not possible

with permanent diversion. Actually the 8088 permanent diversion

mechanism can be tricked to achieve temporary diversion. However it is not

possible without getting into a lot of trouble. The key idea in doing it this way

is to use the jump instruction form that takes a register as argument.

Therefore this is not impossible but this is not the way it is done.

The natural way to do this is to use the CALL instruction followed by a

label, just like JMP is followed by a label. Execution will divert to the code

following the label. Till now the operation has been similar to the JMP

instruction. When the subroutine completes we need to return. The RET

instruction is used for this purpose. The word return holds in its meaning

that we are to return from where we came and need no explicit destination.

Therefore RET takes no arguments and transfers control back to the

instruction following the CALL that took us in this subroutine. The actual

technical process that informs RET where to return will be discussed later

after we have discussed the system stack.

CALL takes a label as argument and execution starts from that label, until

the RET instruction is encountered and it takes execution back to the

instruction following the CALL. Both the instructions are commonly used as

a pair, however technically they are independent in their operation. The RET

works regardless of the CALL and the CALL works regardless of the RET. If

you CALL a subroutine it will not complain if there is no RET present and

similarly if you RET without being called it won’t complain. It is a logical pair

and is used as a pair in every decent code. However sometimes we play tricks

with the processor and we use CALL or RET alone. This will become clear

when we need to play such tricks in later chapters.

Bubble Sort

Swap

Program

takes two bytes. Left shifting has been used to multiply by two.

Base+index+offset addressing has been used. BX holds the start of

array, SI the offset into it and an offset of 2 when the next element is

to be read. BX can be directly changed but then a separate counter

would be needed, as SI is directly compared with CX in our case.

The code starting from the start label is our main program

analogous to the main in the C language. BX and CX hold our

parameters for the bubblesort subroutine and the CALL is made to

invoke the subroutine.

Inside the debugger we observe the same unsigned data that we are so

used to now. The number 0103 is passed via BX to the subroutine which is

the start of our data and the number 000A via CX which is the number of

elements in our data. If we step over the CALL instruction we see our data

sorted in a single step and we are at the termination instructions. The

processor has jumped to the bubblesort routine, executed it to completion,

and returned back from it but the process was hidden due to the step over

command. If however we trace into the CALL instruction, we land at the first

instruction of our routine. At the end of the routine, when the RET

instruction is executed, we immediately land back to our termination

instructions, to be precise the instruction following the CALL.

Also observe that with the CALL instruction SP is decremented by two from

FFFE to FFFC, and the stack windows shows 0150 at its top. As the RET is

executed SP is recovered and the 0150 is also removed from the stack. Match

it with the address of the instruction following the CALL which is 0150 as

well. The 0150 removed from the stack by the RET instruction has been

loaded into the IP register thereby resuming execution from address 0150.

CALL placed where to return on the stack for the RET instruction. The stack

is automatically used with the CALL and RET instructions. Stack will be

explained in detail later, however the idea is that the one who is departing

stores the address to return at a known place. This is the place using which

CALL and RET coordinate. How this placed is actually used by the CALL and

RET instructions will be described after the stack is discussed.

After emphasizing reusability so much, it is time for another example

which uses the same bubblesort routine on two different arrays of different

sizes.

Example

; bubble sort subroutine called twice [org 0x0100] jmp start

data: dw 60, 55, 45, 50, 40, 35, 25, 30, 10, 0 data2: dw 328, 329, 898, 8923, 8293, 2345, 10, 877, 355, 98 dw 888, 533, 2000, 1020, 30, 200, 761, 167, 90, 5 swap: db 0

bubblesort: dec cx ; last element not compared shl cx, 1 ; turn into byte count

mainloop: mov si, 0 ; initialize array index to zero mov byte [swap], 0 ; reset swap flag to no swaps

innerloop: mov ax, [bx+si] ; load number in ax cmp ax, [bx+si+2] ; compare with next number jbe noswap ; no swap if already in order

mov dx, [bx+si+2] ; load second element in dx mov [bx+si], dx ; store first number in second mov [bx+si+2], ax ; store second number in first mov byte [swap], 1 ; flag that a swap has been done

noswap: add si, 2 ; advance si to next index

cmp si, cx ; are we at last index jne innerloop ; if not compare next two

cmp byte [swap], 1 ; check if a swap has been done je mainloop ; if yes make another pass

ret ; go back to where we came from

start: mov bx, data ; send start of array in bx mov cx, 10 ; send count of elements in cx call bubblesort ; call our subroutine

mov bx, data2 ; send start of array in bx mov cx, 20 ; send count of elements in cx call bubblesort ; call our subroutine again

mov ax, 0x4c00 ; terminate program int 0x

There are two different data arrays declared. One of 10 elements and

the other of 20 elements. The second array is declared on two lines,

where the second line is continuation of the first. No additional label

is needed since they are situated consecutively in memory.

The other change is in the main where the bubblesort subroutine is

called twice, once on the first array and once on the second.

Inside the debugger observe that stepping over the first call, the first array

is sorted and stepping over the second call the second array is sorted. If

however we step in SP is decremented and the stack holds 0178 which is the

address of the instruction following the call. The RET consumes that 0178

and restores SP. The next CALL places 0181 on the stack and SP is again

decremented. The RET consumes this number and execution resumes from

the instruction at 0181. This is the coordinated function of CALL and RET

using the stack.

In both of the above examples, there is a shortcoming. The subroutine to

sort the elements is destroying the registers AX, CX, DX, and SI. That means

that the caller of this routine has to make sure that it does not hold any

important data in these registers before calling this function, because after

the call has returned the registers will be containing meaningless data for the

caller. With a program containing thousands of subroutines expecting the

caller to remember the set of modified registers for each subroutine is

unrealistic and unreasonable. Also registers are limited in number, and

restricting the caller on the use of register will make the caller’s job very

tough. This shortcoming will be removed using the very important system

stack.

STACK

Stack is a data structure that behaves in a first in last out manner. It can

contain many elements and there is only one way in and out of the container.

When an element is inserted it sits on top of all other elements and when an

element is removed the one sitting at top of all others is removed first. To

visualize the structure consider a test tube and put some balls in it. The

second ball will come above the first and the third will come above the

second. When a ball is taken out only the one at the top can be removed. The

operation of placing an element on top of the stack is called pushing the

element and the operation of removing an element from the top of the stack

is called popping the element. The last thing pushed is popped out first; the

last in first out behavior.

We can peek at any ball inside the test tube but we cannot remove it

without removing every ball on top of it. Similarly we can read any element

from the stack but cannot remove it without removing everything above it.

The stack operations of pushing and popping only work at the top of the

The corresponding instruction RETF will pop the offset in the instruction

pointer followed by popping the segment in the code segment register.

Apart from CALL and RET, the operations that use the stack are PUSH and

POP. Two other operations that will be discussed later are INT and IRET.

Regarding the stack, the operation of PUSH is similar to CALL however with

a register other than the instruction pointer. For example “push ax” will push

the current value of the AX register on the stack. The operation of PUSH is

shown below.

SP  SP – 2

[SP]  AX

The operation of POP is the reverse of this. A copy of the element at the top

of the stack is made in the operand, and the top of the stack is incremented

afterwards. The operation of “pop ax” is shown below.

AX  [SP]

SP  SP + 2

Making corresponding PUSH and POP operations is the responsibility of

the programmer. If “push ax” is followed by “pop dx” effectively copying the

value of the AX register in the DX register, the processor won’t complain.

Whether this sequence is logically correct or not should be ensured by the

programmer. For example when PUSH and POP are used to save and restore

registers from the stack, order must be correct so that the saved value of AX

is reloaded in the AX register and not any other register. For this the order of

POP operations need to be the reverse of the order of PUSH operations.

Now we consider another example that is similar to the previous examples,

however the code to swap the two elements has been extracted into another

subroutine, so that the formation of stack can be observed during nested

subroutine calls.

Example

; bubble sort subroutine using swap subroutine [org 0x0100] jmp start

data: dw 60, 55, 45, 50, 40, 35, 25, 30, 10, 0 data2: dw 328, 329, 898, 8923, 8293, 2345, 10, 877, 355, 98 dw 888, 533, 2000, 1020, 30, 200, 761, 167, 90, 5 swapflag: db 0

swap: mov ax, [bx+si] ; load first number in ax xchg ax, [bx+si+2] ; exchange with second number mov [bx+si], ax ; store second number in first ret ; go back to where we came from

bubblesort: dec cx ; last element not compared shl cx, 1 ; turn into byte count

mainloop: mov si, 0 ; initialize array index to zero mov byte [swapflag], 0 ; reset swap flag to no swaps

innerloop: mov ax, [bx+si] ; load number in ax cmp ax, [bx+si+2] ; compare with next number jbe noswap ; no swap if already in order

call swap ; swaps two elements mov byte [swapflag], 1 ; flag that a swap has been done

noswap: add si, 2 ; advance si to next index cmp si, cx ; are we at last index jne innerloop ; if not compare next two

cmp byte [swapflag], 1 ; check if a swap has been done je mainloop ; if yes make another pass ret ; go back to where we came from

start: mov bx, data ; send start of array in bx mov cx, 10 ; send count of elements in cx

call bubblesort ; call our subroutine

mov bx, data2 ; send start of array in bx mov cx, 20 ; send count of elements in cx call bubblesort ; call our subroutine again

mov ax, 0x4c00 ; terminate program int 0x

A new instruction XCHG has been introduced. The instruction

swaps its source and its destination operands however at most one

of the operands could be in memory, so the other has to be loaded in

a register. The instruction has reduced the code size by one

instruction.

The RET at the end of swap makes it a subroutine.

Inside the debugger observe the use of stack by CALL and RET

instructions, especially the nested CALL.

SAVING AND RESTORING REGISTERS

The subroutines we wrote till now have been destroying certain registers

and our calling code has been carefully written to not use those registers.

However this cannot be remembered for a good number of subroutines.

Therefore our subroutines need to implement some mechanism of retaining

the callers’ value of any registers used.

The trick is to use the PUSH and POP operations and save the callers’

value on the stack and recover it from there on return. Our swap subroutine

destroyed the AX register while the bubblesort subroutine destroyed AX, CX,

and SI. BX was not modified in the subroutine. It had the same value at

entry and at exit; it was only used by the subroutine. Our next example

improves on the previous version by saving and restoring any registers that it

will modify using the PUSH and POP operations.

Example

; bubble sort and swap subroutines saving and restoring registers [org 0x0100] jmp start

data: dw 60, 55, 45, 50, 40, 35, 25, 30, 10, 0 data2: dw 328, 329, 898, 8923, 8293, 2345, 10, 877, 355, 98 dw 888, 533, 2000, 1020, 30, 200, 761, 167, 90, 5 swapflag: db 0

swap: push ax ; save old value of ax

mov ax, [bx+si] ; load first number in ax xchg ax, [bx+si+2] ; exchange with second number mov [bx+si], ax ; store second number in first

pop ax ; restore old value of ax ret ; go back to where we came from

bubblesort: push ax ; save old value of ax push cx ; save old value of cx push si ; save old value of si

dec cx ; last element not compared shl cx, 1 ; turn into byte count

mainloop: mov si, 0 ; initialize array index to zero mov byte [swapflag], 0 ; reset swap flag to no swaps

innerloop: mov ax, [bx+si] ; load number in ax cmp ax, [bx+si+2] ; compare with next number jbe noswap ; no swap if already in order

The out-of-line procedure is the temporary division, the concept of

roundabout that we discussed. Near calls are also called intra segment calls,

while far calls are called inter-segment calls. There are also versions that are

called indirect calls; however they will be discuss later when they are used.

RET

RET (Return) transfers control from a procedure back to the instruction

following the CALL that activated the procedure. RET pops the word at the

top of the stack (pointed to by register SP) into the instruction pointer and

increments SP by two. If RETF (inter segment RET) is used the word at the

top of the stack is popped into the IP register and SP is incremented by two.

The word at the new top of stack is popped into the CS register, and SP is

again incremented by two. If an optional pop value has been specified, RET

adds that value to SP. This feature may be used to discard parameters

pushed onto the stack before the execution of the CALL instruction.

PARAMETER PASSING THROUGH STACK

Due to the limited number of registers, parameter passing by registers is

constrained in two ways. The maximum parameters a subroutine can receive

are seven when all the general registers are used. Also, with the subroutines

are themselves limited in their use of registers, and this limited increases

when the subroutine has to make a nested call thereby using certain

registers as its parameters. Due to this, parameter passing by registers is not

expandable and generalizable. However this is the fastest mechanism

available for passing parameters and is used where speed is important.

Considering stack as an alternate, we observe that whatever data is placed

there, it stays there, and across function calls as well. For example the

bubble sort subroutine needs an array address and the count of elements. If

we place both of these on the stack, and call the subroutine afterwards, it

will stay there. The subroutine is invoked with its return address on top of

the stack and its parameters beneath it.

To access the arguments from the stack, the immediate idea that strikes is

to pop them off the stack. And this is the only possibility using the given set

of information. However the first thing popped off the stack would be the

return address and not the arguments. This is because the arguments were

first pushed on the stack and the subroutine was called afterwards. The

arguments cannot be popped without first popping the return address. If a

heaving thing falls on someone’s leg, the heavy thing is removed first and the

leg is not pulled out to reduce the damage. Same is the case with our

parameters on which the return address has fallen.

To handle this using PUSH and POP, we must first pop the return address

in a register, then pop the operands, and push the return address back on

the stack so that RET will function normally. However so much effort doesn’t

seem to pay back the price. Processor designers should have provided a

logical and neat way to perform this operation. They did provided a way and

infact we will do this without introducing any new instruction.

Recall that the default segment association of the BP register is the stack

segment and the reason for this association had been deferred for now. The

reason is to peek inside the stack using the BP register and read the

parameters without removing them and without touching the stack pointer.

The stack pointer could not be used for this purpose, as it cannot be used in

an effective address. It is automatically used as a pointer and cannot be

explicitly used. Also the stack pointer is a dynamic pointer and sometimes

changes without telling us in the background. It is just that whenever we

touch it, it is where we expect it to be. The base pointer is provided as a

replacement of the stack pointer so that we can peek inside the stack

without modifying the structure of the stack.

When the bubble sort subroutine is called, the stack pointer is pointing to

the return address. Two bytes below it is the second parameter and four

bytes below is the first parameter. The stack pointer is a reference point to

these parameters. If the value of SP is captured in BP, then the return

address is located at [bp+0], the second parameter is at [bp+2], and the first

parameter is at [bp+4]. This is because SP and BP both had the same value

and they both defaulted to the same segment, the stack segment.

This copying of SP into BP is like taking a snapshot or like freezing the

stack at that moment. Even if more pushes are made on the stack

decrementing the stack pointer, our reference point will not change. The

parameters will still be accessible at the same offsets from the base pointer.

If however the stack pointer increments beyond the base pointer, the

references will become invalid. The base pointer will act as the datum point

to access our parameters. However we have destroyed the original value of

BP in the process, and this will cause problems in nested calls where both

the outer and the inner subroutines need to access their own parameters.

The outer subroutine will have its base pointer destroyed after the call and

will be unable to access its parameters.

To solve both of these problems, we reach at the standard way of accessing

parameters on the stack. The first two instructions of any subroutines

accessing its parameters from the stack are given below.

push bp mov bp, sp

As a result our datum point has shifted by a word. Now the old value of BP

will be contained in [bp] and the return address will be at [bp+2]. The second

parameters will be [bp+4] while the first one will be at [bp+6]. We give an

example of bubble sort subroutine using this standard way of argument

passing through stack.

Example

; bubble sort subroutine taking parameters from stack [org 0x0100] jmp start

data: dw 60, 55, 45, 50, 40, 35, 25, 30, 10, 0 data2: dw 328, 329, 898, 8923, 8293, 2345, 10, 877, 355, 98 dw 888, 533, 2000, 1020, 30, 200, 761, 167, 90, 5 swapflag: db 0

bubblesort: push bp ; save old value of bp mov bp, sp ; make bp our reference point push ax ; save old value of ax push bx ; save old value of bx push cx ; save old value of cx push si ; save old value of si

mov bx, [bp+6] ; load start of array in bx mov cx, [bp+4] ; load count of elements in cx dec cx ; last element not compared shl cx, 1 ; turn into byte count

mainloop: mov si, 0 ; initialize array index to zero mov byte [swapflag], 0 ; reset swap flag to no swaps

innerloop: mov ax, [bx+si] ; load number in ax cmp ax, [bx+si+2] ; compare with next number jbe noswap ; no swap if already in order

xchg ax, [bx+si+2] ; exchange ax with second number mov [bx+si], ax ; store second number in first mov byte [swapflag], 1 ; flag that a swap has been done

noswap: add si, 2 ; advance si to next index cmp si, cx ; are we at last index jne innerloop ; if not compare next two

useless. Also observe that RET n has discarded the arguments rather than

popping them as they were no longer of any use either of the caller or the

callee.

The strong argument in favour of callee cleared stacks is that the

arguments were placed on the stack for the subroutine, the caller did not

needed them for itself, so the subroutine is responsible for removing them.

Removing the arguments is important as if the stack is not cleared or is

partially cleared the stack will eventually become full, SP will reach 0, and

thereafter wraparound producing unexpected results. This is called stack

overflow. Therefore clearing anything placed on the stack is very important.

LOCAL VARIABLES

Another important role of the stack is in the creation of local variables that

are only needed while the subroutine is in execution and not afterwards.

They should not take permanent space like global variables. Local variables

should be created when the subroutine is called and discarded afterwards.

So that the spaced used by them can be reused for the local variables of

another subroutine. They only have meaning inside the subroutine and no

meaning outside it.

The most convenient place to store these variables is the stack. We need

some special manipulation of the stack for this task. We need to produce a

gap in the stack for our variables. This is explained with the help of the

swapflag in the bubble sort example.

The swapflag we have declared as a word occupying space permanently is

only needed by the bubble sort subroutine and should be a local variable.

Actually the variable was introduced with the intent of making it a local

variable at this time. The stack pointer will be decremented by an extra two

bytes thereby producing a gap in which a word can reside. This gap will be

used for our temporary, local, or automatic variable; however we name it. We

can decrement it as much as we want producing the desired space, however

the decrement must be by an even number, as the unit of stack operation is

a word. In our case we needed just one word. Also the most convenient

position for this gap is immediately after saving the value of SP in BP. So that

the same base pointer can be used to access the local variables as well; this

time using negative offsets. The standard way to start a subroutine which

needs to access parameters and has local variables is as under.

push bp mov bp, sp sub sp, 2

The gap could have been created with a dummy push, but the subtraction

makes it clear that the value pushed is not important and the gap will be

used for our local variable. Also gap of any size can be created in a single

instruction with subtraction. The parameters can still be accessed at bp+

and bp+6 and the swapflag can be accessed at bp-2. The subtraction in SP

was after taking the snapshot; therefore BP is above the parameters but

below the local variables. The parameters are therefore accessed using

positive offsets from BP and the local variables are accessed using negative

offsets.

We modify the bubble sort subroutine to use a local variable to store the

swap flag. The swap flag remembered whether a swap has been done in a

particular iteration of bubble sort.

Example

; bubble sort subroutine using a local variable [org 0x0100] jmp start

data: dw 60, 55, 45, 50, 40, 35, 25, 30, 10, 0 data2: dw 328, 329, 898, 8923, 8293, 2345, 10, 877, 355, 98

dw 888, 533, 2000, 1020, 30, 200, 761, 167, 90, 5

bubblesort: push bp ; save old value of bp mov bp, sp ; make bp our reference point sub sp, 2 ; make two byte space on stack push ax ; save old value of ax push bx ; save old value of bx push cx ; save old value of cx push si ; save old value of si

mov bx, [bp+6] ; load start of array in bx mov cx, [bp+4] ; load count of elements in cx dec cx ; last element not compared shl cx, 1 ; turn into byte count

mainloop: mov si, 0 ; initialize array index to zero mov word [bp-2], 0 ; reset swap flag to no swaps

innerloop: mov ax, [bx+si] ; load number in ax cmp ax, [bx+si+2] ; compare with next number jbe noswap ; no swap if already in order

xchg ax, [bx+si+2] ; exchange ax with second number mov [bx+si], ax ; store second number in first mov word [bp-2], 1 ; flag that a swap has been done

noswap: add si, 2 ; advance si to next index cmp si, cx ; are we at last index jne innerloop ; if not compare next two

cmp word [bp-2], 1 ; check if a swap has been done je mainloop ; if yes make another pass

pop si ; restore old value of si pop cx ; restore old value of cx pop bx ; restore old value of bx pop ax ; restore old value of ax mov sp, bp ; remove space created on stack pop bp ; restore old value of bp ret 4 ; go back and remove two params

start: mov ax, data push ax ; place start of array on stack mov ax, 10 push ax ; place element count on stack call bubblesort ; call our subroutine

mov ax, data push ax ; place start of array on stack mov ax, 20 push ax ; place element count on stack call bubblesort ; call our subroutine again

mov ax, 0x4c00 ; terminate program int 0x

A word gap has been created for swap flag. This is equivalent to a

dummy push. The registers are pushed above this gap.

The swapflag is accessed with [bp-2]. The parameters are accessed

in the same manner as the last examples.

We are removing the hole that we created. The hole is removed by

restoring the value of SP that it had at the time of snapshot or at the

value it had before the local variable was created. This can be

replaced with “add sp, 2” however the one used in the code is

preferred since it does not require to remember how much space for

local variables was allocated in the start. After this operation SP

points to the old value of BP from where we can proceed as usual.

We needed memory to store the swap flag. The fact that it is in the stack

segment or the data segment doesn’t bother us. This will just change the

addressing scheme.