STOS, SCAS and Other Instructions-Microprocessor and Assembly Language Programming-Lecture Notes, Study notes of Microprocessor and Assembly Language Programming

This lecture handout was provided at Quaid-i-Azam University for Microprocessor and Assembly Language Programming course by Prof. Saleem Raza. Its main points are: Microprocessor, Stos, Scas, Data, Instruction, Register, Lods, Movs, Cmps, Prefix

Typology: Study notes

2011/2012

Uploaded on 08/04/2012

saqqi
saqqi 🇵🇰

4

(33)

40 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Till now very simple instructions of the 8088 microprocessor have been
introduced. In this chapter we will discuss a bit more powerful instructions
that can process blocks of data in one go. They are called block processing or
string instructions. This is the appropriate place to discuss these
instructions as we have just introduced a block of memory, which is the
video memory. The vision of this memory for the processor is just a block of
memory starting at a special address. For example the clear screen operation
initializes this whole block to 0720.
There are just 5 block processing instructions in 8088. In the primitive
form, the instructions themselves operate on a single cell of memory at one
time. However a special prefix repeats the instruction in hardware called the
REP prefix. The REP prefix allows these instructions to operate on a number
of data elements in one instruction. This is not like a loop; rather this
repetition is hard coded in the processor. The five instructions are STOS,
LODS, CMPS, SCAS, and MOVS called store string, load string, compare
string, scan string, and move string respectively. MOVS is the instruction
that allows memory to memory moves, as was discussed in the exceptions to
the memory to memory movement rules. String instructions are complex
instruction in that they perform a number of tasks against one instruction.
And with the REP prefix they perform the task of a complex loop in one
instruction. This causes drastic speed improvements in operations on large
blocks of memory. The reduction in code size and the improvement in speed
are the two reasons why these instructions were introduced in the 8088
processor.
There are a number of common things in these instructions. Firstly they
all work on a block of data. DI and SI are used to access memory. SI and DI
are called source index and destination index because of string instructions.
Whenever an instruction needs a memory source, DS:SI holds the pointer to
it. An override is possible that can change the association from DS but the
default is DS. Whenever a string instruction needs a memory destination,
ES:DI holds the pointer to it. No override is possible in this case. Whenever a
byte register is needed, AL holds the value. Whenever a word register is used
AX holds the value. For example STOS stores a register in memory so AL or
AX is the register used and ES:DI points to the destination. The LODS
instruction loads from memory to register so the source is pointed to by
DS:SI and the register used is AL or AX.
String instructions work on a block of data. A block has a start and an
end. The instructions can work from the start towards the end and from the
end towards the start. In fact they can work in both directions, and they
must be allowed to work in both directions otherwise certain operations with
overlapping blocks become impossible. This problem is discussed in detail
later. The direction of movement is controlled with the Direction Flag (DF) in
the flags register. If this flag is cleared the direction is from lower addresses
towards higher addresses and if this flag is set the direction is from higher
addresses to lower addresses. If DF is cleared, this is called the auto-
increment mode of string instruction, and if DF is set, this is called the auto-
decrement mode. There are two instructions to set and clear the direction
flag.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download STOS, SCAS and Other Instructions-Microprocessor and Assembly Language Programming-Lecture Notes and more Study notes Microprocessor and Assembly Language Programming in PDF only on Docsity!

Till now very simple instructions of the 8088 microprocessor have been

introduced. In this chapter we will discuss a bit more powerful instructions

that can process blocks of data in one go. They are called block processing or

string instructions. This is the appropriate place to discuss these

instructions as we have just introduced a block of memory, which is the

video memory. The vision of this memory for the processor is just a block of

memory starting at a special address. For example the clear screen operation

initializes this whole block to 0720.

There are just 5 block processing instructions in 8088. In the primitive

form, the instructions themselves operate on a single cell of memory at one

time. However a special prefix repeats the instruction in hardware called the

REP prefix. The REP prefix allows these instructions to operate on a number

of data elements in one instruction. This is not like a loop; rather this

repetition is hard coded in the processor. The five instructions are STOS,

LODS, CMPS, SCAS, and MOVS called store string, load string, compare

string, scan string, and move string respectively. MOVS is the instruction

that allows memory to memory moves, as was discussed in the exceptions to

the memory to memory movement rules. String instructions are complex

instruction in that they perform a number of tasks against one instruction.

And with the REP prefix they perform the task of a complex loop in one

instruction. This causes drastic speed improvements in operations on large

blocks of memory. The reduction in code size and the improvement in speed

are the two reasons why these instructions were introduced in the 8088

processor.

There are a number of common things in these instructions. Firstly they

all work on a block of data. DI and SI are used to access memory. SI and DI

are called source index and destination index because of string instructions.

Whenever an instruction needs a memory source, DS:SI holds the pointer to

it. An override is possible that can change the association from DS but the

default is DS. Whenever a string instruction needs a memory destination,

ES:DI holds the pointer to it. No override is possible in this case. Whenever a

byte register is needed, AL holds the value. Whenever a word register is used

AX holds the value. For example STOS stores a register in memory so AL or

AX is the register used and ES:DI points to the destination. The LODS

instruction loads from memory to register so the source is pointed to by

DS:SI and the register used is AL or AX.

String instructions work on a block of data. A block has a start and an

end. The instructions can work from the start towards the end and from the

end towards the start. In fact they can work in both directions, and they

must be allowed to work in both directions otherwise certain operations with

overlapping blocks become impossible. This problem is discussed in detail

later. The direction of movement is controlled with the Direction Flag (DF) in

the flags register. If this flag is cleared the direction is from lower addresses

towards higher addresses and if this flag is set the direction is from higher

addresses to lower addresses. If DF is cleared, this is called the auto-

increment mode of string instruction, and if DF is set, this is called the auto-

decrement mode. There are two instructions to set and clear the direction

flag.

cld ; clear direction flag std ; set direction flag

Every string instruction has two variants; a byte variant and a word

variant. For example the two variants of STOS are STOSB and STOSW.

Similarly the variants for the other string instructions are attained by

appending a B or a W to the instruction name. The operation of each of the

string instructions and each of the repetition prefixes is discussed below.

STOS

STOS transfers a byte or word from register AL or AX to the string element

addressed by ES:DI and updates DI to point to the next location. STOS is

often used to clear a block of memory or fill it with a constant.

The implied source will always be in AL or AX. If DF is clear, DI will be

incremented by one or two depending of whether STOSB or STOSW is used.

If DF is set DI will be decremented by one or two depending of whether

STOSB or STOSW is used. If REP is used before this instruction, the process

will be repeated CX times. CX is called the counter register because of the

special treatment given to it in the LOOP and JCXZ instructions and the REP

set of prefixes. So if REP is used with STOS the whole block of memory will

be filled with a constant value. REP will always decrement CX like the LOOP

instruction and this cannot be changed with the direction flag. It is also

independent of whether the byte or the word variant is used. It always

decrements by one; therefore CX has count of repetitions and not the count

of bytes.

LODS

LODS transfers a byte or word from the source location DS:SI to AL or AX

and updates SI to point to the next location. LODS is generally used in a loop

and not with the REP prefix since the value previously loaded in the register

is overwritten if the instruction is repeated and only the last value of the

block remains in the register.

SCAS

SCAS compares a source byte or word in register AL or AX with the

destination string element addressed by ES:DI and updates the flags. DI is

updated to point to the next location. SCAS is often used to locate equality or

in-equality in a string through the use of an appropriate prefix.

SCAS is a bit different from the other instructions. This is more like the

CMP instruction in that it does subtraction of its operands. The prefixes

REPE (repeat while equal) and REPNE (repeat while not equal) are used with

this instruction. The instruction is used to locate a byte in AL in the block of

memory. When the first equality or inequality is encountered; both have

uses. For example this instruction can be used to search for a 0 in a null

terminated string to calculate the length of the string. In this form REPNE

will be used to repeat while the null is not there.

MOVS

MOVS transfers a byte or word from the source location DS:SI to the

destination ES:DI and updates SI and DI to point to the next locations.

MOVS is used to move a block of memory. The DF is important in the case of

overlapping blocks. For example when the source and destination blocks

overlap and the source is below the destination copy must be done upwards

while if the destination is below the source copy must be done downwards.

We cannot perform both these copy operations properly if the direction flag

was not provided. If the source is below the destination and an upwards copy

is used the source to be copied is destroyed. If however the copy is done

downwards the portion of source destroyed is the one that has already been

pop cx pop ax pop es ret

start: call clrscr ; call clrscr subroutine

mov ax, 0x4c00 ; terminate program int 0x

013 A space efficient way to zero a 16bit register is to XOR it with itself.

Remember that exclusive or results in a zero whenever the bits at

the source and at the destination are same. This instruction takes

just two bytes compared to “mov di, 0” which would take three. This

is a standard way to zero a 16bit register.

Inside the debugger the operation of the string instruction can be

monitored. The trace into command can be used to monitor every repetition

of the string instruction. However screen will not be cleared inside the

debugger as the debugger overwrites its display on the screen so CX

decrements with every iteration, DI increments by 2. The first access is made

at B800:0000 and the second at B800:0002 and so on. A complex and

inefficient loop is replaced with a fast and simple instruction that does the

same operation many times faster.

LODS EXAMPLE – STRING PRINTING

The use of LODS with the REP prefix is not meaningful as only the last

value loaded will remain in the register. It is normally used in a loop paired

with a STOS instruction to do some block processing. We use LODS to pick

the data, do the processing, and then use STOS to put it back or at some

other place. For example in string printing, we will use LODS to read a

character of the string, attach the attribute byte to it, and use STOS to write

it on the video memory.

The following example will print the string using string instructions.

Example

; hello world printing using string instructions [org 0x0100] jmp start

message: db 'hello world' ; string to be printed length: dw 11 ; length of string

;;;;; COPY LINES 005-024 FROM EXAMPLE 7.1 (clrscr) ;;;;;

; subroutine to print a string ; takes the x position, y position, attribute, address of string and ; its length as parameters printstr: push bp mov bp, sp push es push ax push cx push si push di

mov ax, 0xb mov es, ax ; point es to video base mov al, 80 ; load al with columns per row mul byte [bp+10] ; multiply with y position add ax, [bp+12] ; add x position shl ax, 1 ; turn into byte offset mov di,ax ; point di to required location mov si, [bp+6] ; point si to string mov cx, [bp+4] ; load length of string in cx mov ah, [bp+8] ; load attribute in ah

cld ; auto increment mode nextchar: lodsb ; load next char in al stosw ; print char/attribute pair loop nextchar ; repeat for the whole string

pop di pop si pop cx pop ax pop es pop bp ret 10

start: call clrscr ; call the clrscr subroutine

mov ax, 30 push ax ; push x position mov ax, 20 push ax ; push y position mov ax, 1 ; blue on black attribute push ax ; push attribute mov ax, message push ax ; push address of message push word [length] ; push message length call printstr ; call the printstr subroutine

mov ax, 0x4c00 ; terminate program int 0x

051 Both operations are in auto increment mode.

052-053 DS is automatically initialized to our segment. ES points to video

memory. SI points to the address of our string. DI points to the

screen location. AH holds the attribute. Whenever we read a

character from the string in AL, the attribute byte is implicitly

attached and the pair is present in AX. The same effect could not be

achieved with a REP prefix as the REP will repeat LODS and then

start repeating STOS, but we need to alternate them.

054 CX holds the length of the string. Therefore LOOP repeats for each

character of the string.

Inside the debugger we observe how LODS and STOS alternate and CX is

only used by the LOOP instruction. In the original code there were four

instructions inside the loop; now there are only two. This is how string

instructions help in reducing code size.

SCAS EXAMPLE – STRING LENGTH

Many higher level languages do not explicitly store string length; rather

they use a null character, a character with an ASCII code of zero, to signal

the end of a string. In assembly language programs, it is also easier to store

a zero at the end of the string, instead of calculating the length of string,

which is very difficult process for longer strings. So we delegate length

calculation to the processor and modify our string printing subroutine to

take a null terminated string and no length. We use SCASB with REPNE and

a zero in AL to find a zero byte in the string. In CX we load the maximum

possible size, which is 64K bytes. However actual strings will be much

smaller. An important thing regarding SCAS and CMPS is that if they stop

due to equality or inequality, the index registers have already incremented.

Therefore when SCAS will stop DI would be pointing past the null character.

Example

; hello world printing with a null terminated string [org 0x0100]

LES and LDS Instructions

Since the string instructions need their source and destination in the form

of a segment offset pair, there are two special instructions that load a

segment register and a general purpose register from two consecutive

memory locations. LES loads ES while LDS loads DS. Both these instructions

have two parameters, one is the general purpose register to be loaded and

the other is the memory location from which to load these registers. The

major application of these instructions is when a subroutine receives a

segment offset pair as an argument and the pair is to be loaded in a segment

and an offset register. According to Intel rules of significance the word at

higher address is loaded in the segment register while the word at lower

address is loaded in the offset register. As parameters segment should be

pushed first so that it ends up at a higher address and the offset should be

pushed afterwards. When loading the lower address will be given. For

example “lds si, [bp+4]” will load SI from BP+4 and DS from BP+6.

LES AND LDS EXAMPLE

We modify the string length calculation subroutine to take the segment

and offset of the string and use the LES instruction to load that segment

offset pair in ES and DI.

Example

; hello world printing with length calculation subroutine [org 0x0100] jmp start

message: db 'hello world', 0 ; null terminated string

;;;;; COPY LINES 005-024 FROM EXAMPLE 7.1 (clrscr) ;;;;;

; subroutine to calculate the length of a string ; takes the segment and offset of a string as parameters strlen: push bp mov bp,sp push es push cx push di

les di, [bp+4] ; point es:di to string mov cx, 0xffff ; load maximum number in cx xor al, al ; load a zero in al repne scasb ; find zero in the string mov ax, 0xffff ; load maximum number in ax sub ax, cx ; find change in cx dec ax ; exclude null from length

pop di pop cx pop es pop bp ret 4

; subroutine to print a string ; takes the x position, y position, attribute, and address of a null ; terminated string as parameters printstr: push bp mov bp, sp push es push ax push cx push si push di

push ds ; push segment of string mov ax, [bp+4] push ax ; push offset of string call strlen ; calculate string length

cmp ax, 0 ; is the string empty jz exit ; no printing if string is empty mov cx, ax ; save length in cx

mov ax, 0xb mov es, ax ; point es to video base mov al, 80 ; load al with columns per row mul byte [bp+8] ; multiply with y position add ax, [bp+10] ; add x position shl ax, 1 ; turn into byte offset mov di,ax ; point di to required location mov si, [bp+4] ; point si to string mov ah, [bp+6] ; load attribute in ah

cld ; auto increment mode nextchar: lodsb ; load next char in al stosw ; print char/attribute pair loop nextchar ; repeat for the whole string

exit: pop di pop si pop cx pop ax pop es pop bp ret 8

start: call clrscr ; call the clrscr subroutine

mov ax, 30 push ax ; push x position mov ax, 20 push ax ; push y position mov ax, 0x71 ; blue on white attribute push ax ; push attribute mov ax, message push ax ; push address of message call printstr ; call the printstr subroutine

mov ax, 0x4c00 ; terminate program int 0x

036 The LES instruction is used to load the DI register from BP+4 and

the ES register from BP+6.

065 The convention to return a value from a subroutine is to use the AX

register. That is why AX is not saved and restored in the subroutine.

Inside the debugger observe that the segment register is pushed followed

by the offset. The higher address FFE6 contains the segment and the lower

address FFE4 contains the offset. This is because we have a decrementing

stack. Then observe the loading of ES and DI from the stack.

. MOVS EXAMPLE – SCREEN SCROLLING

MOVS has the two forms MOVSB and MOVSW. REP allows the instruction

to be repeated CX times allowing blocks of memory to be copied. We will

perform this copy of the video screen.

Scrolling is the process when all the lines on the screen move one or more

lines towards the top of towards the bottom and the new line that appears on

the top or the bottom is cleared. Scrolling is a process on which string

movement is naturally applicable. REP with MOVS will utilize the full

processor power to do the scrolling in minimum time.

In this example we want to scroll a variable number of lines given as

argument. Therefore we have to calculate the source address, which is 160

times the number of lines to clear. The destination address is 0, which is the

top left of the screen. The lines that scroll up are discarded so the source

pointer is placed after them. An equal number of lines at the bottom are

cleared. These lines have actually been copied above.

Example

; scroll down the screen [org 0x0100] jmp start

; subroutine to scrolls down the screen ; take the number of lines to scroll as parameter scrolldown: push bp mov bp,sp push ax push cx push si push di push es push ds

mov ax, 80 ; load chars per row in ax mul byte [bp+4] ; calculate source position push ax ; save position for later use shl ax, 1 ; convert to byte offset mov si, 3998 ; last location on the screen sub si, ax ; load source position in si mov cx, 2000 ; number of screen locations sub cx, ax ; count of words to move mov ax, 0xb mov es, ax ; point es to video base mov ds, ax ; point ds to video base mov di, 3998 ; point di to lower right column std ; set auto decrement mode rep movsw ; scroll up mov ax, 0x0720 ; space in normal attribute pop cx ; count of positions to clear rep stosw ; clear the scrolled space

pop ds pop es pop di pop si pop cx pop ax pop bp ret 2

start: mov ax, push ax ; push number of lines to scroll call scrolldown ; call scroll down subroutine

mov ax, 0x4c00 ; terminate program int 0x

CMPS EXAMPLE – STRING COMPARISON

For the last string instruction, we take string comparison as an example.

The subroutine will take two segment offset pairs containing the address of

the two null terminated strings. The subroutine will return 0 if the strings

are different and 1 if they are same. The AX register will be used to hold the

return value.

Example

; comparing null terminated strings [org 0x0100] jmp start

msg1: db 'hello world', 0 msg2: db 'hello WORLD', 0 msg3: db 'hello world', 0

;;;;; COPY LINES 028-050 FROM EXAMPLE 7.4 (strlen) ;;;;;

; subroutine to compare two strings ; takes segment and offset pairs of two strings to compare