Algorithms and Multiplication-Microprocessor and Assembly Language Programming-Lecture Notes, Study notes of Microprocessor and Assembly Language Programming

This lecture handout was provided at Quaid-i-Azam University for Microprocessor and Assembly Language Programming course by Prof. Saleem Raza. Its main points are: Repertoire, Algorithm, Manipulations, instructions, Processor, Binary, Formulation, Set, Complex, Task

Typology: Study notes

2011/2012

Uploaded on 08/04/2012

saqqi
saqqi 🇵🇰

4

(33)

40 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
With the important capability of decision making in our repertoire we move
on to the discussion of an algorithm, which will help us uncover an
important set of instructions in our processor used for bit manipulations.
Multiplication is a common process that we use, and we were trained to do
in early schooling. Remember multiplying by a digit and then putting a cross
and then multiplying with the next digit and putting two crosses and so on
and summing the intermediate results in the end. Very familiar process but
we never saw the process as an algorithm, and we need to see it as an
algorithm to convey it to the processor.
To highlight the important thing in the algorithm we revise it on two 4bit
binary numbers. The numbers are 1101 i.e. 13 and 0101 i.e. 5. The answer
should be 65 or in binary 01000001. Observe that the answer is twice as
long as the multiplier and the multiplicand. The multiplication is shown in
the following figure.
1101 = 13
0101 = 5
-----
1101
0000x
1101xx
0000xxx
--------
01000001 = 65
We take the first digit of the multiplier and multiply it with the
multiplicand. As the digit is one the answer is the multiplicand itself. So we
place the multiplicand below the bar. Before multiplying with the next digit a
cross is placed at the right most place on the next line and the result is
placed shifted one digit left. However since the digit is zero, the result is zero.
Next digit is one, multiplying with which, the answer is 1101. We put two
crosses on the next line at the right most positions and place the result there
shifted two places to the left. The fourth digit is zero, so the answer 0000 is
placed with three crosses to its right.
Observe the beauty of binary base, as no real multiplication is needed at
the digit level. If the digit is 0 the answer is 0 and if the digit is 1 the answer
is the multiplicand itself. Also observe that for every next digit in the
multiplier the answer is written shifted one more place to the left. No shifting
for the first digit, once for the second, twice for the third and thrice for the
fourth one. Adding all the intermediate answers the result is 01000001=65
as desired. Crosses are treated as zero in this addition.
Before formulating the algorithm for this problem, we need some more
instructions that can shift a number so that we use this instruction for our
multiplicand shifting and also some way to check the bits of the multiplier
one by one.
The set of shifting and rotation instructions is one of the most useful set in
any processor’s instruction set. They simplify really complex tasks to a very
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Algorithms and Multiplication-Microprocessor and Assembly Language Programming-Lecture Notes and more Study notes Microprocessor and Assembly Language Programming in PDF only on Docsity!

With the important capability of decision making in our repertoire we move on to the discussion of an algorithm, which will help us uncover an important set of instructions in our processor used for bit manipulations. Multiplication is a common process that we use, and we were trained to do in early schooling. Remember multiplying by a digit and then putting a cross and then multiplying with the next digit and putting two crosses and so on and summing the intermediate results in the end. Very familiar process but we never saw the process as an algorithm, and we need to see it as an algorithm to convey it to the processor. To highlight the important thing in the algorithm we revise it on two 4bit binary numbers. The numbers are 1101 i.e. 13 and 0101 i.e. 5. The answer should be 65 or in binary 01000001. Observe that the answer is twice as long as the multiplier and the multiplicand. The multiplication is shown in the following figure.

1101 = 13 0101 = 5


1101 0000x 1101xx 0000xxx


01000001 = 65

We take the first digit of the multiplier and multiply it with the multiplicand. As the digit is one the answer is the multiplicand itself. So we place the multiplicand below the bar. Before multiplying with the next digit a cross is placed at the right most place on the next line and the result is placed shifted one digit left. However since the digit is zero, the result is zero. Next digit is one, multiplying with which, the answer is 1101. We put two crosses on the next line at the right most positions and place the result there shifted two places to the left. The fourth digit is zero, so the answer 0000 is placed with three crosses to its right. Observe the beauty of binary base, as no real multiplication is needed at the digit level. If the digit is 0 the answer is 0 and if the digit is 1 the answer is the multiplicand itself. Also observe that for every next digit in the multiplier the answer is written shifted one more place to the left. No shifting for the first digit, once for the second, twice for the third and thrice for the fourth one. Adding all the intermediate answers the result is 01000001= as desired. Crosses are treated as zero in this addition. Before formulating the algorithm for this problem, we need some more instructions that can shift a number so that we use this instruction for our multiplicand shifting and also some way to check the bits of the multiplier one by one.

The set of shifting and rotation instructions is one of the most useful set in any processor’s instruction set. They simplify really complex tasks to a very

neat and concise algorithm. The following shifting and rotation operations are available in our processor.

Shift Logical Right (SHR)

The shift logical right operation inserts a zero from the left and moves every bit one position to the right and copies the rightmost bit in the carry flag. Imagine that there is a pipe filled to capacity with eight balls. The pipe is open from both ends and there is a basket at the right end to hold anything dropping from there. The operation of shift logical right is to force a white ball from the left end. The operation is depicted in the following illustration.

White balls represent zero bits while black balls represent one bits. Sixteen bit shifting is done the same way with a pipe of double capacity.

Shift Logical Left (SHL) / Shift Arithmetic Left (SAL)

The shift logical left operation is the exact opposite of shift logical right. In this operation the zero bit is inserted from the right and every bit moves one position to its left with the most significant bit dropping into the carry flag. Shift arithmetic left is just another name for shift logical left. The operation is again exemplified with the following illustration of ball and pipes.

Shift Arithmetic Right (SAR)

A signed number holds the sign in its most significant bit. If this bit was one a logical right shifting will change the sign of this number because of insertion of a zero from the left. The sign of a signed number should not change because of shifting. The operation of shift arithmetic right is therefore to shift every bit one place to the right with a copy of the most significant bit left at the most significant place. The bit dropped from the right is caught in the carry basket. The sign bit is retained in this operation. The operation is further illustrated below.

The left shifting operation is basically multiplication by 2 while the right shifting operation is division by two. However for signed numbers division by two can be accomplished by using shift arithmetic right and not shift logical right. The left shift operation is equivalent to multiplication except when an important bit is dropped from the left. The overflow flag will signal this condition if it occurs and can be checked with JO. For division by 2 of a signed number logical right shifting will give a wrong answer for a negative number as the zero inserted from the left will change its sign. To retain the sign flag and still effectively divide by two the shift arithmetic right instruction must be used on signed numbers.

C 1 0 1 1 0 1 0 0 0

0 1 0 1 1 0 1 0 0 C

1 0 1 1 0 1 0 0 C

In the multiplication algorithm discussed above we revised the way we multiplied number in lower classes, and gave an example of that method on binary numbers. We make a simple modification to the traditional algorithm before we proceed to formulate it in assembly language. In the traditional algorithm we calculate all intermediate answers and then sum them to get the final answer. If we add every intermediate answer to accumulate the result, the result will be same in the end, except that we do not have to remember a lot of intermediate answers during the whole multiplication. The multiplication with the new algorithm is shown below.

1101 = 13 Accumulated Result 0101 = 5 ----- 0 (Initial Value) 1101 = 13 0 + 13 = 13 0000x = 0 13 + 0 = 13 1101xx = 52 13 + 52 = 65 0000xxx = 0 65 + 0 = 65 (Answer)

We try to identify steps of our algorithm. First we set the result to zero. Then we check the right most bit of multiplier. If it is one add the multiplicand to the result, and if it is zero perform no addition. Left shift the multiplicand before the next bit of multiplier is tested. The left shifting of the multiplicand is performed regardless of the value of the multiplier’s right most bit. Just like the crosses in traditional multiplication are always placed to mark the ones, tens, thousands, etc. places. Then check the next bit and if it is one add the shifted value of the multiplicand to the result. Repeat for as many digits as there are in the multiplier, 4 in our example. Formulating the steps of the algorithm we get:  Shift the multiplier to the right.  If CF=1 add the multiplicand to the result.  Shift the multiplicand to the left.  Repeat the algorithm 4 times. For an 8bit multiplication the algorithm will be repeated 8 times and for a sixteen bit multiplication it will be repeated 16 times, whatever the size of the multiplier is. The algorithm uses the fact that shifting right forces the right most bit to drop in the carry flag. If we test the carry flag using JC we are effectively testing the right most bit of the multiplier. Another shifting will cause the next bit to drop in the next iteration and so on. So our task of checking bits one by one is satisfied using the shift operation. There are many other methods to do this bit testing as well, however we exemplify one of the methods in this example. In the first iteration there is no shifting just like there is no cross in traditional multiplication in the first pass. Therefore we placed the left shifting of the multiplicand after the addition step. However the right shifting of multiplier must be before the addition as the addition step’s execution depends upon its result. We introduce an assembly language program to perform this 4bit multiplication. The algorithm is extensible to more bits but there are a few complications, which are left to be discussed later. For now we do a 4bit multiplication to keep the algorithm simple.

01 02 03 04 05 06

; 4bit multiplication algorithm [org 0x100] jmp start multiplicand: db 13 ; 4bit multiplicand (8bit space) multiplier: db 5 ; 4bit multiplier

07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23

result: db 0 ; 8bit result start: mov cl, 4 ; initialize bit count to four mov bl, [multiplicand] ; load multiplicand in bl mov dl, [multiplier] ; load multiplier in dl checkbit: shr dl, 1 ; move right most bit in carry jnc skip ; skip addition if bit is zero add [result], bl ; accumulate result skip: shl bl, 1 ; shift multiplicand left dec cl ; decrement bit count jnz checkbit ; repeat if bits left mov ax, 0x4c00 ; terminate program int 0x

04-

The numbers to be multiplied are constants for now. The multiplication is four bit so the answer is stored in an 8bit register. If the operands were 8bit the answer would be 16bit and if the operands were 16bit the answer would be 32bit. Since eight bits can fit in a byte we have used 4bit multiplication as our first example. Since addition by zero means nothing we skip the addition step if the rightmost bit of the multiplier is zero. If the jump is not taken the shifted value of the multiplicand is added to the result. The multiplicand is left shifted in every iteration regardless of the multiplier bit. DEC is a new instruction but its operation should be immediately understandable with the knowledge gained till now. It simply subtracts one from its single operand. The JNZ instruction causes the algorithm to repeat till any bits of the multiplier are left

Inside the debugger observe the working of the SHR and SHL instructions. The SHR instruction is effectively dividing its operand by two and the remainder is stored in the carry flag from where we test it. The SHL instruction is multiplying its operand by two so that it is added at one place more towards the left in the result.

We performed a 4bit multiplication to explain the algorithm however the real advantage of the computer is when we ask it to multiply large numbers, Numbers whose multiplication takes real time. If we have an 8bit number we can do the multiplication in word registers, but are we limited to word operations? What if we want to multiply 32bit or even larger numbers? We are certainly not limited. Assembly language only provides us the basic building blocks. We build a plaza out of these blocks, or a building, or a classic piece of architecture is only dependant upon our imagination. With our logic we can extend these algorithms as much as we want. Our next example will be multiplication of 16bit numbers to produce a 32bit answer. However for a 32bit answer we need a way to shift a 32bit number and a way to add 32bit numbers. We cannot depend on 16bit shifting as we have 16 significant bits in our multiplicand and shifting any bit towards the left may drop a valuable bit causing a totally wrong result. A valuable bit means any bit that is one. Dropping a zero bit doesn’t cause any difference. So we place the 16it number in 32bit space with the upper 16 bits zeroed so that the sixteen shift operations don’t cause any valuable bit to drop. Even though the numbers were 16bit we need 32bit operations to multiply correctly. To clarify this necessity, we take example of a number 40000 or 9C40 in hexadecimal. In binary it is represented as 1001110001000000. To multiply

operand. However ADC has three operands. The third implied operand is the carry flag. The ADC instruction is specifically placed for extending the capability of ADD. Numbers of any size can be added using a proper combination of ADD and ADC. All basic building blocks are provided for the assembly language programmer, and the programmer can extend its capabilities as much as needed by using these fine instructions in appropriate combinations. Further clarifying the operation of ADC, consider an instruction “ADC AX, BX.” Normal addition would have just added BX to AX, however ADC first adds the carry flag to AX and then adds BX to AX. Therefore the last carry is also included in the result. The algorithm should be apparent by now. The lower halves of the two numbers to be added are first added with a normal addition. For the upper halves a normal addition would lose track of a possible carry from the lower halves and the answer would be wrong. If a carry was generated it should go to the upper half. Therefore the upper halves are added with an addition with carry instruction. Since one operand must be in register, ax is used to read the lower and upper halves of the source one by one. The destination is directly updated. The set of instructions goes here. dest: dd 40000 src: dd 80000 mov ax, [src] add word [dest], ax mov ax, [src+2] adc word [dest+2], ax To further extend it more addition with carries will be used. However the carry from last addition will be wasted as there will always be a size limit where the results and the numbers are stored. This carry will remain in the carry flag to be tested for a possible overflow. For subtraction the same logic will be used and just like addition with carry there is an instruction to subtract with borrows called SBB. Borrow in the name means the carry flag and is used just for clarity. Or we can say that the carry flag holds the carry for addition instructions and the borrow for subtraction instructions. Also the carry is generated at the 17th bit and the borrow is also taken from the 17th bit. Also there is no single instruction that needs borrow and carry in their independent meanings at the same time. Therefore it is logical to use the same flag for both tasks. We extend subtraction with a very similar algorithm. The lower halves must be subtracted normally while the upper halves must be subtracted with a subtract with borrow instruction so that if the lower halves needed a borrow, a one is subtracted from the upper halves. The algorithm is as under. dest: dd 40000 src: dd 80000 mov ax, [src] sub word [dest], ax mov ax, [src+2] sbb word [dest+2], ax

Extended Multiplication

We use extended shifting and extended addition to formulate our algorithm to do extended multiplication. The multiplier is still stored in 16bits since we only need to check its bits one by one. The multiplicand however cannot be stored in 16bits otherwise on left shifting its significant bits might get lost. Therefore it has to be stored in 32bits and the shifting and addition used to accumulate the result must be 32bits as well.

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

; 16bit multiplication [org 0x0100] jmp start multiplicand: dd 1300 ; 16bit multiplicand 32bit space multiplier: dw 500 ; 16bit multiplier result: dd 0 ; 32bit result start: mov cl, 16 ; initialize bit count to 16 mov dx, [multiplier] ; load multiplier in dx checkbit: shr dx, 1 ; move right most bit in carry jnc skip ; skip addition if bit is zero mov ax, [multiplicand] add [result], ax ; add less significant word mov ax, [multiplicand+2] adc [result+2], ax ; add more significant word skip: shl word [multiplicand], 1 rcl word [multiplicand+2], 1 ; shift multiplicand left dec cl ; decrement bit count jnz checkbit ; repeat if bits left mov ax, 0x4c00 ; terminate program int 0x

05-

10

15-

20-

The multiplicand and the result are stored in 32bit space while the multiplier is stored as a word. The multiplier is loaded in DX where it will be shifted bit by bit. It can be directly shifted in memory as well. The multiplicand is added to the result using extended 32bit addition. The multiplicand is shifted left as a 32bit number using extended shifting operation.

The multiplicand will occupy the space from 0103-0106, the multiplier will occupy space from 0107-0108 and the result will occupy the space from 0109-010C. Inside the debugger observe the changes in these memory locations during the course of the algorithm. The extended shifting and addition operations provide the same effect as would be provided if there were 32bit addition and shifting operations available in the instruction set. At the end of the algorithm the result memory locations contain the value 0009EB10 which is 65000 in decimal; the desired answer. Also observe that the number 00000514 which is 1300 in decimal, our multiplicand, has become 05140000 after being left shifted 16 times. Our extended shifting has given the same result as if a 32bit number is left shifted 16 times as a unit. There are many other important applications of the shifting and rotation operations in addition to this example of the multiplication algorithm. More examples will come in coming chapters.

The 8088 processor provides us with a few logical operations that operate at the bit level. The logical operations are the same as discussed in computer logic design; however our perspective will be a little different. The four basic operations are AND, OR, XOR, and NOT. The important thing about these operations is that they are bitwise. This means that if “and ax, bx” instruction is given, then the operation of AND is applied on corresponding bits of AX and BX. There are 16 AND operations as a result; one for every bit of AX. Bit 0 of AX will be set if both its original value and Bit 0 of BX are set, bit 1 will be set if both its original value and Bit 1 of BX are set, and so on for the remaining bits. These operations are conducted in parallel on the sixteen bits. Similarly the operations of other logical operations are bitwise as well.

Selective Bit Inversion

XOR can also be used as a masking operation to invert selective bits. The bits in the mask are cleared at positions, which are to retain their values, and are set at positions, which are to be inverted. For example to invert the lower nibble of the destination operand, the operand should be applied with a mask of 0F in the source. The upper nibble will retain its value and the lower nibble will be set as a result. Compare this with NOT which inverts everything. XOR on the other hand allows inverting selective bits.

Selective Bit Testing

AND can be used to check whether particular bits of a number are set or not. Previously we used shifting and JC to test bits one by one. Now we introduce another way to test bits, which is more powerful in the sense that any bit can be tested anytime and not necessarily in order. AND can be applied on a destination with a 1-bit in the desired position and a source, which is to be checked. If the destination is zero as a result, which can be checked with a JZ instruction, the bit at the desired position in the source was clear. However the AND operation destroys the destination mask, which might be needed later as well. Therefore Intel provided us with another instruction analogous to CMP, which is non-destructive subtraction. This is the TEST instruction and is a non-destructive AND operation. It doesn’t change the destination and only sets the flags according to the AND operation. By checking the flags, we can see if the desired bit was set or cleared. We change our multiplication algorithm to use selective bit testing instead of checking bits one by one using the shifting operations.

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

; 16bit multiplication using test for bit testing [org 0x0100] jmp start multiplicand: dd 1300 ; 16bit multiplicand 32bit space multiplier: dw 500 ; 16bit multiplier result: dd 0 ; 32bit result start: mov cl, 16 ; initialize bit count to 16 mov bx, 1 ; initialize bit mask checkbit: test bx, [multiplier] ; test right most bit jz skip ; skip addition if bit is zero mov ax, [multiplicand] add [result], ax ; add less significant word mov ax, [multiplicand+2] adc [result+2], ax ; add more significant word skip: shl word [multiplicand], 1 rcl word [multiplicand+2], 1 ; shift multiplicand left shl bx, 1 ; shift mask towards next bit dec cl ; decrement bit count jnz checkbit ; repeat if bits left mov ax, 0x4c00 ; terminate program int 0x

12

The test instruction is used for bit testing. BX holds the mask and in every next iteration it is shifting left, as our concerned bit is now the next bit. We can do without counting in this example. We can stop as soon as our mask in BX becomes zero. These are the small tricks that assembly allows us to do and optimize our code as a result.