SRT Division Algorithm and the Pentium Division Bug - Prof. Gabriel Loh | Study notes Computer Science

Multiplication, Division

Prof. Loh

CS3220 - Processor Design - Fall 2008

November 3, 2008

1 Multiplication

Multiplication by hand in base 10 involves repeated multiplication of one number by the digits of the second number,

and then adding all of the results together. In base 2, multiplication by a single bit is simplified by the fact that a bit

only has two values (both of which are trivial to multiply by!). Multiplying an n-bit number by a single bit simply

involves nAND gates, as illustrated in Figure 1. To multiply two n-bit numbers together, we simply need to perform

ndifferent n×1-bit multiplications in parallel, shift the partial results properly, and add them all together. This is

illustrated in Figure 2. The total gate delay is O(1) for the 1-bit multiplies, zero for shifting (each shift is by a constant

amount, so only wires are involved), and O(log (n+ lg n)) ≈O(log n)gate delays for adding together ndifferent

O(n)-bit numbers (if, for example, a tree of carry-save adders is used). Notice that the final output of a n-bit by n-bit

multiply is 2n-bits wide.

There are other ways to perform multiplication by using repeated iterative steps. The naive approach is to have

a single 1-bit multiplier, and on each cycle, generate an additional partial product. After nsuch steps, all of the

partial products will have been generated. In parallel, the partial products can be added as they are generated with an

accumulator. This uses considerably less hardware, but takes much longer to complete the calculation (O(n)).

In either iterative addition of partial products, or the usage of a Wallace Tree, the number of partial products is

largely what determines how fast the multiplication can be performed. To reduce the number of partial products,

the Booth algorithm can be used. This is a simple trick that invovles the recoding the binary numbers using 0’s, 1’s

and -1’s. For example, the number 00111102is the same as 01000102, where 1means -1. Any partial product that

corresponds to a bit equalling zero can be skipped. This doesn’t help much for the case where a tree of adders is used,

but can save many iterations when an iterative method is used. To perform the encoding, start from the least significant

bit. Each time a block of zero ends, and a block of ones starts, a 1is written down. Each time a block of ones ends, and

a block of zeros starts, a 1is written down. If neither condition holds, a zero is written. The following is an example:

00011000011111 0 ←implicit zero

00101000100001

x3x2x1x0

xn−2

xn−1

x1·yi

x3·yi

xn−1·yi

xn−2·yix2·yix0·yi

Figure 1: A n-bit by 1-bit multiply is achieved by using AND gates.

SRT Division Algorithm and the Pentium Division Bug - Prof. Gabriel Loh, Study notes of Computer Science

Related documents

Partial preview of the text

Download SRT Division Algorithm and the Pentium Division Bug - Prof. Gabriel Loh and more Study notes Computer Science in PDF only on Docsity!

Multiplication, Division

Prof. Loh

CS3220 - Processor Design - Fall 2008

November 3, 2008

1 Multiplication

xn− 1 xn− 2 x 3 x 2 x 1 x 0

yi

xn− 1 · yi x 3 · yi x 1 · yi

xn− 2 · yi x 2 · yi x 0 · yi

R 0

D

R 1

4 D

R 2

4 D

R 2

D

∑^ ∞

∑^ ∞

Q.E.D. (7)

∑^ ∞