Pré-visualização parcial do texto
Baixe mmix (Arte de programar) e outras Notas de estudo em PDF para Informática, somente na Docsity!
THE ART OF COMPUTER PROGRAMMING FASCICLE 1 MMIX DONALD E. KNUTH Sianford University Es ADDISON-WESLEY vv Internet page http://uww-cs-faculty.stanford.edu/" knuth/taocp .html contains current information about this book and related books. Sec also http: //www-cs-faculty.stanford.edu/"knuth/mmix.html for downloadable software, and http: //amixmasters.sourceforge .net for gencral news about MMIX. Copyright (O 1999 by Addison-Wesley All rights reserved. No part of this publication may be reproduced, stored in a retricval system, or transmitted, in any form, or by amy means, electronic, mechanical, photo- copying, recording, or otherwise, without the prior consent of the publisher, except that the official electronic file may be used to print single copies for personal (not commercial) use. Zeroth printing (revision 15), 15 February 2004 iv PREFACE Tascicle Number One is about NHIX, the long-promised replacement for MIX. gned, and computer a rather different th a new computer Or. have passed since the MIX computer was de nc. Therefore I de style of mad ided in 1990 to replnec MIX saturated fat than its predece that would contain even les: Exerce tended MIX called MixMaster, which was upward compatible with the old version. But MixMaster itsclf has long been hopelessly obsolete. It allowed for several gigabytes of memory, but one couldn't even use it with ASCIL code to print lowercase letters, And ouch, its standard subroutine calling convention was irrevocably based on self-modilying instructions! Decimal arithmetic and self- modifying code were popular in 1962, but they sure have disappeared quickly as machines have gotten bigger and faster. Fortunately the new RISC machines have a very appealing structure. so T've had a chance to design a new computer thaz is not only np to date but also fim. Many veaders are no doubt thinking, “Why does Knuth replace MIX by another machine instead of just sticking to à high-level programming language Haxdly anybody uses : these da Such people are entitled to thei opinions, and they nced not bother reading the machinc-language parts of my books. But the reasons for machine language that 1 gave in the preface to Volume 1, written in the early 1960s, remain valid today: 1.3.1-25 im the first three editions of Volume 1 spoke of an ex- e One of the principal goals of my books is to show how high-level construc- tions are actually implemented in macl not simply to show how they are applied. 1 explain coromtine linkage, tree structures, random mimber generation, high-prec combinatorial searching, recursion, ctc., from the ground up. jon arithmetic, radix conversion, packing of data, The programs needed in my books are generally so short that their main points can be grasped easily. People who are more than casually interested im computers should have at least some idea of what the underlying hardware is like, Otherwise Lhe programs they write will be pretty weird. Machine language is necessary in any case, as ontput of some of the software that T describe. Expressing basic methods like algoritims for sorting and searching in ma- chine language makes it possible to carry out meaningful studies of the effects of cache and RAM size and other hardware characteristics (memory speed, pipelining, multiple issue, lookaside buffers, the size of cache blocks, etc.) when comparing different schemes. Moreover, if did use à higl-level language, what language should il be? In the 1960s 1 would probably have chosen Algol W; in the 1970s, 1 would then have had to resrite my books using Pascal; in the 19808, T would suely have changed everything to C; in the 19905, T would have had to switch to CH and then probably to Java. In the 20005, yet another language will no doubt be de PREFACE v rigueur. T cannot afford the time to rewrite my books as languages go in and out of fashion: languages aren't the point of my books, the point is rather what you cau do in your favorite language. My books focus on timeless truths. Therefore L will continuo to use English as the high-level language in The Art of Computer Programming, and 1 will continue to usc à low-level language to indicate how machines actually compute. Readers who only want to sec algorithuns that are already packaged in à plug-in way, using a trendy lang should buy other people's books, The good news is that programming for MMIX is pleasant and simple. This fascicle presents je, 1) a programmer's introduction to the machine (replacing Section 1.3.1 of Volume 1); 2) the MHIX assembly language (replacing Section 1.3.2); 3) new material on subrontines, corontines, and interpretive routines (replacing Sections 1.4.1, 1.4.2, and 1.4.3). Of course, HIX appears in many places throughout Volumes 1 3, and dozens of programs need to be rewritten for MMIX. Readers who would like to help with this conversion process are enconraged to join the MMIXmasters, a happy group of voluntecrs hascd at mmixmasters.sourceforge.net. 1 am extremely grxateful to all the people who helped ine with the design of MMIX. In particular, Jolm Hennessy and Richard L. Sites deserve special Lhanks for their active participation and substantial contributions. Thauks also to Vladimir Ivanovié for volunteering Lo be the MMIX grandmaster /webmaster. Stanford, California D.E.K. You can, HF you want, rewrite forever. — NEIL SIMON, Rewrites: A Memoir (1996) 2 BASIC CONCEPTS 13 1.3. MMIX TN MANY PLACES throughout this book we will have occasion to refer to a com- internal machine language. The machine we nse is à mythical computer called “MMIX.” MMIX — pronounced EM-micks —is very much like near gencral-purposc computer designed since 1985, except that it is, perhap The luguage oÉ MMIX is powerful enough to allow brief programs to be * for most algorithms, yet simple enough so that its operations arc casily learned. The reader is nrged to sindy this section carefully, since MMIX language appears in so many parts of this book. There should be no hesitation about learning a machine language; indeed, the author once found it not uncommon to be writing programs in a half dozen different machine languages during the same week! Everyone with more than a casual interest in computers will probably gel to know at least one machine language sooner or later. Machine language helps programmers understand what really goes on inside their computers. And once one machine language has been learned, the characteristics of another are easy to assimilate. Computer science is largely concerned with an understanding of how low-level details make it possible to achieve high-level goals. Software for running MMIX programs on almost any real computer can be downloaded from the website for this book (see page ii). The complete source code for the anthor's MNIX voutines appears in the book MiIXware Lecture Notes in Computer Science 1750 (1999)]: that book will be called “the MMIXw?: document” in the following pages. re. 1.3.1". Description of MMIX MMIX is é an ident: polyunsaturated, 100% natural computer. Like most machines, it has 'ying number — the 2009. This number was found by taking 14 actual computers very similar to MMIX and on which MMIX could casily be simulated, them averaging their sumbers with equal weight: (CrayI | IBM801 | RISCII | ClipperC300 | AMD29K | Motorola 88K + TBM 601 + Tntelig60 + Alpha 21164 + POWEIRS + MIPSR4000 + Tlitachi SupesTI4 + StrongARM 110 + Sparc64)/14 = 28126/14 = 2009. (a) “Che same number may also be obtained in a simpler way by taking Roman numerals. Bits and bytes binary digits or bi the 64-bit quantity MMIX works with patterns of Os and 1s, commonly called . and it usually deals with 64 bits at a time. For example, 1OOL1L1000LL0L1101L1100L1011100101111111010010100111110000010110 (2) is a typical pattem that the machine might encounter. Long patterns like this can be expressed more conveniently if we group the bits four at a time and use 13.1 DESCRIPTION OF MMIX 3 nal digits to represent O = 0000, 4 = 0100, 8 = 1000, c= 1100, group. The sixteen hexadecimal digits are 1=0001, 5=0101, 9=1001, a=1101, () 20010, 60110, a 1010, e -110, 3 3=0011, 7=01, b= Ol, f= 1. We shall always use a distinctive Lypelace for hexadecimal digils, as shown here, so Lhat they won“ be confused with the decimal digits O 9; and we will usually also put the symbol * just before a hexadecimal number, Lo make the distinction even clearer. For example, (2) becomes *9e3779b97f4aTc16 (4) in hexadecimalese. Uppercase digits ABCDEF are often used instead of abcdef, because *SE3779B97F4A7C16 looks better than *9e3779b97f4a7c16 in some context; there is no difference in meaning. A sequence of eight bits, or two hexadecimal digits. is commonly called a byte. Most computers now consider bytes to be their basic, individually addressable units of information; we will see that an MMIX program can refer to as many as 2%! bytes, cach with its own address from *0000000000000000 to AfSELILLLLLLLILES. Letters. digits, and punctuation marks of languages like English arc often represented with one byte per character, using the American Standard Code for Information Interchange (ASCIT). For example, the ASCII equivalent of MMIX is "49494958. ASCIL is actually a 7-bit code with control characters *00-*1£, printing characters *20-* Te, aud a “delete” character *7f [see CACM 8 (1965), 207 214; 11 (1968), 849 852; 12 (1969), 166 178]. was extended during the 1980s Lo an international standard 8-bit code known as Latin-1 or ISO 8859-1, thereby encoding accented letters: páté is “70627469. “Of the 256th squadron?” “Of the fighting 256th Squadron,” Yossarian reptied. “That's two to the fighting cighth power.” — JOSEPH HELLER, Catch-22 (1961) A 16-bik code that supports nearly every modem language became am inter- national standard during the 1990s. This code, known as Unicode or ISO/IEC 10646 U + includes not only Greck letters like E and o (*03a3 and *03c3), Cyrillic letters like II] and m (*0429 and *0449), Armenian letters like (3 and 2 (*0547 and *0577), Hebrew letters liko w (*05e9), Arabic letters like qa (*0634), and Indian letters like NT (*0936) or (*09b6) or d (*0b36) or ap (Fobbr), ete., bu also Lens of thousands of East Asian ideograplis such as Lhe Chinese character [or mathematics and computing, $7 (*7b97). IL even has special codes for Roman numerals: MMIX = *216f216f21602169. Ordinary ASCIT or Tatin-l characters are represented by simply giving them a leading byte of zero: páté is *007000e2007400e9, à Unicode. a 13.1 DESCRIPTION OF MMIX CEMMIS HO so: T I I ] sul I I I ] ss Do DIDI s254: 0 TD E O E | $255: rail ] BED TD TI DI DT] 122: [ I I I I I I | E A E O ES E ST TE ERR eesraeoo CEE EEE CEE E CEE ESTE Fig. 13. Tho MMIX computer, as seen by a programmer, has 256 gencral-purpose registers and 32 special-purpose registors, together with 2%! bytes of virtual memory. Each register holds 64 bits of data. significant lg t bits of » when referring to Mi[. For completeness, we also write Mi[z] = M[x, and we define M[z] = M[z mod 2º4] when x < 0 or x > 2%, The 32 special registers o! MHIX are called rÃ, 13, ..., 17, TBB, 1TT, TWW, XX. 1YY, and 1Z2Z. Like their general-purpose consins, they each hold an octabyte. Their uses will be explained later; for example, we will see that rà controls arithmetic interrupts while rR, holds the remainder after division. Instructions. MMIX's memory contains instructions as well as data. An in- struction or “command” is a Letrabyte whose four bytes are conventionally called : Y. and Z. OP is Lhe operation code (or “opcode,” for short): X. Y, and Z specify the operands. For example, *20010203 is an instruction with OP = +20, X=+*01,Y = +02. and 7 = *03, and it means “Set $1 to the sum of $2 and $3.” The operand bytes are always regarded as unsigned integers. Each of the 256 possible opcodes has a syimbolic form that is cas, member. For example, opcode 20 is ADD. We will deal almost exelusively with symbolie opeode equivalents can be found, if needed, in Table 1 below, and also in the endpapers of this hook. The X, Y, and Z bytes also have syinbolie representations, consistent srith the assembly language that we will discuss in Section 1.3.2. For example, Lhe instruction *20010203 is conventionally wriltem “ADD $1,82,$3', and lhe addition instruction in general is written “ADD $X,$Y,8$Z'. Most instructions have three operands, but some of them have only two, and a few have only one. When there ave two operands, the first is X and the second is the two-byte quantity YZ; the symbolic notation then has only one comma. For example, the instruction E to re- ; the nume: 6 BASIC CONCEPTS 13.1 “INCL $X,YZ' increa: gister 8X by the amount YZ. When there is only one operand, it is the unsigned three-byte number XYZ, and the symbolic notation has no comnia at all. Tor example, we will sec that 'JMP 0+4*XYZº tells MMIX to find its next instruction by skipping ahead XYZ tetrab; “JMP €+1000000” has the hexadecimal form *£003dO9O, bec 250000 = *03dO90. We will describe each MMIX instruction both informally and formally. For , the informal meaning of “ADD $X,8Y,$Z' is “Sel $X to the sum of $Y the formal definition is 's(8X) & s($Y) + s($7)'. Here s(x) denotes the signed integer corresponding to the bit pattern x, according to the conventions of two's complement notation. An assignment like s(z) — N' means that x is to be set to the bit pattern for which s(x) = N. (Such an assignment canses integer overfiow if N is too largo or too small to fitin x. For example, an ADD ill overflow if s($Y) + s(8Z) is less than —288 or greater than 288 = 1. When we're ing an instruction informally, we will often gloss over the possibility of overflow; the formal definition, however, will make everything precise. In general Lhe assignment s(x) — N sels x Lo Lhe binary representation of N mod 2”, where n is the number of bits in x, and it signals overlow il N < —-2º-lor N > 2º), see exercise 5.) the instruction o JMP — *fO and diser Loading and storing. Although MMIX has 256 different opcodes, we will sec that they fall into a few easily learned categories. Let's start with the instructions that transfer information between lhe registers and the memory. Each of Lhe following instructions has a memory address A obtained by adding $Y to 82. Formally, A= (u(8y) + u(82)) mod 2% (5 is the sum of the unsigned integers represented by uumber by ignoring any carry that occurs at the le: added. In this formula the notation n(x) is analogous to s(x), but it considers 7 to be an unsigned binary number. * LDB $X,$7,8Z (load byte): s(8X) + s(Mi [A]). * LDW $X,$Y,82 (load wydo): ) e s(Mo[A]). * LDT $X,$Y,8Z (load tetra): s($X) — s(Mu[A)). * LDO $X,$Y,8Z (load octa): s($X) — s(Ma[A]). “These instructions bring data from memory into register $X, changing the data il necessary lrom a signed byte, wyde, or Letrabyte Lo a sigued octabyle ol Lhe same value. For example, suppose the octabyte Ms[1002] = Ms[1000] is M[L000]M[L001] ... M[1007 = *0123456789abcdet. (6) Then if 82 = 1000 and $3 = 2, we have À = 1002, and LDB $1,$2,8$3 seis $1 — *0000000000000045: LDW $1,82,83 sets $1 + *0000000000004567 ; LDT $1,82,83 sets $1 — *0000000001234567; LDO $1,82,$3 sets 61 — *01234567 89abcdefr . & 8 BASIC CONCEPTS 13.1 e STBU $x,$Y,8Z (store byte unsigned): u(M) [A]) + u($X) mod 25. + STWU $X,8Y,8Z (store wyde unsigned): u(Mo[4]) — u($X) mod 219. * STIU 8X,$Y,82 (storo tetra unsigned): u(Mi[A]) — u($X) mod 2%2, * STOU $X,$Y, $Z (slore octa unsigued): u(Ms[A)) — u($X). “These instructions have exactly lhe same ellect on memo: counterparts STB, STW, STT, and STO, but overllow never occurs. * STHT $X,$Y,82Z (store high tetra): u(MulA)) — [u(8X)/282]. The left half of register $X is stored in memory tetrabyte Mu[A]. * STCO X,$Y,$Z (store constant octabyte): u(Ms[A)) — X. A constant between O and 255 is stored in memory octabyte Ms [A]. as their signed Arithmetic opcrators. Most of MMIX's operations tako place strictly betwcen registers. We might as well begin our study ol the register-to-register opera- tious by considering addition, sublraction, multiplication, aud division, because computers are supposed to be able to compute, ADD $X,$Y,8Z (add): s(8X) — s(8Y) + s(82). SUB $X,8Y,$2 (subtract): s(SX) € s(8Y) — s($2). MUL $X,8$Y,82 (multiply): s(SX) — s($Y) x s(82). DIV $X,8Y,8Z (divide): s($X) — |s(SY)/s($Z)] [82 %0], and s(rR) + s($Y) mod s(82). Sums, differences, and products need no [urther discussion. “Lhe DIY command forms the quotient and remainder as defined in Section 1.2.4; the remainder goes into the special remainder register rR, where it can be examined by using lhe instruction GET 8X, rR described below. Tf the divisor $Z is zero, DIV sets 8X — O and rR + $Y (see Ta. 1.2.4 (1); an “integer divide check” also oceurs. * ADDU $X,$Y,8Z (add unsigned): u(8X) + (u($Y) + u($Z)) mod 28. SUBU $X,$Y,8Z (subtract unsigned): u(SX) — (u($Y) — u($2)) mod 2%. MULU $X,$Y,$Z2 (multiply unsigned): u(rHSX) + u(SY) x u(8Z). * DIVU $X,8Y,8Z2 (divide unsigned): u($X) — [u(1D$Y)/u($2)]. u(rR) & u(rD $Y) mod u($2), if u($Z) > u(rD); otherwise $X + 1D, rR + $Y. Arithmetic on unsigned numbers never causes overflow. A full 16-byte product is formed by the MULU conmand, and the upper half goes into the special Aimult register rIT. For example, when the unsigned number “9837 79b97f4a 7c16 in (2) and (4) above is multiplied by itself we ger rH + *61c8864680b583€a, 8X < *1bb32095ccddble4. (7) In this case the value of rH has turned out to be exactly 28 minus the original number “9037 79b97f4a7c16; Luis is not a coincidence! The reason is Lhal (2) actually gives the liest 64 bits of the binary representation of Lhe golden ratio gl = 6-1, 1f we place a binary radix point at the left. (See Table 2 im Appendix A.) Squaring gives us an approximation to the binary representation ofg 2-1 -& 1 with the radix point now at the left of rIT. 13.1 DESCRIPTION OF MMIX 9 Division with DIVU yields the B-byte quotient and remainder of a 16-byte dividend with respect to an 8-byte divisor. The upper half of the dividend appea ter 1D, which is zero at the beginning of a program; this register cam be set to any desired valne with the conmand PUT rD,$Z described below. If rD is greater than or equal to the di DIVU $X,8$Y,$Z simply sets SX — 1D and rR + $Y. (This alw when $Z is zero.) But DIVU never causes au integer divide check. “Che ADDU instruction computes a memory address À, according to defini- tion (5); therefore, as discussed earlier, we sometimes give ADDU the alternative name LDA. The following related commands also help with address calculation. + 2ADDU $X,$Y,$Z (times 2 and add nnsigned): u(8X) — (n($Y) x 2 + u($Z)) mod 2. + 4ADDU $X,$Y,8Z (times 4 and add unsigned): u(8X) + (u($Y) x 4 + u($2)) mod 2%. * SADDU $X,8Y,$Z (times 8 and add unsigned): u(8X) — (u($Y) x 8 + u($4)) mod 26. * 16ADDU $X,$Y,8$Z (Limes 16 and add unsigned): u(8X) — (u($Y) x 16 + u($Z)) mod 264, Tt is faster to execute the command 2ADDU $X,$Y,$Y than to multiply by 3. if overflow is not an issue. o NEG $X,Y,$8Z (negate): s(SX) — Y — s(SZ). + NEGU $X,Y,$Z (negate msigned): (8X) — (Y — u($Z)) mod 2%. ln these cominands Y nply an unsigned constant, not a (just as X was an unsigned constant in Lhe STCO instructiou). Usually Y is zero, in which case we can write simply NEG $X,$Z or NEGU $X,8$Z. e SL $X,8Y,8Z (shift left): s(8X) + s(SY) x 21182), * SLU $X,$Y,8$Z (shift left unsigned): u(8X) — (u($Y) x 24(82)) mod 2º1, e SR $X,8Y,8Z (shift right): s($X) « |s(8Y)/2u(821]. * SRU 8X,$Y,$Z (shift right unsigned): u(SX) — [u(8Y)/2"67 |. SL and SLU both produce the same result in $X, but SL might overflow while SLU never does. SR extends Lhe sign when shifling right, but SRU shifts zeros in from the left. 'Lherefore SR and SRU produce the same result in $X il and only il $Y is nonnegative or $Z is zero. The SL and SR instructions are much faster than MUL and DIV by powers of 2. An SLU instruction is much faster than MULU by a power of 2, although it does not affect vIT as MULU does. An SRU instruction is much faster than DIVU by a power of 2, although it is not affected by rD. The notation 3 < = is often nsed to denote the result of shifting à binary valne y to the left by 2 bits; similarly, y > 2 denotes shifting to the right. e CHP 8X,$Y,8Z (compare): s(SX) + [s(SY) > s(8Z)] — [s(SY) < s(8Z)]. e CHPU $X,8$Y,$Z (compare unsigned): s(SX) — [u($Y) > u(87)] — [u($Y) < u($7)]. These instructions each set $X to either —1, O, or 1, depending on whether register $Y is less than, equal to, or greater than register $7. SOL, axises 13.1 DESCRIPTION OF MMIX 1 * HUX $X,$Y,$Z (bityise multiplex): v($X) < (r(SV)Av(MD) v(v(SZ)AT(M)). The NUX operation combines two bit vectors by looking at the special multiplex mask register rM, choosing bits of 8Y where rM is 1 and bits of SZ vhere rM is 0. * SADD $X,8Y,8Z (sidoways add): s(8X) e s(T(v(8Y) A T(82))). The SADD operation connts the number of bit positions in which register SY has a 1 while registor $Z has a 0. Bytewise operations. Similarly. we can regard an octabyte x as a vector b(z) of eight individual bytes, each of which is an integer between O and 255; or we cam think of it as a vector w(x) of four individual wydes, or a vector t(z) of two The following operations deal with all components at once. BDIF $X,8Y,8Z (byte difference): b(8X) E b($Y) = b($7). WDIF $X,$Y,82 (wydo diflerenec): w($X) — w($Y) = w($2). TDIF $X,8Y,$Z (tetra difference): H$X) — t(SY) = (872). UDIF $X,$Y,$Z (octa difference): u(SX) — u(8Y) = u(87). Here — denotes Lhe operation of saturating sublraction, unsigned tetra: voz = max(0,y—2). (9) These operations have important applications to text processing, as well as to computer graphics (when the bytes or wydes represent pixel values). Exerci 27-30 discuss some of their basic propc We can also regard an octabyte as an 8 x 8 Boolcon matriz, that is 8x 8 array of Os and Is. Les m(2) be the matrix whose rows from top to bottom are the bytes ot 2 from left to right: and let m" =hose columns arc the bytes of «. For example, if « = “9637 79b97f4aTc16 is Lhe octabyte (2), we have 10011110 10010000 001101 1 00101110 01111001 O114141010 “4101116001 Try (lIL1L1011 a m()=|i9111110 (10) 01001010 11001011 011111600 11001101 00010110 01111000 “This interpretation ol octabytes suggests two operalions Lhal are quite [amiliar to malhematicians, bul we will pause a moment to define them [rom seratch. 1£ A is an qm x matrix and B is an xs matrix, and if o and e are binary operations, Lhe generalized matriz product AS B is lhe m x s matrix € defined by Cy =(Ane Bi) o (Ane Bojo. o (Ame Bj) (11) fori 0, set 0 RA. * BOD $X,RA (branch if odd): if s($X) mod 2 — 1, set GO RA. . . . mbol & denotes the place where we're “at.”) But jump BNH $X,RA (branch if nonnegativo): if s($X) > 0, set 8 RA. BNZ $X,RA (branch if nonzero): if SX É O, set G— RA. BNP $X,RA (branch if nonpositive): if s(8X) < 0, set G« RA. é BEV $X,RA (branch if even): if s($X) mod 2 — O, set G + RA. À branch instruction is a conditional jump that depends on the contents of register $X. The range of destination addresses RÁ is more limited than it was with JMP, becanso only two bytes arc available to express the relative offset; but still we can branch to any tetrabyte betwcon & — 218 and à — 218 4, * PBN $X,RA (probable branch if negative): if s($X) < 0, set & — RA. * PBZ $X,RA (probable branch il zero): i 8X = 0, seL G RA, * PBP $X,RA (probable branch il positive): il s($X) > 0, set E RA. . . PBOD $X,RA (probable branch il odd): $X)mod2=1,set GH RA, PBNN $X,RA (probable brauch il nonnegalive): il s(5X) > O, set GQ RA, 15