









Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Encuentra los documentos específicos para los exámenes de tu universidad
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
Arithmetic coding is a data compression technique that encodes data (the data string) by creating a code string which represents a fractional value on the number line between 0 and 1. The coding algorithm is symbolwise recursive; i.e., it operates upon and encodes (decodes) one data symbol per iteration or recursion. On each recursion, the algorithm successively partitions an interval of the number line between 0 and I , and retains one of the partitions as the new interval. Thus, the algorithm successively deals with smaller intervals, and the code string, viewed as a magnitude, lies in each of the nested intervals. The data string is recovered by using magnitude comparisons on the code string to recreate how the encoder must have successively partitioned and retained each nested subinterval. Arithmetic coding differs considerably from the more familiar compression coding techniques, such as prefix (Huffman) codes.
Tipo: Resúmenes
1 / 15
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!










An Introduction to Arithmetic Coding
Arithmetic coding maps a string of data (source) symbols to a code string in such a way that the original data can be recovered from the code string. The encoding and decoding
One recursion of the algorithm handles onedata symbol.
the property of treating the code string as a magnitude. For a brief history ofthe development of arithmetic coding, refer to Appendix 1.
The notion of compression systems captures the idea that data may be transformed into something which is encoded, then transmitted to a destination, then transformed back into the original data. Any data compression approach, whether em- ploying arithmetic coding, Huffman codes, or any other cod-
The code itself can be independent of the model. Some systemswhich compress waveforms ( e g , digitizedspeech) may predict the next valueand encode the error. In this model the error and not the actual data is encoded. Typically, at the encoder side of a compression system, the data to be com- pressedfeed a model unit. The model determines 1) the event@) to be encoded, and 2) the estimate of the relative
frequency (probability) of the events. The encoder accepts the event and some indication of its relative frequency and gen- erates the code string.
A simple model is the memoryless model, where the data symbols themselves are encoded according to a single code. Another model is the first-order Markov model, whichuses
Consider, for example, compressing English sentences. If the data symbol (in this case, a letter) “q” is the previous letter, we wouldexpect the next letter to be “u.” The first-order
expectation for each symbol (or in the example, each letter), depending on the context. The context is, in a sense, a state governed by the past sequence of symbols. The purpose of a
for encoding (decoding) the next symbol.
Corresponding to the symbols are statistics. To simplify the discussion, consider a single-context model, i.e., the memory- less model. Data compression results from encoding the more- frequent symbols with short code-string length increases, and encoding the less-frequent events with long code length in-
denote the length (in bits) of the code-string increase associated
0 Copyright 1984 byInternational Business Machines Corporation. Copying inprintedformforprivate use is permitted without payment of royalty provided that ( 1 ) each reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the firstpage. The title and abstract,but no other portions, of this papermay be copied or distributedroyaltyfreewithoutfurtherpermission by computer-based and other information-servicesystems. Permission to republish any other portion of this paper must be obtained from the Editor.
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH I 984
GLEN G. LANGDON. JR.
GLEN G. L
The encoder accepts the events to be encoded and generates Symbol CodewordProbability (^) (inbinary) p (^) probability PCumulative the code string.
b 10 ,010. l o o C 110 .oo 1. I 10 d 1 1 1^ .oo^1. I^11
data string is obtained by replacing each data symbol with its associated length and summing thelengths:
c C r 4. I
values of c,), the given code will almost surely fail to achieve
long length e ) lead to a code stringwhich may have more bits than the original data. For compressionit is imperative to closely approximate the relative frequency of the more-fre-
value, the best we can compress (accordingto ourgiven model)
the base 2 and the unitof length is the bit. Knowing the ideal
the given data string and memoryless model by replacing each
and summing thelengths.
Let us now review the componentsof a compressionsystem: the model structure for contexts andevents, the statistics unit for estimation of the event statistics, and theencoder.
In practice, the model is a finite-state machine which operates successively on each data symbol and determines the current event to be encoded and its context (i.e., which relative frequency distribution applies to the current event). Often, eachevent is thedata symbol itself, butthestructurecan define other eventsfrom which thedata stringcould be
as the runlength of a succession of repeated symbols, i.e., the number of times the currentsymbol repeats itself.
The estimation method computes the relative frequency dis-
man codes, the event statistics are predetermined by the length of the event’s codeword.
The notionsof model structure andstatistics are important because they completelydetermine theavailable compression. Consider applications where the compression model is com-
string statistics. Due to theflexibility of arithmetic coding, for such applications the “compression problem” is equivalent to the “modelingproblem.”
We now list some properties for which arithmetic coding is amply suited.
property: Events are decoded in the same order as they are
cult.
We desire nomorethan asmall storage buffer atthe encoder. Once events are encoded,we do notwant the encod- ing of subsequent events to alter what has already been gen- erated.
Theencoding algorithmshould be capableofaccepting successive events from different probabilitydistributions. Arithmetic coding has this capability. Moreover,the code acts directly on the probabilities, and can adapt “on the fly” to changing statistics. TraditionalHuffman codes require the design of a different codeword set for different statistics.
We progress to a very simple arithmetic code by first using a prefix (Huffman) code asan example. Our purpose is to introducethe basic notions of arithmetic codesina very simple setting.
Considerafour-symbolalphabet,for which the relative frequencies 4, i, i , and Q call for respective codeword lengths
to relative frequency, and use the code of Table 1. The probability column has the binary fraction associated with the probability corresponding to the assigned length.
property (no codeword is the prefix of another). Decoding is
A N G W N , JR. IBM J. RES. DEVELOP, VOL. 28 - NO. 2 MARCH 1984
0 ,001 1001 Ill I I 4 I , (^) d I a I h I ( (^) I I I d 1 a I h I c I 1 - I Id1 a I b I C I
I...l
Figure 3 Subdivision of unit interval for arithmetic code of Table 2
Table 2 Arithmeticcodeexample.
Symbol Cumulative (^) Symbol Length probability P (^) probability p
b .oo 1 .010 2
C .111 .oo 1 3
W of the current interval and the cumulative probability P,
New C = Current C + ( A X Pi). For example, after encoding “a a,” the current code point C
the current interval, and the factor on the right is the cumu- lativeprobability P for symbol “b”; see the“Cumulative probability” column of Table 1.
The width A of the current interval is the product of the
new interval width is New A = Current A X Pi,
ing “a a^ b,”^ the interval width is^ (. I )^ X^ (. I )^ X^ (.Ol),^ which is .om I.
In summary,we can systematically calculate the next inter-
interval, given the probability p and cumulativeprobability P
ing, and this particular version is characteristic of the class of FIFO arithmetic codes which use the symbolprobabilities directly.
TheHuffman codeof Table I correspondsto a special integer-length arithmetic code. With arithmetic codes we can rearrange the symbols and forsake the notion of a k-bit code- word for a symbol corresponding to a probability of 2-k. We
GLEN G. LANGDON, JR.
retain the important techniqueof the double recursion. Con- sider thearrangement of Table 2. The “codeword”corre-
bols in the ordering.
The subdivision of the unit interval for Table 2, and for the
have the prefix property of Huffman codes. Compare Figs. 2
conform with the new ordering in Table 2.
reinforces thedouble recursionoperations, where the new values become the current values for the next recursion. It is helpful to understand the arithmeticprovided here, using the
[.O 1 1 ,. 1 1 l), as follows:
(Current code point plus currentwidth A times P.) A : New interval width A = 1 X (.1) =. l. (Current width A times probability p. )
X P added to the oldcode point C, the augend.
C: New code point = .011 +. 1 X (.011) = .O 1 1 (current code point) .0011 (current width A times P, or augend)
.lo0 1. (new code point)
(Current width A times probability p.)
Now the remaininginterval is one-fourththe width of the unit interval.
C: New code point = .lo01 + .01 X (.001) = .10011. .lo01 (currentcodepoint C )
,1001 I (new code point)
(Current width A times probability p.)
A : New interval width A = .01 X (.01) = .0001.
[.loo1 1,.10101).
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984
C: New code point = .IO011 + .0001 X (. I l l ) = .1010011. .IO01 1 (current codepoint)
(Current width A times probability p.)
The encoding of the fourth symbol exemplifies a small prob- lem, called the carry-over problem. After encoding symbols
string using Huffman coding would not change. However, in this arithmeticcode, the encoding of symbol “c” changed the value of the third code-string bit. (The first three bits changed
because we are basically adding quantities to thecode string. We discuss carry-over control later on in thepaper.
[.1010011,.1010100). Ifwe were to terminate the code string at this point (no more data symbols to handle), any value
would serve to identify the interval.
Let us overview the example. In our creation of code string .1010011, we in effect added properly scaled cumulative prob- abilities P, called augends, to the code string. For the width recursion on A , the interval widths are, fortuitously, negative integral powers of two, which can be represented as floating point numbers with one bit of precision. Multiplication by a negative integral power of two may be performed by a shift
sum of augends, which displays the scaling by a right shift:
.o 1 I 01 1
Let us retain code string .lOlOOl 1 and decode it. Basically, the code string tells the decoder what the encoder did. In a sense, the decoder recursively “undoes” the encoder’s recur- sion. If, for the first data symbol, the encoder had encoded a “b,” then (referring to the cumulative probability P column of Table 2), the code-string value would be at least ,001 but
IBM J. RESDEVELOP. VOL. 28 - NO. 2 MARCH 1984
. I O 1001 I lies in [.O 1 1 ,. I IO), which is a’s subinterval. We can summarize this step asfollows.
determine the interval in whichit lies. Decode the symbol corresponding to thatinterval.
Since the second subinterval code pointwas obtained at the encoder by addingsomethingto .011, we can prepare to decode the second symbol by subtracting .011 from the code
augend value of the code point for thedecoded symbol.
Also, since the values for the second subinterval were ad- justed by multiplying by. I in the encoderArecursion, we can “undo” that multiplication by multiplying the remaining value of the codestring by 2. Our code string isnow .IO000 1 I. In summary, we have Step 3.
value A.
Now we can decode the second symbol from the adjusted codestring .IO001 1 by dealing directly with the values in
symbol, because the adjusted code string is greater than .01 I
we obtain
the encoder, so the rescaled code string is obtained by doubling
The third symbol is decoded as follows.
symbol, we see that .O I O 1 1 is equal to or greater than ,
by .O 1 , which is undone by rescaling with a 2-bit shift:
,001 11 becomes. I 1 I.
sufficient. The fourth symbol is decoded as “c,” whose code point corresponds to the remainingcode string.
GLEN G. L
ANGDON. JR.
in.” If MIN is true (T), the event to be encoded is the more
the less probable. The decoder result is binary variable MOUT,
Similarly, at the decoder side, output value MOCJT is true (T) only when the decoded event is the more probable.
data usually represent bits from the real world. Here, we leave to a statistics unit the determinationof event values T or F.
Consider, for example,a black and white image of two- valued pels (pictureelements) which hasaprimary white background. For these data we associate the instances of a white pel value to the “moreprobable” value (T) and a black pel value into the“less probable” value (F). The statistics unit would thus have an internal variable, MVAL, indicating that
a black background, the mapping of values black and white
a more complex model, if the same black and white image had areas of whitebackground interspersed with neighbor- hoods of black, the mappingof pel values black/white to event values F and T could change dynamically in accordance with thecontext (neighborhood) of the pel location. In a black context, the black pel would be value T, whereas in the context of a white neighborhood the black pel would be value F.
The statistics unit must determine the additional informa-
F. The BAC coder requires us to estimate therelative ratio of
through 12, to indicate the relative frequency of value F. In a crude sense, we select one of 12 “codes” for each event to be encoded or decoded. By approximatingto 12 skew values, instead of using a continuum of values, the maximum loss in coding efficiency is less than 4 percent of the original file size
loss at higher skew numbers is even less; see [2].
In what follows, our concern is how to code binary events after the relative frequencies have been estimated.
appears in the BAC algorithm as a recursion on variables C (for code point) and A (for available space). The BAC algo-
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984
The BAC coder successively splits the width or size of the
vals. The left subinterval is associated with F and the right subinterval with T. Variables C and A jointly describe the current intervalas, respectively, the leftmost point and the width. As with the initial code space, the current interval is closed on the left and open on theright: [C,C +^ A ).
In the BAC, not all intervalwidths are integral negative powers of two. For example, where p of event F is 4, the other
problem. We solve the problem by representing space A with a floating point number to a fixed precision. We introduce variable E for the exponent, which controls the “data han- dling” and “shifting” aspect of the algorithm. We represent variable A in floating point with the most significant bit of A in position E from theleft. Thus theleading I-bit ofthe binary representation of A has value 2-”. For example, if A =
width is determined by a multiplication. In the simple BAC algorithm, the smaller width is determined by the value SK, as in Eq. ( I ) , which follows. The other width is the difference between the current width and the smaller width, as in Eq. (2), which follows. No multiplication is needed.
as follows. If SK is 1, the interval is split nearly in half, and if SK is 12, the interval is split with a very smallsubinterval
interval in a way which corresponds to therelative frequency of each event. Let W(F) and W(T) be respectively the subin- terval widths assigned to F^ and to^ T.^ Specifically,
W(F) = 2-(E+SK), (1)
with the remainder of the interval width A assigned to T:
We can summarize thehandling of an event (value T or F) in the BAC algorithm in three steps. The first and second steps correspond to the A and C recursions described earlier. The third step is a “data handling” or scaling step which we have ignored in the earlier examples. Let s denote thestring ofdata symbols already encoded,and let notation C(s); A($), and E($)
ing the encoding of the data string. Now, after handling the
respectively denoted C(s,T), A(s,T), and E(s,T).
GLEN G. LANGDON, JR.
Table 3 Exampleencoding-refining theinterval.
Event MIN SK E W(F) C A (value) (skew) (A’s lead (interval Os) pt) (least (F width) A)
Initial -^ -^0 -^ 0.o000oo^^1.^ m^0 (^1) T 3 0 .00 (^1) 0.001o00 0.1 1 1000
3 F 1 1 .o 1 0.01 lo00 0.01oooo 4 T 1 2 .oo 1 0. 1 m 0.001o
0 ,001 1 r I 1 1 c - c (^) / 1 Y F T S u b d ~ v i s ~ o n point
Figure 5 Intervalsplitting-subdivisionforEvent 1, Table 3.
Width h (^) , 0 O l l. I 101. I I (^) L I^ Y-^ I 1
(a) t^ Subdlvlslun^ polnt
Step I Given skew SK and E (the leading Os of A ) , subdivide
describe the new interval: If T: C(s,T) = C(s) + W(F) and A(s,T) = W(T). ( 3 4 If F: C(s,F) = C(s) and A(s,F) = W(F). (3b)
of E If T: If A(s,T) < 2-€(”), then E(s,T) = E($) + 1; otherwise E(s,T) = E($). If F: E(s,F) = E(s) + SK.
We continue the discussion by an example, where we en- code the four-event string T, T, F, T under respective skews
following description accompanies thistable.
Relative to Step 2, Eq. (3), the subdivision point is C + W(T) or 0 + .OO 1 = .001. Since the binaryvalue is T and therelative frequency of the T event is equal to orgreater than 4,we keep the larger (rightmost) subinterval. Refemng to Fig. 5 , we see
now C(T) = 0.001 and A(T) = W(T) = 0.11 1. For Step 3, we
The subdivision point of the current interval is C + W(F), or
interval [.011,1) ofwidth .101. The smaller width W(F) is
C + W(F), or subdivision point .101. See Fig. 6(a). Refemng
now keep the left side of the subdivision. By keeping the F
becomes 2. The resulting interval is shown in Fig. 6(b).
Arithmetic codes generate the code string by adding a sum-
the current code string and possibly shifting the result. The
long string of 1s from the coding process. An addition could propagateacarry intothe longstring of Is, changing the
In this section we show how the arithmetic using A and C can be separated fromthe carry-overhandling anddata-
GLEN G. LANGDON. JR IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984
Table 4 Example encoding with normalization.
Event MIN SK Q C A Normalization
Initial - - - 1 T 3
Yes No 3 F 1 00 1.0001.100 F-shift of SK 4 T I 010 0.000 1.000 Yes
Q,C e Q,C + 2-sK, ( 4 4
with the double recursion. If the result in A is less than 1 .OOO,
pair, and shift left A. Let “shl” denote shift left one bit, and “shl’” denote a shift left of two bits, etc. If A is less than I .OOO, then
A t shl A,O. (5b) In the above, “,O” denotes “0-fill” (the vacated positions are filled with Os).
action to perform is relatively simple: Q,C e shlSK Q,C,O, ( 6 4 A t 1.0. (6b)
MIN and SK values of that step. The first row is the initiali- zation.
an SK of 3. The arithmetic result for Eq. (4a) is C = 0.000 +
A is 1.1 10.
0.010 (^) (old C)
0.1 10 (^) (new C)
+-. I (2-l)
Equation (4b) gives 1 .O 10 for A: 1.1 (^10) (old A)
1.010 (^) (new A)
-. I (-2”)
Since the register A result is greater than 1 .O, the normalization
encoding an F is Eq. (6). The value F is encoded by shifting
so Q is 00 and C is 1.100. Equation (6b) reinitializes A to 1 .ooo.
the C^ register. This carrypropagates to^ Q^ by activating encoder output signal ADD+I, and this carry-over operation converts
is needed. Q now becomes 010. The value of code string is
Arithmetic coding ensuresthat no futurevalue of Ccan exceed the current value of C + A. Consequently, once a carry-over has propagated into a given code-string positionin Q, no other carry-over will reach thesame code-stringposition. In the abovesample, the secondbit of the codestring received a carry-over. The algorithm ensures that this same bit position (second fromthe beginning) cannot receive another carry- over during the succeeding encoding operations. This obser-
a bit-stuff permits the block with the 16 Is to be transmitted. At the decoder side, if the decoder encounters 16 Is in a row, the decoder buffer removes and examines the stuffed bit. If the stuffed bit value is I , the carry is propagatedinside the decoder.
the Q FIFO store.However,after the last eventhas been coded, we remain with interval [C,C + A ). Ifwe know the length of the datastring, then we know whenwe have decoded the last event. Therefore, any code string whose magnitude lies in [C,C + A ) decodes to the original data string. In the
truncation process leaves “01” as the codestring, with the
takes to decode four databits.
(smallest value in current interval) and strictly less than C + A = 010000 + 0001000 = 010100 suffices. Our shortest
GLEN G.LANGDON. JR. (^) IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984
The decoding part of the BAC algorithm is shown in Figure
and the contentsof CBUF are transferred into C. Register A
arenow0.110and 1.110.
subtract 0.100 from C (which is now 0.010). The result CBUF
now 0.100 and A is 1 .OOO.
ization shift of C and A. A is now 1.000 and decoding is complete.
Notethatcolumn A and the Normalization columnsof
register contents always follow the same sequence of values for the decode phase as for the encode phase.
We can apply the code-string-tree representation of the coding operations to arithmeticcodes. However, unlike prefix codes, in arithmetic coding the encoding of a given symbolmay result in a code spacedescribed by continuations of more than one leaf of the tree. We illustrate the point by showing the
tree of Figure 10.
The smallest subinterval at the current depthis a single leaf.
the value of A can range from 1 .OOO to 1.1 I O. At the same current depth where 2" is one leaf, the subset of code-string
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984
START
INITIALIZE C shl' Q,O
Pet SK
A t shl A
EXIT Figure 9 Flowchart for BAC algorithm decoder.
A = 1. S u b d l v l w n point f o r SK = 3 (a)
f - v C = 0 0 0 1 unnornmalized A = I 110 ( n o r m a l l z e d ) C = ( 1. 0 1 0 dfter norrnallzatmn ( b l Figure 10 Code-string tree for Event I , Table 4: (a) Initial tree. (b) Following encoding of Event 1.
Table 5 Example decoding.
-~ Initial - 0.100 1.000 - - I 3 0.110 1.110 0.011 T Yes 2 I 0.010 1.010 0.010 T No 3 I 0.100 1.000 -1.110 F F-shiftofSK 4 I 1.0000.000 0.000 T Yes
GLEN G. L A N G W N. JR.
arithmetic code, solved the precision problem by truncating
of the leaves atthecurrentdepth of the code-stringtree,
maps to a continuation ofsuch leaves, and there is some
They do this by ensuring that C(s,k + 1) = C(s,k) + A(s,k).
Decodability Codes are not uniquelydecodable if two data strings map into the samecode space. In arithmetic codes, if the subdivi- sionof the interval yields an overlap, more than one data string can map to the overlappedsubinterval.Overlap oc-
of data string s may map to thesubinterval whose least point is C(s + I). We define string s + 1 to be the data string of the same length as s which is next in the lexical ordering. If s is
is thus C(s,n) + A(s,n)c C(s + I), (9)
arithmetic codes which do not explicitly use value A(s,n), it
[C(s + I),C(s,n)+ A(s,n)).
s a n d s + 1 if C(S + I ) - C(S) > A(s,l) +... + A(s,n). (10) A gap does not affect decodability; it simply means that the code space is not fully utilized.
P-based arithmetic codes For P-based arithmetic codes, the code space is represented as a number A which is subdivided by a multiplication in pro- portion to therelative frequencies of the symbols.
The decodability criterion for P-based codes is given by A ( s ) 2 A ( $ , ] ) + A(s,2) +... + A(s,n). (1 1) If this equation is met with equality forall s, then thealgorithm leaves no gaps. L-based arithmetic codes The L-based arithmetic codes represent the width of the code space A(s) as a value 2-[y(s)+x(s)1, where Y(s) is an integer and
Y(s) + X@). Here, Y ( s ) corresponds to E(s) of the example of Table 3. [When the codestring is terminated following the encoding of the last symbol, the code-string length of C(s) is
is determined as the product
C(s,k) = C(s) + D(s,k).
In Eq. (12), multiplication by 2-‘(”) is simply a shift.
Corresponding to the relative frequency estimates, pkr are
L(s,k) = Y(s) + X ( s ) + 4 , where again L(s) is broken into integer part Y(s,k) and fraction
decodability criterion for L-based codes if A(s,k) is as defined
For k < n define A(s,k) = B(s,k + 1 ) - B(s,k),
A(s,n) = B(s,n,n) + B(s,n,n,n) +.. .. (13)
Applications
using as a contextthe value of neighboring pels already encoded. This work introducedthe notion ofthe binary coding parameter called skew. A binary codeis particularly useful for black and white images because the contextsemployed for successive pels may have different relative frequencies. Tra- ditional run-length codes such as Golomb’s [4]^ are only de- signed for aparticular relative frequency for the repeating symbol. In [ 3 ] , amethod to dynamically adapttothe pel statistics is described.Particularlysimple adaptation tech- niques exist for determining the skew number.
Arithmetic coding can also be applied to file compression
adaptively. A linearized binary tree can be used to store the skew numbers required to encode the decomposed 8-bit byte as eight encodings, each of a binary event. 147
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984 GLEN G.LANGDON. JR
Arithmetic codes have been employedto design codes for a
e.g. (0,1], where not allstrings in (0,1]* are allowed. The Manchester code, where clocking and data information are
studied the constrained channel, defined the constraints in terms of allowed transitions in a state table, and determined the probabilities for the allowed transitions which are needed
strings to channel strings is analogous to the decompression operation, and recovering the data string from the channel
be representable, the encoding operation can have overlaps but cannot have a gap. Some interesting L-based codes for constrained channels arederived in [ 131.
constrained-channel codes, contributed a code which subdi- vides the codespaceaccording tothe given probabilities. However, the subdivision method is quite crude compared to Pasco’s [5].
Arithmetic codescan achieve compressionas close to theideal as desired, for given statistics. Inaddition,the P-based FIFO arithmeticcodes which accept statistics directly facilitate dynamic adaptation to the statistics in one pass of the data [3]. A good binary code is important, as Shannon and others havenoted, because other alphabets can be converted to binary form by decomposition.
Of ovemding importance tocompression now is the mod- eling of the data. Rissanen and Langdon [ 141 have studied a framework for the encoding of data strings and have assigned a cost to a model based on the coding parameters required. Different modeling approachesmay be compared. They showed that blocking to form larger alphabets results in the same model entropy at the samecost in coding parameters as a symbolwise approach. In general, the compression system designer seeks ways to give up a small percentage of the ideal compressionin orderto simplify theimplementation.The existence of an efficient coding technique now places the emphasis on efficient context selection and parameter-reduc- tion techniques [ 141.
Most of the author’s work in this field was done jointly with J. Rissanen, and thisdebt is obvious.I also owe a debt to Joan Mitchell, who has made several contributions to arithmetic 148 coding [ 151. If this paper is more accessible tothe general
GLEN G. LANGDON. JR.
reader, it is due toJoan’s graciousand patient encouragement. I thankGerry Goertzelfor his insights in explaining the operation of arithmetic coding. I have also benefited from the encouragement of Janet Kelly, Nigel Martin, Stephen Todd, Ron Arps, and Murali Varanasi.
The first step toward arithmetic coding was taken by Shannon [ 1 I], who observed in a 1948 paper that messages N symbols long could be encoded by first sorting the messages in order of their probabilities and then sending the cumulative proba- bility of the preceding messages in the ordering. The code
comparison. The nextstep was taken by Peter Elias in an unpublished result; Abramson [ 161 described Elias’ improve-
that Shannon’s scheme worked without sorting the messages, and that the cumulativeprobability of a message of N symbols could be recursively calculated from individual symbol prob-
As the message increased in length the arithmetic involved
units forthese codes, thetimetoencode eachsymbol is increased linearly with the length of the code string.
Meanwhile, another approach to coding was having a sim- ilar problem with precision. In 1972, Schalkwijk [ 181 studied coding fromthestandpoint of providing an index tothe encodedstringwithin a set of, possible strings. As symbols were added to the string, the index increased in size. This is a
was the first symbol decoded. Cover[ 191 made improvements
codes suffered from the same precision problem.
Both Shannon’s code and the Schalkwijk-Cover code can be viewed as a mapping of strings to a number, forming two
precision problem by suitable approximations in designing a LIFO arithmetic code. Code strings of any length could be generated with a fixed calculation time per data symbol using fixed-precision arithmetic.
earlier, which controlled the precision problem by essentially
code stringwas kept in computer memory until last symbolthe was encoded. This strategy allowed a carry-over to be propa- gated over a long carry chain. Pasco [5] also conjectured on the family of arithmetic codes based on their mechanization.
IBM J. RES. DEVELOP. VOL. 28 NO. 2 MARCH 1984