






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to error correcting codes, discussing the concept of distance in zn, the definition of error correcting codes, and the process of decoding. It also covers gaussian elimination for matrices with entries in z2 and its application to the hamming code. How to reduce matrices to echelon form and how to find the solution to a system of linear equations using the reduced matrix.
Typology: Assignments
1 / 12
This page cannot be seen from the preview
Don't miss anything!







The identification number schemes we discussed in the previous chapter give us the ability
to determine if an error has been made in recording or transmitting information. However,
they are limited in two ways. First, each allows detection of an error in just one digit,
expect for some special types of errors, such as interchanging digits. Second, they provide
no way to recover the intended information. By making use of more sophisticated ideas and
mathematical concepts, we will study methods of encoding and transmitting information that
allow us to both detect and correct errors. There are many places that use these so-called
error correcting codes, from transmitting photographs from planetary probes to playing of
compact discs and dvd movies.
To discuss error correcting codes, we need first to set the context and define some terms. We
work throughout in binary; that is, we will work over Z 2. To simplify notation, we will write
the two elements of Z 2 as 0 and 1 instead of as 0 and 1. If n is a positive integer, then the
set Z
n 2 is the set of all^ n-tuples of^ Z^2 -entries. Elements of^ Z
n 2 are called^ words, or words of
length n. A code of length n is a nonempty subset of Z
n
as codewords. For convenience we will write elements of Z
n 2 either with the usual notation,
or as a concatenation of digits. For instance, we will write (0, 1 , 0 , 1) and 0101 for the same
4-tuple. We can equip Z
n 2 with an operation of addition by using point-wise addition. That
is, we define
(a 1 ,... , an) + (b 1 ,... , bn) = (a 1 + b 1 ,... , an + bn).
Note that, as a consequence of the facts that 0 + 0 = 0 = 1 + 1 in Z 2 , we have a + a = 0 for
any a ∈ Z
n 2 , where^0 is the vector (0,... ,^ 0) consisting of all zeros.
Example 2.1. The set { 01 , 10 , 11 } is a code of length 2, and { 0000 , 1010 , 0101 , 1111 } is a
code of length 4.
20 Chapter 2. Error Correcting Codes
Let w = a 1 · · · an be a word of length n. Then the weight of w is the number of digits of
w equal to 1. We denote the weight of w by wt(w). There are some obvious consequences of
this definition. First of all, wt(w) = 0 if and only if w = 0. Second, wt(w) is a nonnegative
integer. A more sophisticated fact about weight is its relation with addition. If v, w ∈ Z
n 2 ,
then wt(v + w) ≤ wt(v) + wt(w). To see why this is true, we write xi for the i-th component
of a word x. The weight of x is then given by the equation wt(x) = |{i : 1 ≤ i ≤ n, xi = 1}|.
Using this description of weight, we note that (v + w)i = vi + wi. Therefore, if (v + w)i = 1,
then either vi = 1 or wi = 1 (but not both). Therefore,
{i : 1 ≤ i ≤ n, (v + w)i = 1} ⊆ {i : vi = 1} ∪ {i : wi = 1}.
Since |A ∪ B| ≤ |A| + |B| for any two finite sets A, B, the inclusion above yields wt(v + w) ≤
wt(v) + wt(w), as desired.
From idea of weight we can define the notion of distance on Z
n
we set the distance D(v, w) between v and w to be
D(v, w) = wt(v + w).
Alternatively, D(v, w) is equal to the number of positions in which v and w differ. The
function D shares the basic properties of distance in Euclidean space R
3
. More precisely, it
satisfies the properties of the following lemma.
Lemma 2.2. The distance function D defined on Z
n 2 ×^ Z
n 2 satisfies
n 2 ;
n 2 , if^ D(v, w) = 0, then^ v^ =^ w;
n 2 ;
n
Proof. Since v + v = 0 , we have D(v, v) = wt(v + v) = wt( 0 ) = 0. This proves (1). We note
that 0 is the only word of weight 0. Thus, if D(v, w) = 0, then wt(v + w) = 0, which forces
v + w = 0. However, adding w to both sides yields v = w, and this proves (2). The equality
D(v, w) = D(w, v) is obvious since v +w = w +v. Finally, we prove (4), the only non-obvious
statement, with a cute argument. Given u, v, w ∈ Z
n 2 , we have, from the definition and the
fact about weight given above,
D(v, w) = wt(v + w) = wt((v + u) + (u + w))
≤ wt(v + u) + wt(u + w)
= D(v, u) + D(u, w).
22 Chapter 2. Error Correcting Codes
Proof. Let w be a word, and suppose that v is a codeword with D(v, w) ≤ t. We need to
prove that v is the unique closest codeword to w. We do this by proving that D(u, w) > t
for any codeword u 6 = v. If not, suppose that u is a codeword with u 6 = v and D(u, w) ≤ t.
Then, by the triangle inequality,
D(u, v) ≤ D(u, w) + D(w, v) ≤ t + t = 2t < d.
This is a contradiction to the definition of d. Thus, v is indeed the unique closest codeword to
w. To finish the proof, we need to prove that C does not correct t + 1 errors. Since the code
has distance d, there are codewords u 1 , u 2 with d = D(u 1 , u 2 ). By altering appropriately t+
components of u 1 , we can produce a word w with D(u 1 , w) = t+1 and D(w, u 2 ) = d−(t+1).
We can do this by considering u 1 + u 2 , a vector with d components equal to 1, and changing
d − (t + 1) of these components to 0, thereby obtaining a word e. We then set w = u 1 + e.
Given w, we have D(u 1 , w) = t + 1, but since (d − 1) < 2 t + 2, by definition of t. Thus,
d − (t + 1) < t + 2, so D(w, u 2 ) = d − (t + 1) ≤ t + 1. Thus, u 1 is not the unique closest
codeword to w, since u 2 is either equally close or closer to w. Therefore, C is not a (t+1)-error
correcting code.
We need to show that if u is any word of weight ≤ t and both v and w are codewords,
then D(v, v + u) < D(w, v + u). To see this, first observe that D(v + u, v) = wt(u),
so that D(w, v + u) + wt(u) = D(w, v + u) + D(v, v + u). The triangle inequality gives
D(w, v+u)+D(v, v+u) ≥ D(w, v) ≥ d (by definition of d). Moreover, d ≥ 2 t+1 ≥ 2 wt(u)+
so that D(w, v + u) + wt(u) ≥ 2 wt(u) + 1, and D(w, v + u) ≥ wt(u) + 1 = D(v, v + u) + 1
as desired.
Example 2.8. Let C = { 0000 , 00111 , 11100 , 11011 }. The distance of C is 3, and so C is a
1-error correcting code.
Example 2.9. Let n be an odd positive integer, and let C = { 0 · · · 0 , 1 · · · 1 } be a code of
length n. If n = 2t + 1, then C is a t-error correcting code since the distance of C is n. Thus,
by making the length of C long enough, we can correct any number of errors that we wish.
However, note that the fraction of components of a word that can be corrected is t/n, and
this is always less than 1/2.
In this section we discuss the idea of Gaussian elimination for matrices with entries in Z 2.
We do this now precisely because we need to work with matrices with entries in Z 2 in order
to discuss the Hamming code, our first example of an error correcting code.
In linear algebra, if you are given a system of linear equations, then you can write this
system as a single matrix equation AX = b, where A is the matrix of coefficients, and X is
2.2. Gaussian Elimination 23
the column matrix of variables. For example, the system
2 x + 3y − z = 1
x − y + 5z = 2
is equivalent to the matrix equation
x
y
z
The primary matrix-theoretic method for solving such a system is to perform Gaussian
elimination on the augmented matrix, that matrix that adds to the coefficient matrix one
column at the right equal to the column on the right side of the equation. Recall Gaussian
elimination performs operations on the rows of a matrix in order to replace the matrix by one
in which the solution to the system can be found easily. There are three such row operations:
It is likely that in all your work with matrices, the entries of the matrices were real
numbers. However, to perform the row operations, all you need is to be able to add, subtract,
multiply, and divide the entries. In many situations, matrices arise whose entries are not real
numbers. For coding theory we need to work with matrices whose entries lie in Z 2 =
Within this set we can add, subtract, multiply, and divide just as if we had real numbers.
Furthermore, all the theorems of linear algebra have analogues to the setting where entries
lie in Z 2. In fact, we will generalize the idea of linear algebra later on to include many more
sets of scalars. Again, all we need is to be able to perform the four arithmetic operations on
the scalars, and we need properties analogous to those that hold for real number arithmetic.
Recall that the only symbolic difference between Z 2 arithmetic and ordinary arithmetic
of these symbols is that 1 + 1 = 0 in Z 2. Note that the first of the three row operations
listed above is not useful; multiplying a row by 1 does not affect the row, so is an operation
that is not needed. Also, the third operation in the case of Z 2 reduces to adding one row to
another.
Before working some examples, we recall what it means for a matrix to be in row reduced
echelon form.
Definition 2.10. A matrix A is in row reduced echelon form if
2.2. Gaussian Elimination 25
we can apply the following single row operation.
We now recall why having a matrix in row reduced echelon form will give us the solution
to the corresponding system of equations. The row operations on the augmented matrix
corresponds to performing various algebraic manipulations to the equations, such as inter-
changing equations. So, the system of equations corresponding to the reduced matrix is
equivalent to the original system; that is, the two systems have exactly the same solutions.
Example 2.14. Consider the system of equations
x = 1
x + y = 1
y + z = 1.
This system has augmented matrix
and reducing this matrix yields
This new matrix corresponds to the system of equations
x = 1,
y = 0,
z = 1.
Thus, we have already the solution to the original system.
Example 2.15. The augmented matrix
26 Chapter 2. Error Correcting Codes
corresponds to the system of equations
x 1 + x 2 + x 5 = 0,
x 1 + x 3 = 1,
x 2 + x 3 + x 4 + x 5 = 0,
x 2 + x 3 + x 5 = 1.
Reducing the matrix yields
which corresponds to the system of equations
x 1 + x 3 = 1,
x 2 + x 3 + x 5 = 1,
x 4 = 1.
We have left the leading ones in boldface in the echelon matrix. These correspond to the
variables x 1 , x 2 , and x 4. These variables can be solved in terms of the other variables. Thus,
we have the full solution
x 1 = 1 + x 3 ,
x 2 = 1 + x 3 + x 5 ,
x 4 = 1,
x 3 and x 5 are arbitrary.
We can write out all solutions to this system of equations, since each of x 3 and x 5 can take
on the two values 0 and 1. This gives us four solutions, which we write as row vectors.
(x 1 , x 2 , x 3 , x 4 , x 5 ) = (1, 1 , 0 , 1 , 0), (x 3 = 0, x 5 = 0)
(x 1 , x 2 , x 3 , x 4 , x 5 ) = (0, 0 , 1 , 1 , 0), (x 3 = 1, x 5 = 0)
(x 1 , x 2 , x 3 , x 4 , x 5 ) = (1, 0 , 0 , 1 , 1), (x 3 = 0, x 5 = 1)
(x 1 , x 2 , x 3 , x 4 , x 5 ) = (0, 1 , 1 , 1 , 1), (x 3 = 1, x 5 = 1).
Example 2.16. Let H be the Hamming matrix
28 Chapter 2. Error Correcting Codes
the dimension of the row space is the number of leading 1’s. Thus, an alternative definition
of the rank of a matrix is that it is equal to the number of leading 1’s in the row reduced
echelon form obtained from the matrix.
The kernel, or nullspace, of a matrix A is the set of all solutions to the homogeneous
equation AX = 0. To help understand this example, consider the Hamming matrix H of
the previous example.
Example 2.17. The solution to the homogeneous equation HX = 0 from the previous
example is
x 1 = x 3 + x 5 + x 7 ,
x 2 = x 3 + x 6 + x 7 ,
x 4 = x 5 + x 6 + x 7 ,
x 3 , x 5 , x 6 , x 7 are arbitrary.
For each arbitrary variable we can set it equal to 1 and all other arbitrary variables equal
to 0. The resulting vector will be a solution to HX = 0. If we do this for each arbitrary
variable, we will have a basis for the nullspace. Doing this, we get the four vectors
These vectors do form a basis for the nullspace of H since the general solution of HX = 0 is
x 1
x 2
x 3
x 4
x 5
x 6
x 7
x 3 + x 5 + x 7
x 3 + x 6 + x 7
x 3
x 5 + x 6 + x 7
x 5
x 6
x 7
= x 3
From this equation, we see that every solution is a linear combination of the four specific
solutions written above, and a little work will show that every solution can be written in
a unique way as a linear combination of these vectors. For example, one can check that
(0, 1 , 1 , 1 , 1 , 0 , 0) is a solution to the system HX = 0 , and that to write this vector as a
linear combination of the four given vectors, we must have x 3 = x 5 = 0 and x 6 = x 7 = 0,
2.3. The Hamming Code 29
and so (^)
is a sum of two of the four given vectors, and can be written in no other way in terms of the
four.
This example indicates the following general fact that for a homogeneous system AX = 0 ,
that the number of variables not corresponding to leading 1’s is equal to the dimension of
the nullspace of A. Let us call these variables leading variables. If we reduce A, the leading
variables can be solved in terms of the other variables, and these other variables are all
arbitrary; we call them free variables. By mimicing the example above, any solution can
be written uniquely in terms of a set of solutions, one for each free variable. This set of
solutions is a basis for the nullspace of A; therefore, the number of free variables is equal
to the dimension of the nullspace. Every variable is then either a leading variable or a free
variable. The number of variables is the number of columns of the matrix. This observation
leads to the rank-nullity theorem. The nullity of a matrix A is the dimension of the nullspace
of A.
Theorem 2.18 (Rank-Nullity). Let A be an n × m matrix. Then m is equal to the sum
of the rank of A and the nullity of A.
The point of this theorem is that once you know the rank of A, the nullity of A can be
immediately calculated. The number of solutions to AX = 0 can then be found. In coding
theory this will allow us to determine the number of codewords in a given code.
The Hamming code, discovered independently by Hamming and Golay, was the first example
of an error correcting code. Let
be the Hamming matrix, described in Example 2.16 above. Note that the columns of this
matrix give the base 2 representation of the integers 1-7. The Hamming code C of length
7 is the nullspace of H. More precisely,
v ∈ K
7 : Hv
T = 0