Error Correcting Codes: Basic Notions and Gaussian Elimination, Assignments of Algebra

An introduction to error correcting codes, discussing the concept of distance in zn, the definition of error correcting codes, and the process of decoding. It also covers gaussian elimination for matrices with entries in z2 and its application to the hamming code. How to reduce matrices to echelon form and how to find the solution to a system of linear equations using the reduced matrix.

Typology: Assignments

Pre 2010

Uploaded on 08/09/2009

koofers-user-jze
koofers-user-jze 🇺🇸

5

(1)

10 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 2
Error Correcting Codes
The identification number schemes we discussed in the previous chapter give us the ability
to determine if an error has been made in recording or transmitting information. However,
they are limited in two ways. First, each allows detection of an error in just one digit,
expect for some special types of errors, such as interchanging digits. Second, they provide
no way to recover the intended information. By making use of more sophisticated ideas and
mathematical concepts, we will study methods of encoding and transmitting information that
allow us to both detect and correct errors. There are many places that use these so-called
error correcting codes, from transmitting photographs from planetary probes to playing of
compact discs and dvd movies.
2.1 Basic Notions
To discuss error correcting codes, we need first to set the context and define some terms. We
work throughout in binary; that is, we will work over Z2. To simplify notation, we will write
the two elements of Z2as 0 and 1 instead of as 0 and 1. If nis a positive integer, then the
set Zn
2is the set of all n-tuples of Z2-entries. Elements of Zn
2are called words, or words of
length n. A code of length nis a nonempty subset of Zn
2. We will refer to elements of a code
as codewords. For convenience we will write elements of Zn
2either with the usual notation,
or as a concatenation of digits. For instance, we will write (0,1,0,1) and 0101 for the same
4-tuple. We can equip Zn
2with an operation of addition by using point-wise addition. That
is, we define
(a1, . . . , an)+(b1, . . . , bn) = (a1+b1, . . . , an+bn).
Note that, as a consequence of the facts that 0 + 0 = 0 = 1 + 1 in Z2, we have a+a=0for
any aZn
2, where 0is the vector (0,...,0) consisting of all zeros.
Example 2.1. The set {01,10,11}is a code of length 2, and {0000,1010,0101,1111}is a
code of length 4.
19
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Error Correcting Codes: Basic Notions and Gaussian Elimination and more Assignments Algebra in PDF only on Docsity!

Chapter 2

Error Correcting Codes

The identification number schemes we discussed in the previous chapter give us the ability

to determine if an error has been made in recording or transmitting information. However,

they are limited in two ways. First, each allows detection of an error in just one digit,

expect for some special types of errors, such as interchanging digits. Second, they provide

no way to recover the intended information. By making use of more sophisticated ideas and

mathematical concepts, we will study methods of encoding and transmitting information that

allow us to both detect and correct errors. There are many places that use these so-called

error correcting codes, from transmitting photographs from planetary probes to playing of

compact discs and dvd movies.

2.1 Basic Notions

To discuss error correcting codes, we need first to set the context and define some terms. We

work throughout in binary; that is, we will work over Z 2. To simplify notation, we will write

the two elements of Z 2 as 0 and 1 instead of as 0 and 1. If n is a positive integer, then the

set Z

n 2 is the set of all^ n-tuples of^ Z^2 -entries. Elements of^ Z

n 2 are called^ words, or words of

length n. A code of length n is a nonempty subset of Z

n

  1. We will refer to elements of a code

as codewords. For convenience we will write elements of Z

n 2 either with the usual notation,

or as a concatenation of digits. For instance, we will write (0, 1 , 0 , 1) and 0101 for the same

4-tuple. We can equip Z

n 2 with an operation of addition by using point-wise addition. That

is, we define

(a 1 ,... , an) + (b 1 ,... , bn) = (a 1 + b 1 ,... , an + bn).

Note that, as a consequence of the facts that 0 + 0 = 0 = 1 + 1 in Z 2 , we have a + a = 0 for

any a ∈ Z

n 2 , where^0 is the vector (0,... ,^ 0) consisting of all zeros.

Example 2.1. The set { 01 , 10 , 11 } is a code of length 2, and { 0000 , 1010 , 0101 , 1111 } is a

code of length 4.

20 Chapter 2. Error Correcting Codes

Let w = a 1 · · · an be a word of length n. Then the weight of w is the number of digits of

w equal to 1. We denote the weight of w by wt(w). There are some obvious consequences of

this definition. First of all, wt(w) = 0 if and only if w = 0. Second, wt(w) is a nonnegative

integer. A more sophisticated fact about weight is its relation with addition. If v, w ∈ Z

n 2 ,

then wt(v + w) ≤ wt(v) + wt(w). To see why this is true, we write xi for the i-th component

of a word x. The weight of x is then given by the equation wt(x) = |{i : 1 ≤ i ≤ n, xi = 1}|.

Using this description of weight, we note that (v + w)i = vi + wi. Therefore, if (v + w)i = 1,

then either vi = 1 or wi = 1 (but not both). Therefore,

{i : 1 ≤ i ≤ n, (v + w)i = 1} ⊆ {i : vi = 1} ∪ {i : wi = 1}.

Since |A ∪ B| ≤ |A| + |B| for any two finite sets A, B, the inclusion above yields wt(v + w) ≤

wt(v) + wt(w), as desired.

From idea of weight we can define the notion of distance on Z

n

  1. If^ v, w^ are words, then

we set the distance D(v, w) between v and w to be

D(v, w) = wt(v + w).

Alternatively, D(v, w) is equal to the number of positions in which v and w differ. The

function D shares the basic properties of distance in Euclidean space R

3

. More precisely, it

satisfies the properties of the following lemma.

Lemma 2.2. The distance function D defined on Z

n 2 ×^ Z

n 2 satisfies

  1. D(v, v) = 0 for all v ∈ Z

n 2 ;

  1. for any v, w ∈ Z

n 2 , if^ D(v, w) = 0, then^ v^ =^ w;

  1. D(v, w) = D(w, v) for any v, w ∈ Z

n 2 ;

  1. triangle inequality: D(v, w) ≤ D(v, u) + D(u, w) for any u, v, w ∈ Z

n

Proof. Since v + v = 0 , we have D(v, v) = wt(v + v) = wt( 0 ) = 0. This proves (1). We note

that 0 is the only word of weight 0. Thus, if D(v, w) = 0, then wt(v + w) = 0, which forces

v + w = 0. However, adding w to both sides yields v = w, and this proves (2). The equality

D(v, w) = D(w, v) is obvious since v +w = w +v. Finally, we prove (4), the only non-obvious

statement, with a cute argument. Given u, v, w ∈ Z

n 2 , we have, from the definition and the

fact about weight given above,

D(v, w) = wt(v + w) = wt((v + u) + (u + w))

≤ wt(v + u) + wt(u + w)

= D(v, u) + D(u, w).

22 Chapter 2. Error Correcting Codes

Proof. Let w be a word, and suppose that v is a codeword with D(v, w) ≤ t. We need to

prove that v is the unique closest codeword to w. We do this by proving that D(u, w) > t

for any codeword u 6 = v. If not, suppose that u is a codeword with u 6 = v and D(u, w) ≤ t.

Then, by the triangle inequality,

D(u, v) ≤ D(u, w) + D(w, v) ≤ t + t = 2t < d.

This is a contradiction to the definition of d. Thus, v is indeed the unique closest codeword to

w. To finish the proof, we need to prove that C does not correct t + 1 errors. Since the code

has distance d, there are codewords u 1 , u 2 with d = D(u 1 , u 2 ). By altering appropriately t+

components of u 1 , we can produce a word w with D(u 1 , w) = t+1 and D(w, u 2 ) = d−(t+1).

We can do this by considering u 1 + u 2 , a vector with d components equal to 1, and changing

d − (t + 1) of these components to 0, thereby obtaining a word e. We then set w = u 1 + e.

Given w, we have D(u 1 , w) = t + 1, but since (d − 1) < 2 t + 2, by definition of t. Thus,

d − (t + 1) < t + 2, so D(w, u 2 ) = d − (t + 1) ≤ t + 1. Thus, u 1 is not the unique closest

codeword to w, since u 2 is either equally close or closer to w. Therefore, C is not a (t+1)-error

correcting code.

We need to show that if u is any word of weight ≤ t and both v and w are codewords,

then D(v, v + u) < D(w, v + u). To see this, first observe that D(v + u, v) = wt(u),

so that D(w, v + u) + wt(u) = D(w, v + u) + D(v, v + u). The triangle inequality gives

D(w, v+u)+D(v, v+u) ≥ D(w, v) ≥ d (by definition of d). Moreover, d ≥ 2 t+1 ≥ 2 wt(u)+

so that D(w, v + u) + wt(u) ≥ 2 wt(u) + 1, and D(w, v + u) ≥ wt(u) + 1 = D(v, v + u) + 1

as desired.

Example 2.8. Let C = { 0000 , 00111 , 11100 , 11011 }. The distance of C is 3, and so C is a

1-error correcting code.

Example 2.9. Let n be an odd positive integer, and let C = { 0 · · · 0 , 1 · · · 1 } be a code of

length n. If n = 2t + 1, then C is a t-error correcting code since the distance of C is n. Thus,

by making the length of C long enough, we can correct any number of errors that we wish.

However, note that the fraction of components of a word that can be corrected is t/n, and

this is always less than 1/2.

2.2 Gaussian Elimination

In this section we discuss the idea of Gaussian elimination for matrices with entries in Z 2.

We do this now precisely because we need to work with matrices with entries in Z 2 in order

to discuss the Hamming code, our first example of an error correcting code.

In linear algebra, if you are given a system of linear equations, then you can write this

system as a single matrix equation AX = b, where A is the matrix of coefficients, and X is

2.2. Gaussian Elimination 23

the column matrix of variables. For example, the system

2 x + 3y − z = 1

x − y + 5z = 2

is equivalent to the matrix equation

x

y

z

The primary matrix-theoretic method for solving such a system is to perform Gaussian

elimination on the augmented matrix, that matrix that adds to the coefficient matrix one

column at the right equal to the column on the right side of the equation. Recall Gaussian

elimination performs operations on the rows of a matrix in order to replace the matrix by one

in which the solution to the system can be found easily. There are three such row operations:

  • multiply or divide a row by a nonzero scalar,
  • interchange two rows,
  • add a multiple of one row to another row.

It is likely that in all your work with matrices, the entries of the matrices were real

numbers. However, to perform the row operations, all you need is to be able to add, subtract,

multiply, and divide the entries. In many situations, matrices arise whose entries are not real

numbers. For coding theory we need to work with matrices whose entries lie in Z 2 =

Within this set we can add, subtract, multiply, and divide just as if we had real numbers.

Furthermore, all the theorems of linear algebra have analogues to the setting where entries

lie in Z 2. In fact, we will generalize the idea of linear algebra later on to include many more

sets of scalars. Again, all we need is to be able to perform the four arithmetic operations on

the scalars, and we need properties analogous to those that hold for real number arithmetic.

Recall that the only symbolic difference between Z 2 arithmetic and ordinary arithmetic

of these symbols is that 1 + 1 = 0 in Z 2. Note that the first of the three row operations

listed above is not useful; multiplying a row by 1 does not affect the row, so is an operation

that is not needed. Also, the third operation in the case of Z 2 reduces to adding one row to

another.

Before working some examples, we recall what it means for a matrix to be in row reduced

echelon form.

Definition 2.10. A matrix A is in row reduced echelon form if

  1. the first nonzero entry of any row is 1. This entry is called a leading 1 ;
  2. If a column contains a leading 1 , then all other entries of the column are 0 ;

2.2. Gaussian Elimination 25

we can apply the following single row operation.

We now recall why having a matrix in row reduced echelon form will give us the solution

to the corresponding system of equations. The row operations on the augmented matrix

corresponds to performing various algebraic manipulations to the equations, such as inter-

changing equations. So, the system of equations corresponding to the reduced matrix is

equivalent to the original system; that is, the two systems have exactly the same solutions.

Example 2.14. Consider the system of equations

x = 1

x + y = 1

y + z = 1.

This system has augmented matrix

and reducing this matrix yields 

This new matrix corresponds to the system of equations

x = 1,

y = 0,

z = 1.

Thus, we have already the solution to the original system.

Example 2.15. The augmented matrix

26 Chapter 2. Error Correcting Codes

corresponds to the system of equations

x 1 + x 2 + x 5 = 0,

x 1 + x 3 = 1,

x 2 + x 3 + x 4 + x 5 = 0,

x 2 + x 3 + x 5 = 1.

Reducing the matrix yields 

which corresponds to the system of equations

x 1 + x 3 = 1,

x 2 + x 3 + x 5 = 1,

x 4 = 1.

We have left the leading ones in boldface in the echelon matrix. These correspond to the

variables x 1 , x 2 , and x 4. These variables can be solved in terms of the other variables. Thus,

we have the full solution

x 1 = 1 + x 3 ,

x 2 = 1 + x 3 + x 5 ,

x 4 = 1,

x 3 and x 5 are arbitrary.

We can write out all solutions to this system of equations, since each of x 3 and x 5 can take

on the two values 0 and 1. This gives us four solutions, which we write as row vectors.

(x 1 , x 2 , x 3 , x 4 , x 5 ) = (1, 1 , 0 , 1 , 0), (x 3 = 0, x 5 = 0)

(x 1 , x 2 , x 3 , x 4 , x 5 ) = (0, 0 , 1 , 1 , 0), (x 3 = 1, x 5 = 0)

(x 1 , x 2 , x 3 , x 4 , x 5 ) = (1, 0 , 0 , 1 , 1), (x 3 = 0, x 5 = 1)

(x 1 , x 2 , x 3 , x 4 , x 5 ) = (0, 1 , 1 , 1 , 1), (x 3 = 1, x 5 = 1).

Example 2.16. Let H be the Hamming matrix

H =

28 Chapter 2. Error Correcting Codes

the dimension of the row space is the number of leading 1’s. Thus, an alternative definition

of the rank of a matrix is that it is equal to the number of leading 1’s in the row reduced

echelon form obtained from the matrix.

The kernel, or nullspace, of a matrix A is the set of all solutions to the homogeneous

equation AX = 0. To help understand this example, consider the Hamming matrix H of

the previous example.

Example 2.17. The solution to the homogeneous equation HX = 0 from the previous

example is

x 1 = x 3 + x 5 + x 7 ,

x 2 = x 3 + x 6 + x 7 ,

x 4 = x 5 + x 6 + x 7 ,

x 3 , x 5 , x 6 , x 7 are arbitrary.

For each arbitrary variable we can set it equal to 1 and all other arbitrary variables equal

to 0. The resulting vector will be a solution to HX = 0. If we do this for each arbitrary

variable, we will have a basis for the nullspace. Doing this, we get the four vectors

These vectors do form a basis for the nullspace of H since the general solution of HX = 0 is

x 1

x 2

x 3

x 4

x 5

x 6

x 7

x 3 + x 5 + x 7

x 3 + x 6 + x 7

x 3

x 5 + x 6 + x 7

x 5

x 6

x 7

= x 3

  • x 5
  • x 6
  • x 7

From this equation, we see that every solution is a linear combination of the four specific

solutions written above, and a little work will show that every solution can be written in

a unique way as a linear combination of these vectors. For example, one can check that

(0, 1 , 1 , 1 , 1 , 0 , 0) is a solution to the system HX = 0 , and that to write this vector as a

linear combination of the four given vectors, we must have x 3 = x 5 = 0 and x 6 = x 7 = 0,

2.3. The Hamming Code 29

and so (^) 

is a sum of two of the four given vectors, and can be written in no other way in terms of the

four.

This example indicates the following general fact that for a homogeneous system AX = 0 ,

that the number of variables not corresponding to leading 1’s is equal to the dimension of

the nullspace of A. Let us call these variables leading variables. If we reduce A, the leading

variables can be solved in terms of the other variables, and these other variables are all

arbitrary; we call them free variables. By mimicing the example above, any solution can

be written uniquely in terms of a set of solutions, one for each free variable. This set of

solutions is a basis for the nullspace of A; therefore, the number of free variables is equal

to the dimension of the nullspace. Every variable is then either a leading variable or a free

variable. The number of variables is the number of columns of the matrix. This observation

leads to the rank-nullity theorem. The nullity of a matrix A is the dimension of the nullspace

of A.

Theorem 2.18 (Rank-Nullity). Let A be an n × m matrix. Then m is equal to the sum

of the rank of A and the nullity of A.

The point of this theorem is that once you know the rank of A, the nullity of A can be

immediately calculated. The number of solutions to AX = 0 can then be found. In coding

theory this will allow us to determine the number of codewords in a given code.

2.3 The Hamming Code

The Hamming code, discovered independently by Hamming and Golay, was the first example

of an error correcting code. Let

H =

be the Hamming matrix, described in Example 2.16 above. Note that the columns of this

matrix give the base 2 representation of the integers 1-7. The Hamming code C of length

7 is the nullspace of H. More precisely,

C =

v ∈ K

7 : Hv

T = 0