Error-Correcting Codes: Definitions, Properties, and Existential Bounds | Study notes Number Theory

CS225: Pseudorandomness Prof. Salil Vadhan

Lecture 14: Error-Correcting Codes

April 3, 2007

Based on scribe notes by Sasha Schwartz and Adi Akavia.

1 Basic Definitions

The field of coding theory is motivated by the problem of communicating reliably over noisy chan-

nels — where the data sent over the channel may come out corrupted on the other end, but we

nevertheless want the receiver to be able to correct the errors and recover the original message.

There is a vast literature studying aspects of this problem from the perspectives of electrical engi-

neering (communications and information theory), computer science (algorithms and complexity),

and mathematics (combinatorics and algebra). In this course, we are interested in codes as ‘pseu-

dorandom objects,’ ones that are intimately related with the other ob jects we are studying. In

particular, we will see how to use ideas from coding theory to construct the condensers and unbal-

anced expanders that we assumed in the previous lectures (for our construction of extractors).

The approach to communicating over a noisy channel is to restrict the data we send to be from a

certain set of strings that can be easily disambiguated (even after being corrupted).

Definition 1 Aq-ary code is a set C ⊆ Σˆn, where Σis an alphabet of size q. Elements of Care

called codewords. Some key parameters:

•ˆnis the block length.

•n= log2|C| is the message length.

•ρ=n/(ˆn·log |Σ|)is the (relative) rate of the code.

An encoding function for Cis an injective mapping Enc: {0,1}n→ C (for na positive integer).

Given such an encoding function, we view the strings in {0,1}nas messages. The code is explicit

if Enc is computable in polynomial time.

Note that every code Cwhose message length is an integer has an encoding function Enc. We

view Cand Enc as being essentially the same ob ject (with Enc merely providing a ‘labelling’ of

codewords), with the former being useful for studying the combinatorics of codes and the latter for

algorithmic purposes. Our notation differs from the standard notation in coding theory in several

ways. Typically in coding theory, the input alphabet is taken to be the same as the output alphabet

(rather than {0,1}and Σ, respectively), the blocklength is denoted n, and the message length (over

Σ) is denoted kand is referred to as the rate.

So far, we haven’t talked at all about the error-correcting properties of codes. Here we need to

specify two things: the model of errors (as introduced by the noisy channel) and the notion of a

successful recovery.

Error-Correcting Codes: Definitions, Properties, and Existential Bounds, Study notes of Number Theory