



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Prof. Daniel A. Spielman, Engineering, Parity, The Gaussian Distribution, The Gaussian and Erasure Channels, The Parity Product Code, bit error rate, BER, Heuristic Decoding of the Parity Product Code, Confidence Intervals, Lab exercise, Yale, MIT
Typology: Slides
1 / 7
This page cannot be seen from the preview
Don't miss anything!




18.413: Error-Correcting Codes Lab February 19, 2004
Lecturer: Daniel A. Spielman
One of the most natural distribution is the Gaussian distribution. A Gaussian random variable with mean μ and standard deviation σ has probability density function
p(x) =
2 πσ
e
−(x−μ)^2 2 σ^2.
For those who haven’t learned about probability density functions before, this means that the probability that x lies between x 0 and x 1 is ∫ (^) x 1
x=x 0
p(x) dx.
You can verify that (^) ∫ (^) ∞
x=−∞
p(x) dx = 1.
The standard deviation tells you how far from the mean x is likely to lie. In particular, one can prove that
P [x − μ > kσ] < e−^
−k^2 2 √ 2 πk
and that this bound is very close to tight for k ≥ 2. For σ = 1 and μ = 0, this bound indicates that the probability that the Gaussian lies between −2 and 2 is at least 0.946, whereas the actual probability is about 0.956.
If g is a Gaussian random variable of mean 0 and standard deviation 1, then σg + μ is a Gaussian random variable of mean μ and standard deviation σ. Often, you will see a Gaussian described by its variance. It’s variance is the square of its standard deviation.
Gaussian random variables arise as the limit of binomial random variables. That is, if we let x 1 ,
... , xn be indepent random variables uniformly distributed in { 1 , − 1 }, and let X =
xi/
n, then as n grows large, X approaches the Gaussian distribution. We will now sketch a proof of this.
In particular, we consider the probability that X = c
n. From counting, we can determine that this is
P
X = c
n
= 2−n
n n/ 2 − c
n/ 2
We will show that this is approximately equal to
∫ (^) c+1/√n
g=c− 1 /√n
p(g) dg ≈ 2 p(c)/
n =
πn e
−c^2 (^2).
Applying Stirling’s formula, we find that
2 −n
n n/ 2 − k
= 2−n^ n! (n/ 2 − k)!(n/2 + k)!
≈ 2 −n
( (^) n e
)n ( n/ 2 −k e
)n/ 2 −k ( n/2+k e
)n/2+k
2 πn √ 2 π(n/ 2 − k)
2 π(n/2 + k)
(n)n (n − 2 k)n/^2 −k^ (n + 2k)n/2+k
2 πn √ 2 π(n/ 2 − k)
2 π(n/2 + k)
(n)n−^2 k (n^2 − 4 k^2 )n/^2 −k
(n)^2 k (n + 2k)^2 k
2 πn √ 2 π(n/ 2 − k)
2 π(n/2 + k)
We now evaluate each of the three terms in this product individually, substituting in k = (c/2)
n. First, we find
(n)n−^2 k (n^2 − 4 k^2 )n/^2 −k^
(n)n−c
√n
(n^2 − c^2 n)n/^2 −(c/2)
√n
(1 − c^2 /n)
)n/ 2 −(c/2)√n
c^2 n
)n/ 2 −(c/2)√n
≈ ec (^2) / 2 ,
as (1 + 1/k)k^ ≈ e. To evalue the other term, we compute
(n)^2 k (n + 2k)^2 k^
(n)c
√n
(n + c
n)c
√n
1 + c/
n
)c√n
1 − c/
n
)c√n
≈ e−c 2 ,
as (1 − 1 /k)k^ ≈ e−^1. Finally, we find
√ 2 πn √ 2 π(n/ 2 − k)
2 π(n/2 + k)
πn
This is how we interpret the output of the Gaussian Channel.
To quickly compare these channels, we note that the capacity of the BECp is 1 − p, which is exactly what you would get if you asked for a retransmit of every lost message. The capacity of the Gaussian channel with standard deviation σ is
1 2
log 2
σ^2
Warning: I’ve lied here a little bit: this is actually the capacity if we are allowed to use arbitrary input alphabets, subject to an average power constraint. The capacity is a little bit lower if we restrict ourselves to {− 1 , 1 } inputs. There is no nice closed form for the capacity when we restict to {− 1 , 1 } inputs, but you could compute it empiracally using the techniques from Small Project
For comparison with the BSC, we note that we obtain capacity 1/2 when σ = 1. If we were to naively round the input—treating it as 1 if y > 0 and −1 otherwise—we would obtain a BSC. 1587. However, the capacity of the BSC. 16 is .366. To get capacity 1/2, we need the BSC. 1101. So, you loose a lot if you throw out the extra information provided by the channel.
Note that the Gaussian Channel on { 1 , − 1 } inputs is a symmetric channel. That means that we can view it as a probability distribution over BSC channels, with a crossover probability p ≤ 1 / 2 occuring with density e−(p−1) (^2) / 2 σ 2
This means that you all know a way to experimentally compute the capacity of this channel.
When describing codes, we have usually used the bits { 0 , 1 }, whereas I’m using 1 and −1 over the Gaussian Channel. The standard translation is to identify binary 0 with Real 1 and binary 1 with Real −1.
In Small Project 2, we will consider the following code. It will have 4 input bits, w 1 , w 2 , w 3 , w 4. The first 4 bits of the codeword x 1 ,... , x 4 will be set to these. The remaining bits will be set by the following rules:
x 5 = x 1 ⊕ x 2 x 6 = x 3 ⊕ x 4 x 7 = x 1 ⊕ x 3 x 8 = x 2 ⊕ x 4 x 9 = x 7 ⊕ x 8
We note that all these relations also imply
x 9 = x 5 ⊕ x 6.
We usually understand this code by writing the bits in a matrix, like this
x 1 x 2 x 5 x 3 x 4 x 6 x 7 x 8 x 9 ,
and observing that each row and column must have parity 0. This is why we call it a product code: each row and each column looks like a code (we’ll see more of these later).
If we flip one bit in a codeword of this code, then the parity constraints on the row and colum containing the flipped bit will be violated, indicating where the error is. If two bits are flipped, you might be able to detect that an error has occured, but won’t be able to figure out where it is. That is, if you only have bits and not and confidence information.
If in addition to bit you have information for each bit indicating how confident you are that the bit is 0 or 1 (that is, if you have the full information provided by the channel), then you can decode in one of the ways discussed in the last lecture: either maximum likelihood decoding to minimize word error or bit error. In Small Project 2, we will minimize the bit error rate, denoted (BER).
I’ll now define the bit error rate (BER) precisely. It depends upon the coding scheme, the channel, and the decoding scheme. Let w 1 ,... , wk be the bits in the message to be sent. We assume that they are encoded as x 1 ,... , xn, and received by the decoder as y 1 ,... , yn. We then let z 1 ,... , zk be the outputs of the decoder, which we now force to output a 0 or 1 for each bit. The bit error rate is then
E
(1/k)
i
P [wi 6 = zi]
Empircally, you can compute this by averaging
(1/k)
i
[wi 6 = zi]
over many trials.
I will now describe a heuristic decoding algorithm for the code we will examine on Small Project 2. This algorithm will be less accurate and take longer to run than the ideal algorithm. However, for larger codes the natural extension of this algorithm will be practical and the ideal algorithm will be impractical. For now, I will assume that we are just trying to compute x 1.
Here is the algorithm:
Let’s say that we are trying to estimate the BER, but the BER is quite small. You might want to know how many trials you need to run to estimate the BER reasonably.
Rather than fixing in advance how many trials you will run, you can run until you see some fixed number of errors. For crude data, just to get general order of magnitude, 10 observations would be reasonable. If you want to get a digit of accuracy, then you need 100 observations. The formula given in the previous section will give you a resonable confidence interval.
To make many plots appear on the same set of axes, type hold on after you plot.
Here is an example of how to plot a confidence interval on some imaginary data. In this case, I’ve made the confidence interval go from y (i) - sig to y (i) + sig, just to show you how the graphics should look.
x = [1:10]; y = x.^(1.5); plot(x,y) hold on plot(x,y,’o’) i = 4; sig = 3; plot([x(i),x(i)],[y(i)-sig,y(i)+sig],’+’)