



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: Information Theory; Subject: Electrical and Computer Engr; University: University of Illinois - Urbana-Champaign; Term: Fall 2004;
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!




Date Assigned: 17 November 2004. Date Due: 1 December 2004 in class. Instructions: I expect you to work on the problems by yourself. You can refer to any textbook (or the technical literature in general), but not confer with any person.
(a) How many bits are needed to specify a selection of k objects from n objects? (n and k are assumed to be known and the selection of k objects is unordered.) (b) Either prove that the following code is uniquely decodable or give an ambiguous concatenated sequence of codewords:
c 0 = 101 c 1 = 0011 c 2 = 1001 c 3 = 1110 c 4 = 00001 c 5 = 11001 c 6 = 11100 c 7 = 010100
(c) Consider the memoryless AWGN channel
y = x + z
where z is zero mean Gaussian random variable with variance σ^2. The transmit signal x has an average power constraint of P. With no other constraints on the input, the capacity of the channel is
log 2
σ^2
Now suppose x is restricted to the binary alphabet
i. Find an expression for the capacity of this restrictive channel and denote it by Cˆ (you may not be able to find a closed form expression for Cˆ but you should be able to identify the optimal input distribution exactly). ii. When is Cˆ close to C? For small signal-to-noise ratios (defined as the ratio P/σ^2 ) or large ones? Commentary: This justifies the usual engineering prac- tice of using simple binary modulation on the AWGN channel in a certain SNR regime.
n for 1^ ≤^ n^ ≤^12367 0 n > 12367. (This remarkable 1/n law is known as Zipf’s law, and applies to the word frequencies of many languages [4].) If we assume that English is generated by picking words at random according to this distribution, what is the entropy of English (per word)? You might need a computer to help you arrive at the answer.
The poisoned glass. ‘Mathematicians are curious birds’, the police com- missioner said to his wife. ‘You see, we had all those partly filled glasses lined up in rows on a table in the hotel kitchen. Only one contained poison, and we wanted to know which one before searching that glass for fingerprints. Our lab could test the liquid in each glass, but the tests take time and money, so we wanted to make as few of them as possible by simultaneously testing mixtures of small samples from groups of glasses. The university sent over a mathematics professor to help us. He counted the glasses, smiled and said: ‘ “Pick any glass you want, Commissioner. We’ll test it first.” ‘ “But won’t that waste a test?” I asked. ‘ “No,” he said, “it’s part of the best procedure. We can test one glass first. It doesn’t matter which one.” ’ ‘How many glasses were there to start with?’ the commissioner’s wife asked. ‘I don’t remember. Somewhere between 100 and 200.’ What was the exact number of glasses?
(a) Solve this puzzle. (b) Now, explain why the professor was in fact wrong and the commissioner was right. What is the optimal procedure for identifying the one poisoned glass? You will have to make precise the notion of optimality you are considering. (c) What is the expected waste relative to this optimum if one followed the professor’s strategy?
Hint: How is this problem related to Huffman coding?
(a) Consider a probability distribution over n, p def = {pn}. Define the average duration per symbol to be L(p) =
n
pnln.
The entropy per symbol is defined to be
H(p) =
n
pn log 2
pn
Show that the largest rate of information transfer in bits per unit time is equal to
sup p
H(p) L(p)
(b) Suppose ln = n. Solve the optimization problem in (1) explicitly. What is the largest rate of information transfer in this case? Not what one would call mind- numbing speed, but hey, you cannot beat the price.
Commentary: Here the information is contained only in the sequence of number of phone rings. But, if we think about it, we can also pack information in the timing between the successive phone calls. This means, we choose not to redial instantly but delay it with the purpose of sending information in the duration between the rings. But there is typically some (random) lag between when you dial a number and the time the first ring begins at the destination. This is a noisy timing channel. The capacity of such a channel is derived in [1]; this work won the Information Theory Society Best Paper Award in 1998.
x y
Figure 1: The Z channel.
P (y = 0|x = 0) = 1; P (y = 0|x = 1) = q; P (y = 1|x = 0) = 0; P (y = 1|x = 1) = 1 − q.
(a) Show that the capacity of the Z channel is at least 0.5(1 − q). Hint: The capacity of the erasure channel with erasure probability q is 1 − q. Can you convert two uses of the Z channel into one over an erasure channel? (b) Show that the optimal input distribution (p∗ 0 , p∗ 1 ) is given by
p∗ 1 =
(1 − q)
H(q) 1 −q
where H(q) = −q log 2 q − (1 − q) log 2 (1 − q) and p∗ 0 = 1 − p∗ 1. (c) What happens to p∗ 1 if the noise level q is very close to 1? (d) Argue that p∗ 1 is less than 0.5 for all values of q. (e) Why do you think that p∗ 1 is always less than 0.5? One could argue that it is good to favor the 0 input, since it is transmitted without error – and also argue that it is good to favor the 1 input, since it often gives rise to the highly prized 1 output, which allows certain identification of the input! Try to make a convincing argument.
be either 0 or 1 and thus information cannot be written on them. Not only do these faults waste memory (if there are a total of k struck-at faults, then the useful memory size is only n − k bits), they cause trouble when trying to read the information: the device reading the bits from the memory has no idea how to differentiate the useful memory locations from the faulty ones. However, the struck-at locations are known ahead of time to the device writing the informa- tion. It is possible that one can be smart about writing the information bits in the available n − k locations. A remarkable result in [3] shows that one can be very smart in writing (and reading) the information bits: the capacity of the memory is n − k bits even though the memory reading device has no idea of the k faulty bit positions. (c) A continuous alphabet version of this problem with AWGN also admits a similarly striking result [2]: Consider the memoryless channel
y[m] = x[m] + s[m] + z[m]
where the additive noise z[m] is zero mean Gaussian random variable with vari- ance σ^2. The additive interference s[m] is known non-causally to the transmitter but unknown to the receiver (the receiver only knows that statistically s[m] is generated by an i.i.d. zero mean Gaussian random sequence with variance σ s^2 ). As usual, there is a transmit power constraint of P. Let us denote the capacity of this channel by C measured in bits per channel-use.
i. Show that C ≥ 12 log 2
1 + (^) σ (^2) s P+σ 2
ii. Show that C ≤ 12 log 2
1 + (^) σP 2
iii. Reading exercise:^2 The capacity of the channel is actually equal to the upper bound above; i.e., the capacity of the channel is unchanged even if the receiver is also made aware of the entire inteference sequence.
[1] V. Anantharam and S. Verd´u, “Bits through Queues”, IEEE Transactions on Information Theory, Vol. 42, No. 1, pp. 4-18, January 1996.
[2] M. H. M. Costa, “Writing on dirty paper”, IEEE Transactions on Information Theory, Vol.29, pp. 439-441, May 1983.
[3] A. El Gamal and C. Heegard, “On the Capacity of Computer Memory with Defects,” IEEE Transactions on Information Theory, Vol. 29, No. 5, pp. 731-739, September 1983.
[4] G. K. Zipf, Human Behavior and the Principle of Least Effort, Addison-Wesley, 1949.
(^2) No need to turn this in.