Quantifying Information - Computation Structures - Lecture Slides, Slides of Computer Fundamentals

The main points are:Quantifying Information, Encoding, Fixed-Length Encodings, Encoding Numbers, Signed Integers, Data Compression, Variable-Length Encodings, Error Detection and Correction, Hamming Distance

Typology: Slides

2012/2013

Uploaded on 04/18/2013

palmoni
palmoni 🇮🇳

4.5

(2)

75 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
L01 - Basics of Information 9
6.004 – Spring 2009 2/3/09
Quantifying Information
(Claude Shannon, 1948)
Suppose you’re faced with N equally probable choices, and I
give you a fact that narrows it down to M choices. Then
I’ve given you
log2(N/M) bits of information
Examples:
information in one coin flip: log2(2/1) = 1 bit
roll of 2 dice: log2(36/1) = 5.2 bits
outcome of a Red Sox game: 1 bit
(well, actually, are both outcomes equally probable?)
Information is measured in bits
(binary digits) = number of 0/1’s
required to encode choice(s)
L01 - Basics of Information 10
6.004 – Spring 2009 2/3/09
Encoding
Encoding describes the process of
assigning representations to information
Choosing an appropriate and efficient encoding is a
real engineering challenge
Impacts design at many levels
- Mechanism (devices, # of components used)
- Efficiency (bits used)
- Reliability (noise)
- Security (encryption)
Next lecture: encoding a bit.
What about longer messages?
L01 - Basics of Information 11
6.004 – Spring 2009 2/3/09
Fixed-length encodings
7bits6.426
<=(86)
2
log
bits4322310
2
<= .)(log
If all choices are equally likely (or we have no reason to expect
otherwise), then a fixed-length code is often used. Such a code will
use at least enough bits to represent the information content.
ex. ~86 English characters =
{A-Z (26), a-z (26), 0-9 (10), punctuation (11), math (9), financial (4)}
7-bit ASCII (American Standard Code for Information Interchange)
ex. Decimal digits 10 = {0,1,2,3,4,5,6,7,8,9}
4-bit BCD (binary coded decimal)
L01 - Basics of Information 12
6.004 – Spring 2009
Encoding numbers
=
=
1n
0i
i
i
b2v
211
21029282726252423222120
011111010000
03720
Octal - base 8
000 - 0
001 - 1
010 - 2
011 - 3
100 - 4
101 - 5
110 - 6
111 - 7
0x7d0
Hexadecimal - base 16
0000 - 0 1000 - 8
0001 - 1 1001 - 9
0010 - 2 1010 - a
0011 - 3 1011 - b
0100 - 4 1100 - c
0101 - 5 1101 - d
0110 - 6 1110 - e
0111 - 7 1111 - f
Oftentimes we will find it
convenient to cluster
groups of bits together
for a more compact
notation. Two popular
groupings are clusters of
3 bits and 4 bits.
It is straightforward to encode positive integers as a sequence of bits.
Each bit is assigned a weight. Ordered from right to left, these weights are
increasing powers of 2. The value of an n-bit number encoded in this fashion
is given by the following formula:
= 200010
0273 0d7
pf3
pf4
pf5

Partial preview of the text

Download Quantifying Information - Computation Structures - Lecture Slides and more Slides Computer Fundamentals in PDF only on Docsity!

L01 - Basics of Information 9

2/3/

Quantifying Information

(Claude Shannon, 1948)

Suppose you’re faced with N equally probable choices, and Igive you a fact that narrows it down to M choices. ThenI’ve given you

log

2

(N/M)

bits

of information

Examples:

information in one coin flip: log

2

(2/1) = 1 bit

roll of 2 dice: log

2

(36/1) = 5.2 bits

outcome of a Red Sox game: 1 bit

(well, actually, are both outcomes equally probable?)

Information is measured in bits(binary digits) = number of 0/1’s

required to encode choice(s)

L01 - Basics of Information 10

6.004 – Spring 2009

2/3/

Encoding

Encoding describes the process of assigning representations to information

Choosing an appropriate and efficient encoding is a

real engineering challenge

Impacts design at many levels

  • Mechanism (devices, # of components used)- Efficiency (bits used)- Reliability (noise)- Security (encryption)

Next lecture: encoding a

bit.

What about

longer

messages?

L01 - Basics of Information 11

2/3/

Fixed-length encodings

7bits

log

bits

4

322

3

10

2

log

If all choices are equally likely (or we have no reason to expectotherwise), then a fixed-length code is often used. Such a code willuse at least enough bits to represent the information content.

ex. ~86 English characters =

{A-Z (26), a-z (26), 0-9 (10), punctuation (11), math (9), financial (4)}

7-bit ASCII (

American Standard Code for Information Interchange

ex. Decimal digits 10 = {0,1,2,3,4,5,6,7,8,9}

4-bit BCD (binary coded decimal)

L01 - Basics of Information 12

6.004 – Spring 2009

Encoding numbers



 =

1

n

0

i

i

i

b

v

2

11

2

10

2

9

2

8

2

7

2

6

2

5

2

4

2

3

2

2

2

1

2

0

03720

Octal - base 8

000 - 0001 - 1010 - 2011 - 3100 - 4101 - 5110 - 6111 - 7

0x7d

Hexadecimal - base 16

0000 - 0 1000 - 80001 - 1

1001 - 9

0010 - 2

1010 - a

0011 - 3

1011 - b

0100 - 4

1100 - c

0101 - 5

1101 - d

0110 - 6

1110 - e

0111 - 7

1111 - f

Oftentimes we will find it

convenient to cluster

groups of bits together

for a more compact

notation. Two popular

groupings are clusters of

3 bits and 4 bits.

It is straightforward to encode positive integers as a sequence of bits.Each bit is assigned a weight. Ordered from right to left, these weights areincreasing powers of 2. The value of an n-bit number encoded in this fashionis given by the following formula:

10

d

L01 - Basics of Information 13

2/3/

Signed integers: 2’s complement

0

1

2

3

N-

N-

N bits

8-bit 2’s complement example:

7

6

4

2

1

If we use a two’s complement representation for signed integers, the samebinary addition mod 2

n

procedure will work for adding positive and negative

numbers (don’t need separate subtraction rules). The same procedure will alsohandle unsigned numbers!By moving the implicit location of “decimal” point, we can represent fractionstoo:

1101.0110 = –

3

  • 2

2

  • 2

0

  • 2
  • 2

= – 8 + 4 + 1 + 0.25 + 0.125 = – 2.

“sign bit”

“decimal” point

Range: – 2

N-

to 2

N-

L01 - Basics of Information 14

6.004 – Spring 2009

2/3/

When choices aren’t equally probable

When the choices have different probabilities (p

i

), you get more

information when learning of a unlikely choice than when learningof a likely choice

Information from choice

i

= log

2

(1/p

) bitsi^

Average information from a choice =

p

i

log

2

(1/p

)i^

Example

choice

i^

p

i

“A”

1/

“B“

1/

“C”

1/

“D”

1/

Average information

= (.333)(1.58) + (.5)(1)+ (2)(.083)(3.58)= 1.626 bits Can we find an encoding wheretransmitting 1000 choices isclose to 1626 bits on theaverage? Using two bits for eachchoice = 2000 bits

log

2

(1/p

)i

1.58 bits

1 bit

3.58 bits3.58 bits

L01 - Basics of Information 15

2/3/

Variable-length encodings

(David Huffman, MIT 1950)

choice

i^

p

i^

encoding

“A”

1/

11

“B“

1/

0

“C”

1/

100

“D”

1/

101

B

C

D

A

1

0

1

0

1

0

Use shorter bit sequences for high probability choices,longer sequences for less probable choices

(^010011011101) Huffman Decoding Tree

C

B

A

A

D

Average information

=

(.333)(2)+(.5)(1)+(2)(.083)(3)

= 1.666 bits Transmitting 1000 choicestakes an average of 1666bits… better but notoptimal

To get a more efficient encoding (closer to information content) we need toencode sequences of choices, not just each choice individually. This is theapproach taken by most file compression algorithms…

B

L01 - Basics of Information 16

6.004 – Spring 2009

2/3/

Data Compression

Key:

re-encoding to remove

redundant information: matchdata rate to actual informationcontent.

A84b!*m9@+M(p

“Outside of a dog, a book is

man’s best friend. Inside ofa dog, its too dark toread…”

-Groucho Marx

Ideal: No redundant info – Only

unpredictable bits transmitted.Result appears

random!

LOSSLESS: can ‘uncompress’, get back

original.

Figure by MIT OpenCourseWare.

L01 - Basics of Information 21

2/3/

Hamming Distance

(Richard Hamming, 1950) HAMMING DISTANCE: The number of digitpositions in which the corresponding digits oftwo encodings of the same length are different

The Hamming distance between a valid binary code word and the samecode word with single-bit error is 1.The problem with our simple encoding is that the two valid code words(“0” and “1”) also have a Hamming distance of 1. So a single-bit errorchanges a valid code word into another valid code word…

“heads”

“tails”

single-bit error

L01 - Basics of Information 22

6.004 – Spring 2009

2/3/

Error Detection

What we need is an encoding where a single-biterror doesn’t produce another valid code word.

“heads”

“tails”

single-bit error

We can add single-bit error detection to any length code word by adding a parity bit

chosen to guarantee the Hamming distance between any two

valid code words is at least 2. In the diagram above, we’re using “evenparity” where the added bit is chosen to make the total number of 1’s inthe code word even.

Can we correct detected errors? Not yet…

If D is the minimumHamming distancebetween code words, wecan detect up to(D-1)-bit errors

L01 - Basics of Information 23

2/3/

Error Correction

110

000

“heads”

“tails”

100

010

single-bit error

111

101 001

011

By increasing the Hamming distance between valid code words to 3, weguarantee that the sets of words produced by single-bit errors don’toverlap. So if we detect an error, we can perform

error correction

since we

can tell what the valid code was before the error happened.

  • Can we safely detect double-bit errors while correcting 1-bit errors?• Do we always need to triple the number of bits?

If D is the minimum Hammingdistance between codewords, we can correct up to

  • bit errors

 

 

 2

1

D

L01 - Basics of Information 24

6.004 – Spring 2009

2/3/

The right choice of codes can solve hard problems

Reed-Solomon (1960)First construct a polynomial fromthe data symbols to be transmittedand then send an over-sampled plotof the polynomial instead of theoriginal symbols themselves –spread the information out so it canbe recovered from a subset of thetransmitted symbols.Particularly good at correctingbursts of erasures (symbols knownto be incorrect)Used by CD, DVD, DAT, satellitebroadcasts, etc.

Viterbi (1967)A dynamic programming algorithmfor finding the most likely sequenceof hidden states that result in asequence of observed events,especially in the context of hiddenMarkov models.Good choice when soft-decisioninformation is available from thedemodulator.Used by QAM modulation schemes(eg, CDMA, GSM, cable modems),disk drive electronics (PRML)

L01 - Basics of Information 25

6.004 – Spring 2009

2/3/

Summary

Information resolves uncertainty

Choices equally probable:

^

N choices down to M

log

2

(N/M) bits of information

^

use fixed-length encodings

^

encoding numbers: 2’s complement signed integers

Choices not equally probable:

^

choice

i^

with probability p

i

log

2

(1/p

) bits of informationi

^

average number of bits =

p

i

log

2

(1/p

)i

^

use variable-length encodings

To detect D-bit errors: Hamming distance > D

To correct D-bit errors: Hamming distance > 2D Next time:

^

encoding information electrically

^

the digital abstraction

^

combinational devices

Hand in Information Sheets!