Understanding Digital Information: Analog vs. Digital, Encoding, Values, and Compression, Slides of Artificial Intelligence

An introduction to the concepts of analog and digital information representation, bit encoding, representing negative values, and data compression. It covers the basics of mercury thermometers, digitization, binary representations, signed-magnitude representation, ten's complement, and data compression techniques like keyword encoding, run-length encoding, and huffman encoding.

Typology: Slides

2012/2013

Uploaded on 04/24/2013

banani
banani 🇮🇳

4.3

(3)

91 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CSCI 100
Think Like Computers
Lecture 2
Fall 2008
Linux
ls [file-or-dir]
ls –l [file-or-dir]
cd directorypath
mkdir dir
rmdir dir
cp (copy)
mv (move, rename)
rm (remove)
Path:
dir/subdir/subsubdir
Special file/dir:
. and ..
~, ~username
Wildcards
*
?
Help:
man
Remote Login
Use secure shell (ssh) to connect to
csgateway.clarku.edu
Use secure ftp (sftp) to transfer files
(download/upload)
Homework
Create your own homepage
Create the directory ~/public_html
Create the file ~/public_html/index.html
Change the permissions so that they are
accessible to others
chmod
Data and Computers
Computers are multimedia devices,
dealing with a vast array of information
categories. Computers store, present, and
help us modify
Numbers
Text
Audio
Images and graphics
Video
And other things …
Data & Information
Information can be represented in one of two
ways: analog or digital.
Analog data A continuous representation, analogous
to the actual information it represents.
Digital data A discrete r epresentation, breaking the
information up into separate elements.
A mercury thermometer is an analog device. The mercury
rises in a continuous flow in the tube in direct proportion to
the temperature.
Docsity.com
pf3
pf4
pf5

Partial preview of the text

Download Understanding Digital Information: Analog vs. Digital, Encoding, Values, and Compression and more Slides Artificial Intelligence in PDF only on Docsity!

CSCI 100

Think Like Computers

Lecture 2

Fall 2008

Linux

  • ls [file-or-dir]
  • ls –l [file-or-dir]
  • cd directorypath
  • mkdir dir
  • rmdir dir
  • cp … (copy)
  • mv … (move, rename)
  • rm … (remove)
    • Path: Š dir/subdir/subsubdir
    • Special file/dir: Š. and .. Š ~, ~ username
    • Wildcards Š * Š?
    • Help: Š man …

Remote Login

  • Use secure shell (ssh) to connect to csgateway.clarku.edu
  • Use secure ftp (sftp) to transfer files (download/upload)

Homework

  • Create your own homepage Š Create the directory ~/public_html Š Create the file ~/public_html/index.html

Š Change the permissions so that they are accessible to others Š chmod

Data and Computers

  • Computers are multimedia devices, dealing with a vast array of information categories. Computers store, present, and help us modify ƒ Numbers ƒ Text ƒ Audio ƒ Images and graphics ƒ Video ƒ And other things …

Data & Information

  • Information can be represented in one of two ways: analog or digital.

Analog data A continuous representation, analogous to the actual information it represents.

Digital data A discrete representation, breaking the information up into separate elements. A mercury thermometer is an analog device. The mercury rises in a continuous flow in the tube in direct proportion to the temperature.

Analog and Digital Information

Figure 3. A mercury thermometercontinually rises in direct proportion to thetemperature

Analog and Digital Information

  • Computers, cannot work well with analog information. So we digitize information by breaking it into pieces and representing those pieces separately.
  • Why do we use binary? Modern computers are designed to use and manage binary values because the devices that store and manage the data are far less expensive and far more reliable if they only have to represent on of two possible values.

Binary Representations

  • One bit can be either 0 or 1. Therefore, one bit can represent only two things.
  • To represent more than two things, we need multiple bits. Two bits can represent four things because there are four combinations of 0 and 1 that can be made from two bits: 00, 01, 10,11.

Binary Representations

  • If we want to represent more than four things, we need more than two bits. Three bits can represent eight things because there are eight combinations of 0 and 1 that can be made from three bits.

Binary Representations

Figure 3.4Bit combinations

Binary Representations

  • In general, η bits can represent 2η^ things because there are 2η^ combinations of 0 and 1 that can be made from n bits. Note that every time we increase the number of bits by 1, we double the number of things we can represent.

Representing Negative Values

  • Here is a formula that you can use to compute the negative representation
  • This representation of negative numbers is called the ten’s complement.

Representing Negative Values

Two’s Complement To make it easier to look at long binary numbers, we make the number line vertical.

Representing Negative Values

  • Addition and subtraction are accomplished the same way as in 10’s complement arithmetic -127 10000001 + 1 00000001 -126 10000010
  • Notice that with this representation, the leftmost bit in a negative number is always a 1.

Number Overflow

  • Overflow occurs when the value that we compute cannot fit into the number of bits we have allocated for the result. For example, if each value is stored using eight bits, adding 127 to 3 overflows. 1111111
  • 0000011 10000010
  • Overflow is a classic example of the type of problems we encounter by mapping an infinite world onto a finite machine.

Representing Real Numbers

  • Any idea?

Representing Text

  • To represent a text document in digital form, we need to be able to represent every possible character that may appear.
  • There are finite number of characters to represent, so the general approach is to list them all and assign each a binary string.
  • A character set is a list of characters and the codes used to represent each one.
  • By agreeing to use a particular character set, computer manufacturers have made the processing of text data easier.

The ASCII Character Set

  • ASCII stands for American Standard Code for Information Interchange. The ASCII character set originally used seven bits to represent each character, allowing for 128 unique characters.
  • Later ASCII evolved so that all eight bits were used which allows for 256 characters.

The ASCII Character Set

The Unicode Character Set

  • The extended version of the ASCII character set is not enough for international use.
  • The Unicode character set uses 16 bits per character. Therefore, the Unicode character set can represent 216, or over 65 thousand, characters.
  • Unicode was designed to be a superset of ASCII. That is, the first 256 characters in the Unicode character set correspond exactly to the extended ASCII character set.

The Unicode Character Set

Figure 3.6 A few characters in the Unicode character set

Data Compression

  • Data compression Reduction in the amount of space needed to store a piece of data.
  • Compression ratio The size of the compressed data divided by the size of the original data.
  • A data compression techniques can be Š lossless , which means the data can be retrieved without any loss of the original information, Š lossy , which means some information may be lost in the process of compaction.

Text Compression

  • It is important that we find ways to store and transmit text efficiently, which means we must find ways to compress text. Š keyword encoding Š run-length encoding Š Huffman encoding

Huffman Encoding

  • Why should the character “X”, which is seldom used in text, take up the same number of bits as the blank, which is used very frequently? Huffman codes using variable-length bit strings to represent each character.
  • A few characters may be represented by five bits, and another few by six bits, and yet another few by seven bits, and so forth.

Huffman Encoding

  • If we use only a few bits to represent characters that appear often and reserve longer bit strings for characters that don’t appear often, the overall size of the document being represented is smaller.

Huffman Encoding

  • For example

Huffman Encoding

  • DOORBELL would be encoded in binary as
  • If we used a fixed-size bit string to represent each character (say, 8 bits), then the binary from of the original string would be 64 bits. The Huffman encoding for that string is 25 bits long, giving a compression ratio of 25/64, or approximately 0.39.
  • An important characteristic of any Huffman encoding is that no bit string used to represent a character is the prefix of any other bit string used to represent a character.

Exercise

  • Encode REAL
  • Decode 11111010010001111