Download Data Representation - Computers and Scientific Thinking - Lecture Slides and more Slides Computer Science in PDF only on Docsity!
Data Representation
1
Analog vs. Digital
- there are two ways data can be stored electronically 1. analog signals represent data in a way that is analogous to real life signals can vary continuously across an infinite range of values e.g., frequencies on an old-fashioned radio with a dial 2. digital signals utilize only a finite set of values e.g., frequencies on a modern radio with digital display
- the major tradeoff between analog and digital is variability vs. reproducibility
- analog allows for a (potentially) infinite number of unique signals, but they are harder to reproduce
- good for storing data that is highly variable but does not need to be reproduced exactly
- digital signals limit the number of representable signals, but they are easily remembered and reproduced - good for storing data when reproducibility is paramount
2
Decimal Binary
- algorithm for converting from decimal (D) to binary (B):
4
Representing Integers
- when an integer value must be saved on a computer, its binary equivalent can be encoded as a
bit pattern and stored digitally
- usually, a fixed size (e.g., 32 bits) is used for each integer so that the computer knows where
one integer ends and another begins
- the initial bit in each pattern acts as the sign bit (0=positive, 1=negative)
- negative numbers are represented in two’s complement notation
- the "largest" bit pattern corresponds to the smallest absolute value (-1)
5
Representing Characters
- characters have no natural
correspondence to binary numbers
- computer scientists devised an arbitrary system for representing characters as bit patterns
- ASCII (American Standard Code for Information Interchange) - maps each character to a specific 8- bit pattern - note that all digits are contiguous, as are all lower-case and all upper-case letters
'0' < '1' < … < '9' 'A' < 'B' < … < 'Z' 'a' < 'b' < … < 'z'
7
Representing Text
- strings can be represented as sequences of ASCII codes, one for each character in the string
8
specific programs may store additional information along with the ASCII
codes
e.g. programming languages will often store the number of characters along with the ASCII codes e.g., word processing programs will insert special character symbols to denote formatting (analogous to HTML tags in a Web page)
Representing Sounds (cont.)
10
when analog recordings are repeatedly duplicated, small errors that were
originally unnoticed begin to propagate
digital recordings can be reproduced exactly without any deterioration
in sound quality
analog waveforms must be converted to a sequence of discrete values digital sampling is the process in which the amplitude of a wave is measured at regular intervals, and stored as discrete measurements
frequent measurements must be taken to ensure high quality (e.g., 44,100 readings per second for a CD) this results in massive amounts of storage techniques are used to compress the data and reduce file sizes (e.g., MP3, WAV)
Representing Images
- EXAMPLE: representing images
- images are stored using a variety of formats and compression techniques
- the simplest representation is a bitmap
- bitmaps partition an image into a grid of picture elements, called pixels , and then convert each pixel into a bit pattern
11
resolution refers to the sharpness or
clarity of an image
bitmaps that are divided into smaller pixels will yield higher resolution images the left image is stored using 96 pixels per square inch, and the right image is stored using 48 pixels per square inch the left image appears sharp, but has twice the storage requirements
Distinguishing Data Types
- how does a computer know what type of value is stored in a particular piece of memory?
- short answer: it doesn't
- when a program stores data in memory, it must store additional information as to what type of data the bit pattern represents
- thus, the same bit pattern might represent different values in different contexts
13