Download Memory & Storage: Lesson 6 Storing Characters and more Slides Computer science in PDF only on Docsity!
Memory & Storage: Lesson 6
Starter
A computers memory and storage only hold binary 1 s and 0 s
- How might it be possible to store letters with only binary?
Learning Outcomes:
- Understand the use of binary codes to represent characters
- Understand the term ‘character set’
- Explain the relationship between the number of bits per character in a character set, and the number of characters that can be represented using: - ASCII - Extended ASCII - Unicode
Memory & Storage: Lesson 6
Storing Characters
Memory & Storage: Lesson 6
Representing characters in
binary
- Every character on the keyboard is represented by a binary value
- Uppercase letters (capitals) have different values from lowercase characters
- Punctuation symbols have their own character code How many characters are there on a standard keyboard? How many bits would be required to represent this many combinations?
Memory & Storage: Lesson 6
Characters in binary
- A keyboard needs to contain
- 26 lowercase letters
- 26 uppercase letters
- 10 numbers
- (around) 36 other characters
- There are around 98 unique characters that are available on a keyboard
- 6 bits give 64 different combinations – this isn’t enough
- 7 bits give 128 different combinations which can represent 128 different characters
Memory & Storage: Lesson 6
ASCII & Unicode….?
- What is ASCII?
- What is Unicode?
- What is the difference between the two?
Memory & Storage: Lesson 6
The ASCII character set
- ASCII (American Standard Code for Information Interchange) has become the standard code, used worldwide - It was originally developed in the 1960 s for representing the English alphabet - It encodes 128 characters into 7 - bit binary codes
- Characters include numbers 0 to 9 , uppercase and lowercase letters A-Z, a-z, punctuation symbols and the space character
Memory & Storage: Lesson 5
ASCII- Problems
- Standard ASCII only uses the first 128 binary numbers available out of a possible 256.
- Other languages such as German, French, Finnish, Irish, Icelandic etc take advantage of the another 128 values to include their own special characters.
- This is called the extended ascii character set.
Memory & Storage: Lesson 6
7 - and 8 - bit ASCII
- Numerous different codes for representing characters have been created, but ASCII is commonly used on PCs
- Originally only seven bits were used, but now an eighth bit is used allowing for many more characters such as ©, ® etc. - How many different characters can be encoded using 7 bits, 8 bits or 16 bits?
Memory & Storage: Lesson 6
Using the eighth bit
- Sometimes it is useful to be able to type special characters like á, à, ®
- Here are the codes for some of them: © Alt+ 0169 ® Alt+ 0174 á Alt+ 0225 à Alt+ 0224 â Alt+ 0226 ä Alt+ 0228
Memory & Storage: Lesson 6
Extended ASCII
- There is not a single agreed-upon standard for extended ascii code sets.
- There are many different extended ASCII sets as various countries wanted to use the higher bits for their own alphabet. Example:
- In the 'Latin 1 Western European' set, 11011100 is the German u-with-umlaut (ü).
- In other ASCII sets, like the 'Latin 2 Central European' that covers languages such as Croatian, Serbian, and Czech, this value may correspond to an entirely different character.
Memory & Storage: Lesson 6
Programming with text &
numbers
- The ASCII code for ‘ 7 ’ is 011 0111
- The binary code for the digit 7 is 0000 0111
- When you write a program in Python, for example, you have to specify whether a variable is text or integer - You cannot do arithmetic with characters - If the character represents a number it must first be converted to an integer before any arithmetic can be carried out
Memory & Storage: Lesson 6
Using different alphabets
- To represent other characters for different languages, a new code allowing for many more characters is needed - Unicode was developed to use 16 bits 65 536 possible combinations - The 32 bit version gives 4 294 967 296 (over 4 billion) possible combinations
Memory & Storage: Lesson 6
Unicode
- In Japanese, konnichiwa is used as a greeting meaning “good day”
- In Unicode this is written as three 16 - bit characters
- How many bytes does the English ‘good day’ require in ASCII?
- How many bytes does the Japanese require in Unicode?
Memory & Storage: Lesson 6 Smiling face with sunglasses Unicode: 1 F 60 E
Unicode
- ‘good day’ requires 8 bytes to store
- 今日は requires 6 bytes to store ( 3 characters x 2 bytes)
- Unicode is also used to store emoji - ‘e’ is Japanese for picture - ‘moji’ is Japanese for character or alphabet