Memory & Storage: Lesson 6 Storing Characters, Slides of Computer science

How computers store characters using binary codes and character sets. It covers the use of ASCII, extended ASCII, and Unicode character sets, and the relationship between the number of bits per character and the number of characters that can be represented. The document also discusses the limitations of ASCII and the need for extended character sets to represent special characters in different languages. It includes programming examples and tasks for students to complete.

Typology: Slides

2021/2022

Available from 10/26/2022

UKComputerScienceGuides
UKComputerScienceGuides 🇬🇧

26 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Memory & Storage: Lesson 6
Storing Characters
Starter
A computers memory and storage only hold binary 1s and 0s
How might it be possible to store letters with only binary?
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download Memory & Storage: Lesson 6 Storing Characters and more Slides Computer science in PDF only on Docsity!

Memory & Storage: Lesson 6

Starter

A computers memory and storage only hold binary 1 s and 0 s

  • How might it be possible to store letters with only binary?

Learning Outcomes:

  • Understand the use of binary codes to represent characters
  • Understand the term ‘character set’
  • Explain the relationship between the number of bits per character in a character set, and the number of characters that can be represented using: - ASCII - Extended ASCII - Unicode

Memory & Storage: Lesson 6

Storing Characters

Memory & Storage: Lesson 6

Representing characters in

binary

  • Every character on the keyboard is represented by a binary value
    • Uppercase letters (capitals) have different values from lowercase characters
    • Punctuation symbols have their own character code How many characters are there on a standard keyboard? How many bits would be required to represent this many combinations?

Memory & Storage: Lesson 6

Characters in binary

  • A keyboard needs to contain
    • 26 lowercase letters
    • 26 uppercase letters
    • 10 numbers
    • (around) 36 other characters
  • There are around 98 unique characters that are available on a keyboard
    • 6 bits give 64 different combinations – this isn’t enough
    • 7 bits give 128 different combinations which can represent 128 different characters

Memory & Storage: Lesson 6

ASCII & Unicode….?

  • What is ASCII?
  • What is Unicode?
  • What is the difference between the two?

Memory & Storage: Lesson 6

The ASCII character set

  • ASCII (American Standard Code for Information Interchange) has become the standard code, used worldwide - It was originally developed in the 1960 s for representing the English alphabet - It encodes 128 characters into 7 - bit binary codes
  • Characters include numbers 0 to 9 , uppercase and lowercase letters A-Z, a-z, punctuation symbols and the space character

Memory & Storage: Lesson 5

ASCII- Problems

  • Standard ASCII only uses the first 128 binary numbers available out of a possible 256.
  • Other languages such as German, French, Finnish, Irish, Icelandic etc take advantage of the another 128 values to include their own special characters.
  • This is called the extended ascii character set.

Memory & Storage: Lesson 6

7 - and 8 - bit ASCII

  • Numerous different codes for representing characters have been created, but ASCII is commonly used on PCs
  • Originally only seven bits were used, but now an eighth bit is used allowing for many more characters such as ©, ® etc. - How many different characters can be encoded using 7 bits, 8 bits or 16 bits?

Memory & Storage: Lesson 6

Using the eighth bit

  • Sometimes it is useful to be able to type special characters like á, à, ®
  • Here are the codes for some of them: © Alt+ 0169 ® Alt+ 0174 á Alt+ 0225 à Alt+ 0224 â Alt+ 0226 ä Alt+ 0228

Memory & Storage: Lesson 6

Extended ASCII

  • There is not a single agreed-upon standard for extended ascii code sets.
  • There are many different extended ASCII sets as various countries wanted to use the higher bits for their own alphabet. Example:
  • In the 'Latin 1 Western European' set, 11011100 is the German u-with-umlaut (ü).
  • In other ASCII sets, like the 'Latin 2 Central European' that covers languages such as Croatian, Serbian, and Czech, this value may correspond to an entirely different character.

Memory & Storage: Lesson 6

Programming with text &

numbers

  • The ASCII code for ‘ 7 ’ is 011 0111
    • The binary code for the digit 7 is 0000 0111
  • When you write a program in Python, for example, you have to specify whether a variable is text or integer - You cannot do arithmetic with characters - If the character represents a number it must first be converted to an integer before any arithmetic can be carried out

Memory & Storage: Lesson 6

Using different alphabets

  • To represent other characters for different languages, a new code allowing for many more characters is needed - Unicode was developed to use 16 bits 65 536 possible combinations - The 32 bit version gives 4 294 967 296 (over 4 billion) possible combinations

Memory & Storage: Lesson 6

Unicode

  • In Japanese, konnichiwa is used as a greeting meaning “good day”
  • In Unicode this is written as three 16 - bit characters
    • How many bytes does the English ‘good day’ require in ASCII?
    • How many bytes does the Japanese require in Unicode?

Memory & Storage: Lesson 6 Smiling face with sunglasses Unicode: 1 F 60 E

Unicode

  • ‘good day’ requires 8 bytes to store
  • 今日は requires 6 bytes to store ( 3 characters x 2 bytes)
  • Unicode is also used to store emoji - ‘e’ is Japanese for picture - ‘moji’ is Japanese for character or alphabet