








































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of character and string processing in java. It covers the use of character constants and special characters, the character class and its methods, and the handling of strings as complex objects. The document also includes examples of implementing methods like isdigit(), randomletter(), and touppercase().
Typology: Study notes
1 / 48
This page cannot be seen from the preview
Don't miss anything!









































Characters and Strings
Spring 2009
I (^) Reading: Textbook, Sections 8.2–8. I (^) Lab: Character and String processing I (^) Attendance
Surely you don’t think that numbers are as important as words. —King Azaz to the Mathemagician; Norton Juster, The Phantom Tollbooth, 1961
I (^) The first widely adopted character encoding was ASCII — American Standard Code for Information Interchange.
I (^) With only 256 possible characters (the number of bit combinations in a byte), the ASCII system proved inadequate to represent the many alphabets in use throughout the world.
I (^) It has therefore been superseded by Unicode, which allows for a much larger number of characters.
The first 128 characters — written in Octal or Base 8 base 0 1 2 3 4 5 6 7 000 \ 000 \ 001 \ 002 \ 003 \ 004 \ 005 \ 006 \ 007 010 \b \t \n \ 011 \f \r \ 016 \ 017 020 \ 020 \ 021 \ 022 \ 023 \ 024 \ 025 \ 026 \ 027 030 \ 030 \ 031 \ 032 \ 033 \ 034 \ 035 \ 036 \ 037 040 space! " # $ % & ’ 050 ( ) * + , -. / 060 0 1 2 3 4 5 6 7 070 8 9 : ; < = >? 100 @ A B C D E F G 110 H I J K L M N O 120 P Q R S T U V W 130 X Y Z [ \ ] ^ _ 140 ‘ a b c d e f g 150 h i j k l m n o 160 p q r s t u v w 170 x y z { | } ∼ \
Two properties of the Unicode table are worth special notice:
I (^) The character codes for the digits are consecutive
I (^) The letters in the alphabet are divided into two ranges: one for the uppercase letters and one for the lowercase letters.
Within each range of alphabetic letters, the Unicode values are consecutive.
I (^) Most of the characters in the Unicode table are familiar ones that appear on the keyboard.
I (^) These characters are called printing (or printable) characters.
I (^) The table also includes several special characters that are typically used to control formatting.
I (^) Special characters are indicated in the Unicode table by an ”escape” sequence, which consists of a backslash followed by a character or sequence of digits.
I (^) The Character class is defined in the java.lang package and is therefore available in any Java program without an import statement.
I (^) The Character class provides several useful methods for manipulating char values.
I (^) Character methods are static — they belong to the class, rather than to an object of the class.
I (^) Hence, these methods aren’t sent to a Character class object, but can be used with chars — similar to Math class methods and how they work on numeric types.
I (^) It is good programming practice to use these library methods rather than writing our own.
I (^) They are standard — programmers recognize and know what they mean.
I (^) Library methods have been tested by millions of client programmers, so it is reasonable to expect that they are correct.
I (^) Implementations in the Character class are able to convert characters in alphabets other than our Roman (English) alphabet.
I (^) Library methods are typically more efficient than methods we write ourselves.
static char toLowerCase(char ch) Returns a copy of ch converted to its lowercase equivalent, if any. Otherwise, the value of ch is returned unchanged.
static char toUpperCase(char ch) Returns a copy of ch converted to its uppercase equivalent, if any. Otherwise, the value of ch is returned unchanged.
Neither of these methods modifies the char argument.
I (^) The fact that characters have underlying integer representations allows us to use them in arithmetic expressions.
I (^) For example, if you evaluate the expression ’A’ + 1, Java will convert the character ’A’ into the integer 65 and then add 1 to get 66, which is the character code for ’B’
I (^) As an example, the following method returns a randomly chosen uppercase letter—why is the (char) in the return statement? public char randomLetter() { return (char) rgen.nextInt(’A’, ’Z’); }
I (^) We wish to implement a method, toHexDigit(), that takes an integer and returns the corresponding hexadecimal (base 16) digit as a character:
public char toHexDigit(int n) { if (0 <= n && n <= 9) { return (char) (’0’ + n); } else if (10 <= n && n <= 15) { return (char) (’A’ + (n - 10)); } else { return ’?’; } }
I (^) For most applications, this abstract view of strings is precisely the right one — however, on the inside, strings are surprisingly complicated objects.
I (^) Java supports a high–level view of stings by making String a class whose methods hide the underlying complexity.
I (^) Java defines many useful methods that operate on the String class.
I (^) Before trying to use those methods individually, it is important to understand how those methods work at a more general level.
I (^) The String class uses the receiver syntax when we call a String method — unlike the Character class static methods which take chars as arguments — in Java, we send a message to a String object.