












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to sets, functions, and relations. It covers the basics of sets, including the definition of elements and subsets, as well as the concepts of reflexivity, symmetry, and transitivity. The document also explains the concept of functions, including the domain and range, and discusses the properties of one-to-one, onto, and bijective functions. Additionally, it introduces the concept of relations and their graphs.
Typology: Schemes and Mind Maps
1 / 20
This page cannot be seen from the preview
Don't miss anything!













We understand a “set” to be any collection M of certain distinct objects of our thought or intuition (called the “elements” of M ) into a whole. (Georg Cantor, 1895)
In mathematics you don’t understand things. You just get used to them. (Attributed to John von Neumann)
In this chapter, we define sets, functions, and relations and discuss some of their general properties. This material can be referred back to as needed in the subsequent chapters.
A set is a collection of objects, called the elements or members of the set. The objects could be anything (planets, squirrels, characters in Shakespeare’s plays, or other sets) but for us they will be mathematical objects such as numbers, or sets of numbers. We write x ∈ X if x is an element of the set X and x /∈ X if x is not an element of X.
If the definition of a “set” as a “collection” seems circular, that’s because it is. Conceiving of many objects as a single whole is a basic intuition that cannot be analyzed further, and the the notions of “set” and “membership” are primitive ones. These notions can be made mathematically precise by introducing a system of axioms for sets and membership that agrees with our intuition and proving other set-theoretic properties from the axioms.
The most commonly used axioms for sets are the ZFC axioms, named somewhat inconsistently after two of their founders (Zermelo and Fraenkel) and one of their axioms (the Axiom of Choice). We won’t state these axioms here; instead, we use “naive” set theory, based on the intuitive properties of sets. Nevertheless, all the set-theory arguments we use can be rigorously formalized within the ZFC system.
2 1. Sets and Functions
Sets are determined entirely by their elements. Thus, the sets X, Y are equal, written X = Y , if
x ∈ X if and only if x ∈ Y.
It is convenient to define the empty set, denoted by ∅, as the set with no elements. (Since sets are determined by their elements, there is only one set with no elements!) If X 6 = ∅, meaning that X has at least one element, then we say that X is non- empty.
We can define a finite set by listing its elements (between curly brackets). For example,
X = { 2 , 3 , 5 , 7 , 11 }
is a set with five elements. The order in which the elements are listed or repetitions of the same element are irrelevant. Alternatively, we can define X as the set whose elements are the first five prime numbers. It doesn’t matter how we specify the elements of X, only that they are the same.
Infinite sets can’t be defined by explicitly listing all of their elements. Never- theless, we will adopt a realist (or “platonist”) approach towards arbitrary infinite sets and regard them as well-defined totalities. In constructive mathematics and computer science, one may be interested only in sets that can be defined by a rule or algorithm — for example, the set of all prime numbers — rather than by infinitely many arbitrary specifications, and there are some mathematicians who consider infinite sets to be meaningless without some way of constructing them. Similar issues arise with the notion of arbitrary subsets, functions, and relations.
1.1.1. Numbers. The infinite sets we use are derived from the natural and real numbers, about which we have a direct intuitive understanding.
Our understanding of the natural numbers 1, 2 , 3 ,... derives from counting. We denote the set of natural numbers by
N = { 1 , 2 , 3 ,... }.
We define N so that it starts at 1. In set theory and logic, the natural numbers are defined to start at zero, but we denote this set by N 0 = { 0 , 1 , 2 ,... }. Histori- cally, the number 0 was later addition to the number system, primarily by Indian mathematicians in the 5th century AD. The ancient Greek mathematicians, such as Euclid, defined a number as a multiplicity and didn’t consider 1 to be a number either.
Our understanding of the real numbers derives from durations of time and lengths in space. We think of the real line, or continuum, as being composed of an (uncountably) infinite number of points, each of which corresponds to a real number, and denote the set of real numbers by R. There are philosophical questions, going back at least to Zeno’s paradoxes, about whether the continuum can be represented as a set of points, and a number of mathematicians have disputed this assumption or introduced alternative models of the continuum. There are, however, no known inconsistencies in treating R as a set of points, and since Cantor’s work it has been the dominant point of view in mathematics because of its precision, power, and simplicity.
4 1. Sets and Functions
complements, by listing finitely many elements. Some infinite subsets, such as the set of primes or the set of squares, can be defined by giving a definite rule for membership. We imagine that a general subset A ⊂ N is “defined” by going through the elements of N one by one and deciding for each n ∈ N whether n ∈ A or n /∈ A.
If X is a set and P is a property of elements of X, we denote the subset of X consisting of elements with the property P by {x ∈ X : P (x)}.
Example 1.3. The set { n ∈ N : n = k^2 for some k ∈ N
is the set of perfect squares { 1 , 4 , 9 , 16 , 25 ,... }. The set
{x ∈ R : 0 < x < 1 }
is the open interval (0, 1).
1.1.3. Set operations. The intersection A ∩ B of two sets A, B is the set of all elements that belong to both A and B; that is
x ∈ A ∩ B if and only if x ∈ A and x ∈ B.
Two sets A, B are said to be disjoint if A ∩ B = ∅; that is, if A and B have no elements in common.
The union A ∪ B is the set of all elements that belong to A or B; that is
x ∈ A ∪ B if and only if x ∈ A or x ∈ B.
Note that we always use ‘or’ in an inclusive sense, so that x ∈ A ∪ B if x is an element of A or B, or both A and B. (Thus, A ∩ B ⊂ A ∪ B.)
The set-difference of two sets B and A is the set of elements of B that do not belong to A,
B \ A = {x ∈ B : x /∈ A}.
If we consider sets that are subsets of a fixed set X that is understood from the context, then we write Ac^ = X \ A to denote the complement of A ⊂ X in X. Note that (Ac)c^ = A.
Example 1.4. If
A = { 2 , 3 , 5 , 7 , 11 } , B = { 1 , 3 , 5 , 7 , 9 , 11 }
then
A ∩ B = { 3 , 5 , 7 , 11 } , A ∪ B = { 1 , 2 , 3 , 5 , 7 , 9 , 11 }.
Thus, A ∩ B consists of the natural numbers between 1 and 11 that are both prime and odd, while A ∪ B consists of the numbers that are either prime or odd (or both). The set differences of these sets are
B \ A = { 1 , 9 } , A \ B = { 2 }.
Thus, B \ A is the set of odd numbers between 1 and 11 that are not prime, and A \ B is the set of prime numbers that are not odd.
1.2. Functions 5
These set operations may be represented by Venn diagrams, which can be used to visualize their properties. In particular, if A, B ⊂ X, we have De Morgan’s laws:
(A ∪ B)c^ = Ac^ ∩ Bc, (A ∩ B)c^ = Ac^ ∪ Bc.
The definitions of union and intersection extend to larger collections of sets in a natural way.
Definition 1.5. Let C be a collection of sets. Then the union of C is ⋃ C = {x : x ∈ X for some X ∈ C} ,
and the intersection of C is ⋂ C = {x : x ∈ X for every X ∈ C}.
If C = {A, B}, then this definition reduces to our previous one for A ∪ B and A ∩ B.
The Cartesian product X × Y of sets X, Y is the set of all ordered pairs (x, y) with x ∈ X and y ∈ Y. If X = Y , we often write X × X = X^2. Two ordered pairs (x 1 , y 1 ), (x 2 , y 2 ) in X × Y are equal if and only if x 1 = x 2 and y 1 = y 2. Thus, (x, y) 6 = (y, x) unless x = y. This contrasts with sets where {x, y} = {y, x}.
Example 1.6. If X = { 1 , 2 , 3 } and Y = { 4 , 5 } then
X × Y = {(1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)}.
Example 1.7. The Cartesian product of R with itself is the Cartesian plane R^2 consisting of all points with coordinates (x, y) where x, y ∈ R.
The Cartesian product of finitely many sets is defined analogously.
Definition 1.8. The Cartesian products of n sets X 1 , X 2 ,... ,Xn is the set of ordered n-tuples,
X 1 × X 2 × · · · × Xn = {(x 1 , x 2 ,... , xn) : xi ∈ Xi for i = 1, 2 ,... , n} ,
where (x 1 , x 2 ,... , xn) = (y 1 , y 2 ,... , yn) if and only if xi = yi for every i = 1 , 2 ,... , n.
A function f : X → Y between sets X, Y assigns to each x ∈ X a unique element f (x) ∈ Y. Functions are also called maps, mappings, or transformations. The set X on which f is defined is called the domain of f and the set Y in which it takes its values is called the codomain. We write f : x 7 → f (x) to indicate that f is the function that maps x to f (x).
Example 1.9. The identity function idX : X → X on a set X is the function idX : x 7 → x that maps every element to itself.
Example 1.10. Let A ⊂ X. The characteristic (or indicator) function of A,
χA : X → { 0 , 1 },
1.3. Composition and inverses of functions 7
A function is one-to-one if it maps distinct elements of X to distinct elements of Y ; that is, if
x 1 , x 2 ∈ X and x 1 6 = x 2 implies that f (x 1 ) 6 = f (x 2 ).
An onto function is also called a surjection, a one-to-one function an injection, and a one-to-one, onto function a bijection.
Example 1.14. The function f : A → B defined in Example 1.11 is one-to-one but not onto, since 5 ∈/ ran f , while the function g : B → A is onto but not one-to-one, since g(5) = g(7).
The successive application of mappings leads to the notion of the composition of functions.
Definition 1.15. The composition of functions f : X → Y and g : Y → Z is the function g ◦ f : X → Z defined by
(g ◦ f )(x) = g (f (x)).
The order of application of the functions in a composition is crucial and is read from from right to left. The composition g ◦ f can only be defined if the domain of g includes the range of f , and the existence of g ◦ f does not imply that f ◦ g even makes sense.
Example 1.16. Let X be the set of students in a class and f : X → N the function that maps a student to her age. Let g : N → N be the function that adds up the digits in a number e.g., g(1729) = 19. If x ∈ X is 23 years old, then (g ◦ f )(x) = 5, but (f ◦ g)(x) makes no sense, since students in the class are not natural numbers.
Even if both g ◦ f and f ◦ g are defined, they are, in general, different functions.
Example 1.17. If f : A → B and g : B → A are the functions in Example 1.11, then g ◦ f : A → A is given by
(g ◦ f )(2) = 2, (g ◦ f )(3) = 3, (g ◦ f )(5) = 11, (g ◦ f )(7) = 7, (g ◦ f )(11) = 5.
and f ◦ g : B → B is given by
(f ◦ g)(1) = 1, (f ◦ g)(3) = 3, (f ◦ g)(5) = 7, (f ◦ g)(7) = 7, (f ◦ g)(9) = 11, (f ◦ g)(11) = 9.
A one-to-one, onto function f : X → Y has an inverse f −^1 : Y → X defined by f −^1 (y) = x if and only if f (x) = y.
Equivalently, f −^1 ◦ f = idX and f ◦ f −^1 = idY. A value f −^1 (y) is defined for every y ∈ Y since f is onto, and it is unique since f is one-to-one. If f : X → Y is one- to-one but not onto, then one can still define an inverse function f −^1 : ran f → X whose domain in the range of f.
The use of the notation f −^1 to denote the inverse function should not be con- fused with its use to denote the reciprocal function; it should be clear from the context which meaning is intended.
8 1. Sets and Functions
Example 1.18. If f : R → R is the function f (x) = x^3 , which is one-to-one and onto, then the inverse function f −^1 : R → R is given by
f −^1 (x) = x^1 /^3.
On the other hand, the reciprocal function g = 1/f is given by
g(x) =
x^3
, g : R \ { 0 } → R.
The reciprocal function is not defined at x = 0 where f (x) = 0.
If f : X → Y and A ⊂ X, then we let f (A) = {y ∈ Y : y = f (x) for some x ∈ A}
denote the set of values of f on points in A. Similarly, if B ⊂ Y , we let
f −^1 (B) = {x ∈ X : f (x) ∈ B}
denote the set of points in X whose values belong to B. Note that f −^1 (B) makes sense as a set even if the inverse function f −^1 : Y → X does not exist.
Example 1.19. Define f : R → R by f (x) = x^2. If A = (− 2 , 2), then f (A) = [0, 4). If B = (0, 4), then
f −^1 (B) = (− 2 , 0) ∪ (0, 2).
If C = (− 4 , 0), then f −^1 (C) = ∅.
Finally, we introduce operations on a set.
Definition 1.20. A binary operation on a set X is a function f : X × X → X.
We think of f as “combining” two elements of X to give another element of X. One can also consider higher-order operations, such as ternary operations f : X × X × X → X, but will will only use binary operations.
Example 1.21. Addition a : N × N → N and multiplication m : N × N → N are binary operations on N where
a(x, y) = x + y, m(x, y) = xy.
We say that a set X is indexed by a set I, or X is an indexed set, if there is an onto function f : I → X. We then write
X = {xi : i ∈ I}
where xi = f (i). For example,
{ 1 , 4 , 9 , 16 ,... } =
n^2 : n ∈ N
The set X itself is the range of the indexing function f , and it doesn’t depend on how we index it. If f isn’t one-to-one, then some elements are repeated, but this doesn’t affect the definition of the set X. For example,
{− 1 , 1 } = {(−1)n^ : n ∈ N} =
(−1)n+1^ : n ∈ N
10 1. Sets and Functions
Conversely, if x ∈
j∈J f^
− (^1) (Yj ), then x ∈ f − (^1) (Yj ) for some j ∈ J, so f (x) ∈ Yj
and f (x) ∈
j∈J Yj^ , meaning that^ x^ ∈^ f^ − 1
j∈J Yj
. It follows that
⋃
j∈J
f −^1 (Yj ) ⊂ f −^1
j∈J
Yj
which proves that the sets are equal.
If y ∈ f
i∈I Xi
, then there exists x ∈
i∈I Xi^ such that^ f^ (x) =^ y.^ Then x ∈ Xi and y ∈ f (Xi) for every i ∈ I, meaning that y ∈
i∈I f^ (Xi).^ It follows that
f
i∈I
Xi
i∈I
f (Xi).
The only case in which we don’t always have equality is for the image of an intersection, and we may get strict inclusion here if f is not one-to-one.
Example 1.25. Define f : R → R by f (x) = x^2. Let A = (− 1 , 0) and B = (0, 1). Then A ∩ B = ∅ and f (A ∩ B) = ∅, but f (A) = f (B) = (0, 1), so f (A) ∩ f (B) = (0, 1) 6 = f (A ∩ B).
Next, we generalize the Cartesian product of finitely many sets to the product of possibly infinitely many sets.
Definition 1.26. Let C = {Xi : i ∈ I} be an indexed collection of sets Xi. The Cartesian product of C is the set of functions that assign to each index i ∈ I an element xi ∈ Xi. That is,
∏
i∈I
Xi =
f : I →
i∈I
Xi : f (i) ∈ Xi for every i ∈ I
For example, if I = { 1 , 2 ,... , n}, then f defines an ordered n-tuple of elements (x 1 , x 2 ,... , xn) with xi = f (i) ∈ Xi, so this definition is equivalent to our previous one.
If Xi = X for every i ∈ I, then
i∈I Xi^ is simply the set of functions from^ I to X, and we also write it as
XI^ = {f : I → X}.
We can think of this set as the set of ordered I-tuples of elements of X.
Example 1.27. A sequence of real numbers (x 1 , x 2 , x 3 ,... , xn,... ) ∈ RN^ is a function f : N → R. We study sequences and their convergence properties in Chapter 3.
Example 1.28. Let 2 = { 0 , 1 } be a set with two elements. Then a subset A ⊂ I can be identified with its characteristic function χA : I → 2 by: i ∈ A if and only if χA(i) = 1. Thus, A 7 → χA is a one-to-one map from P(I) onto 2 I^.
Before giving another example, we introduce some convenient notation.
1.5. Relations 11
Definition 1.29. Let
Σ = {(s 1 , s 2 , s 3 ,... , sk,... ) : sk = 0, 1 }
denote the set of all binary sequences; that is, sequences whose terms are either 0 or 1.
Example 1.30. Let 2 = { 0 , 1 }. Then Σ = 2 N, where we identify a sequence (s 1 , s 2 ,... sk,... ) with the function f : N → 2 such that sk = f (k). We can also identify Σ and 2 N^ with P(N) as in Example 1.28. For example, the sequence (1, 0 , 1 , 0 , 1 ,... ) of alternating ones and zeros corresponds to the function f : N → 2 defined by
f (k) =
1 if k is odd, 0 if k is even,
and to the set { 1 , 3 , 5 , 7 ,... } ⊂ N of odd natural numbers.
A binary relation R on sets X and Y is a definite relation between elements of X and elements of Y. We write xRy if x ∈ X and y ∈ Y are related. One can also define relations on more than two sets, but we shall consider only binary relations and refer to them simply as relations. If X = Y , then we call R a relation on X.
Example 1.31. Suppose that S is a set of students enrolled in a university and B is a set of books in a library. We might define a relation R on S and B by:
s ∈ S has read b ∈ B.
In that case, sRb if and only if s has read b. Another, probably inequivalent, relation is: s ∈ S has checked b ∈ B out of the library.
When used informally, relations may be ambiguous (did s read b if she only read the first page?), but in mathematical usage we always require that relations are definite, meaning that one and only one of the statements “these elements are related” or “these elements are not related” is true.
The graph GR of a relation R on X and Y is the subset of X × Y defined by GR = {(x, y) ∈ X × Y : xRy}.
This graph contains all of the information about which elements are related. Con- versely, any subset G ⊂ X ×Y defines a relation R by: xRy if and only if (x, y) ∈ G. Thus, a relation on X and Y may be (and often is) defined as subset of X × Y. As for sets, it doesn’t matter how a relation is defined, only what elements are related.
A function f : X → Y determines a relation F on X and Y by: xF y if and only if y = f (x). Thus, functions are a special case of relations. The graph GR of a general relation differs from the graph GF of a function in two ways: there may be elements x ∈ X such that (x, y) ∈/ GR for any y ∈ Y , and there may be x ∈ X such that (x, y) ∈ GR for many y ∈ Y.
For example, in the case of the relation R in Example 1.31, there may be some students who haven’t read any books, and there may be other students who have
1.5. Relations 13
The definition of an equivalence relation differs from the definition of an order only by changing antisymmetry to symmetry, but order relations and equivalence relations have completely different properties.
Definition 1.36. An equivalence relation ∼ on a set X is a binary relation on X such that for every x, y, z ∈ X:
(a) x ∼ x (reflexivity); (b) if x ∼ y then y ∼ x (symmetry); (c) if x ∼ y and y ∼ z then x ∼ z (transitivity).
For each x ∈ X, the set of elements equivalent to x, [x/ ∼] = {y ∈ X : x ∼ y} ,
is called the equivalence class of x with respect to ∼. When the equivalence relation is understood, we write the equivalence class [x/ ∼] simply as [x]. The set of equivalence classes of an equivalence relation ∼ on a set X is denoted by X/ ∼. Note that each element of X/ ∼ is a subset of X, so X/ ∼ is a subset of the power set P(X) of X.
The following theorem is the basic result about equivalence relations. It says that an equivalence relation on a set partitions the set into disjoint equivalence classes.
Theorem 1.37. Let ∼ be an equivalence relation on a set X. Every equivalence class is non-empty, and X is the disjoint union of the equivalence classes of ∼.
Proof. If x ∈ X, then the symmetry of ∼ implies that x ∈ [x]. Therefore every equivalence class is non-empty and the union of the equivalence classes is X.
To prove that the union is disjoint, we show that for every x, y ∈ X either [x] ∩ [y] = ∅ (if x 6 ∼ y) or [x] = [y] (if x ∼ y).
Suppose that [x] ∩ [y] 6 = ∅. Let z ∈ [x] ∩ [y] be an element in both equivalence classes. If x 1 ∈ [x], then x 1 ∼ z and z ∼ y, so x 1 ∼ y by the transitivity of ∼, and therefore x 1 ∈ [y]. It follows that [x] ⊂ [y]. A similar argument applied to y 1 ∈ [y] implies that [y] ⊂ [x], and therefore [x] = [y]. In particular, y ∈ [x], so x ∼ y. On the other hand, if [x] ∩ [y] = ∅, then y /∈ [x] since y ∈ [y], so x 6 ∼ y.
There is a natural projection π : X → X/ ∼, given by π(x) = [x], that maps each element of X to the equivalence class that contains it. Conversely, we can index the collection of equivalence classes
X/ ∼ = {[a] : a ∈ A}
by a subset A of X which contains exactly one element from each equivalence class. It is important to recognize, however, that such an indexing involves an arbitrary choice of a representative element from each equivalence class, and it is better to think in terms of the collection of equivalence classes, rather than a subset of elements.
Example 1.38. The equivalence classes of N relative to the equivalence relation m ∼ n if m ≡ n (mod 3) are given by
I 0 = { 3 , 6 , 9 ,... }, I 1 = { 1 , 4 , 7 ,... }, I 2 = { 2 , 5 , 8 ,... }.
14 1. Sets and Functions
The projection π : N → {I 0 , I 1 , I 2 } maps a number to its equivalence class e.g. π(101) = I 2. We can choose { 1 , 2 , 3 } as a set of representative elements, in which case
I 0 = [3], I 1 = [1], I 2 = [2],
but any other set A ⊂ N of three numbers with remainders 0, 1, 2 (mod 3) will do. For example, if we choose A = { 7 , 15 , 101 }, then
I 0 = [15], I 1 = [7], I 2 = [101].
One way to show that two sets have the same “size” is to pair off their elements. For example, if we can match up every left shoe in a closet with a right shoe, with no right shoes left over, then we know that we have the same number of left and right shoes. That is, we have the same number of left and right shoes if there is a one-to-one, onto map f : L → R, or one-to-one correspondence, from the set L of left shoes to the set R of right shoes.
We refer to the “size” of a set as measured by one-to-one correspondences as its cardinality. This notion enables us to compare the cardinality of both finite and infinite sets. In particular, we can use it to distinguish between “smaller” countably infinite sets, such as the integers or rational numbers, and “larger” uncountably infinite sets, such as the real numbers.
Definition 1.39. Two sets X, Y have equal cardinality, written X ≈ Y , if there is a one-to-one, onto map f : X → Y. The cardinality of X is less than or equal to the cardinality of Y , written X. Y , if there is a one-to-one (but not necessarily onto) map g : X → Y.
If X ≈ Y , then we also say that X, Y have the same cardinality. We don’t define the notion of a “cardinal number” here, only the relation between sets of “equal cardinality.”
Note that ≈ is an equivalence relation on any collection of sets. In particular, it is transitive because if X ≈ Y and Y ≈ Z, then there are one-to-one and onto maps f : X → Y and g : Y → Z, so g ◦ f : X → Z is one-to-one and onto, and X ≈ Z. We may therefore divide any collection of sets into equivalence classes of sets with equal cardinality.
It follows immediately from the definition that. is reflexive and transitive. Furthermore, as stated in the following Schr¨oder-Bernstein theorem, if X. Y and Y. X, then X ≈ Y. This result allows us to prove that two sets have equal cardinality by constructing one-to-one maps that need not be onto. The statement of the theorem is intuitively obvious but the proof, while elementary, is surprisingly involved and can be omitted without loss of continuity. (We will only use the theorem once, in the proof of Theorem 5.67.)
Theorem 1.40 (* Schr¨oder-Bernstein). If X, Y are sets such that there are one- to-one maps f : X → Y and g : Y → X, then there is a one-to-one, onto map h : X → Y.
16 1. Sets and Functions
We can use the cardinality relation to describe the “size” of a set by comparing it with standard sets.
Definition 1.41. A set X is:
(1) Finite if it is the empty set or X ≈ { 1 , 2 ,... , n} for some n ∈ N; (2) Countably infinite (or denumerable) if X ≈ N; (3) Infinite if it is not finite; (4) Countable if it is finite or countably infinite; (5) Uncountable if it is not countable.
We’ll take for granted some intuitively obvious facts which follow from the definitions. For example, a finite, non-empty set is in one-to-one correspondence with { 1 , 2 ,... , n} for a unique natural number n ∈ N (the number of elements in the set), a countably infinite set is not finite, and a subset of a countable set is countable.
According to Definition 1.41, we may divide sets into disjoint classes of finite, countably infinite, and uncountable sets. We also distinguish between finite and infinite sets, and countable and uncountable sets. We will show below, in Theo- rem 2.19, that the set of real numbers is uncountable, and we refer to its cardinality as the cardinality of the continuum.
Definition 1.42. A set X has the cardinality of the continuum if X ≈ R.
One has to be careful in extrapolating properties of finite sets to infinite sets.
Example 1.43. The set of squares
S = { 1 , 4 , 9 , 16 ,... , n^2 ,... }
is countably infinite since f : N → S defined by f (n) = n^2 is one-to-one and onto. It may appear surprising at first that the set N can be in one-to-one correspondence with an apparently “smaller” proper subset S, since this doesn’t happen for finite sets. In fact, assuming the axiom of choice, one can show that a set is infinite if and only if it has the same cardinality as a proper subset. Dedekind (1888) used this property to give a definition infinite sets that did not depend on the natural numbers N.
Next, we prove some results about countable sets. The following proposition states a useful necessary and sufficient condition for a set to be countable.
Proposition 1.44. A non-empty set X is countable if and only if there is an onto map f : N → X.
Proof. If X is countably infinite, then there is a one-to-one, onto map f : N → X. If X is finite and non-empty, then for some n ∈ N there is a one-to-one, onto map g : { 1 , 2 ,... , n} → X. Choose any x ∈ X and define the onto map f : N → X by
f (k) =
g(k) if k = 1, 2 ,... , n, x if k = n + 1, n + 2,....
1.6. Countable and uncountable sets 17
Conversely, suppose that such an onto map exists. We define a one-to-one, onto map g recursively by omitting repeated values of f. Explicitly, let g(1) = f (1). Suppose that n ≥ 1 and we have chosen n distinct g-values g(1), g(2),... , g(n). Let
An = {k ∈ N : f (k) 6 = g(j) for every j = 1, 2 ,... , n}
denote the set of natural numbers whose f -values are not already included among the g-values. If An = ∅, then g : { 1 , 2 ,... , n} → X is one-to-one and onto, and X is finite. Otherwise, let kn = min An, and define g(n + 1) = f (kn), which is distinct from all of the previous g-values. Either this process terminates, and X is finite, or we go through all the f -values and obtain a one-to-one, onto map g : N → X, and X is countably infinite.
If X is a countable set, then we refer to an onto function f : N → X as an enumeration of X, and write X = {xn : n ∈ N}, where xn = f (n).
Proposition 1.45. The Cartesian product N × N is countably infinite.
Proof. Define a linear order ≺ on ordered pairs of natural numbers as follows:
(m, n) ≺ (m′, n′) if either m + n < m′^ + n′^ or m + n = m′^ + n′^ and n < n′.
That is, we arrange N × N in a table
(1, 1) (1, 2) (1, 3) (1, 4)... (2, 1) (2, 2) (2, 3) (2, 4)... (3, 1) (3, 2) (3, 3) (3, 4)... (4, 1) (4, 2) (4, 3) (4, 4)... .. .
and list it along successive diagonals from bottom-left to top-right as
(1, 1), (2, 1), (1, 2), (3, 1), (2, 2), (1, 3), (4, 1), (3, 2), (2, 3), (1, 4),....
We define f : N → N × N by setting f (n) equal to the nth pair in this order; for example, f (7) = (4, 1). Then f is one-to-one and onto, so N × N is countably infinite.
Theorem 1.46. A countable union of countable sets is countable.
Proof. Let {Xn : n ∈ N} be a countable collection of countable sets. From Propo- sition 1.44, there is an onto map fn : N → Xn. We define
g : N × N →
n∈N
Xn
by g(n, k) = fn(k). Then g is also onto. From Proposition 1.45, there is a one-to- one, onto map h : N → N × N, and it follows that
g ◦ h : N →
n∈N
Xn
is onto, so Proposition 1.44 implies that the union of the Xn is countable.
1.6. Countable and uncountable sets 19
the cardinality of P(P(N), and so on. Thus, there are many other uncountable cardinalities apart from the cardinality of the continuum.
Cantor (1878) raised the question of whether or not there are any sets whose cardinality lies strictly between that of N and P(N). The statement that there are no such sets is called the continuum hypothesis, which may be formulated as follows.
Hypothesis 1.49 (Continuum). If C ⊂ P(N) is infinite, then either C ≈ N or C ≈ P(N).
The work of G¨odel (1940) and Cohen (1963) established the remarkable result that the continuum hypothesis cannot be proved or disproved from the standard axioms of set theory (assuming, as we believe to be the case, that these axioms are consistent). This result illustrates a fundamental and unavoidable incomplete- ness in the ability of any finite system of axioms to capture the properties of any mathematical structure that is rich enough to include the natural numbers.