Probability for Statisticians, Schemes and Mind Maps of Probability and Statistics

Appendix A provides a brief introduction to elementary probability theory, that could be useful for some mathematics students. (The appendices begin on page ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 05/11/2023

alpa
alpa 🇺🇸

4.4

(20)

249 documents

1 / 484

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT/MATH 521, 522, 523
Advanced Probability
Instructor 2012-2013: Jon Wellner
Text
Probability
for Statisticians
by
Galen R. Shorack
Department of Statistics
University of Washington
Seattle, WA 98195
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Probability for Statisticians and more Schemes and Mind Maps Probability and Statistics in PDF only on Docsity!

STAT/MATH 521, 522, 523

Advanced Probability

Instructor 2012-2013: Jon Wellner

Text

Probability

for Statisticians

by

Galen R. Shorack

Department of Statistics

University of Washington

Seattle, WA 98195

ii My grandparents’ generation

  • Peter Shorack and Anna Miliˇci (immigrants from Miliˇci, near Karlovatz) (They married in 1890, and came to the US in 1898.) Nicholas died by age 10 Annie to Ivan Harrington (Archie (wed Ellen; 5) + 9) William to Kate (William Jr. (wed Virginia; 2)) Amelia to Charlie Lord Theodore James Shorack (my father) Jenny to Godfrey Knight (5)
  • Frank Blaha Jr. (Chicago 3/24/1893, cooper) and Marcella Nekola (immigrant from the Prague area as a child) Marcella Barbara Blaha (my mother) Marie (1904) to Carlos Halstead (Carlos, Gilbert, Christine) George (1906) (my father’s dear friend) to Carmen Jirik Julia (1908) to Lyle Dinnell Nan (1911) (first child born on the Effie, Minnesota homestead) Rose (1913) Helen (1916) Carol (twin) (1918) to Ashley Morse (Leigh (wed Kent; 2), Laurel) Don (twin) (1918) to Jean Dora Frank J. Blaha Sr. (1850, lumber mill and railroad) wed Rose Hˇrda (1852) 1/30/1872. They had immigrated separately to the US in their twenties; he from the Prague area. (Frank Jr. (my mother’s father, homesteader), James (married Aunt Anna Nekola, printer), Joseph, Agnes.) Frank Jr. was an inept farmer, but he enjoyed his books and raised educated daughters. Thomas Nekola (wagon maker in the Prague area) married Mary Tomˇasek. (Barbara, Anna (married Uncle James), Marcella (my mother’s mother), Albert, Pete, Frances) Peter Shorack (an only child) seems to have been “on the move” when he arrived in Miliˇci, but his origin is unclear. Was he fleeing a purge in the east (he said) or Austro–Hungarian conscription (his wife said)? His parentage is unknown. He died from alcohol when my father was nine. Anna Miliˇci was the fourth of five children of Maximo and Martha Miliˇci. Maximo (appropri- ately 6’9”) came to the US, but fled home when two attackers did not survive. Later, his neighbors there banded together and killed him with pitchforks. Anna visited Miliˇci with her children when my father was five, but had to leave her children there for one year. She had hidden enough to get herself home, thus thwarting Peter’s efforts to strand his family. A hard woman in most ways, she used her gun to run off robbers (when linguistic Peter was running a railroad gang) and poachers (after she was alone on the homestead). My father worked in logging camps as a young kid; his mother and older brother would not allow him to go to school. He hopped a freight when he was bigger, but “hammer toes” allowed him to negate his mistake of an army enlistment. Back home, he trained religiously as a boxer and a fighter. He thus “survived” his older brother, boxed the county fairs with George, defended his interests in my mother, won two professional fights (but lost two teeth), fought (with some success) for his full winter logging earnings (each spring the same companies would go bankrupt, leaving their debts unpaid). WW II construction work on the Al-Can Highway and in the Aleutian Islands, gave him the nestegg to get us out of that country. With boxing and the gym for entertainment, he lived entirely off a whiskey allotment sold late in each month, every full paycheck came home— and he learned carpentry. On to Oregon! After ten years he was building his own houses on speculation, in spite of his financial raw fear common to so many of that depression generation. But that gave his sons jobs to go to college, and he sent his daughter. He took incredible pride in even the smallest of the accomplishments of any of us. Part of him died with my brother, flying cover on a pilot pickup. My mother provided the stability in our family, not an easy task. She provided the planning, tried to challenge us, watched for opportunities to expand our horizons. A shy woman, she defended the value of her son’s life by following the anti-Vietnam circuit and challenging all speakers. Her’s was the quiet consistency that I beter appreciated after having a family of my own.

Preface

Chapters 1–5 and Appendix B provide the mathematical foundation for the rest of the text. Then Chapters 6–7 hone some tools geared to probability theory. Appendix A provides a brief introduction to elementary probability theory, that could be useful for some mathematics students. (The appendices begin on page 425.) The classical weak law of large numbers (WLLN) and strong law of large numbers (SLLN) as presented in Sections 8.2–8.4 are particularly complete, and they also emphasize the important role played by the behavior of the maximal summand. Presentation of good inequalities is emphasized in the entire text, and this chapter is a good example. Also, there is an (optional) extension of the WLLN in Appendix C that focuses on the behavior of the sample variance, even in very general situations. It will be appealed to in the optional Section 10.5 and Chapter 11. The classical central limit theorem (CLT) and its Lindeberg and Liapunov and Berry–Esseen generalizations are presented in Chapter 10 using the characteristic function (chf) methods introduced in Chapter 9. Conditions for both the weak boot- strap and the strong bootstrap are also developed in Chapter 10, as is a universal bootstrap CLT based on light trimming of the sample. This approach emphasizes a statistical perspective. Gamma and Edgeworth approximations appear at the end of Chapter 11. Both distribution functions (dfs F (·)) and quantile functions (qfs K(·) ≡ F −^1 (·)) are emphasized throughout (quantile functions are important to statisticians). In Chapter 6 much general information about both dfs and qfs and the Winsorized vari- ance is developed. The text includes presentations showing how to exploit the in- verse transformation X ≡ K(ξ) with ξ ∼= Uniform(0, 1). In particular, Appendix C inequalities relating the qf and the Winsorized variance to some empirical process results of Chapter 12 were used in the first edition to treat trimmed means and L-statistics, rank and permutation tests, sampling from finite populations.

I have learned much through my association with David Mason, and I would like to acknowledge that here. Especially (in the context of this text), Theorem 12.4. is a beautiful improvement on Theorem 12.10.3, in that it still has the potential for necessary and sufficient results. I really admire the work of Mason and his colleagues. It was while working with David that some of my present interests developed. In particular, a useful companion to Theorem 12.10.3 is knowledge of quantile functions. Section 7.6 and Sections C.2–C.X present what I have compiled and produced on that topic while working on various applications, partially with David. Jon Wellner has taught from several versions of this text. In particular, he typed an earlier version and thus gave me a major critical boost. That head start is what turned my thoughts to writing a text for publication. Sections 14.2, and the Hoffman–Jorgensen inequalities came from him. He has also formulated a number of exercises, suggested various improvements, offered good suggestions and references regarding predictable processes, and pointed out some difficulties. My thanks to Jon for all of these contributions. (Obviously, whatever problems may remain lie with me.)

Contents

Preface iii Use of This Text ix Definition of Symbols xiii

Chapter 1. Measures

  1. Basic Properties of Measures 1
  2. Construction and Extension of Measures 12
  3. Lebesgue–Stieltjes Measures 18

Chapter 2. Measurable Functions and Convergence

  1. Mappings and σ-Fields 21
  2. Measurable Functions 24
  3. Convergence 29
  4. Probability, RVs, and Convergence in Law 33
  5. Discussion of Sub σ-Fields 35

Chapter 3. Integration

  1. The Lebesgue Integral 37
  2. Fundamental Properties of Integrals 40
  3. Evaluating and Differentiating Integrals 44
  4. Inequalities 46
  5. Modes of Convergence 52

Chapter 4. Derivatives via Signed Measures

  1. Decomposition of Signed Measures 66
  2. The Radon–Nikodym Theorem 72
  3. Lebesgue’s Theorem 76
  4. The Fundamental Theorem of Calculus 80

Chapter 5. Measures and Processes on Products

  1. Finite-Dimensional Product Spaces 87
  2. Random Vectors on (Ω, A, P ) 92
  3. Countably Infinite Product Probability Spaces 94
  4. Random Elements and Processes on (Ω, A, P ) 98

vi CONTENTS

viii CONTENTS

Chapter 14. Convergence in Law on Metric Spaces

  1. Convergence in Distribution on Metric Spaces 395
  2. Metrics for Convergence in Distribution 404

Chapter 15. Asymptotics Via Empirical Processes

  1. Introduction 409
  2. Trimmed and ˇ Winsorized Means˜ 410
  3. Linear Rank Statistics and Finite Sampling 416
  4. L–Statistics 422

Appendix A. Special Distributions

  1. Elementary Probability 425
  2. Distribution Theory for Statistics 433
  3. Linear Algebra Applications 437
  4. The Multivariate Normal Distribution 447

Appendix B. General Topology and Hilbert Space

  1. General Topology ??
  2. Metric Spaces ??
  3. Hilbert Space ??

Appendix C. More on the WLLN and CLT

  1. General Moment Estimation ??
  2. Slowly Varying Partial variance when σ^2 = ∞ ??
  3. Specific Tail Relationships ??
  4. Regularly Varying Functions ??
  5. Some Winsorized Variance Comparisons˜ ??
  6. Inequalities for Winsorized Quantile Functions˜ ??

Appendix D. LLN and CLT Summaries

  1. Introduction ??
  2. LLN’s for X¯n ??
  3. CLT’s for X¯n ??

References 451 Index 521

Use of This Text

The University of Washington is on the quarter system, so my description will reflect this fact. My thoughts are offered as a potential guide to an instructor. They certainly do not comprise an essential recipe. The reader will note that the exercises are interspersed with the text. It is important to read all of the exercises as they are encountered, as most of them contain some worthwhile contribution to the story. Chapters 1–5 provide the measure-theoretic background that is necessary for the rest of the text. Many of our students have had at least some kind of an undergraduate exposure to part of this subject. Still, it is important that I present the key parts of this material rather carefully. I feel it is useful for all of them. Chapter 1 (measures; 5 lectures) Emphasized in my presentation are generators, the monotone property of measures, the Carath´eodory extension theorem, completions, the approximation lemma, and the correspondence theorem. Presenting the correspondence theorem carefully is important, as this allows one the luxury of merely highlighting some proofs in Chapter 5. [The minimal monotone class theorem of Section 1.1, claim 8 of the Carath´edory extension theorem proof, and most of what follows the approximation lemma in Section 1.2 would never be presented in my lectures.] {I always assign Ex- ercises 1.1.1 (generators), 1.2.1 (completions), and 1.2.3 (the approximation lemma). Other exercises are assigned, but they vary each time.} Chapter 2 (measurable functions and convergence; 4 lectures) I present most of Sections 2.1, 2.2, and 2.3. Highlights are preservation of σ-fields, measurability of both common functions and limits of simple functions, induced measures, convergence and divergence sets (especially), and relating →μ to →a.s (especially, reducing the first to the second by going to subsequences). I then assign Section 2.4 as outside reading and Section 2.5 for exploring. [I never lecture on either Section 2.4 or 2.5.] {I always assign Exercises 2.2.1 (specific σ-fields), 2.3. (concerning →a.e.), 2.3.3 (a substantial proof), and 2.4.1 (Slutsky’s theorem).} Chapter 3 (integration; 7 lectures) This is an important chapter. I present all of Sections 3.1 and 3.2 carefully, but Section 3.3 is left as reading, and some of the Section 3.4 inequalities (Cr , H¨older, Liapunov, Markov, and Jensen) are done carefully. I do Section 3.5 carefully as far as Vitali’s theorem, and then assign the rest as outside reading. {I always assign Exercises 3.2.1–3.2.2 (only the zero function), 3.3.3 (differentiating under the integral sign), 3.5.1 (substantial theory), and 3.5.7 (the Scheff´e theorem).} Chapter 4 (Radon–Nikodym; 2 lectures) I present ideas from Section 4.1, sketch the Jordan–Hahn decomposition proof, and then give the proofs of the Lebesgue decomposition, the Radon–Nikodym theorem, and the change of variable theorem. These final two topics are highlighted. The fundamental theorem of calculus of Section 4.4 is briefly discussed. [I would never present any of Section 4.3.] {I always assign Exercises 4.2.1 (manipulating Radon– Nikodym derivatives), 4.2.7 (mathematically substantial), and 4.4.1, 4.4.2, and 4.4. (so that the students must do some outside reading in Section 4.4 on their own).}

USE OF THIS TEXT xi

practices the important Op(·) and op(·) notation), 8.4.4 (the substantial result of Marcinkiewicz and Zygmund), 8.4.7 (random sample size), and at least one of the alternative SLLN proofs contained in 8.4.8, 8.4.9, and 8.4.10.} At this point at the beginning of the winter quarter the instructor will have his/her own opinions about what to cover. I devote the winter quarter to the weak law of large numbers (WLLN), an introduction to the law of the iterated logarithm (LIL), and various central limit theorems (CLTs). That is, the second term treats material from Chapters 8-10, with others optional. I will now outline my choices. Chapter 8 (LLNs, inequalities, LIL, and series; 6 lectures) My lectures cover Section 8.3 (symmetrization inequalities and L´evy’s inequality for the WLLN, and the Ottovani–Skorokhod inequality for series), Feller’s WLLN from Section 8.4, the Glivenko–Cantelli theorem from Section 8.5, the LIL for normal rvs in Proposition 8.6.1, the strong Markov property of Theorem 8.7.1, and the two series Theorem 8.8.2. [I do not lecture from any of Sections 8.9, 9.10, or 8.11 at this time.] {I always assign Exercise 8.6.1 (Mills’ ratio).} Chapter 9 (characteristic functions (chfs); 8 lectures) Sections 9.1 and 9.2 contain classic results that relate to deriving convergence in distribution from convergence of various classes of integrals. I also cover sections 9.3–9.8. {I always assign Exercises 9.3.1 and 9.3.3(a) (deriving specific chfs) and 9.6.1 (Taylor series expansions of the chf).} Chapter 10 (CLTs via chfs; 6 lectures) The classical CLT, the Poisson limit theorem, and the multivariate CLT make a nice lecture. The chisquare goodness of fit example and/or the median example (of Section 10.3) make a lecture of illustrations. Chf proofs of the usual CLTs are given in Section 10.2 (Section 9.5 on Esseen’s lemma could have been left until now). Other examples from Section 10.2 or 10.3 could now be chosen, and Example 10.3. (weighted sums of iid rvs) is my first choice. [The chisquare goodness of fit example could motivate a student to read from Sections A.3 and A.4.] At this stage I still have at least 7 optional lectures at the end of the winter quarter and about 12 more at the start of the spring quarter. In my final 16 lectures of the spring quarter I feel it appropriate to consider Brownian motion in Chapter 12 and then martingales in Chapter 13 (in a fashion to be described below). Let me first describe some possibilities for the optional lectures, assuming that the above core was covered. Chapter 10 (bootstrap) Both Sections 10.8 and 10.9 on the bootstrap require only a discussion of section 10.??. Chapter 19 (convergence in distribution) Convergence in distribution on the line is presented in Chapter 10. [This is extended to metric spaces in Chapter 14, but I do not lecture from it.] Chapter 10 (domain of normal attraction of the normal df) The converse of the CLT in Theorem 10.6.1 requires the Gin´e–Zinn symmetrization inequality and the Khinchine inequality of Section 8.3 and the Paley–Zygmund inequality of Section 3.4. Chapters 7, 10 and 11 (domain of attraction of the Normal df) Combining Sections 6.6, C.1-C.4, Section 8.3 subsection on maximal inequalities of another ilk, and Sections 10.5–10.6 makes a nice unit. L´evy’s asymptotic normality

xii USE OF THIS TEXT

condition (ANC) of (10.6.3) for a rv X has some prominence. In Section B. purely geometric methods plus Cauchy–Schwarz are used to derive a multitude of equivalent conditions. In the process, quantile functions are carefully studied. In Section 10.1 the ANC is seen to be equivalent to a result akin to a WLLN for the rv X^2 , and so in this context many additional equivalent conditions are again derived. Thus when one comes to the general CLT in Sections 10.5 and 10.6, one already knows a great deal about the ANC. Chapter 11 (infinitely divisible and stable laws) First, Section 11.1 (infinitely divisible laws) is independent of the rest, including Section 11.2 (stable laws). The theorem stated in Section 11.4 (domain of attraction of stable laws) would require methods of Section B.4 to prove, but the interesting exercises are accessible without this. Chapter 11 (higher-order approximations) The local limit theorem in Section 10.4 can be done immediately for continuous dfs, but it also requires Section 9.8 for discrete dfs. The expansions given in Sections 11.5 (Gamma approximation) and 11.6 (Edgeworth approximation) also require Exercise 9.6.7. Assorted topics suitable for individual reading Possibilities include Section 13.8 (counting process martingales), and Section 13. (martingale CLTs). Section 15.1 on trimmed means and Section 15.2 on R-statistics (including a finite sampling CLT) are both fun; both require some discussion of Section C.6. The primary topics for the spring quarter are Chapter 12 (Brownian motion and elementary empirical processes) and Chapter 13 (martingales). Chapter 12 (Brownian motion; 6 lectures) I discuss Section 12.1, sketch the proof of Section 12.2 and carefully apply that result in Section 12.3, and treat Section 12.4 carefully (as I believe that at some point a lecture should be devoted to a few of the more subtle difficulties regarding measurability). I am a bit cavalier regarding Section 12.5 (strong Markov property), but I apply it carefully in Sections 12.6, 12.7, and 12.8. I assign Section 12.9 as outside reading. [I do not lecture on Theorem 12.8.2.] {I always assign Exercises 12.1.2 (on (C, C)), 12.3.1 (various transforms of Brownian motion), 12.3.3 (integrals of normal processes), 12.4.1 (properties of stopping times), 12.7.3(a) (related to embedding a rv in Brownian motion), and 12.8.2 (the LIL via embedding).} At this point let me describe three additional optional topics that could now be pursued, based on the previous lectures from Chapter 12. Chapter 12 (elementary empirical processes) Uniform empirical and quantile processes are considered in Section 12.10. Straight- forward applications to linear rank statistics and two-sample test of fit are included. One could either lecture from Section 12.12 (directly) or 12.11 (with a preliminary lecture from Sections 10.10–10.11, or leave these for assigned reading.) Chapter 11 (martingales; 10 lectures) I cover most of the first seven sections. {I always assign Exercises 11.1.4 (a counting process martingale), 11.3.2 (a proof for continuous time mgs), 11.3.7, and 11.3.9 (on Lr -convergence).}

xiv DEFINITION OF SYMBOLS

Chapter 1

Measures

1 Basic Properties of Measures

Motivation 1.1 (The Lebesgue integral) The Riemann integral of a continuous function f (we will restrict attention to f (x) ≥ 0 on a ≤ x ≤ b for convenience) is formed by subdividing the domain of f , forming approximating sums, and passing to the limit. Thus the mth Riemann sum for

∫ (^) b a f^ (x)^ dx^ is defined as

RSm ≡

∑^ m

i=

(1) f (x∗ mi) [xmi − xm,i− 1 ],

where a ≡ xm 0 < xm 1 < · · · < xmm ≡ b (with xm,i− 1 ≤ x∗ mi ≤ xmi for all i) satisfy meshm ≡ max[xmi − xm,i− 1 ] → 0. Note that xmi − xm,i− 1 is the measure (or length) of the interval [xm,i− 1 , xmi], while f (x∗ mi) approximates the values of f (x) for all xm,i− 1 ≤ x ≤ xmi (at least it does if f is continuous on [a, b]). Within the class C+^ of all nonnegative continuous functions, this definition works reasonably well. But it has one major shortcoming. The conclusion

∫ (^) b a fn(x)^ dx^ →^

∫ (^) b a f^ (x)^ dx is one we often wish to make if fn “converges” to f. However, even when all fn are in C+^ and f (x) ≡ lim fn(x) actually exists, it need not be that f is in C+^ (and thus ∫ (^) b a f^ (x)^ dx^ may not even be well-defined) or that^

∫ (^) b a fn(x)^ dx^ →^

∫ (^) b a f^ (x)^ dx^ (even when it is well defined). A different approach is needed. (Note figure 1.1.) The Lebesgue integral of a nonnegative function is formed by subdividing the range. Thus the mth Lebesgue sum for

∫ (^) b a f^ (x)^ dx^ is defined as

LSm ≡

m ∑ 2 m

k=

k − 1 2 m^

× measure

x :

k − 1 2 m^

≤ f (x) <

k 2 m

and

∫ (^) b a f^ (x)^ dx^ is defined to be the limit of the^ LSm^ sums as^ m^ → ∞. For what class M of functions f can this approach succeed? The members f of the class M will need to be such that the measure (or length) of all sets of the form { x : k − 1 2 m^

≤ f (x) < k 2 m

1. BASIC PROPERTIES OF MEASURES 3

Definition 1.1 (Set theory) Consider a nonvoid class A of subsets A of a nonvoid set Ω. (For us, Ω will be the sample space of an experiment.) (a) Let Ac^ denote the complement of A, let A ∪ B denote the union of A and B, let A ∩ B and AB both denote the intersection, let A \ B ≡ ABc^ denote the set difference, let A△B ≡ (AcB ∪ ABc) denote the symmetric difference, and let ∅ denote the empty set. The class of all subsets of Ω will be denoted by 2Ω. Sets A and B are called disjoint if AB = ∅, and sequences of sets An or classes of sets At are called disjoint if all pairs are disjoint. Writing A + B or

1 An^ will also denote a union, but will imply the disjointness of the sets in the union. As usual, A ⊂ B denotes that A is a subset of B. We call a sequence An increasing (and we will nearly always denote this fact by writing An ր) when An ⊂ An+1 for all n ≥ 1. We call the sequence decreasing (denoted by An ց) when An ⊃ An+1 for all n ≥ 1. We call the sequence monotone if it is either increasing or decreasing. Let ω denote a generic element of Ω. We will use 1A(·) to denote the indicator function of A, which equals 1 or 0 at ω according as ω ∈ A or ω 6 ∈ A. (b) A will be called a field if it is closed under complements and unions. (That is, A and B in A requires that Ac^ and A ∪ B be in A.) [Note that both Ω and ∅ are necessarily in A, as A was assumed to be nonvoid, with Ω = A ∪ Ac^ and ∅ = Ωc.] (c) A will be called a σ-field if it is closed under complements and countable unions. (That is, A, A 1 , A 2 ,... in A requires that Ac^ and ∪∞ 1 An be in A.) (d) A will be called a monotone class provided it contains ∪∞ 1 An for all increasing sequences An in A and contains ∩∞ 1 An for all decreasing sequences An in A. (e) (Ω, A) will be called a measurable space provided A is a σ-field of subsets of Ω. (f) A will be called a π-system provided AB is in A for all A and B in A; and A will be called a ¯π-system when Ω in A is also guaranteed.

If A is a field (or a σ-field), then it is closed under intersections (under countable intersections); since AB = (Ac^ ∪ Bc)c^ (since ∩∞ 1 An = (∪∞ 1 Acn)c). Likewise, we could have used “intersection” instead of “union” in our definitions by making use of A ∪ B = (Ac^ ∩ Bc)c^ and ∪∞ 1 An = (∩∞ 1 Acn)c. (This used De Morgan’s laws.)

Proposition 1.1 (Closure under intersections) (a) Arbitrary intersections of fields, σ-fields, or monotone classes are fields, σ-fields, or monotone classes, respectively. [For example, F ≡ ∩{Fα : Fα is a field under consideration} is a field.] (b) There is a minimal field, σ-field, or monotone class generated by (or, containing) any specified class C of subsets of Ω. Call C the generators. For example,

σ[C] ≡

(4) {Fα : Fα is a σ-field of subsets of Ω for which C ⊂ Fα}

is the minimal σ-field generated by C (that is, containing C). (c) A collection A of subsets of Ω is a σ-field if and only if it is both a field and a monotone class.

Proof. (c) (⇐) ∪∞ 1 An = ∪∞ 1 (∪n 1 Ak)) ≡ ∪∞ 1 Bn ∈ A since the Bn are in A and are ր. Everything else is even more trivial. 2

4 CHAPTER 1. MEASURES

Exercise 1.1 (Generators) Let C 1 and C 2 denote two collections of subsets of the set Ω. If C 2 ⊂ σ[C 1 ] and C 1 ⊂ σ[C 2 ], then σ[C 1 ] = σ[C 2 ]. Prove this fact.

Definition 1.2 (Measures and events) Consider a measurable space (Ω, A) and a set function μ : A → [0, ∞] (that is, μ(A) ≥ 0 for each A ∈ A) having μ(∅) = 0. (a) Now A is a σ-field and if μ is countably additive (abbreviated c.a.) in that

μ

n=

An

∑^ ∞

n=

(5) μ(An) for all disjoint sequences An in A,

then μ is called a measure (or, equivalently, a countably additive measure) on (Ω, A). The triple (Ω, A, μ) is then called a measure space. We call μ finite if μ(Ω) < ∞.

We call μ σ-finite if there exists a measurable decomposition of Ω as Ω =

1 Ωn with Ωn ∈ A and μ(Ωn) < ∞ for all n. The sets A in the σ-field A are called events.

[Even if A is not a σ-field, we will still call μ a measure on (Ω, A), when (5) holds for all sequences An ∈ A for which

1 An^ is in^ A. We will not, however, use the term “measure space” to describe such a triple. We will consider below measures on fields, on certain ¯π-systems, and on some other collections of sets. A useful property of a collection of sets is that along with any sets A 1 ,... , Ak it also includes all sets of the type Bk ≡ AkAck− 1 · · · Ac 2 Ac 1 ; then

⋃n 1 Ak^ =^

∑n 1 Bk^ is easier to work with.] (b) Of less interest, call μ a finitely additive measure (abbreviated f.a.) on (Ω, A) if

μ(

∑n 1 Ak^ ) =^

∑n (6) 1 μ(Ak )

for all disjoint sequences Ak in A for which

∑n 1 Ak^ is also in^ A.

Definition 1.3 (Outer measures) Consider a set function μ∗^ : 2Ω^ → [0, ∞]. (a) Suppose that μ∗^ also satisfies the following three properties. Null: μ∗(∅) = 0. Monotone: μ∗(A) ≤ μ∗(B) for all A ⊂ B. Countable subadditivity: μ∗(

1 An)^ ≤^

1 μ

∗(An) for all sequences An.

Then μ∗^ is called an outer measure. (b) An arbitrary subset A of Ω is called μ∗-measurable if

(7) μ∗(T ) = μ∗(T A) + μ∗(T Ac) for all subsets T ⊂ Ω.

Sets T used in this capacity are called test sets. (c) We let A∗^ denote the class of all μ∗-measurable sets, that is,

(8) A∗^ ≡ {A ∈ 2 Ω^ : A is μ∗-measurable}.

[Note that A ∈ A∗^ if and only if μ∗(T ) ≥ μ∗(T A) + μ∗(T Ac) for all T ⊂ Ω, since the other inequality is trivial by the subadditivity of μ∗.]

Motivation 1.2 (Measure) In this paragraph we will consider only one possible measure μ, namely the Lebesgue-measure generalization of length. Let CI denote the set of all intervals of the types (a, b], (−∞, b], and (a, +∞) on the real line R, and for each of these intervals I we assign a measure value μ(I) equal to its length, thus ∞, b − a, ∞ in the three special cases. All is well until we manipulate the sets