## Search in the document preview

*Probability and
Stochastic Processes*

**Features of this Text
**

**Who will benefit from using this text?
**

This text can be used in Junior, Senior or graduate level courses in probability, stochastic process, random signal processing and queuing theory. The mathematical exposition will appeal to students and practioners in many areas. The examples, quizzes, and problems are typical of those encountered by practicing electrical and computer engineers. Professionals in the telecommunications and wireless industry will find it particularly useful.

**What’s New?
**

This text has been expanded greatly with new material:

• Matlab examples and problems give students hands-on access to theory and ap- plications. Every chapter includes guidance on how to use MATLAB to perform calculations and simulations relevant to the subject of the chapter.

• A new chapter on **Random Vectors
**• Expanded and enhanced coverage of **Random Signal Processing
**• Streamlined exposition of **Markov Chains **and **Queuing Theory **provides quicker

access to theories of greatest practical importance

**Notable Features
**

**The Friendly Approach
**The friendly and accessible writing style gives students an intuitive feeling for the
formal mathematics.

**Quizzes and Homework Problems
**An extensive collection of in-chapter Quizzes provides check points for readers to
gauge their understanding. Hundreds of end-of-chapter problems are clearly marked
as to their degree of difficulty from beginner to expert.

**Website for Students **http://www.wiley.com/college/yates
Available for download: All Matlab m-files in the text, the *Quiz Solutions Manual
*

**Instructor Support
**

Instructors should register at the Instructor Companion Site (ISC) at Wiley in order to obtain supplements. The ISC can be reached by accessing the text’s companion web page http://www.wiley.com/college/yates

• Unparalleled in its offerings, this Second Edition provides a web-based interface for instructors to create customized solutions documents that output in PDF or PostScript.

• Extensive PowerPoint slides are available.

*Probability and
Stochastic Processes
*

*A Friendly Introduction
for Electrical and Computer Engineers
*

Second Edition

**Roy D. Yates
***Rutgers, The State University of New Jersey
*

**David J. Goodman
***Polytechnic University
*

*JOHN WILEY & SONS, INC.*

EXECUTIVE EDITOR Bill Zobrist

MARKETING MANAGER Jennifer Powers

PRODUCTION EDITOR Ken Santor

COVER DESIGNER Dawn Stanley

This book was set in Times Roman by the authors using LATEXand printed and bound by Malloy, Inc. The cover was printed by Lehigh Press.

About the cover: The cover shows a cutaway view of a bivariate Gaussian probability den- sity function. The bell-shaped cross-sections show that the marginal densities are Gaussian.

This book is printed on acid-free paper. ∞©

Copyright c© 2005 John Wiley & Sons, Inc. All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978)750-8400, fax (978)750-4470. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030 (201)748-6011, fax (201)748-6008, E-Mail: PERMREQ@WILEY.COM. To order books or for customer service call 1-800-CALL WILEY (225-5945).

ISBN 0-471-27214-0

WIE 0-471-45259-9

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

*To our children,
Tony, Brett, and Zachary Yates
*

*Leila and Alissa Goodman*

*Preface
***What’s new in the second edition?
**

We are happy to introduce you to the second edition of our textbook. Students and instructors using the first edition have responded favorably to the “friendly” approach that couples engineering intuition to mathematical principles. They are especially pleased with the abundance of exercises in the form of “examples,” “quizzes,” and “problems,” many of them very simple. The exercises help students absorb the new material in each chapter and gauge their grasp of it.

Aiming for basic insight, the first edition avoided exercises that require complex com- putation. Although most of the original exercises have evident engineering relevance, they are considerably simpler than their real-world counterparts. This second edition adds a large set of Matlab programs offering students hands-on experience with simulations and calculations. Matlab bridges the gap between the computationally simple exercises and the more complex tasks encountered by engineering professionals. The Matlab section at the end of each chapter presents programs we have written and also guides students to write their own programs.

Retaining the friendly character of the first edition, we have incorporated into this edition the suggestions of many instructors and students. In addition to the Matlab programs, new material includes a presentation of multiple random variables in vector notation. This format makes the math easier to grasp and provides a convenient stepping stone to the chapter on stochastic processes, which in turn leads to an expanded treatment of the application of probability theory to digital signal processing.

**Why did we write the book?
**

When we started teaching the course *Probability and Stochastic Processes *to Rutgers un-
dergraduates in 1991, we never dreamed we would write a textbook on the subject. Our
bookshelves contain more than a twenty probability texts, many of them directed at elec-
trical and computer engineering students. We respect most of them. However, we have
yet to find one that works well for Rutgers students. We discovered to our surprise that
the majority of our students have a hard time learning the subject. Beyond meeting degree
requirements, the main motivation of most of our students is to learn how to solve practical
problems. For the majority, the mathematical logic of probability theory is, in itself, of
minor interest. What the students want most is an intuitive grasp of the basic concepts and
lots of practice working on applications.

The students told us that the textbooks we assigned, for all their mathematical elegance, didn’t meet their needs. To help them, we distributed copies of our lecture notes, which gradually grew into this book. We also responded to students who find that although much of the material appears deceptively simple, it takes a lot of careful thought and practice to

**vii**

**viii **PREFACE

use the mathematics correctly. Even when the formulas are simple, knowing which ones to use is difficult. This is a reversal from some mathematics courses, where the equations are given and the solutions are hard to obtain.

**What is distinctive about this book?
**

• The entire text adheres to a single model that begins with an experiment consisting of a procedure and observations.

• The mathematical logic is apparent to readers. Every fact is identified clearly as a definition, an axiom, or a theorem. There is an explanation, in simple English, of the intuition behind every concept when it first appears in the text.

• The mathematics of discrete random variables are introduced separately from the mathematics of continuous random variables.

• Stochastic processes and statistical inference fit comfortably within the unifying model of the text.

• An abundance of exercises puts the theory to use. New ideas are augmented with detailed solutions of numerical examples. Each section concludes with a simple quiz to help students gauge their grasp of the new material. The book’s Web site contains complete solutions of all of the quizzes.

• Each problem at the end of a chapter is labeled with a reference to a section in the chapter and a degree of difficulty ranging from “easy” to “experts only.”

• There is considerable support on the World Wide Web for students and instructors, including Matlab files and problem solutions.

**How is the book organized?
**

We estimate that the material in this book represents about 150% of a one-semester under-
graduate course. We suppose that most instructors will spend about two-thirds of a semester
covering the material in the first five chapters. The remainder of a course will be devoted
to about half of the material in the final seven chapters, with the selection depending on the
preferences of the instructor and the needs of the students. Rutgers electrical and computer
engineering students take this course in the first semester of junior year. The following
semester they use much of the material in *Principles of Communications*.

We have also covered the entire book in one semester in an entry-level graduate course that places more emphasis on mathematical derivations and proofs than does the undergraduate course. Although most of the early material in the book is familiar in advance to many graduate students, the course as a whole brings our diverse graduate student population up to a shared level of competence.

The first five chapters carry the core material that is common to practically all intro- ductory engineering courses in probability theory. Chapter 1 examines probability models defined on abstract sets. It introduces the set theory notation used throughout the book and states the three axioms of probability and several theorems that follow directly from the ax- ioms. It defines conditional probability, the Law of Total Probability, Bayes’ theorem, and

PREFACE **ix
**

**1.
**

**2.
**

**3.
**

**4.
**

**5.
**

Experiments, Models, and Probabilities Discrete Random Variables

Pairs of Random Variables Random Vectors

Continuous Random Variables

**6.
**

**7.
**

Sums of Random Variables

Parameter Estimation using the Sample Mean

**8.
**

**9.
**

Hypothesis Testing

Estimation of a Random Variable

**10.
**

**12.
**

Stochastic Processes

Markov Chains

**10.
**

**11.
**

Stochastic Processes

Random Signal Processing

**A road map for the text.
**

independence. The chapter concludes by presenting combinatorial principles and formulas that are used later in the book.

The second and third chapters address individual discrete and continuous random vari- ables, respectively. They introduce probability mass functions and probability density functions, expected values, derived random variables, and random variables conditioned on events. Chapter 4 covers pairs of random variables including joint probability functions, conditional probability functions, correlation, and covariance. Chapter 5 extends these concepts to multiple random variables, with an emphasis on vector notation. In studying Chapters 1–5, students encounter many of the same ideas three times in the contexts of abstract events, discrete random variables, and continuous random variables. We find this repetition to be helpful pedagogically. The flow chart shows the relationship of the subse- quent material to the fundamentals in the first five chapters. Armed with the fundamentals, students can move next to any of three subsequent chapters.

Chapter 6 teaches students how to work with sums of random variables. For the most part it deals with independent random variables and derives probability models using convolution integrals and moment generating functions. A presentation of the central limit theorem precedes examples of Gaussian approximations to sums of random variables. This material flows into Chapter 7, which defines the sample mean and teaches students how to use measurement data to formulate probability models.

Chapters 8 and 9 present practical applications of the theory developed in the first five chapters. Chapter 8 introduces Bayesian hypothesis testing, the foundation of many signal

**x **PREFACE

detection techniques created by electrical and computer engineers. Chapter 9 presents techniques for using observations of random variables to estimate other random variables. Some of these techniques appear again in Chapter 11 in the context of random signal processing.

Many instructors may wish to move from Chapter 5 to Chapter 10, which introduces the basic concepts of stochastic processes with the emphasis on wide sense stationary pro- cesses. It provides tools for working on practical applications in the last two chapters. Chapter 11 introduces several topics related to random signal processing including: linear filters operating on continuous-time and discrete-time stochastic processes; linear estima- tion and linear prediction of stochastic processes; and frequency domain analysis based on power spectral density functions. Chapter 12 introduces Markov chains and their practical applications.

The text includes several hundred homework problems, organized to assist both instruc- tors and students. The problem numbers refer to sections within a chapter. For example Problem 3.4.5 requires material from Section 3.4 but not from later sections. Each problem also has a label that reflects our estimate of degree of difficulty. Skiers will recognize the following symbols:

• Easy Moderate Difficult Experts Only.

Every ski area emphasizes that these designations are relative to the trails at that area. Similarly, the difficulty of our problems is relative to the other problems in this text.

**Further Reading
**

Libraries and bookstores contain an endless collection of textbooks at all levels covering the topics presented in this textbook. We know of two in comic book format [GS93, Pos01]. The reference list on page 511 is a brief sampling of books that can add breadth or depth to the material in this text. Most books on probability, statistics, stochastic processes, and random signal processing contain expositions of the basic principles of probability and random variables, covered in Chapters 1–4. In advanced texts, these expositions serve mainly to establish notation for more specialized topics. [LG93] and [Pee00] share our focus on electrical and computer engineering applications. [Dra67], [Ros02], and [BT02] introduce the fundamentals of probability and random variables to a general audience of students with a calculus background. [Bil95] is more advanced mathematically. It presents probability as a branch of number theory. [MR94] and [SM95] introduce probability theory in the context of data analysis. [Sig02] and [HL01] are beginners’ introductions to MATLAB. [Ber96] is in a class by itself. It presents the concepts of probability from a historical perspective, focusing on the lives and contributions of mathematicians and others who stimulated major advances in probability and statistics and their application various fields including psychology, economics, government policy, and risk management.

The summaries at the end of Chapters 5–12 refer to books that supplement the specialized material in those chapters.

PREFACE **xi
**

**Acknowledgments
**

We are grateful for assistance and suggestions from many sources including our students at Rutgers and Polytechnic Universities, instructors who adopted the first edition, reviewers, and the Wiley team.

At Wiley, we are pleased to acknowledge the continuous encouragement and enthusiasm of our executive editor, Bill Zobrist and the highly skilled support of marketing manager, Jennifer Powers, Senior Production Editor, Ken Santor, and Cover Designer, Dawn Stanley.

We also convey special thanks to Ivan Seskar of WINLAB at Rutgers University for ex- ercising his magic to make the WINLAB computers particularly hospitable to the electronic versions of the book and to the supporting material on the World Wide Web.

The organization and content of the second edition has benefited considerably from the input of many faculty colleagues including Alhussein Abouzeid at Rensselaer Polytechnic Institute, Krishna Arora at Florida State University, Frank Candocia at Florida International University, Robin Carr at Drexel University, Keith Chugg at USC, Charles Doering at University of Michigan, Roger Green at North Dakota State University, Witold Krzymien at University of Alberta, Edl Schamiloglu at University of New Mexico, Arthur David Snider at University of South Florida, Junshan Zhang at Arizona State University, and colleagues Narayan Mandayam, Leo Razumov, Christopher Rose, Predrag Spasojevi ć and Wade Trappe at Rutgers.

Unique among our teaching assistants, Dave Famolari took the course as an undergrad- uate. Later as a teaching assistant, he did an excellent job writing homework solutions with a tutorial flavor. Other graduate students who provided valuable feedback and sugges- tions on the first edition include Ricki Abboudi, Zheng Cai, Pi-Chun Chen, Sorabh Gupta, Vahe Hagopian, Amar Mahboob, Ivana Maric, David Pandian, Mohammad Saquib, Sennur Ulukus, and Aylin Yener.

The first edition also benefited from reviews and suggestions conveyed to the publisher by D.L. Clark at California State Polytechnic University at Pomona, Mark Clements at Georgia Tech, Gustavo de Veciana at the University of Texas at Austin, Fred Fontaine at Cooper Union, Rob Frohne at Walla Walla College, Chris Genovese at Carnegie Mellon, Simon Haykin at McMaster, and Ratnesh Kumar at the University of Kentucky.

Finally, we acknowledge with respect and gratitude the inspiration and guidance of our teachers and mentors who conveyed to us when we were students the importance and elegance of probability theory. We cite in particular Alvin Drake and Robert Gallager of MIT and the late Colin Cherry of Imperial College of Science and Technology.

**A Message to Students from the Authors
**

A lot of students find it hard to do well in this course. We think there are a few reasons for this difficulty. One reason is that some people find the concepts hard to use and understand. Many of them are successful in other courses but find the ideas of probability difficult to grasp. Usually these students recognize that learning probability theory is a struggle, and most of them work hard enough to do well. However, they find themselves putting in more effort than in other courses to achieve similar results.

Other people have the opposite problem. The work looks easy to them, and they under- stand everything they hear in class and read in the book. There are good reasons for assuming

**xii **PREFACE

this is easy material. There are very few basic concepts to absorb. The terminology (like
the word *probability*), in most cases, contains familiar words. With a few exceptions, the
mathematical manipulations are not complex. You can go a long way solving problems
with a four-function calculator.

For many people, this apparent simplicity is dangerously misleading because it is very tricky to apply the math to specific problems. A few of you will see things clearly enough to do everything right the first time. However, most people who do well in probability need to practice with a lot of examples to get comfortable with the work and to really understand what the subject is about. Students in this course end up like elementary school children who do well with multiplication tables and long division but bomb out on “word problems.” The hard part is figuring out what to do with the numbers, not actually doing it. Most of the work in this course is that way, and the only way to do well is to practice a lot. Taking the midterm and final are similar to running in a five-mile race. Most people can do it in a respectable time, provided they train for it. Some people look at the runners who do it and say, “I’m as strong as they are. I’ll just go out there and join in.” Without the training, most of them are exhausted and walking after a mile or two.

So, our advice to students is, if this looks really weird to you, keep working at it. You will probably catch on. If it looks really simple, don’t get too complacent. It may be harder than you think. Get into the habit of doing the quizzes and problems, and if you don’t answer all the quiz questions correctly, go over them until you understand each one.

We can’t resist commenting on the role of probability and stochastic processes in our careers. The theoretical material covered in this book has helped both of us devise new communication techniques and improve the operation of practical systems. We hope you find the subject intrinsically interesting. If you master the basic ideas, you will have many opportunities to apply them in other courses and throughout your career.

We have worked hard to produce a text that will be useful to a large population of stu-
dents and instructors. We welcome comments, criticism, and suggestions. Feel free to send
us e-mail at *ryates@winlab.rutgers.edu *or *dgoodman@poly.edu*. In addition, the Website,
http://www.wiley.com/college/yates, provides a variety of supplemental ma-
terials, including the Matlab code used to produce the examples in the text.

Roy D. Yates
*Rutgers, The State University of New Jersey
*

David J. Goodman
*Polytechnic University
*

*March 29, 2004*

*Contents
Features of this Text ii
*

*Preface vii
*

*1 Experiments, Models, and Probabilities 1
Getting Started with Probability 1
*

*1.1 Set Theory 2
1.2 Applying Set Theory to Probability 6
1.3 Probability Axioms 12
1.4 Some Consequences of the Axioms 15
1.5 Conditional Probability 16
1.6 Independence 21
1.7 Sequential Experiments and Tree Diagrams 24
1.8 Counting Methods 28
1.9 Independent Trials 35
1.10 Reliability Problems 38
1.11 *Matlab *40
*

*Chapter Summary 41
Problems 42
*

*2 Discrete Random Variables 49
2.1 Definitions 49
2.2 Probability Mass Function 52
2.3 Families of Discrete Random Variables 54
2.4 Cumulative Distribution Function (CDF) 62
2.5 Averages 65
2.6 Functions of a Random Variable 70
2.7 Expected Value of a Derived Random Variable 74
2.8 Variance and Standard Deviation 77
2.9 Conditional Probability Mass Function 81
2.10 *Matlab *85
*

*Chapter Summary 92
Problems 93
*

*3 Continuous Random Variables 101
*

**xiii**

**xiv **CONTENTS

*3.1 The Cumulative Distribution Function 104
3.2 Probability Density Function 105
3.3 Expected Values 111
3.4 Families of Continuous Random Variables 114
3.5 Gaussian Random Variables 118
3.6 Delta Functions, Mixed Random Variables 122
3.7 Probability Models of Derived Random Variables 131
3.8 Conditioning a Continuous Random Variable 137
3.9 *Matlab *141
*

*Chapter Summary 144
Problems 145
*

*4 Pairs of Random Variables 153
4.1 Joint Cumulative Distribution Function 154
4.2 Joint Probability Mass Function 155
4.3 Marginal PMF 158
4.4 Joint Probability Density Function 160
4.5 Marginal PDF 165
4.6 Functions of Two Random Variables 167
4.7 Expected Values 171
4.8 Conditioning by an Event 177
4.9 Conditioning by a Random Variable 180
4.10 Independent Random Variables 188
4.11 Bivariate Gaussian Random Variables 191
4.12 *Matlab *196
*

*Chapter Summary 200
Problems 201
*

*5 Random Vectors 211
5.1 Probability Models of N Random Variables 211
5.2 Vector Notation 214
5.3 Marginal Probability Functions 216
5.4 Independence of Random Variables and Random Vectors 218
5.5 Functions of Random Vectors 220
5.6 Expected Value Vector and Correlation Matrix 224
5.7 Gaussian Random Vectors 229
5.8 *Matlab *235
*

*Chapter Summary 237
Problems 237*

CONTENTS **xv
**

*6 Sums of Random Variables 243
6.1 Expected Values of Sums 243
6.2 PDF of the Sum of Two Random Variables 246
6.3 Moment Generating Functions 248
6.4 MGF of the Sum of Independent Random Variables 251
6.5 Random Sums of Independent Random Variables 254
6.6 Central Limit Theorem 257
6.7 Applications of the Central Limit Theorem 261
6.8 The Chernoff Bound 265
6.9 *Matlab *266
*

*Chapter Summary 268
Problems 269
*

*7 Parameter Estimation Using the Sample Mean 275
7.1 Sample Mean: Expected Value and Variance 275
7.2 Deviation of a Random Variable from the Expected Value 277
7.3 Point Estimates of Model Parameters 279
7.4 Confidence Intervals 286
7.5 *Matlab *292
*

*Chapter Summary 294
Problems 295
*

*8 Hypothesis Testing 299
8.1 Significance Testing 300
8.2 Binary Hypothesis Testing 302
8.3 Multiple Hypothesis Test 314
8.4 *Matlab *317
*

*Chapter Summary 318
Problems 319
*

*9 Estimation of a Random Variable 327
9.1 Optimum Estimation Given Another Random Variable 327
9.2 Linear Estimation of ***X ***given ***Y ***332
9.3 MAP and ML Estimation 336
9.4 Linear Estimation of Random Variables from Random
*

*Vectors 340
9.5 *Matlab *345
*

*Chapter Summary 347
Problems 348*

**xvi **CONTENTS

*10 Stochastic Processes 353
10.1 Definitions and Examples 353
10.2 Types of Stochastic Processes 357
10.3 Random Variables from Random Processes 359
10.4 Independent, Identically Distributed Random Sequences 361
10.5 The Poisson Process 362
10.6 Properties of the Poisson Process 365
10.7 The Brownian Motion Process 368
10.8 Expected Value and Correlation 369
10.9 Stationary Processes 373
10.10 Wide Sense Stationary Stochastic Processes 376
10.11 Cross-Correlation 379
10.12 Gaussian Processes 382
10.13 *Matlab *384
*

*Chapter Summary 389
Problems 390
*

*11 Random Signal Processing 395
11.1 Linear Filtering of a Continuous-Time Stochastic Process 395
11.2 Linear Filtering of a Random Sequence 399
11.3 Discrete-Time Linear Filtering: Vectors and Matrices 404
11.4 Discrete-Time Linear Estimation and Prediction Filters 407
11.5 Power Spectral Density of a Continuous-Time Process 412
11.6 Power Spectral Density of a Random Sequence 417
11.7 Cross Spectral Density 421
11.8 Frequency Domain Filter Relationships 422
11.9 Linear Estimation of Continuous-Time Stochastic Processes 426
11.10 *Matlab *428
*

*Chapter Summary 438
Problems 439
*

*12 Markov Chains 445
12.1 Discrete-Time Markov Chains 445
12.2 Discrete-Time Markov Chain Dynamics 448
12.3 Limiting State Probabilities for a Finite Markov Chain 451
12.4 State Classification 455
12.5 Limit Theorems For Irreducible Finite Markov Chains 459
12.6 Periodic States and Multiple Communicating Classes 464
12.7 Countably Infinite Chains: State Classification 467*

CONTENTS **xvii
**

*12.8 Countably Infinite Chains: Stationary Probabilities 473
12.9 Continuous-Time Markov Chains 474
12.10 Birth-Death Processes and Queueing Systems 480
12.11 *Matlab *486
*

*Chapter Summary 493
Problems 494
*

*Appendix A Families of Random Variables 501
A.1 Discrete Random Variables 501
A.2 Continuous Random Variables 503
*

*Appendix B A Few Math Facts 507
*

*References 511
*

*Index 513*

*1
Experiments, Models,
*

*and Probabilities
***Getting Started with Probability
**

You have read the “Message to Students” in the Preface. Now you can begin. The title
of this book is *Probability and Stochastic Processes*. We say and hear and read the word
*probability *and its relatives (*possible, probable, probably*) in many contexts. Within the
realm of applied mathematics, the meaning of *probability *is a question that has occupied
mathematicians, philosophers, scientists, and social scientists for hundreds of years.

Everyone accepts that the probability of an event is a number between 0 and 1. Some people interpret probability as a physical property (like mass or volume or temperature) that can be measured. This is tempting when we talk about the probability that a coin flip will come up heads. This probability is closely related to the nature of the coin. Fiddling around with the coin can alter the probability of heads.

Another interpretation of probability relates to the knowledge that we have about some-
thing. We might assign a low probability to the truth of the statement, *It is raining now in
Phoenix, Arizona*, because we know that Phoenix is in the desert. However, our knowledge
changes if we learn that it was raining an hour ago in Phoenix. This knowledge would cause
us to assign a higher probability to the truth of the statement, *It is raining now in Phoenix*.

Both views are useful when we apply probability theory to practical problems. Whichever view we take, we will rely on the abstract mathematics of probability, which consists of definitions, axioms, and inferences (theorems) that follow from the axioms. While the structure of the subject conforms to principles of pure logic, the terminology is not entirely abstract. Instead, it reflects the practical origins of probability theory, which was developed to describe phenomena that cannot be predicted with certainty. The point of view is differ- ent from the one we took when we started studying physics. There we said that if we do the same thing in the same way over and over again – send a space shuttle into orbit, for example – the result will always be the same. To predict the result, we have to take account of all relevant facts.

The mathematics of probability begins when the situation is so complex that we just can’t replicate everything important exactly – like when we fabricate and test an integrated circuit. In this case, repetitions of the same procedure yield different results. The situ-

**1**

**2 **CHAPTER 1 EXPERIMENTS, MODELS, AND PROBABILITIES

ation is not totally chaotic, however. While each outcome may be unpredictable, there
are consistent patterns to be observed when we repeat the procedure a large number of
times. Understanding these patterns helps engineers establish test procedures to ensure that
a factory meets quality objectives. In this repeatable procedure (making and testing a chip)
with unpredictable outcomes (the quality of individual chips), the *probability *is a number
between 0 and 1 that states the proportion of times we expect a certain thing to happen,
such as the proportion of chips that pass a test.

As an introduction to probability and stochastic processes, this book serves three pur- poses:

• It introduces students to the logic of probability theory. • It helps students develop intuition into how the theory applies to practical situations. • It teaches students how to apply probability theory to solving engineering problems.

To exhibit the logic of the subject, we show clearly in the text three categories of theoretical material: definitions, axioms, and theorems. Definitions establish the logic of probability theory, while axioms are facts that we accept without proof. Theorems are consequences that follow logically from definitions and axioms. Each theorem has a proof that refers to definitions, axioms, and other theorems. Although there are dozens of definitions and theorems, there are only three axioms of probability theory. These three axioms are the foundation on which the entire subject rests. To meet our goal of presenting the logic of the subject, we could set out the material as dozens of definitions followed by three axioms followed by dozens of theorems. Each theorem would be accompanied by a complete proof.

While rigorous, this approach would completely fail to meet our second aim of conveying the intuition necessary to work on practical problems. To address this goal, we augment the purely mathematical material with a large number of examples of practical phenomena that can be analyzed by means of probability theory. We also interleave definitions and theorems, presenting some theorems with complete proofs, others with partial proofs, and omitting some proofs altogether. We find that most engineering students study probability with the aim of using it to solve practical problems, and we cater mostly to this goal. We also encourage students to take an interest in the logic of the subject – it is very elegant – and we feel that the material presented will be sufficient to enable these students to fill in the gaps we have left in the proofs.

Therefore, as you read this book you will find a progression of definitions, axioms, theorems, more definitions, and more theorems, all interleaved with examples and comments designed to contribute to your understanding of the theory. We also include brief quizzes that you should try to solve as you read the book. Each one will help you decide whether you have grasped the material presented just before the quiz. The problems at the end of each chapter give you more practice applying the material introduced in the chapter. They vary considerably in their level of difficulty. Some of them take you more deeply into the subject than the examples and quizzes do.

**1.1 Set Theory
**

The mathematical basis of probability is the theory of sets. Most people who study proba-
bility have already encountered set theory and are familiar with such terms as *set, element,*

1.1 SET THEORY **3
**

*union, intersection*, and *complement*. For them, the following paragraphs will review ma-
terial already learned and introduce the notation and terminology we use here. For people
who have no prior acquaintance with sets, this material introduces basic definitions and the
properties of sets that are important in the study of probability.

A *set *is a collection of things. We use capital letters to denote sets. The things that
together make up the set are *elements*. When we use mathematical notation to refer to set
elements, we usually use small letters. Thus we can have a set *A *with elements *x *, *y*, and
*z*. The symbol ∈ denotes set inclusion. Thus *x *∈ *A *means “*x *is an element of set *A*.” The
symbol ∈ is the opposite of ∈. Thus *c *∈ *A *means “*c *is not an element of set *A*.”

It is essential when working with sets to have a definition of each set. The definition allows someone to consider anything conceivable and determine whether that thing is an element of the set. There are many ways to define a set. One way is simply to name the elements:

*A *= {Rutgers University, Polytechnic University, the planet Mercury} *. *(1.1)
Note that in stating the definition, we write the name of the set on one side of = and the
definition in curly brackets { } on the other side of =.

It follows that “the planet closest to the Sun ∈ *A*” is a true statement. It is also true that
“Bill Clinton ∈ *A*.” Another way of writing the set is to give a rule for testing something
to determine whether it is a member of the set:

*B *= {all Rutgers juniors who weigh more than 170 pounds} *. *(1.2)
In engineering, we frequently use mathematical rules for generating all of the elements of
the set:

*C *=
{

*x*2|*x *= 1*, *2*, *3*, *4*, *5
}

(1.3)

This notation tells us to form a set by performing the operation to the left of the vertical bar, |, on the numbers to the right of the bar. Therefore,

*C *= {1*, *4*, *9*, *16*, *25} *. *(1.4)
Some sets have an infinite number of elements. For example

*D *=
{

*x*2|*x *= 1*, *2*, *3*, . . .
*}
*. *(1.5)

The dots tell us to continue the sequence to the left of the dots. Since there is no number to
the right of the dots,we continue the sequence indefinitely, forming an infinite set containing
all perfect squares except 0. The definition of *D *implies that 144 ∈ *D *and 10 ∈ *D*.

In addition to set inclusion, we also have the notion of a *subset*, which describes a
relationship between two sets. By definition, *A *is a subset of *B *if every member of *A *is
also a member of *B*. We use the symbol ⊂ to denote subset. Thus *A *⊂ *B *is mathematical
notation for the statement “the set *A *is a subset of the set *B*.” Using the definitions of sets
*C *and *D *in Equations (1.3) and (1.5), we observe that *C *⊂ *D*. If

*I *= {all positive integers, negative integers, and 0} *, *(1.6)
it follows that *C *⊂ *I *, and *D *⊂ *I *.

**4 **CHAPTER 1 EXPERIMENTS, MODELS, AND PROBABILITIES

The definition of set equality,

*A *= *B, *(1.7)

is

*A *= *B *if and only if *B *⊂ *A *and *A *⊂ *B*.
This is the mathematical way of stating that *A *and *B *are identical if and only if every
element of *A *is an element of *B *and every element of *B *is an element of *A*. This definition
implies that a set is unaffected by the order of the elements in a definition. For example,
{0*, *17*, *46} = {17*, *0*, *46} = {46*, *0*, *17} are all the same set.

To work with sets mathematically it is necessary to define a *universal set*. This is the set of
all things that we could possibly consider in a given context. In any study, all set operations
relate to the universal set for that study. The members of the universal set include all of the
elements of all of the sets in the study. We will use the letter *S *to denote the universal set. For
example, the universal set for *A *could be *S *= {all universities in New Jersey*, *all planets}.
The universal set for *C *could be *S *= *I *= {0*, *1*, *2*, . . .*}. By definition, every set is a subset
of the universal set. That is, for any set *X *, *X *⊂ *S*.

The *null set*, which is also important, may seem like it is not a set at all. By definition it
has no elements. The notation for the null set is *φ*. By definition *φ *is a subset of every set.
For any set *A*, *φ *⊂ *A*.

It is customary to refer to Venn diagrams to display relationships among sets. By con-
vention, the region enclosed by the large rectangle is the universal set *S*. Closed surfaces
within this rectangle denote sets. A Venn diagram depicting the relationship *A *⊂ *B *is

*A
B
*

When we do set algebra, we form new sets from existing sets. There are three operations
for doing this: *union*, *intersection*, and *complement*. Union and intersection combine two
existing sets to produce a third set. The complement operation forms a new set from one
existing set. The notation and definitions are

*A B
*The *union *of sets *A *and *B *is the set of all elements that
are either in *A *or in *B*, or in both. The union of *A *and *B
*is denoted by *A *∪ *B*. In this Venn diagram, *A *∪ *B *is the
complete shaded area. Formally, the definition states

*x *∈ *A *∪ *B *if and only if *x *∈ *A *or *x *∈ *B*.
The set operation union corresponds to the logical “or”
operation.

1.1 SET THEORY **5
**

*A B
*The *intersection *of two sets *A *and *B *is the set of all ele-
ments which are contained both in *A *and *B*. The intersec-
tion is denoted by *A*∩*B*. Another notation for intersection
is *AB*. Formally, the definition is

*x *∈ *A *∩ *B *if and only if *x *∈ *A *and *x *∈ *B*.
The set operation intersection corresponds to the logical
“and” function.

*A
*

*A
c
*

The *complement *of a set *A*, denoted by *A c*, is the set of all
elements in *S *that are not in *A*. The complement of *S *is
the null set *φ*. Formally,

*x *∈ *Ac *if and only if *x *∈ *A*.

*A-B
*A fourth set operation is called the *difference*. It is a com-
bination of intersection and complement. The *difference
*between *A *and *B *is a set *A *− *B *that contains all elements
of *A *that are *not *elements of *B*. Formally,

*x *∈ *A *− *B *if and only if *x *∈ *A *and *x *∈ *B
*Note that *A *− *B *= *A *∩ *Bc *and *Ac *= *S *− *A*.

In working with probability we will frequently refer to two important properties of collec- tions of sets. Here are the definitions.

*A
*

*B
*

A collection of sets *A*1*, . . . , An *is *mutually exclusive *if
and only if

*Ai *∩ *A j *= *φ, i *= *j. *(1.8)
When there are only two sets in the collection, we say that
these sets are *disjoint*. Formally, *A *and *B *are disjoint if
and only if *A *∩ *B *= *φ*.

*A1
*

*A3A2
*

A collection of sets *A*1*, . . . , An *is *collectively exhaustive
*if and only if

*A*1 ∪ *A*2 ∪ · · · ∪ *An *= *S. *(1.9)

In the definition of *collectively exhaustive*, we used the somewhat cumbersome notation
*A*1∪*A*2∪· · ·∪*An *for the union of *N *sets. Just as ∑*ni*=1 *xi *is a shorthand for *x*1+*x*2+· · ·+*xn *,

**6 **CHAPTER 1 EXPERIMENTS, MODELS, AND PROBABILITIES

we will use a shorthand for unions and intersections of *n *sets:
*n*⋃

*i*=1
*Ai *= *A*1 ∪ *A*2 ∪ · · · ∪ *An, *(1.10)

*n*⋂
*i*=1

*Ai *= *A*1 ∩ *A*2 ∩ · · · ∩ *An. *(1.11)

From the definition of set operations, we can derive many important relationships between sets and other sets derived from them. One example is

*A *− *B *⊂ *A. *(1.12)
To prove that this is true, it is necessary to show that if *x *∈ *A *− *B*, then it is also true that
*x *∈ *A*. A proof that two sets are equal, for example, *X *= *Y *, requires two separate proofs:
*X *⊂ *Y *and *Y *⊂ *X *. As we see in the following theorem, this can be complicated to show.

*Theorem 1.1 **De Morgan’s law relates all three basic operations:
*

*(A *∪ *B)c *= *Ac *∩ *Bc.
*

** Proof **There are two parts to the proof:
• To show

*(A*∪

*B)c*⊂

*Ac*∩

*Bc*, suppose

*x*∈

*(A*∪

*B)c*. That implies

*x*∈

*A*∪

*B*. Hence,

*x*∈

*A*

and *x *∈ *B*, which together imply *x *∈ *Ac *and *x *∈ *Bc*. That is, *x *∈ *Ac *∩ *Bc*.
• To show *Ac *∩ *Bc *⊂ *(A *∪ *B)c*, suppose *x *∈ *Ac *∩ *Bc*. In this case, *x *∈ *Ac *and *x *∈ *Bc*.

Equivalently, *x *∈ *A *and *x *∈ *B *so that *x *∈ *A *∪ *B*. Hence, *x *∈ *(A *∪ *B)c*.

*Quiz 1.1
*

*A pizza at Gerlanda’s is either regular (R) or Tuscan (T ). In addition,
each slice may have mushrooms (M) or onions (O) as described by
the Venn diagram at right. For the sets specified below, shade the
corresponding region of the Venn diagram.
*

*M O
*

*T
*

*(1) R (2) M *∪ *O
(3) M *∩ *O (4) R *∪ *M
(5) R *∩ *M (6) T c *− *M
*

**1.2 Applying Set Theory to Probability
**

The mathematics we study is a branch of measure theory. Probability is a number that describes a set. The higher the number, the more probability there is. In this sense prob- ability is like a quantity that measures a physical phenomenon; for example, a weight or

1.2 APPLYING SET THEORY TO PROBABILITY **7
**

a temperature. However, it is not necessary to think about probability in physical terms. We can do all the math abstractly, just as we defined sets and set operations in the previous paragraphs without any reference to physical phenomena.

Fortunately for engineers, the language of probability (including the word *probability
*itself) makes us think of things that we experience. The basic model is a repeatable *exper-
iment*. An experiment consists of a *procedure *and *observations*. There is uncertainty in
what will be observed; otherwise, performing the experiment would be unnecessary. Some
examples of experiments include

1. Flip a coin. Did it land with heads or tails facing up?

2. Walk to a bus stop. How long do you wait for the arrival of a bus?

3. Give a lecture. How many students are seated in the fourth row?

4. Transmit one of a collection of waveforms over a channel. What waveform arrives at the receiver?

5. Transmit one of a collection of waveforms over a channel. Which waveform does the receiver identify as the transmitted waveform?

For the most part, we will analyze *models *of actual physical experiments. We create
models because real experiments generally are too complicated to analyze. For example,
to describe *all *of the factors affecting your waiting time at a bus stop, you may consider

• The time of day. (Is it rush hour?) • The speed of each car that passed by while you waited. • The weight, horsepower, and gear ratios of each kind of bus used by the bus company. • The psychological profile and work schedule of each bus driver. (Some drivers drive

faster than others.)

• The status of all road construction within 100 miles of the bus stop.
It should be apparent that it would be difficult to analyze the effect of each of these factors
on the likelihood that you will wait less than five minutes for a bus. Consequently, it is
necessary to study a *model *of the experiment that captures the important part of the actual
physical experiment. Since we will focus on the model of the experiment almost exclusively,
we often will use the word *experiment *to refer to the model of an experiment.

**Example 1.1 **An experiment consists of the following procedure, observation, and model:

• Procedure: Flip a coin and let it land on a table. • Observation: Observe which side (head or tail) faces you after the coin lands. • Model: Heads and tails are equally likely. The result of each flip is unrelated to

the results of previous flips.

As we have said, an experiment consists of both a procedure and observations. It is important to understand that two experiments with the same procedure but with different observations are different experiments. For example, consider these two experiments: