Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


Probability and Statistics - De Groot, Notas de estudo de Cultura

Probability and Statistics - De Groot

Tipologia: Notas de estudo

Antes de 2010

Compartilhado em 25/08/2010

fernanda-ribeiro-21
fernanda-ribeiro-21 🇧🇷

4.4

(36)

25 documentos

1 / 730

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Pré-visualização parcial do texto

Baixe Probability and Statistics - De Groot e outras Notas de estudo em PDF para Cultura, somente na Docsity!

Probability and Statistics Second Edition Morris H. DeGroot Carnegie-Mellon University A vv ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts /Menlo Park, California Don Mills, Ontario /Wokingham, England / Amsterdam / Sydney Singapore /Tokyo / Mexico City /Bogota/Santiago /San Juan This book is in the Addison-Wesley Series in Statistics. Frederick Mosteller Consulting Editor Library of Congress Cataloging in Publication Data DeGroot, Morris H. Probability and statistics. Bibliography: p. Includes index. 1. Probabilities. 2. Mathematical statistics. 1. Title. QA273.D35 1984 519.2 84-6269 ISBN 0-201-11366-X Reprinted with corrections, September 1989 Copyright O 1975, 1986 by Addison-Wesley Publishing Company, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. Published simultancously in Canada. 14 15 16 27 18 19 20 MA 959493 iv Pretace problem, chocsing the best, utility and preferences among gambles, and the Borel-Kolmogorov paradox. These topics are treated in a completely elementary fashion, but they can be omitted without loss of continuity if time is limited. Sections of the book that can be so omitted are indicated, in the traditional way, by asterisks in the Contents and in the text. The last five chapters of the book are devoted to statistical inference. The coverage here is modern in outlook. Both classical and Bayesian statistical methods are developed in an integrated presentation. No single school of thought is treated in a dogmatic fashion. My goal is to equip the student with the theory and methodology that have proved to be useful in the past and promise to be useful in the future. These chapters contain a comprehensive but elementary survey of estimation, testing hypotheses, nonparametric methods, multiple regression, and the analysis of variance. The strengths and weaknesses and the advantages and disadvantages of such basic concepts as maximum likelihood estimation, Bayesian decision procedures, unbiased estimation, confidence intervals, and levels of significance are discussed from a contemporary viewpoint. Special features of these chapters include discussions of prior and posterior distribution, sufficient statistics, Fisher information, the delta method, the Bayesian analysis of samples from a normal distribution, unbiased tests, multidecision problems, tests of goodness-of-fit, contingency tables, Simpson's paradox, inferences about the median and other quantiles, robust estimation and trimmed means, confidence bands for a regres- sion line, and the regression fallacy. If time does not permit complete coverage of the contents of these chapters, any of the following sections can be omitted without loss of continuity: 7.6, 7.8, 8.3, 9.6, 9.7, 9.8, 9.9, and 9.10. In summary, the main changes in this second edition are new sections or subsections on statistical swindles, choosing the best, the Borel-Kolmogorov paradox, the correction for continuity, the delta method, unbiased tests, Simpson's paradox, confidence bands for a regression line, and the regression fallacy, as well as a new section of supplementary exercises at the end of cach chapter. The material introducing random variables and their distributions has been thor-- oughly revised, and minor changes, additions, and deletions have been made throughout the text. Although a computer can be a valuable adjunct in a course in probability and statistics such as this one, none of the exercises in this book requires access to a computer or a knowledge of programming. For this reason, the use of this book is not tied to a computer in any way. Instructors are urged, however, to use computers in the course as much as is feasible. A small caiculator is a helpful aid for solving some of the numerical exercises in the second half of the book. One further point about the style in which the book is written should be emphasized. The pronoun “he” is used throughout the book in reference to a person who is confronted with a statistical problem. This usage certainly does not mean that only males calculate probabilities and make decisions, or that only Preface v males can be statisticians. The word “he” is used quite literally as defined in Webster's Third New International Dictionary to mean “that one whose sex is unknown or immaterial.” The field of statistics should certainly be as accessible to women as it is to men. It should certainly be as accessible to members of minority groups as it is to the majority. It is my sincere hope that this book will help create among all groups an awareness and appreciation of probability and statistics as an interesting, lively, and important branch of science. 1 am indebted to the readers, instructors, and colleagues whose comments have strengthened this edition. Marion Reynolds, Jr. of Virginia Polytechnie Institute and James Stapleton of Michigan State University reviewed the manuscript for the publisher and made many valuable suggestions. 1 am grateful to the Literary Executor of the late Sir Ronald A, Fisher, F.R.S,, to Dr. Frank Yates, F.R.S,, and the Longman Group Ltd., London, for permission to adapt Table III of their book Statistical Tables for Biological, Agricultural and Medical Research (6th Edition, 1974). The field of statistics has grown and changed since 1 wrote a Preface for the first edition of this book in November, 1974, and so have 1. The influence on my life and work of those who made that first edition possible remains vivid and undiminished; but with growth and change have come new influences as well, both personal and professional. The love, warmth, and support of my family and friends, old and new, have sustained and stimulated me, and enabled me to write a book that I believe reflecis contemporary probability and statistics. Pittsburgh, Pennsylvania M.H.D, October 1985 vlii Contents *2.5 Choosing the Best 87 2.6 Supplementary Exercises 94 3 Random Variables and Distributions 31 Random Variables and Discrete Distributions 97 3.2 Continuous Distributions 102 3.3 The Distribution Function 108 34 Bivariate Distributions 115 3.5 Marginal Distributions 125 3.6 Conditional Distributions 134 3.7 Multivariate Distributions 142 3.8 Functions of a Random Variable 150 3.9 Functions of Two or More Random Variables 158 * 3.10 The Borel-Kolmogorov Paradox 171 311 Supplementary Exercises 174 Expectation 41 The Expectation of a Random Variable 179 42 Properties of Expectations 187 4.3 Variance 194 44 Momenis 199 4.5 The Mean and the Median 206 4.6 Covariance and Correlation 213 47 Conditional Expectation 219 4.8 The Sample Mean 226 * 4,9 Utility 233 4.10 Supplementary Exercises 239 5 Special Distributlons 5.1 Introduction 243 5.2 The Bernoulli and Binomial Distributions 243 5.3 The Hypergeometric Distribution 247 54 The Poisson Distribution 252 5.5 The Negative Binomial Distribution 258 5.6 The Normal Distribution 263 5.7 The Central Limit Theorem 274 Contents 5.8 The Correction for Continuity 283 5.9 The Gamma Distribution 286 8.10 The Beta Distribution 294 511 The Multinomial Distribution 297 5.12 The Bivariate Normal Distribution 300 5.13 Supplementary Exercises 307 Estlmation 61 Statistical Inference 311 6.2 Prior and Posterior Distributions 313 6.3 Conjugate Prior Distributions 321 64 Bayes Estimators 330 6.5 Maximum Likelihood Estimators 338 6.6 Properties of Maximum Likelihood Estimators . 348 67 Sufficient Statistics 356 6.8 Jointly Sufficient Statistics 364 6.9 Improving an Estimator 371 6.10 Supplementary Exercises 377 Sampling Distributions of Estimators CR The Sampling Distribution of a Statistic 381 7.2 The Chi-Square Distribution 383 7.3 Joint Distribution of the Sample Mean and Sample Variance 386 74 The + Distribution 393 75 Confidence Intervals 398 5 "76 Bayesian Analysis of Samples from a Normal Distribution 402 71 Unbiased Estimators 411 *78 Fisher Information 420 7.9 Supplementary Exercises 433 Testing Hypotheses 81 82 *83 Problems of Testing Hypotheses 437 Testing Simple Hypotheses 442 Multidecision Problems 456 Contents xi Random Digits 685 Poisson Probabilities 688 The Standard Normal Distribution Function 689 The x? Distribution 690 The z Distribution 692 0.95 Quantile of the F Distribution 694 0.975 Quantile of the F Distribution 695 Answers to Even-Numbered Exercises 697 index 717 Introduction to Probability 1.1. THE HISTORY OF PROBABILITY The concepts of chance and uncertainty are as old as civilization itself. People have always had to cope with uncertainty about the weather, their food supply, and other aspects of their environment, and have strived to reduce this uncer- tainty and its effects. Even the idea of gambling has a long history. By about the year 3500 B.C., games of chance played with bone objects that could be consid- ered precursors of dice were apparently highly developed in Egypt and elsewhere. Cubical dice with markings virtually identical to those on modern dice have been ou t i o) e know tl E g with dice has been popular ever since that time and p) ayed an important part in the early development of probability theory. K is generally believed that the mathematical theory of probability was started by the French mathematicians Blaise Pascal (1 623-1662) and Pierre Fermat (1601-1665) when they succeeded in deriving exaet probabilities for certain gambling problems involving dice. Some of the problems that they solved had been outstanding for about 300 years. However, numerical probabilities of various dice combinations had been calculated previously by Girolamo Cardano (1501-1576) and by Galileo Galilei (1564-1642). The theory of probability has been developed steadily since the seventeenth century and has been widely applied in diverse fields of study. Today, probability theory is an important tool in most areas of engineering, science, and manage- ment. Many research workers are actively engaged in the discovery and establish- ment of new applications of probability in fields such as medicine, meteorology, Photography from spaceships, marketing, earthquake prediction, human behavior, 1.2. Interpretations of Probability 3 under similar conditions. For example, the probability of obtaining a head when a coin is tossed is considered to be 1/2 because the relative frequency of heads should be approximately 1/2 when the coin is tossed a large number of times under similar conditions. In other words, it is assumed that the proportion of Of course, the conditions mentioned in this example are too vague to serve as the basis for a scientific definition of probability. First, a “large number” of tosses of the coin is specified, but there is no definite indication of an actual number that would be considered large enough. Second, it is stated that the coin should be tossed each time “under similar conditions,” but these conditions are not described precisely. The conditions under which the coin is tossed must not be completely identical for each toss because the outcomes would then be the same, and there would be either all heads or all tails. In fact, a skilled person can toss a coin into the air repeatedly and catch it in such a way that a head is obtained on almost every toss. Hence, the tosses must not be completely con- trolled but must have some “random” features. Furthermore, it is stated that the relative frequency of heads should be “approximately 1/2,” but no limit is specified for the permissible variation from 1/2. If a coin were tossed 1,000,000 times, we would not expect to obtain exactly 500,000 heads. Indeed, we would be extremely surprised if we obtained exactly 500,000 heads. On the other hand, neither would we expect the number of heads to be very far from 500,000. It would be desirable to be able to make a precise statement of the likelihoods of the different possible numbers of heads, but these likelihoods would of necessity depend on the very concept of probability that we are trying to define. Another shortcoming of the frequency interpretation of probability is that it applies only to a problem in which there can be, at least in principle, a large number of similar repetitions of a certain process. Many important problems are not of this type. For example, the frequency interpretation of probability cannot be applied directly to the probability that a specific acquaintance will get married within the next two years or to the probability that a particular medical research project will lead to the development of a new treatment for a certain disease within a specified period of time. The Classical Interpretation of Probability The classical interpretation of probability is based on the concept of egually likely outcomes. For example, when a coin is tossed, there are two possible outcomes: a head or a tail. If it may be assumed that these outcomes are equally likely to occur, then they must have the same probability. Since the sum of the probabili- ties must be 1, both the probability of a head and the probability of a tail must be 1/2. More generally, if the outcome of some process must be one of n different 4 Introduction to Probability outcomes, and if these 7 outcomes are equally likely to occur, then the probability of each outcome is 1/n. Two basic difficulties arise when an attempt is made to develop a formal definition of probability from the classical interpretation. First, the concept of equally likely outcomes is essentially based on the concept of probability that we are trying to define. The statement that two possible outcomes are equally likely to occur is the same as the statement that two outcomes have the same probabil- ity. Second, no systematic method is given for assigning probabilíties to outcomes that are not assumed to be equally likely. When a coin is tossed, or a well-bal- anced die is rolled, or a card is chosen from a well-shuffied deck of cards, the different possible outcomes can usually be regarded as equally likely because of the nature of the process. However, when the problem is to guess whether an acquaintance will get married or whether a research project will be successful, the possible outcomes would not typically be considered to be equaily likely, and a different method is needed for assigning probabilities to these outcomes. The Subjective Interpretation of Probability According to the subjective, or personal, interpretation of probability, the prob- ability that a person assigns to a possible outcome of some process represents his own judgment of the likelihood that the outcome will be obtained, This judgment will be based on that person's beliefs and information about the process. Another person, who may have different beliefs or different information, may assign a i is reason, it is appropriate to speak of a certain persons subjective probability of an outcome, rather than to speak of the irue probability of that outcome. As an illustration of this interpretation, suppose that a coin is to be tossed once. A person with no special information about the coin or the way in which it is tossed might regard a head and a tail to be equally likely outcomes. That person would then assign a subjective probability of 1/2 to the possibility óf obtaining a head. The person who is actually tossing the coin, however, might feel that à head is much more likely to be obtained than a tail. In order that this person may be able to assign subjective probabilities to the outcomes, he must express the strength of his belief in numerical terms. Suppose, for example, that be regards the likelihood of obtaining a head to be the same as the likelihood of obtaining a red card when one card is chosen from a well-shuflled deck containing four red cards and one black card. Since the person would assign a probability of 4/5 to the possibility of obtaining a red card, he should also assign a probability of 4/5 to the possibility of obtaining a head when the coin is tossed. This subjective interpretation of probability can be formalized. In general, if a person's judgments of the relative likelihoods of various combinations of outcomes satisfy certain conditions of consistency, then it can be shown that his 6 Introduction to Probability 1. In an experiment in which a coin is to be tossed 10 times, the experimenter might want to determine the probability that at least 4 heads will be obtained. 2. In an experiment in which a sample of 1000 transistors is to be selected from a large shipment of similar items and each selected item is to be inspected, a person might want to determine the probability that not more than one of the selected transistors will be defective. 3. In an experiment in which the air temperature at a certain location is to be observed every day at noon for 90 successive days, a person might want to determine the probability that the average temperature during this period will be less than some specified value. 4. From information relating to the life of Thomas Jefferson, a certain person might want to determine the probability that Jefferson was born in the year 1741. 5. In evaluating an industrial research and development project at a certain time, a person might want to determine the probability that the project will result in the successful development of a new product within a specified number of months. It can be seen from these examples that the possible outcomes of an experiment may be either random or nonrandom, in accordance with the usual meanings of those terms. The interesting feature of an experiment is that each of its possible outcomes can be specified before the experiment is performed, and probabilities can be assigned to various combinations of outcomes that are of interest. The Mathematical Theory of Probability As was explained in Section 1,2, there is controversy in regard to the proper meaning and interpretation of some of the probabilities that are assigned to the outcomes of many experiments. However, once probabilities have been assigned to some simple outcomes in an experiment, there is complete agreement among all authorities that the mathematical theory of probability provides the appropriate methodology for the further study of these probabilities. Almost all work in the mathematical theory of probability, from the most clementary textbooks to the most advanced research, has been related to the following two problems: (i) methods for determinmng the probabilities of certain events from the specified probabilities of each possible outcome of an experiment and (ii) methods for revising the probabilities of events when additional relevant information is obtained. 1.4. Set Theory 7 These methods are based on standard mathematical techniques. The purpose of the first five chapters of this book is to present these techniques which, together, form the mathematical theory of probability. 1.4. SET THEORY The Sample Space The collection of all possible outcomes of an experiment is called the sample space of the experiment. In other words, the sample space of an experiment can be thought of as a set, or collection, of different possible outcomes; and each outcome cam be thought of as a point, or an element, in the sample space. Because of this interpretation, the language and concepts of set theory provide a natural context for the development of probability theory. The basic ideas and notation of set theory will now be reviewed. Relations of Set Theory Let S denote the sample space of some experiment. Then any possible outcome s of the experiment is said to be a member of the space S, or to belong to the space S. The statement that s is a member of S is denoted symbolically by the relation ses. When an experiment has been performed and we say that some evenr has occurred, we mean that the outcome of the experiment satisfied certain conditions which specified that event. In other words, some outcomes in the space S signify that the event occurred, and all other outcomes in $ signify that the event did not occur. In accordance with this interpretation, any event can be regarded as a certain subset of possible outcomes in the space S. , For example, when a six-sided die is rolled, the sample space can be regarded as containing the six numbers 1,2,3,4,5,6. Symbolically, we write . S=(1,2,3,4,5,6). * The event A that an even number is obtained is defined by the subset 4 = (2,4,6). The event B that a number greater than 2 is obtained is defined by the subset B = (3,4,5,6). IX is said that an event A is contained in another event B if every outcome that belongs to the subset defining the event 4 also belongs 1 to-the subset detining the event 8. This relation between two events is expresséd symbolically by the relation 4 C B. The relation 4 € B is also expressed by saying that A is a subset of B. Equivalently, if 4 CB, we may say that B contains 4 and may write BOA. 1.4. Set Theory 9 n outcomes which belong to at least one of these 7 events. The notation for this union is either 4U AU U A, or UL,A; Similarly, the notation for the union of an infinite sequence of events 4, 44... is UM, A; The notation for the union of an arbitrary collection of events 4,, where the values of the sub- script i belong to some index set Z, isU,c, 4; The union of three events 4, B, and C can be calculated either directly from the definition ot AU BU Cor by first evaluating the union of any two of the events and then forming the union cf this combination of events and the third event. In other words, the following associative relations are satisfied: The union of n events 4,,..., 4, is defined to be the event that contains all Ta AUBUC=(AUBJUC=AU(BUC). Intersections. MH A and B are any two events, the intersection of A and Bis defined to be the event that contains all outcomes which belong both to À and to B. The notation for the intersection of 4 and Bis AN B. Theevent 40 B is sketched in a Venn diagram in Fig. 1.2. It is often convenient to denote the intersection of 4 and B by the symbol AB instead of 4 NB, and we shail use these two types of notation interchangeably. For any events 4 and B, the intersection has the following properties: ANB=BNA, ANA=A, Aangd=, ANS=A. Furthermore, if AC B, then ANB=A. The intersection of n events A,,..., 4, is defined to be the event that contains the outcomes which are common to all these » events. The notation for this intersection is AN ANN Ap Or NirA, OT Ada ré Ap Similar notations are used for the intersection of an infinite sequence of events or for the intersection of an arbitrary collection of events. Figure 1.2 The event AN B. 10 Introduction to Probability Figure 1.3 The event 4º. For any three events 4, B, and C, the following associative relations are satisfied: ANnBnC=(ANB)nC=AN(BNC). Complements. The complement of an event 4 is defined to be the event that contains all outcomes in the sample space S which do nor belong to 4. The notation for the complement of 4 is A%. The event A“ is sketched in Fig. 1.3. For any event 4, the complement has the following properties: (4 =A, f=s, S'=8, AUA=S, ANA =. Disjoint Events. It is said that two events 4 and B are disjoint, or mutually exclusive, if A and B have no outcomes in common. It follows that 4 and B are disjoint if and only if AN B=f. It is said that the events in an arbitrary collection of events are disjoint if no two events in the collection have any outcomes in common. As an illustration of these concepts, a Venn diagram for three events 4,. Ás, and A, is presented in Fig. 1.4. This diagram indicates that the various intersec- tions of 4,, 4,, and 4, and their complements will partition the sample space $ into eight disjoint subsets. Example 1: Tossing a Coin. Suppose that a coin is tossed three times. Then the sample space S contains the following eight possible outcomes s,,...,sg: su: HHH, 89: THH, sa: HTH, sa HHT, ss: HTT, se: THT, sy TTH, se TTT.