Pré-visualização parcial do texto
Baixe Statistical Inference (Casella e outras Manuais, Projetos, Pesquisas em PDF para Economia, somente na Docsity!
Statistical Inference Second Edition George Casella Roger |. Berger DUXBURY ADVANCED SERIES Statistical Inference Second Edition George Casella University of Florida Roger L. Berger North Carolina State University DUXBURY THOMSON LEARNING O!KONOMIKO MANENIZTHMIO AGHNON BIBAIOOHKH cio. 72065 Ap. DIS taE. CAs Australia * Canada « Mexico + Singapore * Spain « United Kingdom * United States To Anne and Vicki Duxbury titles of related interest Daniel, Applied Nonparametric Statistics 2"4 Derr, Statistical Consulting: A Guide to Effective Communication Durrett, Prodability: Theory end Examples 22 Graybill, Theory and Application of the Lincar Model Johnson, Applied Multivariate Methods for Data Analyets Kuehl, Design of Experiments: Statistical Principles of Research Design and Analysis 24 Larsen, Marx, & Cooil, Statistics for Applied Problem Solving and Decision Making Lohr, Sampling: Design and Analysis Lunneborg, Data Analysis by Resampling: Concepis and Applications Minh, Applied Probability Models Minitab Inc., MINITABTM Student Version 12 for Windows Myers, Classical and Modern Regression with Applications 2"d Newton & Harvill, StatConcepts: A Visual Tour of Statistical Ideas Ramsey & Schafer, The Statistical Sleuth 2º4 SAS Institute Inc., JMP-IN: Statistical Discovery Software Savage, INSIGHT: Business Analysis Software for Microsoft? Excel Scheaffer, Mendenhall, & Ott, Elementary Survey Sampling 5 Shapiro, Modeling the Supply Chain Winston, Simulation Modeling Using To order copies contact your local Egitstore or call 1:800-354-9706. For more information contact Duxbury Prége-aé 511 Forest Lodi Fgad, Pacific Grove, CA 93950, or go to: www.duxbury.com Er anais 6; Ê t que ju Elo h ) PREFACE TO THE SECOND EDITION ” das, a new scegoni n p-values. In Chapter 9 we now put more emphasis on pivoting revi ecalidod has “guaranteeing an interval” was merely “pivoting the cdf”). Also, the nraberral that was in Chapter 10 of the first edition (decision theory) has been re- duced, and small sections on loss function optimality of point estimation, hypothesis testing, and interval estimation have been added to the appropriate chapters. Chapter 10 is entirely new and attempts to lay out the fundamentals of large sample inference, including the delta method, consistency and asymptotic normality, boot- strapping, robust estimators, score tests, etc. Chapter 11 is classic oneway ANOVA and linear regression (which was covered in two different chapters in the first edi- tion). Unfortunately, coverage of randomized block designs has been eliminated for space reasons. Chapter 12 covers regression with errors-in-variables and contains new material on robust and logistic regression. After teaching from the first edition for a number of years, we know (approximately) what can be covered in a one-year course. From the second edition, it should be possible to cover the following in one year: Chapter 1: Sections 1-7 Chapter 6: Sections 1-3 Chapter 2: Sections 1-3 Chapter 7: Sections 1-3 Chapter 3: Sections 1-6 Chapter 8: Sections 1-3 Chapter 4: Sections 1-7 Chapter 9: Sections 1-3 Chapter 5: Sections 1-6 Chapter 10: Sections 1,3, 4 Classes that begin the course with some probability background can cover more ma- terial from the later chapters. Finally, it is almost impossible to thank all of the people who have contributed in some way to making the second edition a reality (and help us correct the mistakes in the first edition). To all of our students, friends, and colleagues who took the time to send us a note or an e-mail, we thank you. À number of people made key suggestions that led to substantial changes in presentation. Sometimes these suggestions were just short notes or comments, and some were longer reviews. Some were so long ago that their authors may have forgotten, but we haven't. So thanks to Arthur Cohen, Sir David Cox, Steve Samuels, Rob Strawderman and Tom Wehrly. We also owe much to Jay Beder, who has sent us numerous comments and suggestions over the years and possibly knows the first edition better than we do, and to Michael Perlman and his class, who are sending comments and corrections even as we write this. This book has seen a number of editors. We thank Alex Kugashev, who in the mid-1990s first suggested doing a second edition, and our editor, Carolyn Crockett, who constantly encouraged us. Perhaps the one person (other than us) who is most responsible for this book is our first editor, John Kimmel, who encouraged, published, and marketed the first edition. Thanks, John. George Casella Roger L. Berger Preface to the First Edition When someone discovers that you are writing a textbook, one (or botb) of two ques- tions will be asked. The first is “Why are you writing a book?” and the second is “How is your book different from what's out there?” The first question is fairly easy to answer. You are writing a book because you are not entirely satisfied with the available texts. The second question is harder to answer. The answer can't be put in a few sentences so, in order not to bore your audience (who may be asking the question only out oí politeness), you try to say something quick and witty. H usually doesn't work. The purpose of this book is to build theoretical statistics (as different from mathe- matical statistics) from the first principles of probability theory. Logical development, proofs, ideas, themes, etc., evolve through statistical arguments. Thus, starting from the basics of probability, we develop the theory of statistical inference using tech- niques, definitions, and concepts that are statistical and are natural extensions and consequences of previous concepts. When this endeavor was started, we were not sure how well it would work. The final judgment of our success is, of course, left to the reader. The book is intended for first-year graduate students majoring in statistics or in a field where a statistics concentration is desirable. The prerequisite is one year of calculus. (Some familiarity with matrix manipulations would be useful, but is not essential.) The book can be used for a two-semester, or three-quarter, introductory course in statistics. The first four chapters cover basics of probability theory and introduce many fun- damentals that are later necessary. Chapters 5 and 6 are the first statistical chapters. Chapter 5 is transitional (between probability and statistics) and can be the starting point for a course in statistical theory for students with some probability background. Chapter 6 is somewhat unique, detailing three statistical principles (sufficiency, Lke- lihood, and invariance) and showing how these principles are important in modeling data. Not all instructors will cover this chapter in detail, although we strongly recom- mend spending some time here. In particular, the likelihood and invariance principles are treated in detail. Along with the sufficiency principle, these principles, and the thinking behind them, are fundamental to total statistical understanding. Chapters 7-9 represent the central core of statistical inference, estimation (point and interval) and hypothesis testing. A major feature of these chapters is the division into methods of finding appropriate statistical techniques and methods of evaluating these techniques. Finding and evaluating are of interest to both the theorist and the PREFACE TO THE FIRST EDITION ix 9 9.1, 9.21, 9.2.2, 9.2.4, 9.3.1, 94 1% 11.1, 11.2 12 12.1, 12.2 Jf time permits, there can be some discussion (with little emphasis on details) of the material in Sections 4.4, 5.5, and 6.1.2, 6.1.3, 6.1.4. The material in Sections 11.3 and 12.3 may also be considered. The exercises have been gathered from many sources and are quite plentiful. We feel that, perhaps, the only way to master this material is through practice, and thus we have included much opportunity to do so. The exercises are as varied as we could make them, and many of them illustrate points that are either new or complementary to the material in the text. Some exercises are even taken from research papers. (Tt makes you feel old when you can include exercises based on papers that were new research during your own student days!) Although the exercises are not subdivided like the chapters, their ordering roughly follows that of the chapter. (Subdivisions often give too many hints.) Furthermore, the excreises become (again, roughly) more challenging as their numbers become higher. As this is an introductory book with a relatively broad scope, the topics are not covered in great depth. However, we felt some obligation to guide the reader one step further in the topics that may be of interest. Thus, we have included many references, pointing to the path to decper understanding of any particular topic. (The Encyclopedia of Statistical Sciences, edited by Kotz, Johnson, and Read, provides a fine introduction to many topics.) To write this book, we have drawn on both our past teachings and current work. We have also drawn on many people, to whom we are extremely grateful. We thank our colleagues at Cornell, North Carolina State, and Purdue—in particular, Jim Berger, Larry Brown, Sir David Cox, Ziding Feng, Janet Johnson, Leon Gleser, Costas Goutis, Dave Lansky, George McCabe, Chuck McCulloch, Myra Samuels, Steve Schwager, and Shaylc Searle, who have given their time and expertise in reading parts of this manuscript, offered assistance, and taken part in many conversations leading to con- atructive suggestions. We also thank Shanti Gupta for his hospitality, and the li- brary at Purdue, which was essential. We are grateful for the detailed reading and helpful suggestions of Shayle Searle and of our revicwers, both anonymous and non- anonymous (Jim Albert, Dan Coster, and Tom Wehrly). We also thank David Moore and George McCabe for allowing us to use their tables, and Steve Hirdt for supplying us with data. Since this book was written by two people who, for most of the time, were at least 600 miles apart, we lastly thank Bitnet for making this entire thing possible, George Casella Roger L. Berger “We have got to the deductions and the inferences,” said Lestrade, winking at me. “I find it hard enough to tackle facts, Holmes, without flying away after theories and fancies,” Inspector Lestrade to Sherlock Holmes The Boscombe Valley Mystery xiv 3.6 37 38 CONTENTS Inequalities and Identities 3.6.1 Probability Inequalities 3.6.2 Identities Exercises Miscellanea Multiple Random Variables 41 42 43 44 4.5 4.6 47 48 49 Joint and Marginal Distributions Conditional Distributions and Independence Bivariate Transformations Hierarchical Models and Mixture Distributions Covariance and Correlation Multivariate Distributions Inequalities 4.7.1 Numerical Inequalities 4.7.2 Functional Inequalities Exercises Miscellanea Properties of a Random Sample 5.1 5.2 5.3 54 5.5 5.6 5.7 5.8 Basic Concepts of Random Samples Sums of Random Variables from a Random Sample Sampling from the Normal Distribution 5.3.1 Properties of the Sample Mean and Variance 5.3.2 “The Derived Distributions: Student's t and Snedecor's F Order Statistics Convergence Concepts 5.5.1 Convergence in Probability 5.5.2 Almost Sure Convergence 5.5.3 Convergence in Distribution 5.5.4 The Delta Method Generating a Random Sample 5.6.1 Direct Methods 5.6.2 Indirect Methods 5.63 The Accept /Reject Algorithm Exercises Miscellanea Principles of Data Reduction 6.1 62 Introduction The Sufficiency Principle 6.2.1 Sufficient Statistics 6.2.2 Minimal Sufficient Statistics 6.2.3 Ancillary Statistics 6.2.4 Sufficient, Ancillary, and Complete Statistics 121 122 123 127 135 139 139 147 156 162 169 177 186 186 189 192 203 207 207 211 218 218 222 226 232 232 234 235 240 245 247 251 253 255 267 27 271 272 272 279 282 284 63 6.4 65 6.6 CONTENTS The Likelihood Principle 6.3.1 The Likelihood Function 6.3.2 The Formal Likelihood Principle The Equivariance Principle Exercises Miscellanea. Point Estimation 7 72 73 74 75 Introduction Methods cf Finding Estimators 7.2.1 Method of Moments 7.2.2 Maximum Likelihood Estimators 7.23 Bayes Estimators 7.24 The EM AlgorithÂm Methods of Evaluating Estimators 7.3.1 Mean Squared Error 7.3.2 Best Unbiased Estimators 7.3.3 Sufficiency and Unbiasedness 7.34 Loss Punction Optimality Exercises Miscellanea Hypothesis Testing 81 8.2 83 84 85 Introduction Methods of Finding Tests 8.2.1 Likelihood Ratio Tests 8.2.2 Bayesian Tests 8.2.3 Union-Intersection and Intersection-Union Tests Methods of Evaluating Tests 8.3.1 Error Probabilities and the Power Function 8.3.2 Most Powerful Tests 8.3.3 Sizes of Union-Intersection and Intersection-Union Tests 8.34 p-Values 8.3.5 Loss Punction Optimality Exercises Miscellanea Interval Estimation 91 9.2 Introduction Methods of Finding Interval Estimators 9.2.1 Inverting a Test Statistic 9.2.2 Pivotal Quantities 9.2.3 Pivoting the CDF 9.24 Bayesian Intervals 290 290 292 296 300 307 311 311 312 312 315 324 326 330 330 334 342 348 355 367 a73 373 a74 a74 a79 380 382 382 387 394 397 400 402 413 417 417 420 420 427 430 435 CONTENTS 12 Regression Models 12.1 Introduction 12.2 Regression with Errors in Variables 12.2.1 Functional and Structural Relationships 12.2.2 A Least Squares Solution 12.2.3 Maximum Likelihood Estimation 12.2.4 Confidence Sets 12.3 Logistic Regression 12.3.1 The Model 12.3.2 Estimation 12.4 Robust Regression 12.5 Exercises 12.6 Miscellanea Appendix: Computer Algebra Table of Common Distributions References Author Index Subject Index xvii 57r 577 577 579 581 583 588 591 591 593 597 602 608 613 621 629 845 649 List of Tables 73.1 8.3.1 9.21 9.2.2 10.1.1 10.21 10.3.1 10.41 10.4.2 11.21 11.31 11.3.2 12.31 1241 Number of arrangements Values of the joint pmf f(x,y) Three estimators for a binomial p Counts of leukemia cases Two types of errors in hypothesis testing Location-scale pivots Sterne's acceptance region and confidence set Three 90% normal confidence intervals Booitstrap and Delta Method variances Median/mean asymptotic relative efficiencies Huber estimators Huber estimator asymptotie relative eficiencies, k = 1.5 Poisson LRT statistie Power of robust tests Confidence coefficient for a pivotal interval Confidence coefficients for intervals based on Huber's M-estimator ANOVA table for oneway classification Data pictured in Figure 11,3.1 ANOVA table for simple linear regression Challenger data Potoroo data Regression M-estimator asymptotic relative efficiencies 16 141 354 360 383 427 431 441 480 484 485 487 490 497 500 504 538 542 556 594 598 601 731 73.2 8.21 8.3.1 8.3.2 833 834 9.2.1 9.2.2 92.3 9.2.4 9.2.5 9.3.1 10.1.1 10,3.1 10.41 104.2 11,3.1 11.3.2 11.3.3 12.21 12.2.2 12.2.3 12.31 12.41 LIST OF FIGURES Binomial MSE comparison Risk functions for variance estimators LRT statistic Power functions for Example 8.3.2 Power functions for Example 8.3.3 Power functions for throe tests in Example 8.3.19 Risk function for test in Example 8.3.31 Confidence interval-acceptance region relationship Acceptance region and confidence interval for Example 9.2.3 Credible and confidence intervals from Example 9.2.16 Crodible probabilities of the intervals from Example 9.2.16 Coverage probabilities of the intervals from Example 9.2,16 Three interval estimators from Example 9.2.16 Asymptotic relative efficiency for gamma mean estimators Poisson LRT histogram LRT intervals for a binomial proportion Coverage probabilities for nominal .9 binomial confidence procedures Vertical distances that are measured by RSS Geometric description of the BLUE Schefié bands, t interval, and Bonferroni intervals Distance minimized by orthogonal least squares Three regression lines Creasy-Williams F statístic Challenger data logistic curve Least squares, LAD, and M-estimate fits 333 351 37 384 384 394 401 421 423 437 438 439 449 478 490 502 503 542 547 562 581 583 590 595 599 List of Examples 113 1.2.2 1.23 1.2.5 1.2.7 1.210 12.12 12.13 1.215 1.2.18 1.2.19 1.2.20 13.1 13.3 1,3.4 1.3.6 1.3.8 1.310 13.11 13,13 14.2 14,3 144 1.5.2 15.4 15.5 1.5.6 1.5.9 1,6.2 1.6.4 211 2.1.2 214 2.1.6 Event operations Sigma algebra-I Sigma algebra-II Defining probabilities-I Defining probabilities-II Bonferroni's Inequality Lottery-I Tournament, Lottery-II Poker Sampling with replacement. Calculating an average Four aces Continuation of Example 1.3.1 Three prisoners Coding Chevalier de Meré Tossing two dice Letters Three coin tosses-T Random variables Three coin tosses-II Distribution of a random variable Tossing three coins Tossing for a head Continuous cdf Cdf with jumps Identically distributed random variables Geometric probabilities Logistic probabilities Binomial transformation Uniform transformation Uniform-exponential relationship-I Inverted gamma pdf 48 49 51 51