




























































































Estude fácil! Tem muito documento disponível na Docsity
Ganhe pontos ajudando outros esrudantes ou compre um plano Premium
Prepare-se para as provas
Estude fácil! Tem muito documento disponível na Docsity
Prepare-se para as provas com trabalhos de outros alunos como você, aqui na Docsity
Encontra documentos específicos para os exames da tua universidade
Prepare-se com as videoaulas e exercícios resolvidos criados a partir da grade da sua Universidade
Responda perguntas de provas passadas e avalie sua preparação.
Ganhe pontos para baixar
Ganhe pontos ajudando outros esrudantes ou compre um plano Premium
Texto de Inferencia Estatistica
Tipologia: Notas de estudo
1 / 114
Esta página não é visível na pré-visualização
Não perca as partes importantes!





























































































Everything existing in the universe is the fruit of chance. Democritus, the 5th Century BC
Statistics is a discipline that provides with a methodology allowing to make an infer- ence from real random data on parameters of probabilistic models that are believed to generate such data. The position of statistics with relation to real world data and corre- sponding mathematical models of the probability theory is presented in the following diagram. The following is the list of few from plenty phenomena to which randomness is attributed.
1.2 Motivating Example
Let X denote the number of particles that will be emitted from a radioactive source in the next one minute period. We know that X will turn out to be equal to one of the non-negative integers but, apart from that, we know nothing about which of the possible values are more or less likely to occur. The quantity X is said to be a random variable. Suppose we are told that the random variable X has a Poisson distribution with parameter θ = 2. Then, if x is some non-negative integer, we know that the probability that the random variable X takes the value x is given by the formula
P (X = x) = θ
x (^) exp (−θ) x!
where θ = 2. So, for instance, the probability that X takes the value x = 4 is
P (X = 4) =^2
(^4) exp (−2) 4! = 0.^0902.
We have here a probability model for the random variable X. Note that we are using upper case letters for random variables and lower case letters for the values taken by random variables. We shall persist with this convention throughout the course. Let us still assume that the random variable X has a Poisson distribution with parameter θ but where θ is some unspecified positive number. Then, if x is some non- negative integer, we know that the probability that the random variable X takes the value x is given by the formula
P (X = x|θ) = θ
x (^) exp (−θ) x! ,^ (1.1)
for θ ∈ R+. However, we cannot calculate probabilities such as the probability that X takes the value x = 4 without knowing the value of θ. Suppose that, in order to learn something about the value of θ, we decide to measure the value of X for each of the next 5 one minute time periods. Let us use the notation X 1 to denote the number of particles emitted in the first period, X 2 to denote the number emitted in the second period and so forth. We shall end up with data consisting of a random vector X = (X 1 , X 2 ,... , X 5 ). Consider x = (x 1 , x 2 , x 3 , x 4 , x 5 ) = (2, 1 , 0 , 3 , 4). Then x is a possible value for the random vector X. We know that the probability that X 1 takes the value x 1 = 2 is given by the formula
P (X = 2|θ) = θ
(^2) exp (−θ) 2!
and similarly that the probability that X 2 takes the value x 2 = 1 is given by
P (X = 1|θ) = θ^ exp (1!−θ)
and so on. However, what about the probability that X takes the value x? In order for this probability to be specified we need to know something about the joint distribution of the random variables X 1 , X 2 ,... , X 5. A simple assumption to make is that the ran- dom variables X 1 , X 2 ,... , X 5 are mutually independent. (Note that this assumption may not be correct since X 2 may tend to be more similar to X 1 that it would be to X 5 .) However, with this assumption we can say that the probability that X takes the value x
0 2 4 6 8 10
Number of particles
Probability
Figure 1.2: Probability mass function for Poisson model with θ = 2.
that such a model is correct. However, we have arbitrarily set θ = 2 and this is more questionable. How can we know that it is correct a correct value of the parameter? Let us analyze this issue in detail. If x is some non-negative integer, we know that the probability that the random variable X takes the value x is given by the formula
P (X = x|θ) = θ
xe−θ x! ,
for θ > 0. But without knowing the true value of θ, we cannot calculate probabilities such as the probability that X takes the value x = 1. Suppose that, in order to learn something about the value of θ, an experiment is performed and a value of X = 5 is recorded. Let us take a look at the probability mass function for θ = 2 in Figure 1.2. What is the probability of X to take value 2? Do we like what we see? Why? Would you bet 1 or 2 in the next experiment? We certainly have some serious doubt about our choice of θ = 2 which was arbi- trary anyway. One can consider, for example, θ = 7 as an alternative to θ = 2. Here are graphs of the pmf for the two cases. Which of the two choices do we like? Since it
0.00 0 2 4 6 8 10
0.^ 0.^ 0.^
Number of particles
Probability
0.00 0 2 4 6 8 10
0.^ 0.^
Number of particles
Probability
Figure 1.3: The probability mass function for Poisson model with θ = 2 vs. the one with θ = 7.
was more probable to get X = 5 under the assumption θ = 7 than when θ = 2, we say θ = 7 is more likely to produce X = 5 than θ = 2. Based on this observation we can develop a general strategy for chosing θ. Let us summarize our position. So far we know (or assume) about the radioactive emission that it follows Poisson model with some unknown θ > 0 and the value x = 5 has been once observed. Our goal is somehow to utilized this knowledge. First, we note that the Poisson model is in fact not only a function of x but also of θ
p(x|θ) = θ
xe−θ x!.
Let us plug in the observed x = 5, so that we get a function of θ that is called likelihood function l(θ) = θ
(^5) e−θ
The graph of it is presented on the next figure. Can you localize on this graph the values of probabilities that were used to chose θ = 7 over θ = 2? What value of θ appears to be the most preferable if the same argument is extended to all possible values of θ? We observe that the value of θ = 5 is most likely to produce value x = 5. In the result of our likelihood approach we have used the data x = 5 and the Poisson model to make inference - an example of statistical inference.
Exercise 1. For the general Poisson model
p(x|θ) = l(θ|x) = θ
xe−θ x! ,
Give a mathematical argument for your claims.
Suppose that we perform another measurement of the number of emitted particles. Let us use the notation X 1 to denote the number of particles emitted in the first period, X 2 to denote the number emitted in the second period. We shall end up with data consisting of a random vector X = (X 1 , X 2 ). The second measurement yielded x 2 = 2, so that x = (x 1 , x 2 ) = (5, 2). We know that the probability that X 1 takes the value x 1 = 5 is given by the formula
P (X = 5|θ) = θ
(^5) e−θ 5!
and similarly that the probability that X 2 takes the value x 2 = 2 is given by
P (X = 2|θ) = θ
(^2) e−θ 2!.
However, what about the probability that X takes the value x = (5, 2)? In order for this probability to be specified we need to know something about the joint distribution of the random variables X 1 , X 2. A simple assumption to make is that the random variables X 1 , X 2 are mutually independent. In such a case the probability that X takes the value x = (x 1 , x 2 ) is given by
P (X = (x 1 , x 2 )|θ) = θ
x (^1) e−θ x 1! ·^
θx^2 e−θ x 2! =^ e
− 2 θ θx^1 +x^2 x 1 !x 2!. After little of algebra we easily find the likelihood function of observing X = (5, 2) as l(θ|(5, 2)) = e−^2 θ^ θ
7 240
0 5 10 15
theta
Likelihood
0 5 10 15
theta
Likelihood
Figure 1.5: Likelihood of observing (5, 2) (top) vs. the one of observing 5 (bottom).
and its graph is presented in Figure 1.5 in comparison with the previous likelihood for a single observation. Two important effects of adding an extra information should be noted
1.3 Likelihood and theory of statistics
The strategy of making statistical inference based on the likelihood function as de- scribed above is the recurrent theme in mathematical statistics and thus in our lecture. Using mathematical argument we would compare various strategies to infering about the parameters and often we will demonstrate that the likelihood based methods are optimal. It will show its strength also as a criterium deciding between various claims about parameters of the model which is the leading story of testing hypotheses. In the modern days, the role of computers has increased in statistical methodology. New computationally intense methods of data explorations become one of the central areas of modern statistcs. Even there, methods that refer to likelihood play dominant roles, in particular, in Bayesian methodology. Despite this extensive penetration of statistical methodology by likelihood techin- ques, by no means statistics can be reduced to analysis of likelihood. In every area of statistics, there are important aspects that require reaching beyond likelihood, in many cases, likelihood is not even a focus of studies and development. The purpose of this course is to present both the importance of likelihood approach across statistics but also presentation of topics for which likelihood plays a secondary role if any.
1.4 Computationally intensive methods of statistics
The second part of our presentation of modern statistical inference is devoted to compu- tationally intensive statistical methods. The area of data explorations is rapidly growing in importance due to
In this introduction we give two examples that illustrate the power of modern computers and computing software both in analysis of statistical models and in performing actual
statistical inference. We start with analyzing a performance of a statistical procedure using random sample generation.
Randomness can be used to study properties of a mathematical model. The model itself may be probabilistic or not but here we focus on the probabilistic ones. Essentially, it is based on repetitive simulations of random samples corresponding to the model and observing behavior of objects of interests. An example of Monte Carlo method is ap- proximate the area of circle by tossing randomly a point (typically computer generated) on the paper where a circle is drawn. The percentage of points that fall inside the circle represents (approximately) percentage of the area covered by the circle, as illustrated in Figure 1.6.
Exercise 4. Write an R code that would explore the area of an elipsoid using Monte Carlo method.
Below we present an application of Monte Carlo approach to studying fitting meth- ods for the Poisson model.
Deciding for Poisson model
Recall that the Poisson model is given by
P (X = x|θ) = θ
xe−θ x!.
It is relatively easy to demonstrate that the mean value of this distribution is equal to θ and standard deviation is also equal to θ.
Exercise 5. Present a formal argument showing that for a Poisson random variable X with parameter θ, EX = θ and VarX = θ.
Thus for a sample of observations x = (x 1 ,... , xn) it is reasonable to consider
Histogram of means
means
Frequency 2.5 3.0 3.5 4.0 4.5 5.0 5.
0
100
Histogram of vars
vars
Frequency 0 5 10 15
0
150
300
Figure 1.7: Monte Carlo results of comparing estimation of θ = 4 by the sample mean (left) vs. estimation using the sample standard deviation right.
estimates performs better. The resulting histograms of the values of estimator are pre- sented in Figure 1.8. It is quite clear from the graphs that the estimator based on the mean is better than the one based on the variance.
Bootstrap (resampling) methods are one of the examples of Monte Carlo based statis- tical analysis. The methodology can be summarized as follows
that produced the original statistical data.
This way randomness is used to analyze statistical samples that, by the way, are also a result of randomness. An example illustrating the approach is presented next.
Estimating nitrate ion concentration
Nitrate ion concentration measurements in a certain chemical lab has been collected and their results are given in the following table. The goal is to estimate, based on
0.51 0.51 0.51 0.50 0.51 0.49 0.52 0.53 0.50 0. 0.51 0.52 0.53 0.48 0.49 0.50 0.52 0.49 0.49 0. 0.49 0.48 0.46 0.49 0.49 0.48 0.49 0.49 0.51 0. 0.51 0.51 0.51 0.48 0.50 0.47 0.50 0.51 0.49 0. 0.51 0.50 0.50 0.53 0.52 0.52 0.50 0.50 0.51 0.
Table 1.1: Results of 50 determinations of nitrate ion concentration in μg per ml.
these values, the actual nitrate ion concentration. The overall mean of all observations is 0.4998. It is natural to ask what is the error of this determination of the nitrate concentration. If we would repeat our experiment of collecting 50 samples of nitrate concentrations many times we would see the range of error that is made. However, it would be a waste of resources and not a viable method at all. Instead we resample ‘new’ data from our data and use so obtained new samples for assessment of the error and compare the obtained means (bootstrap means) with the original one. The differ- ences of these represent the bootstrap “estimation” errors their distribution is viewed as a good representation of the distribution of the true error. In Figure ??, we see the bootstrap counterpart of the distribution of the estimation error. Based on this we can safely say that the nitrate concentration is 49. 99 ± 0. 005.
Exercise 6. Consider a sample of daily number of buyers in a furniture store
8 , 5 , 2 , 3 , 1 , 3 , 9 , 5 , 5 , 2 , 3 , 3 , 8 , 4 , 7 , 11 , 7 , 5 , 12 , 5
Consider the two estimators of θ for a Poisson distribution as discussed in the previous section. Describe formally the procedure (in steps) of obtaining a bootstrap confidence