Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad


Statistics II, Apuntes de Administración de Empresas

Asignatura: Statistics II, Profesor: , Carrera: Administració i Direcció d'Empreses - Anglès, Universidad: UAB

Tipo: Apuntes

2011/2012

Subido el 18/09/2012

eolina93
eolina93 🇪🇸

3.9

(369)

200 documentos

1 / 140

Toggle sidebar

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

bg1
Notes on Statistics II
Xavier Vilà
Universitat Autònoma de Barcelona
Year 2012-2013
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Vista previa parcial del texto

¡Descarga Statistics II y más Apuntes en PDF de Administración de Empresas solo en Docsity!

Notes on Statistics II

Xavier Vilà

Universitat Autònoma de Barcelona

Year 2012-

2 Notes on Statistics II

Attribution-Noncommercial-Share Alike 3.0 Spain

You are free:

ˆ to Share  to copy, distribute, display, and perform the work ˆ to Remix  to make derivative works

Under the following conditions:

ˆ Attribution. You must attribute the work in the manner spec- ied by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). ˆ Noncommercial. You may not use this work for commercial purposes. ˆ Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

For any reuse or distribution, you must make clear to others the license terms of this work. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. Copyright © 1998-2012 Xavier Vilà. This is a human-readable summary of the Legal Code (the full li- cense) available in http://creativecommons.org

  • 1 Introduction to Inferential Statistics
    • 1.1 Inferential Statistics: Denition and Inference Methods
    • 1.2 Denition and properties of Simple Random Sampling
      • 1.2.1 Simple Random Sampling (SRS).
      • 1.2.2 Systematic Sampling.
      • 1.2.3 Stratied Sampling.
      • 1.2.4 Step by step sampling.
      • proportion 1.3 Distribution of the main sample statistics: mean, variance and
      • 1.3.1 Sample Mean
      • 1.3.2 Sample Variance
      • 1.3.3 Sample Proportion
    • 1.4 Central Limit Theorem
  • 2 Estimation
    • 2.1 Objective of statistical estimation
    • 2.2 Denition and characteristics of estimators
    • 2.3 Properties of estimators: bias, eciency and consistency
      • 2.3.1 Bias
      • 2.3.2 Eciency
        • 2.3.2.1 Unbiased Estimators
        • 2.3.2.2 Biased Estimators
      • 2.3.3 Consistency
        • 2.3.3.1 Asymptotically unbiased estimators
        • 2.3.3.2 Consistent Estimators
      • of moments 2.4 Methods of point estimation: maximum likelihood and method
      • 2.4.1 Maximum Likelihood estimation 4 CONTENTS
      • 2.4.2 Method of moments
    • 2.5 Methods of interval estimation
      • 2.5.1 Condence Interval for the mean
        • σ^2 known 2.5.1.1 Case I: Normal Population (or large sample) and
        • and σ^2 unknown 2.5.1.2 Case II: Normal Population (or large sample)
      • 2.5.2 Condence Interval for the variance
      • 2.5.3 Condence Interval for the proportion
  • 3 Parametric Hypothesis Testing - pothesis 3.1 Concept of parametric test: null hypothesis and alternative hy-
    • 3.2 Test statistic and error types
      • tion proportion 3.3 Tests on the population mean, population variance and popula-
      • 3.3.1 Hypothesis Test for the Population Mean (μ)
      • 3.3.2 Hypothesis Test for the Population Variance (σ^2 )
      • 3.3.3 Hypothesis Test for the Population Proportion (π)
    • 3.4 Test of dierences
      • 3.4.1 Test for the Dierence of Means
      • 3.4.2 Test for the Dierence of Variances
      • 3.4.3 Test for the Dierence of Proportions
    • 3.5 Analysis of Variance
      • 3.5.1 Basic Framework
      • 3.5.2 Test
      • 3.5.3 Test statistic
      • 3.5.4 Distribution of the test statistic
      • 3.5.5 Test conclusion
      • 3.5.6 The ANOVA table
  • CONTENTS
    • ables 4 Goodness-of-t and analysis of the relationship between vari-
    • 4.1 Tests of goodness-of-t
      • 4.1.1 χ^2 test of goodnes-of-t for discrete variables
        • variables 4.1.2 Kolmogorov-Smirnov test of goodnes-of-t for continuous
    • 4.2 Types of relationship between variables
    • 4.3 Test of independence between qualitative variables
      • lation coecient and hypothesis test 4.4 Analysis of the correlation between quantitative variables: corre-
  • 5 Introduction to the regression model
    • 5.1 Objective of the model
    • 5.2 Hypothesis of the model specication
    • 5.3 Estimation by Ordinary Least Squares (OLS) and its properties
      • 5.3.1 Data in dierences with respect to the mean
      • 5.3.2 The OLS estimators
      • 5.3.3 Estimation of the variance of the error term
      • 5.3.4 Properties of the estimators OLS
        • 5.3.4.1 Properties of βˆ
        • 5.3.4.2 Properties of βˆ
    • 5.4 Model testing
      • 5.4.1 Condence intervals for β 1 and β
      • 5.4.2 Hypothesis testing for β 1 and β
      • correlation and the regression analysis 5.5 Coecient of the goodness-of- t and relationship between the
      • 5.5.1 The Coecient of determination (or Goodness-of-t)
        • gression analysis 5.5.2 Relationship between the correlation analysis and the re-
    • 5.6 Forecasting
      • 5.6.1 Point estimation of Yn+1
      • 5.6.2 Estimation by intervals of Yn+1
  • A Statistical Tables
    • A.1 The Standard Normal Distribution
    • A.2 The t − student Distribution
    • A.3 The χ^2 Distribution
    • A.4 The Snedecor 's F Distribution
    • A.5 Values for the Kolmogorov-Smirnov test

Chapter 1

Introduction to Inferential

Statistics

Think of a researcher who seeks to explain some fact in the real world. For instance, imagine Newton trying to explain why apples fall. As a more familiar example, imagine an economist trying to explain why unemployment does exist. Usually, the task of a researcher consists of three parts:

  1. Observe the world in order to determine the problem to study and gather information about it
  2. Think about the problem
  3. Produce an explanation or Theory for the problem.

Statistics becomes extremely important for the rst of these three items.^1

It is clear that, in order to study a "real problem", the researcher must observe the "real" world. Nevertheless, it is also clear that no researcher can observe the whole reality. Newton can not observe all the falling apples, neither can an economist interview the whole population of a country to determine the unemployment rate. It is thus necessary to somehow "summarize" the reality, but this task has to be done so that such "summary" closely ts the reality being studied. Then, and only then, the conclusions drawn from the "summary" can be applied reliably to the whole population.

Statistics (more precisely, statistical inference) is a collection of techniques by means of which we can draw conclusions about a reality from the study of a summary of that reality (^1) Very often the researcher does not start up by gathering information using statistical techniques. On the contrary, in many cases his initial activity consists of detecting general patterns of behavior for a given fact. From here, researchers are able to build up an abstract theory in order to explain the phenomenon at study. This is, for example, Newton's way, and also the way Economic Theory works. Once this "abstract theory" is logically constructed, statistical techniques are often used to check whether such theory ts the reality, as we will see in Chapter 5.

8 CHAPTER 1. INTRODUCTION TO INFERENTIAL STATISTICS

Hereafter we will study in detail how this is done.

Chapter one explains how the reality is rigorously summarized and what are the main features of the results obtained in this process.

Chapter two deals with the rst approach on how to draw conclusions about some real issues based on what we observe in the summary.

Chapters three and four introduce more sophisticated techniques to make infer- ences about the reality using some of the more elemental results seen in Chapter two.

Finally, Chapter ve introduces the linear regression analysis, a technique widely used in the economic analysis (and other sciences) to study the relationship between variables.

It is worth saying that a clear understanding of the topics in Chapter one is important in order to easily understand what other chapters deal with, and also to get an global idea of the whole process of statistical inference.

It is important to understand that statistics is based on probabilistic techniques. Hence, any statistical conclusion drawn from this kind of summary will not be true for sure when applied to the whole reality, but only with a certain probability. For instance, when an electoral survey is conducted it is clear that its results will not exactly coincide with the results in the nal election. Nevertheless, if the survey is "well done", that is, if the summary of the reality (which in this case is the set of people interviewed) closely represents the whole reality (which in this case is the whole population that has the right to vote), then the survey result will be close to the nal results with a high probability

In the sections below we will see which are the basic ingredients of any statistical analysis and its probabilistic features

1.1 Inferential Statistics: Denition and Infer-

ence Methods

Statistical inference is mainly built upon four main concepts, which will be dened and described below. These concepts are closely related to each other and it is very important to clearly understand each of them and not to mistake one by the other.

Population Is the set of elements that are the object of study. The goal is to draw some conclusion regarding some specic feature of this population.

Example 1.1.1 All the apples in the world. The feature at study is whether apples fall down or not.

Example 1.1.2 Labor force in the European Union. The feature at study is whether workers are unemployed or not.

Example 1.1.3 Production of Intel chips in a given day. The fea- ture at study is whether chips are faulty or not.

10 CHAPTER 1. INTRODUCTION TO INFERENTIAL STATISTICS

  1. From this statistic, using some statistical inference technique that we will see in other chapters, some conclusions are drawn regarding the unknown population parameter that represents the feature of the population that is to be studied.

This process can be represented as in Figure 1.

Population

Sample

Parameter (unkonwn)

Statistic (known)

Statistical Sampling Inference

Figure 1.1: The process of Statistical Inference

We can now provide a better denition for Statistics (or Statistical Inference).

Denition 1.1.13 Statistical Inference is a subject whose main objective is to draw conclusions regarding a population through the study of one sample by means of probabilistic techniques.

1.2 Denition and properties of Simple Random

Sampling

We will see next what a sample is, that is, how a sample can be selected out of a population. Since we want to study this sample to produce conclusions about the population, it can not be selected arbitrarily. In this sense, there exist rigorous techniques specially tailored for this purpose. In what follows, the more basic techniques will be introduced, while more sophisticated analysis are out of the scope of these notes. The following denition introduces the idea of sampling

Denition 1.2.1 Sampling is a systematic technique to select a sample out of a population in such a way that it is representative of that population

Here, the keyword is representative. Indeed, if we want our sample to be used in order to produce "reliable" conclusions regarding the original population, we would better have a sample that closely resembles (in its structure) the original population. For instance, if we want to conduct an electoral survey and we only interview people living in a "rich" neighborhood, then it is clear that their answers will not be representative of the whole population.

There are dierent types of sampling techniques, depending on the specic fea- tures of the study at hand. The more important are:

1.2. DEFINITION AND PROPERTIES OF SIMPLE RANDOM SAMPLING 11

1.2.1 Simple Random Sampling (SRS).

This is the "most random" of all the sampling methods, and throughout this notes we will normally assume that samples are obtained using this technique. Its main feature is that all elements in the population have the same proba- bility of being selected to be incorporated to the sample. In other words, the sample is constructed completely at random. If we think for a moment of all the possible dierent samples of a given size that can be selected from a given population, simple random sampling means that each of these samples has the same probability of been selected as "the sample", i.e., they are equally likely

Example 1.2.2 Consider a population consisting of only 4 elements

Population = {A, B, C, D}

If, for instance, we want to draw a sample of size 2, there are 6 possible samples

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 {A, B} {A, C} {A, D} {B, C} {B, D} {C, D}

Table 1.1: Possibles Samples

Hence, in a Simple Random Sampling, each of this samples has the same proba- bility of being selected, 16 in this case. Analogously, we may also say that each of the four elements in the Population has the same probability of being drawn to enter the selected sample. Indeed, since each of the elements belongs to exactly 3 of the possible sample and each possible sample has probability 16 of being the selected sample, then the probability for any of the elements in the Population of entering the selected sample is 16 + 16 + 16 = 12.

This probability ( 12 ) can also be understood as each element in the Population having probability 14 of being the rst element to enter the sample and probability 1 3

3 4 of being selected as the second element in the sample given that it has not been selected in the rst place, which yields a total of 12 probability of being one of the two elements in the sample.

1.2.2 Systematic Sampling.

The Systematic Sampling consists of a variant of a SRS. It is useful when the population to sample is not "static", but changes often. The following example shows how this method works.

Example 1.2.3 Consider a factory that manufactures Intel "chips". The man- agers want to study how many of these chips turn out to be faulty every day. The factory has a "chain" process so that once the "chip" has been assembled, it automatically enters in the packaging process and then moves into warehouse. Let us suppose that the factory produces 100 "chips" a day, and that a sample of size 5 is going to be selected. It is clear that the managers can not wait until the end of the day, then stop all processes, randomly select 5 chips, and start

1.3. DISTRIBUTION OF THE MAIN SAMPLE STATISTICS: MEAN, VARIANCE AND PROPORTION 13

Example 1.2.5 We want to conduct a survey to know the situation of the public schools in Catalonia. Since this is a very delicate topic, we must travel to each of the schools that have been picked to belong to the sample and interview the Director. In this context, a SRS might very well select a sample composed of schools disseminated all over the territory, which would imply a high level of travel expenditure. To avoid this, we can do the following:

  1. Perform a SRS within all the "comarques" in Catalonia, so that 10 "co- marques" are selected to visit.
  2. Within each of the 10 selected "comarques", perform another SRS to select 20 towns to visit. Hence, we will have a total of 200 cities to visit.
  3. Finally, within each of the selected cities perform one SRS more to select one public school to visit.

In this way, we have selected 200 public schools to visit in Catalonia with travel costs lower than using a SRS. The problem, though, is that the sample obtained will be less representative.

In some cases the sample is obtained without any randomness at all. For in- stance, if we want to test a new drug against malaria, we can not just randomly select "subjects" and force them to take the drug. In cases like this, a call for volunteers is made. This techniques are usually much less representatives that a random technique. Furthermore, since there are no random components in the sample, we can not use probabilistic tools to study the sample and, therefore, statistical inference techniques can not be properly applied.

In what follows, we will always assume (implicitly) that the sample at hand has been obtained by means of a SRS.

1.3 Distribution of the main sample statistics:

mean, variance and proportion

Once the sample is obtained (we will always assume that using a SRS), the process of working with it and drawing conclusions begins.

In this sense, the main task is now to obtain a statistic, one of the main concepts in statistical inference. We will use it to obtain conclusions regarding the unknown population parameter that is of interest to us.

The denition that follows will remind us what a statistic is (as introduced in the previous section). Then, the concept of estimate is dened. Although these two concepts are very similar and closely related it is very important to notice that they are not the same thing.

14 CHAPTER 1. INTRODUCTION TO INFERENTIAL STATISTICS

Denition 1.3.1 A statistic (or estimator)^4 is a formula that uses the values in the sample at hand (observations) in order to produce an approximation to the true value of an unknown population parameter.

Denition 1.3.2 An estimate (or estimation) is the particular value of an estimator that is obtained from a particular sample.

Hence, a statistic is not a number but a formula while an estimate is the number that is obtained when the formula (the estimator) is applied to the observations of the specic sample that we have at hand.

At this point, it becomes crucial to understand that, given that the sample is obtained by means of a random technique, the statistic will produce dierent estimates with dierent probabilities (depending on the specic sample that is nally "selected" at random). To put it more formally, a statistic is a Random Variable, that is, a variable that takes dierent values with dierent probabilities In this sense, an estimate is a specic realization of this random variable. The following example aims to clarify this idea.

Example 1.3.3 We want to know the average number of cars per family in a given population. To keep the example simple, we will assume that the population is very small, only 4 families,

Population = {A, B, C, D}

Let us now assume that family A owns one car, families B and C have 2 cars each, and family D has 4. 5

For our study, we want to obtain a random sample of size 2. We can then compute the average number of cars in the sample and use it to infer some conclusion regarding the true average in the population. Hence, the sample mean (or just mean, for short) will play the role of statistic in this example, and we can use it to draw conclusions on the true population parameter that is of interest to us: the average number of cars per family in the whole population, that is, the population mean.

Table 1.3 summarizes:

  1. The 6 possible samples than can be the result of a random sampling process on this population,
  2. for each of the possible samples, the probability of being selected (all of them will have the same probability as we are assuming SRS)
  3. the estimate value that would result from applying the sample average formula to the corresponding sample

16 CHAPTER 1. INTRODUCTION TO INFERENTIAL STATISTICS

we have seen in the previous example. Indeed, in that example we have seen that the population is distributed so that there is 1 element with 1 car, 2 elements with 2 cars, and 1 element with 4 cars. Therefore, if we pick the sample element xi at random from this population, we will have that:

p(xi = a) =

1 41 if^ a^ = 1 21 if^ a^ = 2 4 if^ a^ = 4 0 otherwise

This is, in this case, the distribution of the population. Figure 1.2 shows it.

0 1 2 3 4 x

p

0.

0.

Figure 1.2: Population distribution in example 1.3.

In generala, we will assume that the Sample has been obtained by means of a SRS from a population distributed according to a Normal Distribution with some Population Mean μ and some Popula- tion Variance σ^2. aThere are special cases that we will discuss in due time

What does it mean? Easy, it means that for any two numbers a and b, we have that for any element in our sample xi,

p(a ≤ xi ≤ b) = p(a − μ ≤ xi − μ ≤ b − μ) =

= p(

a − μ σ

xi − μ σ

b − μ σ ) = p(

a − μ σ

≤ Z ≤

b − μ σ

where Z represents the Standard Normal distribution, usually denoted by N (0, 1), whose associated probabilities are found in tables. Graphically, Figure 1.3 shows it

1.3. DISTRIBUTION OF THE MAIN SAMPLE STATISTICS: MEAN, VARIANCE AND PROPORTION 17

a b μ

p(a<x<b)

Figure 1.3: Normal Distribution

We turn next to the study of the distributions of the three main statistics. These, as we have discussed above, will depend on the distribution of the popu- lation from which we obtain the sample. For each case, we will also be interested in knowing what is the expectation and the variance of these statistics.

1.3.1 Sample Mean

Sample mean, denoted by X¯, is the statistic that is obtained from the sample using the formula:

X^ ¯ =^1 n

∑^ n

i=

xi

It is normally used to infer conclusions regarding the true value of the Population mean μ. Its distribution depends on the characteristics of both the population and the sample

  1. If the population is Normal, that is, Xi ∼ N (μ, σ^2 ) ∀i, then we have that

X^ ¯ ∼ N (μ, σ

2 n

because of the sample mean being a linear combination of Normal random variables

  1. If the population is not Normal but the sample is big enough then:

X^ ¯ − μ √ σ^2 n

∼ N (0, 1) (approx.)

because of the Central Limit Theorem that will be introduced later on.

1.3. DISTRIBUTION OF THE MAIN SAMPLE STATISTICS: MEAN, VARIANCE AND PROPORTION 19

  1. If the population is not Normal, then the distribution of the sample vari- ance is unknown in general, even for large samples.

Since we only know the distribution of the sample variance when the population is Normal, we will use the fact that in that case its distribution is χ^2 n− 1 to nd the expectation and variance easily. In this sense, we know that for any χ^2 variable we have that E(χ^2 n− 1 ) = n − 1 and V (χ^2 n− 1 ) = 2(n − 1). Hence, we will assume the the sample has been obtained from a Normal population with sample mean μ and sample variance σ^2. That is, xi ∼ N (μ σ^2 ) for any element xi in the sample. Hence: (n − 1)S^2 σ^2

∼ χ^2 n− 1

and therefore

E(

(n − 1)S^2 σ^2

) = n − 1 ⇒

(n − 1) σ^2

E(S^2 ) = n − 1 ⇒ E(S^2 ) = σ^2

V (

(n − 1)S^2 σ^2

) = 2(n − 1) ⇒

(n − 1)^2 (σ^2 )^2

V (S^2 ) = 2(n − 1) ⇒ V (S^2 ) =

2 σ^4 n − 1

1.3.3 Sample Proportion

Sample proportion is a special case. It is used when we are interested in know- ing which is the true proportion of elements in a population that have a given characteristic. For instance, it might be of interest to know what is the propor- tion of smokers among the second year students in this school (in this case, the characteristic that is of interest is "whether a student smokes or not"), or what is the proportion of faulty Intel chips in a day (in this case, the characteristic of interest is "whether a chip is faulty or not")

Sample proportion, denoted by ˆπ, is the statistic that is obtained from the sample using the formula:

π ˆ =

∑^ n

i=

xi n

where xi = 1 if the i-th element in the sample has the characteristic that we are studying and xi = 0 if it does not.

Sample proportion is normally used to infer conclusions regarding the true pop- ulation sample π. In this case, the population is never Normal since each obser- vation xi comes from a Bernoulli random variable. Indeed, let us assume that we are looking at a population of 100 individuals out of which 45 are smokers. That is, the true population proportion is 45% or π = 0. 45. Imagine that from this population we want to obtain a sample of size 10. It is clear that for any element xi of the sample we will have that:

p(xi = 1) =

20 CHAPTER 1. INTRODUCTION TO INFERENTIAL STATISTICS

p(xi = 0) =

Hence, we see that each element xi in the sample follows a Bernoulli distribution with parameter π (where π is the true and unknown population proportion

It can be shown then that ˆπ =

∑n i=1 xi/n^ is a^ Binomial^ random variable. Also, given that when samples are large a Binomial distribution can be approximated by a Normal distribution, we can conclude that, in general:

  1. If the sample is large enough (nπ(1 − π) > 5), then (approx.):

π ˆ ∼ N (π,

π(1 − π) n

This approximation is better the closer to 0 , 5 is π and the larger is the sample

  1. If the sample is not large, then the approximation is very bad.

With regards to the expectation and variance of the sample proportion, we have:

E(ˆπ) = π

V (ˆπ) =

π(1 − π) n

1.4 Central Limit Theorem

This theorem presents a mathematical fact that is of the highest importance in statistical inference. Basically, the theorem states that the sum of identical random variables, whatever their distribution is, approximates a Normal random variable.

From a practical point of view this result allows us to work with the sample mean as if it were a Normal random variable even when the population from which we obtain the sample does not follow a Normal distribution. Moreover, the larger the sample the better the approximation.

Formally,

Theorem 1.4.1 Let X 1 , X 2 ,... , Xn be a series of independent random vari- ables with identical distribution, expectation μ and variance σ^2. Then, when n is large enough, the random variable

X^ ¯ =^1

n

∑^ n

i=

Xi

follows, approximately, a Normal distribution with μ (^) X¯ = μ and σ^2 X¯ = σ

2 n