





































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Integrating Facto,r Notations, Random samples, Normal Distribution ,Maximum likelihood estimation, Parameter estimation, accuracy of the estimate, polynomial regression
Typology: Study notes
1 / 77
This page cannot be seen from the preview
Don't miss anything!






































































Peter Donnelly,
Department of Statistics and The Wellcome Trust Centre for Human Genetics
Lecture notes and problem sheets will be available from the Mathematical Institute’s website.
We will be concerned with the mathemat- ical framework for making inferences from data. The tools of probability provide the backdrop, allowing us to quantify the uncer- tainties involved.
Examples
Data: x 1 , x 2 ,... , xn, the heights of n ran- domly chosen girls.
An obvious estimate is
¯x =
n
∑^ n i=
xi
How precise is our estimate?
Notation
We usually denote observations by lower case letters: x 1 , x 2 ,... , xn.
Regard these as observed values of random variables (rv’s) (for which we usually use up- per case) X 1 , X 2 ,... , Xn.
We often write x (respectively X) for the col- lection x 1 , x 2 ,... , xn (respectively X 1 , X 2 ,... , Xn).
In different settings, it is convenient to think of xi as the observed value of Xi, or as a possible value that Xi can take.
For example, if Xi is a Poisson random vari- able with mean λ,
−λλxi xi!
for xi = 0, 1 , 2 ,.. ..
Definition 1 A random sample of size n is a set of random variables X 1 , X 2 ,... , Xn which are independent and identically distributed (i.i.d.).
Examples
e−λλx^1 x 1!
e−λλx^2 x 2!
e−λλxn xn!
= e−nλ^ λ(
∑n i=1 xi) ∏n i=1 xi!^
where the second equality follows from the independence of the Xi.
In probability questions we would usually as- sume that the parameters λ and μ from our previous examples are known.
In many settings they will not be known, and we wish to estimate them from data. Two key questions of interest are:
Definition 2 Let X 1 , X 2 ,... , Xn be a random sample. The sample mean is defined as
X¯ =^1 n
∑^ n i=
Xi.
The sample variance is defined as
S^2 = 1 n − 1
∑^ n i=
(Xi − X¯)^2.
The sample standard deviation is S (=
Notes
Definition 3 Recall that X has a normal dis- tribution with mean μ and variance σ^2 , writ- ten X ∼ N(μ, σ^2 ), if the p.d.f. of X is
f (x) =
2 πσ^2
e−
(^12) (x−σ μ) 2
for −∞ < x < ∞.
If μ = 0 and σ = 1, then X is said to have a standard normal distribution, and we write X ∼ N(0, 1).
Important Result
If X ∼ N(μ, σ^2 ) and Z = (X − μ)/σ, then Z ∼ N(0, 1).
The cumulative distribution function (c.d.f.) of a standard normal random variable is:
Φ(x) =
∫ (^) x −∞
2 π
e−u (^2) / 2 du.
Example 1. continued
Suppose n = 62 and x 1 , x 2 ,... , xn are 62 time intervals between major earthquakes. As- sume X 1 , X 2 ,... , Xn are exponential random variables with mean μ.
How does one estimate the unknown μ? In- tuition suggests using μ = ¯x. But is this a good idea? Are there general principles we can use to choose estimators?
In general, suppose X 1 , X 2 ,... , Xn is a ran- dom sample from a distribution with p.d.f. (or p.m.f.) f (x; θ). If we regard the param- eter θ as unknown, we need to estimate it using x 1 , x 2 ,... , xn.
Definition 4 Given observations x 1 , x 2 ,... , xn and unknown parameter θ, the likelihood of θ is the function
∏^ n i=
f (xi; θ). (1)
That is, L is the joint density (or mass) func- tion, but regarded as a function of θ, for a fixed x 1 , x 2 ,... , xn. The likelihood L(θ) is the probability (or probability density) of observ- ing x = x 1 , x 2 ,... , xn if the unknown param- eter is θ.
The log-likelihood is l(θ) = log L(θ) (The logarithm is to the base e).
The maximum likelihood estimate θˆ(x), is the value of θ that maximizes L(θ).
θˆ(X) is the maximum likelihood estimator (m.l.e.).
Example 1 again
In this case the parameter of interest is μ.
L(μ) =
∏^ n i=
μ e−
xi μ
μn^
e(−
(^1) μ^ ∑ni=1 xi) ,
and so
l(μ) = −n log μ −
∑n i=1 xi μ
Then dl dμ
= −n μ
∑n i=1 xi μ^2
and dl dμ = 0 ⇒ μ = ¯x ,
(which is a maximum).
Therefore, the maximum likelihood estimate of μ is ¯x.
The maximum likelihood estimator is X¯.
Example
Consider a random variable X with a Bernoulli distribution with parameter p (this is the same as a Binomial(1, p)).
The probability mass function of X is
{ px(1 − p)^1 −x^ x = 0, 1. 0 otherwise.
Suppose X 1 , X 2 ,... , Xn is a random sample. Then, the likelihood is
L(p) =
∏^ n i=
pxi(1 − p)^1 −xi
= pr(1 − p)n−r^ ,
where r = ∑ni=1 xi.
Example
Suppose we take a random sample of indi- viduals from a population, and test their ge- netic type at a particular chromosomal loca- tion (called a “locus” in genetics). At this particular position, each chromosome in the population will have one of two possible vari- ants, which we denote by A and a. Since each individual has two chromosomes (we re- ceive one from each of our parents), then the type of a particular individual could be one of three so-called genotypes, AA, Aa, or aa, depending on whether they have 2, 1, or 0 copies of the A variant. (Note that order is not relevant, so there is no distinction be- tween Aa and aA.)
There is a simple result, called the Hardy- Weinberg law, which states that under plau- sible assumptions, the genotypes AA, Aa and aa will occur with probabilities p 1 = θ^2 , p 2 = 2 θ(1 − θ) and p 3 = (1 − θ)^2 respectively, for some 0 ≤ θ ≤ 1.
Now suppose the random sample of n indi- viduals contains:
x 1 of type AA; x 2 of type Aa; x 3 of type aa;
where ∑ 3 i=1 xi^ =^ n.
Then the likelihood L(θ) is the probability that we observe (x 1 , x 2 , x 3 ) if we assign indi- viduals to genotypes with probabilities (p 1 , p 2 , p 3 ). That is,
L(θ) = n! x 1 !x 2 !x 3!
px 11 px 22 px 33.
This is a multinomial distribution (the gen- eralization of the binomial distribution in the setting when there are more than two possi- ble outcomes).