Lecture 3: Probability Distributions in Statistical Data Analysis by G. Cowan, Slides of Computational and Statistical Data Analysis

A part of lecture 3 from the 'statistical data analysis' series by g. Cowan. It covers various probability distributions, their properties, and applications in high energy physics (hep). Binomial, multinomial, poisson, uniform, exponential, gaussian, chi-square, cauchy, landau, beta, gamma, and student's t distributions. It also explains the central limit theorem and its significance.

Typology: Slides

2011/2012

Uploaded on 03/08/2012

leyllin
leyllin 🇬🇧

4.3

(15)

241 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1
Statistical Data Analysis: Lecture 3
1 Probability, Bayes’ theorem, random variables, pdfs
2 Functions of r.v.s, expectation values, error propagation
3 Catalogue of pdfs
4 The Monte Carlo method
5 Statistical tests: general concepts
6 Test statistics, multivariate methods
7 Goodness-of-fit tests
8 Parameter estimation, maximum likelihood
9 More maximum likelihood
10 Method of least squares
11 Interval estimation, setting limits
12 Nuisance parameters, systematic uncertainties
13 Examples of Bayesian approach
14 tba
15 tba
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Lecture 3: Probability Distributions in Statistical Data Analysis by G. Cowan and more Slides Computational and Statistical Data Analysis in PDF only on Docsity!

Statistical Data Analysis: Lecture 3

1 Probability, Bayes’ theorem, random variables, pdfs 2 Functions of r.v.s, expectation values, error propagation 3 Catalogue of pdfs 4 The Monte Carlo method 5 Statistical tests: general concepts 6 Test statistics, multivariate methods 7 Goodness-of-fit tests 8 Parameter estimation, maximum likelihood 9 More maximum likelihood 10 Method of least squares 11 Interval estimation, setting limits 12 Nuisance parameters, systematic uncertainties 13 Examples of Bayesian approach 14 tba 15 tba

Some distributions

Distribution/pdf Example use in HEP Binomial Branching ratio Multinomial Histogram with fixed N Poisson Number of events found Uniform Monte Carlo method Exponential Decay time Gaussian Measurement error Chi-square Goodness-of-fit Cauchy Mass of resonance Landau Ionization energy loss Beta Prior pdf for efficiency Gamma Sum of exponential variables Student’s t Resolution function with adjustable tails

Binomial distribution (2)

The binomial distribution is therefore random variable parameters For the expectation value and variance we find:

Binomial distribution (3)

Binomial distribution for several values of the parameters: Example: observe N decays of W ± , the number n of which are W→μν is a binomial r.v., p = branching ratio.

Multinomial distribution (2)

Now consider outcome i as ‘success’, all others as ‘failure’. → all n i individually binomial with parameters N , p i for all i One can also find the covariance to be Example: represents a histogram with m bins, N total entries, all entries independent.

Poisson distribution

Consider binomial n in the limit → n follows the Poisson distribution: Example: number of scattering events

n with cross section σ found for a fixed

integrated luminosity, with

Exponential distribution

The exponential pdf for the continuous r.v. x is defined by: Example: proper decay time t of an unstable particle

( τ = mean lifetime)

Lack of memory (unique to exponential):

Gaussian distribution

The Gaussian (normal) pdf for a continuous r.v. x is defined by:

Special case: μ = 0, σ

2 = 1 (‘standard Gaussian’):

(N.B. often μ, σ

2 denote mean, variance of any r.v., not only Gaussian.)

If y ~ Gaussian with μ, σ

2

, then x = ( y μ) / σ follows ϕ ( x ).

Central Limit Theorem (2)

The CLT can be proved using characteristic functions (Fourier transforms), see, e.g., SDA Chapter 10. Good example: velocity component v x of air molecules. OK example: total deflection due to multiple Coulomb scattering. (Rare large angle deflections give non-Gaussian tail.) Bad example: energy loss of charged particle traversing thin gas layer. (Rare collisions make up large fraction of energy loss, cf. Landau pdf.) For finite n , the theorem is approximately valid to the extent that the fluctuation of the sum is not dominated by one (or few) terms. Beware of measurement errors with non-Gaussian tails.

Multivariate Gaussian distribution

Multivariate Gaussian pdf for the vector are column vectors, are transpose (row) vectors, For n = 2 this is

where ρ = cov[ x

1 , x 2

]/( σ

1

2 ) is the correlation coefficient.

Cauchy (Breit-Wigner) distribution

The Breit-Wigner pdf for the continuous r.v. x is defined by (Γ = 2, x 0 = 0 is the Cauchy pdf.) E [ x ] not well defined, V [ x ] →∞. x 0 = mode (most probable value) Γ = full width at half maximum Example: mass of resonance particle, e.g. ρ, K

, φ 0 , ... Γ = decay rate (inverse of mean lifetime)

Landau distribution

For a charged particle with β = v / c traversing a layer of matter

of thickness d , the energy loss Δ follows the Landau pdf:

L. Landau, J. Phys. USSR 8 (1944) 201; see also W. Allison and J. Cobb, Ann. Rev. Nucl. Part. Sci. 30 (1980) 253.

d

Beta distribution

Often used to represent pdf of continuous r.v. nonzero only between finite limits.

Gamma distribution

Often used to represent pdf of continuous r.v. nonzero only in [0,∞]. Also e.g. sum of n exponential r.v.s or time until n th event in Poisson process ~ Gamma