Statistical vs Systematic Errors-Computing and Statistical Data Analysis-Lecture 13 Slides-Physics, Slides of Computational and Statistical Data Analysis

Bayes’ theorem tells how our beliefs should be updated in light of the data. Statistical vs Systematic Errors, Nuisance, Parameters, Profile Likelihood, Bayesian, Bayes Theorem, Markov Chain Monte Carlo, Metropolis Hastings Algorithm, Cousins Highland Method, Integrated Likelihoods, Statistical Data Analysis, Lecture Slides, Glen Cowan, Physics Department, University of London, United Kingdom.

Typology: Slides

2011/2012

Uploaded on 03/08/2012

leyllin
leyllin 🇬🇧

4.3

(15)

241 documents

1 / 26

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
G. Cowan
Lectures on Statistical Data Analysis
Lecture 13 page 1
Statistical Data Analysis: Lecture 13
1 Probability, Bayes’ theorem
2 Random variables and probability densities
3 Expectation values, error propagation
4 Catalogue of pdfs
5 The Monte Carlo method
6 Statistical tests: general concepts
7 Test statistics, multivariate methods
8 Goodness-of-fit tests
9 Parameter estimation, maximum likelihood
10 More maximum likelihood
11 Method of least squares
12 Interval estimation, setting limits
13 Nuisance parameters, systematic uncertainties
14 Examples of Bayesian approach
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a

Partial preview of the text

Download Statistical vs Systematic Errors-Computing and Statistical Data Analysis-Lecture 13 Slides-Physics and more Slides Computational and Statistical Data Analysis in PDF only on Docsity!

Statistical Data Analysis: Lecture 13

1 Probability, Bayes’ theorem 2 Random variables and probability densities 3 Expectation values, error propagation 4 Catalogue of pdfs 5 The Monte Carlo method 6 Statistical tests: general concepts 7 Test statistics, multivariate methods 8 Goodness-of-fit tests 9 Parameter estimation, maximum likelihood 10 More maximum likelihood 11 Method of least squares 12 Interval estimation, setting limits 13 Nuisance parameters, systematic uncertainties 14 Examples of Bayesian approach

Statistical vs. systematic errors

Statistical errors: How much would the result fluctuate upon repetition of the measurement? Implies some set of assumptions to define probability of outcome of the measurement. Systematic errors: What is the uncertainty in my result due to uncertainty in my assumptions, e.g., model (theoretical) uncertainty; modelling of measurement apparatus. The sources of error do not vary upon repetition of the measurement. Often result from uncertain value of, e.g., calibration constants, efficiencies, etc.

Nuisance parameters

Suppose the outcome of the experiment is some set of data values x (here shorthand for e.g. x 1 , ..., x n

We want to determine a parameter θ , (could be a vector of parameters θ 1 , ..., θ n

The probability law for the data x depends on θ : L ( x | θ ) (the likelihood function) E.g. maximize L to find estimator Now suppose, however, that the vector of parameters: contains some that are of interest, and others that are not of interest: Symbolically: The are called nuisance parameters.

Example #1: fitting a straight line

Data: Model: measured y i independent, Gaussian: assume x i

and σ

i known.

Goal: estimate θ

0

(don’t care about θ

1

Correlation between causes errors to increase. Standard deviations from tangent lines to contour

Case #2: both θ

0

and θ

1

unknown

The information on θ

1 improves accuracy of

Case #3: we have a measurement t

1

of θ

1

The Bayesian approach

In Bayesian statistics we can associate a probability with

a hypothesis, e.g., a parameter value θ.

Interpret probability of θ as ‘degree of belief’ (subjective).

Need to start with ‘prior pdf’ π( θ), this reflects degree

of belief about θ before doing the experiment.

Our experiment has data x , → likelihood function L ( x | θ).

Bayes’ theorem tells how our beliefs should be updated in light of the data x :

Posterior pdf p ( θ | x ) contains all our knowledge about θ.

Case #4: Bayesian method

We need to associate prior probabilities with θ

0

and θ

1 , e.g., Putting this into Bayes’ theorem gives: posterior likelihood × prior ← based on previous measurement reflects ‘prior ignorance’, in any case much broader than !

Digression: marginalization with MCMC

Bayesian computations involve integrals like often high dimensionality and impossible in closed form, also impossible with ‘normal’ acceptance-rejection Monte Carlo. Markov Chain Monte Carlo (MCMC) has revolutionized Bayesian computation. Google for ‘MCMC’, ‘Metropolis’, ‘Bayesian computation’, ... MCMC generates correlated sequence of random numbers: cannot use for many applications, e.g., detector MC; effective stat. error greater than √ n. Basic idea: sample multidimensional look, e.g., only at distribution of parameters of interest.

MCMC basics: Metropolis-Hastings algorithm

Goal: given an n -dimensional pdf generate a sequence of points

  1. Start at some point
  2. Generate Proposal density e.g. Gaussian centred about
  3. Form Hastings test ratio
  4. Generate
  5. If else move to proposed point old point repeated
  6. Iterate

Metropolis-Hastings caveats

Actually one can only prove that the sequence of points follows the desired pdf in the limit where it runs forever. There may be a “burn-in” period where the sequence does not initially follow Unfortunately there are few useful theorems to tell us when the sequence has converged. Look at trace plots, autocorrelation. Check result with different proposal density. If you think it’s converged, try it again starting from 10 different initial points and see if you find same result.

Although numerical values of answer here same as in frequentist case, interpretation is different (sometimes unimportant?)

Example: posterior pdf from MCMC

Sample the posterior pdf from previous example with MCMC: Summarize pdf of parameter of interest with, e.g., mean, median, standard deviation, etc.

Sensitivity to prior

Vary π ( θ ) to explore how extreme your prior beliefs would have to be to justify various conclusions (sensitivity analysis). Try exponential with different mean values... Try different functional forms...

Example #2: Poisson data with background

Count n events, e.g., in fixed time or integrated luminosity. s = expected number of signal events b = expected number of background events n ~ Poisson( s + b ): Sometimes b known, other times it is in some way uncertain. Goal: measure or place limits on s , taking into consideration the uncertainty in b.