Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Understanding Gaussian Distributions & Error Propagation in Physics, Schemes and Mind Maps of Physics

Physics

An in-depth exploration of covariance and correlation in physics, focusing on the Gaussian distribution and its applications in understanding errors and uncertainties. Topics covered include the definition of covariance, the importance of Gaussian distributions, and the interpretation of multidimensional Gaussians. The document also discusses the relationship between free parameters and the covariance matrix, as well as the differences between maximum likelihood estimation and least squares methods.

Typology: Schemes and Mind Maps

2021/2022

Uploaded on 09/12/2022

christina 🇺🇸

4.6

(23)

393 documents

1 / 55

This page cannot be seen from the preview

Don't miss anything!

Physics 509 9

Covariance & Correlation

The covariance between two variables is defined by:

cov x , y= x

x

 y

y

=xy  x  y

This is the most useful thing they never tell you in most lab

courses! Note that cov(x,x)=V(x).

The correlation coefficient is a unitless version of the same

thing:

=cov x , y 



x



y

If x and y are independent variables (P(x,y) = P(x)P(y)), then

cov x , y=dx dy P x , y xy  dx dy P x , y xdx dy P x , y y

=dx P xxdy P yy  dx P xxdy P yy= 0

Partial preview of the text

Download Understanding Gaussian Distributions & Error Propagation in Physics and more Schemes and Mind Maps Physics in PDF only on Docsity!

Physics 509

Covariance & Correlation

The covariance between two variables is defined by:

cov

x , y

=

x

x^

y

y^

=

xy

^

x

y

This is the most useful thing they never tell you in most labcourses! Note that cov(x,x)=V(x).The correlation coefficient is a unitless version of the samething:

=

cov

x , y x

y

If x and y are independent variables

(P(x,y) = P(x)P(y))

, then

cov

x , y

=

dx dy P

x , y

xy

dx dy P

x , y

x

dx dy P

x , y

y

=

dx P

x

dy P

y

dx P

x

dy P

y

=

0

Physics 509

More on Covariance

Correlationcoefficients for somesimulated data sets.Note the bottomright---whileindependentvariables must havezero correlation, thereverse is not true!Correlation isimportant because itis part of the errorpropagationequation, as we'llsee.

Physics 509

Gaussian Distributions

By far the most useful distribution is the Gaussian (normal)

distribution:

P

x

2

e

1 2

x^

(^2)

68.27% of area within

±

1

σ

95.45% of area within

±

2

σ

99.73% of area within

±

3

σ

Mean =

μ

, Variance=

σ

2

Note that width scales with

σ

.

Area out on tails is important---uselookup tables or cumulativedistribution function.In plot to left, red area (>

σ

) is

2.3%. 90% of area within

±

σ

95% of area within

±

σ

99% of area within

±

σ

Physics 509

Why are Gaussian distributions so critical?

They occur very commonly---the reason is that the average ofseveral independent random variables often approaches aGaussian distribution in the limit of large N.

Nice mathematical properties---infinitely differentiable,symmetric. Sum or difference of two Gaussian variables isalways itself Gaussian in its distribution.

Many complicated formulas simplify to linear algebra, oreven simpler, if all variables have Gaussian distributions.

Gaussian distribution is often used as a shorthand fordiscussing probabilities. A ì5 sigma resultî means a resultwith a chance probability that is the same as the tail area ofa unit Gaussian:

^5

dt P

t

This way of speaking is used even for non-Gaussiandistributions!

Physics 509

Review of covariances of joint PDFs

Consider some multidimensional PDF

p(x

1

x

)n

. We define the

covariance between any two variables by:

cov

x

i^

, x

j

d

x p

x

^

x

i^

x

i^

x

j

x

j

The set of all possible covariances defines a covariance matrix,

often denoted by V

. The diagonal elements of Vij

ij^

are the

variances of the individual variables, while the off-diagonalelements are related to the correlation coefficients:

V

ij

[^

2 1

12

1

2

1

n^

1

n

21

1

n^

2 2

2n

2

n

1

n^

n

2

n^

2 n

]

Physics 509

Properties of covariance matrices

Covariance matrices always:

are symmetric and square

are invertible (very important requirement!) The most common use of a covariance matrix is to invert it then

use it to calculate a

χ

2

i

j

y

i^

f^

x

i^

V

ij

1

y

j

f^

x

j

If the covariances are zero, then

V

=ij

σ ij

(^2) i , and this reduces to:

2

i

^

y

i^

f^

^

x

i^

2

2 i

W

a

r

n

in

g

:^

d

o

N

O

T

u

s

e

t

h

e

s

im

p

li

f

ie

d

f

o

r

m

u

la

i

f

d

a

t

a

p

o

in

t

s

a

r

e

c

o

r

re

la

t

e

d

Physics 509

Taylor Series expansion

Consider a Taylor series expansion of the logarithm of the

PDF around its maximum at (

x

0 ,y

0

log

P

x , y

=

P

0

A

^

x

0

B

^

y

0

C

x

0 2

D

y

0 2

2

E

x

0

y

0 ...

Since we are expanding around the peak, then the first

derivatives must equal zero, so A=B=0. The remainingterms can be written in matrix form:

log

P

x , y

P

0

x ,

y

C

E

D

x

y

In order for (

x

0 ,y

0 ) to be a maximum of the PDF (and not a

minimum or saddle point), the above matrix must bepositive definite, and therefore invertible.

Physics 509

Taylor Series expansion

Let me now suggestively denote the inverse of the above

matrix by V

. It's a positive definite matrix with threeij

parameters. In fact, I might as well call these parameters σ

,x σ

, andy

Exponentiating, we see that around its peak the PDF can be

approximated by a multidimensional Gaussian. The fullformula, including normalization, is

log

P

x , y

P

0

x ,

y

C

E

D

x

y

P

x , y

=

1

2

x^

y

1

2

exp

{

1

2

1

2

[ ^

x

0

x^

(^2)

^

y

0

y

(^2)

2

^

x

0

x^

y

0

y

^ ] }

This is a good approximation as long as higher order terms inTaylor series are small.

Physics 509

P(x|y)^ P

x , y

=

1

2

x^

y

1

2

exp

{

1

2

1

2

[ ^

x

0

x^

(^2)

^

y

0

y

(^2)

2

^

x

0

x^

y

0

y

^ ] }

Note: if you view y as a fixed parameter, then the PDF P(x|y) is aGaussian with width of:

x^

2

and a mean value of

x

0

^

x

y

0

(It makes sense that the width of P(x|y) is always narrower thanthe width of the marginalized PDF P(x) (integrated over y). Ifyou know the actual value of y, you have additional informationand so a tighter constraint on x.

Physics 509

=2x

=1y

ρ

Red ellipse:contour withargument ofexponentialset to equal-1/2 Blue ellipse:contourcontaining68% of 2Dprobabilitycontent.

Physics 509

Probability content inside a contour ellipse^ For a 1D Gaussian exp(-x

2

σ

2 ), the ±

σ

limits occur when the

argument of the exponent equals -1/2. For a Gaussian there's a68% chance of the measurement falling within around the mean. But for a 2D Gaussian this is not the case. Easiest to see this for

the simple case of

σ

=x

σ

=1:y

dx dy

exp

[^

x

2

y

2

=]

r ^0 0

dr

exp

[^

r

2

= ]

Evaluating this integral and solving gives r

0 2 =2.3. So 68% of

probability content is contained within a radius of

σ

We call this the 2D contour. Note that it's bigger than the 1Dversion---if you pick points inside the 68% contour and plottheir x coordinates, they'll span a wider range than thosepicked from the 68% contour of the 1D marginalized PDF!

Physics 509

=2x

=1y

ρ

Red ellipse:contour withargument ofexponentialset to equal-1/2 Blue ellipse:contourcontaining68% ofprobabilitycontent.

Physics 509

Two marginalization procedures

Normal marginalization procedure: integrate over nuisance variables:

P

x

dy P

x , y

Alternate marginalization procedure: maximize the likelihood as a function of

the nuisance variables, and return the result:

P

x

max

y

P

x , y

(It is not necessarily the case that the resulting PDF is normalized.) I can prove for Gaussian distributions that these two marginalization

procedures are equivalent, but cannot prove it for the general case (In factthey give different results). Bayesians always follow the first prescription. Frequentists most often use

the second. Sometimes it will be computationally easier to apply one, sometimes the

other, even for PDFs that are approximately Gaussian.

Physics 509

Maximum likelihood estimators By far the most useful estimator is the maximum likelihoodmethod. Given your data set x

1

... x

N

and a set of unknown

parameters

α

, calculate the likelihood function

L

^

x

1

x

N

^ i

1 N

P

x

i

It's more common (and easier) to calculate -ln

L

instead:

ln

L

x

1

x

N

^ i =

1 N

ln

P

x

i

The maximum likelihood estimator is that value of

α

which

maximizes L as a function of

α

. It can be found by minimizing

-ln

L

over the unknown parameters.

Understanding Gaussian Distributions & Error Propagation in Physics, Schemes and Mind Maps of Physics

Related documents

Partial preview of the text

Download Understanding Gaussian Distributions & Error Propagation in Physics and more Schemes and Mind Maps Physics in PDF only on Docsity!

Covariance & Correlation

More on Covariance

Correlationcoefficients for somesimulated data sets.Note the bottomright---whileindependentvariables must havezero correlation, thereverse is not true!Correlation isimportant because itis part of the errorpropagationequation, as we'llsee.

Gaussian Distributions

P

Why are Gaussian distributions so critical?

Nice mathematical properties---infinitely differentiable,symmetric. Sum or difference of two Gaussian variables isalways itself Gaussian in its distribution.

Many complicated formulas simplify to linear algebra, oreven simpler, if all variables have Gaussian distributions.

Gaussian distribution is often used as a shorthand fordiscussing probabilities. A ì5 sigma resultî means a resultwith a chance probability that is the same as the tail area ofa unit Gaussian:

This way of speaking is used even for non-Gaussiandistributions!

Review of covariances of joint PDFs

^

V

[^

]

Properties of covariance matrices

V

V

σ ij

^

^

W

a

r

n

in

g

:^

d

o

N

O

T

u

s

e

t

h

e

s

im

p

li

f

ie

d

f

o

r

m

u

la

i

f

d

a

t

a

p

o

in

t

s

a

r

e

c

o

r

re

la

t

e

d

Taylor Series expansion

Taylor Series expansion

^

^

^

^ i