









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A part of a statistical data analysis lecture series by g. Cowan. It focuses on the monte carlo method, a numerical technique for calculating probabilities and related quantities using sequences of random numbers. The basics of the monte carlo method, random number generators, and the transformation and acceptance-rejection methods. It also discusses the use of monte carlo event generators and detector simulations in physics.
Typology: Slides
1 / 16
This page cannot be seen from the preview
Don't miss anything!










G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1
1 Probability, Bayes’ theorem, random variables, pdfs 2 Functions of r.v.s, expectation values, error propagation 3 Catalogue of pdfs 4 The Monte Carlo method 5 Statistical tests: general concepts 6 Test statistics, multivariate methods 7 Goodness-of-fit tests 8 Parameter estimation, maximum likelihood 9 More maximum likelihood 10 Method of least squares 11 Interval estimation, setting limits 12 Nuisance parameters, systematic uncertainties 13 Examples of Bayesian approach 14 tba 15 tba
What it is: a numerical technique for calculating probabilities and related quantities using sequences of random numbers. The usual steps: (1) Generate sequence r 1 , r 2 , ..., r m uniform in [0, 1]. (2) Use this to produce another sequence x 1 , x 2 , ..., x n distributed according to some pdf f ( x ) in which we’re interested ( x can be a vector). (3) Use the x values to estimate some property of f ( x ), e.g., fraction of x values with a < x < b gives → MC calculation = integration (at least formally) MC generated values = ‘simulated data’ → use for testing statistical procedures
The sequence is (unfortunately) periodic! Example (see Brandt Ch 4): a = 3, m = 7, n 0
← sequence repeats Choose a , m to obtain long period (maximum = m 1); m usually close to the largest integer that can represented in the computer. Only use a subset of a single period of the sequence.
are in [0, 1] but are they ‘random’? Choose a , m so that the r i pass various tests of randomness: uniform distribution in [0, 1], all values independent (no correlations between pairs), e.g. L’Ecuyer, Commun. ACM 31 (1988) 742 suggests a = 40692 m = 2147483399 Far better generators available, e.g. TRandom3 , based on Mersenne twister algorithm, period = 2 19937 1 (a “Mersenne prime”). See F. James, Comp. Phys. Comm. 60 (1990) 111; Brandt Ch. 4
Exponential pdf: Set (^) and solve for x ( r ). → works too.)
Enclose the pdf in a box: (1) Generate a random number x , uniform in [ x min , x max ], i.e. r 1 is uniform in [0,1]. (2) Generate a 2nd independent random number u uniformly distributed between 0 and f max , i.e. (3) If u < f ( x ), then accept x. If not, reject x and repeat.
The fraction of accepted points is equal to the fraction of the box’s area under the curve. For very peaked distributions, this may be very low and thus the algorithm may be slow. Improve by enclosing the pdf f ( x ) in a curve C h ( x ) that conforms to f ( x ) more closely, where h ( x ) is a pdf from which we can generate random values and C is a constant. Generate points uniformly over C h ( x ). If point is below f ( x ), accept x.
Simple example: e
e → μ
μ
Less simple: ‘event generators’ for a variety of reactions: e
e
Takes as input the particle list and momenta from generator. Simulates detector response: multiple Coulomb scattering (generate scattering angle), particle decays (generate lifetime),
electromagnetic, hadronic showers, production of signals, electronics response, ... Output = simulated raw data → input to reconstruction software: track finding, fitting, etc. Predict what you should see at ‘detector level’ given a certain hypothesis for ‘generator level’. Compare with the real data. Estimate ‘efficiencies’ = #events found / # events generated. Programming package: GEANT
We’ve now seen the Monte Carlo method: calculations based on sequences of random numbers, used to simulate particle collisions, detector response. So far, we’ve mainly been talking about probability. But suppose now we are faced with experimental data. We want to infer something about the (probabilistic) processes that produced the data. This is statistics, the main subject of the following lectures.
1955 the RAND Corporation published a book of random numbers generated with an “electronic roulette wheel”, based on random frequency electronic pulses. You can download all 1,000,000 of them (and buy the book) from www.rand.org.