





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to the statistical issues in particle physics from the viewpoint of statisticians and others unfamiliar with the field. It covers current statistical problems, techniques, and hot topics. The document also discusses the subject of particle physics, including the elementary particles that have been discovered, the intermediate bosons, and hadrons. It also explains the template for an experiment in particle physics.
Typology: Lecture notes
1 / 9
This page cannot be seen from the preview
Don't miss anything!






Roger Barlow Manchester University, UK and Stanford University, USA
An account is given of the methods of working of Experimental High Energy Particle Physics, from the viewpoint of statisticians and others unfamiliar with the field. Current statistical problems, techniques, and hot topics are introduced and discussed.
Particle Physics emerged as a discipline in its own right half a century ago. It pioneered ‘big science’; experiments are performed at accelerators of increas- ing energy and complexity by collaborations of many physicists from many institutes. It has evolved a re- search methodology within which statistics is of great importance, although it has done so without strong links to the statistics community – a fault that this conference exists to remedy. Thus although a statis- tician will be familiar with the research methods and statistical issues arising in, say, agricultural field trials or clinical testing, they may be interested in a brief description of how particle physicists do research, and the statistical issues that arise. Particle physics is also known as High Energy Physics 1 and the names are sometimes merged to give High Energy Particle Physics. Whatever it is called, its field of study is all the ‘Elementary’ Particles that have been discovered:
To this long list must be added the corresponding list of antiparticles. However this is not all: the domain of particle physics also includes all the particles that have not yet been discovered – some of which never will be discovered:
(^1) The terms are almost equivalent; strictly the phrase ‘High Energy’ means ‘above the threshold for pion production’, i.e., the energy at which a collision between two protons can produce three outgoing particles.
This list of proposed particles is limited only by the imagination of the theorists who propose them – which is no limitation at all. For each species of particle we want to establish:
To study some phenomenon X, which could be any of the above, a particle physics experiment goes through the following stages:
a large number of ordinary protons (perhaps as hydrogen in water) in suitably low-background conditions deep underground and waits to ob- serve any decays.
For important studies dedicated experiments (even accelerators) are built. For many more, the experimenter utilizes data taken with an experiment designed primarily for another pur- pose but also favorable for X. An example is the study of charm mesons at BABAR, Belle and CLEO, for which the primary purpose is B physics.
A detector is built (or an existing detector is utilized). ‘Events’ – interactions or decays – are observed by a whole range of detectors (track- ing detectors like drift chambers and silicon de- tectors, calorimeters that measure deposited en- ergy). Fast logic and/or online computers select the events that look promising, and these are recorded: the phrase ‘written to tape’ is used even though today the recording medium is gen- erally disk storage.
The electronic signals are combined and inter- preted: points are joined to form tracks, and measurement of their curvature in a magnetic field gives the particle momentum. A calorime- ter may give the energy, a Cherenkov counter the velocity. From this emerges a reconstructed ‘event’ as a list of the particles produced, their kinematic quantities (energies and directions) and possibly their identity (as pions or kaons or electrons, etc.)
Knowing the pattern one is looking for, one can then select the events that contain the phe- nomenon being studied.
A key point is that this selection (and also the electronic selection described above) is not going to be perfect. There will always be a selection efficiency which is less than 100%.
There is also a chance that some of the events that look like X and survive the selection and the cuts are actually from some other process. There will be a background which is greater than zero. Statistical techniques are obviously im- portant for the treatment and understanding of efficiency and background.
B^0 → D π background
Events / 2.5 MeV/c
2
B^0 → D*^ π background
mES (GeV/c^2 )
0
1000
2000
0
1000
2000
5.2 5.22 5.24 5.26 5.28 5.
Figure 1: Examples of analysis: B^0 decay to Dπ and D∗π
Relevant quantities, sensitive to X, are formed from the kinematic variables of the particles de- tected and measured. These are typically dis- played in a histogram, or histograms. (Joint two-dimensional plots are also common. Some- times, but rarely, the data at this stage is a sin- gle number.) These distributions are then compared with the theoretical predictions, of which there may be several. One will be the predicted distribution if X is not present. Another may be the predic- tion if X is present in the amount, and with the properties, predicted by an expected theory such as the ‘Standard Model’ [1] of Particle Physics. There may also be predictions obtained within the framework of a particular model, but with one or more parameters adjusted to fit the data.
An example of such a result is shown in Figure 1 (taken from [2]). In the top plot the phenomenon X is the decay of the B^0 meson to Dπ, in the lower plot the decay to D∗π. The distributions show the invariant mass, which is the quantity given by
M 2 c^4 =
i
Ei
i
p ~ic
where the sums run over the two final-state particles. If the two observed particles do indeed come from the decay of a B^0 particle then this quantity should be 5 .28 GeV/c^2 , though this is smeared out by experi- mental resolution. The plots show the predictions of a theory in which this decay does not occur (and all events are background) and also a prediction in which the decay is produced, with a normalization adjusted to give the best fit to the data. The result of this fit gives the number of signal events, from which the branching ratio can be obtained (though in fact that was not done in this example). If that looks trivial, a harder example is the decay B → π^0 π^0 , taken from [3] and shown in Figure 2. (To be fair, things are not quite as bad as this 1 dimen- sional plot implies.) In this confrontation of theory with experiment, one can then ask: is there any evidence for X or is the null
Having seen that particle physics makes little use of some statistical tools, we take a more detailed look at the ones it does utilize.
Theoretical distributions for the quantities being studied are predicted by quantum mechanics – per- haps with a few unknown parameters – and are often beautiful and simple. Angular distributions may be flat, or described by a few trigonometric terms; masses often follow a Cauchy function (which the particle physicists call the Breit-Wigner), time distributions may be exponential, or exponential with a sinusoidal oscillation. These beautiful and simple forms are generally modified by unbeautiful and complicated effects (higher-order calculations in perturbation theory, or the fragmentation of quarks into other particles). Fur- thermore the measurement and reconstruction process that the detector does for the particles is not com- pletely accurate or completely efficient. The translation from knowing the distributions in principle to knowing them in practice is done by Monte Carlo simulation. Particles are generated ac- cording to the original simple distributions, and then put through repeated random processes to describe the theoretical complications and then the passage of particles through the detector, including probabilities for colliding with nuclei in the beam pipe, slipping through cracks in the acceptance, or other eventual- ities. A complete software representation of all the experimental hardware has to be coded. The effects of the particles on the detector elements is simulated and the information used to reconstruct the kinematic quantities using the same programs that are run on the real data. This provides the full theoretical distri- bution function that the data is predicted to follow, albeit as a histogram rather than a smooth curve. These programs are large and slow to run. Signif- icant resources (both people and machines) are put into them. The generation of ‘Monte Carlo data’ is a significant issue for all experiments. Cases are known where data has been taken and analyzed but results delayed because of lack of the correct Monte Carlo data [4].
Having the parametrized theoretical description of the distribution means the likelihood function is al- ways known, and it assumes an overwhelmingly im- portant position. Writing this function – where the xi are the data and θ the unknown parameter(s)
L(x 1 , x 2... xN |θ) =
P (xi|θ) ln L =
i
ln P (xi|θ)
the form p(x|θ) is totally known, and L (or ln L) fol- lows.
Figure 3: The log likelihood as a function of a parameter
Having the likelihood function, the Maximum Like- lihood estimator is then easy to implement, and is very widely used. Even estimators like least-squares are, at least by some, ‘justified’ as being derivable from Max- imum Likelihood. Its (asymptotic) efficiency, and its invariance properties are desirable and useful. In some cases the ML estimate leads to an algebraic solution but in general, and in complex analysis, the physicist just maps out ln L for their data set as a function of θ and reads off the ML estimator from the peak, as can be done in Figure 3. This also produces an interval estimate as part of the minimization pro- cess. Following the value of ln L until it falls off by 1 2 from its maximum gives the 68% central confidence interval. Strictly speaking this is valid only for large N , but this restriction is generally disregarded. Per- haps we should not be so cavalier about doing so. Maximum Likelihood methods can also be used for functions with several parameters, as illustrated in Figure 4. Confidence regions are mapped out by read- ing off the likelihood contours. This is done in many analysis and the MINUIT program [5] is widely used in exploring the likelihood and parameter space.
Fitting the parametrized curve to the experimental data is done by several techniques.
(yi−f (xi|θ))^2 n has the advantage that the minimiza- tion can be done by differentiating and solving the normal equation, which is especially simple if f is lin- ear in θ. However the use of the observed number rather than the predicted number in the denominator
Figure 4: Contours of ln L in two dimensions
is recognized to lead to bias (downward fluctuations get an undue weight) and this cannot safely be used if n is small. (Actually in many cases what happens is that one of the bins has n = 0, and the physicist gets divide-by-zero messages and then starts to worry.)
Having found a fit, one has to judge whether to believe it. Whether the question is ‘Does the curve really describe the data?’ or ‘Do the data really fit the curve’ depends on one’s point of view. The likelihood value does not contain the answer to this question. This appears counter-intuitive and many people have wrestled (unsuccessfully) to pro- duce ways that the likelihood can be used to say some- thing about the quality of the fit.
The χ^2 =
i
yi−f (xi|θ) σi
certainly does give a
goodness of fit number. It is heavily used for GoF and 2-sample tests: researchers may quote χ^2 or χ^2 /ND or the probability of exceeding this χ^2. Alternative measures of goodness of fit have never really caught on. The Kolmogorov-Smirnov test is oc- casionally used – generally misleadingly, in my opin- ion. This is a totally robust test but pays the price for that by being weak. If you know anything about the data, e.g., that the numerical value of the parameter means something, then a more powerful test should be available. The KS test is being used to certify that distributions are in agreement when a more powerful approach would show up a difference.
The ‘Toy Monte Carlo’ has emerged as a technique made possible by modern computing resources. Hav- ing obtained a result, it may be hard or impossible to obtain significance levels or confidence regions in the traditional analytic way, for instance if the likelihood function one is studying does not even plausibly re- semble a distorted parabola, but instead some shape with multiple maxima. As an alternative approach, starting with an esti- mate θˆexp from the data, say {x 1... xN }, how can one establish a confidence region? Consider any particular θ. Use the known L(x|θ) to generate a set of N values of x – an “experiment”. Use this in your estimator (whatever that is) to find a corresponding θˆ. Repeat- ing many times gives the probability that this θ will give an estimate below (or above) the experimental one. This is just what the Neyman construction uses. To find a particular confidence region one has to ex- plore the parameter space until one finds the limits one wants.
Having explained the basic and generally agreed techniques used, there are a number of topics where advances are being made, or which are the subject of heated discussion and argument, or both.
The religious war which has been waged over the past few years has now cooled – although some iso- lated zealots remain on both sides. The ‘frequen- tists’ have come to accept that the use of Bayesian techniques can be illuminating and helpful, and some- times provide more useful information than a frequen- tist confidence level, especially for measurements of bounded parameters (e.g., masses). The ‘Bayesians’
presented in section 1.2 often continues
The mistake in method is that the experimenter stops looking for bugs when they have agreement, not when they honestly believe that all (substantial) bi- ases are accounted for. To guard against this the data can be ‘blinded’. There are two techniques used, cov- ering two types of situation
In the early days of particle physics, the 50s and 60s, a typical experiment would get handfuls of events – a few hundred if lucky – from painstaking analysis of bubble chamber pictures. Statistical errors were thus ∼ 10% and were so large that the effect of systematic uncertainties was generally small. In the 70s and 80s, the development of counter ex- periments led to event samples in the tens of thou- sands. Statistical errors were now at the per cent level, and systematic errors began to be more important. The current generation of experiments – the Z fac- tory at LEP, the B factories, Deep Inelastic Scattering at HERA – deal with millions of events. Statistical er- rors are at the level of ∼ 0 .1% and we have learned how to talk about ‘parts per mille’. Systematic errors (uncertainties in factors system- atically applied in the analysis) can no longer be fudged. The word ‘conservative’ has been grossly overused in this context. It sounds safe and reas- suring; in practice it is usually a sign of laziness or cowardice. The experiment perhaps cannot be both- ered to evaluate an uncertainty and makes a guess, and then it inflates that guess to cover the possibility that they’ll be caught out, and calls it a ‘conservative’ estimate of the systematic error. Particle physicists also confuse the evaluation of systematic errors with overall consistency checks. There is bad practice being spread to and between
graduate students. They will identify all the calibra- tion constants and parameters that contribute to the final result and vary those by their appropriate error, and fold the resultant variation into the systematic error. This is correct procedure. But they will also vary quantities like cut values, which should not in principle affect the result, by some arbitrary amount and then solemnly fold those resulting variations into the systematic error. This is nonsense. Looking at what happens when you change a cut value is a good and sensible thing: a (say) looser cut will give a higher efficiency and a higher background and thus more ob- served events, but after correcting for the new effi- ciency and background the result should be compati- ble with the original. This is a useful check that one understands what’s going on and that the analysis is consistent. But it does not feed into a numerical un- certainty.
Measurements of the properties of particles in events are made with finite resolution, so the plots of these quantities, and functions of these quantities, are ‘smeared out’. Events move between histogram bins. Sharp peaks become broad, edges become slopes. The recovery of the original sharp distribution from the observed one is known as ‘unfolding’. This is an alternative use of the Monte Carlo simulation process: rather than compare the data with a theoretical pre- diction smeared by Monte Carlo simulation, one com- pares the original theory with the de-smeared data. Clearly this is preferable, if it can be done, as the un- folding process depends only on the experiment and not on the original theory, and so once unfolded the data can be compared with any prediction. It looks to be a simple problem: given an original distribution as a histogram, the probability of migra- tion from any bin i to any bin j, Pji, can be estimated from a Monte Carlo sample (this includes the proba- bility that it may not be accepted:
j Pji^ ≤^ 1). The matrix is inverted, and then applied to the data his- togram to give the reconstructed original. Unfortunately it is not at all simple [13]. In the ma- trix inversion the errors on the Pji from finite statistics have devastating consequences and produce unrealis- tic results. There is a lot of activity in handling this in a sensible way, and in investigating other approaches, such as Maximum Entropy techniques.
The combination of compatible measurements with different errors is straightforward. However results are sometimes incompatible, or marginally compati- ble. But something must be done with the results, as the community needs a way of using the combined
number. Indeed it is the responsibility of the Particle Data Group [11] to combine measurements and form ‘world average’ results in a meaningful way. There is also a problem in combining limits. If two experiments report 95% confidence level upper limits of, say, 0.012 and 0.013, how can one combine these two measurements? This question was put forcefully by the Higgs searches at the end of the LEP run. The four experiments reported results separately com- patible and possibly marginally suggestive of a signal from a Higgs boson of mass around 114 GeV/c^2. Did four possibles make a probable? The answer to that statistics question determined whether or not LEP would run another year, at a cost of millions not only in power bills but in its impact on the construction schedule for the LHC. The CERN management de- cided that the answer in this case was ‘no’. History will be their judge. In combining experiments the likelihood function contains much more information than a simple limit, or value and error. There is a suggestion that these should be routinely published, and we are probably going to see that happening a lot in the future.
The classification of events (usually ‘signal’ and ‘background’) and particles (pion, kaon... ) by means of a cut on a discriminator variable is a basic hypoth- esis testing problem. However there may be several variables, each containing useful information, and the best choice will be made by combining these in some way. The Fisher Discriminant has been re-discovered as a technique which is good if the means of distributions differ between the two samples. The Neural Network (feed-forward ‘perceptron’ configuration) has become a standard item in the toolbox which can handle more general differences, and there are many developments going on in this area. The use of cuts is deeply engrained. In many cases it is simple and appropriate. However in cases where there are no clean boundaries it may be better to consider all events, weighting them according to their signal-like or background-like nature.
I have given several talks on ‘Statistics for Particle Physicists’ but ‘Particle Physics for Statisticians’ has been a new and interesting experience. This has been a very broad view. Particular topics will be considered in detail in the subsequent talks in this conference, in both plenary and parallel sessions. Hopefully the
account here will provide you with a map which will help you place them in context.
The author gratefully acknowledges the support of the Fulbright Foundation.
[1] S. L. Glashow ‘Partial Symmetries of Weak Inter- actions’, Nucl. Phys B22 579 (1961) S. Weinberg ‘A Model of Leptons’, Phys. Rev. Lett. 19 1264 (1967) A. Salam ‘Weak and Electromagnetic Interac- tions’, Proc. 8th^ Nobel Symposium, Svartholm 307 (1968). [2] BABAR Collaboration, ‘Measurement of time- dependent CP asymmetries in B^0 → D(∗)±π∓ decays and constraints on sin(2β + γ)’ SLAC- PUB-100155, 2003. To be published in Phys. Rev. Lett. [3] BABAR Collaboration, ‘Observation of the decay B^0 → π^0 π^0 ’, SLAC-PUB-100092, 2003. To be published in Phys. Rev. Lett. [4] Details witheld to prevent embarrassment of those involved. [5] F. James ‘MINUIT: Function Minimiza- tion and Error Analysis Reference Manual’ http://wwinfo.cern.ch/asdoc/minuit/ minmain.html [6] J. Orear ‘Notes on Statistics for Physicists’, Uni- versity of California report UCRL-8417 (1958) and Cornell report CLNS 82/511 (1982). [7] A.G. Frodesen et al. ‘Probablity and Statistics in Particle Physics’, Universitetsforlaget Bergen- Oslo-Tromso (1979). [8] V.L. Fitch et al., Phys. Rev. Lett. 13 (1964) 138. [9] Proc. Workshop on Confidence Limits 17- January 2000, Ed. F. James, L. Lyons and Y. Perrin. CERN yellow report 2000-005 (2000) http://user.web.cern.ch/user/Index/ library.html Fermilab Workshop in Confidence Limits 27- March 2000. http://conferences/fnal.gov/ c12k/ [10] G.J. Feldman and R.D. Cousins Phys. Rev. D (1998) 37731111. [11] K. Hagiwara et al, ‘The Review of Particle Prop- erties’ Phys. Rev. D66 (2002) 010001. [12] See e.g., P. Harrison ‘Blind Analysis’ p 278, Proc. Conf. on Advanced Statistical Techniques in Par-