How to Claim a Discovery

W.A. Rolke and A.M. L ´opez

University of Puerto Rico - Mayaguez

We describe a statistical hypothesis test for the presence of a signal. The test allows the researcher to fix the

signal location and/or width a priori, or perform a search to find the signal region that maximizes the signal.

The background rate and/or distribution can be known or might be estimated from the data. Cuts can b e used

to bring out the signal.

1. INTRODUCTION

Setting limits for new particles or decay modes has

been an active research area for many years. In high

energy physics it received renewed interest with the

unified method by Feldman and Cousins [1]. Giunti

[2] and Roe and Woodroofe [3] gave variations of the

unified method, trying to resolve an apparent anomaly

when there are fewer events in the signal region than

expected. They all discuss the problem of setting lim-

its for the case of a known background rate. The

case of an unknown background rate was discussed in

a conference talk by Feldman [4] and a method for

handling this case was developed by Rolke and L´opez

[5]. Little work has been done though on the ques-

tion of claiming a discovery. This problem could be

handled by finding a confidence interval and claiming

a discovery if the lower limit is positive. Instead the

question of a discovery should be done separately, by

performing a hypothesis test with the null hypothesis

Ho:”There is no signal present”. Rejecting this hy-

pothesis will then lead to a claim for a new discovery.

In carrying out a hypothesis test one needs to decide

on the type I error probability α, the probability of

falsely rejecting the null hypothesis. This is of course

equivalent to the major mistake to be guarded against,

namely that of falsely claiming a discovery.

In practice a hypothesis test is often carried out

by finding the p-value. This is the probability that

an identical experiment will yield a result as extreme

(with respect to the null hypothesis) or even more so

given that the null hypothesis is true. Then if p < α

we reject H0; otherwise we fail to do so. For the test

discussed here it is not possible to compute the p-value

analytically, and therefore we will find the p-value via

Monte Carlo.

Maybe the most important decision in carrying out

a hypothesis test is the choice of α, or what we might

call the discovery threshold. As we shall see, this de-

cision is made much easier by the method described

here because we will need only one threshold, regard-

less of how the analysis was done. What a proper

discovery threshold should be in high energy physics

is a question outside the scope of this paper, although

we might suggest α= 0.001 (roughly equivalent to

3σ). Sinervo [6] argues for a much stricter standard

of 5σ, or α= 2.9∗10−7. We believe that such extreme

values were used in the past because it was felt that

the calculated p values were biased downward by the

analysis process, and a small αwas needed in order

to compensate for any unwittingly introduced biases.

If we were to trust that our p-value is in fact correct,

a 1 in 1000 error rate should to be acceptable.

A general introduction to hypothesis testing with

applications to high energy physics is given in Sinervo

[6]. A classic reference for the theory of hypothesis

testing is Lehmann [7].

2. THE SIGNAL TEST

Our test uses T=x−bor T=x−y/τ as the

test statistic, depending on whether the background

rate bis assumed to be known or not. Here xis the

number of observations in the signal region, yis the

number of observations in the background region and

τis the probability that a background event falls into

the background region divided by the probability that

it falls into the signal region. Therefore y/τ is the

estimated background in the signal region and x−y/τ

is an estimate for the signal rate λ.Tis the maximum

likelihood estimator of λ, and it is the quantity used in

Feldman and Cousins [1] without being set to 0 when

x−y/τ is negative. This is not necessary here because

a negative value of x−y/τ will clearly lead to a failure

to reject H0.

Other choices for the test statistic are of course pos-

sible. For example, a measure for the size of a signal

that is often used in high energy physics is S/√b. Un-

der the null hypothesis this statistic is approximately

Gaussian, at least if there is sufficient data. Unfor-

tunately the approximation is not sufficiently good

in the extreme tails where a new discovery is made,

leading to p-values that are much smaller than is war-

ranted. Even when using Monte Carlo to compute the

true p-value, this test statistic can be shown to be in-

ferior to the one proposed in our method because it

has consistently lower power, that is its probability of

detecting a real signal is smaller.

In order to find the p-value of the test we need to

know the null distribution. In the simplest case of

a known background rate and everything else fixed

PHYSTAT2003, SLAC, Stanford, California, September 8-11, 2003

How to Claim a Discovery, Lecture notes of Particle Physics

Related documents

Partial preview of the text

Download How to Claim a Discovery and more Lecture notes Particle Physics in PDF only on Docsity!

1. INTRODUCTION

2. THE SIGNAL TEST

3. PERFORMANCE OF THE METHOD

4. CONCLUSION

Acknowledgments

References