Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Linear Decision Boundaries-Computing and Statistical Data Analysis-Lecture 06 Slides-Physics, Slides of Computational and Statistical Data Analysis

Queen Mary, University of London (QMUL)Computational and Statistical Data Analysis

Feed-forward net: values of a node depend only on earlier layers,usually only on previous layer (“network architecture”). Linear Decision Boundaries, Nonlinear Transformation, Inputs, Nonlinear Test Statistics, Neural Networks, Multi Layer Perceptron, Probability Density Estimation, PDE, Kernel Based, Correlation, Independence, Decorrelation, Naive Bayes, Decision Trees, Boosting, AdaBoost, Overtraining, Statistical Data Analysis, Lecture Slides, Glen Cowan, Physics Department, University of Lon

Typology: Slides

2011/2012

Uploaded on 03/08/2012

leyllin 🇬🇧

4.3

(15)

241 documents

1 / 35

This page cannot be seen from the preview

Don't miss anything!

G. Cowan Lectures on Statistical Data Analysis 1

Statistical Data Analysis: Lecture 6

1 Probability, Bayes’ theorem, random variables, pdfs

2 Functions of r.v.s, expectation values, error propagation

3 Catalogue of pdfs

4 The Monte Carlo method

5 Statistical tests: general concepts

6 Test statistics, multivariate methods

7 Goodness-of-fit tests

8 Parameter estimation, maximum likelihood

9 More maximum likelihood

10 Method of least squares

11 Interval estimation, setting limits

12 Nuisance parameters, systematic uncertainties

13 Examples of Bayesian approach

14 tba

Discover Slides of Computational and Statistical Data Analysis Queen Mary, University of London (QMUL)

Partial preview of the text

Download Linear Decision Boundaries-Computing and Statistical Data Analysis-Lecture 06 Slides-Physics and more Slides Computational and Statistical Data Analysis in PDF only on Docsity!

Statistical Data Analysis: Lecture 6

1 Probability, Bayes’ theorem, random variables, pdfs 2 Functions of r.v.s, expectation values, error propagation 3 Catalogue of pdfs 4 The Monte Carlo method 5 Statistical tests: general concepts 6 Test statistics, multivariate methods 7 Goodness-of-fit tests 8 Parameter estimation, maximum likelihood 9 More maximum likelihood 10 Method of least squares 11 Interval estimation, setting limits 12 Nuisance parameters, systematic uncertainties 13 Examples of Bayesian approach 14 tba

Nonlinear test statistics

The optimal decision boundary may not be a hyperplane, → nonlinear test statistic accept

H

Multivariate statistical methods 1 are a Big Industry: Particle Physics can benefit from progress in Machine Learning. Neural Networks, Support Vector Machines, Kernel density methods, ...

Introduction to neural networks

Used in neurobiology, pattern recognition, financial forecasting, ... Here, neural nets are just a type of test statistic. Suppose we take t ( x ) to have the form logistic sigmoid This is called the single-layer perceptron. s (·) is monotonic → equivalent to linear t ( x )

Neural network discussion

Easy to generalize to arbitrary number of layers. Feed-forward net: values of a node depend only on earlier layers, usually only on previous layer (“network architecture”). More nodes → neural net gets closer to optimal t ( x ), but more parameters need to be determined. Parameters usually determined by minimizing an error function, where t (0) , t (1) are target values, e.g., 0 and 1 for logistic sigmoid. Expectation values replaced by averages of training data (e.g. MC). In general training can be difficult; standard software available.

Neural network example from LEP II

Signal: e

e → W

(often 4 well separated hadron jets) Background: e

e → qqgg (4 less well separated hadron jets) ← input variables based on jet structure, event shape, ... none by itself gives much separation. Neural network output does better... (Garrido, Juste and Martinez, ALEPH 96-144)

Probability Density Estimation (PDE) techniques

See e.g. K. Cranmer, Kernel Estimation in High Energy Physics , CPC 136 (2001) 198; hep-ex/0011057; T. Carli and B. Koblitz, A multi-variate discrimination technique based on range-searching , NIM A 501 (2003) 576; hep-ex/ Construct non-parametric estimators of the pdfs and use these to construct the likelihood ratio ( n -dimensional histogram is a brute force example of this.) More clever estimation techniques can get this to work for (somewhat) higher dimension.

Kernel-based PDE (KDE, Parzen window)

Consider d dimensions, N training events, x 1 , ..., x N

estimate f ( x ) with Use e.g. Gaussian kernel: kernel bandwidth (smoothing parameter) Need to sum N terms to evaluate function (slow); faster algorithms only count events in vicinity of x ( k -nearest neighbor, range search).

Decision trees

Out of all the input variables, find the one for which with a single cut gives best improvement in signal purity: Example by MiniBooNE experiment, B. Roe et al., NIM 543 (2005) 577 where w i

. is the weight of the i th event. Resulting nodes classified as either signal/ background. Iterate until stop criterion reached based on e.g. purity or minimum number of events in a node. The set of cuts defines the decision boundary.

Finding the best single cut

The level of separation within a node can, e.g., be quantified by the Gini coefficient , calculated from the (s or b) purity as: For a cut that splits a set of events a into subsets b and c, one can quantify the improvement in separation by the change in weighted Gini coefficients: where, e.g.,

Choose e.g. the cut to the maximize Δ; a variant of this

scheme can use instead of Gini e.g. the misclassification rate:

Linear Decision Boundaries-Computing and Statistical Data Analysis-Lecture 06 Slides-Physics, Slides of Computational and Statistical Data Analysis

Related documents

Partial preview of the text

Download Linear Decision Boundaries-Computing and Statistical Data Analysis-Lecture 06 Slides-Physics and more Slides Computational and Statistical Data Analysis in PDF only on Docsity!

Statistical Data Analysis: Lecture 6

Nonlinear test statistics

H

H

Introduction to neural networks

Neural network discussion

Neural network example from LEP II

Probability Density Estimation (PDE) techniques

Kernel-based PDE (KDE, Parzen window)

Decision trees

Finding the best single cut

Choose e.g. the cut to the maximize Δ; a variant of this