Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Similar documents
D0 Higgs Results and Tevatron Higgs Combination

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Searching for the Higgs at the Tevatron

The search for standard model Higgs boson in ZH l + l b b channel at DØ

Higgs Boson Physics at the Tevatron

Search for High Mass SM Higgs at the Tevatron

The Higgs boson discovery. Kern-und Teilchenphysik II Prof. Nicola Serra Dr. Annapaola de Cosa Dr. Marcin Chrzaszcz

La ricerca dell Higgs Standard Model a CDF

Overview of the Higgs boson property studies at the LHC

Measurements of Fermionic Couplings of the Standard Model Higgs Boson using the bb, ττ and µµ Decay Channels with the ATLAS Detector

Inclusive top pair production at Tevatron and LHC in electron/muon final states

The Tevatron s Search for High Mass Higgs Bosons

Discovery potential of the SM Higgs with ATLAS

Higgs Searches at CMS

Hà γγ in the VBF production mode and trigger photon studies using the ATLAS detector at the LHC

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012

Higgs Boson Searches at ATLAS

Search for the Higgs Boson In HWW and HZZ final states with CMS detector. Dmytro Kovalskyi (UCSB)

Evidence for tth production at ATLAS

Measurement of the Higgs Couplings by Means of an Exclusive Analysis of its Diphoton decay

Search for WZ lνbb at CDF: Proving ground of the Hunt for the

Higgs Boson at the CMS experiment

Combined Higgs Results

Machine learning approaches to the Higgs boson self coupling

Single top quark production at CDF

Tutorial 8: Discovery of the Higgs boson

Higgs search in WW * and ZZ *

Recent ATLAS measurements in Higgs to diboson channels

Search for the Higgs Boson at the LHC. Karl Jakobs Physikalisches Institut Universität Freiburg

Measurements of the Higgs Boson at the LHC and Tevatron

Pedro Teixeira-Dias. Higgs Overview

Study of Higgs boson leptonic decay modes

Tevatron Physics Prospects. Paul Grannis, for the CDF and DØ collaborations ICFA Seminar, Oct

Higgs-related SM Measurements at ATLAS

Physics at Hadron Colliders

Dark matter searches and prospects at the ATLAS experiment

Studies on a Higgs-like Boson in the H(bb )W(`ν ) Channel with the CMS Experiment

Recent Results on New Phenomena and Higgs Searches at DZERO

PoS(ICHEP 2010)544. Higgs searches at the Tevatron. Ben Kilminster Fermilab bjk AT fnal.gov

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

Georges Aad For the ATLAS and CMS Collaboration CPPM, Aix-Marseille Université, CNRS/IN2P3, Marseille, France

Study of Diboson Physics with the ATLAS Detector at LHC

Improvements on the Higgs searches in the high mass region at CDF

arxiv: v1 [hep-ex] 12 Oct 2018

Recent Results on Top Physics at CMS

Search for the Higgs boson in fermionic channels using the CMS detector

Identification of the Higgs boson produced in association with top quark pairs in proton-proton

Search for the Standard Model Higgs Boson in H WW lν lν with the ATLAS experiment

Evidence for Single Top Quark Production. Reinhard Schwienhorst

Higgs Searches and Properties Measurement with ATLAS. Haijun Yang (on behalf of the ATLAS) Shanghai Jiao Tong University

Review of Standard Model Higgs results at the ATLAS experiment

Measurement of t-channel single top quark production in pp collisions

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Highlights of top quark measurements in hadronic final states at ATLAS

Search for Higgs and new Physics at CDF

Search for Fermionic Higgs Boson Decays in pp Collisions at ATLAS and CMS

PoS(CKM2016)117. Recent inclusive tt cross section measurements. Aruna Kumar Nayak

ATLAS Discovery Potential of the Standard Model Higgs Boson

Top quark pair cross section measurements at the Tevatron experiments and ATLAS. Flera Rizatdinova (Oklahoma State University)

Higgs couplings and mass measurements with ATLAS. Krisztian Peters CERN On behalf of the ATLAS Collaboration

Search for the Standard Model Higgs boson decaying to b quark with CMS experiment

QCD Multijet Background and Systematic Uncertainties in Top Physics

Status of ATLAS Higgs Search

Status and Prospects of Higgs CP Properties with CMS and ATLAS

CMS Higgs Results Adi Bornheim Caltech

Measurement of the Inclusive Isolated Prompt Photon Cross Section at CDF

Summer plans and prospects on Higgs searches at DZero Yuji Enari D0 France meeting 1

Precision QCD at the Tevatron. Markus Wobisch, Fermilab for the CDF and DØ Collaborations

HIGGS STUDIES IN ATLAS AND CMS

b hadron properties and decays (ATLAS)

ATLAS Di-fermion Results. Koji Nakamura (KEK) on behalf of ATLAS Collaboration

Top production measurements using the ATLAS detector at the LHC

Search for Standard Model Scalar Boson Decaying to Fermions at the LHC

PoS(ICHEP2012)238. Search for B 0 s µ + µ and other exclusive B decays with the ATLAS detector. Paolo Iengo

Search for the SM Higgs in τ l τ h jj Final States: Production Modes and Background Modeling

Albert De Roeck CERN, Geneva, Switzerland Antwerp University Belgium UC-Davis University USA IPPP, Durham UK. 14 September 2012

CDF recent results Paolo Mastrandrea (INFN Roma) on behalf of the CDF Collaboration

Review of Higgs results at LHC (ATLAS and CMS results)

Higgs decaying into bosons

Combined Higgs Searches at DZero

Studies of Higgs hadronic decay channels at CLIC. IMPRS Young Scientist Workshop at Ringberg Castle 18 th July 2014 Marco Szalay

Study of Higgs boson production in the WW decay channel at the LHC

WW scattering in the lep-lep final state at 14 TeV including pile-up

Multivariate Methods in Statistical Data Analysis

Search for the Standard Model Higgs in WW (lν)(lν)

Discovery Potential for the Standard Model Higgs at ATLAS

Latest results on the SM Higgs boson in the WW decay channel using the ATLAS detector

ZZ 4l measurement with the first ATLAS data

Beyond the Standard Model Higgs Boson Searches at DØ

Dark Matter in ATLAS

Yoshikazu NAGAI (CPPM, Aix Marseille Université)

Evidence for s-channel single top production at DØ

Beyond the Standard Model Higgs boson searches using the ATLAS etector

Search for a new spin-zero resonance in diboson channels at 13 TeV with the CMS experiment

Boosted hadronic object identification using jet substructure in ATLAS Run-2

Search for the Higgs boson in the t th production mode using the ATLAS detector

Standard Model Higgs Searches at the Tevatron (Low Mass : M H ~140 GeV/c 2 )

Single-Top at the Tevatron

Perspectives of Higgs measurements at High-Luminosity LHC

[ ] Search for tth and th production (not including H bb) at the LHC. !!!!! Top2015 Ischia. Josh McFayden

Transcription:

Statistical Tools in Collider Experiments Multivariate analysis in high energy physics Pauli Lectures - 06/0/01 Nicolas Chanon - ETH Zürich 1

Main goals of these lessons - Have an understanding of what are multivariate analyses - How they are used in high energy physics - Answer to the questions : what is a neural network? a boosted decision tree? what are the multivariate methods currently used in HEP? - Become familiar with problems related with training and application of multivariate methods - Be aware of the systematic uncertainties related to multivariate techniques - Be able to understand the results of new physics searches at Tevatron or LHC in the form where they are presented usually, and how they were produced

Introductory comments - In these lectures, examples will be mainly taken from Higgs boson searches at LHC - Will focus on multivariate methods commonly used in the high energy physics community - Theory will be addressed as a tool for practical usage 3

Exercises - Proposed exercises will follow the progress of the lecture - Problem inspired by Higgs searches in H->photons channel at LHC - Goal : be able to estimate the sensitivity of a search for a small peak over a huge background, using multivariate methods - 3 exercises : - Setting up Root and TMVA environment, TMVA basics - Using a MVA method inside the analysis - Estimation of analysis sensitivity 4

Outline 1.Introduction.Multivariate methods 3.Optimization of MVA methods 4.Application of MVA methods in HEP 5.Understanding Tevatron and LHC results 5

Lecture 1. Introduction 6

Content of this lecture - Introduction - Experimental problems in high energy physics - The problem : how to distinguish signal from background? - Multivariate analyses examples in HEP - At the Tevatron - At the LHC - Presentation of commonly used multivariate methods 7

Searching for rare signals Higgs and new physics cross-sections are small... Examples of background to H ZZ searches 5 orders of magnitude 8

Over huge backgrounds To achieve a discovery, huge background reduction rate needed LHC (14 TeV) - Example of H γγ : typically 9 orders of magnitude under the QCD jets background - Reducible background : jet-jet, photon-jet - Jets can be mis-identified as photons! => can be suppressed by tight photon identification criteria - Irreducible background : photon-photon - Non-resonant diphoton continuum! => Can be discriminated using kinematic properties Other NP? 9

With a given detector (here, CMS) +!,-./.$0$1!,-./.$0$345 6*7(8&*./.$0$941 ":);<./.$0$94= "&7<:*&>&;'$>7?&$@A'BA;$ 6*7(8&*$7((&C'7;(&$./.$0$941

Experimental issues Experimental challenges : - Detector calibration - Identification of the tracks / energy deposits in the sub-detectors - Particle reconstruction - Particle identification - Finding the vertex of hard interaction among all pile-up vertices - Discriminate the signal process against all other background processes -... - Multivariate methods can help for that Collision with 0 pile-up events recorded with the ATLAS detector 11

Multivariate analysis : Definitions MultiVariate Analysis : - Set of statistical analysis methods that simultaneously analyze multiple measurements (variables) on the object studied - Variables can be dependent or correlated in various ways Classification / regression : - Classification : discriminant analysis to separate classes of events, given already known results on a training sample - Regression : analysis which provides an output variable taken into account the correlations of the input variables Statistical learning : - Supervised learning : the multivariate method is trained over a sample were the result is known (e.g. Monte-Carlo simulation of signal and background) - Unsupervised learning : no prior knowledge is required. The algorithm will cluster events in an optimal way 1

Event classification - Focus here on supervised learning for classification. - Use case in particle physics : signal/background discrimination - Assume we have two populations (signal and background) and two variables - How to decorrelate, what decision boundary (on X1 and X) to choose, to decide if an event is signal or background? 13

Event classification - Possible solutions : rectangular cuts, Fisher, non-linear contour Rectangular cuts Linear (Fisher) Non-linear 14

Multivariate analyses in HEP - Signal/background discrimination : - Object reconstruction : discriminate against instrumental background (electronic noise...) - Object identification : e.g. electron, bottom quark identification, to improve the rejection other objects resembling (e.g. jets) - Discriminating physics process against physics backgrounds. Many examples, e.g. single top against W+jets, H->WW against WW background... - Improving the energy measurement, via regression. Allows to narrow the reconstructed mass peak, improve the resolution. - Estimate the sensitivity of the analysis : - Sensitivity to signal exclusion or discoveries : Likelihood of the data to be consistent with background only or signal+background hypothesis - Combination of many channels! => exclusion limits or discoveries 15

MVA examples in HEP : Tevatron Single top discovery PhysRevLett.98.18180 (a) q q q q t W + (b) W b g b b t Event Yield 60 40 0 (a) H T < 175 GeV e+jets jets 1 tag 0 0 0. 0.4 0.6 0.8 1 tb+tqb Decision Tree Output Event Yield 0 (b) H T > 300 GeV e+jets 4 jets 1 tag 0 0 0. 0.4 0.6 0.8 1 tb+tqb Decision Tree Output - When published, very controversial - 36 boosted decision trees used to discriminate signal from background Event Yield (c) 1 Event Yield 5 (d) - First measurement of the single top cross-section, today well established 0.6 0.7 0.8 0.9 1 tb+tqb Decision Tree Output 0 0 150 00 50 M(W,b) [GeV] 16

MVA examples in HEP : Tevatron ZH llbb searches at CDF PRL 5, 5180 (0) Events / 8 GeV 1400 100 00 800 600 400 00 0 1400 PreTag data (after) 0 40 60 80 0 10 140 160 180 00 0 95% C.L. Upper Limit/SM Dijet Mass (GeV/c ) Expected Observed ± 1 σ ± σ total bkg. (before) total bkg. (after) ZH 1500 (before) ZH 1500 (after) M H (GeV/c ) 80 ST Data Z+h.f. 70 60 ZH 5 Z+l.f. 50 40 Diboson, misid. Z, & tt 30 0 0 0 0. 0.4 0.6 0.8 1 Projection of NN Output - b-jet energy estimated with a regression neural network, to improve dijet mass resolution - b-tagging with neural networks, used to compute the final limits Events / Bin 1 0 1 10 130 140 150 D 17 Events / Bin

MVA examples in HEP : Tevatron Photon identification at D0 and applications arxiv:0.4917v3 Fraction of events 0.35 0.3 0.5 0. 0.15 0.1 0.05 DØ, 4. fb + - l! Z->l! MC jet MC (l = e,µ) data 0 0.1 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 O NN - Neural network for Photon Id based on calorimeter energy deposit and track variables in an isolation cone around the photon - Used to identify and measure the diphoton+x cross-section (pb/gev) d"/dm!! Ratio to RESBOS - -3 1.5 1 0.5 DØ, 4. fb (a) data RESBOS DIPHOX PYTHIA PDF uncert. scale uncert. 50 0 150 00 50 300 350 (GeV) DØ, 4. fb M!! (c) 18

MVA examples in HEP : Tevatron H γγ searches at D0 DØ Note 6177-CONF Events/0.08 7 6 5 4 3 DØ preliminary, 8. fb data background signal (M =10GeV) x 50 H 00 180 160 140 10 0 80 60 40 0 0 0 0.1 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 95% CL σ x BR(γγ)/SM value 80 70 60 50 40 30 DØ preliminary, 8. fb Observed limit Expected limit Expected limit ± 1 s.d. Expected limit ± s.d. 1 0 - -0.8-0.6-0.4-0. -0 0. 0.4 0.6 0.8 1 MVA output (c) M H = 10 GeV 0 0 1 10 130 140 150 [GeV] FIG. 5: 95% C.L. limits on the σ BR relative to the SM prediction as a function of Higgs mass. Th as a solid black line while the expected limit under the background-only hypothesis is shown as a da and yellow areas correspond to 1 and standard deviations (s.d.) around the expected limit. - Identify photons with the neural network (reduces fake photons processes) - Boosted decision tree with kinematic variables to improve the sensitivity against the diphoton continuum (+30%) - The BDT includes the invariant mass of the diphoton system as input Acknowledgments We thank the staff at Fermilab and collaborating institutions, and acknowledge support fr M γγ 19

MVA examples in HEP : LHC H WW llνν searches in CMS - 3 channels : 0-jet, 1-jet, -jet - Electron identification with a multivariate technique : 50% more background rejection for the same signal efficiency - Boosted decision tree in 0-jet and 1-jet channels : kinematic variables CMS-PAS-HIG1-04 40 0 data m H =130 WW W+jets Z+jets top WZ/ZZ CMS preliminary L = 4.6 fb 95% CL limit on!/! SM - Limits improved by using BDT 5 median expected expected ± 1! expected ±! observed CMS preliminary H " WW (cut based) L = 4.6 fb 95% CL limit on!/! SM 5 median expected expected ± 1! expected ±! observed CMS preliminary H " WW (BDT based) L = 4.6 fb 0-0.5 0 0.5 1 BDT Output 0 0 00 300 400 500 600 Higgs mass [GeV] 0 0 00 300 400 500 600 Higgs mass [GeV] 0

MVA examples in HEP : LHC H bb searches in CMS CMS-PAS-HIG1-031 95% C.L. Limit on!/! SM 16 14 1 8 CMS Preliminary, BDT analysis s = 7 TeV, L = 4.7 fb VH(bb), combined CL S Observed CL S Expected CL S Expected ± 1! CL S Expected ±! 95% C.L. Limit on!/! SM 1 115 10 15 130 135 Higgs Mass [GeV] Figure 4: Expected and observed 95% CL combined upper limits on the ratio of VHbb produc- 4 Data CMS Preliminary WH s = 7 TeV, L = 4.7 fb VV bb tion for the BDT (left) and M(jj)(right) analyses. The median expected W(µ!)H(bb) limit and the 1- and -σ W + udscg Z + bb - 5 channels : W eν,μν, Z ee,μμ, Z νν 3 Z + udscg Single Top bands are obtained with the LHC CLs method as implemented in RooStats, as are the observed tt QCD MC uncertainty Events 1-0.6-0.4-0. 0 0. 0.4 0.6 BDT output Events 1 4 C s 3 1-0.6 W C 16 14 1 8 CMS Preliminary, M JJ s = 7 TeV, L = 4.7 fb VH(bb), combined CL S Observed CL S Expected CL S Expected ± 1! CL S Expected ±! analysis 6 6 4 4 1 115 10 15 130 135 Higgs Mass [GeV] - Searches for VH, H bb limits at each mass point. - B-tagging selection on a likelihood discriminant (track impact parameter + secondary vertices information) - Boosted decision trees for the kinematics

MVA examples in HEP : LHC H γγ searches in CMS CMS-PAS-HIG1-030 ) Events / ( 1 GeV/c 100 00 800 600 400 00 CMS preliminary s = 7 TeV L = 4.76 fb All Categories Combined Data Bkg Model!1"! " 5xSM m H =10 GeV /#(H"!!) SM 95%CL #(H"!!) 5 4 3 Observed CLs Limit Median Expected CLs Limit " 1# Expected CLs " # Expected CLs CMS preliminary s = 7 TeV L = 4.76 fb 0 0 10 140 160 180 (GeV/c ) m!! 1 1!# SM 0 1 115 10 15 130 135 140 145 150 m H (GeV/c ) - Hard interaction vertex identified with a BDT using diphoton kinematics and track variables - Photon energy estimated with a BDT regression from geometry and energy deposit variables (% improvement on the limit)

MVA examples in HEP : LHC Combination of all channels in CMS CMS-PAS-HIG1-03 95% CL limit on!/! SM CMS Preliminary, Combined, L int s = 7 TeV = 4.6-4.7 fb Combined H " bb (4.7 fb ) H " ## (4.6 fb ) H " $$ (4.7 fb ) H " WW (4.6 fb ) H " ZZ (4.7 fb ) 95% CL limit on!/! SM 1 CMS Preliminary, s = 7 TeV Combined, L int = 4.6-4.7 fb Observed Expected ± 1! Expected ±! 1 0 00 300 400 500 600 Higgs boson mass (GeV/c ) 0 00 300 400 500 600 Higgs boson mass (GeV/c ) - Combination can be seen as a grand multivariate analysis - Limits are set with CLs method - Exclusion at 95% confidence level : 17-600 GeV 3

Plenty of multivariate methods... Example of MVA methods : - Rectangular cut optimization - Fisher - Likelihood - Neural network - Decision tree - Support Vector Machine -... Characteristics : - Level of complexity and transparency - Performance in term of background rejection - Way of dealing with non-linear correlations - Speed of training - Robustness while increasing the number of input variables - Robustness against overtraining 4

Rectangular cuts - Simplest multivariate method, very intuitive - All HEP analyses are using rectangular cuts, not always completely optimized Rectangular cuts optimization : - Grid search, Monte-Carlo sampling - Genetic algorithm - Simulated annealing Characteristics : - Difficult to discriminate signal from background if non-linear correlations - Optimization difficult to handle with high number of variables Define the signal region :! a1 < x1 < a,! b1 < x < b!... 5

Fisher discriminant Fisher method : - Cut on a linear combination of the input variables! y < a.x1 + b.x - This corresponds to an hyper-plan in the variable phase-space - Very efficient if linear correlations - Again, difficult to handle non-linear correlations - More easily trained than rectangular cuts 6

Likelihood estimator - The likelihood ratio is defined by : L S(B) (i) = n var k=1 p S(B),k (x k (i)) y L (i) = L S (i) L S (i)+l B (i) is the product of the probability function for each variables. - Optimal when no correlation between the variables - This likelihood method does not take into account the correlations and is therefore sub-optimal in presence of correlations - Refinements exist to take into account the correlations 7

Neural network - Most commonly used : the multi-layer perceptron - Composed of neurons taking as input a linear combination of the previous neuron outputs - Activation function (usually tanh) transforms the linear combination - Weights for each neurons are found during the training phase by minimizing the error on the neural network output Input Layer Hidden Layer Output Layer x 1 y 1 1 w 1 11 w 1 1 y 1 y w 11 - Neural networks are universal approximators : takes advantage of correlations x x 3 y 1 y 1 3 y 3 y 4 y 3 1 y ANN - Quite stable against overtraining and against increasing number of variables x 4 Bias y 1 4 1 w 1 45 w 1 05 Bias y 5 1 w 51 w 01 8

Decision tree - A decision tree is a binary tree : a sequence of cuts paving the phase-space of the input variables - Repeated yes/no decisions on each variables are taken for an event until a stop criterion is fulfilled - Trained to maximize the purity of signal nodes (or the impurity of background nodes) - Decision trees are extremely sensitive to the training samples, therefore to overtraining - To stabilize their performance, one uses different techniques : - Boosting - Bagging - Random forests 9

Support Vector Machine - Idea : build a hyperplane that separate signal and background vectors (events) using only a subset of all training vectors (support vectors) - Position of the hyperplane found by maximizing the margin between it and the support vectors - Higher dimensions spaces are used by non-linear transformation, using kernel functions such as the gaussian basis - SVM can be competitive with NN and BDT but is often less discriminant : often data are non-separable, therefore sensitive to all the SVM parameters - In some cases this method performs very well 30

Training and application Training / test samples - For all multivariate methods, two samples are used : - Training sample - Test sample - This is mandatory to check that the training has converged to a solution which does not depend on the statistical fluctuations of the training sample - Generally speaking, MVA should be applied (or tested) in events where the response is not known - Training is time-consuming, especially while increasing the number of variables (and depending on the method) - Application is usually faster : it uses a set of weights used in the MVA output computation 31

Which method to choose? From TMVA manual MVA METHOD CRITERIA Cuts Likelihood PDE- RS / k-nn PDE- Foam H- Matrix Fisher / LD MLP BDT Rule- Fit SVM Performance No or linear correlations Nonlinear correlations Speed Training Response Robust- Overtraining ness Weak variables Curse of dimensionality Transparency 3