hep-ex/ May 1999

Similar documents
ICHEP'96 Ref. pa DELPHI CONF 41. Submitted to Pa 1, Pa 5, Pa 7 25 June, Measurement of the Partial Decay. had.

Recent Results from Fermilab Charm Experiment E791. David A. Sanders. Representing the Fermilab Experiment E791 Collaboration

below, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing

Studies of b b gluon and c c vertices Λ. Abstract

OPAL =0.08) (%) (y cut R 3. N events T ρ. L3 Data PYTHIA ARIADNE HERWIG. L3 Data PYTHIA ARIADNE HERWIG

arxiv:hep-ex/ v1 16 Jun 2004

Physics at Hadron Colliders

arxiv:hep-ex/ v2 2 Feb 2001

dσ/dx 1/σ tot TASSO 22 TPC/2γ 29 MKII 29 TASSO 35 CELLO 35 TASSO 43.7 AMY 55.2 DELPHI 91.2 ALEPH 91.

Invariant Mass, Missing Mass, jet reconstruction and jet flavour tagging

1 Introduction Flavor Changing Neutral Current processes (FCNC) are governed in the Standard Model (SM) by the Glashow-Iliopoulus-Maini (GIM) mechanis

A TEST OF THE FLAVOR INDEPENDENCE OF STRONG INTERACTIONS *

Introduction. The Standard Model

ICHEP94 Ref. gls0229 c DELPHI PHYS 410. Submitted to Pa 01 1st June, New Measurement of the b = had. Branching Ratio of the Z with

Optimization of tau identification in ATLAS experiment using multivariate tools

Open heavy-flavour production in pp, p Pb and Pb Pb collisions in ALICE

Measurements of charm and beauty proton structure functions F2 c c and F2 b b at HERA

PoS(EPS-HEP2015)565. Study of the radiative tau decays τ γlνν with the BABAR detector. Roger Barlow

Supervised Machine Learning: Learning SVMs and Deep Learning. Klaus-Robert Müller!!et al.!!

Study of di-muon production with the ZEUS detector at HERA

arxiv: v1 [hep-ex] 18 Jul 2016

Highlights of top quark measurements in hadronic final states at ATLAS

R b,r c Measurements at SLD and LEP-I 1

STUDY OF D AND D PRODUCTION IN B AND C JETS, WITH THE DELPHI DETECTOR C. BOURDARIOS

Inclusive top pair production at Tevatron and LHC in electron/muon final states

hep-ex/ Jun 1995

PoS(EPS-HEP2015)544. b-flavour tagging in pp collisions. Alex Birnkraut TU Dortmund

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Jet reconstruction in LHCb searching for Higgs-like particles

Stephen R. Armstrong CERN EP Division CH-1211 Geneva 23, SWITZERLAND

LHCb Semileptonic Asymmetry

Machine learning in Higgs analyses

Flavour Physics and CP Violation (FPCP) Philadelphia, Pennsylvania, USA May 16, 2002

The search for standard model Higgs boson in ZH l + l b b channel at DØ

Top production measurements using the ATLAS detector at the LHC

PoS(ICHEP2012)049. Study of tau-pair production at HERA. Masaki Ishitsuka Tokyo Institute of Technology

& Λ Production in ALICE

Tutorial on Top-Quark Physics

PRECISION&MEASUREMENTS&

Charm and bottom production in two-photon collisions with OPAL 1

arxiv:hep-ex/ v2 10 Jan 1997

PoS(ICHEP2012)311. Identification of b-quark jets in the CMS experiment. Sudhir Malik 1

PoS(EPS-HEP2011)250. Search for Higgs to WW (lνlν, lνqq) with the ATLAS Detector. Jonas Strandberg

Identification of the Higgs boson produced in association with top quark pairs in proton-proton

Events with High P T Leptons and Missing P T and Anomalous Top at HERA

Isolated Leptons at H1/ZEUS

Recent Results of + c + X and + b + X Production Cross Sections at DØ

B and Upsilon Cross Sections at HERA-B

We review the semileptonic b-hadron branching ratio Br(b! lx) measurements from LEP. Both the inclusive

National Accelerator Laboratory

A flavour-independent Higgs boson search in e+e- collisions at s up to 209 GeV. 11 th, May, 2009 Kohei Yoshida

Analysis of Top Quarks Using a Kinematic Likelihood Method

Measurement of the jet production properties at the LHC with the ATLAS Detector

Charm photoproduction at HERA

Muon reconstruction performance in ATLAS at Run-2

Richard Hall-Wilton University College London. July 2002

Studies of top pair production in the fully hadronic channel at LHC with CMS

PoS(ICHEP 2010)170. D +, D 0 and Λ + c production in deep inelastic scattering at HERA

Spectroscopy and Decay results from CDF

THE FLORIDA STATE UNIVERSITY COLLEGE OF ARTS & SCIENCES W BOSON PRODUCTION CHARGE ASYMMETRY IN THE ELECTRON CHANNEL ASHLEY S HUFF

PoS(ICHEP2012)300. Electroweak boson production at LHCb

Tau (or no) leptons in top quark decays at hadron colliders

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012

Particle Physics with Electronic Detectors

Flavour Tagging at LHCb

Hadronic final state interactions at ALEPH and OPAL

Z 0 Resonance Analysis Program in ROOT

Beauty contribution to the proton structure function and charm results

W vs. QCD Jet Tagging at the Large Hadron Collider

Study of Higgs Boson Decaying to Four Muons at s =14 TeV

National Accelerator Laboratory

Performance of muon and tau identification at ATLAS

Heavy-flavor production in pp and Pb Pb collisions at LHC with ALICE

Heavy Flavour Physics at HERA. Beate Naroska. Abstract. New results with increased statistics are presented for heavy avour production

Top quarks objects definition and performance at ATLAS

QCD and Top physics studies in proton-proton collisions at 7 TeV centre-of-mass energy with the ATLAS detector

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

Discovery of the W and Z 0 Bosons

Single top quark production at CDF

CP Violation Physics. Lecture # 18. December 12,1997. Theresa Champion

Study of τ Decays to Four-Hadron Final States with Kaons

Measurement of t-channel single top quark production in pp collisions

E (GeV) E (GeV) E (GeV) Entries/2.5 MeV/c

Support Vector Machines vs Multi-Layer. Perceptron in Particle Identication. DIFI, Universita di Genova (I) INFN Sezione di Genova (I) Cambridge (US)

arxiv: v1 [hep-ex] 17 Aug 2016

HERA e-p scattering events observed in the H1Detector. H1 Events Joachim Meyer DESY

Gluons at high x in Nuclei at EIC

Jeff Howbert Introduction to Machine Learning Winter

PoS(HEP2005)038. Final state QCD studies at LEP: Part I. Pedro Abreu * for the DELPHI and OPAL Collaborations

Tests of Pythia8 Rescattering Model

W, Z and top production measurements at LHCb

arxiv: v1 [hep-ex] 6 Jul 2007

CMS Conference Report

Lepton and Charm Measurements in the First Two Years of RHIC: An Experimental Overview

THE FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES. By ALICIA GOMEZ

Gluon Polarisation COMPASS

Study of Multimuon Events at CDF

Limits on ν τ mass by DELPHI

Multivariate Methods in Statistical Data Analysis

Beauty jet production at HERA with ZEUS

Transcription:

Classifying LEP Data with Support Vector Algorithms P. Vannerem a K.-R. Muller b B. Scholkopf b A. Smola b S. Soldner-Rembold a a Fakultat fur Physik, Hermann-Herder-Str. 3, D-79104 Freiburg, Germany b GMD-FIRST, Rudower Chaussee 5, D-12489 Berlin, Germany hep-ex/9905027 17 May 1999 Abstract We have studied the application of dierent classication algorithms in the analysis of simulated high energy physics data. Whereas Neural Network algorithms have become a standard tool for data analysis, the performance of other classiers such as Support Vector Machines has not yet been tested in this environment. We chose two dierent problems to compare the performance of a Support Vector Machine and a Neural Net trained with back-propagation: tagging events of the type e + e ;! cc and the identication of muons produced in multihadronic e + e ; annihilation events. 1 Classication algorithms Articial Neural Networks (ANN) are a useful tool to solve multi-dimensional classication problems in high energy physics, in cases where one-dimensional cut techniques are not sucient. They are used both as hard-coded chip for very fast low-level pattern recognition in on-line triggering [1,2] and as a statistical tool for particle and event classications in oine data analysis. In oine data analysis, a Monte Carlo simulation of the physics process and the detector response is necessary to train an ANN by supervised learning. ANN algorithms have been applied successfully in classication problems such as gluon-jet tagging [3] and b-quark tagging [4]. The ANN classiers [5] constructed in this paper have sigmoid nodes. Design and tuning issues were solved by applying practical experience rules [6].? To be submitted to the Proceedings of AIHENP99, Crete, April 1999 Preprint submitted to Elsevier Preprint 17 May 1999

a) INPUT SPACE b) FEATURE SPACE slack var margin Fig. 1. A Support Vector Machine maps the input vectors of 2 dierent classes (a) into a higher dimensional space (b) and determines the separating hyperplane with maximum margin in that space. For non-separable problems additional slack variables are introduced. This paper presents the application of a recently proposed machine-learning method, called Support Vector Machines (SVM) [7,8], to high energy physics data. The underlying idea is to map the patterns, i.e., the n-dimensional vectors x of n input variables, from the input space to a higher dimensional feature space with a non-linear transformation (Fig. 1). Gaussian radial basis functions are used as a kernel for the mapping. After this mapping the problem becomes linearly separable by hyperplanes. The hyperplane which maximises the margin is dened by the support vectors which are the patterns lying closest to the hyperplane. This hyperplane which is determined with a training set is expected to ensure an optimal separation of the dierent classes in the data. In many problems a complete linear separation of the patterns is not possible and additional slack variables for patterns not lying on the correct side of the hyperplane are therefore introduced. The training of a SVM [9,10] is a convex optimisation problem. This guarantees that the global minimum can be found, which is not the case when minimising the error function for an ANN with back-propagation. The CPU time needed to nd the hyperplane scales approximately with the cube of the number of patterns. 2 Data sets and the OPAL experiment Two distinctly dierent problems were chosen for comparing ANN and SVM classiers, charm tagging and muon identication. The rst problem is to classify (tag) e + e ;! qqevents according to the avour of the produced quarks, separating c-quark events from light quark (uds) and b-quark events. Flavour tagging is necessary for precision measurements of electroweak parameters of the Standard Model. The events are divided into two hemispheres by a plane perpendicular to the thrust axis. The avour tag is applied separately to both hemispheres, which contain the jets from the 2

dn/n 1 10-1 10-2 10-3 0 0.5 1 1.5 2 2.5 3 decay length L Fig. 2. The distribution of the reconstructed decay length L is an input distribution to the avour tag. two produced quarks. For a signal (s) with background (bg) the eciency " is dened as " = N s tag =N s tot where N s are the number of correctly tagged hemispheres and N s tag tot are all signal hemispheres in the sample. The purity is given by = N s = tag N s tag + N bg with N bg tag being the number of tagged background hemispheres. Due to the high mass (' 5 GeV) and long lifetime (' 1:5 ps) of b hadrons, hemispheres containing b-quarks can be tagged using an ANN with typical eciencies of 25% and purities of about 92% [11]. However, for the lighter charm quark, the measured fragmentation properties and secondary vertex quantities are very similar in charm events and uds events (Fig. 2). High purity charm tags with low eciency are possible using D mesons or leptons with high transverse momentum from semi-leptonic decays. Applying an ANN or SVM charm tag to kinematic variables dened for all charm events is expected to increase the charm tagging eciency at the cost of lower purities. The second problem is the identication of muons which are produced in the fragmentation of e + e ;! qq events. Muons are usually not absorbed in the calorimeters. They are measured in muon chambers which build the outer layer of a typical collider detector. A signal in these chambers which ismatched to a track in the central tracking chamber is already a good muon discriminator. The OPAL detector at LEP has been extensively described elsewhere [12,13]. tag 3

The event generator JETSET 7.4 [14] is used to simulate e + e ; annihilation events (e + e ;! qq), including the fragmentation of quarks into hadrons measured in the detector. The fragmentation model has to be tuned empirically to match kinematic distributions as measured with the detector [15]. The response of the detector is also simulated. A data set of simulated e + e ; collisions at a centre-of-mass energy p s = m Z 0 is used for the charm identication problem. A second Monte Carlo data set at p s = 189 GeV has been used for the muon identication problem. 3 Problem 1: Charm quark tagging The Monte Carlo events had to full preselection cuts which ensure that an event is well reconstructed in the detector. These cuts result in a rst overall reduction of the reconstruction eciency. The input variables for the machinelearning algorithms were chosen from a larger dataset of 27 variables, containing various jet shape variables, e.g. Fox-Wolfram moments and eigenvalues of the sphericity tensor, plus several secondary vertex variables and lepton information. A jet nding algorithm [16] clusters the tracks and the calorimeter clusters into jets. Only the highest energy jet per hemisphere is used. The variables containing information about high transverse momentum leptons were removed in order to avoid a high correlation of this charm tag with the charm tag using leptons. The 14 variables with the largest inuence on the performance of an ANN (27-27-1 architecture), trained to classify charm versus not-charm, were picked from the larger set. The variable selection method used is equivalent to a method which selects the variables with the largest connecting weights between the input and the rst layer. This method has been shown to perform a fairly good variable selection [17]. It would be interesting to try Hessian based selection methods in comparison [6]. A ANN classier with a 14-14-1 architecture was found to have the best performance. More complex architectures did not improve the classication. At generator level ve dierent quark types are distinguished. Due to their very similar decay and jet properties, Monte Carlo events coming from u,d and s quarks are put into one single class (uds). The data set consists of 10 5 hemispheres per uds, c and b class. However, the eciencies and purities are calculated assuming a mixture of quark avours according to the Standard Model prediction. This set is divided into training, validation and test sets of equal size. The learning machines were trained on equal number of events from the three classes. The supervision during the learning phase consisted of a charm versus not-charm label (udsb), thus distinguishing only two classes. The outputs of both learning machines are shown in Fig. 3. The two classes c and udsb are separated by requiring a certain value for the output. This denes 4

dn/n dn/n 10-1 10-2 10-3 10-4 -1-0.8-0.6-0.4-0.2 0 0.2 0.4 0.6 0.8 1 NN OUTPUT 1 10-1 10-2 10-3 10-4 -2.5-2 -1.5-1 -0.5 0 0.5 1 1.5 2 2.5 SVM OUTPUT purity 0.6 0.5 0.4 0.3 0.2 0.1 0 Charm Tagging support vector machine back-prop neural net 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 efficiency Fig. 3. The output distributions of a) the ANN and b) the SVM. c) Purity versus eciency for both charm tags. Statistical errors are shown. the eciency and purity of the tagged sample. The purity as a function of the eciency " for the two charm tags are shown in Figure 3 with the statistical errors. The performance of the SVM is comparable to the performance of the ANN with a slightly higher purity for the ANN at larger eciencies ". 4 Problem 2: Muon identication The muon candidates are preselected by requiring a minimum track momentum of 2 GeV and by choosing the best match in the event between muon chamber track segment and the extrapolated central detector track. Ten discriminating variables containing muon matching, hadron calorimeter and central detector information on the specic energy loss, de=dx, of charged particles were chosen from a larger set of variables. The Monte Carlo data set consists of 3 10 4 muons and 3 10 4 fake muon candidates. This set was divided into training, validation and test sets of equal size. After training, the tag is dened by requiring a certain value for the output of the learning machines. The resulting purity as a function of eciency for muon identication is shown in Fig. 4. For high eciency the performance of the SVM is very similar to the ANN. 5

purity 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Muon Identification support vector machine back-prop neural net 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 efficiency Fig. 4. The separation performance on identifying muons shown for a Neural Net trained with back-propagation and a Support Vector Machine. Statistical errors are shown 5 Conclusion We have compared the performance of Support Vector Machines and Articial Neural Networks in the classication of two distinctly dierent problems in high energy collider physics: charm-tagging and muon identication. The constructed SVM and ANN classiers give consistent results for eciencies and purities of tagged samples. Acknowledgements This work has been supported by the Deutsche Forschungsgemeinschaft with the grants SO 404/1-1, JA 379/5-2, JA 379/7-1. We alsowould like to thank Frank Fiedler and Andreas Sittler of DESY for their help with the preparation of the OPAL data. We thank Gunnar Ratsch and Sebastian Mika ofgmd- FIRST for helping to analyse the data. References [1] For a review see: B. Denby, Proceedings of AIHENP 1993, ed. D. Perret-Gallix, World Scientic (1993). [2] C. Kiesling et al. (Principles and Hardware), S. Udluft et al. (Training and Control) and L. Janauschek et al. (Physics Results), Proceedings of AIHENP 6

1999, eds. G. Athanasiu, D. Perret-Gallix, Elsevier North-Holland (1999). [3] L. Lonnblad, C. Peterson and T. Rognvalsson, Phys. Rev. Letters 65 (1990) 1321. [4] P. Mazzanti and R. Odorico, Z. Phys. C59 (1993) 273. [5] C. Peterson, T. Rognvaldsson and L. Lonnblad, Comp. Phys. Comm. 81 (1994) 185. [6] G.B. Orr and K.-R. Muller, (eds.) (1998), Neural Networks: Tricks of the Trade, Springer Heidelberg (1998), LNCS 1524. [7] V. Vapnik, The nature of statistical learning theory, Springer Verlag, New York (1995). [8] C. Cortes and V. Vapnik, Machine Learning 20 (1995) 273. [9] B. Scholkopf, C.J.C. Burges and A.J. Smola (eds.), Advances in Kernel Methods { Support Vector Learning, MIT Press, Cambridge, MA (1999). [10] For more information on SVM, consult http://svm.first.gmd.de [11] OPAL Collaboration, G. Abbiendi et al., Eur. Phys. J. C8 (1999) 217. [12] OPAL Collaboration, K. Ackersta et al., Z. Phys. C74 (1997) 1. [13] P.P. Allport et al., Nucl. Instr. and Methods A346 (1994) 476. [14] T. Sjostrand, Comp. Phys. Comm. 82 (1994) 74. [15] OPAL Collaboration, G. Alexander et al., Z. Phys. C69 (1996) 543. [16] OPAL Collaboration, R. Akers et al., Z. Phys. C63 (1994) 197. [17] J. Proriol, Proceedings of AIHENP 1995, eds. B. Denby & D. Perret-Gallix, World Scientic (1995). 7