Boosted Top Tagging with Deep Neural Networks

Similar documents
Quark/Gluon Discrimination with Jet-Images and Deep Learning

Pre-Processing and Re-Weighting Jet Images with Different Substructure Variables

GAN Applications in High Energy Particle Physics

Boosted Top Tagging with Neural Networks

Image Processing, Computer Vision, and Deep Learning: new approaches to the analysis and physics interpretation of LHC events

Boosted hadronic object identification using jet substructure in ATLAS Run-2

Effects of Jet Substructure Selection in

Classifaction of gg gh against qg qh

bb and TopTagging in ATLAS

Large R jets and boosted. object tagging in ATLAS. Freiburg, 15/06/2016. #BoostAndNeverLookBack. Physikalisches Institut Universität Heidelberg

THE MULTIPLICIY JUMP! FINDING B S IN MULTI-TEV JETS W/O TRACKS

Distinguishing quark and gluon jets at the LHC

Jets in the 21st Century

Jet physics in ATLAS. Paolo Francavilla. IFAE-Barcelona. Summer Institute LNF , QCD, Heavy Flavours and Higgs physics

QCD Jets: Rise of the Machines

QCD Jets at the LHC. Leonard Apanasevich University of Illinois at Chicago. on behalf of the ATLAS and CMS collaborations

Substructure at CMS:

Jet substructure, top tagging & b-tagging at high pt Pierre-Antoine Delsart and Jeremy Andrea

Studies on hadronic top decays

Machine Learning just because it is Great Fun

Measurement of t-channel single top quark production in pp collisions

Azimuthal Correlations for Inclusive 2-jet, 3-jet and 4-jet events in pp collisions at s = 13 TeV with CMS

End-to-End Event Classification of High-Energy Physics Data

Hard And Soft QCD Physics In ATLAS

George Bakas For the NTUA CMS Group

ATLAS Measurements of Boosted Objects

Results on QCD jet production at the LHC (incl. Heavy flavours)

Jet Reconstruction and Energy Scale Determination in ATLAS

Top-tagging at high jet multiplicity

Boosted Top Resonance Searches at CMS

New particle Searches in the dijet final state at 13TeV with the ATLAS detector

Two Early Exotic searches with dijet events at ATLAS

Top tagging at CMS. Torben Dreyer on behalf of the CMS Collaboration. BOOST 2017, Buffalo

Inclusive BB cross section at 8 TeV

Tuning the simulated response of the CMS detector to b-jets using Machine learning algorithms

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

Measurement of jet production in association with a Z boson at the LHC & Jet energy correction & calibration at HLT in CMS

Double parton scattering studies in CMS

QCD Studies at LHC with the Atlas detector

PoS(DIS2014)064. Forward-Central Jet Correlations. Pedro Miguel RIBEIRO CIPRIANO, on behalf of CMS. DESY - CMS

Jet reconstruction with first data in ATLAS

Results from D0: dijet angular distributions, dijet mass cross section and dijet azimuthal decorrelations

Hadronic Exotica Searches at CMS

Colin Jessop. University of Notre Dame

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

ATLAS jet and missing energy reconstruction, calibration and performance in LHC Run-2

JET FRAGMENTATION DENNIS WEISER

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Measurement of the jet production properties at the LHC with the ATLAS Detector

Measurement of multijets and the internal structure of jets at ATLAS

PoS(EPS-HEP2015)309. Electroweak Physics at LHCb

Tagging Boosted Top Quarks and Higgs Bosons in ATLAS

The Heavy Quark Search at the LHC

Search for a SM-like Higgs boson. Zijun Xu Peking University ISHP, Beijing Aug 15, 2013

arxiv: v1 [hep-ex] 15 Jan 2019

Recent QCD results from ATLAS

et Experiments at LHC

Evidence for tth production at ATLAS

Timing for pileup mitigation in forward jets (& more) F. Rubbo, A. Schwartzman, 9/4/2015

W vs. QCD Jet Tagging at the Large Hadron Collider

ATLAS Jet Physics Results and Jet Substructure in 2010 Data from the LHC

Dark matter searches and prospects at the ATLAS experiment

How Much Information is in a Jet?

Measurement of muon tagged open heavy flavor production in Pb+Pb collisions at 2.76 TeV with ATLAS

Maria Fiascaris (University of Oxford) ATLAS UK Meeting, Durham 10/01/08

arxiv:hep-ex/ v1 8 Jun 2001

Measurement of photon production cross sections also in association with jets with the ATLAS detector

Reconstructing low mass boosted A bb at LHCb

Top Tagging with Lorentz Boost Networks and Simulation of Electromagnetic Showers with a Wasserstein GAN

Christian Wiel. Reconstruction and Identification of Semi-Leptonic Di-Tau Decays in Boosted Topologies at ATLAS

Search for W' tb in the hadronic final state at ATLAS

Measurements of the production of a vector boson in association with jets in the ATLAS and CMS detectors

THE ATLAS TRIGGER SYSTEM UPGRADE AND PERFORMANCE IN RUN 2

Olivier Mattelaer University of Illinois at Urbana Champaign

Probing electroweak symmetry breaking! with Higgs pair production at the LHC

ATLAS Jet Reconstruction, Calibration, and Tagging of Lorentzboosted

Physics at Hadron Colliders Part II

Studies of the diffractive photoproduction of isolated photons at HERA

from Klaus Rabbertz, KIT Proton Structure (PDF) Proton Structure (PDF) Klaus Rabbertz Status of αs Determinations Mainz, Germany,

Study of observables for measurement of MPI using Z+jets process

Pileup, substructure and other thoughts from CMS

ATLAS Jet Reconstruction, Calibration, and Tagging

QCD dijet analyses at DØ

PoS(ICHEP2012)311. Identification of b-quark jets in the CMS experiment. Sudhir Malik 1

arxiv: v1 [hep-ex] 21 Aug 2011

PoS(CKM2016)117. Recent inclusive tt cross section measurements. Aruna Kumar Nayak

Qi Li On behalf of the analysis team IHEP, Beijing Sunday, December 11, 2016 HGam micro-workshop

Feasibility of a cross-section measurement for J/ψ->ee with the ATLAS detector

The Top Quark Mass in the Muon+Jets Final State at s = 13 TeV in 2015 Data

Top production measurements using the ATLAS detector at the LHC

Jet energy measurement in the ATLAS detector

Measurement of Z+ Jets Cross-Section at ATLAS

Event shapes in hadronic collisions

Nikos Varelas. University of Illinois at Chicago. CTEQ Collaboration Meeting Northwestern November 20, Nikos Varelas. CTEQ Meeting Nov 20, 2009

Measurement of Jet Energy Scale and Resolution at ATLAS and CMS at s = 8 TeV

Jet quenching in PbPb collisions in CMS

Determination of the strong coupling constant from multi-jet production with the ATLAS detector

Early physics with Atlas at LHC

Recent LHCb measurements of Electroweak Boson Production in Run-1

Measurements of BB Angular Correlations based on Secondary Vertex Reconstruction at sqrt(s) = 7 TeV in CMS

Transcription:

Boosted Top Tagging with Deep Neural Networks Jannicke Pearkes University of British Columbia, Engineering Physics Wojtek Fedorko, Alison Lister, Colin Gay Inter-Experimental Machine Learning Workshop March 22 nd, 2017

Overview Introduction Method Monte Carlo Samples Network architecture & training Results Preprocessing P T dependence Pileup dependence Learning what is being learnt Next Steps 2

Introduction Low top p T High top p T W boost W b Image: Emily Thompson b Train a deep neural network to discriminate between jets originating from top quarks and those originating from QCD background 3

Monte Carlo Samples Signal: Z to ttbar Background: Dijet Generated with PYTHIA v8.219 NNPDF23 LO AS 0130 QED PDF DELPHES v3.4.0 using default CMS card Jets clustered using DELPHES energy-flow objects Anti-k T jets selected with R = Trimming performed with k T algorithm and R = 0.2, p T frac = 5% Signal jets are selected where a truth top decays hadronically within ΔR= 0.75 of a large radius jet Jets are required to have η <= 2.0 Jets are subsampled to be flat in p T and signal-matched in eta Looking at jets with p T between 600-2500 GeV ~ 4 million signal jets and ~4 million background jets Sample divided into 80%, 10%, 10% for training, validation and testing 4

Examples of Jet Images Signal jet with pt =781GeV 10 0 Signal jet with pt =1480GeV 10 0 Signal jet with pt =2358GeV 10 0 0.5 10 1 0.5 10 1 0.5 10 1 Translated azimuthal angle 0.0 0.5 10 2 10 3 Jet pt per pixel [GeV] Translated azimuthal angle 0.0 0.5 10 2 10 3 Jet pt per pixel [GeV] Translated azimuthal angle 0.0 0.5 10 2 10 3 Jet pt per pixel [GeV] 0.5 0.0 0.5 Translated pseudorapidity 10 4 0.5 0.0 0.5 Translated pseudorapidity 10 4 0.5 0.0 0.5 Translated pseudorapidity 10 4 Background jet with pt =702GeV 10 0 Background jet with pt =1370GeV 10 0 Background jet with pt =2376GeV 10 0 0.5 10 1 0.5 10 1 0.5 10 1 Translated azimuthal angle 0.0 10 2 Jet pt per pixel [GeV] Translated azimuthal angle 0.0 10 2 Jet pt per pixel [GeV] Translated azimuthal angle 0.0 10 2 Jet pt per pixel [GeV] 0.5 10 3 0.5 10 3 0.5 10 3 0.5 0.0 0.5 Translated pseudorapidity 10 4 0.5 0.0 0.5 Translated pseudorapidity 10 4 0.5 0.0 0.5 Translated pseudorapidity 10 4 Jet images are typically very sparse roughly 5-10% pixel activation on average if using a 0.1x0.1 grid [1] [1] L. de Oliveira, M. Kagan, L. Mackey, B. Nachman, and A. Schwartzman, Jet-images -- deep learning edition, JHEP 07 (2016) 069, arxiv:1515190 [hep-ph]. 5

Neural Network Inputs Use sequence of jet constituents rather than image Advantages: No loss of information due to pixelization in an image Inputs are more information dense Using 120 constituents average activation is 30%-50% 6

Training and Network Architecture Network Type Number of layers Number of free parameters Activation function Optimizer Loss Fully connected 5, [300,150,50,10,5,1] 41,323 Rectified linear units, sigmoid on output Adam Binary Cross-Entropy Early Stopping Patience of 5 Implemented with Keras Initially planned on using an LSTM, but ended up using a fully connected network We found that performance between the LSTM and the fully connected network was very similar, but the deep networks were much faster to train (~10 times) which allowed for faster experimentation with preprocessing techniques and network architectures 7

Preprocessing

Preprocessing Large radius, R =, jets are trimmed using subjets R = 0.2 found with the k T algorithm with and p T frac = 5% Order subjets by subjet p T and jet constituent p T within each subjet We use only the 120 highest p T jet constituents Perform preprocessing using domain knowledge about the physics at hand 9

No Preprocessing 10 3 Jet p T =600-2500GeV Trimming only Background Rejection 10 2 10 1 Trimming only AUC = 0.83 R ϵ = 50% = 8.85 R ϵ = 80% = 3.36 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 10

Scale Scale p T of all jet constituents by a common factor to ensure that the constituent p T is approximately between 0 and 1 11

Scale 10 3 Jet p T =600-2500GeV Trimming only Scale Background Rejection 10 2 10 1 Scaling AUC = 0.900 R ϵ = 50% = 21.3 R ϵ = 80% = 6.02 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 12

Translate Center jet about highest p T subjet in η, φ plane 13

Translate 10 3 Jet p T =600-2500GeV Trimming only Scale Translation Background Rejection 10 2 10 1 Translation AUC = 0.924 R ϵ = 50% = 33.2 R ϵ = 80% = 8.48 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 14

Rotate Designed method of rotations to preserve jet mass Transform p ', η, φ into p ), p *,, p + Rotate so that second highest p T subjet is aligned with negative y-axis: Transform (p ), p *,, p + ) back to p ', η, φ 15

Rotate 10 3 Jet p T =600-2500GeV Trimming only Scale Translation Rotation Background Rejection 10 2 10 1 Rotation AUC = 0.932 R ϵ = 50% = 42.3 R ϵ = 80% = 9.57 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 16

Flip Third subjet is not constrained, but can be moved to right half of plane Flip jet if average p T is in left half of plane 17

Flip Background Rejection 10 3 10 2 10 1 Flip AUC = 0.933 R ϵ = 50% = 44.3 R ϵ = 80% = 9.75 Jet p T =600-2500GeV Trimming only Scale Translation Rotation Flip 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 18

Performance on Truth vs Reconstructed Jets

Performance after preprocessing 10 3 Jet p T =600-2500GeV DNN, truth 32,truth DNN, reco 32, reco Background Rejection 10 2 10 1 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency 20

Performance at 50% overall Signal Efficiency Truth Jets Reconstructed Jets 80 60 0.8 70 0.8 50 Signal efficiency 0.6 0.4 0.2 0.0 Signal efficiency Background rejection 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] 60 50 40 30 20 10 0 Background rejection Signal efficiency 0.6 0.4 0.2 0.0 Signal efficiency Background rejection 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] 40 30 20 10 0 Background rejection AUC = 0.947 R ϵ = 50% = 66 R ϵ = 80% = 13 AUC = 0.933 R ϵ = 50% = 44 R ϵ = 80% = 9.7 21

Pileup

Performance at different levels of pileup 10 3 Jet p T =600-2500GeV No pile up Pile up = 23 Pile up = 50 Background Rejection 10 2 10 1 10 0 0.0 0.2 0.4 0.6 0.8 Top Tagging Efficiency Extremely stable performance with respect to pileup 23

Performance at different levels of pileup 60 0.8 50 Signal efficiency 0.6 0.4 40 30 20 Background rejection 0.2 0.0 Signal efficiency: No pile up Signal efficiency: Pile up = 23 Signal efficiency: Pile up = 50 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] 24 Background rejection: No pile up Background rejection: Pile up = 23 Background rejection: Pile up = 50 p T dependence also stable with respect to pileup 10 0

Learning what is being learnt

Jet Mass Background Jets 0.120 0.014 Flat p T distribution 600 < jet p T < 2500 GeV DNN output 0.8 0.6 0.4 0.2 0.105 0.090 0.075 0.060 0.045 0.030 0.015 P(Jet mass [GeV] DNN output) 0.012 0.010 0.008 0.006 0.004 0.002 Signal Background 0.0 0 100 200 300 400 500 Jet mass [GeV] 0.000 0.000 0 50 100 150 200 250 300 350 400 Jet mass [GeV] 26

Jet Mass Background Jets 0.120 0.014 Flat p T distribution 600 < jet p T < 2500 GeV DNN output 0.8 0.6 0.4 0.2 0.105 0.090 0.075 0.060 0.045 0.030 0.015 P(Jet mass [GeV] DNN output) 0.012 0.010 0.008 0.006 0.004 0.002 Signal Background 0.0 0 100 200 300 400 500 Jet mass [GeV] 0.000 0.000 0 50 100 150 200 250 300 350 400 Jet mass [GeV] 27

Next Steps Short term: We plan to revisit LSTMs Thorough Bayesian hyper-parameter optimization Longer term: Both top and W tagging with deep neural networks now reasonably well-established on Monte Carlo But does it work on data? Start working towards evaluating the performance of these techniques on data Investigate effects of systematics and strategies for mitigating the impact of systematics 28

Thank you! 29

W-tagging performance on truth QCD-Aware Recursive Neural Networks for Jet Physics. Louppe, Cho, Becot, Cranmer https://arxiv.org/abs/1702.00748 30

Zooming Parton Shower Uncertainties in Jet Substructure Analyses with Deep Neural Networks Barnard, Dawe, Dolan, Rajcic https://arxiv.org/pdf/1609.00607v2.pdf 31

Performance when trained and tested on different levels of pileup 60 60 0.8 50 0.8 50 Signal efficiency 0.6 0.4 0.2 0.0 Signal efficiency: NN trained on µ = 0 tested on µ =0 Signal efficiency: NN trained on µ = 0 tested on µ =23 Signal efficiency: NN trained on µ = 0 tested on µ =50 Background rejection: NN trained on µ = 0 tested on µ =0 Background rejection: NN trained on µ = 0 tested on µ =23 Background rejection: NN trained on µ = 0 tested on µ =50 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] - Examined how a neural network trained at one pileup level performs on another level of pileup - NN seems relatively robust to changes in pileup expected at the LHC in the next few years 40 30 20 10 0 Background rejection Signal efficiency Signal efficiency 0.6 0.4 0.2 0.0 0.8 0.6 0.4 0.2 0.0 Signal efficiency: NN trained on µ = 23 tested on µ =0 Signal efficiency: NN trained on µ = 23 tested on µ =23 Signal efficiency: NN trained on µ = 23 tested on µ =50 Background rejection: NN trained on µ = 23 tested on µ =0 Background rejection: NN trained on µ = 23 tested on µ =23 Background rejection: NN trained on µ = 23 tested on µ =50 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] Signal efficiency: NN trained on µ = 50 tested on µ =0 Signal efficiency: NN trained on µ = 50 tested on µ =23 Signal efficiency: NN trained on µ = 50 tested on µ =50 Background rejection: NN trained on µ = 50 tested on µ =0 Background rejection: NN trained on µ = 50 tested on µ =23 Background rejection: NN trained on µ = 50 tested on µ =50 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Jet p T [GeV] 40 30 20 10 0 60 50 40 30 20 10 0 Background rejection Background rejection 32

Jet Mass Background Jets 0.120 0.014 Flat p T distribution 600 < jet p T < 2500 GeV DNN output 0.8 0.6 0.4 0.2 0.105 0.090 0.075 0.060 0.045 0.030 0.015 P(Jet mass [GeV] DNN output) 0.012 0.010 0.008 0.006 0.004 0.002 Signal Background 0.0 0 100 200 300 400 500 Jet mass [GeV] 0.000 0.000 0 50 100 150 200 250 300 350 400 Jet mass [GeV] 33

32 Background Jets 0.8 0.040 0.035 0.030 2.5 2.0 Flat p T distribution 600 < jet p T < 2500 GeV Signal Background DNN output 0.6 0.4 0.025 0.020 0.015 DNN output) P( 32 wta 1.5 0.2 0.010 0.5 0.005 0.0 0.0 0.2 0.4 0.6 0.8 32 wta 0.000 0.0 0.0 0.2 0.4 0.6 0.8 1.2 32 34