Boosted hadronic object identification using jet substructure in ATLAS Run-2

Boosted hadronic object identification using jet substructure in ATLAS Run-2 Emma Winkels on behalf of the ATLAS collaboration HEPMAD18

Outline Jets and jet substructure Top and W tagging H bb tagging Mass-decorrelated taggers Summary 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 1

Jets 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 2

What is a jet? Jets are objects constructed from the energy deposits left by collimated sprays of particles. Attempt to group inputs from common sources together.

What is a jet? Jets are objects constructed from the energy deposits left by collimated sprays of particles. Attempt to group inputs from common sources together. Hard scattering

What is a jet? Parton shower Jets are objects constructed from the energy deposits left by collimated sprays of particles. Attempt to group inputs from common sources together. Hard scattering

Parton shower What is a jet? Jets are objects constructed from the energy deposits left by collimated sprays of particles. Attempt to group inputs from common sources together. Hard scattering Hadronization

11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 7

Inputs to jets ATLAS uses calorimeter objects and tracks as jet inputs. Calorimeter measures energy of particles. Starts with topological clustering of calorimeter cells. The ATLAS detector Muon Neutrino Proton Neutron Muon spectrometer Seed cells, E > 4σ above noise Hadronic calorimeter Growth cells, E > 2σ above noise Boundary cells Final topoclusters Inner detector Photon Electron Electromagnetic calorimeter LHC beampipe 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 8

How to define a jet There is no unique way to define a jet Different jet algorithms to cluster energy constituents into a jet: Anti-k ' : cluster hard (high-p ' ) and close (small R) energy deposits first k ' : cluster soft (low-p ' ) and close Cambridge/Aachen: cluster close 12 constituents = large E = small E R 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 9

Boosted jets Different jet radii (R) for different purposes: Small jets for quarks and gluons Large jets for hadronic decays of W, Z, H, top.. θ 2m/pT Increasing pt 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 10

Jet substructure Access the inner structure of large jets* Jet substructure variables are some function of Number of constituents Energy of the constituents Angular separation of the constituents ( R) Jet substructure helps us in jet tagging R 12 constituents = large E = small E * We use jet grooming to cut away the soft parts of jets, see 1510.05821 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 11

Jet tagging Identify the particle that produced the jet. Used in broad range of physics analyses. Analyses looking at boosted top/higgs/w: distinguish these large jets from quark/gluon jets Analyses with b-hadrons or c-hadrons in the final state: heavy flavour tagging (b-, c-quark jets) 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 12

Top/W tagging ATLAS-CONF-2017-064 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 13

Commissioning of a tagger ATLAS process from idea to tagger used for physics analyses: Use MC to choose tagger Check data/mc Measure efficiencies in data & MC Calculate uncertainties on efficiencies 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 14

Use MC to choose tagger Two-variable tagging Simple cut-based tagging works well: W: m 1234 + D 7 Top: m 1234 + τ 97 m 1234 : Combined mass, combines calorimeter clusters with tracks to give more stable mass performance at high-p '. D 7 : Energy correlation ratio, distinguishes between one-prong and two-prong jets. τ 97 : N-subjettiness, distinguishes between two-prong and three-prong jets. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 15

Use MC to choose tagger Two-variable tagging Simple cut-based tagging works well: W: m 1234 + D 7 Top: m 1234 + τ 97 m 1234 : Combined mass, combines calorimeter clusters with tracks to give more stable mass performance at high-p '. D 7 : Energy correlation ratio, distinguishes between one-prong and two-prong jets. τ 97 : N-subjettiness, distinguishes between two-prong and three-prong jets. W tagging performance ATLAS-CONF-2017-064 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 16

Use MC to choose tagger Machine learning taggers - BDT Machine learning techniques allow for the use of multiple variables Boosted decision tree (BDT): sequentially adding variables improves classification over two-variable tagging W tagging Top tagging bkg ) Relative background rejection (1/ rel 60 50 40 30 20 10 0 D 2 comb m p T ATLAS Simulation Preliminary s = 13 TeV, BDT W Tagging Trimmed anti-k t R = 1.0 jets rel sig p true 21 KtDR τ T comb m = 50% = [200,2000] GeV 3 a > 40 GeV, η true < 2.0 1 τ P FW R 2 A 2 C 2 τ z cut d 12 3 e bkg ) Relative background rejection (1/ rel 9 8 7 6 5 4 3 2 1 0 τ 32 comb m 23 d ATLAS Simulation Preliminary s = 13 TeV, BDT Top Tagging Trimmed anti-k t R = 1.0 jets rel sig p D 2 true T comb m W Q = 80% = [350,2000] GeV > 40 GeV, η true < 2.0 d 12 21 τ 3 e 2 τ p T 2 C 1 τ τ 3 ATLAS-CONF-2017-064 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 17

Use MC to choose tagger Machine learning taggers - DNN ATLAS evaluated TopoDNN* top tagger. Uses deep neural network (DNN) with topocluster jet constituents as inputs. Performance is better with lowlevel inputs than standard machine learning taggers. Background rejection (1 / bkg ) 4 10 3 10 2 10 10 DNN top BDT top Shower Deconstruction 2-var optimised tagger TopoDNN ATLAS-CONF-2017-064 ATLAS Simulation Preliminary s = 13 TeV Trimmed anti-k t R = 1.0 jets η true < 2.0 true p = [1500, 2000] GeV T Top tagging HEPTopTagger v1, m comb > 60 GeV τ 32 TopoDNN: Low level inputs (four momenta) * More details on TopoDNN: 1704.02124. DNN: High level inputs (constructed variables) 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal efficiency ( sig ) 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 18

Measurements in data Use MC to choose tagger Check data/mc Events / 5 GeV 10000 ATLAS Preliminary Data 2015+2016-1 s = 13 TeV, 36.1 fb tt (top) Trimmed anti-k tt (W ) t R=1.0 jets tt (other) 8000 R(large-R jet, b-jet) > 1.0 Single Top (W ) p > 200 GeV T Single Top (other) W + jets 6000 VV, Z + jets, multijet Total uncert. Stat. uncert. tt modelling uncert. 4000 W enriched sample Events / 5 GeV 2500 Data 2015+2016 tt (top) tt (W ) tt (other) 2000 Single Top (W ) Single Top (other) W + jets 1500 VV, Z + jets, multijet Total uncert. Stat. uncert. tt modelling uncert. 1000 Top enriched sample ATLAS Preliminary -1 s = 13 TeV, 36.1 fb Trimmed anti-k t R=1.0 jets R(large-R jet, b-jet) < 1.0 p > 350 GeV T 2000 500 Data/Pred. 1.5 1 0.5 Data/Pred. 1.5 1 0.5 60 80 100 120 140 160 180 200 220 240 60 80 100 120 140 160 180 200 220 240 comb Leading large-r jet m [GeV] ATLAS-CONF-2017-064 60 80 100 120 140 160 180 200 220 240 comb Leading large-r jet m [GeV] 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 19

Use MC to choose tagger Check data/mc Measure efficiencies in data & MC Calculate uncertainties on efficiencies W/top tagging efficiency Need to measure efficiency in data and get uncertainty on this efficiency*. Full ATLAS 2015-2016 dataset of 36.1 36.7 fb CD Measure top/w tagging efficiency in tt lepton+jets samples. Measure multijet rejection in dijet and γ +jets samples. * Also done in V+jets: ATLAS-CONF-2018-016 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 20

Use MC to choose tagger Check data/mc Measure efficiencies in data & MC Calculate uncertainties on efficiencies W/top tagging efficiency vs. pile-up Pile-up is the resulting signal in the detector from other interactions besides the hard scatter we want to look at. Expressed as the mean number of interactions per bunch crossing. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 21

Use MC to choose tagger Check data/mc Measure efficiencies in data & MC Calculate uncertainties on efficiencies W/top tagging efficiency vs. pile-up pre-fit ϵ IJ = SQOOTU L MNOPQR SQOOTU L VLMNOPQR MNOPQR PWS SQOOTU ϵ XYZY = SQOOTU L [NSSTU MNOPQR SQOOTU PWS SQOOTU L [NSSTU MNOPQR VL [NSSTU MNOPQR post-fit Signal efficiency ( sig ) Data/MC 0.6 ATLAS Preliminary -1 Data 2015+2016 0.5 0.4 0.3 0.2 1.5 0.5 1 s = 13 TeV, 36.1 fb lepton+jets selection Trimmed anti-k t R=1.0 jets W tagger ( = 50%): m comb sig + D 2 p > 200 GeV T W tagger PowhegPythia6 Total uncert. 10 20 30 40 10 20 30 40 Mean number of interactions per bunch crossing µ 10 20 30 40 ATLAS-CONF-2017-064 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 22 Signal efficiency ( sig ) Data/MC 1.5 1 0.5 1.2 0.8 1 ATLAS Preliminary s = 13 TeV, 36.1 fb lepton+jets selection Trimmed anti-k t R=1.0 jets Top tagger ( sig = 80%): TopoDNN p > 450 GeV T TopoDNN tagger -1 Data 2015+2016 PowhegPythia6 Total uncert. 10 20 30 40 Mean number of interactions per bunch crossing µ

Use MC to choose tagger Check data/mc Measure efficiencies in data & MC Calculate uncertainties on efficiencies Multijet rejection vs. pile-up Background rejection (1/ bkg ) 160 140 120 100 80 60 40 20 W tagger ATLAS Preliminary -1 s = 13 TeV, 36.7 fb Trimmed anti-k t R=1.0 jets Multijet Selection W tagger ( sig = 50%): comb m + D 2 Data 2015+2016 Pythia8 Herwig++ Stat. uncert. Total uncert. Background rejection (1/ bkg ) 50 45 40 35 30 25 20 15 10 5 TopoDNN tagger ATLAS Preliminary Data 2015+2016-1 s = 13 TeV, 36.7 fb Pythia8 Trimmed anti-k Herwig++ t R=1.0 jets Stat. uncert. Multijet Selection Total uncert. Top tagger ( sig = 80%): TopoDNN Data/Pred. 2 1.5 1 0.5 0 5 10 15 20 25 30 35 40 Mean number of interactions per bunch crossing µ Data/Pred. ATLAS-CONF-2017-064 25 10 15 20 25 30 35 40 1.5 1 0.5 0 5 10 15 20 25 30 35 40 Mean number of interactions per bunch crossing µ 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 23

H bb tagging ATL-PHYS-PUB-2017-010 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 24

Overview b H BOOST b b] H BOOST b] b H b] Current nominal tagger identifies b-jets with multivariate algorithm on anti-k ' R=0.2 track-jets. Loses efficiency at high-p ' due to b- jets merging. 3 new subjet reconstruction techniques to mitigate this loss. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 25

1 Variable radius track jets Subjets with dynamic radius parameter: ATL-PHYS-PUB-2017-010 R^ p ' = ` a b with low (R cde ) and high (R cfg ) cutoff. Scans performed over ρ, R cde, R cfg to find optimal values 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 26

2 Exclusive-k ' Recluster trimmed large-r jet calorimeter constituents with k ' algorithm and stop when 2 jets are obtained. Splits the large-r jet into two parts, each of which is expected to contain one b-hadron. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 27

3 Centre-of-mass Boost jet calorimeter clusters to the centre-of-mass frame of the large-r jet (jet rest frame) and reconstruct subjets. Tracks for b-tagging are also boosted to the centre-of-mass frame. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 28

Results New methods show large improvement over nominal tagger for p ' > 1000 GeV. New methods Nominal ATL-PHYS-PUB-2017-010 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 29

Mass decorrelation ATL-PHYS-PUB-2018-014 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 30

Mass decorrelated taggers Jet substructure variables are correlated with jet mass. When you put many of them into a multivariate analysis the correlation gets very strong. Sculpting of the multijet background -> resembles the resonance jet mass distribution Depopulates side-band regions Aim to decorrelate jet substructure classifiers from jet mass ATL-PHYS-PUB-2017-004 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 31

Designed decorrelated taggers (DDT) τ 7D variable distinguishes 1-prong from 2-prong jets Has a linear relationship to jet scaling variable ρ kkl for masses > 80 GeV ATL-PHYS-PUB-2018-014 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 32

Designed decorrelated taggers (DDT) τ 7D variable distinguishes 1-prong from 2-prong jets Has a linear relationship to jet scaling variable ρ kkl for masses > 80 GeV ATL-PHYS-PUB-2018-014 τ 7D after DDT transformation Standard τ 7D 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 33

Summary 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 34

Summary Top/W tagging: Machine learning taggers perform better than 2-variable taggers Machine learning tagger with low-level inputs (TopoDNN) performs the best for top tagging Signal efficiency & background rejection in data are well modelled by MC H bb tagging: 3 new subjet reconstruction techniques to overcome loss of efficiency at high p ' due to b-jet merging Large improvement over nominal tagger for p ' > 1000 GeV Mass decorrelated taggers: Various approaches studied with promising results 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 35

Fin. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 36

Back-up 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 37

The ATLAS coordinate system η = ln tan θ 2 R = η 7 + φ 7 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 38

Granularity of ATLAS calorimeter The hadronic calorimeter is coarser than the EM calorimeter. If we want to use all calorimeter layers we are limited by coarsest layer (R cde ~0.2). But we can also ignore coarse layers and use only very fine layers. EM granularity: η φ 0.025 π/128 Hadronic granularity: η φ 0.1 π/32 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 39

Jet trimming 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 40

Combined mass Track-assisted mass: m lx = m ZyY1z { }QRW { S~Q} ATLAS-CONF-2016-035 in which m ZyY1z and p l ZyY1z are the invariant mass and pt calculated from tracks associated with the large-r trimmed calorimeter jet and p l 1Y 2 is the pt of the original trimmed large-r jet. Calorimeter mass: m fƒ = E d d 7 p d d 7 m 1234 = a m 1Y 2 + b m lx with a = ˆ}QRW Š ˆ}QRW Š Vˆ and b = ˆ Š Š ˆ}QRW Š Š Vˆ where σ 1Y 2 and σ lx are the calorimeter-based jet mass resolution function and the track-assisted mass resolution function respectively. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 41

JSS variables Energy correlation ratio: D 7 = Œ (Œ Š ) where e 7 and e 9 are the 2- and 3-prong energy correlation functions which are sensitive to the 2- and 3-prong structure in a jet. N-subjettiness: τ 97 = τ 9 /τ 7 where τ L = 1 p d l,z min ( R Dz, R 7z,, R Lz ), d = p lz R z R is radius parameter of the jet, p lz is transverse momentum of constituent k, and δr z is the distance between the subjet i and the constituent k. The N-subjettiness variable τ L expresses how well a jet can be described as containing N subjets. Jet mass calculated from constituents: m 7 = E d d 7 p 7 d d z 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 42

W tagging with ML methods Background rejection (1 / bkg ) 4 10 3 10 2 10 10 DNN W BDT W 2-var optimised tagger, m comb D 2 [60, 100] GeV Low pt ATLAS Simulation Preliminary s = 13 TeV Trimmed anti-k t R = 1.0 jets η true < 2.0 true p = [200, 500] GeV T W tagging Background rejection (1 / bkg ) 4 10 3 10 2 10 10 DNN W BDT W 2-var optimised tagger, m comb D 2 [60, 100] GeV High pt ATLAS Simulation Preliminary s = 13 TeV Trimmed anti-k t R = 1.0 jets η true < 2.0 true p = [1000, 1500] GeV T W tagging 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal efficiency ( sig ) 1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal efficiency ( sig ) 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 43

TopoDNN Uses p l, η, φ of 10 leading topoclusters in trimmed large-r jet. Normalized units 0.16 0.14 0.12 ATLAS Simulation Preliminary s = 13 TeV Pythia 8 Pythia 8 Multijet Z Cluster 0 Cluster 1 Cluster 2 Cluster 9 0.1 0.08 0.06 Trimmed anti-k t R=1.0 jets Light Quark Jet Sample : p > 450 GeV T Top Quark Jet Sample: p > 150 GeV T 0.04 0.02 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 fraction of cluster 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 44 p T

Top/W tagging in data 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 45

W/top tagging efficiency Need to measure efficiency in data and get uncertainty on this efficiency. Measure signal-like events in data to fit large-r jet mass for events passing/failing a tagger Passing tag Failing tag Events / 10 GeV 1200 ATLAS Preliminary -1 Data 2015+2016 s = 13 TeV, 36.1 fb tt signal 1000 non-tt background tt background 800 600 400 200 lepton+jets selection Trimmed anti-k t R=1.0 jets Top tagger ( = 80%): m comb sig + τ 32 Tag pass 350 GeV < p < 400 GeV T Events / 10 GeV 800 600 400 200 ATLAS Preliminary s = 13 TeV, 36.1 fb -1 Data 2015+2016 tt signal non-tt background tt background lepton+jets selection Trimmed anti-k t R=1.0 jets Top tagger ( = 80%): m comb sig + τ 32 Tag fail 350 GeV < p < 400 GeV T Data/MC 1.50 100 200 300 400 1 0.5 0 100 200 300 400 m comb [GeV] Data/MC 1.50 100 200 300 400 1 0.5 0 100 200 300 400 m comb [GeV] 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 46

W/top tagging efficiency vs. jet pt pre-fit ϵ IJ = SQOOTU L MNOPQR SQOOTU L VLMNOPQR MNOPQR PWS SQOOTU ϵ XYZY = SQOOTU L [NSSTU MNOPQR SQOOTU PWS SQOOTU L [NSSTU MNOPQR VL [NSSTU MNOPQR post-fit Signal efficiency ( sig ) Data/MC 0.6 ATLAS Preliminary -1 Data 2015+2016 0.5 0.4 0.3 0.2 W tagger s = 13 TeV, 36.1 fb lepton+jets selection Trimmed anti-k t R=1.0 jets W tagger ( = 50%): m comb sig + D 2 PowhegPythia6 Total uncert. 1.5 200 300 400 500 600 0.5 1 200 300 400 500 600 Leading large-r jet p [GeV] T Signal efficiency ( sig ) Data/MC 1.5 1 0.5 1.5 1 0.5 ATLAS Preliminary TopoDNN tagger s = 13 TeV, 36.1 fb lepton+jets selection Trimmed anti-k t R=1.0 jets Top tagger ( sig = 80%): TopoDNN -1 Data 2015+2016 PowhegPythia6 Total uncert. 500 600 700 800 900 1000 500 600 700 800 900 1000 Leading large-r jet p [GeV] T 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 47

Multijet rejection vs. jet pt Background rejection (1/ bkg ) 160 140 120 100 80 60 40 W tagger ATLAS Preliminary -1 s = 13 TeV, 36.7 fb Trimmed anti-k t R=1.0 jets Multijet Selection W tagger ( sig = 50%): comb m + D 2 Data 2015+2016 Pythia8 Herwig++ Stat. uncert. Total uncert. Background rejection (1/ bkg ) 40 35 30 25 20 15 10 TopoDNN tagger ATLAS Preliminary Data 2015+2016-1 s = 13 TeV, 36.7 fb Pythia8 Trimmed anti-k Herwig++ t R=1.0 jets Stat. uncert. Multijet Selection Total uncert. Top tagger ( sig = 80%): TopoDNN 20 5 Data/Pred. 2500 1000 1500 2000 2500 1.5 1 0.5 0 500 1000 1500 2000 2500 Data/Pred. 2 1.5 1 0.5 0 500 1000 1500 2000 2500 Leading large-r Jet p T [GeV] Leading large-r Jet p T [GeV] 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 48

Variable radius track jets 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 49

Results H->bb tagging techniques 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 50

H->bb tagging results 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 51

Designed decorrelated taggers Linear fit on the relationship between τ 7D and rho. Transformation is: τ 7D ' = τ 7D a ρ ' 1.5 a is the slope of the fit in the plot. DDT transformation removes the linear correlation of τ 7D with ρ. Since ρ has info on kinematics of the jet (m and pt), the DDT transform yields a JSS discriminant which is decorrelated from the jet mass. 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 52

Mass decorrelation 11/09/2018 Emma Winkels ewinkels@cern.ch HEPMAD18 53