CMS Physics Analysis Summary

Similar documents
PoS(CORFU2016)060. First Results on Higgs to WW at s=13 TeV with CMS detector

PoS(DIS 2010)190. Diboson production at CMS

Substructure at CMS:

Jet reconstruction with first data in ATLAS

Z 0 /γ +Jet via electron decay mode at s = 7TeV in

Boosted Top Resonance Searches at CMS

Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Muon reconstruction performance in ATLAS at Run-2

Distinguishing quark and gluon jets at the LHC

Search for heavy BSM particles coupling to third generation quarks at CMS

Future prospects for the measurement of direct photons at the LHC

Double parton scattering studies in CMS

Jet quenching in PbPb collisions in CMS

arxiv: v1 [hep-ex] 2 Nov 2010

PoS(ICHEP2012)300. Electroweak boson production at LHCb

arxiv: v1 [hep-ex] 18 Jan 2016

Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

arxiv: v1 [hep-ex] 28 Aug 2017

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

Measurement of charged particle spectra in pp collisions at CMS

Recent QCD results from ATLAS

ATLAS jet and missing energy reconstruction, calibration and performance in LHC Run-2

PoS(DIS2014)064. Forward-Central Jet Correlations. Pedro Miguel RIBEIRO CIPRIANO, on behalf of CMS. DESY - CMS

Di muons and the detection of J/psi, Upsilon and Z 0 Jets and the phenomenon of jet quenching

QCD Jets at the LHC. Leonard Apanasevich University of Illinois at Chicago. on behalf of the ATLAS and CMS collaborations

Measurement of Jet Energy Scale and Resolution at ATLAS and CMS at s = 8 TeV

Ridge correlation structure in high multiplicity pp collisions with CMS

Heavy Hadron Production and Spectroscopy at ATLAS

Hard And Soft QCD Physics In ATLAS

Measurements of the production of a vector boson in association with jets in the ATLAS and CMS detectors

Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV

Search for BSM Decaying to Top Quarks

PoS(ICHEP2012)311. Identification of b-quark jets in the CMS experiment. Sudhir Malik 1

PERFORMANCE OF THE ATLAS MUON TRIGGER IN RUN 2

Nikos Varelas. University of Illinois at Chicago. CTEQ Collaboration Meeting Northwestern November 20, Nikos Varelas. CTEQ Meeting Nov 20, 2009

Transverse momentum and pseudorapidity distributions with minimum bias events in CMS at the LHC

Performance of muon and tau identification at ATLAS

Measurement of multijets and the internal structure of jets at ATLAS

Measurement of Z+ Jets Cross-Section at ATLAS

Physics object reconstruction in the ATLAS experiment

Dark matter searches and prospects at the ATLAS experiment

Measurement of the Inclusive Isolated Prompt Photon Cross Section at CDF

Measurements on hadron production in proton-proton collisions with the ATLAS detector

The Heavy Quark Search at the LHC

Measurement of the top pair invariant mass distribution and search for New Physics using the CMS experiment

PoS(EPS-HEP 2013)508. CMS Detector: Performance Results. Speaker. I. Redondo * CIEMAT

Application of the Tau Identification Capability of CMS in the Detection of Associated Production of MSSM Heavy Neutral Higgs Bosons Souvik Das

Identification of the Higgs boson produced in association with top quark pairs in proton-proton

Measurement of the jet production properties at the LHC with the ATLAS Detector

Double Parton Scattering in CMS. Deniz SUNAR CERCI Adiyaman University On behalf of the CMS Collaboration Low-x th June 2017 Bari, Italy

Recent CMS results on heavy quarks and hadrons. Alice Bean Univ. of Kansas for the CMS Collaboration

Studies of b b gluon and c c vertices Λ. Abstract

Measurement of the associated production of direct photons and jets with the Atlas experiment at LHC. Michele Cascella

QCD dijet analyses at DØ

Differential photon-jet cross-section measurement with the CMS detector at 7 TeV

The CMS Particle Flow Algorithm

Dilepton Forward-Backward Asymmetry and electroweak mixing angle at ATLAS and CMS

Physics potential of ATLAS upgrades at HL-LHC

Tutorial on Top-Quark Physics

Measurement of photon production cross sections also in association with jets with the ATLAS detector

arxiv:hep-ex/ v1 27 Oct 1998

Conference Report Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

2 ATLAS operations and data taking

arxiv: v1 [hep-ex] 24 Oct 2017

arxiv: v1 [hep-ex] 20 Jan 2013

CMS Note Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Searching for New High Mass Phenomena Decaying to Muon Pairs using Proton-Proton Collisions at s = 13 TeV with the ATLAS Detector at the LHC

Jet Reconstruction and Energy Scale Determination in ATLAS

Measurements of BB Angular Correlations based on Secondary Vertex Reconstruction at sqrt(s) = 7 TeV in CMS

The first Z boson measurement in the dimuon channel in PbPb collisions at s = 2.76 TeV at CMS

Muon commissioning and Exclusive B production at CMS with the first LHC data

Hadronic Exotica Searches at CMS

Top quarks objects definition and performance at ATLAS

Latest results on Higgs boson γγ in the CMS experiment

Top production measurements using the ATLAS detector at the LHC

Electroweak results. Luca Lista. INFN - Napoli. LHC Physics

Jet Results in pp and Pb-Pb Collisions at ALICE

The ATLAS Run 2 Trigger: Design, Menu, Performance and Operational Aspects

AGH-UST University of Science and Technology, Faculty of Physics and Applied Computer Science, Krakow, Poland

Precision QCD at the Tevatron. Markus Wobisch, Fermilab for the CDF and DØ Collaborations

Studies of the diffractive photoproduction of isolated photons at HERA

Boosted hadronic object identification using jet substructure in ATLAS Run-2

Jet tagging with ATLAS for discoveries in Run II

THE ATLAS TRIGGER SYSTEM UPGRADE AND PERFORMANCE IN RUN 2

First V+jets results with CMS. Vitaliano Ciulli (Univ. & INFN Firenze) V+jets workshop, 8-10 Sep 2010, Durham

Identification and rejection of pile-up jets at high pseudorapidity with the ATLAS detector

Discovery of the W and Z 0 Bosons

Single top quark production at CDF

Boosted top quarks in the ttbar dilepton channel: optimization of the lepton selection

Mono-X, Associate Production, and Dijet searches at the LHC

QCD Multijet Background and Systematic Uncertainties in Top Physics

Measurement of W/Z+ Jets Cross-Section at ATLAS

arxiv: v1 [hep-ex] 17 Aug 2016

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Results from combined CMS-TOTEM data

Some studies for ALICE

Multi-jet production and jet correlations at CMS

Measurement of Quenched Energy Flow for Dijets in PbPb collisions with CMS

arxiv: v1 [hep-ex] 18 May 2015

Atlas Status and Perspectives

Transcription:

Available on the CERN CDS information server CMS PAS JME3 CMS Physics Analysis Summary Contact: cms-pog-conveners-met@cern.ch /3/3 Study of Jet Substructure in pp Collisions at 7 ev in CMS he CMS Collaboration Abstract Many physics models of electroweak symmetry breaking and solutions to the hierarchy problem predict the existence of heavy particles that often decay hadronically. If these new particles are boosted relative to their mass, the resulting decay products for hadronic decays can fall within a single. his summary explores the search and identification of these hadronically decaying boosted objects by examining the substructure of the s that are postulated to originate from these boosted heavy particles. Examples of this include top quarks, W/Z bosons, Higgs bosons, and many others. wo algorithms have been proposed, one dedicated to reconstructing boosted top quarks and the other to identifying boosted W bosons. Both techniques are applied to data recorded by CMS in at 7 ev and comparisons to various Monte Carlo generators and underlying event tunes are presented. A parametrization of the rate at which generic QCD s are mistakenly identified by each algorithm is obtained from data. his parameterization allows background estimations for physics measurements and searches to be done using data-driven techniques.

Introduction Many physics models of electroweak symmetry breaking and solutions to the hierarchy problem predict the existence of heavy particles. If these new particles are sufficiently massive the resulting decay products will be highly boosted in the lab frame. Decay products of a highly boosted object tend to be collimated and merged into a single reconstructed. he merged s can be dissected to find substructure representing each parton from the hadronic decay. Several algorithms to resolve this substructure have been proposed, with a recent comprehensive summary in Reference []. In this note two strategies designed for this purpose are considered. One is the Johns Hopkins University top tagging algorithm developed by Kaplan et al. []. We also examine the University of Washington pruning algorithm developed by S. Ellis et al. [3], with which we target hadronic W decays. Both tagging approaches attempt to find substructure by looking at the clustering sequence on a -by- basis. he top tagging algorithm seeks the three partons coming from a hadronically decaying boosted top. Jets are reconstructed using a sequential clustering algorithm, and the clustering sequence is then reversed to find subs. he algorithm completes successfully only when the mass of the is close to the top quark mass, at least three subs are found and a W candidate is formed by combining two of the subs found. he pruning algorithm is a general approach to finding substructure for any boosted object in a (top, W/Z, Higgs, etc. Jets are again reconstructed using a sequential clustering algorithm, however in this case, the clustering sequence is repeated, adding a set of requirements to be met before the is finally built. In each step of the repeated clustering sequence, constituents that are soft and at large angles are excluded from the merging procedure. he final set of pruned s can then be dissected by stepping back in the clustering sequence until N subs are found. his bottom-up approach is different in strategy to the top-down approach used in the top tagger. In this study pruning is used to tag boosted W bosons decaying hadronically by requiring N =. he top tagging and W tagging algorithms were optimized using the CMS simulation of QCD background and various potential signal samples. echniques to measure mistag rates from data are explored throughout the note and comparisons of the results with several Monte Carlo simulations are presented. he CMS Detector he data sample was collected by CMS in at s = 7 ev and corresponds to an integrated luminosity of 36 pb. he CMS detector [4] is a general-purpose device. It has many features suited for reconstruction of energetic s, in particular a finely segmented electromagnetic calorimeter, a hadronic calorimeter and a tracking detector. he charged particles are reconstructed by the inner tracker, immersed in a 3.8 axial magnetic field; the inner tracker consists of three layers and two endcap disks of pixel sensors, and ten barrel layers and twelve endcap disks of silicon strips. his arrangement results in a full azimuthal coverage within η <.5, where η is the pseudorapidity and is defined as η = ln tan(θ/. he CMS uses a polar coordinate system, with the z axis coinciding with the axis of symmetry of the CMS detector, and oriented in the counterclockwise proton direction; here θ is the polar angle defined with respect to the positive z axis. A lead-tungstate crystal electromagnetic calorimeter (ECAL and a brass-scintillator hadronic calorimeter (HCAL surround the tracking volume and allow photon, electron and recon-

4 Jet Reconstruction and Event Selection struction up to η = 3. he ECAL and HCAL cells are grouped into towers projecting radially outward from the interaction region. In the central region ( η <.74 the towers have dimensions η = φ =.87; however, at higher η, the η and φ widths increase. ECAL and HCAL cell energies above the noise suppression thresholds are combined within each tower to define the calorimeter tower energy, and the towers are further combined into clusters, which are then identified as s. For an improved reconstruction, the tracking and calorimeter information is combined in an algorithm called particle-flow [5], which is described below. 3 sets he goal of this study is to examine the performance of substructure algorithms in background enriched (generic QCD environments, and hence a generic di selection is performed. he data are collected with inclusive triggers with uncorrected energies. Due to the increasing instantaneous luminosity, there is no single inclusive trigger via which the data were collected without prescales, and so uncorrected p triggers with thresholds of 3, 5, 7,, and 4 GeV/c are required. he lowest threshold unprescaled trigger is required for each event. Four different generators and tunes are used to simulate the di data. We compare PYHIA 6 tune Z and D6, PYHIA 8 tune, and HERWIG ++ tune 3 [6][7]. All samples except for PYHIA 8 use an artificial flat ˆp spectrum. he PYHIA 8 sample is produced in bins of ˆp. In addition to the Monte Carlo simulation of QCD di production, a MC sample of continuum QCD production of t t pairs is also considered, as well as seven samples of Z decaying to tt pairs with resonance masses between 75 and 4 GeV/c. 4 Jet Reconstruction and Event Selection Event data are reconstructed using the particle-flow reconstruction algorithm [5], which attempts to reconstruct all stable particles in an event by combining information from all subdetectors. he algorithm categorizes all particles into five types: muons, electrons, photons, charged and neutral hadrons. he resulting particle flow candidates are passed to each clustering algorithm to create particle flow s. he particle flow candidates are clustered using the Cambridge-Aachen (CA clustering algorithm [8, 9]. he CA clustering sequence is only determined by the distance between clusters and is not weighted by their momentum, as is done for the k and anti-k algorithms. A distance parameter of size R = ( y + ( φ =.8 is used. In previous studies based on simulation [], the CA algorithm was found to be more efficient (for the same mistag rate than k or anti-k at finding hard subs when the clustering sequence is reversed []. he four momenta are corrected [] for nonlinearities in η and p with simulated data. An extra correction is applied to the data to account for a residual nonlinearity that is not observed in the simulation. No pileup corrections are applied. Constituents of the (i.e. subs are not corrected, and algorithmic procedures are done on uncorrected energies. As no official corrections are available for CA R =.8 s, the corrections for anti-k R =.5 s are used, and simulation studies confirm that they work well for the momenta considered here. he following preselection is applied: he event must have a good primary vertex ( z Primary Vertex < 4 cm, N DOF > 3.

3 he event must contain at least two s, and the two highest p s must satisfy: p > GeV/c (W tagging or p > 5 GeV/c (top tagging, η <.5 Loose PF Jet ID [] is applied he two leading s must satisfy φ φ >., where φ is the azimuthal angle. Beam background events are removed using the following requirements: In events with at least tracks, a minimum of 5% of these tracks must be high-purity tracks. 5 he op agging and W agging Algorithms Both the top tagging and W tagging algorithms make use of sequential recombination algorithms such as the Cambridge-Aachen algorithm [8, 9]. In this section the key steps of both approaches are described, as well as the variables used in pruning and selection of subs. 5. op agging Algorithm Cambridge-Aachen R =.8 s, as described in Section 4, are used as inputs to the top tagging algorithm. he input CA s are hereby referred to as the hard s. he algorithm has two steps: the primary decomposition, in which the algorithm attempts to split the hard into two subclusters, and the secondary decomposition, in which the algorithm attempts to split the clusters found by the primary decomposition. he decomposition procedure is as follows:. he pairwise clustering sequence which was used to form the is examined in reverse order to find two subclusters.. Continue to the next step if the two subclusters satisfy ( η + ( φ >.4 A p C, where p C is the transverse momentum of the object fed to the decomposition stage, and the slope parameter A =.4 was optimized using simulated events. If this selection is not satisfied the subclusters are too close and the decomposition fails. he primary decomposition uses the p (p C = p, while the secondary decomposition uses the p of the clusters found by the primary decomposition (p C = pa or pc = pb. 3. If the two subclusters satisfy the momentum fraction criterion p cluster > δ p p hard, then decomposition succeeds. he p requirement on the cluster serves to remove low-p clusters from consideration. he default value of δ p =.5 was found to be optimal based on simulation studies. 4. If only one of the subclusters satisfies the criterion p cluster > δ p p hard, then the decomposition process is repeated on the passed cluster, ignoring the constituents from the failed cluster. his decomposition is repeated until both clusters pass, both clusters fail, or the cluster consists of a single constituent. 5. If, after this iterative process, there is no cluster with p cluster is a single constituent, the decomposition fails. > δ p p hard, or the cluster he primary decomposition recursively declusters the hard until it finds two subclusters (define them as A and B which are well separated and contain a significant fraction of the hard

4 5 he op agging and W agging Algorithms s momentum. If the primary decomposition fails then this has one sub (the original. If the primary decomposition succeeds, then the secondary decomposition is applied to clusters A and B. If cluster A and cluster B can not be further decomposed then they become the s only two subs. If both are successfully decomposed then the has four subs. If the secondary decomposition succeeds on one cluster and fails on the other, than this has three subs. At least three subs are required. he following variables, defined for each passing the algorithm, are used to tag top s: Jet Mass m - he mass of the four-vector sum of the constituents of the hard. Number of Subs N subs - he number of subs found by the algorithm. Minimum Pairwise Mass m min - he three highest p subs are taken pairwise, and each pair s invariant mass is calculated via m ij = (E i + E j ( p i + p j. m min is the mass of the pair with the lowest invariant mass (m min = min[m, m 3, m 3 ]. his variable is not defined for s with less than three subs. Jets that have mass close to the top mass, at least three subs, and minimum pairwise mass close to the W mass are tagged as top s. A loose top tag is defined via the following requirements: 4 < m < 5GeV/c ( N subs 3 ( m min > 5GeV/c (3 hese requirements are representative of those that can be used to tag top s. In an analysis the requirements will be optimized in order to obtain the maximum ratio defined by the number of signal events divided by the square root of number of background events. 5. W agging Algorithm he W tagging algorithm takes a different strategy than the top tagger to groom s. Instead of removing soft clusters during the reversal of the clustering sequence as it is done by the top tagger, the removal of soft clusters is built into the clustering sequence itself through the process of pruning. In detail, the algorithm is as follows. he clustering sequence is rerun for each and at every step of the clustering, when clusters i and j are being merged to a single cluster p, two additional requirements are made (where m J stands for the mass of the in the first clustering squence: z ij min(p,i, p,j p,p > z cut (4 R ij < D cut = α m J p J If (i + j p step fails to satisfy these two requirements, i and j are not merged and instead the softer of the two clusters is removed. he z ij requirement ensures that soft particles are discarded, and the R requirement ensures that wide-angle particles are discarded. he resulting is referred to as the pruned. he choices for z cut and α in the above equations are. and.5, respectively, which are the default values used by the authors of the algorithm in Reference [3]. (5

5 Jet pruning can be used to identify W s by applying the following selection, which exploits the variables used in Ref [3]: Require two subs in the pruning algorithm. Require the total pruned mass to satisfy 6 GeV/c < m < GeV/c. Sub p is used to sort the two subs. By looking at the last clustering iteration of the pruned, the mass drop of the hardest sub (hereby referred to as, and is required to satisfy m m = µ <.4. he mass drop requirement ensures that the mass of the is roughly evenly spread between two or more subs. he p of the two subs hereby is referred to as p and p. A sub p asymmetry is defined as: y = min(p, p ( R m and can be required to be y > y cut, although at the present time this particular selection is not applied. he p asymmetry ensures that the less energetic sub still carries more energy than a fraction of the mass of the whole W candidate system. Both criteria are designed to select W candidates in which the daughter subs are similar in energy and mass. 6 Mistag Rate from his section describes the measurement of the mistag rate, or the rate at which s not originating from boosted top quarks or boosted W bosons are mistakenly tagged. For both the top tagging and W tagging algorithms, the mistag rate is measured from di data. In the prototypical new physics search (tt resonances, the data sample for the analysis using the top tagging algorithm is the di sample itself, and hence to derive a statistically disjoint control sample with which to measure the mistag rate, an anti-tag-and-probe method is employed (the signal region is tag-plus-tag, so a tag veto achieves the statistical independence required. he top mistag measurement procedure is as follows: For each preselected event, one of the two leading s is selected randomly. A sample of non-top s is formed by selecting the events for which this randomly selected is not tagged (anti-tagged. A is anti-tagged if it fails any of the requirements listed above. he opposite to the anti-tagged is the probe. he mistag rate is measured by counting the number of probe s that are tagged and dividing this by the total number of probe s. he mistag rate for top tagging as measured from data is shown in Figure a. A comparison of four different Monte Carlo generators and tunes is shown for comparison. he performance capabilities of the top tagging algorithm on the Monte Carlo simulation are summarized in Figure a. he probability to wrongly identify a generic QCD as a top (mistag rate is shown on the y-axis, while the efficiency to identify a true top quark is shown on the x-axis. he denominator of the efficiency calculation is defined to be the number of p -leading s in events which pass the preselection criteria (as defined in Section 4. he numerator is the number of these s which are tagged. he mistag rate and the efficiency, as functions of p, are shown in Figure a and c respectively.

6 6 Mistag Rate from he W tagging mistag rate is also computed from dis. he signal region for the W tagging algorithm in envisioned tt resonance searches is a multi sample, with more than two s. herefore the di sample is a strictly statistically disjoint control region. hus, the anti-tag and probe method is unnecessary. So the procedure then becomes: For each preselected event, one of the two leading s is selected randomly as the tag. he mistag rate is measured by counting the number of away-side probe s that are tagged and dividing this by the total number of probe s. he mistag rate for W tagging as measured from data is shown in Figure b. he simulated mistag rate is shown for comparison. op Mistag Rate. Pythia 6 une Z, Stat Error..8 Pythia 6 une D6, Stat Error Pythia 8 une, Stat Error Herwig++ une 3, Stat Error op agging Algorithm 35.97 pb at s = 7 ev W Mistag Rate.3.5. Pythia 6 une Z, Stat Error Pythia 6 une D6, Stat Error Herwig++ une 3, Stat Error 34.7 pb at s = 7eV.6.5.4...5 3 4 5 6 7 8 9 Jet p (GeV/c 3 4 5 6 7 8 9 p (GeV/c op agging Efficiency.5.4.3 W agging Efficiency.75.7.65.6.55 Z' tt at s = 7eV.. Z` t t op agging s = 7 ev.5.45.4.35 4 6 8 46 8 Jet p (GeV/c (c.3 5 3 35 4 45 5 55 6 65 7 p (GeV/c Figure : Mistag rate for the top tagging and W tagging algorithms measured in data and simulation. he bands are Monte Carlo statistical error, and the error bars are statistical error from data. Also shown are simulated efficiencies for top and W tagging, calculated using seven Z tt samples with resonance masses between 75 and 4 GeV/c. op tagging mistag rate as measured using the anti-tag-and-probe procedure. W tagging mistag rate based on the pruning algorithm using a randomly selected probe. (c Simulated top tagging efficiency as a function of p. (d Simulated W tagging efficiency as a function of p. (d

7 op Mistag Rate.8.7.6.5.4.3. Jet p (GeV/c 3<p 4<p 5<p 6<p 7<p 8<p 9<p <p <p op agging <4 <5 <6 <7 <8 <9 < < < s = 7 ev W Mistag Rate..8.6.4...8.6.4 Jet p (GeV/c <p 5<p 3<p 4<p 5<p <5 <3 <4 <5 <7 at s = 7eV.....3.4.5.6 op agging Efficiency...3.4.5.6.7 W ag Efficiency Figure : Simulated top and W tagging mistag rate as a function of efficiency. he curves correspond to different p bins. op tagging: efficiency is calculated with seven Z tt simulated samples with resonance masses between 75 and 4 GeV/c. Mistag rate is calculated with a QCD di sample generated with PYHIA 6 une Z. he p -spectrum has been reweighted such that is flat within each p bin. W tagging: efficiency is obtained from the same Z Monte Carlo samples, and the mistag rate is derived from a QCD di sample generated with HERWIG. he overall performance of the W tagging algorithm is summarized in Figure b, obtained from the Monte Carlo simulation. he probability to wrongly identify a generic QCD as a W (mistag rate is shown on the y-axis, while the the efficiency to identify a true W is shown on the x-axis. he mistag rate and the efficiency, as functions of p, are shown in Figure b and d respectively. 7 Commissioning of the op agging and W tagging Algorithms he performance of the top tagging and W tagging algorithms are now examined. he following two sections focus on the top tagging algorithm and on the W tagging based on the pruning algorithm, respectively. In both cases, distributions of the algorithm variables from the QCD Monte Carlo samples are compared to the 36 pb of LHC collision data from the run. 7. Commissioning of the op agging Algorithm he top tagging algorithm constructs three variables (m, m min, and N subs which can be used to identify s originating from boosted top quarks. Simulated distributions of m and m min versus p demonstrate the identification power of these variables (Fig. 3. he mass distribution of s from high p top quarks peaks at the top mass. he minimum pairwise invariant mass among the three hardest subs, m min, has a peak at the W mass for signal s, but peaks at smaller values for background s (Figures 3c and 3d. Figure 4 shows that these variables offer significant discrimination between signal and background s. to simulation comparisons of top tagging variables are shown in Figure 5. he mass and number of subs distributions are highly sensitive to the generator and tune. For exam-

8 7 Commissioning of the op agging and W tagging Algorithms (GeV/c m 35 3 5 op agging Z` t t s = 7 ev.. (GeV/c m 35 3 5 op agging QCD Di s = 7 ev.6.4..8. 5.6.4 5.8.6.4 5. 5. 3 4 5 6 7 8 9 p (GeV/c 3 4 5 6 7 8 9 p (GeV/c (GeV/c m min 6 4 op agging Z` t t s = 7 ev..8 (GeV/c m min 6 4 op agging QCD Di s = 7 ev..8 8.6 8.6 6.4 6.4 4. 4. 3 4 5 6 7 8 9 p (GeV/c (c 3 4 5 6 7 8 9 p (GeV/c Figure 3: wo-dimensional distributions of reconstructed m and m min versus p : m for a generated Z tt sample m for a generated QCD di sample (c m min for a generated Z tt sample (d m min for a generated QCD di sample. he histograms are normalized to unity in each p bin. Seven Z tt simulated samples are used with resonance masses between 75 and 4 GeV/c. he QCD di sample was generated with PYHIA 6 une Z. ple, the number of s predicted by simulation for a with mass m = 7 GeV/c is 5% larger for PYHIA 8 and 3% lower for PYHIA 6 than what is measured in data (Fig. 6. he distributions predicted by HERWIG ++ tune 3 most closely resembles what is measured from data. Using the same example, the number of s predicted by HERWIG ++ for s with mass m = 7 GeV/c is only 3% lower than what is measured in data as seen in Fig. 6. Figure 7 shows the top tagging variables after all the selection requirements are applied except on the variable shown. he mass distribution is less sensitive to the generator and tune after selection criteria are applied, but the predicted N subs distribution is still very sensitive. he effect of pileup on top tagging variables can be examined by comparing the ratio of the number of events with one primary vertex to the number of events with more than one primary vertex. he mass distribution shifts to higher mass values in events with more than one primary vertex (Figure 8a. Similarly, events with more than one primary vertex are more likely to have multiple subs (Figure 8b. Minimum pairwise mass is less susceptible to pileup (Figure 8c. (d

7. Commissioning of the for W agging 9.5.4.3 Z` t t QCD op agging Algorithm s = 7 ev.8.7.6.5.4 Z` t t QCD op agging Algorithm s = 7 ev.45.4.35.3.5. Z` t t QCD op agging Algorithm s = 7 ev..3.5.....5 5 5 5 3 35 m (GeV/c 3 4 N subs 4 6 8 4 6 m min (GeV/c (c Figure 4: Comparison of top tagging variables for a Z t t simulation and a PYHIA 6 une Z QCD simulation. Seven Z tt simulated samples are used with resonance masses between 75 and 4 GeV/c. A flat p spectrum on the interval p [45, ] GeV/c is used. Jet mass (m Number of subs (N subs (c Minimum pairwise mass (m min. Number of Jets 4 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 3 op agging Algorithm 35.97 pb at s = 7 ev Number of Jets 6 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une 5 Herwig++ une 3 op agging Algorithm 4 35.97 pb at s = 7 ev Number of Jets 3 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 op agging Algorithm 35.97 pb at s = 7 ev 3 5 5 5 3 35 4 m (GeV/c 3 4 N subs 4 6 8 4 m min (GeV/c Figure 5: Comparison of data and simulation for top tagging variables. he simulation has been normalized to the same number of events as in data. Jet mass (m Number of subs (N subs (c Minimum pairwise mass (m min. 7. Commissioning of the for W agging Figure 9 shows data to simulation comparisons of the W tagging algorithm, of the mass, mass drop (µ, sub asymmetry, and R between the two subs, for the leading in the event. Overall the agreement between the data and the simulation is striking, particularly between the data and the HERWIG ++ tune3. his highlights some dependency of the substructure to the underlying event tune chosen. Of particular note is the mass distribution shown in Figure 9a. he peaking structure at low mass can be predicted from NLO perturbation theory, due to a soft collinear divergence. he actual behavior in nature then turns over and drops toward zero. he shoulder of the data and the simulation around 6-8 GeV/c is an artifact of finite- size effects (see, for instance, Ref [4] for a theoretical description of the mass from NLO perturbation theory. Figure shows the W tagging variables after all the selection requirements are applied except on the variable shown. he overall agreement between data and simulation is still quite good. Figure shows the normalized ratio of events with one good vertex to events with more than one good vertex. here is considerable agreement between the events with one and (c

8 Conclusion ( MC - / 3 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 op agging Algorithm 35.97 pb at s = 7 ev ( MC - /.4. ( MC - / 5 4 3 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 op agging Algorithm 35.97 pb at s = 7 ev - 5 5 5 3 35 4 m (GeV/c -. -.4 -.6 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 op agging Algorithm 35.97 pb at s = 7 ev 3 N subs - 4 6 8 4 m min (GeV/c Figure 6: Percent difference of simulation as compared to data. Jet mass (m Number of subs (N subs (c Minimum pairwise mass (m min. (c Number of Jets 8 Pythia 6 une Z Pythia 6 une D6 6 Pythia 8 une Herwig++ une 3 4 op agging Algorithm 35.97 pb at s = 7 ev N subs>, m >5 GeV/c 8 6 min Number of Jets 6 4 8 6 Pythia 6 une Z Pythia 6 une D6 Pythia 8 une Herwig++ une 3 op agging Algorithm 35.97 pb at s = 7 ev 4<m <5 GeV/c Number of Jets 6 Pythia 6 une Z Pythia 6 une D6 4 Pythia 8 une Herwig++ une 3 8 6 op agging Algorithm 35.97 pb at s = 7 ev N subs >, 4<m <5 GeV/c 4 4 4 5 5 3 35 4 m (GeV/c 3 4 N subs 4 6 8 m min (GeV/c Figure 7: op tagging variables after all the selection requirements are applied except on the variable shown. he simulation has been normalized to the same number of events as in data. Jet mass (m. Number of subs with the m requirement only (N subs. (c Minimum pairwise mass (m min. vertices. his is in slight contrast to the top tagging algorithm, where the pileup plays a larger role. his is because the pruning algorithm dynamically attempts to remove pileup because it is mostly soft and not correlated in angle with the primary clustering. As is seen in the data, this procedure seems effective, at least in the pileup regime exhibited by the current data samples. o examine the effect of the selection on a typical signal, Figure shows comparisons of the W tagging variables in simulation for Z tt, and a simulation of QCD data. here is excellent separation between the background (QCD and signal (Z. (c 8 Conclusion he search for substructure has been explored using CMS data. wo algorithms dedicated to this purpose were studied, one dedicated to identify boosted tops and another to identify boosted W bosons. echniques to measure the mistag rate of each algorithm directly from data were presented. he mistag rate from data is measured and also compared to simulation, using different generators and underlying event tunes. he agreement between the simulation and the data is generally good, with some simulations and tunes doing better than others. he best

N vtx = N vtx.5.4.3...9.8.7.6 op agging Algorithm 35.97 pb s = 7 ev.5 5 5 5 3 35 4 45 m (GeV/c at N vtx = N vtx.4..98.96.94.9.9.88.86 op agging Algorithm 35.97 pb s = 7 ev 3 4 N subs at N vtx = N vtx.8.6.4..8.6.4 op agging Algorithm 35.97 pb at s = 7 ev 4 6 8 m min (GeV/c Figure 8: Normalized ratio of events with one good vertex to events with more than one good vertex. Jet mass (m Number of subs (N subs (c Minimum pairwise mass (m min. agreement is observed to be with the HERWIG ++ generator. here is also some pileup dependence observed in the top tagging algorithm, but this is smaller in effect than the underlying event tune. Since the previous studies of the top and W tagging have been necessarily focused on the Monte Carlo simulation, our comparisons of the performance between data and Monte Carlo lend a substantial credibility to the algorithms used to identify substructure. his in turn opens new oportunities for their deployment in the searches for new physics at CMS and elsewhere. (c Acknowledgments his work was supported in part by the DOE under under ask ev of contract DE-FG- 96ER4956 during the Workshop on Jet Substructure at the University of Washington. Moreover we would like to thank the Bundesministerium für Bildung und Forschung for its support.

8 Conclusion...8.6.4...8.6.4. Pythia unez Pythia uned6 Herwig++ une3 34.7 pb at s = 7eV 4 6 8 4 m (GeV/c.45 Pythia unez Pythia uned6.4 Herwig++ une3.35.3.5..5..5 34.7 pb at s = 7eV...3.4.5.6.7.8.9 m m.7.6.5 Pythia unez Pythia uned6 Herwig++ une3 34.7 pb at s = 7eV.35.3.5 Pythia unez Pythia uned6 Herwig++ une3 34.7 pb at s = 7eV.4..3.5....5..4.6.8 R...3.4.5.6.7.8.9 y (c Figure 9: Leading pruned mass, mass drop (ratio of masses of highest p sub to full, sub R and asymmetry. wo different PYHIA tunes, Z and D6, and HERWIG ++ are compared. Plots are normalized to unity. Jet mass Mass drop (c R between subs (d Sub p asymmetry. (d

3.6.5.4 Pythia unez Pythia uned6 Herwig++ une3 34.7 pb at s = 7eV.9.8.7.6 Pythia unez Pythia uned6 Herwig++ une3 34.7 pb at s = 7eV.3.5.4..3... 4 6 8 4 m (GeV/c...3.4.5.6.7.8.9 m m Figure : W tagging variables after all the selection requirements are applied except on the variable shown. Both data and simulation is normalized to unity. Jet mass Mass drop. N vtx = N vtx.4 N vtx = N vtx.5.3. 34.7 pb at s = 7eV.5 34.7 pb at s = 7eV..9.5.8 4 6 8 4 Jet mass (GeV/c...3.4.5.6.7.8.9 m m N vtx = N vtx 8 6 34.7 pb at s = 7eV N vtx = N vtx 4 3 34.7 pb at s = 7eV 4 -..4.6.8 R...3.4.5.6.7.8.9 Sub p Asymmetry (c Figure : Normalized ratio of events with one good vertex to events with more than one good vertex. Jet mass Mass drop (c R between subs (d Sub p Asymmetry. (d

4 8 Conclusion.5. Z' tt QCD Pythia 6 D6 at s = 7eV..8 Z' tt QCD Pythia 6 D6 at s = 7eV.5.6..4.5. 4 6 8 4 m (GeV/c...3.4.5.6.7.8.9 m m.6 Z' tt QCD Pythia 6 D6.5 at s = 7eV.3 Z' tt QCD Pythia 6 D6.5 at s = 7eV.4..3.5....5..4.6.8 R...3.4.5.6.7.8.9 y (c (d Figure : Comparison of W tagging variables for a Z t t simulation and a QCD simulation. Jet mass Mass drop (c R between subs (d Sub p asymmetry.

5 References [] A. Abdesselam et al., Boosted objects: a probe of beyond the Standard Model physics, arxiv:.54v. [] D. E. Kaplan, K. Rehermann, M. D. Schwartz et al., op agging: A Method for Identifying Boosted Hadronically Decaying op Quarks, Phys. Rev. Lett. (Oct, 8 4. doi:.3/physrevlett..4. [3] S. D. Ellis, C. K. Vermilion, and J. R. Walsh Phys. Rev. n D8 (9. [4] CMS Collaboration, he CMS experiment at the CERN LHC, JINS 3 (8 S84. doi:.88/748/3/8/s84. [5] CMS Collaboration, Particle-Flow Event Reconstruction in CMS and Performance for Jets, aus, and ME, CMS PAS PF-9 (9. [6]. Sjostrand et al., High-energy physics event generation with PYHIA 6., Comput. Phys. Commun. 35 ( 38 59, arxiv:hep-ph/7. doi:.6/s-4655(36-8. [7] M. Bahr et al., Herwig++ Physics and Manual, Eur. Phys. J. C58 (8 639 77, arxiv:83.883. doi:.4/epjc/s5-8-798-9. [8] M. Wobisch and. Wengler, Hadronization corrections to cross sections in deepinelastic scattering, arxiv:hep-ph/9978. [9] Y. L. Dokshitzer, G. D. Leder, S. Moretti et al., Better Jet Clustering Algorithms, JHEP 8 (997, arxiv:hep-ph/97733. [] CMS Collaboration CMS PAS JME-9 (9. [] CMS Collaboration, Jet Energy Corrections determination at 7 ev, CMS PAS JME (. [] CMS Collaboration, Jet Performance in pp Collisions at s = 7 ev, CMS PAS JME-3 (. [3] J. Butterworth, A. Davison, R. Mathieu et al., Jet substructure as a new Higgs search channel at the LHC, Phys. Rev. Lett. (8. [4] S. D. Ellis, J. Huston, K. Hatakeyama et al., Jets in hadron-hadron collisions, Prog. Part. Nucl. Phys. 6 (8 484 55, arxiv:7.447. doi:.6/j.ppnp.7...