Tuning the simulated response of the CMS detector to b-jets using Machine learning algorithms

Similar documents
Studies of Higgs hadronic decay channels at CLIC. IMPRS Young Scientist Workshop at Ringberg Castle 18 th July 2014 Marco Szalay

Determination of the Top- Quark Mass from the m lb Distribution in Dileptonic Top-Pair Events at s= 8 TeV

Top Quark Production at the LHC. Masato Aoki (Nagoya University, Japan) For the ATLAS, CMS Collaborations

How to find a Higgs boson. Jonathan Hays QMUL 12 th October 2012

Christian Wiel. Reconstruction and Identification of Semi-Leptonic Di-Tau Decays in Boosted Topologies at ATLAS

Jet reconstruction in LHCb searching for Higgs-like particles

Radiative decays: The photon polarization in b s γ penguin transitions

La ricerca dell Higgs Standard Model a CDF

First Look at H WW * lepton + h Analysis. University of Delhi. Ajay Kumar, Arun Kumar, Dr. Kirti Ranjan

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Particle Identification at LHCb. IML Workshop. April 10, 2018

High Pt Top Quark Mass Reconstruction in CMS

Recent ATLAS measurements in Higgs to diboson channels

Top production measurements using the ATLAS detector at the LHC

The search for standard model Higgs boson in ZH l + l b b channel at DØ

Top quark pair cross section measurements at the Tevatron experiments and ATLAS. Flera Rizatdinova (Oklahoma State University)

Highlights of top-quark measurements in hadronic nal states at ATLAS

Ernest Aguiló. York University

First studies towards top-quark pair. dilepton channel at s = 13 TeV with the CMS detector

Searches for BSM Physics in Rare B-Decays in ATLAS

Light Charged Higgs Discovery Potential of CMS in. A. Nikitenko, M. Hashemi, Imperial College, IPM & Sharif Univ. of Tech.

Measurement of t-channel single top quark production in pp collisions

Resolved Top Tagger in CMS

Top Quark Mass Reconstruction from High Pt Jets at LHC

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

Performance of muon and tau identification at ATLAS

George Bakas For the NTUA CMS Group

How to Measure Top Quark Mass with CMS Detector??? Ijaz Ahmed Comsats Institute of Information Technology, Islamabad

Evidence for tth production at ATLAS

Update on H ZZ 2l2ν Analysis

Dark Photon Search at BaBar

Multivariate Analysis for Setting the Limit with the ATLAS Detector

Maria Fiascaris (University of Oxford) ATLAS UK Meeting, Durham 10/01/08

Observation of Higgs boson decay to bottom quarks with CMS (and ATLAS) data

Georges Aad For the ATLAS and CMS Collaboration CPPM, Aix-Marseille Université, CNRS/IN2P3, Marseille, France

Search for high mass diphoton resonances at CMS

Boosted hadronic object identification using jet substructure in ATLAS Run-2

Flavour tagging developments for the LHCb experiment

Feasibility of a cross-section measurement for J/ψ->ee with the ATLAS detector

Higgs Boson at the CMS experiment

Improvements on the Higgs searches in the high mass region at CDF

Particle ID in ILD. Masakazu Kurata, KEK Calorimeter Workshop IAS program 01/19/2018

CMS Performance Note

Single Top t -Channel

Higgs searches in TauTau final LHC. Simone Gennai (CERN/INFN Milano-Bicocca) Mini-workshop Higgs Search at LHC Frascati, March 28th, 2012

Highlights of top quark measurements in hadronic final states at ATLAS

Recent Results on Top Physics at CMS

Single top: prospects at LHC

Early physics with Atlas at LHC

The Tevatron s Search for High Mass Higgs Bosons

Reconstruction of tau leptons and prospects for SUSY in ATLAS. Carolin Zendler University of Bonn for the ATLAS collaboration

Events with High P T Leptons and Missing P T and Anomalous Top at HERA

CEA-Saclay, IRFU/SPP, bât. 141, Gif sur Yvette Cedex, France On Behalf of the CDF and DO Collaborations.

First two sided limit on BR(B s μ + μ - ) Matthew Herndon, University of Wisconsin Madison SUSY M. Herndon, SUSY

Search for the very rare decays B s/d + - at LHCb

Search for Higgs to tt at CMS

Measurements of R K and R K* at LHCb

Search for tt(h bb) using the ATLAS detector at 8 TeV

Single-Top at the Tevatron

Double Parton Scattering in CMS. Deniz SUNAR CERCI Adiyaman University On behalf of the CMS Collaboration Low-x th June 2017 Bari, Italy

B-Tagging in ATLAS: expected performance and and its calibration in data

Recent results on search for new physics at BaBar

Measurements of BB Angular Correlations based on Secondary Vertex Reconstruction at sqrt(s) = 7 TeV in CMS

Jet substructure, top tagging & b-tagging at high pt Pierre-Antoine Delsart and Jeremy Andrea

Search for the Higgs Boson In HWW and HZZ final states with CMS detector. Dmytro Kovalskyi (UCSB)

Searches for exotica at LHCb

Top Physics in Hadron Collisions

Inclusive top pair production at Tevatron and LHC in electron/muon final states

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

LFV in Tau Decays: Results and Prospects at the LHC. Tau th International Workshop on Tau Lepton Physics Beijing September 22th, 2016

Missing Transverse Energy. Performance with CMS

Tau Lepton Identification at the ATLAS Experiment

Observation of the rare B 0 s µ + µ decay

Search for Fermionic Higgs Boson Decays in pp Collisions at ATLAS and CMS

Physics at Tevatron. Koji Sato KEK Theory Meeting 2005 Particle Physics Phenomenology March 3, Contents

Track Reconstruction in Hadronic Tau Decays at the ATLAS Detector

Status of the LHCb experiment and minimum bias physics

Searches for electroweak production of supersymmetric gauginos and sleptons with the ATLAS detector

Jet Reconstruction and Energy Scale Determination in ATLAS

Tau Performance. ATLAS UK Physics Meeting. Alan Phillips University of Cambridge. ( on behalf of the Tau Working Group)

Prospects for b-tagging with ATLAS and tracking results with cosmics

The Compact Muon Solenoid Experiment. Conference Report. Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland

Higgs search in WW * and ZZ *

Boosted Top Tagging with Deep Neural Networks

Measurement of Фs, ΔΓs and Lifetime in Bs J/ψ Φ at ATLAS and CMS

ATLAS Measurements of Boosted Objects

Hadronic Moments in Semileptonic B Decays from CDFII

Searches for BSM Physics in Events with Top Quarks (CMS)

Search for the SM Higgs boson in the decay channel H ZZ ( * ) 4l in ATLAS

Atlas Physics studies

Boosted Top Resonance Searches at CMS

Inclusive BB cross section at 8 TeV

October 4, :33 ws-rv9x6 Book Title main page 1. Chapter 1. Measurement of Minimum Bias Observables with ATLAS

Latest results on the SM Higgs boson in the WW decay channel using the ATLAS detector

Search for WZ lνbb at CDF: Proving ground of the Hunt for the

Search for a SM Higgs-like boson decaying into WW lvqq' using exclusive bins of jet multiplicity in pp collision at s = 8TeV

STANDARD MODEL AND HIGGS BOSON

Study of Diboson Physics with the ATLAS Detector at LHC

Benjamin Carron Mai 3rd, 2004

Standard Model Higgs Searches at the Tevatron (Low Mass : M H ~140 GeV/c 2 )

Transcription:

Tuning the simulated response of the CMS detector to b-jets using Machine learning algorithms ETH Zürich On behalf of the CMS Collaboration SPS Annual Meeting 2018 31st August 2018 EPFL, Lausanne 1

Motivation bjets semi-leptonic decays wider than light jets Neutrino escapes detection! reco pt gen pt Reconstruction of X bb difficult Energy correction (gen pt / reco pt) o/p of NN-based bjet energy regression* Uses 42 jet variables as inputs In general, MC variable distribution data variable distribution *see talk of N.Chernyavskaya Mis-modelling of detector. Mis-modelling of jet fragmentation. Good data/mc matching is required! 2

f(y) Direct Quantile Morphing Corrections by matching cumulative distributions! Requirements: 1. Continuous, differentiable, non-constant pdf 2. Pure control sample toy simulation data Disadvantages: 1. Correction not differential in kinematics and pile-up. p.d.f y Goal: Differential corrections! f(y) Integrate F(y) toy simulation corrected data simulation data 2 Inverse c.d.f 1 toy 3 c.d.f Match Quantiles y y 3

Quantile Regression Morphing Differential corrections by matching conditional cumulative distributions Direct Quantile Morphing : Match quantiles cdf Quantile Regression Morphing: conditional Match quantiles at fixed x cdf F(yi x) Corrected MC Corrected MC In practice: Discretize cdf estimate discrete quantiles linearly interpolate to get cdf Train grad-bdt to minimize quantile loss function...scikit-learn package 4

Dependent variables: Regression i/p variables Independent variables: Kinematics and pile-up Quantile levels: [19 levels] τ = [ 0.05, 0.10, 0.15,.,0.90, 0.95] Quantile Regression Morphing cdf(y) Direct quantile morphing F(y) toy At fixed x simulation data 2 1 3 c.d.f y y Correction 5

Limitations of Quantile Morphing: pdf should be continuous and differentiable! Solution: Stochastic Morphing Train a binary classifier and get prediction of probabilities of the MC event to be in peak and tail for MC and data. Move the MC event from peak to tail if peak Continuous tail Or move MC event from tail to peak if Then perform quantile morphing for the tail! 6

Limitations of Quantile Morphing: Pure sample of bjets required! Partial solution: bjets from leptonic eμ decay channel of ttbar. (To avoid non-bjets from hadronic decay) Did we avoid background?.. NO! Presence of ISR jets bjet ISR jet bjet Solution 2: A statistical tool named splot* *arxiv:physics/0402083 7

Basic idea of splot technique: Reweight data set in unbiased way such that signal-like events get higher weight than background-like events. No b-tagging variables, only kinematics BDT classifier response* TMVA (ROOT) package Discriminate between bjets and non-bjets based on their correlation with leptons Done for all pt ranked jets: (i.e. leading jet, sub-leading jet and other jets) 8

Compute sweights for both data and MC Parameters for sweights are obtained using Likelihood fit to BDT 9

Weight dataset by sweights to get signal distribution! Charged energy fraction in HCAL Neutral energy fraction in ECAL We compute splots of MC to study the bias produced by splot technique. 10

Results (1) Charged energy fraction in HCAL Differential in η? Inter-quartile range Median 11

Results (2) Stochastic corrections: Differential in pt? Secondary vertex mass Inter-quartile range Median 12

Results (3) Effect on jet energy corrections wrt to jet pt due to data/mc corrections for NN regression is <2% for 1σ deviation Effect on resolution is <2%. 1σ 1σ For mean, the effect is ~ 0.1% Regression does not introduce data/mc discrepancies on energy scale. *Upto the current status of analysis. 13

Summary Motivation: Match data/mc bjet distribution for better reconstruction of X bb. Differential correction: Quantile Regression Morphing and Stochastic Morphing bjet sample used: ttbar leptonic decay (with ISR background) Pure sample of bjet: splot technique The corrections derived are jet-by-jet and largely analysis independent! splot Technique Quantile Regression Morphing Jet electromagnetic energy fraction in ring ΔR = 0.4-0.3 14

Back-up 15

Training Stage: Train a grad-boosted decision tree to minimize the quantile loss function to estimate conditional quantile function for data and MC for each quantile value τ. From sklearn.ensemble import GradientBoostingRegressor 16

For each MC variable yi to be corrected 1. Compute and 2. Run binary search of yi on cdf(y) Application Stage: Differential corrections At fixed x. and. 3. Use linear interpolation to estimate cdf of data and MC. 4. Match corresponding quantiles on both cdfs to get corrections. y Correction 17

Compute sweights for both data and MC Obtained through likelihood fit Signal pdf Background pdf MC Truth info In practice: RooStats::Splot Class Likelihood fit sweights Fit over data We compute sweights for MC as well so as to study the bias produced by the splot technique 18

Limitations of splot technique: BDT should be uncorrelated with variables to be unfolded! Since the discriminatory variable BDT depends on kinematic variables especially pt We have pt binned the sample to make BDT conditionally independent of jet pt 30-40, 40-50, 50-60, 60-70, 70-80, 80-100, 100-120, 120+ GeV Further, in our analysis, events with only positive sweights are taken. For minimizing the loss function quantile regression 19

Back-Up: Migrating events from peak to tail 20

Back-up: Migrating events from tail to peak 21

Back-up: Correlations while migrating events from peak to tail Remove the one which has to be corrected! 22

Back-up: Correlation plots for secondary vertex variables MC Corrected MC data MC - data Corrected MC - data Jet pt: 65-110 GeV 23