Search for a SM Higgs-like boson decaying into WW lvqq' using exclusive bins of jet multiplicity in pp collision at s = 8TeV PhD Student: Supervisor: Alessio Ghezzi Seminario di dottorato 25/09/2014
Introduction & Motivation Search for a SM Higgs-state decaying into WW lvqq' in the high mass region (600 GeV 1 TeV) Why? - Indirect test of the particle discovered @ 125 GeV: is it the SM Higgs or not? - Search for new resonances (looking for a bump in the WW invariant mass spectrum) WW final state: Highest BR at high mass Interesting topology future analysis of WW scattering Semi-leptonic channel: high σ x BR at high mass 2
Topology Two main production mechanisms: gluon fusion (ggh) vector boson fusion (VBF) very clear signature W W lv: lv: High HighpT ptlepton lepton++high highmet MET W W qq': qq': High Highboost boost quarks quarksreconstructed reconstructedas asaa merged merged jet jet Signature: Signature: WW WW {l {l + + MET} MET} + + Merged Merged Jet Jet (+2 (+2 tag tag jets) jets) Strategy: Look for a bump in the WW invariant mass spectrum (pz of the neutrino reconstructed imposing mlv = mw) 3
Categories H WW lvj(jj) restarted after previous inclusive analysis (CMS- HIG-13-008) Goal: split the analysis in categories of jet multiplicities in order to have a dedicated VBF search and improve the sensitivity by means of combination CMS-HIG-13-008 CMS-HIG-13-008 Two categories: - 0-1 jet bin: maximum one jet (pt>30 GeV) in addition to the merged jet - 2-jet bin: at least two jets (pt>30 GeV) in addition to the merged jet (dedicated VBF search) 4
Main Backgrounds W+jets: The dominant bkg in the 0-1 jet bin category shape and normalization exctracted from data Single top, VV: Small contribution, from MC TTbar: ~ Same contribution of W+jets in the 2 jet bin category From MC, but normalization corrected from data 5
Event Selection Leptons: pt μ (e) > 30 (35) GeV; veto presence of 2nd μ or e with pt>10 (20) GeV Missing transverse energy: MET μ (e) > 50 (70) GeV Boosted Regime: ptjet and ptlv > 200 GeV ΔRlep,jet>π/2 Topological back-to-back angular cuts: ΔφETmiss, jet>2.0 ΔφWlep,jet>2.0 Selection on the merged jet: Jet substructure techniques to identify single jets containing decay products of hadronic W 1) Grooming techniques: remove soft radiation and pile-up from the jet (e.g: pruning ) Pruned jet mass [65,105] GeV signal region ( Pruned jet mass [40,65] [105,130] GeV sideband ) 2) N-subjettiness: τ2/τ1 < 0.5 (more details in next talk) 6
MC-Data Comparison at preselection level: 0-1 jet cat. 7
MC-Data Comparison at preselection level: 2 jet cat. 8
2 jet cat: Additional Selection VBF jets- two powerful observables to discriminate signal vs. background: Mjj, Δηjj VBF selection: Δηjj>3.0 && Mjj>250 GeV (signal eff: ~80%) To reduce ttbar bkg: - Veto presence of b-jets - Leptonic top Mass Veto: Veto the invariant mass of the leptonic W and the closest jet (in ΔR) Mleptop > 200 GeV - Hadronic top Mass Veto: Veto the invariant mass of the hadronic W and the closest jet (in ΔR) Mhadtop > 200 GeV 9
Background extraction
t-tbar estimation (2 jet cat.) 2 jet cat: the relative fraction of the background from ttbar is much larger than in the 0-1 jet cat Step 1: Isolate a top-enriched sample after applying b-tag Check the validity of the MC Step 2: If 1) ok, take an overall normalization scale factor from the top-enriched region to the signal one Six different control regions with different levels of selections, approaching the signal B-tag = At least one b-tag jet region, have been selected: D) B-tag + Top mass tag A) B-tag C) B-tag + Top mass veto F) B-tag + VBF cuts + top mass tag B) B-tag + VBF cuts E) B-tag + VBF cuts + top mass veto 11
t-tbar estimation (2 jet cat.) good agreement between data/mc at every level of cuts (plots of other regions in backup) (A) B-tag only (C) B-tag + top mass veto the use of the MC for the ttbar background is validated (B) B-tag + VBF cuts Chi²/d.o.f. are good (~1) 12
W+jets Bkg: Normalization Procedure Fit on data in the sideband Ttbar, VV, Single top: Normalization and shape in Mj fixed from MC extraction of to the signal region side band 0-1 jet bin W+jets normalization eg io n si gn al r si de ba n d W+jets normalization Extrapolation of 2 jet bin 13
Mww shape: W+jets Estimation (1) Procedure: 1) Extraction of the sideband shape: fit on data, subtracting the other backgrounds (fixed after fit on MC) 14
Mww shape: W+jets Estimation (2) 0-1 jet cat 2 jet cat Procedure: 2) Extrapolation factor: α= F W+Jets ( SignalRegion ) F W+Jets ( Sideband ) MC MC 15
Mww shape: W+jets Estimation (3) X = Procedure: 3) Extrapolation to the signal region: FW+jets (signal region) = α FW+jets (sideband) 16
Final Mww shape in signal region 0-1 jet bin (muons) 0-1 jet bin (electrons) 2 jet bin (electrons + muons) 17
Signal Modelling - ggh : Crystal-Ball shape - VBF : double CB+exp shape (able to account for large interference effects) MH = 600 GeV ggh VBF ggh VBF MH = 800 GeV 18
Systematics Theoretical Uncertainties: Experimental Uncertainties: QCD scale and PDF uncertainties for the production of the SM Higgs Luminosity: 2.6% Interference between signal and SM bkg: - ggh : 10% - vbfh : 10 % Jet bin uncertainty Cross section uncertainties for minor backgrounds: - VV: 20% - Single top: 30 % Single lepton trigger, scale, reco efficiency: ~2% B-tagging efficiency: 2-3 % Jet energy scale: jet energy varied up/down effect on the normalization (~5-10%) Jet resolution: jet energy resolution varied up/down effect on the normalization (~5-10%) tt uncertainty: uncertainty on the SF calculated in the b-veto region (~20%) W-tagging SF uncertainty (9%) W+jets normalization: from Mj fit (~30%) 19
Results
Standard Model: Results Old analysis (CMS-HIG-13-008) New analysis (CMS-HIG-14-008) Improvement especially at high masses, where the VBF contribution is more important No significant excesses observed: exclude at 1.1 (3.3) times the SM Higgs cross section for a mass of 600 (1000) GeV 21
Standard Model: Combined p-value Local excesses of 2.64 and 2.56 σ observed at 700 and 800 GeV 22
BSM Interpretation Search for an electroweak singlet scalar mixed with the 125 GeV Higgs boson Unitarization: c² + c'² = 1 where c (c') is the coupling of the low (high) mass state Signal strenght and width of the high mass state: ' μ =C '2 ( 1 BRnew ) μ SM '2 C Γ= Γ SM ( 1 BR new ) ' (BRnew = BR of the high-mass resonance to non-sm like decay modes) 23
BSM Interpretation: Results Observed limits at the 95% CL in the [C'², BRnew] plane for different mh Solid (dashed) line: observed exclusion limits corresponding to 3 (4) times σth MH = 600 GeV MH = 700 GeV MH = 800 GeV NB: regions where Γ > ΓSM are not considered in the analysis (white areas) 24
Conclusions Search for a SM-like Higgs state decaying into WW lvj(jj) in the mass range [600-1000] GeV has been performed Jet substructure techniques have been used in the analysis important for searches in the high mass region in the future No significant excesses have been observed Expected sensitivity to exclude a SM Higgs-like boson of 1.1 (3.3) times the SM cross section at 600 (1000) GeV Local excesses of 2.64 and 2.56 σ are observed at 700 and 800 GeV Electroweak singlet analysis has been performed, setting upper limits on the signal strenght Results to be included in the high mass combination paper CMS-HIG-13-031 (ongoing) 25
Backup
Documentation PAS: HIG-14-008 (CADI LINK) Hypernews: (LINK)) Analysis Note: AN-13-414 (LINK) Review (Twiki):(LINK) Pre-approval talk:(link)) luca Brianza Luca 27
Selection on the merged jet JME-13-006 JME-13-006 Grooming techniques: remove soft radiation and pile-up from the jet (e.g: pruning ) Pruned jet mass [65,105] GeV signal region ( Pruned jet mass [40,65] [105,130] GeV sideband ) JME-13-006 JME-13-006 N-subjettiness τn : τn tends to be zero when the jet is consistent with N subjets N-subjettiness τ2/τ1 < 0.5 28
2-jet bin: additional selection VBF jets: two powerful observables to discriminate signal vs. background Δηjj, Mjj In the TMVA optimization, VBF = signal VBF selection: (ggh is considered as background) Working point: Δηjj>3.0 && Mjj>250 GeV (signal eff: ~80%) Not the optimal choice, but: Can recover significance from ggh events Tighter cuts lack of statistics in the sideband (Need it for the data-driven method) 29
Signal and background yields 0-1 jet bin (mu): process process ggh600 ggh600 vbfh600 vbfh600 WJets WJets TTbar TTbar Rate Rate 119.56 119.56 18.69 18.69 1428.08 1428.08 STop STop VV VV 331.91 331.91 84.41 84.41 235.30 235.30 0-1 jet bin (el): process process ggh600 ggh600 vbfh600 vbfh600 WJets WJets TTbar TTbar Rate Rate 103.09 103.09 16.39 16.39 STop STop VV VV 1398.98 1398.98 276.32 276.32 61.43 61.43 216.19 216.19 (expectation after all analysis cuts, Mww [550,1500] GeV) 2 jet bin: process process rate rate ggh600 ggh600 vbfh600 vbfh600 11.35 22.96 11.35 22.96 WJets WJets TTbar TTbar STop STop VV VV WW WWewk ewk 38.50 43.88 6.54 5.92 5.38 38.50 43.88 6.54 5.92 5.38 30
TTbar estimation (2 jet bin) Scale Factor to normalize Ttbar on data: ~ independent from the cut Chi²/d.o.f. are good (~1) the use of the MC for the ttbar background is validated luca Brianza Luca 31
Standard Model: Combined Limit PAS PAS PAS PAS No significant excesses observed: exclude at 1.1 (3.3) times the SM Higgs cross section for a mass of 600 (1000) GeV luca Brianza Luca 32
BSM interpretation: results Observed (expected) limits at the 95% CL in the [mh, C'²] plane for different BRnew Solid (dashed) line: observed (expected) exclusion limits corresponding to 3 times σth PAS PAS luca Brianza Luca 33
Systematics: background PAS PAS luca Brianza Luca 34
Systematics: signal PAS PAS luca Brianza Luca 35
Interference corrections ggh : included corrections from a previous calculation (see here, here) VBF: no existing calculation for interference effect between qq H WW and qq WW explicit calculation of this effect: 1) Generate three new samples (with Phantom), e.g mh=800: - (S+B+I)800 (coupling c'²) -B - (S+B+I)125 (coupling 1-c'²) 2) Calculation of the reweighting factor: (S+I)800 = (S+B+I)800 B (S+I)125 = (S+B+I)125 B Definition: - S800: heavy-higgs state - I800: interference between S800 and qq WW - B: background (qq WW) - S125: low-mass state negligible, ~0 - I125: interference between S125 and qq WW S(corr) = (S+I)800 + (S+I)125 = S800 + I800 + I125 (S125 ~0) Reweighting factor for the signal: w= S800 +I 800 +I 125 S 800 36
Calculation of the interference (e.g: MH = 800 GeV) (S+I)800 -(I)125 (S) / + = w= S800 +I 800 +I 125 S 800 Set of interference reweights derived for all C'², BRnew 37
Bias study in Mww (signal region) Using the datacards generated with two models Exp, Pow and the combination tool Generate toys with Pow model (the generation is done leaving to float only the normalizations, while the systematic uncertainties are fixed to their central values) Fit toys with the Exp model for backgrounds (B-only or S+B) The pull is calculated as: i i μ μ true pull= fit μunc In the following plots, the mean of the signal strength pull is reported with its uncertainty Different cases analyzed: with and without signal injection 38
Bias Study Results 0-1 jet bin 2 jet bin 39
Impact of the bkg model choice in Mj fit TO+Exp Pow Law Rat Three models considered in the study: Turn-on+Exp Turn-on+Pow Ratio of power laws Generate 1000 toys according to the three models Fit the datasets, and check the biases when crossing functions between generation and fitting model are used choose the fitting function with the least bias wrt the others choose the largest as additional uncertainty 40
Impact of the bkg model choice in Mj fit Results (2 jet bin): Check in the 0-1 jet bin: Results (0-1 jet bin): Best choice: T.O + Exp Gen func Fit func Bias ErfExp ErfExp 2% ErfPow ErfExp -9% User1 ErfExp 5% additional uncertainty in the W+jets normalization 41
Hadronic W matching study Hadronic W = CA8 Jet with highest Pt Is it a good choice? Study matching between CA8 reco jets and hadronic W Criteria of matching: - Matching in ΔR : - Matching in Pt: ΔR CA8,Whad <0. 3 ΔPt CA8,Whad <0. 25 Pt Whad Matching of the hardest CA8 jet Matching of the 2nd or 3th jet, no matching for the hardest No matching Events with only 1 jet 95.0 % - 5.0 % Events with only 2 jets 86.1 % 7.1 % 6.8 % Events with only 3 74.0 % 9.5 % 11.2 % jets In the >70-80 % of cases the hardest CA8 jet coincides with the hadronic W 42
VBF jets Study of S/B ratio in the region 65Gev < Mj < 105 Gev after simple VBF cuts nine differents cuts, varying Δη and Mjj Example: Δη > 2.5 and Mjj > 250 GeV Signal Signal T W+Jets All Bkg (qqh) (ggh) (Top+tt) S/ Top S/ W+Jets S/B (all bkg) Max pt 5.7 5.3 274.8 159.1 442.2 0.040 0.069 0.025 Max Δη 5.9 7.0 548.0 225.6 785.0 0.024 0.057 0.016 Max Mjj 5.9 7.0 538.7 221.8 771.6 0.024 0.058 0.017 (Sample used: MH=800 mu) The choice of jet pair with max pt leads to the best S/B ratio 43
W+jets: exclusive sample sampl N jet (exclusive W+jets) N jet 44
Jet substructure: grooming Problem: Jet mass is widely affected by soft QCD and pile-up contribution bad signal / background discrimination Solution: Jet grooming remove soft radiation and pile-up from the jet Different techniques: filtering, trimming, pruning 45
Jet substructure: N-subjettiness Look 'inside' the merged jet : it is possible to identify two subjets? W jet N-subjettiness τn τn tends to be zero when the jet is consistent with N subjets QCD jet Using the τ2 / τ1 ratio: τ2 / τ1 ~ 0 : Jet consistent with 2 subjets (W jet) τ2 / τ1 ~ 1 : Jet consistent with 1 subjet (QCD jet) 46
N-subjettiness: optimization EXO 12-021 47
W-tagger Scale Fac 48
TTbar estimation (2 jet bin) (D) B-tag only + top mass tag (F) B-tag + VBF cuts + top mass tag 49
Functional forms used in the Mj fit W+Jets: T-tbar: 1 +Erf ( ( x a ) / b ) 2 F ErfExp ( x ) =e cx x m ( F ErfExp2Gaus ( x ) =e c0 x x m ( VV: F 4Gaus ( x ) =e 1 2σ 2 1 x m ( +e 2σ 2 2 2 )2 x m ( +e 2σ 2 1 3 )2 x m ( S.top: F ErfExpGaus sp x +e 1+Erf ( ( x a ) / b ) +c1 e 2 ( 4 2 )2 2σ 2 2 )2 2σ 2 2 ( 0 x m +c 2 e x m ( x ) =e c )2 2σ 2 1 1+Erf ( ( x a ) / b ) +c1 e 2 )2 1 1 )2 2σ 2 1 50
Mj fit W+Jet s V V TTba r S.top 51
Functional model in Mww start from simple functions and increase the number of degrees of freedom: exponentials power laws polynomials Chebychev base with 2, 3 or 4 free parameters check at what level the fit is good enough (F-test recommended from combination): RSSi residual sum of squares N number of data point problematic tails
Mww fit TTba r S.top V V 53
MlvJ shape in the sideband 0-1 jet bin (muons) 0-1 jet bin (electrons) luca Brianza Luca 2 jet bin (electrons + muons) 54
Bias study in Mww (signal region) Autobias : Using the datacards and the combination tool Generate toys with Exp model (the generation is done leaving to float only the normalizations, while the systematic uncertainties are fixed to their central values) Fit toys with the Exp model (B-only or S+B) For each bkg component, the normalization pull is calculated as: In the follwing plots, the mean of the pull for every bkg (and also mu) is reported with its uncertainty Different cases analyzed, with and without signal injection 55
Bias Study Autobias SM: (example: 0-1 jet bin) Signal mu Total bkg W+jets Ttbar Single top VV WW EWK Bias is always under control 56
Pvalue 57
Impact of the systematics on the exp limit Legend: - Black line: Limit with all the nuisances+shape syst - Red line: Limit with no syst. - Blue line: Limit with only shape syst - Empty dot: Limit with shape syst + only this nuisance - Full dot: Limit with all the systematics except this one luca Brianza Luca 58
Pull plots of the post-fit parameters (on 1000 toys) 0-1 jet bin pull= nuis postfit nuis prefit σ prefit Yellow band = σ prefit Dot bars = mean of the post-fit uncertainty for that parameters 2 jet bin No large variations of the parameters from their pre-fit values luca Brianza Luca 59
Interference: Example of extrapolation (mh=650) Corrected Signal scan vs. C'² (BRnew=0) Corrected Signal scan vs. Brnew (c'²=0.5 C'²=0.5, BRnew=0 (difficult to see, but they match..) Scaling in C'², BRnew is still preserved 60