Single Top Quark Production at CDF Jeannine Wagner Kuhr on behalf of the Collabo ration o Universität Karlsruhe ar n i m e s N CER 8 0 0 2..4 2 2 1 Photo: Reidar Hahn
Overview Tevatron and CDF experiment Motivation Ingredients for single top analyses Multivariate analyses Statistical analysis Results from CDF and comparison with results from DO 2
Tevatron Accelerator Delivered luminosity 3.8 fb 1 (3.1 fb 1 on tape) Record luminosity: ~3.2 1032cm 2s 1 Single top quark analyses: Lint = (1.9 2.2) fb 1 3
CDF Detector Multi purpose detector Muon chamber p Silicon detector Very good momentum reso lution Precise vertex reconstruction p y z x Drift chamber (COT) calorimeter 1.4 T solenoid magnet Installation of the silicon detector 4
Why are Top Quarks Interesting? The top quark is very heavy Why? Mt >> Mb > Mc >> Ms, mt 173 GeV/c2 173 Mt of same order as EW symmetry breaking scale special top dynamics? Top mass allows together with W mass for predictions on Higgs mass Top quark decays before it hadronizes No top hadrons, no spectroscopy But: quasi free particle (spin measurement possible) 5 orders of magnitude between quark masses 5
Knowledge about Top Quark 1995: Discovery of top Rel. top mass uncertainty 0.9% pair production (Tevatron combination) Strong interaction (QCD) Tevatron: 85 % 15 % SM NLO prediction: σ = 6.7 ± 0.8 pb M. Cacciari et al., JHEP 0404, 068 (2004) Well established prod. mechanism Measurement of top mass and other properties So far, no hint for a non SM top quark 6
Top Quark EW Production Production of single top quarks: Electroweak interaction This talk: Evidence since 2007 (DO) Update of CDF analyses 7
Single Top Quark Production t channel s channel LHC: σs = (11 ± 1) pb σt = (247 ± 10) pb σa = (56 ± 6) pb SM NLO predictions (Tevatron): σs = (0.88 ± 0.11) pb σt = (1.98 ± 0.25) pb B.W. Harris et al., Phys. Rev. D 66, 054024 (2002), Z. Sullivan, Phys. Rev. D 70, 114012 (2004), N. Kidonakis, Phys. Rev. D74, 114012 (2006) associated production negligible (σa ~ 0.3 pb) σs+t = 2.9 ± 0.4 pb σs+t 0.4 σtop pair 8
Single Top Quarks Why Interesting? Production of single top quarks: Test of SM prediction σ 2 V single top tb measurement of Vtb Test of b quark structure function single top t channel (DGLAP evolution) Search for non SM contributions (W' or H+) Technical motivation WH, H bb (Same final state, σwh 0.1 σsingle top ) 9
CKM Matrix Element Vtb Cabbibo Kobayashi Maskawa matrix: Rotation of down type quark mass eigenstates into weak eigenstates directly measured ratio now well measured from Bs mixing Direct measurement only indirectly only via single top known quark production Are unitarity relations valid? e.g.: Hints for the existence of a 4th generation? 10
1.25 σt (pb) Single Top Quarks Predictions Single top quark production: s and t channel rates are differently sensitive to new physics σs (pb) T. Tait and C. P. Yuan, Phys.Rev.D63:014018 (2001) 11
Analysis Roadmap Signal Model Detector Acceptance Event Selection Multivariate Matrix Element Boosted Decision Tree Extraction of Background Model Analyses Neural Networks Likelihood Function Result Ensemble cross section tests (template fit) b tagging 12
Signal Model t channel s channel LO diagrams Most important NLO diagram W b fusion Important features: Polarization of top quarks, W g fusion process t channel W g fusion Modeling: MADEVENT + PYTHIA for showering W b and W g fusion processes generated separately and matched in b pt to match ZTOP NLO calc. ZTOP: PRD66,054024 (2002); MADEVENT: JHEP 0709:028 (2007) 13
Event Topology s channel W b fusion t channel W g fusion Leptonic W decay Signature: Exactly one isolated high pt e or μ, missing ET, 2 jets, 1b jet 14
Background Processes Apply a Z veto Njets: 2 or 3 QCD Apply a QCD veto W+LF (mistags) tt Z/Dib Wbb Pie chart: 2 jets bkg Wc Wcc Require at least one jet with a secondary vertex 15
Event Selection 1 Lepton (e or μ) High pt lepton triggers, MET MET+jets trigger (ME, BDT analyses) pt > 20 GeV/c, η < 2.0 Missing E T μ (MET) b jet 2 MET > 25 GeV b jet 1 2 or 3 jets (hadron level) ET > 20 GeV, η < 2.8 At least one b tagged jet secondary vertex tag Z veto and QCD veto W+2jets W+3jets s channel 42 ± 6 14 ± 2 t channel 62 ± 9 18 ± 3 Total pred. 103 ± 15 22 ± 5 16
Acceptance Gain for Muons Standard CDF muon coverage Extended Muon coverage + MET+jet trigger muon φ (rad) triggers ηη η Gain from missing ET+jets triggered events: ~30% gain in muon acceptance ~12% gain in total signal acceptance (with electrons) 17
b Tagging at CDF Silicon detectors Decay length Lxy : Impact para meter d0 resolution Lxy for high pt tracks ~18 µm τβ 1.5 ps cτ 450 µm Typical Lxy in CDF: O(mm) Tight b tag: Lxy / σ(lxy) > 7.5 ε(b) 45% ε(c) 9% ε(q) 1% 18
B Tagging Tool Flavor Separator Identified sec. vertex tags have a significant charm and mistag contamination Combination of jet and track variables in NN: Vertex mass, decay length, track multiplicity,.. powerful discriminant Mistags / W+charm... beauty Gain in sensitivity for single top analyses: 15 20 % Very powerful at separating out residual mistags and charm from b tagged sample 19
Background Estimation Strategy Diboson, Z+jets and W+HF jets: top pair production: W+jets normalization Determined with MC from data, HF fraction from ALPGEN MC (calibrated in W+1jet events with b tag) Mistags (W+LF jets) and QCD: Determined from data 20
QCD Background Signal Region MC models of inclusive jet+met production not precise enough Apply QCD veto to suppress QCD bkg cuts on MET, MT,W, ϕ(jet, MET), ϕ(l,jet) No b tag Events Electrons QCD W+jets Use data samples to model kinematic: Anti Electrons : Electron candidates which pass all but two of the selection requirements Electrons Jet Electrons : Multijet events where one jet fakes the electron Perform fits to MET distribution Uncertainty on QCD rate: ± 40% Shape uncertainty on HF content QCD W+jets b tagged 21
Background Estimation W+LF jets ALPGEN W+LF used to predict kine Negative tag: Lxy / σ(lxy) < 7.5 matic distributions Determine tagging rate for W+LF in inclusive multijet events using events with negative Lxy: Parametrize rate of negatively tagged events as a function of ntrks,jet, ET,jet, ηjet, nprimary vertices, ET(event), zprimary Apply asymmetry correction to obtain positive tagging rate Apply tagging rates to W+LF MC 22
W+jets: Heavy Flavor Calibration Use ALPGEN predictions of kinematic distribution but mistrust ALPGEN rates events Use W+1 jet b tagged data control sample to estimate the HF fractions Three parameter fit to bottom/charm/light tem KIT flavor separator output plates of KIT flavor sepa Apply correction factor rator distribution for c and b: Cross check: displaced vertex mass KHF = 1.4 ± 0.4 23
Candidate Events W+2jets Uncertainty of back ground estimation larger than signal } Total pred. bkg pred. single top Total pred. Observation fnon b-bkg 50 % S:B ratio: W+2jets: 1 : 14.5 W+3jets: 1 : 34 (dominant bkg: top pair prod., 45 %) W+3jets 1492 ± 269 755 ± 91 103 ± 15 22 ± 5 1595 ± 269 777 ± 91 1535 712 Counting experiment impossible Multivariate analyses W+2 jets: fnon b-bkg 50 % Use KIT flavor separator in multivariate analyses 24
Search Strategy Full selected data set split in subsets of different purity W + 2 jets 1 tag W + 3 jets 2 tags 1 tag 2 tags Multivariate methods Matrix elements Boosted decision tree Combined search t + s channel = one single top signal σ ratio is fixed to SM value important for discovery Neural networks Likelihood function Separate search σ ratio is not fixed to SM value Sensitive to new physics processes Old LF (1.5/fb), NN (1.0/fb) results; New s channel meas. (optimized sel., LF) 25
Matrix Element Analysis Idea: Compute event probability for signal and background hypotheses x: lepton and jet 4 vectors LO differential cross section PDF's Use full kinematic information of event, integrate over unknown or poorly W+2jets, measured quantities Calculate probability densities P for s and t channel and main backgrounds Nb tags 1 Background (Wbb, Wcc, Wc+jets, top pair) Discriminant: Transfer function: Parton (y) Rec. Object (x) Signal KIT flavor sepa rator output 26
Boosted Decision Tree Idea: Effective extension of a cut based analysis Use many input variables (20) Non discriminating variables are xi>c1 automatically ignored, but don't degrade the performance Optimize series of binary cuts xj>c2 with training sample xj>c3 Calculate for each leaf purity p = s/(s+b) Sort events by output purity Create series of boosted trees purity by reweighting based on value of misclassification 27
Neural Network Analysis Idea: Combine many variables into one more powerful discriminant Network: NeuroBayes (3 layered feed forward net combined with automated preprocessing of input variables) Exploits correlation between variables Realistic mixture of backgrounds Use 11 18 variables: Mlνb, Q η, KIT flavor separator,... 28
Likelihood Function Idea: Combine many variables into one more powerful discriminant HT Binned likelihood function (LEP technique) Correlations between variables not taken advantage of Use 7 10 variables Norm. values Bkg: Wbb,top pair, Wcc/Wc, mistags 29
Separation Variable Mlνb NN analysis C.P. Yuan Phys. Rev. D41, 42 (1990) No experimental resolution effects taken into account In reality signal peak degraded by: MET and jet resolution wrong jet assignment wrong p choice z,ν 30
Separation Variable Q η proton t channel (2 u valence p direction proton t channel quarks) anti proton Likelihood function analysis anti proton anti p direction (1 d valence quark) NN analysis 31
Validation of MVA Neural Networks Pretag W+2jets, t channel NN W + 1 Jet, b tagged, W + 3 Jets, b tagged, top pair NN σtop pair 7.5 ± 0.8 (stat.) pb fit value in good Wbb NN agreement with CDF average 32
Statistical Analysis Cross section: Bayesian treatment (flat prior in σs+t) Binned likelihood fit including all rate Significance: Definition of p value Significance: median and shape syst. uncertainties Modified frequentist approach Exp. Perform pseudo experiments (PE) with p value and without SM single top 2ln(Q) = Fluctuate all syst. rate and shape uncertain ties in PE Binned likelihood fit for each PE (fit W+LF and W+HF norm. parameter) 2ln(Q), Q = Lreduced( SM σs+t)/lreduced(σs+t=0) p value defined via likelihood ratio is the most powerful cri teria to distinguish 2 hypo theses, Neyman Pearson Lemma 33
Systematic Uncertainties Syst. Uncertainty Rate Jet Energy Scale Initial state radiation Final state radiation Parton Distribution Function MC Generator Event Detection Efficiency Luminosity NN Flavor Separator Mistag model Q2 scale in ALPGEN MC Input variable mismodeling 0...16% 0...11% 0...15% 2...3% 1...5% 0...9% 6% Wbb+Wcc normalization Wc normalization Mistag normalization Top pair normalization & mtop 30% 30% 17...29% 23% Shape Different uncertainties for each subset Shape uncertainty example Jet energy scale, single top 34
Results of CDF CDF 35
Measured Cross Sections Matrix element method s t =2.2 0.8 0.7 pb Neural networks s t =2.0 0.9 0.8 pb Boosted decision tree 0.8 s t =1.9 0.7 pb Likelihood function s t =1.8 0.9 0.8 pb SM prediction: σs+t 2.9 ± 0.4 pb Analyses are consistent with each other 36
Significance Matrix element method Neural networks Likelihood function Exp. p value: 0.0003% (4.5σ ) Exp. p value: 0.0005% (4.4σ ) Exp. p value: 0.03% (3.4σ ) Obs. p value: 0.03% (3.4 σ) Obs. p value: 0.06% (3.2 σ) Obs. p value: 2.5% (2.0 σ) Boosted decision tree Exp. p value: 0.0002% (4.6σ ) Obs. p value: 0.28% (2.8 σ) 37
Combination Super Discriminant Combine ME, NN, LF discriminants as inputs to a neural net (NEAT) Neuro Evolution of Augmenting Topologies: designed to optimize the discovery significance Exp. p value: 0.00002% (5.1σ ) Obs. p value: 0.0094% (3.7 σ) Cross section: 0.7 0.7 s t =2.2 pb SM prediction: σs+t 2.9 ± 0.4 pb σs+t lower than SM prediction but consistent with SM prediction 38
Combination Checks Original BLUE: Use BestLinearUnbiasedEstimator to estimate: L. Lyons, NIM A270, 110 (1988) Compatibility between three analyses and compatibility with SM Correlation between single analyses Correlation Compatibility: Perform PEs generated assuming a certain Determine χ2 of combination for each PE ME single top cross section LF ME: 59% Compatibility = fraction of PE with a larger χ2 than in data Discrepancy Self consis Compatibility tency: 87 % with SM: 15 % (σs+t = 2 pb) (σs+t = 2.9 pb) 1.1 σ LF 61% (ME NN), 74% (LF NN) 39
Vtb Measurement Assuming SM production: Pure V A and CP conserving interaction Vtb 2 Vtd 2 + Vts 2 or B(t Wb) ~ 100% Super discriminant: Vtb = 0.88 ± 0.14 (exp.) ± 0.07 (theory) Vtb > 0.66 (95% C.L.) Measurement consistent with SM prediction, Vtb 1 40
Separate s Channel Search No constraint of SM cross section ratio Use double b tagged events with 2 secondary vertices 1 sec. vertex and a loose b tag (calc. from d0 of tracks assigned to jet) No QCD veto applied (QCD bkg small, acceptance gain) Use only W+2jet events Use Likelihood function as multi variate analysis s channel: (t channel fixed s =0.6 0.9 0.6 pb to SM, treated as bkg) Measured σs consistent with SM prediction, σs = (0.88 ± 0.11) pb 41
Separate s and t Channel Search Measure both channels simultaneously (no constraint on the cross section ratio) Likelihood function Neural networks New results coming soon s-channel SM prediction t-channel : lower than SM prediction 42
Summary of CDF Results Combined search: 0.7 Cross section: s t =2.2 0.7 pb Exp. p value: 5.1σ Obs.: 3.7 σ Vtb = 0.88 ± 0.16 Separate search: σs SM prediction σt : lower than SM prediction We are here 43
Results from DO DO 44
Measured Cross Sections Measured using a Bayesian binned likelihood calculation Matrix element method 1.6 1.4 s t =4.8 pb Bayesian neural network 1.6 1.4 s t =4.4 pb Boosted decision tree 1.4 1.4 s t =4.9 pb PRL 98, 18102 (2007), PRD article available at arxiv: 0803.0739 [hep ex] SM prediction: σs+t 2.9 ± 0.4 pb Measurements are con sistent with each other 45
Significance Determined using ensemble tests without signal contribution Matrix element method Expected p value: Bayesian neural network Expected p value: 3% (1.9 ) 1.6% (2.2 ) Observed p value: Observed p value: 0.081% (3.2 ) 0.08% (3.1 ) Boosted decision tree Expected p value: 1.9% (2.1 ) Observed p value: 0.035% (3.4 ) 46
Combination and Vtb Combination: Combine results with BLUE Correlations: 59% (ME BNN) 66% (BNN BDT) Cross section: s t =4.7±1.3 pb Exp. p value: 2.3, Obs: 3.6 Boosted decision tree analysis used to determine Vtb : 0.25 0.21 V tb =1.31 σs+t higher than SM prediction but consistent with SM prediction pb Vtb consistent with SM pred. 47
Separate Search No constraint of SM cross section ratio Boosted decision tree analysis Measure only one channel and assume SM value for non measured channel (treated as CDF: background) NN, 1.0 fb 1 LF, 1.5 fb 1 s =1.0 0.9 0.9 pb t =4.2 1.8 1.4 pb Measure both channels simultaneously s =0.9 pb t =3.8 pb s-channel SM prediction t-channel 2 SM prediction 48
Summary Evidence for single top quark production Cross section: s t =2.2 0.7 0.7 pb Exp. p value: 5.1σ Obs.: 3.7 σ σs+t = 4.7 ± 1.3 pb DO: Exp. p value: 2.3σ Obs.: 3.6 σ?? Vtb CKM matrix element: Vtb = 0.88 ± 0.14 DO: Vtb = 1.31 +0.25 0.21 LHC?? Separate search: σs SM prediction σt lower than SM prediction DO: σt 2 SM prediction 49