GAN Applications in High Energy Particle Physics

Similar documents
Learning Particle Physics by Example:

Quark/Gluon Discrimination with Jet-Images and Deep Learning

Image Processing, Computer Vision, and Deep Learning: new approaches to the analysis and physics interpretation of LHC events

Boosted Top Tagging with Deep Neural Networks

Classifaction of gg gh against qg qh

Pre-Processing and Re-Weighting Jet Images with Different Substructure Variables

Top Tagging with Lorentz Boost Networks and Simulation of Electromagnetic Showers with a Wasserstein GAN

Jet Energy Calibration. Beate Heinemann University of Liverpool

Dark matter searches and prospects at the ATLAS experiment

Physics with Tau Lepton Final States in ATLAS. Felix Friedrich on behalf of the ATLAS Collaboration

The BEH-Mechanism in the SM. m f = λ f. Coupling of Higgs boson to other particles fixed by particle mass and vev

Jet Reconstruction and Energy Scale Determination in ATLAS

GANs, GANs everywhere

W/Z + jets and W/Z + heavy flavor production at the LHC

Distinguishing quark and gluon jets at the LHC

Timing for pileup mitigation in forward jets (& more) F. Rubbo, A. Schwartzman, 9/4/2015

End-to-End Event Classification of High-Energy Physics Data

arxiv: v1 [hep-ex] 24 Oct 2017

Upgrade of ATLAS and CMS for High Luminosity LHC: Detector performance and Physics potential

Jets in the 21st Century

THE MULTIPLICIY JUMP! FINDING B S IN MULTI-TEV JETS W/O TRACKS

Performance of muon and tau identification at ATLAS

Hà γγ in the VBF production mode and trigger photon studies using the ATLAS detector at the LHC

Measurement of the Higgs Couplings by Means of an Exclusive Analysis of its Diphoton decay

Physics at Hadron Colliders Part II

How and Why to go Beyond the Discovery of the Higgs Boson

Machine learning in ALICE

Performance of the CMS electromagnetic calorimeter during the LHC Run II

HL-LHC Physics with CMS Paolo Giacomelli (INFN Bologna) Plenary ECFA meeting Friday, November 23rd, 2012

Boosted hadronic object identification using jet substructure in ATLAS Run-2

QCD Jets at the LHC. Leonard Apanasevich University of Illinois at Chicago. on behalf of the ATLAS and CMS collaborations

Measurement of photon production cross sections also in association with jets with the ATLAS detector

QCD Jets: Rise of the Machines

Atlas Status and Perspectives

Machine Learning just because it is Great Fun

Measurements of the Higgs Boson at the LHC and Tevatron

Identification of the Higgs boson produced in association with top quark pairs in proton-proton

How and Why to go Beyond the Discovery of the Higgs Boson

Future prospects for the measurement of direct photons at the LHC

ATLAS: Status and First Results

Jet Substructure. Adam Davison. University College London

ATLAS jet and missing energy reconstruction, calibration and performance in LHC Run-2

Jet Results in pp and Pb-Pb Collisions at ALICE

Deep generative models for fast shower simulation in ATLAS

Physics at Hadron Colliders

Recent QCD results from ATLAS

Testing QCD at the LHC and the Implications of HERA DIS 2004

Triggering on Long-lived neutral particles in ATLAS

Physics prospects at HL-LHC with the ATLAS detector. Lydia ICONOMIDOU-FAYARD LAL-Univ. Paris Sud-CNRS/IN2P3-Paris Saclay

Jet physics in ATLAS. Paolo Francavilla. IFAE-Barcelona. Summer Institute LNF , QCD, Heavy Flavours and Higgs physics

Statistical Tools in Collider Experiments. Multivariate analysis in high energy physics

Upgrade of ATLAS Electron and Photon Triggers and Performance for LHC Run2

QCD at the Tevatron: The Production of Jets & Photons plus Jets

Tests of QCD Using Jets at CMS. Salim CERCI Adiyaman University On behalf of the CMS Collaboration IPM /10/2017

Nikos Varelas. University of Illinois at Chicago. CTEQ Collaboration Meeting Northwestern November 20, Nikos Varelas. CTEQ Meeting Nov 20, 2009

MC(4BSM) Overview. Kentarou Mawatari,

Search for Fermionic Higgs Boson Decays in pp Collisions at ATLAS and CMS

Atlas results on diffraction

Heavy MSSM Higgses at the LHC

Prospect for WW Study with Early ATLAS Data

Double Parton Scattering in CMS. Deniz SUNAR CERCI Adiyaman University On behalf of the CMS Collaboration Low-x th June 2017 Bari, Italy

PoS(DIS 2010)190. Diboson production at CMS

Thesis. Wei Tang. 1 Abstract 3. 3 Experiment Background The Large Hadron Collider The ATLAS Detector... 4

Search for the Higgs boson in fermionic channels using the CMS detector

Evidence for tth production at ATLAS

ATLAS NOTE. August 25, Electron Identification Studies for the Level 1 Trigger Upgrade. Abstract

bb and TopTagging in ATLAS

Particle ID in ILD. Masakazu Kurata, KEK Calorimeter Workshop IAS program 01/19/2018

ATLAS Discovery Potential of the Standard Model Higgs Boson

Measurement of Jet Energy Scale and Resolution at ATLAS and CMS at s = 8 TeV

Mono-X, Associate Production, and Dijet searches at the LHC

Measurement of EW production! of Z+2j at the LHC

QCD dijet analyses at DØ

QCD and jets physics at the LHC with CMS during the first year of data taking. Pavel Demin UCL/FYNU Louvain-la-Neuve

ATLAS Calorimetry (Geant)

Review of LHCb results on MPI, soft QCD and diffraction

arxiv: v1 [hep-ex] 21 Aug 2011

Multivariate techniques for identifying diffractive interactions at the LHC

Benchmarking the SiD. Tim Barklow SLAC Sep 27, 2005

Jet tagging with ATLAS for discoveries in Run II

From the Discovery of the Higgs Boson to the Search for Dark Matter -New results from the LHC-

QCD Studies at LHC with the Atlas detector

PDF Studies at LHCb! Simone Bifani. On behalf of the LHCb collaboration. University of Birmingham (UK)

Precision QCD at the Tevatron. Markus Wobisch, Fermilab for the CDF and DØ Collaborations

Vector boson scattering and fusion results from ATLAS and CMS. Qiang Li Peking University, Beijing, China

Measurement of jet production in association with a Z boson at the LHC & Jet energy correction & calibration at HLT in CMS

Application of the Tau Identification Capability of CMS in the Detection of Associated Production of MSSM Heavy Neutral Higgs Bosons Souvik Das

Higgs couplings and mass measurements with ATLAS. Krisztian Peters CERN On behalf of the ATLAS Collaboration

Jet Substructure In ATLAS

Photon and neutral meson production in pp and PbPb collisions at ALICE

Muon reconstruction performance in ATLAS at Run-2

Characterization of Jet Charge at the LHC

Study of violation of CP invariance in VBF H γγ with the ATLAS detector using Optimal Observables. Florian Kiss

Searches for electroweak production of supersymmetric gauginos and sleptons with the ATLAS detector

Search for the SM Higgs Boson in H γγ with ATLAS

ATLAS Jet Physics Results and Jet Substructure in 2010 Data from the LHC

Top properties and ttv LHC

Results from D0: dijet angular distributions, dijet mass cross section and dijet azimuthal decorrelations

Hard And Soft QCD Physics In ATLAS

Higgs Boson in Lepton Decay Modes at the CMS Experiment

Transcription:

75.2355 7.5927 GAN Applications in High Energy Particle Physics Benjamin Nachman Lawrence Berkeley National Laboratory with collaborators Michela Paganini and Luke de Oliveira Outline: DNN with HEP images LAGAN CaloGAN

Simulation at the LHC 2 DEFAULT Inspired by Sherpa. paper - can you spot the differences? Spanning -2 m up to m can take O(min/event)

Simulation at the LHC 3 This is only possible because of factorization (Markov Property): given the physics at one energy (~/length) scale, the physics at the next one is independent of what came before. DEFAULT Spanning -2 m up to m can take O(min/event)

Part I: Hard-scatter 4 We begin with a model and ME generators. L = 4 F µ F µ + i /D + i y ij j + h.c. + D µ 2 V ( ) +??? See this paper for adapting a ME to HPC See this paper for ME integration with GNNs DEFAULT ************************************************************ * * * W E L C O M E to * * M A D G R A P H 5 _ a M C @ N L O * * * * * * * * * * * * * * * * * * * * 5 * * * * * * * * * * * * * * * * * ************************************************************ Standard is automated NLO or LO + matched For many cases, this is slow but not limiting (yet)

Part II: Fragmentation 5 Fragmentation uses MCMC; standard is leading-log. DEFAULT Not a limiting factor in terms of computing time.

Part III: Material Interactions 6 State-of-the-art for material interactions is Geant4. Includes electromagnetic and hadronic physics with a variety of lists for increasing/decreasing accuracy (at the cost of time) DEFAULT This accounts for O() fraction of all HEP competing resources!

Part IV: Digitization 7 It is important to mention that after Geant4, each experiment has custom code for digitization this can also be slow; but is usually faster than G4 and reconstruction DEFAULT deposited charge e MIP Frequency Preamplifier output MIP 2 MIPs Energy Loss de/dx ToT = 2 4 MHz clock analog threshold Time

Part IV: Digitization 8 It is important to mention that after Geant4, each experiment has custom code for digitization N.B. calorimeter energy deposits this can also be slow; but is usually factorize (sum of the deposits is faster than G4 and reconstruction the deposit of the sum) but DEFAULT deposited charge e digitization (w/ noise) does not! MIP Frequency Preamplifier output MIP 2 MIPs Energy Loss de/dx ToT = 2 4 MHz clock analog threshold Time

9 Goal: replace (or augment) simulation steps with a faster, powerful generator based on state-of-the-art machine learning techniques This work: attack the most important part: Calorimeter Simulation

Why should you care? N.B. ALL jet substructure analyses in ATLAS are forced to use full simulation as current fast sim. is not good enough. Standard Model Production Cross Section Measurements Status: July 27 σ [pb] 6 5 4 3 2 2 total (x2) inelastic incl. dijets p T > 25 GeV p T > 25 GeV p T > GeV n j n j 2 n j 3 n j n j n j 2n j n j n j 3n j 2 n j 2 n j 3 n j 4 n j 3 n j 4 n j 5 n j 4 n j 5 n j 6 n j 5 n j 6 n j 7 n j n n j 7 j 6 n j 7 total n j 4 n j 5 n j 6 n j 7 n j 8 ATLAS Preliminary Theory Run,2 s =7,8,3TeV ~3 billion events at the HL-LHC t-chan Wt s-chan Zt WW WZ ZZ WW WW WZ ZZ WZ ZZ total ggf H WW H ττ VBF H WW H γγ H ZZ 4l W γ Zγ LHC pp s =7TeV Data 4.5 4.9 fb LHC pp s =8TeV Data 2.3 fb LHC pp s =3TeV Data.8 36. fb 3 W ± W ± WZ pp Jets R=.4 γ W Z t t t tot. VV tot. γγ H WVVγ t tw tot. t tz tot. t tγ Wjj EWK Zjj EWK WW Excl. tot. ZγγWγγ WWγ ZγjjVVjj EWK EWK

Why should you care? N.B. ALL jet substructure analyses in ATLAS are forced to use full simulation as current fast sim. is not good enough. Standard Model Production Cross Section Measurements Status: July 27 σ [pb] 6 total (x2) inelastic incl. If we dijets 5 don t do something, the HL-LHC LHC pp s =8TeV p T > 25 GeV 4 n j won t be possible. If we do something LHC pp s =3TeV n p T > 25 GeV n j j ~3 billion events at the HL-LHC 3 total Data.8 36. fb p T > GeV n j 2n j WW now, t-chan n j n j WW 2 we can save O($ million/year). WW total n j 3n j 2 Wt n n j 2 j 2 WZ WZ n j 3 ZZ WZ ggf n j 4 H WW n j 3 n j 3 n j 4 ZZ W γ ZZ n j 4 n j 5 s-chan n j 5 n j 4 n j 6 H ττ n j 5 Zt Zγ n j 6 n n j 5 j 7 VBF n j 6 H WW 2 n j 7 n n j 7 j 6 n j 7 n j 8 ATLAS Preliminary Theory Run,2 s =7,8,3TeV H γγ H ZZ 4l LHC pp s =7TeV Data 4.5 4.9 fb Data 2.3 fb 3 W ± W ± WZ pp Jets R=.4 γ W Z t t t tot. VV tot. γγ H WVVγ t tw tot. t tz tot. t tγ Wjj EWK Zjj EWK WW Excl. tot. ZγγWγγ WWγ ZγjjVVjj EWK EWK

2 Jets First step: instead of studying the detailed structure of calorimeter showers, we consider Jet images

The Jet Image 3 Jet Image: A two-dimensional fixed representation of the radiation pattern inside a jet proton-proton collision into/ out-of page jet image jet [Translated] Azimuthal Angle (φ).5 -.5 Boosted W qq Not unlike a 2D calorimeter! 2 [GeV] Pixel p T jet - not to scale - -.5.5 [Translated] Pseudorapidity (η) jet image..and nothing like a natural image!

[Translated] Azimuthal Angle (φ) Why re-showered images? with Pythia 8, m = 25 GeV.5-3. Can directly visualize physics.8.6.4 [Translated] Azimuthal Angle (φ) -.5 p p -.5 -.5.5 [Translated] Pseudorapidity (η) p p bb, 8 re-showered with Pythia 8, m - -2-7 bb -9 - - - (will mention other fixed representations later) -3-6 -8 Normalized Pixel Energy -9 - - -2 - -3-7 there is information encoded in the -8 -.5 p p bb re-showered with Pythia 8, m singlet qq = 25 GeV [Translated] Azimuthal Angle (φ),8.5 = 25 GeV 4 extensive -4 image processing literature -5-4 -5-6 physical distance between pixels Normalized Pixel Energy -.5 [Translated] Azimuthal Angle (φ) -.5 -.5 - p p 8 bb re-showered with Pythia 8, m 8 p p 8 = 25 GeV and we can benefit from the bb octet qq 8 re-showered with Pythia 8, m = 25 GeV - -.5.5 [Translated] Pseudorapidity (η)

Why jets? P 5 ~X =(, ) p A jet is defined by a clustering algorithm (=unsupervised learning) BUT these clusters also have physical meaning e.g. can be calculated in perturbation theory p ~.8.6.4.2.8.6.4.2 2 See also Bayesian Model-based Clustering 2 3 ~µ 3 2 ~µ 2 5 4 3 2 2 3 4 5 ~µ a great testing ground to bridge state-of-the-art ML techniques with physically meaningful/ interpretable algorithms Some recent attempts to even cluster jets using modern ML tools. L. Mackey, BPN, A. Schwartzman, C. Stansbury, 59.226 Louppe et. al, 72.748

Pre-processing and Special Relativity 6 Pre-processing is an important aspect of 24 image < p /GeV < 26 GeV, 65 recognition < mass/gev < 95 [Translated] Azimuthal Angle (φ) However, some steps can - Pythia 8, W' WZ, s = 3 TeV -2 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 damage the physics T information -3 3-4 2 content of a jet image -5 -.5-6 [Translated] Azimuthal Angle (φ).5 - -.5.5-7 - I won t discuss this in detail here, -5 but -.5-6 I bring it up so you are aware of -7 it! -8 gle (φ).5 - Pythia 8, W' WZ, T [Translated] Pseudorapidity (η) Pythia - 8, QCD dijets, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T - -.5.5 s = 3 TeV [Translated] Pseudorapidity (η) s = 3 TeV 3 2-8 -2-9 -3-4 -9 3 2 [GeV] Pixel p T [GeV] Pixel p [GeV] T T gle [Translated] (φ) Azimuthal Angle [Translated] (φ) Azimuthal Angle [Translated] (φ) Azimuthal Angle [Translated] (φ) Azimuthal Angle (φ).5 -.5.5 - -.5.5 - -.5.5 - Pythia 8, W' WZ, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T Pythia 8, W' WZ, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T Pythia 8, W' WZ, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T s = 3 TeV - -.5.5 [Translated] Pseudorapidity (η) Pythia 8, QCD dijets, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T s = 3 TeV s = 3 TeV - -.5.5 [Translated] Pseudorapidity (η) Pythia 8, QCD dijets, 24 < p /GeV < 26 GeV, 65 < mass/gev < 95 T - -.5.5 [Translated] Pseudorapidity (η) s = 3 TeV s = 3 TeV 3 2 2 3-3 2-4 - -7-2 -8 3 2 - -2 3-3 -5-6 [GeV] -4-5 Pixel p -6 - -7-2 -8 [GeV] -9 Pixel p -3-9 3-4 2-5 [GeV] Pixel p T T [GeV] T T

Convolution age Max-Pool Convolution Max-Pool Modern Deep NN s for Classification Flatten 7 L. de Oliviera, M. Kagan, L. Mackey, BPN, A. Schwartzman 5.59 Convolutions Convolved Feature LayersLayers Convolved Feature Max-Pooling W WZ event P. Baldi et al. 63.9349 (W-tagging) J. Barnard et al. 69.67 (W-tagging) Repeat Subsequent P. Komiske et al. 62.55 (q/g-tagging) developments: G. Kasieczka et al. 7.8784 (top-tagging)

Modern Deep NN s for Classification 8 /(Background Efficiency) 5 25 < p /GeV < 3 GeV, 65 < mass/gev < 95 T Pythia 8, s = 3 TeV DNNs mass+τ 2 mass+ R τ 2 + R MaxOut Convnet Convnet-norm Random 5 Universally, DNNs out-perform physically by how much motivated features and what they are learning is.2.4.6.8 Signal Efficiency active R&D!

Modern Deep NN s for Regression 9 P. Komiske, E. Metodiev, BPN, M. Schwartz 77.xxxx Every pp collision comes with O(-) other collisions we don t care about (pileup) Pileup is a strange sort of noise because we can measure ~2/3 of it (charged pileup) See BOOST7 talk

Modern Deep NN s for Regression 2 P. Komiske, E. Metodiev, BPN, M. Schwartz 77.xxxx Leading vertex charged f o n o i t a c i! l s p r p e t l a fi l l a r a u n t o a i t N u l o v n o c Inputs to NN Total neutral {z filters 2 } Leading vertex neutral See BOOST7 talk b ea m Pileup charged

Modern Deep NN s for Regression 2 P. Komiske, E. Metodiev, BPN, M. Schwartz 77.xxxx Pileup Mitigation with Machine Learning See BOOST7 talk

And now: Modern Deep NN s for Generation 22 Generative Adversarial Networks (GAN): A two-network game where one maps noise to images and one classifies images as fake or real. GAN noise G D {real,fake} When D is maximally confused, G will be a good generator Pythia D Physics-based simulator

QCD ACGAN 23 {W,QCD} W GAN Add an auxiliary classification task G D W {real,fake} noise QCD D Pythia W = signal QCD = background

Locally Connected Layers Due to the structure of the problem, we do not have translation invariance. 24 Classification studies found fully connected networks outperformed CNNs However, convolutional-like architectures are still useful to e.g. reduce parameters

Locally Connected Layers 25 Locally connected layers use filters on small patches (CNN is then a special case with weight sharing)

Locally Aware GAN (LAGAN) 26 W - QCD (GAN) Unlike `natural images, we have physically meaningful D manifolds (here, jet mass) W - QCD (Pythia)

+ More Layers for Generation What about multiple layers with non-uniform granularity and a causal relationship? Not jet images per se, but the technology is more general than jets! φ z η 27

Calorimeter Simulation 28 η direction [mm] 2 5 5 5 - Geant4, Pb Absorber, lar Gap, GeV e 3 25 2 5 Local Energy Deposit [MeV] We take as our model a 3- layer LAr calorimeter, inspired by the ATLAS barrel EM calorimeter 5 5 2 2 5 5 5 5 2 Depth from Calorimeter Center [mm] A single event may have O( 3 ) of particles showering in the calorimeter - too cumbersome to do all at once (now) η direction [mm] 2 5 5 - Geant4, Pb Absorber, lar Gap, GeV e 9 8 7 6 5 Cell Energy [MeV] 5 4 We exploit factorization of 5 3 2 energy depositions 2 2 5 5 5 5 2 Depth from Calorimeter Center [mm]

Generator Network for CaloGAN 29 One jet image per calo layer One network per particle type; input particle energy use layer i as input to layer i+ ReLU to encourage sparsity

Discriminator Network for CaloGAN 3 help avoid mode collapse

Average Images 3 Geant4 CaloGAN

Overtraining 32 not memorizing A key challenge in training GANs is the diversity of generated images. This does not seem to be a problem for CaloGAN. no mode collapse

Energy per layer 33 Pions deposit much less energy in the first layers; leave the calorimeter with significant energy N.B. can always add these (and others) explicitly to the training

Depth of the shower 34 Depth-weighted total energy ld

Lateral spread 35 The much larger variation in the pion showers is a challenge for the network. These moments and others are useful for classification; we have also tested this as a metric (NN on 3D images)

Shower Energy 36 Beyond our training sample!

Timing 37 Generation Method Hardware Batch Size milliseconds/shower GEANT4 CPU N/A 772 3. CPU 5. 28 2.9 24 2.3 CALOGAN 4.5 4 3.68 GPU 28.2 52.4 24.2

Where to next 38 Add angle in addition to energy; hadronic calorimeter Non-uniform geometry as a function of h η φ z Integration within experiments (ATLAS and possibly others?) and collaboration with other efforts (e.g. GeantV)

Conclusions 39 Convolutions (Jet) image-based NN classification, regression, and generation are powerful tools for fully exploiting the physics program at the LHC Beyond our training sample! WZ eventis to The key tow robustness study what is being learned; this may even help us to learn something new!

Code and documentation 4 All of our training samples are public as is our generation, training, and plotting code: https://github.com/hep-lbdl you can find more documentation about the LAGAN and CaloGAN on the arxiv: 75.2355 7.5927

Workshop Advertisement 4 link

Backup

allows for an imageed in pixels in (η, φ) greyscale analogue. HEP 2 (25) 8], omputer vision.. We h image, as is often ntensities. meaning th discriminant Learning about Visualizing Learning Learning Below, we have the learned convolutional filters (left) and the difference in between the average signal and background image after applying the learned convolutional filters (right). This novel difference-visualization technique helps understand what the network learns. Jet images afford a lot of natural visualization 25 < p /GeV < 26 GeV,.59a< τ <.6, 79 < mass/gev < 8 tinguish between s = 3 TeV, Pythia 8, QCD. 5 2 correlation image T 4 3 2 - -.5-2 T ge nal d.5 Raw image (differences) CNN filters signal p - background p [GeV] [Translated] Azimuthal Angle (φ) T 43 2D Convolutions to Jet Images -3 - most activating images -4 - -.5.5-5 [Translated] Pseudorapidity (η) We show th potential of into Compu towards inc as a community, we have also developed many techniques 24 < pt/gev < 26 GeV,.9 < τ 2 <.2, 79 < mass/gev < 8 s = 3 TeV, Pythia 8 s = 3 TeV, Pythia 8..8.6.4.8.6 joint distributions mass+maxout τ2+maxout R+MaxOut MaxOut Convnet-norm.4.6.8 2 Pythia 8, 5 τ2.2 Convnet 5 MaxOut MaxOut-norm MaxOut (weighted).6 24 < pt/gev < 26 GeV,.9 < τ 2 <.2, 79 < mass/gev < 8 Pythia 8, re-weight Net + high level.4 s = 3 TeV mass+τ2.4.2.2 Convnet Random.2 5 25 < pt/gev < 3 GeV,.2 < τ 2 <.8, 65 < mass/gev < 95 /(Background Efficiency) QCD, /(Background Efficiency) Pr(τ2 DNN Output) DNN Output T /(Background Efficiency) 25 < p /GeV < 3 GeV, 65 < mass/gev < 95 Signal Efficiency.2.4.6 mass τ2 5 R redact MaxOut Convnet Random 5 5.8 s = 3 TeV 2.8 Signal Efficiency More detail in my DS@HEP5 talk.2.4.6.8 Signal Efficiency

Locally Aware GAN (LAGAN) 44