The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) Thursday, 12 February 2015
$ whoami S Lukasz (Luke) Kreczko Particle Physicist S Computing Research Assistant at the University of Bristol S My work involves: S Programming & project management (aka physics analysis) S SysAdmin, DevOps & user support S Outreach: among others, this talk
This talk includes S A (very) short introduction to particle physics S An overview of the LHC and the CMS experiment S Our data problem and our evolving solution
What is Particle Physics? In a nutshell Particle physics is the study of the smallest matter and anti-matter particles and the interactions between them.
How small is small? Observable Universe The Top? 10 30 m 10 20 m Galaxy clusters 10 10 m Solar system 1m You are here 10-10 m Atoms 10-20 m Standard Model 10-30 m Unknown? 10-40 m Planck length The Bottom?
Why are we doing this? S Our business is fundamental physics and we are trying to figure out how our universe works
Where does mass come from? S What is the origin of mass? S We are a step closer with the Higgs boson! S/(S+B) Weighted Events / 1.5 GeV 1500 1000 500 0 CMS -1 s = 7 TeV, L = 5.1 fb Data S+B Fit B Fit Component ±1σ ±2 σ Events / 1.5 GeV 1500 1000-1 s = 8 TeV, L = 5.3 fb Unweighted 120 130 (GeV) m γγ 110 120 130 140 150 (GeV) m γγ Discovered in 2012
Where does mass come from? S What is the origin of mass? S We are a step closer with the Higgs boson! S/(S+B) Weighted Events / 1.5 GeV 1500 1000 500 0 CMS -1 s = 7 TeV, L = 5.1 fb Data S+B Fit B Fit Component ±1σ ±2 σ Events / 1.5 GeV 1500 1000-1 s = 8 TeV, L = 5.3 fb Unweighted 120 130 (GeV) m γγ 110 120 130 140 150 (GeV) m γγ Francois Englert & Peter W. Higgs Nobel Prize in Physics 2013
What is Dark Matter? S What is 96 % of the universe made of? We only see 4%! What is Dark Matter and Dark Energy? dark energy, 73% dark matter, 23% stars, etc, 0.4% intergala ctic gas, 3.6%
Where has the anti-matter gone? S At the Big Bang, matter and anti-matter have been produced in equal quantities: why do we exist? S Matter and anti-matter should have annihilated each other shortly after S But there is lots of matter and almost no anti-matter in the universe!
S What is the state of matter just after the Big Bang?
What we know so far: The Standard Model S Describes elementary particles and the interactions between them S So far we know 6 quarks, 6 leptons and 4 force carriers + their anti-particles *Discovered in 2012!
The Standard Model S Normal matter consists of only the first generation proton neutron *Discovered in 2012!
The Standard Model S Muons: 1 per cm 2 per minute from cosmic rays at sea level *Discovered in 2012!
The Standard Model S Neutrinos: 7*10 10 particles per cm 2 per second from the sun S pass almost undisturbed through matter S Can oscillate into each other (discovered in 2001) Borexino experiment in Gran Sasso *Discovered in 2012!
The Standard Model S Photons (light) carriers of the electro-magnetic force: holding electrons within atoms together *Discovered in 2012!
The Standard Model S Photons (light) carriers of the electro-magnetic force: holding electrons within atoms together S Z- and W-bosons carriers of the weak force: radioactive betadecays *Discovered in 2012!
The Standard Model S Photons (light) carriers of the electro-magnetic force: holding electrons within atoms together S Z- and W-bosons carriers of the weak force: radioactive betadecays S Gluons: carriers of the strong force: holding the atomic nucleus together *Discovered in 2012!
The Standard Model S Newest observed member of the quarks (1995) S Highest mass (by a huge margin) comparable to a gold atom S Very short lifetime ~10-25 s: decays before it can interact with other matter! S My subject of study *Discovered in 2012!
The Standard Model S All of this is not stable and has to be produced in particle collisions! *Discovered in 2012!
The Large Hadron Collider Mankind s biggest machine (27 km circumference)
The Large Hadron Collider 4.3 km
The Large Hadron Collider the worlds most powerful microscope : allows the measurement of very small distances (~10-20 m)
The Large Hadron Collider the worlds fastest race track : protons go around the LHC ~10000 times per second
The Large Hadron Collider Cardiff Geneva: 150 times per second
The Large Hadron Collider a time machine : Recreates conditions as they were available nanoseconds after the Big Bang
The Large Hadron Collider collisions are 100,000 times hotter than the centre of the sun
The Large Hadron Collider And more dense than neutron stars!
The Large Hadron Collider Colder than deep space: (super) liquid helium at 1.9 K (-271 C) is used to cool LHC s superconducting magnets
A complex of accelerators
The CMS Experiment CMS DETECTOR Total weight Overall diameter Overall length Magnetic field : 14,000 tonnes : 15.0 m : 28.7 m : 3.8 T STEEL RETURN YOKE 12,500 tonnes SILICON TRACKERS Pixel (100x150 μm) ~16m2 ~66M channels Microstrips (80x180 μm) ~200m2 ~9.6M channels SUPERCONDUCTING SOLENOID Niobium titanium coil carrying ~18,000A Built like an onion around the collision point MUON CHAMBERS Barrel: 250 Drift Tube, 480 Resistive Plate Chambers Endcaps: 468 Cathode Strip, 432 Resistive Plate Chambers PRESHOWER Silicon strips ~16m2 ~137,000 channels FORWARD CALORIMETER Steel + Quartz fibres ~2,000 Channels CRYSTAL ELECTROMAGNETIC CALORIMETER (ECAL) ~76,000 scintillating PbWO4 crystals HADRON CALORIMETER (HCAL) Brass + Plastic scintillator ~7,000 channels
The CMS Experiment Charged particles leave a track in the tracker
The CMS Experiment Electrons and photons leave all of their energy in the electro-magnetic calorimeter
The CMS Experiment Protons and neutrons (and other hadrons) leave most of their energy in the hadron calorimeter
The CMS Experiment Muons travel through the whole detector and leave a track
The CMS Experiment Neutrinos can t be detected directly: through conservation of energy and momentum they are identified as missing energy
The CMS Experiment Like a big digital camera Ø > 76 million detector channels Ø 200 m 2 of silicon detector (tracker) Ø 40 million pictures (events) per second Ø ~ 1 MB of data per event Ø 3 microseconds data buffer
Decision to store/dump data comes from hardware trigger (custom FPGAs) The CMS Experiment
Decision to store/dump data comes from hardware trigger (custom FPGAs) The CMS Experiment
The CMS Experiment Decision to store/dump data comes from hardware trigger (custom FPGAs) Ø 100 000 events per second to computer farm (software trigger) Ø 1000 events per second to storage (tape/disk)
The CMS Experiment Decision to store/dump data comes from hardware trigger (custom FPGAs) Ø 100 000 events per second to computer farm (software trigger) Ø 1000 events per second to storage (tape/disk) From detector to disk: 40 MHz -> 100 khz -> 1kHz (while trying to keep interesting event)
CERN computing centre The data S The data is stored in data centres like these on both tape (backup) and disk (usage) S Multiple copies ensure availability and fault tolerance
The data S The data is segmented into data sets depending on trigger decision (electron trigger fired -> electron data set) S To understand the data we need simulation. Simulated data is segmented by physics process
Analysing a year of data S CMS records 10 000 Terabytes of data every year (around 70 years of full HD movies) 5000 x 2 TB
Analysing a year of data S CMS records 10 000 Terabytes of data every year (around 70 years of full HD movies) S + similar amount of simulation (usually more)
Analysing a year of data S CMS records 10 000 Terabytes of data every year (around 70 years of full HD movies) S + similar amount of simulation (usually more) S To analyse this on a single computer would take 64,000 years!
Analysing a year of data S CMS records 10 000 Terabytes of data every year (around 70 years of full HD movies) S + similar amount of simulation (usually more) S To analyse this on a single computer would take 64,000 years! S Solution: more computers
The beginning of the grid 1984: LHC project proposed
The beginning of the grid 1994: LHC project approved
The beginning of the grid Deciding LHC s computing model
The beginning of the grid The conclusion: analyse data where it is located Deciding LHC s computing model
The Grid CERN
The Grid Tape/disk + reconstruction CERN
The Grid Tape/disk + reconstruction CERN Tape/disk + reconstruction + simulation
The Grid Tape/disk + reconstruction CERN Tape/disk + reconstruction + simulation disk + simulation + user analysis
The Grid Tape/disk + reconstruction CERN Tape/disk + reconstruction + simulation disk + simulation + user analysys (disk) + user analysys
The Grid CERN All grid sites use Scientific Linux 5 and 6
Global distributed computing The Grid
Global distributed computing The Grid On a normal day, the grid provides 100,000 CPU days executing 1 million jobs
Global distributed computing The Grid At Bristol we have ~630 TB disk space 948 cores Connected via 10 Gbit/s to the grid
Data on the grid 140 PB > 200 PB of transfers
Data preparation
The CMS Software S The CMS Software (CMSSW) is open source: https://github.com/cms-sw/cmssw S Contains around 3.6M source lines of code (SLOC) S The entire software stack includes 125 external packages like ROOT (http://root.cern.ch) or Geant4 (http://geant4.cern.ch) S Runs on x86 and ARM devices under Linux and OS X S Available on all grid sites via CVMFS (http://cernvm.cern.ch/ portal/filesystem)
The data: a structured mess
The data: a structured mess This is low intensity! Later this year we expect 40 times this per collision!
The data: a much nicer picture Jet: p T = 84.1 GeV/c η = 2.24 Missing E T : 22.3 GeV Jet: p T = 89.0 GeV/c η = 2.14 Jet: p T = 85.3 GeV/c η = 2.02 Jet: p T = 90.5 GeV/c η = 1.40 Muon: p T = 71.5 GeV/c η = 0.82 Run: 163583 Event: 26579562 _ m(f)=1.2 TeV/c 2
The data: a much nicer picture Jet: p T = 84.1 GeV/c η = 2.24 Missing E T : 22.3 GeV Jet: p T = 89.0 GeV/c η = 2.14 Jet: p T = 85.3 GeV/c η = 2.02 Jet: a spray of particles going in a common direction Jet: p T = 90.5 GeV/c η = 1.40 Run: 163583 Event: 26579562 Muon: p T = 71.5 GeV/c η = 0.82 _ m(f)=1.2 TeV/c 2
The data: a much nicer picture Muon: the heavy partner of the electron Jet: p T = 84.1 GeV/c η = 2.24 Missing E T : 22.3 GeV Jet: p T = 89.0 GeV/c η = 2.14 Jet: p T = 85.3 GeV/c η = 2.02 Jet: p T = 90.5 GeV/c η = 1.40 Run: 163583 Event: 26579562 Muon: p T = 71.5 GeV/c η = 0.82 _ m(f)=1.2 TeV/c 2
The data: a much nicer picture Jet: p T = 84.1 GeV/c η = 2.24 Missing E T : 22.3 GeV Jet: p T = 89.0 GeV/c η = 2.14 Jet: p T = 85.3 GeV/c η = 2.02 Other low energy particles Run: 163583 Event: 26579562 Muon: p T = 71.5 GeV/c η = 0.82 Jet: p T = 90.5 GeV/c η = 1.40 _ m(f)=1.2 TeV/c 2
The data: a much nicer picture Jet: p T = 84.1 GeV/c η = 2.24 Missing E T : 22.3 GeV Jet: p T = 89.0 GeV/c η = 2.14 Jet: p T = 85.3 GeV/c η = 2.02 Energy and momentum imbalance Jet: p T = 90.5 GeV/c η = 1.40 Run: 163583 Event: 26579562 Muon: p T = 71.5 GeV/c η = 0.82 _ m(f)=1.2 TeV/c 2
The goal: extend our knowledge Jet: p T = 84.1 GeV/c η = 2.24 Jet: p T = 89.0 GeV/c η = 2.14 Run: 163583 Event: 26579562 Missing E T : 22.3 GeV Muon: p T = 71.5 GeV/c η = 0.82 Jet: p T = 85.3 GeV/c η = 2.02 Jet: p T = 90.5 GeV/c η = 1.40 _ m(f)=1.2 TeV/c 2 Billions of events + simulation S/(S+B) Weighted Events / 1.5 GeV 1500 1000 500 0 CMS -1 s = 7 TeV, L = 5.1 fb Data S+B Fit B Fit Component ±1σ ±2 σ Events / 1.5 GeV 1500 1000-1 s = 8 TeV, L = 5.3 fb Unweighted 120 130 (GeV) m γγ 110 120 130 140 150 m γγ (GeV)
The goal: extend our knowledge Jet: p T = 84.1 GeV/c η = 2.24 Jet: p T = 89.0 GeV/c η = 2.14 Run: 163583 Event: 26579562 Missing E T : 22.3 GeV Muon: p T = 71.5 GeV/c η = 0.82 Jet: p T = 85.3 GeV/c η = 2.02 Jet: p T = 90.5 GeV/c η = 1.40 _ m(f)=1.2 TeV/c 2 S/(S+B) Weighted Events / 1.5 GeV 1500 1000 500 0 CMS -1 s = 7 TeV, L = 5.1 fb Data S+B Fit B Fit Component ±1σ ±2 σ That s the famous Higgs boson Events / 1.5 GeV 1500 1000-1 s = 8 TeV, L = 5.3 fb Unweighted 120 130 (GeV) m γγ 110 120 130 140 150 m γγ (GeV)
The long shutdown S Since the end of 2012 the LHC has been in shutdown S Extensive maintenance was needed to get ready for 13 TeV operation (compared to 8 TeV in 2012)
The long shutdown S Since the end of 2012 the LHC has been in shutdown S Extensive maintenance was needed to get ready for 13 TeV operation (compared to 8 TeV in 2012) S Reprocessing of existing data: better detector knowledge etc. S 364 papers published on these data (as of Jan 2015)
The long shutdown S Since the end of 2012 the LHC has been in shutdown S Extensive maintenance was needed to get ready for 13 TeV operation (compared to 8 TeV in 2012) S Reprocessing of existing data: better detector knowledge etc. S 364 papers published on these data (as of Jan 2015) S Lots of time to think about what we can do better
Using the WAN Deciding LHC s computing model
Using the WAN quotation needed S WANs today are fast and reliable S Most sites connected with > 10 Gbit/s S A few sites have lots of cores but little storage
Using the WAN quotation needed S WANs today are fast and reliable S Most sites connected with > 10 Gbit/s S A few sites have lots of cores but little storage S The conclusion: bring data to where cpu cycles are available S Done via Xrootd (http://xrootd.org/)
The logical next step S Dynamic Data Placement: S Monitor the data sample popularity S Delete unused samples (leave 1 copy on tape) S Copy popular samples to more sites
The logical next step S Dynamic Data Placement: S Monitor the data sample popularity S Delete unused samples (leave 1 copy on tape) S Copy popular samples to more sites S Self-regulated system deployed last year S Frees data manager resources S Fast reaction to bottlenecks or space filling up
Other preparations S Software - big effort on multicore to improve data reconstruction S Together with algorithm improvements back on track
Other preparations S Software - big effort on multicore to improve data reconstruction S Middleware - more use of temporary resources e.g. clouds S Using openstack to build up a site on demand S Looking at docker (https://github.com/cmssw/cms-sw.github.io/blob/master/docker.md)
Other preparations S Software - big effort on multicore to improve data reconstruction S Middleware - more use of temporary resources e.g. clouds S The grid is busy: S First sets of simulation for this year are finished. S The final set (to be used with data) is starting soon
Summary S The LHC and the CMS experiment are large man-made machines to measure the smallest known (anti-)matter
Summary S The LHC and the CMS experiment are large man-made machines to measure the smallest known (anti-)matter S The data storage and analysis challenge has been met with the LHC worldwide grid S Made past discoveries possible but is still evolving S Data is shipped on demand to available computing resources S Data popularity is used to distribute data across sites
Summary S The LHC and the CMS experiment are large man-made machines to measure the smallest known (anti-)matter S The data storage and analysis challenge has been met with the LHC worldwide grid S Made past discoveries possible but is still evolving S Data is shipped on demand to available computing resources S Data popularity is used to distribute data across sites S The LHC is about to start collisions again in May/June S We are ready for the new energy frontier!
Any Questions?