How to Win With Poisson Data: Imaging Gamma-Rays A. Connors, Eureka Scientific, CHASC, SAMSI06 SaFDe 1
All-Sky Map > 1 GeV Photons 9 years of CGRO/EGRET 2
3
Model of Diffuse Emission along Galactic Plane: Strong, Moskalenko, Reimer (GALPROP) 4
What is that excess glow around the Milky Way? (Dixon, Hartman, Kolaczyk, et al: Poisson-tailored Haar Wavelet Thresholding) 5
How We Began (SAMSI06 SaFDe): Dixon, Hartman, Kolaczyk, et al 1998: New Astronomy 3 (1998) 539. `The immediate question arises as to the statistical significance of this feature. Though we are able to make rigorous statements about the coefficient-wise and level-wise FDR, similar quantification of object-wise significance (e.g., "this blob is significant at the n sigma level") are difficult.' 6
How to Win With Low-Count Poisson Data: OVERVIEW 1/ Motivation 2/ Opinion/Discussion: Four Old Rules + One New 3/ Framework: A Single Pixel 4/ Worked Example: All-Sky > 1GeV Gamma-Rays 7
FOUR RULES: RESPECT THE DATA (i.e. Respect the underlying distribution) No "binning up" - you lose information and you don't need to (see Scargle etc). No (or minimal) filtering or pre-processing No subtracting ("model out") Cut the cuts DATA EXPLORATION/VISUALIZATION - DIFFERENT THAN INFERENCE And both are useful LIKELIHOOD-BASED IF you want to know uncertainties BUT beware model incompleteness RESPECT YOUR UNKNOWNS i.e. Respect your uncertain background and calibration 'constants' (EffArea, etc.) They have a distribution, too. NEW ONE: What About Goodness of Fit? Try testing for "goodness of fit" and residuals by using flexible model (e.g. non- or semi-parametric) 8
FRAMEWORK: A SINGLE BIN y1 0.0 0.2 0.4 0.6 0.8 y1 0.00 0.04 0.08 0.12 0 5 10 15 20 x1 0 5 10 15 20 x1 Left: Poisson Distribution, mu = 0.2 Right: Poisson Distribution, mu = 10. 9
1. For a Single Pixel: 1.1/Examples: bkg = 0.2 cts, Measurement = 2 counts bkg = 10 cts, measured 2 cts Likelihood Distribution: Tails or Evidence Look At Likelihood of unknown flux s Use your favorite summaries: Marginalize; CL/CR; etc etc. 1.2/ Additionally bkg counts/effarea can all vary 1.3/ Assumed SOLVED (i.e. there are good procedures). 10
WORKED EXAMPLE: All-Sky >1 GeV Gamma-Rays: Galactic Diffuse 2 4 6 8 10 12 14 16 18 11
WORKED EXAMPLE: All-Sky >1 GeV Gamma-Rays: Galactic Diffuse 50000 100000 150000 200000 250000 12
Background Known from Physics 2E-07 4E-07 6E-07 8E-07 1E-06 13
On Top Of One-Pixel Procedure, add: SHAPE Like Dixon, Hartman, Kolaczyk et al, use very flexible non-physics model We use a multiscale (Haar wavelet-like) model In Addition: In Full Likelihood Framework (Also: Instrument Smearing, i.e. PSF, included) Work in progress - Esch et al EMC2 Re-work by D. van Dyk, A. Roy, SAMSI06 CHASC Mistakes: probably mine 14
FLEXIBLE MODEL: EMC2 0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 Four-Way Splits Smoothing parameters for flux ratios fit Hyperparameter (for smoothing kappas) set via MC 15
DOES THIS WORK? Yes. AS HYPOTHESIS TEST? Qualified yes. AS RESIDUALS? Qualified yes Simulations - 16
Simulation 1: GALPROP Gamma-rays from Galactic Diffuse Gas: Cosmic rays impinging on gas clouds (Gas, particles measured separately) Cosmic rays impinging on photon field 2E-07 4E-07 6E-07 8E-07 1E-06 17
Simulation 1: GALPROP Gamma-rays from Galactic Diffuse Gas Convolved with Exposure Map: 18
Simulation 1: GALPROP Gamma-rays from Galactic Diffuse Gas * EGRET Exposure Simulated Poisson Counts: 0 20 40 60 80 100 19
* Physical Model is Background It is also the Best Fit * Multiscale Model is Residual * If Residual is SIGNIFICANT, then the model is not a good enough fit. * Shows residual at the same time * Because it is in a likelihood framework, can say something about uncertaintes. 20
Results of Procedure for Simulated Data = Model Mode Mean Sigma Skew 21
Results of Procedure for Simulated Data = Model Histogram of log10(expecttotalcount) Frequency 0 50 100 150 200 250 300-2.5-2.0-1.5-1.0-0.5 0.0 0.5 1.0 log10(expecttotalcount) 22
Simulation 1I: GALPROP + New Comp 2E-07 4E-07 6E-07 8E-07 1E-06 0 2 4 6 8 10 23
Simulation II: GALPROP Gamma-rays from Galactic Diffuse Gas * EGRET Exposure Simulated Poisson Counts: 10 20 30 40 50 60 70 80 90 100 24
Results of Procedure for Simulated Data Unknown Bright Unknown Component Mode Mean Sigma 0 0.5 1 1.5 2 2.5 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Skew 0.2 0.4 0.6 0.8 1 4 5 6 7 8 9 25
Results of Procedure for Simulated Data With Unknown Bright Unknown: See DS 9 Movie 26
Results of Procedure for Simulated Data With Unknown Histogram of log10(expecttotalcount) -1.04-1.00-0.96-0.92 Frequency 0 50 100 150 200 3.43 3.44 3.45 3.46 3.47 3.48 3.49 3.50 3.43 3.44 3.45 3.46 3.47 3.48 3.49 log10(expecttotalcount) 27
Results of Procedure for Simulated Data With Unknown (new component ~ 8 X fainter than previous example) -0.02 0.00 0.02 0.04 0.06-2 -1 0 1 2 28
Results of Procedure for Simulated Data = Model Histogram of log10(expecttotalcount) Frequency 0 50 100 150 200 250 300-2.5-2.0-1.5-1.0-0.5 0.0 0.5 1.0 log10(expecttotalcount) 29
Results of Procedure for Simulated Data Unknown Faint Unknown Component Mode Mean Sigma 0.002 0.004 0.006 0.008 0.01 0.012 0.0002 0.0004 0.0006 0.0 Skew 0.001 0.002 0.003 5 6 7 30
Conclusion: Many related methods not mentioned here, e.g. John Rice s SAMSI Intro Workshop talk, HC, etc Many challenges in making it work more automatically See me for All-Sky Challenge Data (See Alex Young for Solar Data) EVEN SO - Very encouraging. 31