Multiscale Autocorrelation Function: a new approach to anisotropy studies Manlio De Domenico 12, H. Lyberis 34 1 Laboratory for Complex Systems, Scuola Superiore di Catania, Catania, Italy 2 Istituto Nazionale di Fisica Nucleare, Sez. di Catania, Catania (Italy) 3 CNRS/IN2P3 - IPN Orsay, Paris (France) 4 Dipartimento di Fisica, Universitá di Torino, Torino (Italy) CRIS, Catania, 17 Sep 2010
Take Home Message A new method for anisotropy signal detection in the arrival direction distribution of particles (nuclei, ν, γ,...) 1 It depends on one parameter, i.e. the clustering scale Θ 2 It is based on information theory and extreme value theory 3 It is unbiased against the null hypothesis of isotropy 4 It provides high discrimination power against the alternative hypothesis of anisotropy 5 It is semi-analytical 6 It requires few minutes to analyze (and penalize) data sets up to 10 4 objects Main Ref.: M.D.D, A. Insolia, H. Lyberis, M. Scuderi arxiv:1001.1666 Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 2/ 22
Multiscale Autocorrelation Function I Divide the observed sky into equal-area boxes. The number of boxes defines the angular scale Θ of the analysis. Procedure: ψ i : density of events falling in the box B i ψ i : expected density of events falling in the box B i from an isotropic distribution ( ) Def. the global deviation from isotropy: A(Θ) = D KL ψ ψ ( ) D KL ψ ψ = i ψ i(θ)log ψi(θ) Kullback-Leibler Divergence (1951) ψ i (Θ) We define the Multiscale Autocorrelation Function (MAF) s(θ) = A data(θ) A iso (Θ) σ Aiso (Θ) Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 3/ 22
Multiscale Autocorrelation Function II H 0 : null hypothesis of underlying isotropic distribution H 1 : alternative hypothesis H 0 is false For any scale Θ in the parameter space P, estimate the chance probability to obtain s MC (Θ ) s data (Θ): P(Θ) = Pr ( s iso (Θ ) s data (Θ) H 0, Θ P ) to take into account the penalization for the Θ scan Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 4/ 22
Multiscale Autocorrelation Function III Fixed grid may cut existing clusters, causing loss of information or reducing the signal To avoid this: at any scale Θ each point is extended into 8 exposure-weighted points whose distance from the original one is Θ/2 (dynamical binning) MAF uses the density of extended points Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 5/ 22
Understanding MAF Our numerical studies show that such a dynamical binning (by means of extended points) approach recovers the correct information on the amount of clustering in the data Θ, where the significance is minimum, is the significative clustering scale: the scale at which occurs a greater number of points respect to that one occuring by chance, with no regard for a particular configuration of points, e.g. doublets or triplets. 3 skies of 60 events: 20% of events from a single source with 3 diff. smearing angles ρ = 4, 10, 25 ; 80% are isotropic. Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 6/ 22
MAF: Statistical Features Under H 0 (Isotropy) I Numerical experiments under the Null Hypothesis In general, the scale Θ, where minimum chance probability occurs, is reported. Theorem: under H 0, all p-values min{p(θ)} = arg min Θ {Pr(s iso(θ ) s data (Θ) H 0, Θ P)} corresponding to isotropic skies, should be equally likely. The distribution of min{p(θ)} is flat, as expected, with no regards for the data set size = unbiased against H 0 Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 7/ 22
MAF: Statistical Features Under H 0 (Isotropy) II For each Θ P, we investigate the density of s(θ) = A data(θ) A iso(θ) σ Aiso (Θ) The distribution of s(θ) is half-normal: G 1/2 [s(θ)] = 2 e s2 (Θ) 2 2π for all Θ P Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 8/ 22
MAF: Statistical Features Under H 0 (Isotropy) III We investigate the density of max{s(θ)}, used for penalizing p values µ = 1.737 ± 0.001 σ = 0.464 ± 0.001 χ 2 /ndf 10 4 Density is the generalized Gumbel distribution: (Fisher-Tippet Type I from Extreme Value Theory) g(z) = 1 σ exp [ z ez ], z = max{s(θ)} µ σ NOT dependent on data set size! Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 9/ 22
MAF: Statistical Features Under H 0 (Isotropy) IV Summary 1 MAF is unbiased against the null hypothesis 2 Penalization procedure can be analytically performed: [ ( )] max{s(θ)} µ p (max{s(θ)}) = 1 exp exp, σ We find excellent agreement between estimation through Montecarlo realizations and through analytical estimation Analytical computation is 15 times faster Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 10/ 22
Generating Anisotropic Skies I Ultra High-Energy Cosmic Rays Detecting anisotropy of UHECR is important for understanding creation and propagation mechanisms, for indirectly investigating extra-galactic magnetic fields,... We test MAF against anisotropic mock maps of UHECR according to some physical constraints: 1 Reference catalog of candidate sources: Active Galactic Nuclei (AGN) with known redshift z < 0.047 ( 200 Mpc) from Palermo SWIFT-BAT hard X-ray catalogue [Cusumano, G. et al, Astron. Astrop. 510 (2010)] 2 # of events proportional to source flux Φ and to z 2 3 Magnetic deflections 4 Isotropic background contamination 5 Distribution of events weighted by exposure of world-wide surface detectors Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 11/ 22
Why AGN & SWIFT-BAT? AGN are candidate sources G.R.Farrar and P.L.Biermann, PRL 81 (1998) 3579 P.G.Tinyakov and I.I.Tkachev, JETP Lett. 74 (2001) 445 V.Berezinsky et al, astro-ph/0210095 (2002) D.F.Torres et al, ApJ 595 (2003) P.Auger Coll., Science 318 (2007) 938 P.Auger Coll., Astrop. Phys., 29 (2008) 188 I.Zaw, G.R.Farrar, J.E.Greene ApJ (2009) 696 P.Auger Coll., In Press (2010), arxiv:1009.1855 [P.Auger Coll., In Press (2010), arxiv:1009.1855] SWIFT-BAT It provides the most complete and uniform all-sky hard X-ray survey up to date top: VCV; bottom: SWIFT-BAT Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 12/ 22
Surface Detectors Full-time operating and fully efficient SD do not observe the sky uniformly. Effective detection area depends on the relative exposure ω(δ) cosφ 0 cosδ sin α m + α m sinφ 0 sin δ, (1) SD detect Extended Air Shower of particles produced by UHECR, by mean of a large array of individual stations where φ 0 is the detector latitude and 0 ξ > 1 α m = π ξ < 1 cos 1 ξ otherwise with ξ cos θ max sin φ 0 sinδ. cosφ 0 cosδ [P. Sommers, Astrop. Phys. 14, 271 (2001)] (2) Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 13/ 22
World-Wide Surface/Hybrid EAS Detectors Map Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 14/ 22
World-Wide Surface/Hybrid EAS Detectors Info Experiment φ 0 θ max Exp. (m 2 s sr) λ #Ev. Volcano R. 35.15 N 70 0.2 10 16 1.000 6 Yakutsk 61.60 N 60 1.8 10 16 0.625 20 H. Park 53.97 N 74 1.000 7 AGASA 35.78 N 45 4.0 10 16 0.750 29 SUGAR 30.43 S 70 5.3 10 16 0.500 13 P. Auger 35.20 S 60 28.4 10 16 1.200 27 [M. Kachelrieß and D. Semikoz, Astrop. Phys. 26, 10 (2006)] [V. Berezinsky, Nucl. Phys. B - Proc. Supp. 188, 227 (2009)] Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 15/ 22
Current Data I UHECR 102 UHECR with rescaled energy E 4.0 10 19 ev from world-wide SD Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 16/ 22
Current Data II Catalog+UHECR Nearby AGN within 200 Mpc (z < 0.047) and UHECR Catalog + UHECR With flux-weighted catalog Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 17/ 22
Hypothesis Testing: Statistical Errors Test accepts H 0 Test rejects H 0 H 0 is true OK: 1 α CL α: Type I Error H 1 is true β: Type II Error OK: 1 β Power Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 18/ 22
Generating Anisotropic Skies II Simulation Setup The anisotropic mock map of 10 3 skies is generated according to: 1 Events gaussianly distributed (on the sphere) within = 3 around the sources 2 30% of events are isotropically distributed 3 70% of events are distributed according to our constraints 4 Power is estimated for different values of α Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 19/ 22
Simulated Mock Map Mock Map: 10 3 skies of 102 events Constraints Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 20/ 22
MAF: Statistical Features Under H 1 (Anisotropy) Power 1 β: probability to (correctly) reject H 0 when it is, in fact, false (or prob. to accept H 1 when it is, in fact, true) Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 21/ 22
Summary and conclusions Multiscale Autocorrelation Function Summary New technique not depending on particular configurations of points (e.g. doublet, triplet); physical info: significative clustering scale Physical and astrophysical constraints can be easily taken into account (by weighting densities...) 1 Unbiased method against H 0 / High discrimination power on anisotropy signal 2 Semi-analitical: drastically reduce CPU time We thank P. Auger Collaboration for fruitful discussion and O. Deligny for his invaluable help. Main Ref.: M.D.D, A. Insolia, H. Lyberis, M. Scuderi arxiv:1001.1666 Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 22/ 22
Backup Slides Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 23/ 22
CPU Time CPU Time required by MAF analysis Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 24/ 22
Power Estimation Most significative clustering scale Θ in power analysis (α 1%) Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 25/ 22
Kullback-Leibler Divergence Let P and Q be two probability distributions, with densities p(x) and q(x), respectively. The Kullback-Leibler (KL) divergence is a measure quantifying the error in approximating the density p(x) by means of q(x), and it is defined as Z D KL (p q) = p(x) log p(x) dx (3) q(x) Let P the empirical distribution of random outcomes x i (i = 1, 2,..., n) of the true distribution P, putting the probability n 1 on each outcome as p(x) = 1 nx δ(x x i ) (4) n i=1 and let Q Θ be the statistical model for the data, depending on the unknown parameter Θ. It follows Z D KL ( p q Θ ) = H( p) p(x) log q(x Θ)dx (5) where H( p) is the information entropy of p, not depending on Θ, whereas p and q Θ = q(x Θ) are the corresponding densities of P and Q Θ, respectively. Putting Eq. (4) in the right-hand side of Eq. (5): D KL ( p q Θ ) = H( p) 1 nx log q(x i Θ) = H( p) 1 Lq(Θ x) (6) n n i=1 where L q(θ x) is the log-likelihood of the statistical model. It directly follows that arg min Θ D KL( p q Θ ) = 1 n arg max Lq(Θ x) (7) Θ where the function arg min(arg max)f (Θ) retrieves the minimum (maximum) of the function f (Θ). Hence, another way to obtain the maximum likelihood estimation it to minimize the KL divergence. Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 26/ 22
Generalized Extreme Value Distribution Extreme value theory is the research area dealing with the static analysis of the extremal values of a stochastic variable. Let x i (i = 1, 2,..., n) be i.i.d. random outcomes of a distribution F. If M n = max{x 1, x 2,..., x n}, the probability to obtain an outcome greater or equal than M n is: Pr(M n x) = Pr(x 1 x, x 2 x,..., x n x) = F n (x) It can be shown that the limiting distribution F n (x) is degenerate and should be normalized. However, if there exists sequences of real constants a n > 0 and b n such that «Mn b n Pr x = F n (a nx + b n) a n then lim n Fn (a nx + b n) = G(x) (8) The function G(x) is the generalized extreme value (GEV) or Fisher-Tippett distribution 8 >< exp e z ξ = 0 " # G(z) = 1, z = x µ >: exp (1 ξz) ξ ξ 0 σ (9) defined for 1 ξz > 0 if ξ 0 and for z Ê if ξ = 0, where µ, σ and ξ are the location, scale and shape parameters, respectively. The Gumbel distribution is related to the distribution of maxima and it is retrieved for ξ = 0. The corresponding probability density g(x) is easily obtained from G as g(x) = 1» σ exp x µ «x µ exp (10) σ σ Relation to mean µ and to standard deviation σ of the distribution (γ = 0.577215... is the Euler constant): µ = µ + γσ, σ 2 = π2 6 σ2 (11) Manlio De Domenico, H. Lyberis Multiscale Autocorrelation Function: a new approach to anisotropy studies 27/ 22