Algorithms for Higher Order Spatial Statistics

Similar documents
Higher Order Statistics in Cosmology

Cosmic Statistics of Statistics: N-point Correlations

Big Bang, Big Iron: CMB Data Analysis at the Petascale and Beyond

Data analysis of massive data sets a Planck example

(Gaussian) Random Fields

Extreme value sta-s-cs of smooth Gaussian random fields in cosmology

An Estimator for statistical anisotropy from the CMB. CMB bispectrum

Information in Galaxy Surveys Beyond the Power Spectrum

Wavelets Applied to CMB Analysis

Fisher Matrix Analysis of the Weak Lensing Spectrum

Correlations between the Cosmic Microwave Background and Infrared Galaxies

Anisotropy in the CMB

Cosmology II: The thermal history of the Universe

Active Learning for Fitting Simulation Models to Observational Data

Galaxies 626. Lecture 3: From the CMBR to the first star

Classifying Galaxy Morphology using Machine Learning

Cosmological N-Body Simulations and Galaxy Surveys

What forms AGN Jets? Magnetic fields are ferociously twisted in the disk.

Lecture 37 Cosmology [not on exam] January 16b, 2014

Cosmic Evolution: A Scientific Creation Myth Like No Other

Galaxy Bias from the Bispectrum: A New Method

Cosmic Microwave Background Introduction

Multi-GPU Simulations of the Infinite Universe

ASTR Data on a surface 2D

Measures of Cosmic Structure

AST4320: LECTURE 10 M. DIJKSTRA

The Millennium Simulation: cosmic evolution in a supercomputer. Simon White Max Planck Institute for Astrophysics

The Tools of Cosmology. Andrew Zentner The University of Pittsburgh

Moment of beginning of space-time about 13.7 billion years ago. The time at which all the material and energy in the expanding Universe was coincident

Physics 661. Particle Physics Phenomenology. October 2, Physics 661, lecture 2

Weak gravitational lensing of CMB

Science with large imaging surveys

POWER SPECTRUM ESTIMATION FOR J PAS DATA

2. What are the largest objects that could have formed so far? 3. How do the cosmological parameters influence structure formation?

Communication Theory II

Lecture 09. The Cosmic Microwave Background. Part II Features of the Angular Power Spectrum

Statistical Applications in the Astronomy Literature II Jogesh Babu. Center for Astrostatistics PennState University, USA

Cosmology. Chapter 18. Cosmology. Observations of the Universe. Observations of the Universe. Motion of Galaxies. Cosmology

Astronomy 113. Dr. Joseph E. Pesce, Ph.D. The Big Bang & Matter. Olber s Paradox. Cosmology. Olber s Paradox. Assumptions 4/20/18

Astronomy 113. Dr. Joseph E. Pesce, Ph.D Joseph E. Pesce, Ph.D.

Cosmology. Clusters of galaxies. Redshift. Late 1920 s: Hubble plots distances versus velocities of galaxies. λ λ. redshift =

AST5220 lecture 2 An introduction to the CMB power spectrum. Hans Kristian Eriksen

Astr 102: Introduction to Astronomy. Lecture 16: Cosmic Microwave Background and other evidence for the Big Bang

Cosmic Inflation Lecture 16 - Monday Mar 10

Large-scale structure as a probe of dark energy. David Parkinson University of Sussex, UK

(Astronomy for Dummies) remark : apparently I spent more than 1 hr giving this lecture

Chapter 18. Cosmology. Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

N-body Simulations. On GPU Clusters

What do we really know about Dark Energy?

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge

FOUR-YEAR COBE 1 DMR COSMIC MICROWAVE BACKGROUND OBSERVATIONS: MAPS AND BASIC RESULTS

The oldest science? One of the most rapidly evolving fields of modern research. Driven by observations and instruments

Can one reconstruct masked CMB sky?

AST5220 lecture 2 An introduction to the CMB power spectrum. Hans Kristian Eriksen

Cosmology with high (z>1) redshift galaxy surveys

Formation of z~6 Quasars from Hierarchical Galaxy Mergers

arxiv:astro-ph/ v1 15 Aug 2005

Astronomy: The Big Picture. Outline. What does Hubble s Law mean?

Derivation of parametric equations of the expansion of a closed universe with torsion. Prastik Mohanraj

El Universo en Expansion. Juan García-Bellido Inst. Física Teórica UAM Benasque, 12 Julio 2004

arxiv:astro-ph/ v1 11 Oct 2002

The Cosmic Microwave Background

Preliminaries. Growth of Structure. Today s measured power spectrum, P(k) Simple 1-D example of today s P(k) Growth in roughness: δρ/ρ. !(r) =!!

4.3 The accelerating universe and the distant future

Astronomy 210 Final. Astronomy: The Big Picture. Outline

MIT Exploring Black Holes

Cosmology & CMB. Set5: Data Analysis. Davide Maino

Lecture #24: Plan. Cosmology. Expansion of the Universe Olber s Paradox Birth of our Universe

New Multiscale Methods for 2D and 3D Astronomical Data Set

The Cosmological Principle

MASAHIDE YAMAGUCHI. Quantum generation of density perturbations in the early Universe. (Tokyo Institute of Technology)

Cosmic Microwave Background

arxiv: v1 [astro-ph.co] 3 Apr 2019

PoS(ICRC2017)765. Towards a 3D analysis in Cherenkov γ-ray astronomy

Cosmic Microwave Background

Multi-Field Inflation with a Curved Field-Space Metric

N-body Simulations and Dark energy

The Growth of Structure Read [CO 30.2] The Simplest Picture of Galaxy Formation and Why It Fails (chapter title from Longair, Galaxy Formation )

The cosmic background radiation II: The WMAP results. Alexander Schmah

Perturbation Theory. power spectrum from high-z galaxy. Donghui Jeong (Univ. of Texas at Austin)

Nucleosíntesis primordial

Exam 3 Astronomy 100, Section 3. Some Equations You Might Need

the evidence that the size of the observable Universe is changing;

X t = a t + r t, (7.1)

Galaxy Formation Seminar 2: Cosmological Structure Formation as Initial Conditions for Galaxy Formation. Prof. Eric Gawiser

The Expanding Universe

3. It is expanding: the galaxies are moving apart, accelerating slightly The mystery of Dark Energy

Hunting for Primordial Non-Gaussianity. Eiichiro Komatsu (Department of Astronomy, UT Austin) Seminar, IPMU, June 13, 2008

Inflation. Week 9. ASTR/PHYS 4080: Introduction to Cosmology

Astronomy 102: Stars and Galaxies Review Exam 3

n=0 l (cos θ) (3) C l a lm 2 (4)

Cosmology ASTR 2120 Sarazin. Hubble Ultra-Deep Field

Announcements. Homework. Set 8now open. due late at night Friday, Dec 10 (3AM Saturday Nov. 11) Set 7 answers on course web site.

Data Analysis Methods

COMPUTATIONAL STATISTICS IN ASTRONOMY: NOW AND SOON

Present and future redshift survey David Schlegel, Berkeley Lab

Signal Model vs. Observed γ-ray Sky

Statistical techniques for data analysis in Cosmology

Growth of structure in an expanding universe The Jeans length Dark matter Large scale structure simulations. Large scale structure

Shape Measurement: An introduction to KSB

Transcription:

Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy University of Hawaii Future of AstroComputing Conference, SDSC, Dec 16-17

Outline Introduction 1 Introduction 2

Random Fields Introduction Definition A random field is a spatial field with an associated probability measure: P(A)DA. Random fields are abundant in Cosmology. The cosmic microwave background fluctuations constitute a random field on a sphere. Other examples: Dark Matter Distribution, Galaxy Distribution, etc. Astronomers measure particular realization of a random field (ergodicity helps but we cannot avoid cosmic errors )

Definitions Introduction The ensemble average A corresponds to a functional integral over the probability measure. Physical meaning: average over independent realizations. Ergodicity: (we hope) ensemble average can be replaced with spatial averaging. Symmetries: translation and rotation invariance Joint Moments F (N) (x 1,..., x N ) = T (x 1 ),..., T (x N )

Connected Moments These are the most frequently used spatial statistics Typically we use fluctuation fields δ = T / T 1 Connected moments are defined recursively δ 1,..., δ N c = δ 1,..., δ N P δ 1......δ i c... δ j... δ k c... With these the N-point correlation functions are ξ (N) (1,..., N) = δ 1,..., δ N c

Gaussian vs. Non-Gaussian distributions These two have the same two-point correlation function or P(k) These have the same two-point correlation function!

Basic Objects These are N-point correlation functions. Special Cases Two-point functions δ 1 δ 2 Three-point functions δ 1 δ 2 δ 3 Cumulants δr N c = S N δr 2 N 1 Cumulant Correlators Conditional Cumulants δ1 NδM 2 c δ(0)δr N c In the above δ R stands for the fluctuation field smoothed on scale R (different R s could be used for each δ s). Host of alternative statistics exist: e.g. Minkowski functions, void probability, minimal spanning trees, phase correlations, etc.

Complexities Combinatorial explosion of terms N-point quantities have a large configuration space: measurement, visualization, and interpretation become complex. e.g, already for CMB three-point, the total number of bins scales as M 3/2 CPU intensive measurement: M N scaling for N-point statistics of M objects. Theoretical estimation Estimating reliable covariance matrices

Algorithmic Scaling and Moore s Law Computational resources grow exponentially (Astronomical) data acquisition driven by the same technology Data grow with the same exponent Corrolary Any algorithm with a scaling worse then linear will become impossible soon Symmetries, hierarchical structures (kd-trees), MC, computational geometry, approximate methods

Example: Algorithm for 3pt Other algorithms use symmetries θ 1 θ 2 θ 3 θθθ 4 4

Algorithm for 3pt Cont d Naively N 3 calculations to find all triplets in the map: overwhelming (millions of CPU years for WMAP) Regrid CMB sky around each point according to the resolution Use hierarchical algorithm for regridding: N log N Correlate rings using FFT s (total speed: 2 minutes/cross-corr) The final scaling depends on resolution N(log N + N θ N α log N α + N α N θ (N θ + 1)/4)/2 With another cos transform one and a double Hankel transform one can get the bispectrum In WMAP-I: 168 possible cross correlations, about 1.6 million bins altogether. How to interpret such massive measurements?

3pt in WMAP Introduction

Recent Challanges Processors becoming multicore (CPU and GPU) To take advantage of Moore s law: parallelization Disk sizes growing exponentially, but not the IO speed Data size can become so large that reading might dominate processing Not enough to just consider scaling

Alternative view of the algorithm: lossy compression θ 1 θ 2 θ 3 θθθ 4 4

Compression Introduction Compression can increase processing speed simply by the need of reading less data The full compressed data set can be sent to all nodes This enables parallelization in multicore or MapReduce framework For any algorithm specific (lossy compression) is needed

Introduction Another pixellization as lossy compression I. Szapudi Algorithms for Higher Order Statistics

Introduction Fast algorithm for calculating 3pt functions with N log N scaling instead of N 3 Approximate algorithm with a specific lossy compression phase Scaling with resolution and not with data elements Compression in the algorithm enables multicore or MapReduce style parallelization With a different compression we have done approximate likelihood analysis for CMB (Granett,PhD thesis)