Learning an astronomical catalog of the visible universe through scalable Bayesian inference
|
|
- Jessie Golden
- 5 years ago
- Views:
Transcription
1 Learning an astronomical catalog of the visible universe through scalable Bayesian inference Jeffrey Regier with the DESI Collaboration Department of Electrical Engineering and Computer Science UC Berkeley December 9, 2016
2 The DESI Collaboration Jeff Regier Ryan Giordano Steve Howard Jon McAuliffe Michael Jordan Andy Miller Ryan Adams Jarrett Revels Andreas Jensen Kiran Pamnany Debbie Bard Rollin Thomas Dustin Lang David Schlegel Prabhat 2 / 31
3 An astronomical image An image from the Sloan Digital Sky Survey covering roughly one quarter square degree of the sky. 3 / 31
4 Faint light sources Most light sources are near the detection limit. 4 / 31
5 Outline 1. our graphical model for astronomical images (Celeste) 2. scaling approximate posterior inference to catalog the visible universe 3. model extensions 5 / 31
6 The Celeste graphical model 6 / 31
7 Scientific color priors 7 / 31
8 Galaxies: light-density model The light density for galaxy s is modeled as mixture of two extremal galaxy prototypes: h s (w) = θ s h s1 (w) + (1 θ s )h s0 (w). Each prototype (i = 0 or i = 1) is a mixture of bivariate normal distributions: J h si (w) = η ij φ (w; µ s, ν ij Q s ). j=1 An elliptical galaxy, θ s = 0 Shared covariance matrix Q s accounts for the scale σ s, rotation ϕ s, and axis ratio ρ s. A spiral galaxy, θ s = 1 8 / 31
9 Idealized sky view The brightness for sky position w is S G b (w) = l sb g s (w) s=1 where g s (w) = { 1 {µ s = w}, if a s = 0 ( star ) h s (w), if a s = 1 ( galaxy ). 9 / 31
10 Astronomical images Images differ from the idealized sky view due to 1. pixelation and point spread K f nbm (w) = ᾱ nbk φ ( w m ; w + ξ ) nbk, τ nbk k=1 G nbm = G b f nbm 2. background radiation and calibration F nbm = ι nb [ɛ nb + G nbm ] 3. finite exposure duration x nbm (a s, r s, c s ) S s=1 Poisson (F nbm) 10 / 31
11 Intractable posterior Let Θ = (a s, r s, c s ) S s=1. The posterior on Θ is intractable because of coupling between the sources: p(θ x) = p(x Θ)p(Θ) p(x) and p(x) = = p(x Θ)p(Θ) dθ N B M p(x nbm Θ)p(Θ) dθ. n=1 b=1 m=1 11 / 31
12 Variational inference Variational inference approximates the exact posterior p with a simpler distribution q Q. 12 / 31
13 Variational inference for Celeste An approximating distribution that factorizes across light sources (a structured mean-field assumption) makes most expectations tractable: q(θ) = S q(θ s ). s=1 The delta method for moments approximates the remaining expectations. Existing catalogs provide good initial settings for the variational parameters. Light sources are unlikely to contribute photons to distant pixels. The model contains an auxiliary variable indicating the mixture component that generated each source s colors. Newton s method converges in tens of iterations. 13 / 31
14 Validation from Stripe 82 Photo Celeste position missed gals 23 / / 421 missed stars 10 / / 421 color u-g color g-r color r-i color i-z brightness profile axis ratio scale angle Average error. Lower is better. Highlight scores are more than 2 standard deviations better. 14 / 31
15 Scaling inference to the visibile universe
16 The setting Big data Sloan Digital Sky Survey: 55 TB of images; hundreds of millions of stars and galaxies Large Synoptic Survey (2019): 15 TB of images nightly Fancy hardware Cori supercomputer, Phase 1: 1,630 nodes, each with 32 cores. Cori supercomputer, Phase 2: 9,300 nodes, each with 272 hardware threads. 30 teraflops. 1.5 PB array of SSDs 16 / 31
17 Julia programming language high-level syntax as fast as C++ (when necessary) a single language for hotspots and the rest multi-threading (experimental), not just multi-processing 17 / 31
18 Fast serial optimization algorithm analytic expectations (and one in delta method for moments) block coordinate ascent Newton steps rather than L-BFGS or a first-order method manually coded gradients and Hessians 3x overhead from computing exact Hessians 18 / 31
19 Parallelism among nodes A region of the sky, shown twice, divided into 25 overlapping boxes: A1,...,A9 and B1,...,B16. Each box corresponds to a task: to optimize all the light sources within its boundaries. 19 / 31
20 Parallelism among threads Light sources that do not overlap may be updated concurrently. 20 / 31
21 Parallelism among threads Light sources that do not overlap may be updated concurrently. [1] Pan, Xinghao, et al. CYCLADES: Conflict-free Asynchronous Machine Learning. NIPS / 31
22 Weak and strong scaling Celeste light sources/second. We observe perfect scaling up to 64 nodes. The we are limited by interconnect bandwidth. 21 / 31
23 Hero run: catalog of the Sloan Digital Sky Survey 512 nodes 32 cores/node = 16,384 cores 16,384 cores 16 hours = 250,000 core hours input: 55 TB of astronomical images output: catalog of 250 million light sources 22 / 31
24 A deep generative model for galaxies
25 A deep generative model for galaxies Example z n = [0.1, 0.5, 0.2, 0.1] f µ (z n ) = f σ (z n ) = z n N (0, I) x n z n N (f µ (z), f σ (z)) x n = 24 / 31
26 Autoencoder architecture 25 / 31
27 Sample fits Each row corresponds to a different example from a test set. The left column shows the input x. The center column shows the output f µ(z) for a z sampled from N (g µ(x), g σ(x)). The right column shows the output f σ(z) for the same z. 26 / 31
28 Second-order stochastic variational inference
29 Second-order Stochastic Variational Inference Require: ω is the initial vector of variational parameters; δ (δ min, δ max ) is the initial trust-region radius; γ > 1 is the trust region expansion factor; and η 1 (0, 1) and η 2 > 0 are constants. 1: for i 1 to M do 2: Sample e 1,..., e N iid from base distribution ɛ. 3: g ν ˆL(ν; e 1,..., e N ) ω 4: H 2 ˆL(ν; ν e 1,..., e N ) ω 5: ω arg max ν {g ν + ν Hν : ν δ} non-convex quadratic optimization 6: β g ω + ω Hω the expected improvement 7: Sample e 1,..., e N iid from base distribution ɛ. 8: α ˆL(ω ; e 1,..., e N ) ˆL(ω; e 1,..., e N ) the observed improvement 9: if α/β > η 1 and g η 2 δ then 10: ω ω 11: δ max(γδ, δ max ) 12: else 13: δ δ/γ 14: if δ < δ min or i = M then return ω 28 / 31
30 Second-order Stochastic Variational Inference Require: ω is the initial vector of variational parameters; δ (δ min, δ max ) is the initial trust-region radius; γ > 1 is the trust region expansion factor; and η 1 (0, 1) and η 2 > 0 are constants. 1: for i 1 to M do 2: Sample e 1,..., e N iid from base distribution ɛ. 3: g ν ˆL(ν; e 1,..., e N ) ω 4: H 2 ˆL(ν; ν e 1,..., e N ) ω 5: ω arg max ν {g ν + ν Hν : ν δ} non-convex quadratic optimization 6: β g ω + ω Hω the expected improvement 7: Sample e 1,..., e N iid from base distribution ɛ. 8: α ˆL(ω ; e 1,..., e N ) ˆL(ω; e 1,..., e N ) the observed improvement 9: if α/β > η 1 and g η 2 δ then 10: ω ω 11: δ max(γδ, δ max ) 12: else 13: δ δ/γ 14: if δ < δ min or i = M then return ω 68X fewer iterations than ADVI/SGD. 28 / 31
31 Case study: empirical convergence rates 29 / 31
32 Open questions How do we more accurately approximate the posterior distribution? normalizing flows without an encoder network hybrid VI/MCMC linear response variational Bayes (LRVB) How do we model a spatially varying point spread function (PSF)? Variational autoencoders fit independent PSFs well. But it isn t easy to account for dependence among the PSFs of nearby astronomical objects. How can we easily account for the details we don t yet account for? cosmic rays, airplanes, and satelights camera saturation (censoring) and imaging artifacts terrestrial vs extraterestrial background radiation transient events: supernovae, exoplanets, and near-earth asteroids additional imaging datasets, spectrographic datasets, calibration datasets 30 / 31
33 Thank you! 31 / 31
Celeste: Variational inference for a generative model of astronomical images
Celeste: Variational inference for a generative model of astronomical images Jerey Regier Statistics Department UC Berkeley July 9, 2015 Joint work with Jon McAulie (UCB Statistics), Andrew Miller, Ryan
More informationCeleste: Scalable variational inference for a generative model of astronomical images
Celeste: Scalable variational inference for a generative model of astronomical images Jeffrey Regier 1, Brenton Partridge 2, Jon McAuliffe 1, Ryan Adams 2, Matt Hoffman 3, Dustin Lang 4, David Schlegel
More informationCeleste: Variational inference for a generative model of astronomical images
: Variational inference for a generative model of astronomical images Jeffrey Regier, University of California, Berkeley Andrew Miller, Harvard University Jon McAuliffe, University of California, Berkeley
More informationStreaming Variational Bayes
Streaming Variational Bayes Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, Michael I. Jordan UC Berkeley Discussion led by Miao Liu September 13, 2013 Introduction The SDA-Bayes Framework
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationTwo Useful Bounds for Variational Inference
Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should
More informationOverlapping Astronomical Sources: Utilizing Spectral Information
Overlapping Astronomical Sources: Utilizing Spectral Information David Jones Advisor: Xiao-Li Meng Collaborators: Vinay Kashyap (CfA) and David van Dyk (Imperial College) CHASC Astrostatistics Group April
More informationImage Processing in Astronomy: Current Practice & Challenges Going Forward
Image Processing in Astronomy: Current Practice & Challenges Going Forward Mario Juric University of Washington With thanks to Andy Connolly, Robert Lupton, Ian Sullivan, David Reiss, and the LSST DM Team
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationLatent Variable Models
Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 5 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 5 1 / 31 Recap of last lecture 1 Autoregressive models:
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationCOMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017
COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SOFT CLUSTERING VS HARD CLUSTERING
More informationAstronomical image reduction using the Tractor
the Tractor DECaLS Fin Astronomical image reduction using the Tractor Dustin Lang McWilliams Postdoc Fellow Carnegie Mellon University visiting University of Waterloo UW / 2015-03-31 1 Astronomical image
More informationAlgorithms for Variational Learning of Mixture of Gaussians
Algorithms for Variational Learning of Mixture of Gaussians Instructors: Tapani Raiko and Antti Honkela Bayes Group Adaptive Informatics Research Center 28.08.2008 Variational Bayesian Inference Mixture
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Nov 2, 2016 Outline SGD-typed algorithms for Deep Learning Parallel SGD for deep learning Perceptron Prediction value for a training data: prediction
More informationContrastive Divergence
Contrastive Divergence Training Products of Experts by Minimizing CD Hinton, 2002 Helmut Puhr Institute for Theoretical Computer Science TU Graz June 9, 2010 Contents 1 Theory 2 Argument 3 Contrastive
More informationPhotometric Techniques II Data analysis, errors, completeness
Photometric Techniques II Data analysis, errors, completeness Sergio Ortolani Dipartimento di Astronomia Universita di Padova, Italy. The fitting technique assumes the linearity of the intensity values
More informationAuto-Encoding Variational Bayes
Auto-Encoding Variational Bayes Diederik P Kingma, Max Welling June 18, 2018 Diederik P Kingma, Max Welling Auto-Encoding Variational Bayes June 18, 2018 1 / 39 Outline 1 Introduction 2 Variational Lower
More informationVariational Autoencoder
Variational Autoencoder Göker Erdo gan August 8, 2017 The variational autoencoder (VA) [1] is a nonlinear latent variable model with an efficient gradient-based training procedure based on variational
More informationHierarchical Bayesian Modeling
Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an
More informationDeep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationHubble s Law and the Cosmic Distance Scale
Lab 7 Hubble s Law and the Cosmic Distance Scale 7.1 Overview Exercise seven is our first extragalactic exercise, highlighting the immense scale of the Universe. It addresses the challenge of determining
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationTopic 3: Neural Networks
CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why
More informationProbabilistic Graphical Models & Applications
Probabilistic Graphical Models & Applications Learning of Graphical Models Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics The slides of today s lecture are authored by and shown with
More informationApproximate Bayesian Computation
Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationProbabilistic Catalogs and beyond...
Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ Probabilistic catalogs So, what is a probabilistic catalog? And what s beyond? Finite Mixture Models Finite
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationIntro to Probability. Andrei Barbu
Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationPart 1: Expectation Propagation
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud
More informationPattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM
Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationCosmology with the Sloan Digital Sky Survey Supernova Search. David Cinabro
Cosmology with the Sloan Digital Sky Survey Supernova Search David Cinabro Cosmology Background The study of the origin and evolution of the Universe. First real effort by Einstein in 1916 Gravity is all
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 9: Expectation Maximiation (EM) Algorithm, Learning in Undirected Graphical Models Some figures courtesy
More informationIntroduction to Machine Learning
Introduction to Machine Learning Logistic Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationProbabilistic Graphical Models
10-708 Probabilistic Graphical Models Homework 3 (v1.1.0) Due Apr 14, 7:00 PM Rules: 1. Homework is due on the due date at 7:00 PM. The homework should be submitted via Gradescope. Solution to each problem
More informationUnderstanding Covariance Estimates in Expectation Propagation
Understanding Covariance Estimates in Expectation Propagation William Stephenson Department of EECS Massachusetts Institute of Technology Cambridge, MA 019 wtstephe@csail.mit.edu Tamara Broderick Department
More information> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel
Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation
More informationRegression with Numerical Optimization. Logistic
CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204
More informationNotes on Machine Learning for and
Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori
More informationHierarchical Bayesian Modeling
Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an
More informationOnline Algorithms for Sum-Product
Online Algorithms for Sum-Product Networks with Continuous Variables Priyank Jaini Ph.D. Seminar Consistent/Robust Tensor Decomposition And Spectral Learning Offline Bayesian Learning ADF, EP, SGD, oem
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationX-ray observations of neutron stars and black holes in nearby galaxies
X-ray observations of neutron stars and black holes in nearby galaxies Andreas Zezas Harvard-Smithsonian Center for Astrophysics The lives of stars : fighting against gravity Defining parameter : Mass
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationVariational Inference via Stochastic Backpropagation
Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation
More informationHomework on Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, April 8 30 points Prof. Rieke & TA Melissa Halford
Homework on Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, April 8 30 points Prof. Rieke & TA Melissa Halford You are going to work with some famous astronomical data in this homework.
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Empirical Bayes, Hierarchical Bayes Mark Schmidt University of British Columbia Winter 2017 Admin Assignment 5: Due April 10. Project description on Piazza. Final details coming
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationHomework #7: Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, October points Profs. Rieke
Homework #7: Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, October 31 30 points Profs. Rieke You are going to work with some famous astronomical data in this homework. The image data
More informationEfficient Likelihood-Free Inference
Efficient Likelihood-Free Inference Michael Gutmann http://homepages.inf.ed.ac.uk/mgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017
More informationThe SDSS is Two Surveys
The SDSS is Two Surveys The Fuzzy Blob Survey The Squiggly Line Survey The Site The telescope 2.5 m mirror Digital Cameras 1.3 MegaPixels $150 4.3 Megapixels $850 100 GigaPixels $10,000,000 CCDs CCDs:
More informationData Processing in DES
Data Processing in DES Brian Yanny Oct 28, 2016 http://data.darkenergysurvey.org/fnalmisc/talk/detrend.p Basic Signal-to-Noise calculation in astronomy: Assuming a perfect atmosphere (fixed PSF of p arcsec
More informationAfternoon Meeting on Bayesian Computation 2018 University of Reading
Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of
More informationOn Bayesian Computation
On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 9: Variational Inference Relaxations Volkan Cevher, Matthias Seeger Ecole Polytechnique Fédérale de Lausanne 24/10/2011 (EPFL) Graphical Models 24/10/2011 1 / 15
More informationBasics of Photometry
Basics of Photometry Photometry: Basic Questions How do you identify objects in your image? How do you measure the flux from an object? What are the potential challenges? Does it matter what type of object
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationLinear and logistic regression
Linear and logistic regression Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Linear and logistic regression 1/22 Outline 1 Linear regression 2 Logistic regression 3 Fisher discriminant analysis
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationMULTIPLE EXPOSURES IN LARGE SURVEYS
MULTIPLE EXPOSURES IN LARGE SURVEYS / Johns Hopkins University Big Data? Noisy Skewed Artifacts Big Data? Noisy Skewed Artifacts Serious Issues Significant fraction of catalogs is junk GALEX 50% PS1 3PI
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationCounterfactuals via Deep IV. Matt Taddy (Chicago + MSR) Greg Lewis (MSR) Jason Hartford (UBC) Kevin Leyton-Brown (UBC)
Counterfactuals via Deep IV Matt Taddy (Chicago + MSR) Greg Lewis (MSR) Jason Hartford (UBC) Kevin Leyton-Brown (UBC) Endogenous Errors y = g p, x + e and E[ p e ] 0 If you estimate this using naïve ML,
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationChapter 23 Lecture. The Cosmic Perspective Seventh Edition. Dark Matter, Dark Energy, and the Fate of the Universe Pearson Education, Inc.
Chapter 23 Lecture The Cosmic Perspective Seventh Edition Dark Matter, Dark Energy, and the Fate of the Universe Curvature of the Universe The Density Parameter of the Universe Ω 0 is defined as the ratio
More informationFast Likelihood-Free Inference via Bayesian Optimization
Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology
More informationAn Information Theoretic Interpretation of Variational Inference based on the MDL Principle and the Bits-Back Coding Scheme
An Information Theoretic Interpretation of Variational Inference based on the MDL Principle and the Bits-Back Coding Scheme Ghassen Jerfel April 2017 As we will see during this talk, the Bayesian and information-theoretic
More informationNatural Gradients via the Variational Predictive Distribution
Natural Gradients via the Variational Predictive Distribution Da Tang Columbia University datang@cs.columbia.edu Rajesh Ranganath New York University rajeshr@cims.nyu.edu Abstract Variational inference
More informationVIME: Variational Information Maximizing Exploration
1 VIME: Variational Information Maximizing Exploration Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel Reviewed by Zhao Song January 13, 2017 1 Exploration in Reinforcement
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More informationDeep Learning. What Is Deep Learning? The Rise of Deep Learning. Long History (in Hind Sight)
CSCE 636 Neural Networks Instructor: Yoonsuck Choe Deep Learning What Is Deep Learning? Learning higher level abstractions/representations from data. Motivation: how the brain represents sensory information
More informationAndriy Mnih and Ruslan Salakhutdinov
MATRIX FACTORIZATION METHODS FOR COLLABORATIVE FILTERING Andriy Mnih and Ruslan Salakhutdinov University of Toronto, Machine Learning Group 1 What is collaborative filtering? The goal of collaborative
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationLecture 13 Fundamentals of Bayesian Inference
Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up
More informationBayesian Feature Selection with Strongly Regularizing Priors Maps to the Ising Model
LETTER Communicated by Ilya M. Nemenman Bayesian Feature Selection with Strongly Regularizing Priors Maps to the Ising Model Charles K. Fisher charleskennethfisher@gmail.com Pankaj Mehta pankajm@bu.edu
More informationGALAXIES. Hello Mission Team members. Today our mission is to learn about galaxies.
GALAXIES Discussion Hello Mission Team members. Today our mission is to learn about galaxies. (Intro slide- 1) Galaxies span a vast range of properties, from dwarf galaxies with a few million stars barely
More informationAges of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008
Ages of stellar populations from color-magnitude diagrams Paul Baines Department of Statistics Harvard University September 30, 2008 Context & Example Welcome! Today we will look at using hierarchical
More informationLarge-scale Collaborative Prediction Using a Nonparametric Random Effects Model
Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model Kai Yu Joint work with John Lafferty and Shenghuo Zhu NEC Laboratories America, Carnegie Mellon University First Prev Page
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationLECTURE 1: Introduction to Galaxies. The Milky Way on a clear night
LECTURE 1: Introduction to Galaxies The Milky Way on a clear night VISIBLE COMPONENTS OF THE MILKY WAY Our Sun is located 28,000 light years (8.58 kiloparsecs from the center of our Galaxy) in the Orion
More informationTime-Sensitive Dirichlet Process Mixture Models
Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce
More information