Andy Casey Astrophysics; Statistics

Size: px
Start display at page:

Download "Andy Casey Astrophysics; Statistics"

Transcription

1 Science in the era of Gaia data big Astrophysics; Statistics andycasey astrowizicist astrowizici.st

2 Science in the era of Gaia data big - The Gaia mission All about Gaia. What makes data big?

3 Science in the era of Gaia data big - The Gaia mission All about Gaia. What makes data big? - Pedagogy of data analysis, when you have lots of data Examples of how pedagogy drives decisions in big and small data analysis (data-driven methods, non-parametric models)

4 Science in the era of Gaia data big - The Gaia mission All about Gaia. What makes data big? - Pedagogy of data analysis, when you have lots of data Examples of how pedagogy drives decisions in big and small data analysis (data-driven methods, non-parametric models) - Tools & resources for data analysis: pick the right tool for the job

5 Science in the era of Gaia data big - The Gaia mission All about Gaia. What makes data big? - Pedagogy of data analysis, when you have lots of data Examples of how pedagogy drives decisions in big and small data analysis (data-driven methods, non-parametric models) - Tools & resources for data analysis: pick the right tool for the job - Unsolicited advice to be ahead of the data wave

6

7 having data is no longer currency in astronomy

8 having data is no longer currency in astronomy having good ideas and the ability to effortlessly use data is currency

9 having data is no longer currency in astronomy having good ideas and the ability to effortlessly use data is currency This talk is about making you rich

10 The Gaia satellite The Billion Star Surveyor (tm) One billion stars for one billion Euros An astrometric mission designed to measure the position, parallax, brightness, and proper motions for more than one billion stars.

11 The Gaia satellite The Billion Star Surveyor (tm) One billion stars for one billion Euros An astrometric mission designed to measure the position, parallax, brightness, and proper motions for more than one billion stars.

12 The Gaia satellite The Billion Star Surveyor (tm) One billion stars for one billion Euros For up to 1.7 billion sources: - Positions - Proper motions - Radial velocities (and scatter) - Parallax - Photometry (G, BP, RP) - Colours (G-BP, G-RP, BP-RP) - Dust along line of sight - Stellar effective temperatures - Stellar radii - Stellar masses - Stellar luminosities - Astrometric excess noise (more than a single-star solution) - Orbital solutions for solar system objects - Variable stars (including light curves of new kinds of objects) Credit: Erik Tollerud

13 Source count completeness Gaia observes everything. Stars, galaxies, quasars, asteroids, et cetera.

14 Photometric performance Kepler-precision photometry, but for one billion stars

15 Astrometric performance (It is very good)

16 Astrometric performance Note: Hipparchus data release 1

17 Proper motion performance G ~ 18 star at 30 kpc w/ 0.4 mas/yr is approx. 2 km/s precision at 100,000 light years away

18 You are here.

19 Credit: S. BRUNIER/ESO/ESA

20

21 Gaia Data Release 2 This was the first real data release, and just averaged values.

22 Gaia Data Release 2 This was the first real data release, and just averaged values. are we at big data yet?

23 Gaia Data Release 5 The flood is coming. This is what we need to deal with (easily). Position measurements Brightness measurements Medium-resolution spectra Low-resolution spectra Size of reduced data products for science 128 trillion 380 trillion 1 billion 100 billion 1 petabyte

24 Gaia Data Release 5 The flood is coming. This is what we need to deal with (easily). are we at big data yet? Position measurements Brightness measurements Medium-resolution spectra Low-resolution spectra Size of reduced data products for science 128 trillion 380 trillion 1 billion 100 billion 1 petabyte

25 Gaia Data Release 5 The flood is coming. This is what we need to deal with (easily). are we at big data Position measurements Brightness measurements Medium-resolution spectra Low-resolution spectra Size of reduced data products for science 128 trillion 380 trillion 1 billion 100 billion 1 petabyte yet? rule of thumb: If you can load it into RAM, then you are not at big data.

26 Five pedagogical questions to ask yourself to keep you out of scientific and data analysis cul-de-sacs 1. Do you have small data, or do you have big data? 2. What is the simplest, dumbest model you can think of? 3. What assumptions are you making? 4. What is the utility of the model? 5. What can you afford?

27 1. Do you have small data, or do you have big data? If you can t load it into RAM, you have options (in terms of difficulty): Do you need to load all the data at once? Memory-mapped arrays: store data on external hard drives and treat it (really carefully) as memory. Can you subsample the data and get a comparable result? Can you use statistics of the data to get a comparable result? Can you simplify the data you use and get a comparable result (e.g., ignore covariances)? Can you recast your problem as a map-reduce problem?

28 2. What is the simplest, dumbest model you can think of? Always start with the simplest model you can think of, even if you know it is dumb and will not give you great results. For example: 1. Linear regression (for fitting data) a design matrix can have nonlinear entries, but you are still doing linear regression! 2. k-means (for clustering) use k-means++ for initialisation, always 3. Logistic regression (for classification) Don t change this model until you have answered all five questions! When complicated models aren t working correctly, always ask what is the simplest, dumbest thing is that you could test to check your intuition.

29 3. What assumptions are you making? You have made an infinite number of assumptions. What are the most important assumptions? (Seriously, write them down) Do you assume that your data are drawn from a straight line? Do you assume the data points are independent? Do you assume the noise in the data are normally distributed? Do you assume that you have the correct objective function? Do you assume that you have optimised to the global minimum? Do you assume that you have used an appropriate optimisation algorithm? Do you assume that the noise estimates you have are correct? Do you assume that we do not live in a simulation? (Would it matter?)

30 4. What is the utility of the model? All models are wrong, some are useful. Even a dumb model can tell you a lot about what you should do next. If you have a dumb model but you parameterise your model errors, then the model errors (or residuals from the data) will inform you where your model is failing. Do the underlying physical models make good predictions? Under what conditions will this model fail? (models should fail loudly!) Do you need a point estimate of your model parameters, or do you need a posterior probability distribution over data? Does this model give a point estimate that you can use for other purposes?

31 5. What can you afford? Sometimes a point estimate of the parameters of a very simple model is good enough to answer the question you have. Sometimes you will need to sample a posterior probability distribution of a complicated model. Or worse: calculate the fully marginalised likelihood (FML; a.k.a. the evidence ). What can you afford? (etc.) Answers to these questions will (in a very practical sense) help drive your model complexity.

32 Example: data-driven models For when the data are better than the models.

33 Hierarchical data-driven models of stellar properties Hierarchical, complex model Analytic integrals to marginalise parameters Tractable!-ish Use joint information between stars to denoise properties of the sample arxivs: , (Leistedt et al. and Anderson et al.)

34 Hierarchical data-driven models of stellar properties Hierarchical, complex model Analytic integrals to marginalise parameters Tractable!-ish Use joint information between stars to denoise properties of the sample arxivs: , (Leistedt et al. and Anderson et al.)

35 Hierarchical data-driven models of stellar properties 1. Do you have small data, or do you have big data? Small. 2. What is the simplest, dumbest model you can think of? Gaussian mixture model. 3. What assumptions are you making? Independence among stars. Many others. 4. What is the utility of the model? Most parallaxes are noisy. This model improves them. 5. What can you afford? Posterior distributions over data, but only through analytic marginalisation.

36 Example: non-parametric models Terribly named, because they really have infinite numbers of parameters.

37 Non-parametric model for binary star inference 1. Do you have small data, or do you have big data? Big. We ded. 2. What is the simplest, dumbest model you can think of? Mixture of two components. 3. What assumptions are you making? Some stars with similar colours and luminosity will be single stars. 4. What is the utility of the model? Point estimates of binary probability for two billion stars. 5. What can you afford? Posterior distributions over data, but only if we get clever.

38 Non-parametric model for binary star inference

39 Non-parametric model for binary star inference radial velocity variance template systematics astrometric noise Fit a mixture model (normal and log-normal) to all observables of stars in our ball Calculate p(single data) for the star of interest Move on to the next bluer/redder than expected photometric variability

40 Non-parametric model for binary star inference apparent g flux radial velocity variance (km s 1 ) 1.0 radial velocity variance (km s 1 ) radial velocity variance (km s 1 ) apparent bp flux apparent rp flux 108

41 Non-parametric model for binary star inference In practice we might want to sample the mixture parameters for every star Can we afford it? Hell no! We can barely optimise it! But we may be able to analytically marginalise out parameters that we don t care about

42 Non-parametric model for binary star inference ~210 million parameter model for brighter stars, about 1B parameter model for all stars. Converted a big data problem to a small data problem that is embarrassingly parallel, and one where we might be able to analytically marginalise out many hyper-nuisance-parameters.

43 Non-parametric model for binary star inference vrad excess/p p 1 e binary probability K/P p 1 e 2 0.0

44 Non-parametric model for binary star inference N = absolute G magnitude absolute G magnitude binary fraction bp-rp bp-rp Now we can do a population study of binary stars that is 10 5 times larger than anything we could do before.

45 Why not just turn on the Machine Learning (tm)? As physicists we are often interested in the mechanisms that produced the data. That is, we want a generative model for the data. Neural networks are universal function approximators (we ve known that literally for decades), but they will not give you a generative model for the data that is interpretable. This applies to most ML methods. Sometime s that s OK. Sometimes you don t care about interpretability, or how the data were generated. But often we do care, and we can afford an interpretable model, but we (incorrectly) opt to use Machine Learning.

46 Why not just turn on the Machine Learning (tm)? Consider a problem where there are: Lots of high quality data. It s hard to model those data, and/or the existing models do not make good predictions ( the data are better than the models ). We just want answers. We don t care why.

47 Why not just turn on the Machine Learning (tm)? Turn on the ML! Create some training set of well-known objects. Train a Convolutional Neural Network (CNN) to estimate the intrinsic (or latent) properties of some objects, given an image (or spectrum) of the object. You responsibly run cross-validation (or drop-out) to convince yourself things work. You run the test step. Your CNN has identified an object with properties that defy everything we thought we knew about astrophysics! (But in many other ways, it is similar enough to objects in the training set, so we have some reason to trust it)

48 (Get it? Convolutional Neural Network.) Models that lack interpretability can really suck.

49 When should I turn on the Machine Learning (tm)? Can you write a generative model for the data (that evaluates in less than a Hubble time)? Don t use machine learning. Forward model the data. Do you care about model interpretability, or interpreting the results that you get? Don t use machine learning. Forward model the data. Do you want a posterior probability distribution over data? Don t use machine learning. Forward model the data. Do you need to retain some semblance of probability over data? Don t use machine learning. Forward model the data. Do you want to classify or estimate things, or make decisions, and you don t care about the physics? Hell yeah! Turn the Machine Learning up to 11!

50 Even when you turn on the Machine Learning (tm), the rules still apply! What is the simplest, dumbest model you can think of? Start with that. From Google on Scalable and accurate deep learning with electronic health records (Nature): Regularised logistic regression performed essentially just as well as Deep Neural Networks (mortality C.I vs 0.94 to 0.96). Huge cost, complexity, and interpretability difference in those models.

51 Standard tools for data analysis Linear algebra. Go back to basics. Keep your linear (matrix) algebra sharp. Python (3): astropy, numpy, scipy, scikit-learn, TensorFlow (not just for ML) Positives: Good glue. Human-readable, machine-executable. Transferable skill. Negatives: Only a little bit slow. Stan: probabilistic programming language When to use: If you have a model that doesn t have bespoke parts (e.g., no models at grid points, or functions that are not differentiable). When not to use: When your model contains bespoke parts. Or if statements (kinda). Fortran/C: Betterise your code by speeding up the slowest parts. You can call Fortran or C functions directly from Python. PostgreSQL: Learn it. Write scripts to ingest data. You will thank yourself later. Hadoop: If you have a map-reduce job, use Hadoop. Transferable skill.

52 Resources Statistics: Information theory, inference and learning algorithms, Sokal s notes, Probablistic Programming and Bayesian Methods for Hackers, Bayesian Data Analysis, Hamiltonian Monte Carlo Version control: oh shit git Machine Learning: Talking Machines, Which ML algorithm is for me?, Matrix calculus you need for deep learning, You should understand backpropagation, Machine Learning 101 (Google Engineers) Code: astropy, tensorflow, stan, scikit-learn, fortran from python Probabilistic graphical models: an introduction Linear algebra: immersive linear algebra

53 Unsolicited advice to be ahead of the data wave 1. Create a GitHub or BitBucket account and use it. Push daily. Push good code. Push bad code. Push grant proposals. Push paper drafts. Push. Push. Push. 2. Read arxiv: and do all the exercises. 3. Be familiar with tools (machine learning, optimisation algorithms, linear algebra) and know how to chose the right tool. It s hard. 4. Think about if you can map-reduce your data analysis problem. If you can, learn Hadoop as part of that project. 5. Start with the simplest model for data analysis. But for fun, think about how to fit a line to one petabyte of data.

54 Gaia Sprints Not traditional scientific meetings. Aim is to bring together people who want to exploit Gaia data on short timescales. We do everything in the open. Open data. Open science. No invited participants; everyone applies to attend (incl. the SOC, the Gaia principal investigator, etc). Best scientific experience of my life, Most important week of my year. Next Sprint: 2019 Santa Barbara gaia.lol

55 Conclusions The data are only going to get bigger. Those who can t swim, will drown. Those who can swim will drown in.

56 Conclusions The data are only going to get bigger. Those who can t swim, will drown. Those who can swim will drown in. Remember to ask yourself: 1. Do you have small data, or do you have big data? 2. What is the simplest, dumbest model you can think of? 3. What assumptions are you making? 4. What is the utility of the model? 5. What can you afford?

Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an

More information

Measuring the Properties of Stars (ch. 17) [Material in smaller font on this page will not be present on the exam]

Measuring the Properties of Stars (ch. 17) [Material in smaller font on this page will not be present on the exam] Measuring the Properties of Stars (ch. 17) [Material in smaller font on this page will not be present on the exam] Although we can be certain that other stars are as complex as the Sun, we will try to

More information

Hierarchical Bayesian Modeling of Planet Populations

Hierarchical Bayesian Modeling of Planet Populations Hierarchical Bayesian Modeling of Planet Populations You ve found planets in your data (or not)...... now what?! Angie Wolfgang NSF Postdoctoral Fellow, Penn State Why Astrostats? This week: a sample of

More information

The Gaia Mission. Coryn Bailer-Jones Max Planck Institute for Astronomy Heidelberg, Germany. ISYA 2016, Tehran

The Gaia Mission. Coryn Bailer-Jones Max Planck Institute for Astronomy Heidelberg, Germany. ISYA 2016, Tehran The Gaia Mission Coryn Bailer-Jones Max Planck Institute for Astronomy Heidelberg, Germany ISYA 2016, Tehran What Gaia should ultimately achieve high accuracy positions, parallaxes, proper motions e.g.

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann Machine Learning! in just a few minutes Jan Peters Gerhard Neumann 1 Purpose of this Lecture Foundations of machine learning tools for robotics We focus on regression methods and general principles Often

More information

Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017 CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class

More information

Making precise and accurate measurements with data-driven models

Making precise and accurate measurements with data-driven models Making precise and accurate measurements with data-driven models David W. Hogg Center for Cosmology and Particle Physics, Dept. Physics, NYU Center for Data Science, NYU Max-Planck-Insitut für Astronomie,

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Combining probabilities

Combining probabilities prepared by: approved by: reference: issue: 2 revision: 0 date: 2011-07-19 status: Issued Abstract Coryn A.L. Bailer-Jones, Kester Smith Max Planck Institute for Astronomy, Heidelberg Email: calj@mpia.de

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are

More information

PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY. Arto Klami

PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY. Arto Klami PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY Arto Klami 1 PROBABILISTIC PROGRAMMING Probabilistic programming is to probabilistic modelling as deep learning is to neural networks (Antti Honkela,

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY

PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY PROBABILISTIC PROGRAMMING: BAYESIAN MODELLING MADE EASY Arto Klami Adapted from my talk in AIHelsinki seminar Dec 15, 2016 1 MOTIVATING INTRODUCTION Most of the artificial intelligence success stories

More information

Data-driven models of stars

Data-driven models of stars Data-driven models of stars David W. Hogg Center for Cosmology and Particle Physics, New York University Center for Data Science, New York University Max-Planck-Insitut für Astronomie, Heidelberg 2015

More information

How to Understand Stars Chapter 17 How do stars differ? Is the Sun typical? Location in space. Gaia. How parallax relates to distance

How to Understand Stars Chapter 17 How do stars differ? Is the Sun typical? Location in space. Gaia. How parallax relates to distance How to Understand Stars Chapter 7 How do stars differ? Is the Sun typical? Image of Orion illustrates: The huge number of stars Colors Interstellar gas Location in space Two dimensions are easy measure

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

Big Data Inference. Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts. Josh Speagle 1,

Big Data Inference. Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts. Josh Speagle 1, ICHASC, Apr. 25 2017 Big Data Inference Combining Hierarchical Bayes and Machine Learning to Improve Photometric Redshifts Josh Speagle 1, jspeagle@cfa.harvard.edu In collaboration with: Boris Leistedt

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Stellar distances and velocities. ASTR320 Wednesday January 24, 2018

Stellar distances and velocities. ASTR320 Wednesday January 24, 2018 Stellar distances and velocities ASTR320 Wednesday January 24, 2018 Special public talk this week: Mike Brown, Pluto Killer Wednesday at 7:30pm in MPHY204 Why are stellar distances important? Distances

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

PARALLAX AND PROPER MOTION

PARALLAX AND PROPER MOTION PARALLAX AND PROPER MOTION What will you learn in this Lab? We will be introducing you to the idea of parallax and how it can be used to measure the distance to objects not only here on Earth but also

More information

Engineering considerations for large astrophysics projects

Engineering considerations for large astrophysics projects Engineering considerations for large astrophysics projects David W. Hogg Center for Cosmology and Particle Physics Department of Physics New York University Max-Planck-Institut für Astronomie Heidelberg,

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

CSC2515 Assignment #2

CSC2515 Assignment #2 CSC2515 Assignment #2 Due: Nov.4, 2pm at the START of class Worth: 18% Late assignments not accepted. 1 Pseudo-Bayesian Linear Regression (3%) In this question you will dabble in Bayesian statistics and

More information

Lecture 12: Distances to stars. Astronomy 111

Lecture 12: Distances to stars. Astronomy 111 Lecture 12: Distances to stars Astronomy 111 Why are distances important? Distances are necessary for estimating: Total energy released by an object (Luminosity) Masses of objects from orbital motions

More information

Chapter 8: The Family of Stars

Chapter 8: The Family of Stars Chapter 8: The Family of Stars Motivation We already know how to determine a star s surface temperature chemical composition surface density In this chapter, we will learn how we can determine its distance

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Mathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory

Mathematical Tools for Neuroscience (NEU 314) Princeton University, Spring 2016 Jonathan Pillow. Homework 8: Logistic Regression & Information Theory Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 206 Jonathan Pillow Homework 8: Logistic Regression & Information Theory Due: Tuesday, April 26, 9:59am Optimization Toolbox One

More information

CS 124 Math Review Section January 29, 2018

CS 124 Math Review Section January 29, 2018 CS 124 Math Review Section CS 124 is more math intensive than most of the introductory courses in the department. You re going to need to be able to do two things: 1. Perform some clever calculations to

More information

Milky Way star clusters

Milky Way star clusters Using Γα ια for studying Milky Way star clusters Eugene Vasiliev Institute of Astronomy, Cambridge MODEST-, 26 June Overview of Gaia mission Scanning the entire sky every couple of weeks Astrometry for

More information

Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1

Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1 Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1 What is a linear equation? It sounds fancy, but linear equation means the same thing as a line. In other words, it s an equation

More information

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk 94 Jonathan Richard Shewchuk 7 Neural Networks NEURAL NETWORKS Can do both classification & regression. [They tie together several ideas from the course: perceptrons, logistic regression, ensembles of

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington Bayesian Classifiers and Probability Estimation Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington 1 Data Space Suppose that we have a classification problem The

More information

Approximate inference in Energy-Based Models

Approximate inference in Energy-Based Models CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models Geoffrey Hinton Two types of density model Stochastic generative model using directed acyclic graph (e.g. Bayes Net) Energy-based

More information

Outline Challenges of Massive Data Combining approaches Application: Event Detection for Astronomical Data Conclusion. Abstract

Outline Challenges of Massive Data Combining approaches Application: Event Detection for Astronomical Data Conclusion. Abstract Abstract The analysis of extremely large, complex datasets is becoming an increasingly important task in the analysis of scientific data. This trend is especially prevalent in astronomy, as large-scale

More information

A Review/Intro to some Principles of Astronomy

A Review/Intro to some Principles of Astronomy A Review/Intro to some Principles of Astronomy The game of astrophysics is using physical laws to explain astronomical observations There are many instances of interesting physics playing important role

More information

Learning from Data: Regression

Learning from Data: Regression November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Bayesian Linear Regression [DRAFT - In Progress]

Bayesian Linear Regression [DRAFT - In Progress] Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory

More information

A thorough derivation of back-propagation for people who really want to understand it by: Mike Gashler, September 2010

A thorough derivation of back-propagation for people who really want to understand it by: Mike Gashler, September 2010 A thorough derivation of back-propagation for people who really want to understand it by: Mike Gashler, September 2010 Define the problem: Suppose we have a 5-layer feed-forward neural network. (I intentionally

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

Bayesian Analysis for Natural Language Processing Lecture 2

Bayesian Analysis for Natural Language Processing Lecture 2 Bayesian Analysis for Natural Language Processing Lecture 2 Shay Cohen February 4, 2013 Administrativia The class has a mailing list: coms-e6998-11@cs.columbia.edu Need two volunteers for leading a discussion

More information

! p. 1. Observations. 1.1 Parameters

! p. 1. Observations. 1.1 Parameters 1 Observations 11 Parameters - Distance d : measured by triangulation (parallax method), or the amount that the star has dimmed (if it s the same type of star as the Sun ) - Brightness or flux f : energy

More information

Lecture - 24 Radial Basis Function Networks: Cover s Theorem

Lecture - 24 Radial Basis Function Networks: Cover s Theorem Neural Network and Applications Prof. S. Sengupta Department of Electronic and Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 24 Radial Basis Function Networks:

More information

CSC321 Lecture 20: Reversible and Autoregressive Models

CSC321 Lecture 20: Reversible and Autoregressive Models CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321 Lecture 20: Reversible and Autoregressive Models 1 / 23 Overview Four modern approaches to generative modeling:

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

EM Algorithm & High Dimensional Data

EM Algorithm & High Dimensional Data EM Algorithm & High Dimensional Data Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Gaussian EM Algorithm For the Gaussian mixture model, we have Expectation Step (E-Step): Maximization Step (M-Step): 2 EM

More information

A1101, Lab 11: Galaxies and Rotation Lab Worksheet

A1101, Lab 11: Galaxies and Rotation Lab Worksheet Student Name: Lab Partner Name: Lab TA Name: Part 1: Classifying Galaxies A1101, Lab 11: Galaxies and Rotation Lab Worksheet In the 1930s, Edwin Hubble defined what is still the most influential system

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Please bring the task to your first physics lesson and hand it to the teacher.

Please bring the task to your first physics lesson and hand it to the teacher. Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will

More information

Astronomy 421. Lecture 8: Binary stars

Astronomy 421. Lecture 8: Binary stars Astronomy 421 Lecture 8: Binary stars 1 Key concepts: Binary types How to use binaries to determine stellar parameters The mass-luminosity relation 2 Binary stars So far, we ve looked at the basic physics

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Statistical Modeling. Prof. William H. Press CAM 397: Introduction to Mathematical Modeling 11/3/08 11/5/08

Statistical Modeling. Prof. William H. Press CAM 397: Introduction to Mathematical Modeling 11/3/08 11/5/08 Statistical Modeling Prof. William H. Press CAM 397: Introduction to Mathematical Modeling 11/3/08 11/5/08 What is a statistical model as distinct from other kinds of models? Models take inputs, turn some

More information

f rot (Hz) L x (max)(erg s 1 )

f rot (Hz) L x (max)(erg s 1 ) How Strongly Correlated are Two Quantities? Having spent much of the previous two lectures warning about the dangers of assuming uncorrelated uncertainties, we will now address the issue of correlations

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Observational Parameters

Observational Parameters Observational Parameters Classical cosmology reduces the universe to a few basic parameters. Modern cosmology adds a few more, but the fundamental idea is still the same: the fate and geometry of the universe

More information

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets Neural Networks for Machine Learning Lecture 11a Hopfield Nets Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Hopfield Nets A Hopfield net is composed of binary threshold

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Lines of Hydrogen. Most prominent lines in many astronomical objects: Balmer lines of hydrogen

Lines of Hydrogen. Most prominent lines in many astronomical objects: Balmer lines of hydrogen The Family of Stars Lines of Hydrogen Most prominent lines in many astronomical objects: Balmer lines of hydrogen The Balmer Thermometer Balmer line strength is sensitive to temperature: Most hydrogen

More information

Selected Questions from Minute Papers. Outline - March 2, Stellar Properties. Stellar Properties Recap. Stellar properties recap

Selected Questions from Minute Papers. Outline - March 2, Stellar Properties. Stellar Properties Recap. Stellar properties recap Black Holes: Selected Questions from Minute Papers Will all the material in the Milky Way eventually be sucked into the BH at the center? Does the star that gives up mass to a BH eventually get pulled

More information

Ordinary Least Squares Linear Regression

Ordinary Least Squares Linear Regression Ordinary Least Squares Linear Regression Ryan P. Adams COS 324 Elements of Machine Learning Princeton University Linear regression is one of the simplest and most fundamental modeling ideas in statistics

More information

DD Advanced Machine Learning

DD Advanced Machine Learning Modelling Carl Henrik {chek}@csc.kth.se Royal Institute of Technology November 4, 2015 Who do I think you are? Mathematically competent linear algebra multivariate calculus Ok programmers Able to extend

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Artificial Neural Networks 2

Artificial Neural Networks 2 CSC2515 Machine Learning Sam Roweis Artificial Neural s 2 We saw neural nets for classification. Same idea for regression. ANNs are just adaptive basis regression machines of the form: y k = j w kj σ(b

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Astronomical Study: A Multi-Perspective Approach

Astronomical Study: A Multi-Perspective Approach Astronomical Study: A Multi-Perspective Approach Overview of Stars Motion Distances Physical Properties Spectral Properties Magnitudes Luminosity class Spectral trends Binary stars and getting masses Stellar

More information

Gaia Status & Early Releases Plan

Gaia Status & Early Releases Plan Gaia Status & Early Releases Plan F. Mignard Univ. Nice Sophia-Antipolis & Observatory of the Côte de Azur Gaia launch: 20 November 2013 The big news @ 08:57:30 UTC 2 Gaia: a many-sided mission Driven

More information

Detecting the Unexpected

Detecting the Unexpected Detecting the Unexpected Discovery in the Era of Astronomically Big Data Insights from Space Telescope Science Institute s first Big Data conference Josh Peek, Chair Sarah Kendrew, C0-Chair SOC: Erik Tollerud,

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Midterm sample questions

Midterm sample questions Midterm sample questions CS 585, Brendan O Connor and David Belanger October 12, 2014 1 Topics on the midterm Language concepts Translation issues: word order, multiword translations Human evaluation Parts

More information

VBM683 Machine Learning

VBM683 Machine Learning VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data

More information

Introduction to Algebra: The First Week

Introduction to Algebra: The First Week Introduction to Algebra: The First Week Background: According to the thermostat on the wall, the temperature in the classroom right now is 72 degrees Fahrenheit. I want to write to my friend in Europe,

More information

Active Galaxies and Galactic Structure Lecture 22 April 18th

Active Galaxies and Galactic Structure Lecture 22 April 18th Active Galaxies and Galactic Structure Lecture 22 April 18th FINAL Wednesday 5/9/2018 6-8 pm 100 questions, with ~20-30% based on material covered since test 3. Do not miss the final! Extra Credit: Thursday

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1 Preamble to The Humble Gaussian Distribution. David MacKay Gaussian Quiz H y y y 3. Assuming that the variables y, y, y 3 in this belief network have a joint Gaussian distribution, which of the following

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

18.9 SUPPORT VECTOR MACHINES

18.9 SUPPORT VECTOR MACHINES 744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the

More information