Applied Machine Learning for Design Optimization in Cosmology, Neuroscience, and Drug Discovery
|
|
- Shannon Harmon
- 6 years ago
- Views:
Transcription
1 Applied Machine Learning for Design Optimization in Cosmology, Neuroscience, and Drug Discovery Barnabas Poczos Machine Learning Department Carnegie Mellon University Machine Learning Technologies and their Applications for Scientific and Engineering Domains Workshop NASA Langley Research Center 1 August 16, 2016
2 Goal: Create a Scientific Assistant ML & Stat Scientist Automate discoveries find anomalies, interesting events scientific laws Experiment Collect Data Analysis 2 Recommend experiments to run Recommend what to collect Recommend how to analyze
3 Computer vision, Robotics EEG, fmri, MEG, Astronomy machine learning applications Drug Discovery AI in games Neuroscience 3 Turbulences ML in Agriculture Microarray 3
4 Machine Learning on Complex Objects 4
5 Traditional Machine Learning Observations Feature vectors Training data of feature vectors ML algorithm: classification, regression, clustering, etc 5
6 Finance Complex Data is Everywhere Neuroscience Diffusion Weighted Imaging Cosmology Images 6 6
7 Distributional Data Manchester United 07/08 Owen Hargreaves Rio Ferdinand Cristiano Ronaldo 7
8 ML on Distributions healthy or sick? Medical tests: blood pressure, heart rate, temperature, blood sample Standard ML on sets/distributions machine learning Set Feature of feature vector vectors Classifier Healthy Sick What happens if we repeat the medical tests? 8
9 Distribution Regression / Classification Y 1 =1 Y 2 =0 Y 3 =1 Y m =0? Differences compared to standard methods on vectors P 1 P 2 P 3 P P The inputs are distributions, density functions (not m vectors) m+1 We don t know these distributions, only sample sets are available (error in variables model) 9
10 Distribution Classification Solution: Use RKHS based SVM! Calculate the Gram matrix Dual form of SVM: 10 Problems:
11 Distances / Divegences between Distributions RÉNYI DIVERGENCE ESTIMATION Using without density estimation Estimate divergence 11 11
12 12 The Estimator
13 ML on Distributions 13 Dealing with complex objects break into smaller parts, represent the input as a set of smaller parts treat the set elements as sample points from some unknown distribution do ML on these unknown distributions represented by sets
14 Sport Events Classification [Li and Fei Fei, 2007] badminton bocce croquet polo climbing rowing sailing snowboard 8 categories, 1040 images, each represented by 295 to dim points. Best published: 86.7% (Zhang et al, CVPR 2011) NPR: 87.1% 14 2 fold CV, 16 runs
15 50 highway images Detecting Anomalous Images 5 anomalies 15 2-dimensional sample set representation of images (128 dim SIFT ) 2 dim) Anomaly score: divergences between the distributions of these sample sets 15
16 16 Cosmology Applications
17 Find new scientific laws in physics Goal: Estimate dynamical mass of galaxy clusters. Importance: Galaxy clusters are being the largest gravitationally bound systems in the Universe. Dynamical mass measurements are important to understand the behavior of dark matter and normal matter. Difficulty: We can only measure the velocity of galaxies not the mass of their cluster Physicists estimate dynamical cluster mass from single velocity dispersion. 17 Our method: Estimate the cluster mass from the whole distribution of velocities rather than just a simple velocity distribution.
18 Find new scientific laws in physics 18 Michelle Ntampaka et al, A Machine Learning Approach for Dynamical Mass Measurements of Galaxy Clusters, APJ 2015
19 Find interesting Galaxy Clusters Sloan Digital Sky Survey (SDSS) continuum spectrum 505 galaxy clusters (10-50 galaxies in each) 7530 galaxies Blue galaxy What are the most anomalous galaxy clusters? The most anomalous galaxy cluster contains mostly star forming blue galaxies irregular galaxies 19 B. Póczos, L. Xiong & J. Schneider, UAI, Red galaxy Credits: ESA, NASA
20 Find the parameters of Universe Given a distribution of particles, our goal is to predict the parameters of the simulated universe 20
21 Active Learning & Design Optimization 21
22 NASA / ESA Recommend experiments to find the true parameters of the universe g( ) surrogate function parameter space NASA true parameters real universe noisy observations hypothesis test hypothetical parameters, mathematical model simulated observations 22 Computation problem: How to search parameter space Solution: Learn a surrogate function and make experiment decisions using it
23 Recommend experiments for drug discovery parameter space Parameters of Drugs: Compounds Quantities etc. Drug effects on the Lab mouse Expensive Observations ($, time) Real Observations Simulated Observations Expert predictions (Bias, Variance, Fidelity, Cost) Observations: Blood samples Camera images EEG, etc. 23 g( ) surrogate function to optimize
24 predicted number of galaxies Learning Relationships from Simulations Goal: predict the number of galaxies in a halo from a half dozen dark matter halo parameters 24 true number of galaxies [Xiaoying Xu, 2012] (#particles in a halo, velocity dispersion, max circular velocity, half mass radius, ) data: Millenium simulation 395,832 halos method: support vector regression
25 The Galaxy Zoo challenge Crowdsourcing project Users are asked to describe the morphology of galaxies based on images. They are asked questions such as How rounded is the galaxy and Does it have a central bulge 37 different categories in a decision tree Training set: JPG images of galaxies. Test set: JPG images of galaxies Image resolution: 424x424 color JPEG images 25
26 26 Willett et al
27 27
28 The Large Synoptic Survey Telescope Big data questions 15 Terabytes of data every night 28
29 29 Other Examples in Physics
30 30 ML to Help Understanding Turbulences
31 Turbulence Data Classification Simulated fluid flow through time (JHU Turbulence Research Group, Alex Szalay) Goal: find interesting vortices! events, patterns, phenomena 11 positive, 20 negative examples Results: Leave one out cross-validation : 97% Velocity distributions 31 Positive (vortex) Negative Negative 31
32 Find Interesting Phenomena in Turbulence Data Anomaly detection 32 Anomaly scores
33 Finding Vortices 33 Classification probabilities
34 34 Fusion power plants
35 35 Neuroscience
36 ML in Action: Neuroimaging MEG/ fmri mind reading contest MRI lie detector Decoding thoughts from brain scans Rob a bank 36 36
37 FuSSO = Functional Shrinkage and Selection Operator (Functional Lasso) 37 37
38 Function-to-Real regression 38 38
39 Many Functions-to-Real regression Similarly, one may consider a mapping that takes in multiple functions: 39 39
40 Sparse Functions-to-Real regression When the number of functional input covariates may be very large, a sparse model that depends only on a few of the functional covariates may be preferred: Goal: Finds a sparse set of functional input covariates to predict a real-valued response
41 FuSSO Example Applications Finance: Inputs: Time-series of several product prices in the past Output: Price of a particular product in the nearby future A s Prices J s Prices Future Price 41 K s Prices 41
42 FuSSO Applications in Neuroimaging Inputs: Functions at each voxel (e.g. orientation distribution functions) Output: The age of the subject 42 Voxels ODFs Image credit: Age 42
43 Results: Neuroimaging dataset Dataset with over 25K functions per subject for 89 total subjects (18 to 60 years old) Orientation distribution functions (ODF) at white matter voxels Goal: Predict the subject's age, given ODFs We compared to LASSO with peak ODF (quantitative anisotropy, QA) values. Finite dim non-functional data set. Ages Example Voxel ODF Image Sources:
44 Results: Neuroimaging dataset Results: Method: FuSSO (ODFs) LASSO (QAs) Mean Predict MSE: Absolute Errors per Subject Selected Voxels 44 Mean error: 8.3 years, Naïve approach error: 12.5 years 44
45 45 Agriculture
46 Agriculture Recommend experiments (which plants to cross) to sorghum breeders. 46
47 CMU Robot 47 1 PB data
48 48
49 Name Range RMSE error Leaf angle* (4.35%) Leaf radiation angle* (3.60%) Leaf length* (2.49%) Leaf width [max] (7.48%) Leaf width [average] (7.o2%) 49 Leaf area* (6.08%)
50 50 Grapes datasets
51 Take me home! ML on Complex Objects o ML on distributions o Lasso on functions Active learning and design optimization Applications: Cosmology Drug Design Agriculture Neuroscience 51
52 Thanks for your attention! If interested, please contact me! GHC
53 Linear Functional Regression One Real Vector vs. Functional Covariate: Real Vector Covariate Y i = hx i ; wi + ² i Functional Covariate Y i = hf (i) ; gi + ² i X i ; w 2 R d and hx i ; wi = P d j=1 X ijw j f (i) ; g 2 L 2 (ª) and hf (i) ; gi = R ª f (i) (t)g(t)dt 53 53
Physics Lab #10: Citizen Science - The Galaxy Zoo
Physics 10263 Lab #10: Citizen Science - The Galaxy Zoo Introduction Astronomy over the last two decades has been dominated by large sky survey projects. The Sloan Digital Sky Survey was one of the first
More informationMachine Learning Applications in Astronomy
Machine Learning Applications in Astronomy Umaa Rebbapragada, Ph.D. Machine Learning and Instrument Autonomy Group Big Data Task Force November 1, 2017 Research described in this presentation was carried
More informationClassifying Galaxy Morphology using Machine Learning
Julian Kates-Harbeck, Introduction: Classifying Galaxy Morphology using Machine Learning The goal of this project is to classify galaxy morphologies. Generally, galaxy morphologies fall into one of two
More informationDirected Reading A. Section: The Life Cycle of Stars TYPES OF STARS THE LIFE CYCLE OF SUNLIKE STARS A TOOL FOR STUDYING STARS.
Skills Worksheet Directed Reading A Section: The Life Cycle of Stars TYPES OF STARS (pp. 444 449) 1. Besides by mass, size, brightness, color, temperature, and composition, how are stars classified? a.
More informationGroup Member Names: You may work in groups of two, or you may work alone. Due November 20 in Class!
Galaxy Classification and Their Properties Group Member Names: You may work in groups of two, or you may work alone. Due November 20 in Class! Learning Objectives Classify a collection of galaxies based
More informationMorphological Classification of Galaxies based on Computer Vision features using CBR and Rule Based Systems
Morphological Classification of Galaxies based on Computer Vision features using CBR and Rule Based Systems Devendra Singh Dhami Tasneem Alowaisheq Graduate Candidate School of Informatics and Computing
More informationTHE UNIVERSE CHAPTER 20
THE UNIVERSE CHAPTER 20 THE UNIVERSE UNIVERSE everything physical in and Includes all space, matter, and energy that has existed, now exists, and will exist in the future. How did our universe form, how
More informationSparse Functional Regression
Sparse Functional Regression Junier B. Olia, Barnabás Póczos, Aarti Singh, Jeff Schneider, Timothy Verstynen Machine Learning Department Robotics Institute Psychology Department Carnegie Mellon Uniersity
More informationClassification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).
Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationReal Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report
Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Hujia Yu, Jiafu Wu [hujiay, jiafuwu]@stanford.edu 1. Introduction Housing prices are an important
More informationGalaxies. Beyond the Book. FOCUS Book. Make a model that helps demonstrate how the universe is expanding. Follow these steps:
FOCUS Book Galaxies Make a model that helps demonstrate how the universe is expanding. Follow these steps: 1 Use markers to make dots on the outside of an uninflated balloon to represent galaxies full
More informationMachine Learning for Biomedical Engineering. Enrico Grisan
Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and
More informationSurprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University
Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline What is Surprise Detection? Example Application: The LSST
More informationGalaxy Growth and Classification
Observational Astronomy Lab: I-1FS Objectives: First Name: Last Name: Galaxy Growth and Classification To understand the concept of color in astronomy. To be able to classify galaxies based on their morphology
More informationSVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning
SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are
More informationSOURCES AND RESOURCES:
A Galactic Zoo Lesson plan for grades K-2 Length of lesson: 1 Class Period (60 minutes) Adapted by: Jesús Aguilar-Landaverde, Environmental Science Institute, February 24, 2012 SOURCES AND RESOURCES: An
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Stochastic Convergence Barnabás Póczos Motivation 2 What have we seen so far? Several algorithms that seem to work fine on training datasets: Linear regression
More informationAstroinformatics in the data-driven Astronomy
Astroinformatics in the data-driven Astronomy Massimo Brescia 2017 ICT Workshop Astronomy vs Astroinformatics Most of the initial time has been spent to find a common language among communities How astronomers
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationStudy Guide Chapter 2
Section: Stars Pages 32-38 Study Guide Chapter 2 Circle the letter of the best answer for each question. 1. What do scientists study to learn about stars? a. gravity c. space b. starlight d. colors COLOR
More informationThe Galaxy Zoo Project
Astronomy 201: Cosmology Fall 2009 Prof. Bechtold NAME: The Galaxy Zoo Project 200 points Due: Nov. 23, 2010, in class Professional astronomers often have to search through enormous quantities of data
More informationMajor Review: A very dense article" Dawes Review 4: Spiral Structures in Disc Galaxies; C. Dobbs and J Baba arxiv "
The Components of a Spiral Galaxy-a Bit of a Review- See MBW chap 11! we have discussed this in the context of the Milky Way" Disks:" Rotationally supported, lots of gas, dust, star formation occurs in
More informationAn Introduction to Galaxies and Cosmology. Jun 29, 2005 Chap.2.1~2.3
An Introduction to Galaxies and Cosmology Jun 29, 2005 Chap.2.1~2.3 2.1 Introduction external galaxies normal galaxies - majority active galaxies - 2% high luminosity (non-stellar origin) variability
More informationOutline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren
1 / 34 Metamodeling ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 1, 2015 2 / 34 1. preliminaries 1.1 motivation 1.2 ordinary least square 1.3 information
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationReal Astronomy from Virtual Observatories
THE US NATIONAL VIRTUAL OBSERVATORY Real Astronomy from Virtual Observatories Robert Hanisch Space Telescope Science Institute US National Virtual Observatory About this presentation What is a Virtual
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Independent Component Analysis Barnabás Póczos Independent Component Analysis 2 Independent Component Analysis Model original signals Observations (Mixtures)
More informationPart 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven
More informationLearning Theory Continued
Learning Theory Continued Machine Learning CSE446 Carlos Guestrin University of Washington May 13, 2013 1 A simple setting n Classification N data points Finite number of possible hypothesis (e.g., dec.
More informationStat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January,
1 Stat 406: Algorithms for classification and prediction Lecture 1: Introduction Kevin Murphy Mon 7 January, 2008 1 1 Slides last updated on January 7, 2008 Outline 2 Administrivia Some basic definitions.
More informationSome Applications of Machine Learning to Astronomy. Eduardo Bezerra 20/fev/2018
Some Applications of Machine Learning to Astronomy Eduardo Bezerra ebezerra@cefet-rj.br 20/fev/2018 Overview 2 Introduction Definition Neural Nets Applications do Astronomy Ads: Machine Learning Course
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationIntroduction to Machine Learning and Cross-Validation
Introduction to Machine Learning and Cross-Validation Jonathan Hersh 1 February 27, 2019 J.Hersh (Chapman ) Intro & CV February 27, 2019 1 / 29 Plan 1 Introduction 2 Preliminary Terminology 3 Bias-Variance
More informationBHS Astronomy: Galaxy Classification and Evolution
Name Pd Date BHS Astronomy: Galaxy Classification and Evolution This lab comes from http://cosmos.phy.tufts.edu/~zirbel/ast21/homework/hw-8.pdf (Tufts University) The word galaxy, having been used in English
More informationGalaxies and The Milky Way
Galaxies and The Milky Way Attendance Quiz Are you here today? Here! (a) yes (b) no (c) To infinity and beyond! Next Tuesday, 5/30, I will be away at a meeting. There will be a guest lecture by Dr. Jorge
More informationDeriving Principal Component Analysis (PCA)
-0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationTowards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks
Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Tian Li tian.li@pku.edu.cn EECS, Peking University Abstract Since laboratory experiments for exploring astrophysical
More informationProbabilistic photometric redshifts in the era of Petascale Astronomy
Probabilistic photometric redshifts in the era of Petascale Astronomy Matías Carrasco Kind NCSA/Department of Astronomy University of Illinois at Urbana-Champaign Tools for Astronomical Big Data March
More informationINTRODUCTION TO DATA SCIENCE
INTRODUCTION TO DATA SCIENCE JOHN P DICKERSON Lecture #13 3/9/2017 CMSC320 Tuesdays & Thursdays 3:30pm 4:45pm ANNOUNCEMENTS Mini-Project #1 is due Saturday night (3/11): Seems like people are able to do
More informationIntroduction to Machine Learning (Pattern recognition and model fitting) for Master students
Introduction to Machine Learning (Pattern recognition and model fitting) for Master students Spring 007, ÖU/RAP Thorsteinn Rögnvaldsson thorsteinn.rognvaldsson@tech.oru.se Contents Machine learning algorithms
More informationThe prediction of house price
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationClick Prediction and Preference Ranking of RSS Feeds
Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS
More informationKernel Methods. Barnabás Póczos
Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Risk Minimization Barnabás Póczos What have we seen so far? Several classification & regression algorithms seem to work fine on training datasets: Linear
More information15-388/688 - Practical Data Science: Decision trees and interpretable models. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Decision trees and interpretable models J. Zico Kolter Carnegie Mellon University Spring 2018 1 Outline Decision trees Training (classification) decision trees Interpreting
More informationLecture Outlines. Chapter 25. Astronomy Today 7th Edition Chaisson/McMillan Pearson Education, Inc.
Lecture Outlines Chapter 25 Astronomy Today 7th Edition Chaisson/McMillan Chapter 25 Galaxies and Dark Matter Units of Chapter 25 25.1 Dark Matter in the Universe 25.2 Galaxy Collisions 25.3 Galaxy Formation
More informationTopics we covered. Machine Learning. Statistics. Optimization. Systems! Basics of probability Tail bounds Density Estimation Exponential Families
Midterm Review Topics we covered Machine Learning Optimization Basics of optimization Convexity Unconstrained: GD, SGD Constrained: Lagrange, KKT Duality Linear Methods Perceptrons Support Vector Machines
More informationAstroinformatics: massive data research in Astronomy Kirk Borne Dept of Computational & Data Sciences George Mason University
Astroinformatics: massive data research in Astronomy Kirk Borne Dept of Computational & Data Sciences George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Ever since humans first gazed
More informationDistribution-Free Distribution Regression
Distribution-Free Distribution Regression Barnabás Póczos, Alessandro Rinaldo, Aarti Singh and Larry Wasserman AISTATS 2013 Presented by Esther Salazar Duke University February 28, 2014 E. Salazar (Reading
More informationPAC-learning, VC Dimension and Margin-based Bounds
More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based
More informationIE598 Big Data Optimization Introduction
IE598 Big Data Optimization Introduction Instructor: Niao He Jan 17, 2018 1 A little about me Assistant Professor, ISE & CSL UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU10701 11. Learning Theory Barnabás Póczos Learning Theory We have explored many ways of learning from data But How good is our classifier, really? How much data do we
More informationManual for a computer class in ML
Manual for a computer class in ML November 3, 2015 Abstract This document describes a tour of Machine Learning (ML) techniques using tools in MATLAB. We point to the standard implementations, give example
More informationExcerpts from previous presentations. Lauren Nicholson CWRU Departments of Astronomy and Physics
Excerpts from previous presentations Lauren Nicholson CWRU Departments of Astronomy and Physics Part 1: Review of Sloan Digital Sky Survey and the Galaxy Zoo Project Part 2: Putting it all together Part
More informationFINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE
FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a two-page cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.
More informationCPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017
CPSC 340: Machine Learning and Data Mining Stochastic Gradient Fall 2017 Assignment 3: Admin Check update thread on Piazza for correct definition of trainndx. This could make your cross-validation code
More informationThe SDSS is Two Surveys
The SDSS is Two Surveys The Fuzzy Blob Survey The Squiggly Line Survey The Site The telescope 2.5 m mirror Digital Cameras 1.3 MegaPixels $150 4.3 Megapixels $850 100 GigaPixels $10,000,000 CCDs CCDs:
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu What is Machine Learning? Overview slides by ETHEM ALPAYDIN Why Learn? Learn: programming computers to optimize a performance criterion using example
More informationChapter 30. Galaxies and the Universe. Chapter 30:
Chapter 30 Galaxies and the Universe Chapter 30: Galaxies and the Universe Chapter 30.1: Stars with varying light output allowed astronomers to map the Milky Way, which has a halo, spiral arm, and a massive
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationMachine Learning (CSE 446): Learning as Minimizing Loss; Least Squares
Machine Learning (CSE 446): Learning as Minimizing Loss; Least Squares Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 13 Review 1 / 13 Alternate View of PCA: Minimizing
More informationFigure 19.19: HST photo called Hubble Deep Field.
19.3 Galaxies and the Universe Early civilizations thought that Earth was the center of the universe. In the sixteenth century, we became aware that Earth is a small planet orbiting a medium-sized star.
More informationDiscovery of Primeval Large-Scale Structures with Forming Clusters at Redshift z=5.7
Discovery of Primeval Large-Scale Structures with Forming Clusters at Redshift z=5.7 Sadanori Okamura Department of Astronomy, and Research Center for the Early Universe, University of Tokyo Collaborators
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationSurprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University
Surprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline Astroinformatics Example Application:
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationGalaxies. CESAR s Booklet
What is a galaxy? Figure 1: A typical galaxy: our Milky Way (artist s impression). (Credit: NASA) A galaxy is a huge collection of stars and interstellar matter isolated in space and bound together by
More informationPerceptron (Theory) + Linear Regression
10601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron (Theory) Linear Regression Matt Gormley Lecture 6 Feb. 5, 2018 1 Q&A
More informationData Exploration vis Local Two-Sample Testing
Data Exploration vis Local Two-Sample Testing 0 20 40 60 80 100 40 20 0 20 40 Freeman, Kim, and Lee (2017) Astrostatistics at Carnegie Mellon CMU Astrostatistics Network Graph 2017 (not including collaborations
More informationLearning algorithms at the service of WISE survey
Katarzyna Ma lek 1,2,3, T. Krakowski 1, M. Bilicki 4,3, A. Pollo 1,5,3, A. Solarz 2,3, M. Krupa 5,3, A. Kurcz 5,3, W. Hellwing 6,3, J. Peacock 7, T. Jarrett 4 1 National Centre for Nuclear Research, ul.
More informationEnsemble Methods for Machine Learning
Ensemble Methods for Machine Learning COMBINING CLASSIFIERS: ENSEMBLE APPROACHES Common Ensemble classifiers Bagging/Random Forests Bucket of models Stacking Boosting Ensemble classifiers we ve studied
More informationFast Hierarchical Clustering from the Baire Distance
Fast Hierarchical Clustering from the Baire Distance Pedro Contreras 1 and Fionn Murtagh 1,2 1 Department of Computer Science. Royal Holloway, University of London. 57 Egham Hill. Egham TW20 OEX, England.
More informationMachine Learning Methods for Radio Host Cross-Identification with Crowdsourced Labels
Machine Learning Methods for Radio Host Cross-Identification with Crowdsourced Labels Matthew Alger (ANU), Julie Banfield (ANU/WSU), Cheng Soon Ong (Data61/ANU), Ivy Wong (ICRAR/UWA) Slides: http://www.mso.anu.edu.au/~alger/sparcs-vii
More informationIntroduction to representational similarity analysis
Introduction to representational similarity analysis Nikolaus Kriegeskorte MRC Cognition and Brain Sciences Unit Cambridge RSA Workshop, 16-17 February 2015 a c t i v i t y d i s s i m i l a r i t y Representational
More informationDecision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1
Decision Trees Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, 2018 Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last
More informationUnderstanding the relationship between Functional and Structural Connectivity of Brain Networks
Understanding the relationship between Functional and Structural Connectivity of Brain Networks Sashank J. Reddi Machine Learning Department, Carnegie Mellon University SJAKKAMR@CS.CMU.EDU Abstract Background.
More informationDimension Reduction Methods
Dimension Reduction Methods And Bayesian Machine Learning Marek Petrik 2/28 Previously in Machine Learning How to choose the right features if we have (too) many options Methods: 1. Subset selection 2.
More informationMonday May 12, :00 to 1:30 AM
ASTRONOMY 108: Descriptive Astronomy Spring 2008 Instructor: Hugh Gallagher Office: Physical Science Building 130 Phone, Email: 436-3177, gallagha@oneonta.edu Office Hours: M 2:00-3:00 PM, Th 10:00-11:00
More informationRepresentational similarity analysis. Nikolaus Kriegeskorte MRC Cognition and Brain Sciences Unit Cambridge, UK
Representational similarity analysis Nikolaus Kriegeskorte MRC Cognition and Brain Sciences Unit Cambridge, UK a c t i v i t y d i s s i m i l a r i t y Representational similarity analysis stimulus (e.g.
More informationSparse representation classification and positive L1 minimization
Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng
More informationINSIDE LAB 9: Classification of Stars and Other Celestial Objects
INSIDE LAB 9: Classification of Stars and Other Celestial Objects OBJECTIVE: To become familiar with the classification of stars by spectral type, and the classification of celestial objects such as galaxies.
More informationLogistic Regression Introduction to Machine Learning. Matt Gormley Lecture 8 Feb. 12, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Logistic Regression Matt Gormley Lecture 8 Feb. 12, 2018 1 10-601 Introduction
More informationCS7267 MACHINE LEARNING
CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo Gutierrez-Osuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning
More informationHomework on Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, April 8 30 points Prof. Rieke & TA Melissa Halford
Homework on Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, April 8 30 points Prof. Rieke & TA Melissa Halford You are going to work with some famous astronomical data in this homework.
More informationFrom the Big Bang to Big Data. Ofer Lahav (UCL)
From the Big Bang to Big Data Ofer Lahav (UCL) 1 Outline What is Big Data? What does it mean to computer scientists vs physicists? The Alan Turing Institute Machine learning examples from Astronomy The
More informationHomework #7: Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, October points Profs. Rieke
Homework #7: Properties of Galaxies in the Hubble Deep Field Name: Due: Friday, October 31 30 points Profs. Rieke You are going to work with some famous astronomical data in this homework. The image data
More informationTales from fmri Learning from limited labeled data. Gae l Varoquaux
Tales from fmri Learning from limited labeled data Gae l Varoquaux fmri data p 100 000 voxels per map Heavily correlated + structured noise Low SNR: 5% 13 db Brain response maps (activation) n Hundreds,
More informationThe Perceptron. Volker Tresp Summer 2016
The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters
More informationBe able to define the following terms and answer basic questions about them:
CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional
More informationMachine Learning: Evaluation
Machine Learning: Evaluation Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Wintersemester 2007 / 2008 Comparison of Algorithms Comparison of Algorithms Is algorithm A better
More informationWarm up: risk prediction with logistic regression
Warm up: risk prediction with logistic regression Boss gives you a bunch of data on loans defaulting or not: {(x i,y i )} n i= x i 2 R d, y i 2 {, } You model the data as: P (Y = y x, w) = + exp( yw T
More informationLearning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I
Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I What We Did The Machine Learning Zoo Moving Forward M Magdon-Ismail CSCI 4100/6100 recap: Three Learning Principles Scientist 2
More informationOutline Challenges of Massive Data Combining approaches Application: Event Detection for Astronomical Data Conclusion. Abstract
Abstract The analysis of extremely large, complex datasets is becoming an increasingly important task in the analysis of scientific data. This trend is especially prevalent in astronomy, as large-scale
More informationVBM683 Machine Learning
VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data
More informationMachine Learning 2007: Slides 1. Instructor: Tim van Erven Website: erven/teaching/0708/ml/
Machine 2007: Slides 1 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/ erven/teaching/0708/ml/ September 6, 2007, updated: September 13, 2007 1 / 37 Overview The Most Important Supervised
More information