An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology. Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University
|
|
- Chastity Taylor
- 5 years ago
- Views:
Transcription
1 An Introduc+on to Sta+s+cs and Machine Learning for Quan+ta+ve Biology Anirvan Sengupta Dept. of Physics and Astronomy Rutgers University
2 Why Do We Care? Necessity in today s labs Principled approach: defending conclusions, comparing methods,.. Insights for biological informa+on processing?
3 What Can We Expect to Do Today? Central recurring themes Orienta+on and connec+ons Resources
4
5 Data Deluge in Biology hnp://biomedicalcomputa+onreview.org hnp://ugene.net/learn.html hnps://sites.stanford.edu/baccuslab/ hnp://mmtg.fel.cvut.cz/mapsim/
6 Then and Now Data= numbers Ques+ons=>,<,!= Data=1-10TB Ques+ons= tests Ronald Fisher (on the right) during tea +me at Rothamsted Experimental Sta+on (1920s) hnp://disserta+onreviews.org/archives/724 New York Genome Center hnp://blogs.scien+ficamerican.com/
7 How Much to Learn from Data? hnps://shapeofdata.wordpress.com/2013/03/26/general-regression-and-over-fihng/
8 Model Selec+on: Bias-Variance Tradeoff Underfihng Overfihng hnp://scon.fortmann-roe.com/docs/biasvariance.html
9 Probability Distribu+ons Discrete distribu+ons: Binomial, Poisson,.. P(X = k)= n k pk (1 p) n k Poisson: n large, p small with np= fixed Con+nuous distribu+ons: normal, chi-squared, student-t, F,.. P(a X b)= b a p(x µ,σ 2 )dx 2 1 µ) p(x µ,σ 2 )= exp[ (x ] σ 2π 2σ 2 hnp://sta+s+cs.wikidot.com
10 Distribu+ons Derived from Standard Normal Distribu+on (μ=0, σ 2 =1) Chi-squared distribu+on z 1,...,z k iid N(0,1) k i=1 Q = z i 2 χ 2 (k) Student s t distribu+on z N(0,1) Q χ 2 (k) T = z /(Q /k) 1/2 t(k) F distribu+on Q 1 χ 2 (k 1 ) Q 2 χ 2 (k 2 ) Q 1 /k 1 Q 2 /k 2 F(k 1,k 2 )
11 Es+ma+on p(x µ,σ 2 )= 2 1 µ) exp[ (x ] σ 2π 2σ 2 How to find μ, σ 2 from observa+ons?
12 Candidate Es+mators ˆµ(x) = i n x i = x ˆ σ 2 (x)= i (x i x ) 2 n 1 = s 2
13 Hypothesis tes+ng p(x µ,σ 2 )= 2 1 µ) exp[ (x ] σ 2π 2σ 2 How to decide whether the hypothesis H 1 : μ >6 is true from observa+ons, as opposed to H 0 : μ =6?
14 Test Sta+s+c z = x µ σ N(0,1) underh 0 T = x µ t(n 1) underh s 0
15 Type I and Type II error Accept H 0 Reject H 0 H 0 true OK Type I error Prob α H 0 false Type II error Prob β OK hnp://
16 Regression and Goodness of Fit hnp://randomanalyses.blogspot.com/2011/12/basics-of-regression-and-model-fihng.html
17 F-test for Linear Regression i 2 2 ( y y) = ( ŷi y) + ( yi ŷ ) i i (Corrected) Sum of Squares Total i SST = SSM + SSE (Corrected) Sum of Squares for Model i Sum of Squares for Error 2 SSM /(p 1) F = SSE /(n p) hnp:// Related measure: Coefficient of Determina+on R 2 = SSM SST
18 Likelihood L(θ x)= P(x θ) e.g. L(µ,σ 2 x 1,...,x n )= n i=1 1 exp[ (x µ) 2 i ] σ 2π 2σ 2
19 Likelihood Surface The peak gets sharper and sharper with more data. hnp://reliawiki.org/index.php/appendix:_maximum_likelihood_es+ma+on_example
20 Example: Normal Distribu+on L(µ,σ 2 x 1,...,x n )= n i=1 1 σ 2π exp[ (x µ) 2 i ] 2σ 2 ˆµ(x) = i n x i = x Maximizing likelihood=> ˆ σ 2 (x)= i (x i x ) 2 n
21 Limited Noisy Data with Small Signal Imagine we had to es+mate only the mean μ. L(µ x 1,...,x n )= n i=1 x i 1 σ 0 2π exp[ (x µ) 2 i ] 2 2σ 0 i Should we use MLE ˆµ(x) = n = x? What if we choose, instead, ˆµ 0 (x)= 0? E[( ˆµ(x) µ) 2 ]= σ 2 0 n E[( ˆµ 0 (x) µ) 2 ]= µ 2 Error from noise (variance) Trivial es+mator bener when µ < σ 0 n! Error from bias
22 Regularize to Shrink the Es+mate We use the es+mator For es+ma+ng only the mean μ, maximize exp[ µ2 n 2λ ] 1 2 σ 0 2π exp[ (x µ) 2 i ] 2 i=1 2σ 0 ˆµ(x) = i x i 2 n+ σ 0 λ 2. When λ σ 0 n we get ˆµ(x) 0!
23 Regulariza+on via Bayesian Framework P(θ x)= P(x,θ) P(x) = P(x θ)p 0 (θ) P(x) where P(x)= dθ 'P(x θ ')P 0 (θ ') The prior P 0 (θ) incorporates addi+onal constraints on parameters.
24 Maximum A Posteriori Es+ma+on θ ˆ = argmax P(θ x)= argmax θ θ P(x θ)p 0 (θ) P(x) Maximum likelihood es+mate corresponds to a uniform prior: P 0 (θ)=const. One could also sample θ from the posterior distribu+on P(θ x) to get a sense of range of possibili+es. Some+mes averaging over θ is a bener idea than maximiza+on
25 Likelihood Improvement with Higher Complexity L 2 L 1
26 Evidence Prior Product Likelihood P(x)= dθ 'P(x,θ ') = dθ 'P(x θ ')P 0 (θ ') Since dxp(x) = 1 θ less constrained models have lower evidence.
27 An Example: Tied Means dµ 1 λ 2π exp[ µ 2 n 2λ ] 1 2 σ 0 2π exp[ (x µ) 2 i ] 2 i=1 2σ 0 x 2 x 1 n 1 dµ i σ 0 2π exp[ (x µ ) 2 i i 1 ] 2 i=1 2σ 0 λ 2π exp[ µ 2 i 2λ ] 2
28 Bayesian Hypothesis Tes+ng: Going Beyond p-values Pr(Disease test positive) Pr(No Disease test positive) = Pr(testpositive Disease) Pr(testpositive No Disease) P(Disease) P(No Disease)
29 Bayesian Hypothesis tes+ng P(H) dθ 'P(x θ ')P (θ ' H) P(H x)= 0 P(x) where P(x)= H' P(H 0 x) P(H 1 x) = P(H ) 0 P(H 1 ) P(H') dθ 'P(x θ ')P (θ ' H) 0 dθ 'P(x θ ')P 0 (θ ' H 0 ) dθ 'P(x θ ')P 0 (θ ' H 1 ) Posterior odds = Prior odds x Bayes factor
30 More Complex Tasks in Data Analysis Supervised: Classifica+on, regression,.. Unsupervised: Clustering, discovering latent factors, network structures,..
31 Clustering Gene Expression hnp:// hnp://
32 Latent Variable Models: Clustering and Mixture Models p(x λ)= w i g(x µ i,σ i ) i 1 g(x µ i,σ i )= (2π ) n/2 Σ i exp{ 1 1/2 2 (x µ )'Σ 1 (x µ i i i )} hnps://en.wikipedia.org/wiki/cluster_analysis Challenge: Learning parameters w i, μ i, Σ i. Expecta+on Maximiza+on, Sampling,
33 Latent Variable Models: Hidden Markov Models P(o,s) = P(s)P(o s) = π s0 b s0 (o 0 )a s0,s 1 b s1 (o 1 )a s1,s 2 b s2 (o 2 )a s2,s 3... hnp://ar+nt.info/html/artint_161.html hnp://compbio.pbworks.com/
34 Diffusive HMM for Single Molecule Phenomena Beausang et al, Biophys J, 2007
35 Genera+on vs Predic+on So far we have dealt with explicit probability models of data. What if we have no clue on precise probability models?
36 RosenblaN s Perceptron hnp://sebas+anraschka.com/ar+cles/2015_singlelayer_neurons.html
37 Learning the Weights w t+1 = w t +η( y i ŷ i )x i Good learning: The decision surface does not depend Too much on the subset of training data. Test by cross-valida+on.
38 Linearly Non-separable Cases hnp://
39 Parametrizing Complex Decision Boundaries Shallow methods: SVM, Boos+ng, Deep methods: Layered neural nets
40 Classifica+on: Kernel Methods/SVM hnp:// i φ(x)= α i y i K(x,x i )
41 SVM Kernels Linear Polynomial K(x, y)= x. y K(x, y)= (x. y +1) p Radial Basis Func+on K(x, y)= exp( (x y)2 σ 2 )
42 Classifying Regulatory Sequences Djordjevic, Sengupta and Shraiman, Genome Res, 2003
43 Back to Probability Models: Restricted Boltzmann Machine Restricted Boltzmann Machine (RBM) h P(x,v)= exp( E(x,v))/ Z E(x,v)= Z = {x,v } x i W ia h a ia exp( E(x,v)) x Smolensky, 1986 Hinton and Salakutdinov, Science, 2006
44 MNIST Data and RBM hnps://corpocrat.com/2014/10/17/machine-learning-using-restricted-boltzmann-machines/
45 Stacking RBMs into Deep Boltzmann Machines (DBM) DBM RBM h 2 h RBM h 1 x RBM x
46 Mul+layer Ar+ficial Neural Nets/ Deep Learning (ANN 2.0) hnp://
47 Forget Probabili+es: Feedforward Deep nets (1) h M1 1 (2) h M2 1! (d ) h Md 1 σ(w (1) M1 x N N 1 + b 1 ), σ(w (2) (1) M2 h M M b 2 ) (d σ(w ) M2 h (d 1) M M1 + b 1 1 d ) From hnp://
48 Applica+ons to Biology Angermueller et al, Mol Syst Bio, 2016, adapted from Alipanahi et al, Nat Biotechnol, 2015
49 Summary Complexity of models and overfihng Significance tests Cross-valida+on
50 Tools
51 Books
Bias/variance tradeoff, Model assessment and selec+on
Applied induc+ve learning Bias/variance tradeoff, Model assessment and selec+on Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège October 29, 2012 1 Supervised
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationCS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationComputer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13
Computer Vision Pa0ern Recogni4on Concepts Part I Luis F. Teixeira MAP- i 2012/13 What is it? Pa0ern Recogni4on Many defini4ons in the literature The assignment of a physical object or event to one of
More informationClassifica(on and predic(on omics style. Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University
Classifica(on and predic(on omics style Dr Nicola Armstrong Mathema(cs and Sta(s(cs Murdoch University Classifica(on Learning Set Data with known classes Prediction Classification rule Data with unknown
More informationSTAD68: Machine Learning
STAD68: Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 1 Evalua;on 3 Assignments worth 40%. Midterm worth 20%. Final
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationFounda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #3 Machine Learning. Edward Chang
Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #3 Machine Learning Edward Y. Chang Edward Chang Founda'ons of LSMM 1 Edward Chang Foundations of LSMM 2 Machine Learning
More informationCOMP 562: Introduction to Machine Learning
COMP 562: Introduction to Machine Learning Lecture 20 : Support Vector Machines, Kernels Mahmoud Mostapha 1 Department of Computer Science University of North Carolina at Chapel Hill mahmoudm@cs.unc.edu
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 13.1 and 13.2 1 Status Check Extra credits? Announcement Evalua/on process will start soon
More informationMachine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler
+ Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationNeural Networks. William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]
Neural Networks William Cohen 10-601 [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ] WHAT ARE NEURAL NETWORKS? William s notation Logis;c regression + 1
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationCS 6140: Machine Learning Spring What We Learned Last Week 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Sign
More informationGenerative Model (Naïve Bayes, LDA)
Generative Model (Naïve Bayes, LDA) IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University Materials from Prof. Jia Li, sta3s3cal learning book (Has3e et al.), and machine learning
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationChapter 16. Structured Probabilistic Models for Deep Learning
Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationLatent Dirichlet Alloca/on
Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationCSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015
CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015 Luke ZeElemoyer Slides adapted from Carlos Guestrin Predic5on of con5nuous variables Billionaire says: Wait, that s not what
More informationStatistical learning. Chapter 20, Sections 1 4 1
Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationIntroduction to Machine Learning Midterm, Tues April 8
Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 1 Evalua:on
More informationStatistical learning. Chapter 20, Sections 1 3 1
Statistical learning Chapter 20, Sections 1 3 Chapter 20, Sections 1 3 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationNumerical Learning Algorithms
Numerical Learning Algorithms Example SVM for Separable Examples.......................... Example SVM for Nonseparable Examples....................... 4 Example Gaussian Kernel SVM...............................
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationThe Lady Tasting Tea. How to deal with multiple testing. Need to explore many models. More Predictive Modeling
The Lady Tasting Tea More Predictive Modeling R. A. Fisher & the Lady B. Muriel Bristol claimed she prefers tea added to milk rather than milk added to tea Fisher was skeptical that she could distinguish
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationLecture 3: Machine learning, classification, and generative models
EE E6820: Speech & Audio Processing & Recognition Lecture 3: Machine learning, classification, and generative models 1 Classification 2 Generative models 3 Gaussian models Michael Mandel
More informationPart 2. Representation Learning Algorithms
53 Part 2 Representation Learning Algorithms 54 A neural network = running several logistic regressions at the same time If we feed a vector of inputs through a bunch of logis;c regression func;ons, then
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationMachine Learning & Data Mining CS/CNS/EE 155. Lecture 8: Hidden Markov Models
Machine Learning & Data Mining CS/CNS/EE 155 Lecture 8: Hidden Markov Models 1 x = Fish Sleep y = (N, V) Sequence Predic=on (POS Tagging) x = The Dog Ate My Homework y = (D, N, V, D, N) x = The Fox Jumped
More informationLecture 2 Machine Learning Review
Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things
More informationCSE 546 Final Exam, Autumn 2013
CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013
Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you
More informationComputer Vision. Pa0ern Recogni4on Concepts. Luis F. Teixeira MAP- i 2014/15
Computer Vision Pa0ern Recogni4on Concepts Luis F. Teixeira MAP- i 2014/15 Outline General pa0ern recogni4on concepts Classifica4on Classifiers Decision Trees Instance- Based Learning Bayesian Learning
More informationMachine Learning & Data Mining CS/CNS/EE 155. Lecture 11: Hidden Markov Models
Machine Learning & Data Mining CS/CNS/EE 155 Lecture 11: Hidden Markov Models 1 Kaggle Compe==on Part 1 2 Kaggle Compe==on Part 2 3 Announcements Updated Kaggle Report Due Date: 9pm on Monday Feb 13 th
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationCS-E3210 Machine Learning: Basic Principles
CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationSta$s$cal sequence recogni$on
Sta$s$cal sequence recogni$on Determinis$c sequence recogni$on Last $me, temporal integra$on of local distances via DP Integrates local matches over $me Normalizes $me varia$ons For cts speech, segments
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationKernel methods, kernel SVM and ridge regression
Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;
More informationDeep Learning Basics Lecture 8: Autoencoder & DBM. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 8: Autoencoder & DBM Princeton University COS 495 Instructor: Yingyu Liang Autoencoder Autoencoder Neural networks trained to attempt to copy its input to its output Contain
More informationFinal Examination CS 540-2: Introduction to Artificial Intelligence
Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationFinal Examination CS540-2: Introduction to Artificial Intelligence
Final Examination CS540-2: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 8.1 and 8.2 Status Check Projects Project 2 Midterm is coming, please do your homework!
More informationOverview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation
Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated
More informationCSCE 478/878 Lecture 6: Bayesian Learning
Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationUVA CS / Introduc8on to Machine Learning and Data Mining
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 13: Probability and Sta3s3cs Review (cont.) + Naïve Bayes Classifier Yanjun Qi / Jane, PhD University of Virginia Department
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationClass Notes. Examining Repeated Measures Data on Individuals
Ronald Heck Week 12: Class Notes 1 Class Notes Examining Repeated Measures Data on Individuals Generalized linear mixed models (GLMM) also provide a means of incorporang longitudinal designs with categorical
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification
More informationDeep Learning Srihari. Deep Belief Nets. Sargur N. Srihari
Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationQuan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology
Quan&fying Uncertainty Sai Ravela Massachuse7s Ins&tute of Technology 1 the many sources of uncertainty! 2 Two days ago 3 Quan&fying Indefinite Delay 4 Finally 5 Quan&fying Indefinite Delay P(X=delay M=
More informationSlides modified from: PATTERN RECOGNITION AND MACHINE LEARNING CHRISTOPHER M. BISHOP
Slides modified from: PATTERN RECOGNITION AND MACHINE LEARNING CHRISTOPHER M. BISHOP Predic?ve Distribu?on (1) Predict t for new values of x by integra?ng over w: where The Evidence Approxima?on (1) The
More information