Deep Learning: A Quick Overview

Size: px
Start display at page:

Download "Deep Learning: A Quick Overview"

Transcription

1 Deep Learnng: A Quck Overvew Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr seungjn October 16, / 43

2 Machne Learnng? Envronments Model Learnng Data sources Inference Machne Learnng: A scentfc dscplne that s concerned wth the desgn and development of algorthms that allow computers to learn from emprcal data (sensor data or database) and to make predctons. Descrptve or Predctve Learnng 2 / 43

3 Models n ML Generatve model Dscrmnatve model Undrected model Generatve model: Jont dstrbuton p(v, h) = p(v h)p(h) Dscrmnatve model: Drectly model p(h v) Undrected model: Energy-based model exp { E (v, h)} p(v, h) = v,h exp { E ( v, h )} 3 / 43

4 Generatve vs Energy-based Models Generatve model p(v, h) = p(v h)p(h). Energy-based model p(v, h) = exp { E (v, h)} v,h exp { E ( v, h )}. Generaton from model s easy. Inference can be hard. Learnng s easy after nference Generaton from model s hard. Inference can be easy. Learnng s relatvely hard? 4 / 43

5 Neural Networks Hstory Warren McCulloch and Walter Ptts (1943): McCulloch-Ptts model Donald Hebb (1949): The Organzaton of Behavor (Hebban learnng) Frank Rosenblatt (1957): Perceptron Bernard Wdrow and Marcan Hoff (1959): ADALINE John Hopfeld (1982): Hopfeld net Erkk Oja (1982): Oja s rule Davd E. Rumelhart, Geoffrey E. Hnton, and R. J. Wllams (1986): Error backpropagaton algorthm for MLP George Cybenko (1989): Unversal approxmaton theorem Radford Neal (1996): Bayesan learnng for neural networks 5 / 43

6 Recent Tme Lne of Deep Learnng Paul Smolensky (1986): Harmonum Geoffrey E. Hnton and Terry Sejnowsk (1986): Boltzmann machnes Geoffrey E. Hnton and hs colleagues (md-2000s): RBM Kunhko Fukushma (1980): Neocogntron Y. LeCun, L. Bottou,Y. Bengo, and P.Haffner (1989): Convolutonal neural networks Geoffrey E. Hnton, Smon Osndero, and Yee-Whye Teh (2006): Deep belef networks Ruslan Salakhutdnov and Geoffrey E. Hnton (2009): Deep Boltzmann machnes 6 / 43

7 Success and Issues Why successful? Pre-tranng: Restrcted Boltzmann machne (RBM), Auto-encoder, nonnegatve matrx factorzaton (NMF) Tranng: Dropout Rectfed lnear unts: No vanshng gradent, sparse actvaton Thanks to plenty of data an computng power! Issues Dstrbutons: Exponental famly harmonum, deep exponental famles Hyperparameter tunng: Bayesan optmzaton Scalablty: Squeeze deep nets, tranng and nference Mult-modal extensons: Dual-wng harmonum, restrcted deep belef nets, mult-modal stacked auto-encoders, and mult-modal deep Boltzmann machnes 7 / 43

8 Restrcted Boltzmann Machnes 8 / 43

9 Harmonum (or Restrcted Boltzmann Machne) h 1 h K v 1 v 2 v D Harmonum (Smolensky, 1986), aka RBM (Hnton and Sejnowsk, 1986). Undrected model whch allows only nter-layer connectons (bpartte graphs). Energy-based probablstc model defnes a probablty dstrbuton through an energy functon, assocatng an energy functon to each confguraton of the varables of nterest: p(v) = h p(v, h) = h 1 Z e E(v,h). Learnng corresponds to modfyng the energy functon so that ts shape has desrable propertes (maxmum lkelhood estmaton). 9 / 43

10 RBM: v {1, 0} D and h {1, 0} K Energy s gven by E(v, h) = b v c h v W h. Condtonal probabltes are calculated as: ( ) ( ) D K p(h j = 1 v) = σ c j + W,j v, p(v = 1 h) = σ b + W,j h j. =1 j=1 Gaussan-Bernoull RBM: v R D and h {1, 0} K Energy s gven by E(v, h) = D =1 (v b ) 2 2σ 2 K D K c j h j j=1 =1 j=1 v W,j h j σ. Condtonal probabltes are calculated as: ( ) D W,j v p(h j = 1 v) = σ c j +, σ =1 ( p(v h) = N v b + σ K j=1 W,j h j, σ 2 ). 10 / 43

11 Gbbs Samplng n RBM wth Bnary Unts v (1) p(v), h (1) p(h v (1) ) v (2) p(v h (1) ) h (2) p(h v (2) ). v (k+1) p(v h (k) ) (k Gbbs steps). 11 / 43

12 Contrastve Dvergence Learnng Average log-lkelhood gradent s approxmated by k Gbbs steps, log p(v) θ θ E(v, h) E(v, h) θ p(v) p +, p (k) (v,h) where p (k) (v, h) s the jont dstrbuton determned by k Gbbs steps and p(v) = 1 N N δ(v v n), n=1 p = 1 N N p(h v n). n=1 Gradent ascent learnng leads to [ W W + η vh p vh ], p (k) (v,h) [ ] 1 N b b + η v n v N p (k) (v,h), n=1 [ ] c c + η h p h p (k) (v,h). 12 / 43

13 Exponental Famly Harmonum Max Wellng, Mchal Rosen-Zv, Geoffrey E. Hnton (2004) Allows for both observed and hdden varables to follow exponental famly dstrbuton. Assume ndependent dstrbutons for observed and hdden varables from exponental famly and couple them n the log-doman by the ntroducton of a quadratc nteracton term yelds a random feld: p(v, h) exp ξ a f a (v ) + λ jb g jb (h j ) + W ajb f a (v )g jb (h j ),,a j,b,a,j,b where {f a (v ), g jb (h )} are suffcent statstcs, {ξ a, λ jb } are canoncal parameters. Apply the standard contrastve dvergence learnng. 13 / 43

14 Auto-Encoders 14 / 43

15 Auto-Encoder decodng v = g encodng h = f ( ) U h + c ( ) W v + b v Lnear actvatons and ted weghts U = W : Equvalent to PCA Traned by backpropagaton: Mnmze v J (v, v). 15 / 43

16 Denosng Auto-Encoder Fgure: Autoencoder X X Fgure: Denosng autoencoder 16 / 43

17 Stacked Denosng Auto-Encoder for Face Pose Normalzaton (Kang and Cho, 2013) (frontal faces) (Non-frontal faces) Fgure: Pre-tranng Fgure: Tran another DAE 17 / 43

18 Fgure: 10 examples of corrupted face mages from Georga tech face database (left), ther pose-normalzed verson obtaned by the proposed method (mddle), and ground-truth frontal face mages (rght). 18 / 43

19 Varatonal Auto-Encoder (Kngma and Wellng, 2014) Model parameters θ and varatonal parameters φ are learned by maxmzng varatonal lower bound on the margnal lkelhood (stochastc gradent varatonal Bayes). Decodng MLP and encodng MLP. For nstance, Probablstc decoder: p θ (x z) Probablstc encoder: q φ (z x) p(x z) = N (x µ, Σ), h = σ (W 1 z + b 1 ), µ = W 2 h + b 2, log dag(σ) = W 3 h + b / 43

20 Kngma and Wellng, / 43

21 Dropout 21 / 43

22 Dropout Fgure: Taken from Srvastava et al., JMLR Form a vector of ndependent Bernoull random varables, z (l), where z (l) Bern(p). Feedforward operatons are: ( ( y (l+1) = σ w (l+1) y (l) z (l)) ) + b (l+1). 22 / 43

23 Fgure: Taken from Srvastava et al., JMLR Approxmates the effect of averagng the predctons of all these thnned networks by smply usng a sngle unthnned network that has smaller weghts. 23 / 43

24 Gaussan Dropout (Wang and Mannng, 2013) v (l+1) = j ( y (l+1) = σ W (l+1) j, v (l+1) ). y (l) j z (l) j + b (l+1), Approxmate ( v (l+1) as Gaussan ( ) ) 2 ṽ (l+1) N µ (l+1), σ (l+1). Drectly sample v (l+1) from ths Gaussan dstrbuton: ṽ (l+1) = µ (l+1) + σ (l+1) ɛ, ɛ N (0, 1), [ ] µ (l+1) = E ṽ (l+1) = p j W (l+1) j, y (l) j + b (l+1), j ( ) σ (l+1) = var ṽ (l+1) = ( ) 2 p j (1 p j ) W (l+1) j, y (l) j. j 24 / 43

25 Standard dropout Sample z (l) Bern(p ). Compute actvatons v (l+1) = w (l+1) + b (l+1) ( y (l+1) = σ, v (l+1) (y (l) z (l)) ). Gaussan dropout Drectly sample v (l+1) N ( µ (l+1), ( Compute actvatons ( ) = σ. y (l+1) v (l+1) σ (l+1) ) 2 ). { Gaussan dropout gnores dependences between the standard dropout. v (l+1) }, as present n Nonetheless good emprcal results were reported by Wang and Mannng (2013). 25 / 43

26 Bayesan Learnng for Feedforward Neural Networks 26 / 43

27 Bayesan Learnng for Neural Networks (Neal, 1996) Queston: Does t make sense to use a network wth an nfnte number of hdden unts, tranng t by maxmzng the lkelhood of a fnte amount of data? The network wll overft the data and ts generalzaton performance wll be poor. Well, however, the dea of selectng the network sze dependng on the amount of tranng data makes lttle sense to a Bayesan. Radford M. Neal (1996) showed that: Sensble to consder a lmt where the number of hdden unts n a net tends to nfnty, and that good predctons can be obtaned from such models usng the Bayesan machnery. For fxed hyperparameters, a large class of neural network models wll converge to a Gaussan process pror over functons n the lmt of an nfnte number of hdden unts. 27 / 43

28 Computng wth Infnte Networks (Wllams, 1996) Neal proved that an nfnte neural net wll converge to a Gaussan process. However, Neal dd not gve the covarance functon needed to actually specfy the partcular Gaussan process. Wllams has showed that for certan weght prors and transfer functons n the neural network model, the covarance functon whch descrbes the behavor of the correspondng Gaussan process can be calculated analytcally. Allows predctons to be made usng neural networks wth an nfnte number of hdden unts n tme O(n 3 ), where n s the number of tranng examples. 28 / 43

29 Mult-vew Extenson of Deep Learnng s EASY! 29 / 43

30 Mult-Task Learnng (Caruana, 1997) 30 / 43

31 31 / 43

32 Mult-Vew Learnng: Moble Examples 32 / 43

33 Mult-Vew Learnng: Recommendaton Systems 33 / 43

34 Mult-Vew Learnng: Image Annotaton 34 / 43

35 Dual-Wng Harmonum (E. P. Xng, R. Yan, and A. G. Hauptmann, 2005) h x y For each word (word-count profle) { p(x h) = Posson (x exp b + j W,j h j }). For each mage (color hstogram representaton) ( ( σ 2 p(y k h) = N y k c k + ) ) U k,j h j, σ 2. j 35 / 43

36 36 / 43

37 Mult-Vew Harmonum (Y. Kang and S. Cho, 2011) h x h h y x y Latent varable Latent varable t t (a) mult-vew RBM (b) dual-wng harmonum 37 / 43

38 Mult-Modal Autoencoders (J. Ngam, et al., 2011) shared representaton 38 / 43

39 Restrcted Deep Belef Net (Kang and Cho, 2011) / 43

40 Deep Multvew Hashng (Kang, Km, and Cho, 2012) (c)... (b) (a) / 43

41 Experments: Smlarty Search Results Query Retreved mages 41 / 43

42 Deep Belef Net vs Deep Boltzmann Machne p(h 2, h 3 ) h 3 h 3 p(h 1 h 2 ) h 2 h 2 p(v h 1 ) h 1 h 1 v v 42 / 43

43 Queston and Dscusson 43 / 43

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing CSC321 Tutoral 9: Revew of Boltzmann machnes and smulated annealng (Sldes based on Lecture 16-18 and selected readngs) Yue L Emal: yuel@cs.toronto.edu Wed 11-12 March 19 Fr 10-11 March 21 Outlne Boltzmann

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia Usng deep belef network modellng to characterze dfferences n bran morphometry n schzophrena Walter H. L. Pnaya * a ; Ary Gadelha b ; Orla M. Doyle c ; Crstano Noto b ; André Zugman d ; Qurno Cordero b,

More information

Multi-Conditional Learning for Joint Probability Models with Latent Variables

Multi-Conditional Learning for Joint Probability Models with Latent Variables Mult-Condtonal Learnng for Jont Probablty Models wth Latent Varables Chrs Pal, Xueru Wang, Mchael Kelm and Andrew McCallum Department of Computer Scence Unversty of Massachusetts Amherst Amherst, MA USA

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material

Natural Images, Gaussian Mixtures and Dead Leaves Supplementary Material Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Probabilistic Classification: Bayes Classifiers. Lecture 6:

Probabilistic Classification: Bayes Classifiers. Lecture 6: Probablstc Classfcaton: Bayes Classfers Lecture : Classfcaton Models Sam Rowes January, Generatve model: p(x, y) = p(y)p(x y). p(y) are called class prors. p(x y) are called class condtonal feature dstrbutons.

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

MATH 567: Mathematical Techniques in Data Science Lab 8

MATH 567: Mathematical Techniques in Data Science Lab 8 1/14 MATH 567: Mathematcal Technques n Data Scence Lab 8 Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 11, 2017 Recall We have: a (2) 1 = f(w (1) 11 x 1 + W (1) 12 x 2 + W

More information

Generative classification models

Generative classification models CS 675 Intro to Machne Learnng Lecture Generatve classfcaton models Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Data: D { d, d,.., dn} d, Classfcaton represents a dscrete class value Goal: learn

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

On Autoencoders and Score Matching for Energy Based Models

On Autoencoders and Score Matching for Energy Based Models On Autoencoders and Score Matchng for Energy Based Models Kevn Swersky* kswersky@cs.ubc.ca Marc Aurelo Ranzato ranzato@cs.toronto.edu Davd Buchman* davdbuc@cs.ubc.ca Benjamn M. Marln* bmarln@cs.ubc.ca

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

Fundamentals of Neural Networks

Fundamentals of Neural Networks Fundamentals of Neural Networks Xaodong Cu IBM T. J. Watson Research Center Yorktown Heghts, NY 10598 Fall, 2018 Outlne Feedforward neural networks Forward propagaton Neural networks as unversal approxmators

More information

Learning undirected Models. Instructor: Su-In Lee University of Washington, Seattle. Mean Field Approximation

Learning undirected Models. Instructor: Su-In Lee University of Washington, Seattle. Mean Field Approximation Readngs: K&F 0.3, 0.4, 0.6, 0.7 Learnng undrected Models Lecture 8 June, 0 CSE 55, Statstcal Methods, Sprng 0 Instructor: Su-In Lee Unversty of Washngton, Seattle Mean Feld Approxmaton Is the energy functonal

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probablstc & Unsupervsed Learnng Convex Algorthms n Approxmate Inference Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt Unversty College London Term 1, Autumn 2008 Convexty A convex

More information

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Strong Markov property: Same assertion holds for stopping times τ.

Strong Markov property: Same assertion holds for stopping times τ. Brownan moton Let X ={X t : t R + } be a real-valued stochastc process: a famlty of real random varables all defned on the same probablty space. Defne F t = nformaton avalable by observng the process up

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Gaussian process classification: a message-passing viewpoint

Gaussian process classification: a message-passing viewpoint Gaussan process classfcaton: a message-passng vewpont Flpe Rodrgues fmpr@de.uc.pt November 014 Abstract The goal of ths short paper s to provde a message-passng vewpont of the Expectaton Propagaton EP

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

A quantum-statistical-mechanical extension of Gaussian mixture model

A quantum-statistical-mechanical extension of Gaussian mixture model A quantum-statstcal-mechancal extenson of Gaussan mxture model Kazuyuk Tanaka, and Koj Tsuda 2 Graduate School of Informaton Scences, Tohoku Unversty, 6-3-09 Aramak-aza-aoba, Aoba-ku, Senda 980-8579, Japan

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data Condtonal Random Felds: Probablstc Models for Segmentng and Labelng Sequence Data Paper by John Lafferty, Andrew McCallum, and Fernando Perera ICML 2001 Presentaton by Joe Drsh May 9, 2002 Man Goals Present

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Classification learning II

Classification learning II Lecture 8 Classfcaton learnng II Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Logstc regresson model Defnes a lnear decson boundar Dscrmnant functons: g g g g here g z / e z f, g g - s a logstc functon

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

Deep Learning. Boyang Albert Li, Jie Jay Tan

Deep Learning. Boyang Albert Li, Jie Jay Tan Deep Learnng Boyang Albert L, Je Jay Tan An Unrelated Vdeo A bcycle controller learned usng NEAT (Stanley) What do you mean, deep? Shallow Hdden Markov models ANNs wth one hdden layer Manually selected

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Discriminative classifier: Logistic Regression. CS534-Machine Learning Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng 2 Logstc Regresson Gven tranng set D stc regresson learns the condtonal dstrbuton We ll assume onl to classes and a parametrc form for here s

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting

Deep Belief Network using Reinforcement Learning and its Applications to Time Series Forecasting Deep Belef Network usng Renforcement Learnng and ts Applcatons to Tme Seres Forecastng Takaom HIRATA, Takash KUREMOTO, Masanao OBAYASHI, Shngo MABU Graduate School of Scence and Engneerng Yamaguch Unversty

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Discriminative classifier: Logistic Regression. CS534-Machine Learning Dscrmnatve classfer: Logstc Regresson CS534-Machne Learnng robablstc Classfer Gven an nstance, hat does a probablstc classfer do dfferentl compared to, sa, perceptron? It does not drectl predct Instead,

More information

Open Problem: The landscape of the loss surfaces of multilayer networks

Open Problem: The landscape of the loss surfaces of multilayer networks JMLR: Workshop and Conference Proceedngs vol 4: 5, 5 8th Annual Conference on Learnng Theory Open Problem: The landscape of the loss surfaces of multlayer networks Anna Choromanska Courant Insttute of

More information

Global Gaussian approximations in latent Gaussian models

Global Gaussian approximations in latent Gaussian models Global Gaussan approxmatons n latent Gaussan models Botond Cseke Aprl 9, 2010 Abstract A revew of global approxmaton methods n latent Gaussan models. 1 Latent Gaussan models In ths secton we ntroduce notaton

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Introduction to the Introduction to Artificial Neural Network

Introduction to the Introduction to Artificial Neural Network Introducton to the Introducton to Artfcal Neural Netork Vuong Le th Hao Tang s sldes Part of the content of the sldes are from the Internet (possbly th modfcatons). The lecturer does not clam any onershp

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

Logistic Classifier CISC 5800 Professor Daniel Leeds

Logistic Classifier CISC 5800 Professor Daniel Leeds lon 9/7/8 Logstc Classfer CISC 58 Professor Danel Leeds Classfcaton strategy: generatve vs. dscrmnatve Generatve, e.g., Bayes/Naïve Bayes: 5 5 Identfy probablty dstrbuton for each class Determne class

More information

ARTIFICIAL INTELLIGENCE LABORATORY. and. A.I. Memo No January, Factorial Hidden Markov Models.

ARTIFICIAL INTELLIGENCE LABORATORY. and. A.I. Memo No January, Factorial Hidden Markov Models. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1561 January, 1996

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

Georgia Tech PHYS 6124 Mathematical Methods of Physics I

Georgia Tech PHYS 6124 Mathematical Methods of Physics I Georga Tech PHYS 624 Mathematcal Methods of Physcs I Instructor: Predrag Cvtanovć Fall semester 202 Homework Set #7 due October 30 202 == show all your work for maxmum credt == put labels ttle legends

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Evaluation of classifiers MLPs

Evaluation of classifiers MLPs Lecture Evaluaton of classfers MLPs Mlos Hausrecht mlos@cs.ptt.edu 539 Sennott Square Evaluaton For any data set e use to test the model e can buld a confuson matrx: Counts of examples th: class label

More information

University of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015

University of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015 Lecture 2. 1/07/15-1/09/15 Unversty of Washngton Department of Chemstry Chemstry 453 Wnter Quarter 2015 We are not talkng about truth. We are talkng about somethng that seems lke truth. The truth we want

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Engineering Risk Benefit Analysis

Engineering Risk Benefit Analysis Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007

More information

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis Statstcal analyss usng matlab HY 439 Presented by: George Fortetsanaks Roadmap Probablty dstrbutons Statstcal estmaton Fttng data to probablty dstrbutons Contnuous dstrbutons Contnuous random varable X

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition EG 880/988 - Specal opcs n Computer Engneerng: Pattern Recognton Memoral Unversty of ewfoundland Pattern Recognton Lecture 7 May 3, 006 http://wwwengrmunca/~charlesr Offce Hours: uesdays hursdays 8:30-9:30

More information

TAP Gibbs Free Energy, Belief Propagation and Sparsity

TAP Gibbs Free Energy, Belief Propagation and Sparsity TAP Gbbs Free Energy, Belef Propagaton and Sparsty Lehel Csató and Manfred Opper Neural Computng Research Group Dvson of Electronc Engneerng and Computer Scence Aston Unversty Brmngham B4 7ET, UK. [csatol,opperm]@aston.ac.uk

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics

Space of ML Problems. CSE 473: Artificial Intelligence. Parameter Estimation and Bayesian Networks. Learning Topics /7/7 CSE 73: Artfcal Intellgence Bayesan - Learnng Deter Fox Sldes adapted from Dan Weld, Jack Breese, Dan Klen, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer What s Beng Learned? Space

More information

Second order approximations for probability models

Second order approximations for probability models Second order approxmatons for probablty models lbert Kappen Department of Bophyscs Njmegen Unversty Njmegen, the Netherlands bertmbfysunnl Wm Wegernc Department of Bophyscs Njmegen Unversty Njmegen, the

More information

Representing arbitrary probability distributions Inference. Exact inference; Approximate inference

Representing arbitrary probability distributions Inference. Exact inference; Approximate inference Bayesan Learnng So far What does t mean to be Bayesan? Naïve Bayes Independence assumptons EM Algorthm Learnng wth hdden varables Today: Representng arbtrary probablty dstrbutons Inference Exact nference;

More information

1 Motivation and Introduction

1 Motivation and Introduction Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,

More information

arxiv: v3 [cs.ne] 18 Jan 2019

arxiv: v3 [cs.ne] 18 Jan 2019 Boltzmann machnes for tme-seres Takayuk Osogam IBM Research - Tokyo arxv:1708.06004v3 [cs.ne] 18 Jan 2019 osogam@p.bm.com verson 1.0.1 Abstract We revew Boltzmann machnes extended for tme-seres. These

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Queueing Networks II Network Performance

Queueing Networks II Network Performance Queueng Networks II Network Performance Davd Tpper Assocate Professor Graduate Telecommuncatons and Networkng Program Unversty of Pttsburgh Sldes 6 Networks of Queues Many communcaton systems must be modeled

More information

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material

Gaussian Conditional Random Field Network for Semantic Segmentation - Supplementary Material Gaussan Condtonal Random Feld Networ for Semantc Segmentaton - Supplementary Materal Ravtea Vemulapall, Oncel Tuzel *, Mng-Yu Lu *, and Rama Chellappa Center for Automaton Research, UMIACS, Unversty of

More information

Gaussian-Bernoulli Deep Boltzmann Machine

Gaussian-Bernoulli Deep Boltzmann Machine Gaussan-Bernoull Deep Boltzmann Machne KyungHyun Cho, Tapan Rao and Alexander Iln Department of Informaton and Computer Scence, Aalto Unversty School of Scence Emal: frstname.lastname@aalto.f Abstract

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

Statistical learning

Statistical learning Statstcal learnng Model the data generaton process Learn the model parameters Crteron to optmze: Lkelhood of the dataset (maxmzaton) Maxmum Lkelhood (ML) Estmaton: Dataset X Statstcal model p(x;θ) (θ parameters)

More information