UVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$5:$NonALinear$Regression$ Models$
|
|
- Jasmin Bailey
- 5 years ago
- Views:
Transcription
1 UVACS6316 Fall15Graduate: MachineLearning Lecture5:NonALinearRegression Models 9/1/15 Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience 1 Whereweare?! FivemajorsecHonsofthiscourse " Regression(supervised) " ClassificaHon(supervised) " Unsupervisedmodels " Learningtheory " Graphicalmodels 9/1/15
2 Today! Regression(supervised) " Fourwaystotrain/performopHmizaHonforlinear regressionmodels " NormalEquaHon " GradientDescent(GD) " StochasHcGD " Newton smethod " Supervisedregressionmodels " Linearregression(LR) " LRwithnonWlinearbasisfuncHons " LocallyweightedLR " LRwithRegularizaHons 9/1/15 3 Today " MachineLearningMethodinanutshell " RegressionModelsBeyondLinear LRwithnonWlinearbasisfuncHons Locallyweightedlinearregression RegressiontreesandMulHlinear InterpolaHon(later) 9/1/15 4
3 TradiFonalProgramming Data Program Computer Output MachineLearning Data Output Computer Program 9/1/15 5 Machine Learning in a Nutshell Task Representation Score Function Search/Optimization Models, Parameters MLgrewoutof workinai Op#mize(a( performance(criterion( using(example(data(or( past(experience,(( ( Aiming(to(generalize(to( unseen(data(( 9/1/15 6
4 (1) Multivariate Linear Regression Task Regression Representation Score Function Y = Weighted linear sum of X s Least-squares Search/Optimization Linear algebra / GD / SGD Models, Parameters Regression coefficients ŷ = f (x) =θ +θ 1 x 1 +θ x 9/1/15 7 Today " MachineLearningMethodinanutshell " RegressionModelsBeyondLinear LRwithnonWlinearbasisfuncHons Locallyweightedlinearregression RegressiontreesandMulHlinearInterpolaHon (later) " LinearRegressionModelwithRegularizaHons " RidgeRegression " LassoRegression 9/1/15 8
5 LRwithnonWlinearbasisfuncHons LRdoesnotmeanwecanonlydealwith linearrelahonships m! y =θ + θ ϕ (x) =θ T ϕ(x) j=1 j j Wearefreetodesign(nonWlinear)features (e.g.,basisfunchonderived)underlr wherethearefixedbasisfunchons(also define). E.g.:polynomialregression: 9/1/15! ϕ (x) j! ϕ (x)= 1! ϕ(x):= 1,x,x,x 3 T 9 e.g.(1)polynomialregression IntroducebasisfuncHons 9/1/15 θ * = ( ϕ T ϕ) 1 ϕ T! y 1 Dr.NandodeFreitas stutorialslide
6 e.g.(1)polynomialregression 9/1/15 KEY:ifthebasesaregiven,theproblemof learningtheparametersisshlllinear. 11 ManyPossibleBasisfuncHons TherearemanybasisfuncHons,e.g.: Polynomial RadialbasisfuncHons ϕ j (x) = x j 1 ( x µ ) φ ( x) x µ j Sigmoidal φ j ( x) = σ s j exp j = s 9/1/15 Splines, Fourier, Wavelets,etc 1
7 e.g.()lrwithradialwbasisfunchons E.g.:LRwithRBFregression: m!ŷ =θ + θ ϕ (x) =ϕ(x) T θ j=1 j j! ϕ(x):= 1,K (x,r λ 1 1 ),K λ (x,r ),K λ3 (x,r 3 ),K λ 4 (x,r 4 ) θ * = ( ϕ T ϕ) 1 ϕ T! y T 9/1/15 13 RBF=radialAbasisfuncFon:afuncFonwhichdepends onlyontheradialdistancefromacentrepoint GaussianRBF! asdistancefromthecentrerincreases, theoutputoftherbfdecreases " K λ (x,r) = exp # (x r) λ % ' & 1Dcase Dcase 9/1/15 14
8 e.g.anotherlinearregressionwith 1DRBFbasisfuncHons (4predefinedcentresandwidth)! ϕ(x):= 1,K (x,r λ 1 1 ),K λ (x,r ),K λ3 (x,r 3 ),K λ 4 (x,r 4 ) θ * = ϕ T ϕ 15 9/1/15 ( ) 1 ϕ T! y T e.g.alrwith1drbfs (3predefinedcentresandwidth) 1DRBF Agerfit: 9/1/15 16
9 e.g.dgoodandbadrbfs AgoodDRBF TwobadDRBFs 9/1/15 17 Twomainissues: Learntheparameter\theta ϕ(x) AlmostthesameasLR,just!Xto LinearcombinaHonofbasisfuncHons(thatcanbe nonwlinear) Howtochoosethemodelorder,e.g. polynomialdegreeforpolynomialregression 9/1/15 18
10 Issue:Overfikngandunderfikng Underfit Looksgood Overfit y =θ + θ1x y 5 = θ + θ1x + θx = j = y θ j x j 9/1/15 Generalisation:learnfuncHon/ hypothesisfrompastdatainorder to explain, predict, model or control newdataexamples KWfoldCross ValidaHon!!!! 19 () Multivariate Linear Regression with basis Expansion Task Regression Representation Score Function Y = Weighted linear sum of (X basis expansion) Least-squares Search/Optimization Linear algebra Models, Parameters Regression coefficients m!ŷ =θ + θ ϕ (x) =ϕ(x) T θ j=1 j j 9/1/15
11 Today " MachineLearningMethodinanutshell " RegressionModelsBeyondLinear LRwithnonWlinearbasisfuncHons Locallyweightedlinearregression RegressiontreesandMulHlinear InterpolaHon(later) 9/1/15 1 Locallyweightedlinearregression Thealgorithm: Insteadofminimizing nowwefittominimize J(θ) = 1 n θ w i (x it θ y i ) i=1 Wheredow i 'scomefrom? w i = K(x i,x )= exp (x x i ) τ! wherex_isthequerypointforwhichwe'dliketoknowitscorresponding y #EssenHallyweputhigherweightson(thoseerrors from)trainingexamplesthatareclosetothequery pointx_(thanthosethatarefurtherawayfromthe 9/1/15 querypoint) n 1 T J ( θ ) = ( xi θ yi ) i= 1
12 3 Locally weighted regression aka locally weighted regression, locally linear regression, LOESS, K λ (x i, x ) linear_func(x)w>y! couldrepresent onlytheneighbor regionofx_ UseRBFfuncHonto pickout/emphasize theneighborregion ofx_! K λ (x i, x ) 9/1/15 4 Locally weighted linear regression 9/1/15
13 Dr. Yanjun Qi / UVA CS 6316 / f15 Locally weighted linear regression Separate weighted least squares at each target point x : min N α (x ),β (x ) i=1 K λ (x i, x )[y i α(x ) β(x )x i ]! ˆf (x )= ˆα(x )+ ˆβ(x )x 9/1/15 LEARNING of Locally weighted linear regression x! ˆf (x )= ˆα(x )+ ˆβ(x )x! Separate weighted least squares at each target point x 9/1/15 6
14 Locally weighted linear regression e.g.whenfor onlyone featurevariable Dr. Yanjun Qi / UVA CS 6316 / f15 Separate weighted least squares at each target N point x : min K ( x, x )[ y α( x ) β ( x ) x ] λ i i α ( x ), β ( x ) i= 1 f ˆ ( x ) =α ˆ( x) + ˆ( β x) b(x) T =(1,x); B: Nx regression matrix with i-th row b(x) T ; x ( K ( x, x )), i = 1, N WN N ( x ) = diag λ i, i LWR 9/1/15 versus ˆf (x ) = b(x ) T (B T W (x )B) 1 B T W (x )y LR ˆf (x q ) = (x q ) T θ * = (x q ) T ( X T X) 1 X T! y More! Local Weighted Polynomial Regression Local polynomial fits of any degree d min N K ( x, x ) y α( x ) d λ i i α ( x ), β j ( x ), j= 1,, d i= 1 j= 1 d j fˆ( x) = ˆ( α x) + ˆ β j= j ( x x 1 ) Blue:true Green:esHmated β ( x ) x j j i 9/1/15
15 Parametricvs.nonWparametric LocallyweightedlinearregressionisanonAparametric algorithm. The(unweighted)linearregressionalgorithmthatwesaw earlierisknownasaparametriclearningalgorithm becauseithasafixed,finitenumberofparameters(the),which θ arefittothedata; Oncewe'vefitthe\thetaandstoredthemaway,wenolongerneed tokeepthetrainingdataaroundtomakefuturepredichons. Incontrast,tomakepredicHonsusinglocallyweightedlinear regression,weneedtokeeptheenhretrainingsetaround. Theterm"nonWparametric"(roughly)referstothefactthatthe amountofknowledgeweneedtokeep,inordertorepresent thehypothesisgrowswithlinearlythesizeofthetrainingset. 9/1/15 9 (3) Locally Weighted / Kernel Linear Regression Task Regression Representation Score Function Y = Weighted linear sum of X s Weighted Least-squares Search/Optimization Linear algebra Models, Parameters min N Local Regression coefficients (conditioned on each test point) K λ (x i, x )[y i α(x ) β(x )x i ] α (x ),β (x ) i=1 ˆ 9/1/15 f ( x ) =α ˆ( x) + ˆ( β x) x 3
16 TodayRecap " MachineLearningMethodinanutshell " RegressionModelsBeyondLinear LRwithnonWlinearbasisfuncHons Locallyweightedlinearregression RegressiontreesandMulHlinear InterpolaHon(later) 9/1/15 31 ProbabilisHcInterpretaHonof LinearRegression(LATER) Letusassumethatthetargetvariableandtheinputsare relatedbytheequahon: T yi = θ xi + ε i whereεisanerrortermofunmodeledeffectsorrandomnoise Nowassumethatε(followsaGaussianN(,σ),thenwe have: T 1 ( yi θ x ) exp πσ σ i ( yi xi; θ) = p Byindependence(amongsamples)assumpHon: 9/1/15 n n n 1 i L( θ) = p( yi xi; θ) = exp i= πσ = 1 i 1 T ( y θ x ) σ ManymorevariaHons oflrfromthis perspechve,e.g. binomial/poisson (LATER) i 3
17 References allowingmetoreusesomeofhisslides " Prof.NandodeFreitas stutorialslide 9/1/15 33
UVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$4:$More$opBmizaBon$for$Linear$ Regression$
Dr.YanjunQi/UVACS6316/f15 UVACS6316 Fall2015Graduate: MachineLearning Lecture4:MoreopBmizaBonforLinear Regression 9/14/15 Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience 1 Whereweare?! FivemajorsecHonsofthiscourse
More informationAnnouncements:!Rough!Plan!!
UVA$CS$4501$+$001$/$6501$ $007$ Introduc8on$to$Machine$Learning$and$ Data$Mining$$ $ Lecture$26:History/Review$ YanjunQi/Jane,,PhD UniversityofVirginia Departmentof ComputerScience YanjunQi/UVACS4501O01O6501O07
More informationUVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$8:$Supervised$ClassificaDon$ with$support$vector$machine$
UVACS6316 Fall2015Graduate: MachineLearning Lecture8:SupervisedClassificaDon withsupportvectormachine 9/30/15 Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience 1 Whereweare?! FivemajorsecHonsofthiscourse
More informationLast Lecture Recap. UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 6: Regression Models with Regulariza8on
UVA CS 45 - / 65 7 Introduc8on to Machine Learning and Data Mining Lecture 6: Regression Models with Regulariza8on Yanun Qi / Jane University of Virginia Department of Computer Science Last Lecture Recap
More informationUVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$18:$Neural'Network'/'Deep' Learning$
UVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$18:$NeuralNetwork/Deep Learning$ Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience 1 Wherearewe?! FivemajorsecKonsofthiscourse
More informationMachine Learning. Lecture 2: Linear regression. Feng Li. https://funglee.github.io
Machine Learning Lecture 2: Linear regression Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2017 Supervised Learning Regression: Predict
More informationUVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$17:$Decision(Tree(/(Random( Forest(/(Ensemble$
Dr.YanjunQi/UVACS6316/f15 UVA$CS$6316$$ $Fall$2015$Graduate:$$ Machine$Learning$$ $ $Lecture$17:$DecisionTree/Random Forest/Ensemble$ 10/28/15 Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience
More informationUVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 6: Linear Regression Model with RegularizaEons. Dr. Yanjun Qi. University of Virginia
UVA CS 6316/4501 Fall 2016 Machine Learning Lecture 6: Linear Regression Model with RegularizaEons Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Where are we? è Five major secgons
More informationLast Lecture Recap UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 3: Linear Regression
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 3: Linear Regression Yanjun Qi / Jane University of Virginia Department of Computer Science 1 Last Lecture Recap q Data
More informationUVA CS 4501: Machine Learning. Lecture 6: Linear Regression Model with Dr. Yanjun Qi. University of Virginia
UVA CS 4501: Machine Learning Lecture 6: Linear Regression Model with Regulariza@ons Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sec@ons of this course
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationUVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 7: Regression Models - Review HW1 DUE NOW / HW2 OUT TODAY 9/18/14
UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 7: Regression Models - Review Yanjun Qi / Jane University of Virginia Department of Computer Science 1 HW1 DUE NOW / HW2
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationCS-E3210 Machine Learning: Basic Principles
CS-E3210 Machine Learning: Basic Principles Lecture 3: Regression I slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 48 In a nutshell
More informationMachine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari
Machine Learning Basics: Stochastic Gradient Descent Sargur N. srihari@cedar.buffalo.edu 1 Topics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation Sets
More informationLinear Regression Introduction to Machine Learning. Matt Gormley Lecture 4 September 19, Readings: Bishop, 3.
School of Computer Science 10-701 Introduction to Machine Learning Linear Regression Readings: Bishop, 3.1 Murphy, 7 Matt Gormley Lecture 4 September 19, 2016 1 Homework 1: due 9/26/16 Project Proposal:
More informationMachine learning - HT Basis Expansion, Regularization, Validation
Machine learning - HT 016 4. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford Feburary 03, 016 Outline Introduce basis function to go beyond linear regression Understanding
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationIntroduction to Machine Learning
Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationIntro to Neural Networks and Deep Learning
Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs
More informationStatistical Methods for SVM
Statistical Methods for SVM Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot,
More informationOutline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 3 Additive Models and Linear Regression Sinusoids and Radial Basis Functions Classification Logistic Regression Gradient Descent Polynomial Basis Functions
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationLinear and logistic regression
Linear and logistic regression Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Linear and logistic regression 1/22 Outline 1 Linear regression 2 Logistic regression 3 Fisher discriminant analysis
More informationReview: Support vector machines. Machine learning techniques and image analysis
Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization
More informationLinear regression COMS 4771
Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between
More informationRidge Regression. Mohammad Emtiyaz Khan EPFL Oct 1, 2015
Ridge Regression Mohammad Emtiyaz Khan EPFL Oct, 205 Mohammad Emtiyaz Khan 205 Motivation Linear models can be too limited and usually underfit. One way is to use nonlinear basis functions instead. Nonlinear
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
More informationMachine Learning. Part 1. Linear Regression. Machine Learning: Regression Case. .. Dennis Sun DATA 401 Data Science Alex Dekhtyar..
.. Dennis Sun DATA 401 Data Science Alex Dekhtyar.. Machine Learning. Part 1. Linear Regression Machine Learning: Regression Case. Dataset. Consider a collection of features X = {X 1,...,X n }, such that
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationLecture 10: A brief introduction to Support Vector Machine
Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department
More information10-701/ Recitation : Kernels
10-701/15-781 Recitation : Kernels Manojit Nandi February 27, 2014 Outline Mathematical Theory Banach Space and Hilbert Spaces Kernels Commonly Used Kernels Kernel Theory One Weird Kernel Trick Representer
More informationMachine Learning 2: Nonlinear Regression
15-884 Machine Learning : Nonlinear Regression J. Zico Kolter September 17, 01 1 Non-linear regression Peak Hourly Demand (GW).5 0 0 40 60 80 100 High temperature / peak demand observations for all days
More informationIntroduction to Machine Learning
Introduction to Machine Learning Kernel Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 21
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationMachine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)
Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:
More information(X i,y, i i ), i =1, 2,,n, Y i = min (Y i, C i ), i = I (Y i C i ) C i. Y i = min (C, Xi t θ 0 + i ) E [ρ α (Y t )] t ρ α (y) =(α I (y < 0))y I (A )
Y F Y ( ) α q α q α = if y : F Y (y ) α. α E [ρ α (Y t )] t ρ α (y) =(α I (y < 0))y I (A ) Y 1,Y 2,,Y q α 1 Σ i =1 ρ α (Y i t ) t X = x α q α (x ) Y q α (x )= if y : P (Y y X = x ) α. q α (x ) θ α q α
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters
More informationUVA$CS$4501$+$001$/$6501$ $007$ Introduc8on$to$Machine$Learning$ and$data$mining$$ $ Lecture$19:!Decision!Tree!/! Random!Forest!/!
YanjunQi/UVACS4501L01L6501L07 UVA$CS$4501$+$001$/$6501$ $007$ Introduc8on$to$Machine$Learning$ and$data$mining$$ $ Lecture$19:DecisionTree/ RandomForest/Ensemble$ YanjunQi/Jane,,PhD UniversityofVirginia
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationLecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University
Lecture 18: Kernels Risk and Loss Support Vector Regression Aykut Erdem December 2016 Hacettepe University Administrative We will have a make-up lecture on next Saturday December 24, 2016 Presentations
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More information10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers
Computational Methods for Data Analysis Massimo Poesio SUPPORT VECTOR MACHINES Support Vector Machines Linear classifiers 1 Linear Classifiers denotes +1 denotes -1 w x + b>0 f(x,w,b) = sign(w x + b) How
More informationKernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning
Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationMachine Learning. Regression Case Part 1. Linear Regression. Machine Learning: Regression Case.
.. Cal Poly CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning. Regression Case Part 1. Linear Regression Machine Learning: Regression Case. Dataset. Consider a collection of features X
More informationLinear Models for Regression. Sargur Srihari
Linear Models for Regression Sargur srihari@cedar.buffalo.edu 1 Topics in Linear Regression What is regression? Polynomial Curve Fitting with Scalar input Linear Basis Function Models Maximum Likelihood
More informationLinear Regression 1 / 25. Karl Stratos. June 18, 2018
Linear Regression Karl Stratos June 18, 2018 1 / 25 The Regression Problem Problem. Find a desired input-output mapping f : X R where the output is a real value. x = = y = 0.1 How much should I turn my
More informationIndirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina
Indirect Rule Learning: Support Vector Machines Indirect learning: loss optimization It doesn t estimate the prediction rule f (x) directly, since most loss functions do not have explicit optimizers. Indirection
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationStat 502X Exam 2 Spring 2014
Stat 502X Exam 2 Spring 2014 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This exam consists of 12 parts. I'll score it at 10 points per problem/part
More informationLinear Models for Regression
Linear Models for Regression CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Regression Problem Training data: A set of input-output
More informationDirect Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Classification Logistic regression models for classification problem We consider two class problem: Y {0, 1}. The Bayes rule for the classification is I(P(Y = 1 X = x) > 1/2) so
More informationESS2222. Lecture 3 Bias-Variance Trade-off
ESS2222 Lecture 3 Bias-Variance Trade-off Hosein Shahnas University of Toronto, Department of Earth Sciences, 1 Outline Bias-Variance Trade-off Overfitting & Regularization Ridge & Lasso Regression Nonlinear
More information1. Kernel ridge regression In contrast to ordinary least squares which has a cost function. m (θ T x (i) y (i) ) 2, J(θ) = 1 2.
CS229 Problem Set #2 Solutions 1 CS 229, Public Course Problem Set #2 Solutions: Theory Kernels, SVMs, and 1. Kernel ridge regression In contrast to ordinary least squares which has a cost function J(θ)
More informationIntroduction to Machine Learning
Introduction to Machine Learning Logistic Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationCS 453X: Class 1. Jacob Whitehill
CS 453X: Class 1 Jacob Whitehill How old are these people? Guess how old each person is at the time the photo was taken based on their face image. https://www.vision.ee.ethz.ch/en/publications/papers/articles/eth_biwi_01299.pdf
More information1 Machine Learning Concepts (16 points)
CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions
More informationGWAS IV: Bayesian linear (variance component) models
GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian
More informationLecture 7: Kernels for Classification and Regression
Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationOverview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation
Overview Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation Probabilistic Interpretation: Linear Regression Assume output y is generated
More informationNormalising constants and maximum likelihood inference
Normalising constants and maximum likelihood inference Jakob G. Rasmussen Department of Mathematics Aalborg University Denmark March 9, 2011 1/14 Today Normalising constants Approximation of normalising
More informationLinear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More informationSupport Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Support Vector Machines II CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Outline Linear SVM hard margin Linear SVM soft margin Non-linear SVM Application Linear Support Vector Machine An optimization
More informationYour first day at work MATH 806 (Fall 2015)
Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies
More informationIntroduction to Econometrics Midterm Examination Fall 2005 Answer Key
Introduction to Econometrics Midterm Examination Fall 2005 Answer Key Please answer all of the questions and show your work Clearly indicate your final answer to each question If you think a question is
More informationSupport Vector Machines
Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two
More informationLeaving The Span Manfred K. Warmuth and Vishy Vishwanathan
Leaving The Span Manfred K. Warmuth and Vishy Vishwanathan UCSC and NICTA Talk at NYAS Conference, 10-27-06 Thanks to Dima and Jun 1 Let s keep it simple Linear Regression 10 8 6 4 2 0 2 4 6 8 8 6 4 2
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationCS-E3210 Machine Learning: Basic Principles
CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationSVMs, Duality and the Kernel Trick
SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today
More informationLogistic Regression Logistic
Case Study 1: Estimating Click Probabilities L2 Regularization for Logistic Regression Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 10 th,
More informationIntroduction to SVM and RVM
Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance
More informationA Quick Tour of Linear Algebra and Optimization for Machine Learning
A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators
More informationLecture 10: Support Vector Machine and Large Margin Classifier
Lecture 10: Support Vector Machine and Large Margin Classifier Applied Multivariate Analysis Math 570, Fall 2014 Xingye Qiao Department of Mathematical Sciences Binghamton University E-mail: qiao@math.binghamton.edu
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 27, 2015 Outline Linear regression Ridge regression and Lasso Time complexity (closed form solution) Iterative Solvers Regression Input: training
More informationWeighted Least Squares I
Weighted Least Squares I for i = 1, 2,..., n we have, see [1, Bradley], data: Y i x i i.n.i.d f(y i θ i ), where θ i = E(Y i x i ) co-variates: x i = (x i1, x i2,..., x ip ) T let X n p be the matrix of
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 03: Multi-layer Perceptron Outline Failure of Perceptron Neural Network Backpropagation Universal Approximator 2 Outline Failure of Perceptron Neural Network Backpropagation
More informationSVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels
SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic
More informationUVA CS 4501: Machine Learning. Lecture 11b: Support Vector Machine (nonlinear) Kernel Trick and in PracCce. Dr. Yanjun Qi. University of Virginia
UVA CS 4501: Machine Learning Lecture 11b: Support Vector Machine (nonlinear) Kernel Trick and in PracCce Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major
More informationIntroduction to Machine Learning
1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationSupport Vector Machines
Support Vector Machines INFO-4604, Applied Machine Learning University of Colorado Boulder September 28, 2017 Prof. Michael Paul Today Two important concepts: Margins Kernels Large Margin Classification
More informationMIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 04: Features and Kernels. Lorenzo Rosasco
MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 04: Features and Kernels Lorenzo Rosasco Linear functions Let H lin be the space of linear functions f(x) = w x. f w is one
More informationLecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning
Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function
More informationStatistical learning theory, Support vector machines, and Bioinformatics
1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.
More information