Machine Learning Assignment-1

Size: px
Start display at page:

Download "Machine Learning Assignment-1"

Transcription

1 Uiversity of Utah, School Of Computig Machie Learig Assigmet-1 Chadramouli, Shridhara ) Sigla, Sumedha ) September 10, Liear Regressio a) Create a Matlab fuctio that draws a radom umber from the uivariate ormal distributio N m, σ 2 ) for ay m, σ ɛ R. How do you test whether you have doe this correctly? As: Goal: To create a fuctio rad,m,sd) which takes 3 parameters as iput Where : is umber of radom umbers to be geerated m is required mea of the ormal distributio sdis required variace of the ormal distributio ad retur a array of radom umbers. Approach 1 Step 1. Fid uiformally distributed radom umbers. Step 2. Trasform the uiformally distributed umbers to stadard ormal distributio usig Box Muller Trasformatio. Box Muller Trasformatio: Suppose U1 ad U2 are idepedet radom variables that are uiformly distributed i the iterval 0, 1]. The accordig to Box Muller Trasfor- 1

2 matio the correspodig idepedet radom variables with a stadard ormal distributio is give by Z0 ad Z1 Z0 = 2 2 l U1 cos2πu2) Z1 = 2 2 l U1 si2πu2) Step 3. Scale the resulted stadard ormal distributio to ormal distributio with mea = m ad variace = sd. ND = m + sd sd) Where : ND is ormal distributio sd is stadard ormal distributio To test this fuctio we plot a histogrm for =1000. It comes as a bell courve which is symetric about the mea m.please check figure 1.1 for the curve. Approach 2 By usig the cetral limit theorem, we kow that the mea of uiformly distriuted radom umbers is ormally distributed. To create a uiformly created radom umber geerator, we use a pseudo radom geerator that is of the form x i+1 = ax i modb) We chose the values of a = 40692, x 0 = , ad b = I order to get a ormally distributed curve, we get radom umbers from the pseudo radom umber geerator, ad take the arithmetic mea of the values i each row, givig radom umbers. The seed values for the PRNG are made to persist, thereby gettig distict umbers o each ru. We the scale the results for the mea ad variace specified as per equatio 1.1). The histogram of the values geerated usig this method is show i figure 1.2 b) Choose a arbitrary o-liear uivariate mathematical fuctio f : R R. Make your fuctio complex eough such that it is differet from the aswer of other groups 2

3 Figure 1.1: Histogram of radom umbers geerated usig box-muller trasformatio Figure 1.2: Histogram of radom umbers geerated usig cetral limit theorem 3

4 with a high likelihood. As:For this exercise, we are asked to choose a uivariate fuctio from f : R R. We decided to use the asteroid curve, which is defied as: Where : aɛr >0 ad a x a ) 3 y = ± a 2/3) x 2/3) The above fucio is itestig for our purpose of liear regressio, as y is defied at two poits for each value of x, oe positive ad the other egative. We believe such a fuctio will serve as a diagostic tool for our problem of liear regressio, as we ca use this fuctio i two ways, viz., cosiderig oly the positive values of y or cosiderig both the positive ad egative values of y. c) Geerate radom data x i, y i ) such that yi = f x i) + ξ i, ξ i N 0, σ 2), for a rage of x-values ad a value of σ chose such that the data is iterestig. Plot the data. Approach As metioed above, the asteroid curve could be plotted either by cosiderig both the egative ad positive sig of y or by usig just oe of the sig values. We wrote a matlab fuctio geerate data.m which geerates the various y values from a rage of user specified x values. The plot of the curve for various parameters are show i figures 1.3, 1.4, 1.5 ad 1.6: Figure 1.3: Plot of positive asteroid withoutfigure 1.4: Plot of positive asteroid with oise added oise N0, 3) Figure 1.5: Plot of full asteroid withoutfigure 1.6: Plot of full asteroid with added oise oise N0, 3) 4

5 The above figures were draw usig a = 30 for every a x a i itervals of 0.1. Alterig the value of a does ot chage the shape of the curve, but acts as a scalig factor, icreasig the size of curve as well as the rage of values over which x is defied. We used a moderate value of σ = 3 as too much oise chages the shape of the curve, while too little does ot reflect the real-life sceario. d) Use liear least-squares to fit a polyomial curve through your geerated data ad compute the mea-squared error of your estimate. Give your derivatio. Vary the umber of data poits you use ad degree of the polyomial. Plot the results. Describe i words what you see. Approach Let us assume that the geerated data satisfy the polyomial equatio y = a 0 + a 1 x + a 2 x 2 + a 3 x a m x m where m is the degree of the polyomial = Y = A t X Where A t = [a 0 a 1 a 2... a m ] ad X t = [1 x x 2... x m ] The values of the coefficiets A will be determied by fittig the polyomial to the traiig data oisy data geerated from the asteroid curve). This ca be doe by miimizig a error fuctio that measures the misfit betwee the fuctioyx,a), for ay give value of a, ad the traiig set data poits. Let us assume our error fuctio is Mea Square Error fuctio give by Takig trace EA) = Y i A t X i 2 EA) = Y i A t X i )Y i A t X i ) t EA) = Y i A t X i )Yi t Xt i A) EA) = Y iyi t At X i Yi t Y ixi ta + At X i Xi ta) EA) = Y iyi t 2Y ixi ta + At X i Xi ta) EA) = At X ixi t)a 2 Y ixi t)a + Y iyi t) trea)) = tr At X ixi t)a ) tr 2 Y ixi t)a ) + tr Y iy t i ) ) 5

6 trea)) = tr A X ixi t)at ) tr 2 Y ixi t)a ) + tr Y iy t Takig derivative i ) ) trea)) A = A tra X ixi t)at ) A tr2 Y ixi t)a ) + 0 Miimum value of EA) is at poit where trea)) A = 0 0 = A X i Xi t ) t + A X i Xi t 2 Y i Xi t ) A X i Xi t = As per Liear Mea Square value of A = Y i Xi t ) Y i X t i X i X t i A = Y i X t i X i X t i Fittig a polyomial to the asteroid curve We performed experimets o liear regressio i both the positive asteroid ad the full asteroid curves. For all the experimets, the data were geerated by choosig the fuctio parameters as a = 30&x was sampled at all poits ɛ [ a, a] at itervals of 0.1 ad Gaussia oise was added to the sample set usig the gaussia radom umber geerator defied above. We the implemeted the above MSE estimator i Matlab as geerate data.m. The fuctio takes i as argumets, the value of a, the samplig frequecy, the mea ad the variace for the radom umber geerators ad a flag which specifies if we require a positive or full asteroid curve ad returs a set of oisy observatios of the asteroid curve, where the oise compoet is derived from a ormally distributed radom umber geerator. The resultig data was the fed to fit polyomial.m which takes the data poits from geerate data.m ad the polyomial degree ad computes the value of A, which miimized the mea square distace i the data. We the plotted the proposed values of [y] agaist a x a i itervals of 0.1 to get a idea of how good the estimates are. 6

7 Positive Asteroid For the positive asteroid, the data was geerated usig a oise variace of 3 ad mea of 0. The data set cotaied 601 sets of x, y correspodig to the values of the curve from -a to a. We the tryig varyig the degree of the polyomial from = 2 to = 10. The plots of the data accordig to the predicted model ad the origial data is show below. Figure 1.7: Plot origial data with oise Figure 1.8: Plot of predicted values takigfigure 1.9: Plot of predicted values takig = 2 = 3 From the figures, it is clear that choosig a smaller value of the degree, forms the best fit for the data. O icreasig the degree of the polyomial to =10, the model trys to over fit the data ad much of the local miima ad maxima of the curve correspod to the oise deviatios. The lower power polymoial of = 3 or = 4 provides the best fit for the curve i our experimets. 7

8 Figure 1.10: Plot of predicted values takigfigure 1.11: Plot of predicted values takig = 4 = 10 Full Asteroid For the full asteroid, the data was geerated usig a oise variace of 3 ad mea of 0 just as was doe for the positive asteroid. The data set cotaied 1201 sets of x, y correspodig to the values of the curve from atoa. We the tryig varyig the degree of the polyomial i the rage = 2to = 20. The plots of the data accordig to the predicted model ad the origial data is show below. Figure 1.12: Plot origial data with oise As we ca see from the figures 1.13 to 1.18, the liear model fails to model the full asteroid curve effectively, ulike the positive asteroid curve from the previous sectio. We fid that while a lower degree of the polyomial such as = 2& = 4) avoids overfittg, it fails to provide a suitable predictio for x > 0. O the other had, polyomials of higher order, simply oscillate betwee positive ad egative values ad do ot provide a good model for the full asteroid. This example therefore shows the limitatios of a liear model i makig predictios of values which follow a complex o-liear fuctio. 8

9 Figure 1.13: Plot of predicted values takigfigure 1.14: Plot of predicted values takig = 2 = 4 Figure 1.15: Plot of predicted values takigfigure 1.16: Plot of predicted values takig = 6 = 10 Figure 1.17: Plot of predicted values takigfigure 1.18: Plot of predicted values takig = 15 = 20 While it provides very good approximatio for the positive asteroid, the liear model fails to model the full asteroid well. 2 Liear Multivariate Regressio. The attached file data.txt has two umbers d ad o the first lie; d is the dimesio of the data, ad is the umber of data poits. The, lies of each d umbers follow that costitute a set of vectors x tt=1, with x t R d The data x tt=1 is hypothesized to be geerated by a model: x t+1 = Ax t + m t m t N 0, Σ) where A R dxd, m t R d adσ R d xd a) Compute best estimates for A ad Σ Give your derivatio. Sol: To calculate the value of A ad Σ let us assume we have data X, Y ) where is the umber of data poits. The give hypothesis is X t+1 = AX t + M t 9

10 For the sake of coviiece, lets replace X t+1 by Y. The equatio ow becomes : Y = AX t + M t where M t is N 0, Σ) This could be re-writte as Y N A T X, Σ ) Which represets a ormal curve with mea AX t. I order to get the values of A,B ad Σ, lets cosider the MLE of the above curve. The geeral likelihood fuctio is, ) 1 e 2 x ˆx)T x ˆx) P x) = 2π) Σ ) d for our fuctio, l A, Σ) = ) 1 1.e 2 y i A x i ) T y i A x i ) 2π) d Σ We eed to fid the values of A ad Σ which maximizes the above fuctio. Therefore, lets take the derivate ad equate it to zero after takig the atural logrithm o both sides. ll A, Σ) = l 1 2π) d Σ )) 1.e 2 y i A x i ) T y i A x i ) ll A, Σ) = log 2π d 2 ) ) log )) 1 2 y i A x i ) T y i A x i ) ) log 2π d 2 ) ) term is a costat ad lets detote it by c. O further simplificatio, we get Takig the trace: ll A, Σ) = c + 2 log ) ll A, Σ) = c + 2 log ) = c + 2 log ) 1 2 y i A x i ) T y i A x i ) ) tr 1 2 y i A x i ) T y i A x i ) ) 1 2 tr y i A x i ) y i A x i ) T ) 10

11 as tr ABC) = tr BCA) = tr CAB) Expadig the last term, we get. = c+ 2 log ) 1 2 tr yi y T ) i yi x T i A T ) A xi y T ) i + A xi x T i A T )) ) = c + 2 log ) 1 2 tr tr yi y i T ) ) + 1 Σ 2 tr 1 yi x T i A T )) A xi y i T ) ) 1 Σ 2 tr 1 A xi x T i A T )) Here, tr yi x T )) i AT = tr A xi y i T )) as tr A) = tr A) T ad as tr ABC) = tr BCA) = tr CAB) Therefore, by simplicatio we get ll A, Σ) = c + 2 log ) 1 2 tr 1 2 tr yi y i T ) ) + tr A xi x T i A T )) yi x T i A T )) I order to fid the values of A ad Σ, lets take the derivatives ad equate it to zero. ) ll A, Σ) = 0 = yi x T ) 1 i A x i x T i + A x i x T i A 2 Here tr AB) A = B T ad tr CABA T ) A = C T AB T + CAB = = A x i x T i = yi x T ) i A = yi x T ) ) i x i x T i Σ = y A x) y A x) T ) 1 Puttig the values of x ad y from data.txt ad solvig for A ad Σ we get A = e e

12 Σ = e e e e e e e e e e b) Based o the results, do you have a guess of what the model above actually models? The system models a time depedet system, where the predicted output of a observatio or experimet depeds o the value of the curret state of the system ad a radom evet which acts as a oise. Oe such example of this system could be the motio plaig robot, where the positio at time x t+1 is derived from it s positio at time t which correspods to the state of the robot at curret time. The oise compoet could be cosidered as the effects of the robot s eviromet, which prevet the robot from followig a perfect ad expected trajectory. Such effects could be wid chages, terrai chages or ay other such aturally occurig evets, which are hard to model ad predict, thereby take as a radom umber, simplifyig our system. Cosiderig the motio plaer as the iput, the d-dimesioal feature vector of x would be some spatial positio of the object or robot represeted i d- compoets. 3 Liear Multivariate regressio with iputs The give hypothesis is X t+1 = AX t + BU t + M t For the sake of coviiece, lets replace X t+1 by Y. The equatio ow becomes : Y = AX t + BU t + M t where M t is N 0, Σ) This could be re-writte as Y N A T X + B T U, Σ ) Which represets a ormal curve with mea A T X + B T U. I order to get the values of A,B ad Σ, lets cosider the MLE of the above curve. The geeral likelihood fuctio is, ) 1 e 2 x ˆx)T x ˆx) P x) = 2π) Σ ) d for our fuctio, l A, B, Σ) = ) 1 1.e 2 y i A x i B u i ) T y i A x i B u i ) 2π) d Σ 12

13 We eed to fid the values of A,B ad Σ which maximizes the above fuctio. Therefore, lets take the derivate ad equate it to zero after takig the atural logrithm o both sides. ll A, B, Σ) = l 1 2π) d Σ )) 1.e 2 y i A x i B u i ) T y i A x i B u i ) ll A, B, Σ) = log 2π d 2 ) )+ 1 2 log )) 1 2 y i A x i B u i ) T y i A x i B u i ) ) log 2π d 2 ) ) term is a costat ad lets detote it by c. O further simplificatio, we get ll A, B, Σ) = c + 2 log ) Takig the trace: 1 2 y i A x i B u i ) T y i A x i B u i ) ) ll A, B, Σ) = c + 2 log ) = c + 2 log ) tr 1 2 y i A x i B u i ) T y i A x i B u i ) ) 1 2 tr y i A x i B u i ) y i A x i B u i ) T ) as tr ABC) = tr BCA) = tr CAB) Expadig the last term, we get. = c+ 2 log ) 1 2 tr yi y T i ) yi x T i A T ) yi u T i B T ) A xi y i T ) + A xi x T i A T ) + B ui u T i B T ) + A xi u T i B T ) B ui y i T ) + B ui x T i A T )) ) = c+ 2 log ) 1 2 tr tr 1 2 tr yi y i T ) ) + 1 Σ 2 tr 1 yi x T i A T )) + 1 Σ 2 tr 1 yi u T i B T )) A xi y i T ) ) 1 Σ 2 tr 1 A xi x T i A T )) 1 Σ 2 tr 1 B ui u T i B T )) A xi u T i B T )) + 1 Σ 2 tr 1 B ui y i T ) ) 1 Σ 2 tr 1 B ui x T i A T )) 13

14 Here, tr yi x T )) i AT = tr A xi y i T )) as tr A) = tr A) T ad as tr ABC) = tr BCA) = tr CAB) similarly, tr yi u T )) i BT = tr B ui y i T )) ad 1 2 tr A xi u T )) i BT = 1 2 tr B ui x T )) i AT Therefore, by simplicatio we get: = c+ 2 log ) 1 2 tr 1 2 tr ll A, B, Σ) = yi y i T ) ) +tr yi x T i A T )) +tr A xi x T i A T )) 1 Σ 2 tr 1 B ui u T i B T )) tr A xi u T i B T )) yi u T i B T )) I order to fid the values of A ad B ad Σ, lets take the derivatives ad equate it to zero. ll A, B, Σ) A = 0 = yi x T i ) 1 A 2 x i x T i ) + A x i x T i B ui x T ) i Here tr AB) A = B T ad tr CABA T ) A = C T AB T + CAB = A x i x T i = yi x T ) i B ui x T ) i = Similarly derivig for B, A = yi x T ) i B ui x T ) ) i x i x T i ) 1 ll A, B, Σ) B = = 0 = yi u T i B ) 1 B 2 u i u T i + B u i u T i = yi u T ) i A ) u i u T i A xi u T ) i xi u T ) i 14

15 = B = yi u T ) i A xi u T ) ) i u i u T i We have two equatios i terms of A ad B. We ca solve them to have the exact values of A ad B. Here Y,U ad X deotes the feature vectors of all the iputs. A = B = ) 1 ) )) ) )) y x T y u T u x T x u T u x T 1 x x T u u T x x T I u u T x x T ) )) ) )) y u T y x T x u T u x T x u T 1 u u T x x T u u T I x x T u u T A = Y X T ) XX T ) 1 Y U T ) UU T ) 1 UX T ) XX T ) 1) I XU T ) UU T ) 1 UX T ) XX T ) 1) 1 B = Y U T ) UU T ) 1 Y X T ) XX T ) 1 XU T ) UU T ) 1) I UX T ) XX T ) 1 XU T ) UU T ) 1) 1 Σ = y A x B u) y A x B u) T Puttig the values of x ad y from data2.txt ad solvig for A, B ad Σ we get A = Σ = e e e e e e e e e e B = e e e e e e e e e e e e e e e e 04 b) Based o the results, do you have a guess of what the model above actually models? This system also models a time depedet system like questio 2. I the case of this system however, the additioal u t vector could be cosidered as the cotrol iput to the system which, alog with the curret state of the system ad the oise compoet could be used to predict the ew positio of the system at time t + 1. Cosiderig the motio plaig robot, where the positio at time x t+1 is derived from it s positio at 15

16 time t which correspods to the state of the robot at curret time, the additioal u t vector could be the user iput give to the robot. Such a iput may have differet effects o the robot depedig o the terrai ad other evirometal factors ad this models such a system. The oise compoet could oce agai be cosidered as the effects of the robot s eviromet, which prevet the robot from followig a perfect ad expected trajectory. Such effects could be wid chages, terrai chages or ay other such aturally occurig evets, which are hard to model ad predict, thereby take as a radom umber, simplifyig our system. 16

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Regression and generalization

Regression and generalization Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Chapter 4. Fourier Series

Chapter 4. Fourier Series Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

CHAPTER 5. Theory and Solution Using Matrix Techniques

CHAPTER 5. Theory and Solution Using Matrix Techniques A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring

Machine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor

More information

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

MEI Casio Tasks for Further Pure

MEI Casio Tasks for Further Pure Task Complex Numbers: Roots of Quadratic Equatios. Add a ew Equatio scree: paf 2. Chage the Complex output to a+bi: LpNNNNwd 3. Select Polyomial ad set the Degree to 2: wq 4. Set a=, b=5 ad c=6: l5l6l

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

PAPER : IIT-JAM 2010

PAPER : IIT-JAM 2010 MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure

More information

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)

More information

Chapter 2 The Monte Carlo Method

Chapter 2 The Monte Carlo Method Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Random Signals and Noise Winter Semester 2017 Problem Set 12 Wiener Filter Continuation

Random Signals and Noise Winter Semester 2017 Problem Set 12 Wiener Filter Continuation Radom Sigals ad Noise Witer Semester 7 Problem Set Wieer Filter Cotiuatio Problem (Sprig, Exam A) Give is the sigal W t, which is a Gaussia white oise with expectatio zero ad power spectral desity fuctio

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution Departmet of Civil Egieerig-I.I.T. Delhi CEL 899: Evirometal Risk Assessmet HW5 Solutio Note: Assume missig data (if ay) ad metio the same. Q. Suppose X has a ormal distributio defied as N (mea=5, variace=

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1). Assigmet 7 Exercise 4.3 Use the Cotiuity Theorem to prove the Cramér-Wold Theorem, Theorem 4.12. Hit: a X d a X implies that φ a X (1) φ a X(1). Sketch of solutio: As we poited out i class, the oly tricky

More information

4. Hypothesis testing (Hotelling s T 2 -statistic)

4. Hypothesis testing (Hotelling s T 2 -statistic) 4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Topic 18: Composite Hypotheses

Topic 18: Composite Hypotheses Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Istitute of Techology 6.867 Machie Learig, Fall 6 Problem Set : Solutios. (a) (5 poits) From the lecture otes (Eq 4, Lecture 5), the optimal parameter values for liear regressio give the

More information

Practice Problems: Taylor and Maclaurin Series

Practice Problems: Taylor and Maclaurin Series Practice Problems: Taylor ad Maclauri Series Aswers. a) Start by takig derivatives util a patter develops that lets you to write a geeral formula for the -th derivative. Do t simplify as you go, because

More information

Nonlinear regression

Nonlinear regression oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α Nice plottig of proteis: I A widely used display of protei shapes is based o the coordiates of the alpha carbos - - C α -s. The coordiates of the C α -s are coected by a cotiuous curve that roughly follows

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6) STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

We will conclude the chapter with the study a few methods and techniques which are useful

We will conclude the chapter with the study a few methods and techniques which are useful Chapter : Coordiate geometry: I this chapter we will lear about the mai priciples of graphig i a dimesioal (D) Cartesia system of coordiates. We will focus o drawig lies ad the characteristics of the graphs

More information

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Elementary Statistics

Elementary Statistics Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information