Machine Learning Assignment-1
|
|
- Laurel Craig
- 5 years ago
- Views:
Transcription
1 Uiversity of Utah, School Of Computig Machie Learig Assigmet-1 Chadramouli, Shridhara ) Sigla, Sumedha ) September 10, Liear Regressio a) Create a Matlab fuctio that draws a radom umber from the uivariate ormal distributio N m, σ 2 ) for ay m, σ ɛ R. How do you test whether you have doe this correctly? As: Goal: To create a fuctio rad,m,sd) which takes 3 parameters as iput Where : is umber of radom umbers to be geerated m is required mea of the ormal distributio sdis required variace of the ormal distributio ad retur a array of radom umbers. Approach 1 Step 1. Fid uiformally distributed radom umbers. Step 2. Trasform the uiformally distributed umbers to stadard ormal distributio usig Box Muller Trasformatio. Box Muller Trasformatio: Suppose U1 ad U2 are idepedet radom variables that are uiformly distributed i the iterval 0, 1]. The accordig to Box Muller Trasfor- 1
2 matio the correspodig idepedet radom variables with a stadard ormal distributio is give by Z0 ad Z1 Z0 = 2 2 l U1 cos2πu2) Z1 = 2 2 l U1 si2πu2) Step 3. Scale the resulted stadard ormal distributio to ormal distributio with mea = m ad variace = sd. ND = m + sd sd) Where : ND is ormal distributio sd is stadard ormal distributio To test this fuctio we plot a histogrm for =1000. It comes as a bell courve which is symetric about the mea m.please check figure 1.1 for the curve. Approach 2 By usig the cetral limit theorem, we kow that the mea of uiformly distriuted radom umbers is ormally distributed. To create a uiformly created radom umber geerator, we use a pseudo radom geerator that is of the form x i+1 = ax i modb) We chose the values of a = 40692, x 0 = , ad b = I order to get a ormally distributed curve, we get radom umbers from the pseudo radom umber geerator, ad take the arithmetic mea of the values i each row, givig radom umbers. The seed values for the PRNG are made to persist, thereby gettig distict umbers o each ru. We the scale the results for the mea ad variace specified as per equatio 1.1). The histogram of the values geerated usig this method is show i figure 1.2 b) Choose a arbitrary o-liear uivariate mathematical fuctio f : R R. Make your fuctio complex eough such that it is differet from the aswer of other groups 2
3 Figure 1.1: Histogram of radom umbers geerated usig box-muller trasformatio Figure 1.2: Histogram of radom umbers geerated usig cetral limit theorem 3
4 with a high likelihood. As:For this exercise, we are asked to choose a uivariate fuctio from f : R R. We decided to use the asteroid curve, which is defied as: Where : aɛr >0 ad a x a ) 3 y = ± a 2/3) x 2/3) The above fucio is itestig for our purpose of liear regressio, as y is defied at two poits for each value of x, oe positive ad the other egative. We believe such a fuctio will serve as a diagostic tool for our problem of liear regressio, as we ca use this fuctio i two ways, viz., cosiderig oly the positive values of y or cosiderig both the positive ad egative values of y. c) Geerate radom data x i, y i ) such that yi = f x i) + ξ i, ξ i N 0, σ 2), for a rage of x-values ad a value of σ chose such that the data is iterestig. Plot the data. Approach As metioed above, the asteroid curve could be plotted either by cosiderig both the egative ad positive sig of y or by usig just oe of the sig values. We wrote a matlab fuctio geerate data.m which geerates the various y values from a rage of user specified x values. The plot of the curve for various parameters are show i figures 1.3, 1.4, 1.5 ad 1.6: Figure 1.3: Plot of positive asteroid withoutfigure 1.4: Plot of positive asteroid with oise added oise N0, 3) Figure 1.5: Plot of full asteroid withoutfigure 1.6: Plot of full asteroid with added oise oise N0, 3) 4
5 The above figures were draw usig a = 30 for every a x a i itervals of 0.1. Alterig the value of a does ot chage the shape of the curve, but acts as a scalig factor, icreasig the size of curve as well as the rage of values over which x is defied. We used a moderate value of σ = 3 as too much oise chages the shape of the curve, while too little does ot reflect the real-life sceario. d) Use liear least-squares to fit a polyomial curve through your geerated data ad compute the mea-squared error of your estimate. Give your derivatio. Vary the umber of data poits you use ad degree of the polyomial. Plot the results. Describe i words what you see. Approach Let us assume that the geerated data satisfy the polyomial equatio y = a 0 + a 1 x + a 2 x 2 + a 3 x a m x m where m is the degree of the polyomial = Y = A t X Where A t = [a 0 a 1 a 2... a m ] ad X t = [1 x x 2... x m ] The values of the coefficiets A will be determied by fittig the polyomial to the traiig data oisy data geerated from the asteroid curve). This ca be doe by miimizig a error fuctio that measures the misfit betwee the fuctioyx,a), for ay give value of a, ad the traiig set data poits. Let us assume our error fuctio is Mea Square Error fuctio give by Takig trace EA) = Y i A t X i 2 EA) = Y i A t X i )Y i A t X i ) t EA) = Y i A t X i )Yi t Xt i A) EA) = Y iyi t At X i Yi t Y ixi ta + At X i Xi ta) EA) = Y iyi t 2Y ixi ta + At X i Xi ta) EA) = At X ixi t)a 2 Y ixi t)a + Y iyi t) trea)) = tr At X ixi t)a ) tr 2 Y ixi t)a ) + tr Y iy t i ) ) 5
6 trea)) = tr A X ixi t)at ) tr 2 Y ixi t)a ) + tr Y iy t Takig derivative i ) ) trea)) A = A tra X ixi t)at ) A tr2 Y ixi t)a ) + 0 Miimum value of EA) is at poit where trea)) A = 0 0 = A X i Xi t ) t + A X i Xi t 2 Y i Xi t ) A X i Xi t = As per Liear Mea Square value of A = Y i Xi t ) Y i X t i X i X t i A = Y i X t i X i X t i Fittig a polyomial to the asteroid curve We performed experimets o liear regressio i both the positive asteroid ad the full asteroid curves. For all the experimets, the data were geerated by choosig the fuctio parameters as a = 30&x was sampled at all poits ɛ [ a, a] at itervals of 0.1 ad Gaussia oise was added to the sample set usig the gaussia radom umber geerator defied above. We the implemeted the above MSE estimator i Matlab as geerate data.m. The fuctio takes i as argumets, the value of a, the samplig frequecy, the mea ad the variace for the radom umber geerators ad a flag which specifies if we require a positive or full asteroid curve ad returs a set of oisy observatios of the asteroid curve, where the oise compoet is derived from a ormally distributed radom umber geerator. The resultig data was the fed to fit polyomial.m which takes the data poits from geerate data.m ad the polyomial degree ad computes the value of A, which miimized the mea square distace i the data. We the plotted the proposed values of [y] agaist a x a i itervals of 0.1 to get a idea of how good the estimates are. 6
7 Positive Asteroid For the positive asteroid, the data was geerated usig a oise variace of 3 ad mea of 0. The data set cotaied 601 sets of x, y correspodig to the values of the curve from -a to a. We the tryig varyig the degree of the polyomial from = 2 to = 10. The plots of the data accordig to the predicted model ad the origial data is show below. Figure 1.7: Plot origial data with oise Figure 1.8: Plot of predicted values takigfigure 1.9: Plot of predicted values takig = 2 = 3 From the figures, it is clear that choosig a smaller value of the degree, forms the best fit for the data. O icreasig the degree of the polyomial to =10, the model trys to over fit the data ad much of the local miima ad maxima of the curve correspod to the oise deviatios. The lower power polymoial of = 3 or = 4 provides the best fit for the curve i our experimets. 7
8 Figure 1.10: Plot of predicted values takigfigure 1.11: Plot of predicted values takig = 4 = 10 Full Asteroid For the full asteroid, the data was geerated usig a oise variace of 3 ad mea of 0 just as was doe for the positive asteroid. The data set cotaied 1201 sets of x, y correspodig to the values of the curve from atoa. We the tryig varyig the degree of the polyomial i the rage = 2to = 20. The plots of the data accordig to the predicted model ad the origial data is show below. Figure 1.12: Plot origial data with oise As we ca see from the figures 1.13 to 1.18, the liear model fails to model the full asteroid curve effectively, ulike the positive asteroid curve from the previous sectio. We fid that while a lower degree of the polyomial such as = 2& = 4) avoids overfittg, it fails to provide a suitable predictio for x > 0. O the other had, polyomials of higher order, simply oscillate betwee positive ad egative values ad do ot provide a good model for the full asteroid. This example therefore shows the limitatios of a liear model i makig predictios of values which follow a complex o-liear fuctio. 8
9 Figure 1.13: Plot of predicted values takigfigure 1.14: Plot of predicted values takig = 2 = 4 Figure 1.15: Plot of predicted values takigfigure 1.16: Plot of predicted values takig = 6 = 10 Figure 1.17: Plot of predicted values takigfigure 1.18: Plot of predicted values takig = 15 = 20 While it provides very good approximatio for the positive asteroid, the liear model fails to model the full asteroid well. 2 Liear Multivariate Regressio. The attached file data.txt has two umbers d ad o the first lie; d is the dimesio of the data, ad is the umber of data poits. The, lies of each d umbers follow that costitute a set of vectors x tt=1, with x t R d The data x tt=1 is hypothesized to be geerated by a model: x t+1 = Ax t + m t m t N 0, Σ) where A R dxd, m t R d adσ R d xd a) Compute best estimates for A ad Σ Give your derivatio. Sol: To calculate the value of A ad Σ let us assume we have data X, Y ) where is the umber of data poits. The give hypothesis is X t+1 = AX t + M t 9
10 For the sake of coviiece, lets replace X t+1 by Y. The equatio ow becomes : Y = AX t + M t where M t is N 0, Σ) This could be re-writte as Y N A T X, Σ ) Which represets a ormal curve with mea AX t. I order to get the values of A,B ad Σ, lets cosider the MLE of the above curve. The geeral likelihood fuctio is, ) 1 e 2 x ˆx)T x ˆx) P x) = 2π) Σ ) d for our fuctio, l A, Σ) = ) 1 1.e 2 y i A x i ) T y i A x i ) 2π) d Σ We eed to fid the values of A ad Σ which maximizes the above fuctio. Therefore, lets take the derivate ad equate it to zero after takig the atural logrithm o both sides. ll A, Σ) = l 1 2π) d Σ )) 1.e 2 y i A x i ) T y i A x i ) ll A, Σ) = log 2π d 2 ) ) log )) 1 2 y i A x i ) T y i A x i ) ) log 2π d 2 ) ) term is a costat ad lets detote it by c. O further simplificatio, we get Takig the trace: ll A, Σ) = c + 2 log ) ll A, Σ) = c + 2 log ) = c + 2 log ) 1 2 y i A x i ) T y i A x i ) ) tr 1 2 y i A x i ) T y i A x i ) ) 1 2 tr y i A x i ) y i A x i ) T ) 10
11 as tr ABC) = tr BCA) = tr CAB) Expadig the last term, we get. = c+ 2 log ) 1 2 tr yi y T ) i yi x T i A T ) A xi y T ) i + A xi x T i A T )) ) = c + 2 log ) 1 2 tr tr yi y i T ) ) + 1 Σ 2 tr 1 yi x T i A T )) A xi y i T ) ) 1 Σ 2 tr 1 A xi x T i A T )) Here, tr yi x T )) i AT = tr A xi y i T )) as tr A) = tr A) T ad as tr ABC) = tr BCA) = tr CAB) Therefore, by simplicatio we get ll A, Σ) = c + 2 log ) 1 2 tr 1 2 tr yi y i T ) ) + tr A xi x T i A T )) yi x T i A T )) I order to fid the values of A ad Σ, lets take the derivatives ad equate it to zero. ) ll A, Σ) = 0 = yi x T ) 1 i A x i x T i + A x i x T i A 2 Here tr AB) A = B T ad tr CABA T ) A = C T AB T + CAB = = A x i x T i = yi x T ) i A = yi x T ) ) i x i x T i Σ = y A x) y A x) T ) 1 Puttig the values of x ad y from data.txt ad solvig for A ad Σ we get A = e e
12 Σ = e e e e e e e e e e b) Based o the results, do you have a guess of what the model above actually models? The system models a time depedet system, where the predicted output of a observatio or experimet depeds o the value of the curret state of the system ad a radom evet which acts as a oise. Oe such example of this system could be the motio plaig robot, where the positio at time x t+1 is derived from it s positio at time t which correspods to the state of the robot at curret time. The oise compoet could be cosidered as the effects of the robot s eviromet, which prevet the robot from followig a perfect ad expected trajectory. Such effects could be wid chages, terrai chages or ay other such aturally occurig evets, which are hard to model ad predict, thereby take as a radom umber, simplifyig our system. Cosiderig the motio plaer as the iput, the d-dimesioal feature vector of x would be some spatial positio of the object or robot represeted i d- compoets. 3 Liear Multivariate regressio with iputs The give hypothesis is X t+1 = AX t + BU t + M t For the sake of coviiece, lets replace X t+1 by Y. The equatio ow becomes : Y = AX t + BU t + M t where M t is N 0, Σ) This could be re-writte as Y N A T X + B T U, Σ ) Which represets a ormal curve with mea A T X + B T U. I order to get the values of A,B ad Σ, lets cosider the MLE of the above curve. The geeral likelihood fuctio is, ) 1 e 2 x ˆx)T x ˆx) P x) = 2π) Σ ) d for our fuctio, l A, B, Σ) = ) 1 1.e 2 y i A x i B u i ) T y i A x i B u i ) 2π) d Σ 12
13 We eed to fid the values of A,B ad Σ which maximizes the above fuctio. Therefore, lets take the derivate ad equate it to zero after takig the atural logrithm o both sides. ll A, B, Σ) = l 1 2π) d Σ )) 1.e 2 y i A x i B u i ) T y i A x i B u i ) ll A, B, Σ) = log 2π d 2 ) )+ 1 2 log )) 1 2 y i A x i B u i ) T y i A x i B u i ) ) log 2π d 2 ) ) term is a costat ad lets detote it by c. O further simplificatio, we get ll A, B, Σ) = c + 2 log ) Takig the trace: 1 2 y i A x i B u i ) T y i A x i B u i ) ) ll A, B, Σ) = c + 2 log ) = c + 2 log ) tr 1 2 y i A x i B u i ) T y i A x i B u i ) ) 1 2 tr y i A x i B u i ) y i A x i B u i ) T ) as tr ABC) = tr BCA) = tr CAB) Expadig the last term, we get. = c+ 2 log ) 1 2 tr yi y T i ) yi x T i A T ) yi u T i B T ) A xi y i T ) + A xi x T i A T ) + B ui u T i B T ) + A xi u T i B T ) B ui y i T ) + B ui x T i A T )) ) = c+ 2 log ) 1 2 tr tr 1 2 tr yi y i T ) ) + 1 Σ 2 tr 1 yi x T i A T )) + 1 Σ 2 tr 1 yi u T i B T )) A xi y i T ) ) 1 Σ 2 tr 1 A xi x T i A T )) 1 Σ 2 tr 1 B ui u T i B T )) A xi u T i B T )) + 1 Σ 2 tr 1 B ui y i T ) ) 1 Σ 2 tr 1 B ui x T i A T )) 13
14 Here, tr yi x T )) i AT = tr A xi y i T )) as tr A) = tr A) T ad as tr ABC) = tr BCA) = tr CAB) similarly, tr yi u T )) i BT = tr B ui y i T )) ad 1 2 tr A xi u T )) i BT = 1 2 tr B ui x T )) i AT Therefore, by simplicatio we get: = c+ 2 log ) 1 2 tr 1 2 tr ll A, B, Σ) = yi y i T ) ) +tr yi x T i A T )) +tr A xi x T i A T )) 1 Σ 2 tr 1 B ui u T i B T )) tr A xi u T i B T )) yi u T i B T )) I order to fid the values of A ad B ad Σ, lets take the derivatives ad equate it to zero. ll A, B, Σ) A = 0 = yi x T i ) 1 A 2 x i x T i ) + A x i x T i B ui x T ) i Here tr AB) A = B T ad tr CABA T ) A = C T AB T + CAB = A x i x T i = yi x T ) i B ui x T ) i = Similarly derivig for B, A = yi x T ) i B ui x T ) ) i x i x T i ) 1 ll A, B, Σ) B = = 0 = yi u T i B ) 1 B 2 u i u T i + B u i u T i = yi u T ) i A ) u i u T i A xi u T ) i xi u T ) i 14
15 = B = yi u T ) i A xi u T ) ) i u i u T i We have two equatios i terms of A ad B. We ca solve them to have the exact values of A ad B. Here Y,U ad X deotes the feature vectors of all the iputs. A = B = ) 1 ) )) ) )) y x T y u T u x T x u T u x T 1 x x T u u T x x T I u u T x x T ) )) ) )) y u T y x T x u T u x T x u T 1 u u T x x T u u T I x x T u u T A = Y X T ) XX T ) 1 Y U T ) UU T ) 1 UX T ) XX T ) 1) I XU T ) UU T ) 1 UX T ) XX T ) 1) 1 B = Y U T ) UU T ) 1 Y X T ) XX T ) 1 XU T ) UU T ) 1) I UX T ) XX T ) 1 XU T ) UU T ) 1) 1 Σ = y A x B u) y A x B u) T Puttig the values of x ad y from data2.txt ad solvig for A, B ad Σ we get A = Σ = e e e e e e e e e e B = e e e e e e e e e e e e e e e e 04 b) Based o the results, do you have a guess of what the model above actually models? This system also models a time depedet system like questio 2. I the case of this system however, the additioal u t vector could be cosidered as the cotrol iput to the system which, alog with the curret state of the system ad the oise compoet could be used to predict the ew positio of the system at time t + 1. Cosiderig the motio plaig robot, where the positio at time x t+1 is derived from it s positio at 15
16 time t which correspods to the state of the robot at curret time, the additioal u t vector could be the user iput give to the robot. Such a iput may have differet effects o the robot depedig o the terrai ad other evirometal factors ad this models such a system. The oise compoet could oce agai be cosidered as the effects of the robot s eviromet, which prevet the robot from followig a perfect ad expected trajectory. Such effects could be wid chages, terrai chages or ay other such aturally occurig evets, which are hard to model ad predict, thereby take as a radom umber, simplifyig our system. 16
10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationRegression and generalization
Regressio ad geeralizatio CE-717: Machie Learig Sharif Uiversity of Techology M. Soleymai Fall 2016 Curve fittig: probabilistic perspective Describig ucertaity over value of target variable as a probability
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 11
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationINF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification
INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationWHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT
WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationChapter 4. Fourier Series
Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationChapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian
Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationCHAPTER 5. Theory and Solution Using Matrix Techniques
A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL
More informationThe Method of Least Squares. To understand least squares fitting of data.
The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve
More informationComputing Confidence Intervals for Sample Data
Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios
More information6.867 Machine learning
6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationLecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)
Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationMachine Learning Regression I Hamid R. Rabiee [Slides are based on Bishop Book] Spring
Machie Learig Regressio I Hamid R. Rabiee [Slides are based o Bishop Book] Sprig 015 http://ce.sharif.edu/courses/93-94//ce717-1 Liear Regressio Liear regressio: ivolves a respose variable ad a sigle predictor
More informationReview Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn
Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationMEI Casio Tasks for Further Pure
Task Complex Numbers: Roots of Quadratic Equatios. Add a ew Equatio scree: paf 2. Chage the Complex output to a+bi: LpNNNNwd 3. Select Polyomial ad set the Degree to 2: wq 4. Set a=, b=5 ad c=6: l5l6l
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationPAPER : IIT-JAM 2010
MATHEMATICS-MA (CODE A) Q.-Q.5: Oly oe optio is correct for each questio. Each questio carries (+6) marks for correct aswer ad ( ) marks for icorrect aswer.. Which of the followig coditios does NOT esure
More informationEconomics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls
Ecoomics 250 Assigmet 1 Suggested Aswers 1. We have the followig data set o the legths (i miutes) of a sample of log-distace phoe calls 1 20 10 20 13 23 3 7 18 7 4 5 15 7 29 10 18 10 10 23 4 12 8 6 (1)
More informationChapter 2 The Monte Carlo Method
Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationSolutions to Odd Numbered End of Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationRandom Signals and Noise Winter Semester 2017 Problem Set 12 Wiener Filter Continuation
Radom Sigals ad Noise Witer Semester 7 Problem Set Wieer Filter Cotiuatio Problem (Sprig, Exam A) Give is the sigal W t, which is a Gaussia white oise with expectatio zero ad power spectral desity fuctio
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationDepartment of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution
Departmet of Civil Egieerig-I.I.T. Delhi CEL 899: Evirometal Risk Assessmet HW5 Solutio Note: Assume missig data (if ay) ad metio the same. Q. Suppose X has a ormal distributio defied as N (mea=5, variace=
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More information(X i X)(Y i Y ) = 1 n
L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may
More informationResponse Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable
Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated
More information6. Sufficient, Complete, and Ancillary Statistics
Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationThe Maximum-Likelihood Decoding Performance of Error-Correcting Codes
The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,
More informationExercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).
Assigmet 7 Exercise 4.3 Use the Cotiuity Theorem to prove the Cramér-Wold Theorem, Theorem 4.12. Hit: a X d a X implies that φ a X (1) φ a X(1). Sketch of solutio: As we poited out i class, the oly tricky
More information4. Hypothesis testing (Hotelling s T 2 -statistic)
4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm
More informationTopic 18: Composite Hypotheses
Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationMassachusetts Institute of Technology
Massachusetts Istitute of Techology 6.867 Machie Learig, Fall 6 Problem Set : Solutios. (a) (5 poits) From the lecture otes (Eq 4, Lecture 5), the optimal parameter values for liear regressio give the
More informationPractice Problems: Taylor and Maclaurin Series
Practice Problems: Taylor ad Maclauri Series Aswers. a) Start by takig derivatives util a patter develops that lets you to write a geeral formula for the -th derivative. Do t simplify as you go, because
More informationNonlinear regression
oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationA widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α
Nice plottig of proteis: I A widely used display of protei shapes is based o the coordiates of the alpha carbos - - C α -s. The coordiates of the C α -s are coected by a cotiuous curve that roughly follows
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationSTAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)
STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More information1 Review and Overview
DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationIIT JAM Mathematical Statistics (MS) 2006 SECTION A
IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationWe will conclude the chapter with the study a few methods and techniques which are useful
Chapter : Coordiate geometry: I this chapter we will lear about the mai priciples of graphig i a dimesioal (D) Cartesia system of coordiates. We will focus o drawig lies ad the characteristics of the graphs
More informationUniversity of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 00C Istructor: Nicolas Christou EXERCISE Aswer the followig questios: Practice problems - simple regressio - solutios a Suppose y,
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationApproximations and more PMFs and PDFs
Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationElementary Statistics
Elemetary Statistics M. Ghamsary, Ph.D. Sprig 004 Chap 0 Descriptive Statistics Raw Data: Whe data are collected i origial form, they are called raw data. The followig are the scores o the first test of
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationCS322: Network Analysis. Problem Set 2 - Fall 2009
Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More information