Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting

Size: px
Start display at page:

Download "Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting"

Transcription

1 Statistics ad Data Aalysis i MATLAB Kedrick Kay, kedrick.kay@wustl.edu February 28, 2014 Lecture 4: Model fittig 1. The basics - Suppose that we have a set of data ad suppose that we have selected the type of odel to apply to the data (see Lecture 3). Our task ow is to fit the odel to the data that is, adjust the free paraeters of the odel such that the odel describes the data as well as possible. - Before proceedig, we ust first establish a etric for the goodess-of-fit of the odel. A coo etric is squared error, that is, the su of the squares of the differeces betwee the data ad the odel fit: squared error = ( d i i ) 2 where is the uber of data poits, d i is the ith data poit, ad i is the odel fit for the ith data poit. We will see the otivatio for squared error later i this lecture. - Give the etric of squared error, our job is to deterie the specific set of paraeter values that iiize squared error. The solutio to this proble depeds o the type of odel that we are tryig to fit. 2. The case of liear odels - Model fittig i the case of liear (ad liearized) odels ca be give a ice geoetric iterpretatio. We ca view the data as a sigle poit i -diesioal space ad we ca view the regressors as vectors i this space eaatig fro the origi. Potetial odel fits are give by poits i the subspace spaed by the vectors. (The subspace cosists of all poits that ca be expressed as a weighted su of the regressors.) The odel fit that iiizes squared error is the poit that lies closest i a Euclidea sese to the data. (This is because the Euclidea distace betwee two poits is a siple ootoic trasforatio the square root of the su of the squares of the differeces i the two poits' coordiates.) The residuals of the odel fit ca be viewed as a vector that starts at the odel fit ad eds at the data. - With these geoetric isights i id, we ca ow derive the solutio to our fittig proble. Recall that liear odels ca be expressed as y = Xw + where y is a set of data poits ( 1), X is a set of regressors ( p), w is a set of weights (p 1), ad is a set of residuals ( 1). Let ŵ OLS deote the set of weights that provide the optial odel fit; these weights are called the ordiary least-squares (OLS) estiate. At the optial odel fit, the residuals ust be orthogoal to each of the regressors. (If the residuals were correlated with a give regressor, the a better odel fit could be obtaied by ovig i the directio of that regressor.) This orthogoality coditio iplies that the dot product betwee each regressor ad the residuals ust equal zero: X T (y Xŵ OLS ) = 0 where 0 is a vector of zeros ( 1). Expadig ad solvig, we obtai: X T y X T Xŵ OLS = 0 ad

2 ŵ OLS = (X T X) 1 X T y where A 1 idicates the atrix iverse of A. Thus, we see that the set of weights that iiize squared error ca be coputed usig a aalytic expressio ivolvig siple atrix operatios. - Note that if there are ore regressors tha data poits (p > ), the iversio of the correlatio atrix X T X is ill-defied, ad there is o OLS solutio. Ituitively, the idea is that if there are ore regressors tha data poits, the there are ifiitely ay solutios, all of which achieve zero error. 3. The case of oliear odels - Give that oliear odels ecopass a broad diversity of odels, there is little hope of writig dow a sigle expressio that will solve the fittig proble for a arbitrary oliear odel. - To fit oliear odels, we cast the proble as a search proble i which we have a paraeter space, a cost fuctio, ad our job is to search through the space to fid the poit that iiizes the cost fuctio. Sice we are usig the cost fuctio of squared error, we ca thik of our job as tryig to fid the iiu poit o a error surface. If there is oly oe paraeter, the error surface is a fuctio defied o oe diesio (i.e. a curvy lie); if there are two paraeters, the error surface is a fuctio defied o two diesios (i.e. a bupy sheet); etc. - To search through the paraeter space, the usual approach is to use local, iterative optiizatio algoriths that start at soe poit i the space (the iitial seed), look at the error surface i a sall eighborhood aroud that poit, ove i soe directio i a attept to reduce the error, ad the repeat this process util iproveets are sufficietly sall (e.g. util the iproveet is less tha soe sall uber). There are a variety of optiizatio algoriths, ad they basically vary with respect to how exactly they ake use of first-order derivative iforatio (gradiets) ad secod-order derivative iforatio (curvature). The Leveberg-Marquardt algorith is a popular ad effective algorith ad is ipleeted i the MATLAB Optiizatio Toolbox. - As a siple exaple of a optiizatio ethod, let us cosider how to perfor gradiet descet for a liear odel (reusig the earlier exaple y = Xw + ). Assuig the error etric is squared error, the the derivative of the error surface with respect to the jth weight is error = ((y Xw) T (y Xw) ) = (y i X i w) 2 = 2(y X w)( X ) i i i, j i i where w j is the jth eleet of w, y i is the ith eleet of y, X i is the ith row of X, ad X i,j is the (i,j)th eleet of X. Collectig the derivatives for differet weights ito a vector, we obtai the gradiet of the error surface (p 1): error = 2X T (y Xw) So, what this tell us is that give a set of weights w, we kow how to copute the aout by which the error will chage if we were to tweak ay of the weights (e.g., if we were to icreet the first weight by 0.01, the the error surface will icrease by approxiately 0.01 ties the first eleet i the gradiet vector). This suggests a siple algorith: first, set all the weights to soe iitial value (e.g. all zeros); the, copute the gradiet ad update the weights by subtractig soe sall fractio of the gradiet; ad repeat the weight-updatig process util the error stops decreasig. We will see that this algorith ca be ipleeted i MATLAB quite easily.

3 - A potetial proble with local, iterative optiizatio is local iia, that is, locatios o the error surface that are the iiu withi soe local rage but which are ot the absolute iiu that ca be achieved (which is kow as the global iiu). - For liear odels, the error surface is shaped like a bowl ad there are o local iia. The lack of local iia akes paraeter-search easy as log as a algorith ca adjust paraeters to reduce the error, we will evetually get to the optial solutio. Geoetrically, there is a ice ituitio for why liear odels have o local iia: assuig there are at least as ay data poits as regressors, there is exactly oe poit i the subspace spaed by the regressors that is closest to the data, ad as paraeter values deviate fro this poit, the distace fro the data grows ootoically. - For oliear odels, the error surface ay be "bupy" with local iia. Because of local iia, the solutio foud by a algorith ay ot be the best possible solutio. - The severity of the proble of local iia depeds o the ature of the data, the ature of the odel, ad the specific optiizatio algorith used, so it is difficult to ake ay geeral stateets. Strategies for dealig with local iia iclude (1) startig with differet iitial seeds ad selectig the best resultig odel, (2) exhaustively saplig the paraeter space, ad (3) usig alterative optiizatio techiques such as geetic algoriths. 4. The otivatio for squared error - Give a probability distributio, we ca quatify the probability, or likelihood, of a set of data. Moreover, for differet probability distributios, we ca ask which probability distributio axiizes the likelihood of the data. This procedure is kow as axiu likelihood estiatio ad provides a eas for choosig fro aogst differet odels. For exaple, give a set of data, out of all of the possible Gaussia distributios, the oe that axiizes the likelihood of the data has a ea ad stadard deviatio equal to the ea ad stadard deviatio of the data. - Let us apply axiu likelihood estiatio to the case of regressio odels. Suppose that the true uderlyig probability distributio for each data poit is a Gaussia whose ea is equal to the odel predictio ad whose stadard deviatio is soe fixed value. I other words, suppose that the data are geerated by the odel plus idepedet, idetically distributed (i.i.d.) Gaussia oise. The, the likelihood of a give set of data ca be writte as follows: (d i i ) 2 1 likelihood(d ) = p(d i ) = σ 2π e 2σ 2 where d represets the data, represets the odel, is the uber of data poits, d i is the ith data poit, σ is the stadard deviatio of the Gaussia oise, ad i is the odel predictio for the ith data poit. The odel estiate,, that we wat is the oe that axiizes the likelihood of the data: arg ax( likelihood(d ) ) Because the logarith is a ootoic fuctio, the desired odel estiate is also give by arg ax( log-likelihood(d ) ) which i tur is equivalet to arg i( egative-log-likelihood(d ) ) Now, let's substitute i the likelihood expressio:

4 (d 1 i i ) 2 argi log σ 2π e 2σ 2 Siplifyig, we obtai 1 arg i log σ 2π + (d i i )2 2σ 2 We ca drop the first ter sice it has o depedece o : (d arg i i i ) 2 2σ 2 We ca drop the deoiator sice it has o depedece o : arg i (d i i ) 2 Fially, we ca rewrite this ore siply: arg i( squared error) Thus, we see that to axiize the likelihood of the data, we should choose the odel that iiizes squared error. This shows that there is good otivatio to use squared error, aely, that assuig i.i.d. Gaussia oise, the axiu likelihood estiate of the paraeters of a odel is the set of paraeters that iiizes squared error. 5. A alterative error etric: absolute error - The assuptio that the oise is Gaussia ay be iaccurate for two reasos. Oe, the easureet oise ay be o-gaussia. Two, the odel beig applied ay ot be the correct odel (e.g., fittig a liear odel whe the true effect is quadratic). This has the cosequece that uodeled effects ay be subsued i the oise ad ay cause the oise to be o- Gaussia. Thus, although squared error has a ice otivatio, we ight desire to use a differet error etric. - Oe alterative etric is the su of the absolute values of the differeces betwee the data ad the odel fit: absolute error = d i i where is the uber of data poits, d i is the ith data poit, ad i is the odel fit for the ith data poit. This choice of error etric is useful as it reduces the ipact of outliers (which ca be roughly defied as ureasoably extree data poits). For a theoretical otivatio, it ca be show that the paraeter estiate that iiizes absolute error is the axiu likelihood estiate uder the assuptio of Laplacia oise. (The Laplace distributio is just like the Gaussia distributio except that a expoetial is take of the absolute differece fro the ea istead of the squared differece fro the ea.) - The differece betwee squared error ad absolute error ca be fraed i ters of the differece betwee the ea ad the edia: the ea of a set of data poits is the uber that iiizes squared error (that is, the su of the squares of the differeces betwee the uber ad each of the data poits), whereas the edia of a set of data poits is the uber that iiizes absolute error (that is, the su of the absolute differeces betwee the uber ad

5 each of the data poits). Thus, we ca view absolute error as a error etric that is potetially ore robust tha squared error. Note, however, there are soe disadvatages of absolute error: first, there is o aalytic solutio for liear odels, ad secod, error surfaces quatifyig absolute error ay be less well-behaved (e.g. ore local iia) tha error surfaces quatifyig squared error.

Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data

Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data Lecture 9 Curve fittig I Itroductio Suppose we are preseted with eight poits of easured data (x i, y j ). As show i Fig. o the left, we could represet the uderlyig fuctio of which these data are saples

More information

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data.

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data. STAT-UB.003 NOTES for Wedesday 0.MAY.0 We will use the file JulieApartet.tw. We ll give the regressio of Price o SqFt, show residual versus fitted plot, save residuals ad fitted. Give plot of (Resid, Price,

More information

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions ECE 9 Lecture 4: Estiatio of Lipschitz sooth fuctios R. Nowak 5/7/29 Cosider the followig settig. Let Y f (X) + W, where X is a rado variable (r.v.) o X [, ], W is a r.v. o Y R, idepedet of X ad satisfyig

More information

Lecture 11. Solution of Nonlinear Equations - III

Lecture 11. Solution of Nonlinear Equations - III Eiciecy o a ethod Lecture Solutio o Noliear Equatios - III The eiciecy ide o a iterative ethod is deied by / E r r: rate o covergece o the ethod : total uber o uctios ad derivative evaluatios at each step

More information

Chapter 2. Asymptotic Notation

Chapter 2. Asymptotic Notation Asyptotic Notatio 3 Chapter Asyptotic Notatio Goal : To siplify the aalysis of ruig tie by gettig rid of details which ay be affected by specific ipleetatio ad hardware. [1] The Big Oh (O-Notatio) : It

More information

1 The Primal and Dual of an Optimization Problem

1 The Primal and Dual of an Optimization Problem CS 189 Itroductio to Machie Learig Fall 2017 Note 18 Previously, i our ivestigatio of SVMs, we forulated a costraied optiizatio proble that we ca solve to fid the optial paraeters for our hyperplae decisio

More information

Contents Two Sample t Tests Two Sample t Tests

Contents Two Sample t Tests Two Sample t Tests Cotets 3.5.3 Two Saple t Tests................................... 3.5.3 Two Saple t Tests Setup: Two Saples We ow focus o a sceario where we have two idepedet saples fro possibly differet populatios. Our

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009 18.409 A Algorithist s Toolkit October, 009 Lecture 1 Lecturer: Joatha Keler Scribes: Alex Levi (009) 1 Outlie Today we ll go over soe of the details fro last class ad ake precise ay details that were

More information

Mixture models (cont d)

Mixture models (cont d) 6.867 Machie learig, lecture 5 (Jaakkola) Lecture topics: Differet types of ixture odels (cot d) Estiatig ixtures: the EM algorith Mixture odels (cot d) Basic ixture odel Mixture odels try to capture ad

More information

) is a square matrix with the property that for any m n matrix A, the product AI equals A. The identity matrix has a ii

) is a square matrix with the property that for any m n matrix A, the product AI equals A. The identity matrix has a ii square atrix is oe that has the sae uber of rows as colus; that is, a atrix. he idetity atrix (deoted by I, I, or [] I ) is a square atrix with the property that for ay atrix, the product I equals. he

More information

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers,

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers, Cofidece Itervals III What we kow so far: We have see how to set cofidece itervals for the ea, or expected value, of a oral probability distributio, both whe the variace is kow (usig the stadard oral,

More information

On Modeling On Minimum Description Length Modeling. M-closed

On Modeling On Minimum Description Length Modeling. M-closed O Modelig O Miiu Descriptio Legth Modelig M M-closed M-ope Do you believe that the data geeratig echais really is i your odel class M? 7 73 Miiu Descriptio Legth Priciple o-m-closed predictive iferece

More information

X. Perturbation Theory

X. Perturbation Theory X. Perturbatio Theory I perturbatio theory, oe deals with a ailtoia that is coposed Ĥ that is typically exactly solvable of two pieces: a referece part ad a perturbatio ( Ĥ ) that is assued to be sall.

More information

19.1 The dictionary problem

19.1 The dictionary problem CS125 Lecture 19 Fall 2016 19.1 The dictioary proble Cosider the followig data structural proble, usually called the dictioary proble. We have a set of ites. Each ite is a (key, value pair. Keys are i

More information

5.6 Binomial Multi-section Matching Transformer

5.6 Binomial Multi-section Matching Transformer 4/14/21 5_6 Bioial Multisectio Matchig Trasforers 1/1 5.6 Bioial Multi-sectio Matchig Trasforer Readig Assiget: pp. 246-25 Oe way to axiize badwidth is to costruct a ultisectio Γ f that is axially flat.

More information

(s)h(s) = K( s + 8 ) = 5 and one finite zero is located at z 1

(s)h(s) = K( s + 8 ) = 5 and one finite zero is located at z 1 ROOT LOCUS TECHNIQUE 93 should be desiged differetly to eet differet specificatios depedig o its area of applicatio. We have observed i Sectio 6.4 of Chapter 6, how the variatio of a sigle paraeter like

More information

A PROBABILITY PROBLEM

A PROBABILITY PROBLEM A PROBABILITY PROBLEM A big superarket chai has the followig policy: For every Euros you sped per buy, you ear oe poit (suppose, e.g., that = 3; i this case, if you sped 8.45 Euros, you get two poits,

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Name Period ALGEBRA II Chapter 1B and 2A Notes Solving Inequalities and Absolute Value / Numbers and Functions

Name Period ALGEBRA II Chapter 1B and 2A Notes Solving Inequalities and Absolute Value / Numbers and Functions Nae Period ALGEBRA II Chapter B ad A Notes Solvig Iequalities ad Absolute Value / Nubers ad Fuctios SECTION.6 Itroductio to Solvig Equatios Objectives: Write ad solve a liear equatio i oe variable. Solve

More information

Define a Markov chain on {1,..., 6} with transition probability matrix P =

Define a Markov chain on {1,..., 6} with transition probability matrix P = Pla Group Work 0. The title says it all Next Tie: MCMC ad Geeral-state Markov Chais Midter Exa: Tuesday 8 March i class Hoework 4 due Thursday Uless otherwise oted, let X be a irreducible, aperiodic Markov

More information

Note that the argument inside the second square root is always positive since R L > Z 0. The series reactance can be found as

Note that the argument inside the second square root is always positive since R L > Z 0. The series reactance can be found as Ipedace Matchig Ipedace Matchig Itroductio Ipedace atchig is the process to atch the load to a trasissio lie by a atchig etwork, as depicted i Fig Recall that the reflectios are eliiated uder the atched

More information

42 Dependence and Bases

42 Dependence and Bases 42 Depedece ad Bases The spa s(a) of a subset A i vector space V is a subspace of V. This spa ay be the whole vector space V (we say the A spas V). I this paragraph we study subsets A of V which spa V

More information

Factor Analysis. Lecture 10: Factor Analysis and Principal Component Analysis. Sam Roweis

Factor Analysis. Lecture 10: Factor Analysis and Principal Component Analysis. Sam Roweis Lecture 10: Factor Aalysis ad Pricipal Compoet Aalysis Sam Roweis February 9, 2004 Whe we assume that the subspace is liear ad that the uderlyig latet variable has a Gaussia distributio we get a model

More information

Nonlinear regression

Nonlinear regression oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear

More information

AVERAGE MARKS SCALING

AVERAGE MARKS SCALING TERTIARY INSTITUTIONS SERVICE CENTRE Level 1, 100 Royal Street East Perth, Wester Australia 6004 Telephoe (08) 9318 8000 Facsiile (08) 95 7050 http://wwwtisceduau/ 1 Itroductio AVERAGE MARKS SCALING I

More information

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

Optimal Estimator for a Sample Set with Response Error. Ed Stanek Optial Estiator for a Saple Set wit Respose Error Ed Staek Itroductio We develop a optial estiator siilar to te FP estiator wit respose error tat was cosidered i c08ed63doc Te first 6 pages of tis docuet

More information

Probabilistic Analysis of Rectilinear Steiner Trees

Probabilistic Analysis of Rectilinear Steiner Trees Probabilistic Aalysis of Rectiliear Steier Trees Chuhog Che Departet of Electrical ad Coputer Egieerig Uiversity of Widsor, Otario, Caada, N9B 3P4 E-ail: cche@uwidsor.ca Abstract Steier tree is a fudaetal

More information

Binomial transform of products

Binomial transform of products Jauary 02 207 Bioial trasfor of products Khristo N Boyadzhiev Departet of Matheatics ad Statistics Ohio Norther Uiversity Ada OH 4580 USA -boyadzhiev@ouedu Abstract Give the bioial trasfors { b } ad {

More information

The Binomial Multi-Section Transformer

The Binomial Multi-Section Transformer 4/15/2010 The Bioial Multisectio Matchig Trasforer preset.doc 1/24 The Bioial Multi-Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where:

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Statistics for Applications Fall Problem Set 7

Statistics for Applications Fall Problem Set 7 18.650. Statistics for Applicatios Fall 016. Proble Set 7 Due Friday, Oct. 8 at 1 oo Proble 1 QQ-plots Recall that the Laplace distributio with paraeter λ > 0 is the cotiuous probaλ bility easure with

More information

5.6 Binomial Multi-section Matching Transformer

5.6 Binomial Multi-section Matching Transformer 4/14/2010 5_6 Bioial Multisectio Matchig Trasforers 1/1 5.6 Bioial Multi-sectio Matchig Trasforer Readig Assiget: pp. 246-250 Oe way to axiize badwidth is to costruct a ultisectio Γ f that is axially flat.

More information

Formula List for College Algebra Sullivan 10 th ed. DO NOT WRITE ON THIS COPY.

Formula List for College Algebra Sullivan 10 th ed. DO NOT WRITE ON THIS COPY. Forula List for College Algera Sulliva 10 th ed. DO NOT WRITE ON THIS COPY. Itercepts: Lear how to fid the x ad y itercepts. Syetry: Lear how test for syetry with respect to the x-axis, y-axis ad origi.

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Integrals of Functions of Several Variables

Integrals of Functions of Several Variables Itegrals of Fuctios of Several Variables We ofte resort to itegratios i order to deterie the exact value I of soe quatity which we are uable to evaluate by perforig a fiite uber of additio or ultiplicatio

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

6.867 Machine learning, lecture 13 (Jaakkola)

6.867 Machine learning, lecture 13 (Jaakkola) Lecture topics: Boostig, argi, ad gradiet descet copleity of classifiers, geeralizatio Boostig Last tie we arrived at a boostig algorith for sequetially creatig a eseble of base classifiers. Our base classifiers

More information

PARTIAL DIFFERENTIAL EQUATIONS SEPARATION OF VARIABLES

PARTIAL DIFFERENTIAL EQUATIONS SEPARATION OF VARIABLES Diola Bagayoko (0 PARTAL DFFERENTAL EQUATONS SEPARATON OF ARABLES. troductio As discussed i previous lectures, partial differetial equatios arise whe the depedet variale, i.e., the fuctio, varies with

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Supplementary Material

Supplementary Material Suppleetary Material Wezhuo Ya a0096049@us.edu.s Departet of Mechaical Eieeri, Natioal Uiversity of Siapore, Siapore 117576 Hua Xu pexuh@us.edu.s Departet of Mechaical Eieeri, Natioal Uiversity of Siapore,

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Automated Proofs for Some Stirling Number Identities

Automated Proofs for Some Stirling Number Identities Autoated Proofs for Soe Stirlig Nuber Idetities Mauel Kauers ad Carste Scheider Research Istitute for Sybolic Coputatio Johaes Kepler Uiversity Altebergerstraße 69 A4040 Liz, Austria Subitted: Sep 1, 2007;

More information

18.S34 (FALL, 2007) GREATEST INTEGER PROBLEMS. n + n + 1 = 4n + 2.

18.S34 (FALL, 2007) GREATEST INTEGER PROBLEMS. n + n + 1 = 4n + 2. 18.S34 (FALL, 007) GREATEST INTEGER PROBLEMS Note: We use the otatio x for the greatest iteger x, eve if the origial source used the older otatio [x]. 1. (48P) If is a positive iteger, prove that + + 1

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tau.edu/~suhasii/teachig.htl Suhasii Subba Rao Exaple The itroge cotet of three differet clover plats is give below. 3DOK1 3DOK5 3DOK7

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Orthogonal transformations

Orthogonal transformations Orthogoal trasformatios October 12, 2014 1 Defiig property The squared legth of a vector is give by takig the dot product of a vector with itself, v 2 v v g ij v i v j A orthogoal trasformatio is a liear

More information

Closed virial equation-of-state for the hard-disk fluid

Closed virial equation-of-state for the hard-disk fluid Physical Review Letter LT50 receipt of this auscript 6 Jue 00 Closed virial equatio-of-state for the hard-disk fluid Athoy Beris ad Leslie V. Woodcock Departet of Cheical Egieerig Colbur Laboratory Uiversity

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Summer MA Lesson 13 Section 1.6, Section 1.7 (part 1)

Summer MA Lesson 13 Section 1.6, Section 1.7 (part 1) Suer MA 1500 Lesso 1 Sectio 1.6, Sectio 1.7 (part 1) I Solvig Polyoial Equatios Liear equatio ad quadratic equatios of 1 variable are specific types of polyoial equatios. Soe polyoial equatios of a higher

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Lecture 10: Bounded Linear Operators and Orthogonality in Hilbert Spaces

Lecture 10: Bounded Linear Operators and Orthogonality in Hilbert Spaces Lecture : Bouded Liear Operators ad Orthogoality i Hilbert Spaces 34 Bouded Liear Operator Let ( X, ), ( Y, ) i i be ored liear vector spaces ad { } X Y The, T is said to be bouded if a real uber c such

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

The Binomial Multi- Section Transformer

The Binomial Multi- Section Transformer 4/4/26 The Bioial Multisectio Matchig Trasforer /2 The Bioial Multi- Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where: ( ω ) = + e +

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Orthogonal Functions

Orthogonal Functions Royal Holloway Uiversity of odo Departet of Physics Orthogoal Fuctios Motivatio Aalogy with vectors You are probably failiar with the cocept of orthogoality fro vectors; two vectors are orthogoal whe they

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt):

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt): CS 70 Secod Midter 7 April 2011 NAME (1 pt): SID (1 pt): TA (1 pt): Nae of Neighbor to your left (1 pt): Nae of Neighbor to your right (1 pt): Istructios: This is a closed book, closed calculator, closed

More information

NUMERICAL METHODS FOR SOLVING EQUATIONS

NUMERICAL METHODS FOR SOLVING EQUATIONS Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

More information

Notes on iteration and Newton s method. Iteration

Notes on iteration and Newton s method. Iteration Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Inverse Matrix. A meaning that matrix B is an inverse of matrix A. Iverse Matrix Two square matrices A ad B of dimesios are called iverses to oe aother if the followig holds, AB BA I (11) The otio is dual but we ofte write 1 B A meaig that matrix B is a iverse of matrix

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Assignment 2 Solutions SOLUTION. ϕ 1 Â = 3 ϕ 1 4i ϕ 2. The other case can be dealt with in a similar way. { ϕ 2 Â} χ = { 4i ϕ 1 3 ϕ 2 } χ.

Assignment 2 Solutions SOLUTION. ϕ 1  = 3 ϕ 1 4i ϕ 2. The other case can be dealt with in a similar way. { ϕ 2 Â} χ = { 4i ϕ 1 3 ϕ 2 } χ. PHYSICS 34 QUANTUM PHYSICS II (25) Assigmet 2 Solutios 1. With respect to a pair of orthoormal vectors ϕ 1 ad ϕ 2 that spa the Hilbert space H of a certai system, the operator  is defied by its actio

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Bernoulli Polynomials Talks given at LSBU, October and November 2015 Tony Forbes

Bernoulli Polynomials Talks given at LSBU, October and November 2015 Tony Forbes Beroulli Polyoials Tals give at LSBU, October ad Noveber 5 Toy Forbes Beroulli Polyoials The Beroulli polyoials B (x) are defied by B (x), Thus B (x) B (x) ad B (x) x, B (x) x x + 6, B (x) dx,. () B 3

More information

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.

More information

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014 Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

More information

REVIEW OF CALCULUS Herman J. Bierens Pennsylvania State University (January 28, 2004) x 2., or x 1. x j. ' ' n i'1 x i well.,y 2

REVIEW OF CALCULUS Herman J. Bierens Pennsylvania State University (January 28, 2004) x 2., or x 1. x j. ' ' n i'1 x i well.,y 2 REVIEW OF CALCULUS Hera J. Bieres Pesylvaia State Uiversity (Jauary 28, 2004) 1. Suatio Let x 1,x 2,...,x e a sequece of uers. The su of these uers is usually deoted y x 1 % x 2 %...% x ' j x j, or x 1

More information

Math 312 Lecture Notes One Dimensional Maps

Math 312 Lecture Notes One Dimensional Maps Math 312 Lecture Notes Oe Dimesioal Maps Warre Weckesser Departmet of Mathematics Colgate Uiversity 21-23 February 25 A Example We begi with the simplest model of populatio growth. Suppose, for example,

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

distinct distinct n k n k n! n n k k n 1 if k n, identical identical p j (k) p 0 if k > n n (k)

distinct distinct n k n k n! n n k k n 1 if k n, identical identical p j (k) p 0 if k > n n (k) THE TWELVEFOLD WAY FOLLOWING GIAN-CARLO ROTA How ay ways ca we distribute objects to recipiets? Equivaletly, we wat to euerate equivalece classes of fuctios f : X Y where X = ad Y = The fuctios are subject

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Introduction to Optimization, DIKU Monday 19 November David Pisinger. Duality, motivation

Introduction to Optimization, DIKU Monday 19 November David Pisinger. Duality, motivation Itroductio to Optiizatio, DIKU 007-08 Moday 9 Noveber David Pisiger Lecture, Duality ad sesitivity aalysis Duality, shadow prices, sesitivity aalysis, post-optial aalysis, copleetary slackess, KKT optiality

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

APPENDIX F Complex Numbers

APPENDIX F Complex Numbers APPENDIX F Complex Numbers Operatios with Complex Numbers Complex Solutios of Quadratic Equatios Polar Form of a Complex Number Powers ad Roots of Complex Numbers Operatios with Complex Numbers Some equatios

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

APPLIED MULTIVARIATE ANALYSIS

APPLIED MULTIVARIATE ANALYSIS ALIED MULTIVARIATE ANALYSIS FREQUENTLY ASKED QUESTIONS AMIT MITRA & SHARMISHTHA MITRA DEARTMENT OF MATHEMATICS & STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANUR X = X X X [] The variace covariace atrix

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

and then substitute this into the second equation to get 5(11 4 y) 3y

and then substitute this into the second equation to get 5(11 4 y) 3y Math E-b Lecture # Notes The priary focus of this week s lecture is a systeatic way of solvig ad uderstadig systes of liear equatios algebraically, geoetrically, ad logically. Eaple #: Solve the syste

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information