SEPARABLE LEAST SQUARES, VARIABLE PROJECTION, AND THE GAUSS-NEWTON ALGORITHM

Size: px
Start display at page:

Download "SEPARABLE LEAST SQUARES, VARIABLE PROJECTION, AND THE GAUSS-NEWTON ALGORITHM"

Transcription

1 SEPARABLE LEAS SQUARES, VARIABLE PROJECION, AND HE GAUSS-NEWON ALGORIHM M.R.OSBORNE Key words. oliear least squares, scorig, Newto s method, expected Hessia, Kaufma s modificatio, rate of covergece, radom errors, law of large umbers, cosistecy, large data sets, maximum likelihood AMS subject classificatio , 65K99, Abstract. A regressio problem is separable if the model ca be represeted as a liear combiatio of fuctios which have a oliear parametric depedece. he Gauss-Newto algorithm is a method for miimizig the residual sum of squares i such problems. It is kow to be effective both whe residuals are small, ad whe measuremet errors are additive ad the data set is large. he large data set result that the iteratio asymptotes to a secod order rate as the data set size becomes ubouded is sketched here. Variable projectio is a techique itroduced by Golub ad Pereyra for reducig the separable estimatio problem to oe of miimizig a sum of squares i the oliear parameters oly. he applicatio of Gauss-Newto to miimize this sum of squares the RGN algorithm is kow to be effective i small residual problems. he mai result preseted is that the RGN algorithm shares the good covergece rate behaviour of the Gauss-Newto algorithm o large data sets eve though the errors are o loger additive. A modificatio of the RGN algorithm due to Kaufma, which aims to reduce its computatioal cost, is show to produce iterates which are almost idetical to those of the Gauss-Newto algorithm o the origial problem. Aspects of the questio of which algorithm is preferable are discussed briefly, ad a example is used to illustrate the importace of the large data set behaviour.. Itroductio. he Gauss-Newto algorithm is a modificatio of Newto s method for miimizatio developed for the particular case whe the objective fuctio ca be writte as a sum of squares. It has a cost advatage i that it avoids the calculatio of secod derivative terms i estimatig the Hessia. Other advatages possessed by the modified algorithm are that its Hessia estimate is geerically positive defiite, ad that it actually has better trasformatio ivariace properties tha those possessed by the origial algorithm. It has the disadvatage that it has a geeric first order rate of covergece. his ca make the method usuitable except i two importat cases:. he case of small residuals. his occurs whe the idividual terms i the sum of squares ca be made small simultaeously so that the associated oliear system is cosistet or early so. 2. he case of large data sets. A importat applicatio of the Gauss-Newto algorithm is to parameter estimatio problems i data aalysis. Noliear least squares problems occur i maximizig likelihoods based o the ormal distributio. Here Gauss-Newto is a special case of the Fisher scorig algorithm [6]. I appropriate circumstaces this asymptotes to a secod order covergece rate as the umber of idepedet observatios i the data set becomes ubouded. he large data set problem is emphasised here. his seeks to estimate the true parameter vector R p by solvig the optimizatio problem mi F, ε. For Gee, who itroduced me to this problem more tha thirty years ago. Mathematical Scieces Istitute, Australia Natioal Uiversity, AC 0200, Australia

2 2 M.R. Osbore where F, ε = 2 f, ε 2,.2 f R is a vector of smooth eough fuctios fi, ε, i =, 2,,, f has full colum rak p i the regio of parameter space of iterest, ad ε R N 0, σ 2 I plays the role of observatioal error. he orm is assumed to be the Euclidea vector orm uless otherwise specified. It is assumed that the measuremet process that geerated the data set ca be coceptualised for arbitrarily large, ad that} the estimatio problem is cosistet i the sese that there exists a sequece { of local miimisers of. such that a.s.,. Here the mode of covergece is almost sure covergece. A good referece o asymptotic methods i statistics is [2]. Remark.. A key poit is that the errors are assumed to eter the model additively. hat is the fi, i =, 2,, have the fuctioal form f i, ε = y i µ i.3 where, correspodig to the case of observatios made o a sigal i the presece of oise, y i = µ i + ε i..4 hus differetiatio of fi removes the radom compoet. Also F is directly proportioal to the problem log likelihood ad the property of cosistecy becomes a cosequece of the other assumptios. I a umber of special cases there is additioal structure i f so it becomes a legitimate questio to ask if this ca be used to advatage. A oliear regressio model is called separable if the problem residuals b ca be represeted i the form b i α,, ε = y i m φ ij α j, i =, 2,,..5 j= Here the model has the form of a liear combiatio expressed by α R m of oliear fuctios φ ij, R p,. he modified otatio fi, ε b i α,, ε, m µ i φ ij α j, j= is used here to make this structure explicit. It is assumed that the problem fuctios are φ ij = φ j t i,, j =, 2,, m, where the t i, i =, 2,, are sample poits where observatios o the uderlyig sigal are made. here is o restrictio i assumig t i [0, ]. Oe source of examples is provided by geeral solutios of the m th order liear ordiary differetial equatio with fudametal solutios give by the φ i t,. I [] a systematic procedure variable projectio is itroduced for reducig the estimatio problem to a oliear least squares problem i the oliear parameters oly. A recet survey of developmets ad applicatios of variable projectio is [2]. o itroduce the techique let Φ : R m R, > m be the matrix with compoets φ ij. he rak assumptio i the problem formulatio ow requires

3 Separable least squares 3 [ Φ Φ α ] to have full colum rak m + p. Also let P : R R be the orthogoal projectio matrix defied by Here P has the explicit represetatio he P Φ = 0..6 P = I Φ Φ Φ Φ..7 F = { P y 2 + I P b 2}..8 2 he first term o the right of this equatio is idepedet of α ad the secod ca be reduced to zero by settig α = α = Φ Φ Φ y..9 hus a equivalet formulatio of. i the separable problem is mi 2 P y 2.0 which is a sum of squares i the oliear parameters oly so that, at least formally, the Gauss-Newto algorithm ca be applied. However, ow the radom errors do ot eter additively but are coupled with the oliear parameters i settig up the objective fuctio. he pla of the paper is as follows. he large data set rate of covergece aalysis appropriate to the Gauss-Newto method i the case of additive errors is summarized i the ext sectio. he third sectio shows why this aalysis caot immediately be exteded to the RGN algorithm. Here the rather harder work eeded to arrive at similar coclusios is summarised. Most implemetatios of the variable projectio method use a modificatio due to Kaufma [4] which serves to reduce the amout of computatio eeded i the RGN algorithm. his modified algorithm also shows the favourable large data set rates despite beig developed usig a explicit small residual argumet. However, it is actually closer to the additive Gauss-Newto method tha is the full RGN algorithm. A brief discussio of which form of algorithm is appropriate i particular circumstaces is give i the fial sectio. his is complemeted by a example of a classic data fittig problem which is used to illustrate the importace of the large sample covergece rate. 2. Large data set covergece rate aalysis. he basic iterative step i Newto s method for miimizig F defied i. is i+ = i J i F i, 2. J i = 2 F i. 2.2 I the case of additive errors the scorig/gauss-newto method replaces the Hessia with a approximatio which is costructed as follows. he true Hessia is { } J = { f } { f } + fi 2 fi. 2.3

4 4 M.R. Osbore he stochastic compoet eters oly through.4 so takig expectatios gives where E {J } = I I = µi µ i 2 fi, 2.4 { } { f } { f }. 2.5 he Gauss-Newto method replaces J with I i 2.. he key poit to otice is I = E { J }. 2.6 Several poits ca be made here:. It follows from the special form of 2.5 that the Gauss-Newto correctio i+ i solves the liear least squares problem mi t y µ µ t It is a importat result, coditioal o a appropriate experimetal setup, that I is geerically a bouded, positive defiite matrix for all large eough [6]. A similar result is sketched i Lemma he use of the form of the expectatio which holds at the true parameter values is a characteristic simplificatio of the scorig algorithm ad is available for more geeral likelihoods [7]. Here it leads to the same result as igorig small residual terms i 2.3. he full-step Gauss-Newto method has the form of a fixed poit iteratio: i+ = Q i, Q = I F. 2.8 he coditio for to be a attractive fixed poit is ϖ Q <, 2.9 where ϖ deotes the spectral radius of the variatioal matrix Q. his quatity determies the first order covergece multiplier of the Gauss-Newto algorithm. he key to the good large sample behaviour is the result ϖ Q a.s. 0,. 2.0 which shows that the algorithm teds to a secod order coverget process as. he derivatio of this result will ow be outlied. As F = 0, it follows that Q Now defie W : R p R p by = I p I 2 F. W = I { I 2 F }. 2.

5 Separable least squares 5 he W = Q = W + O, 2.2 by cosistecy. By 2.6, W = I { 2 F E { 2 F }}. 2.3 It has bee oted that I is bouded, positive defiite. Also, a factor is implicit i the secod term of the right had side of 2.3, ad the compoets of 2 F are sums of idepedet radom variables. hus it follows by a applicatio of the a.s. law of large umbers [2] that W 0 compoet-wise as. A immediate cosequece is that ϖ W a.s. 0,. 2.4 he desired covergece rate result 2.0 ow follows from 2.2. Note that the property of cosistecy that derives from the maximum likelihood coectio is a essetial compoet of the argumet. Also, that this is ot a completely straightforward applicatio of the law of large umbers because a sequece of sets of observatio poits {t i, i =, 2,, } is ivolved. For this case see [3]. 3. Rate estimatio for separable problems. Variable projectio leads to the oliear least squares problem.0 where f, ε = P y, 3. F, ε = 2 y P y 3.2 Implemetatio of the Gauss-Newto algorithm RGN algorithm has bee discussed i detail i []. It uses a approximate Hessia computed from 2.5 ad requires derivatives of P. he derivative of P i the directio defied by t R p is P [t] = P Φ [t] Φ + Φ + Φ [t] P, 3.3 = A, t + A, t, 3.4 where A R R, the matrix directioal derivative dφ is writte Φ[t] to emphasise both the liear depedece o t ad that t is held fixed i this operatio, explicit depedece o both ad is uderstood, ad Φ + deotes the geeralised iverse of Φ. Note that Φ + P = Φ + Φ + ΦΦ + = 0 so the two compoets of P [t] i 3.4 are orthogoal. Defie matrices K, L : R p R by he the RGN correctio solves where A, t y = K, y t, 3.5 A, t y = L, y t. 3.6 mi t P y + K + L t 2, 3.7 L K = 0 3.8

6 6 M.R. Osbore as a cosequece of the orthogoality oted above. Remark 3.. Kaufma [4] has examied these terms i more detail. We have t K Kt = y A Ay = O α 2, t L Lt = y AA y = O P y 2. If the orthogoality oted above is used the the secod term i the desig matrix i 3.7 correspods to a small residual term whe P y 2 is relatively small ad ca be igored. he resultig correctio solves mi t P y + Kt his modificatio was suggested by Kaufma. It ca be implemeted with less computatioal cost, ad it is favoured for this reaso. Numerical experiece is reported to be very satisfactory [2]. he terms i the sum of squares i the reduced problem.0 are f i = P ij y j, i =, 2,,. 3.0 Now, because the oise ε is coupled with the oliear parameters ad so does ot disappear uder differetiatio, I is quadratic i the oise cotributios. A immediate cosequece is that I E { f f }. 3. hus it is ot possible to repeat exactly the rate of covergece calculatio of the previous sectio. Istead it is coveiet to rewrite equatio 2.: { } W = f f f i 2 f i, 3.2 where the right had side is evaluated at. he property of cosistecy is uchaged so the asymptotic covergece rate is agai determied by ϖ W. We ow examie this expressio i more detail. Lemma 3.2. where G ij = 0 Φ Φ G,, 3.3 φ i t φ j t ϱ t, i, j m, ad the desity ρ is determied by the asymptotic properties of the method for geeratig the sample poits t i, i =, 2,, for large. he Gram matrix G is bouded ad geerically positive defiite. Let = I P. he ij = φ i G φ j + o, 3.4

7 Separable least squares 7 where φ i = [ φ t i φ 2 t i φ m t i ] his gives a O compoet-wise estimate which applies also to derivatives of both P ad with respect to. Proof. he result 3.3 is discussed i detail i [6]. It follows from Φ Φ = ij φ i t k φ j t k = G ij + O k= by iterpretig the sum as a quadrature formula. Positive defiiteess is a cosequece of the problem rak assumptio. o derive 3.4 ote that = Φ Φ Φ Φ, = Φ G Φ + o. he startig poit for determiig the asymptotics of the covergece rate of the RGN algorithm as is the computatio of the expectatios of the umerator ad deomiator matrices i 3.2. he expectatio of the deomiator is bouded ad geerically positive defiite. he expectatio of the umerator is O as. his suggests strogly that the spectral radius of Q 0,, a result of essetially similar stregth to that obtaied for the additive error case. o complete the proof requires showig that both umerator ad deomiator terms coverge to their expectatios with probability. Cosider first the deomiator term. Lemma 3.3. Fix =. E { f f } = σ 2 M + M 2, 3.5 where M = O, ad M2 teds to a limit which is a bouded, positive defiite matrix whe the problem rak assumptio is satisfied. I detail, these matrices are M = σ2 P ij P ij, 3.6 Proof. Set M 2 = j= µ j µ j j= f f = = fi f i, j= j= k= P ij y j µ j µ k jk. 3.7 k= P ik y k. 3.8

8 8 M.R. Osbore o calculate the expectatio ote that it follows from equatio.4 that where It follows that E { f f } = E {y j y k } = σ 2 δ jk + µ j µk, 3.9 σ2 j= = σ 2 M + M 2 µ j = e j Φα. P ij P ij + j= k= µ j µ k P ij P ik, o show M 0 is a coutig exercise. M cosists of the sum of 2 terms each of which is a p p matrix of O gradiet terms divided by 3 as a cosequece of Lemma 3.2. M 2 ca be simplified somewhat by otig that j= P ijµ j = 0 idetically i by.6 so that µ j P ij = µ j P ij. j= k= j= his gives, usig the symmetry of P = I, µ j µ k P ij P ik = µ j µ k P ij P ik, = = j= k= j= j= k= µ j µ k P jk, 3.20 µ j µ j j= j= k= µ j µ k jk. Boudedess of M 2 as ow follows usig the estimates for the size of the ij computed i Lemma 3.2. o show that M 2 is positive defiite ote that it follows from 3.20 that As dµ dµ t M 2 t = dµ {I } dµ 0., this expressio ca vaish oly if there is a directio t R p such that dµ = γµ for some γ 0. his requiremet is cotrary to the Gauss-Newto rak assumptio that [ Φ Φα ] has full rak m + p. Lemma 3.4. he umerator i the expressio 3.2 defiig W is f i 2 f i = y j y k P ij 2 P ik. 3.2 { Let M 3 = E } f i 2 f i the M 3 = σ 2 j= k= 2 P ii ij 2 P ij, 3.22 j=

9 Separable least squares 9 ad M 3 0,. Proof. his is similar to that of Lemma 3.3. he ew poit is that the cotributio to M 3 from the sigal terms µ j i the expectatio 3.9 is j= k= µ j µ k P ij 2 P ik = 0 by summig over j keepig i ad k fixed. he previous coutig argumet ca be used agai to give the estimate M 3 = O,. he fial step required is to show that the umerator ad deomiator terms i 3.2 approach their expectatios as. Oly the case of the deomiator is cosidered here. Lemma 3.5. f a.s. f M 2, Proof. he basic quatities are: f f = = = fi f i, j= P ij y j P ik y k, j= k= k= {µ j µ k + µ j ε k + µ k ε j + ε j ε k } P ij P ik he first of the three terms i this last expasio is M 2. hus the result requires showig that the remaiig terms ted to 0. Let π i = ε j P ij, π i R p. j= As, by Lemma 3.2, the compoets of P ij = O, it follows by applicatios of the law of large umbers that π i a.s. 0, compoetwise. Specifically, give δ > 0, there is a 0 such that i, π i < δ > 0 with probability. Cosider the third term. Let S = = ε j ε k P ij P ik, j= k= π i π i.

10 0 M.R. Osbore he, i the maximum orm, with probability for > 0, S pδ 2, showig that the third sum teds to 0, almost surely. A similar argumet applies to the secod term which proves to be O δ. hese results ca ow be put together to give the desired covergece result. heorem 3.6. W a.s. 0, Proof. he idea is to write each compoet term Ω i 3.2 i the form Ω = E {Ω} + Ω E {Ω}, ad the to appeal to the asymptotic covergece results established i the precedig lemmas. Remark 3.7. his result whe combied with cosistecy suffices to establish the aalogue of 2.0 i this case. he asymptotic covergece rate of the RGN algorithm ca be expected to be similar to that of the full Gauss-Newto method. While the umerator expectatio i the Gauss-Newto method is 0, ad that i the RGN algorithm is O by Lemma 3.4, these are both smaller tha the discrepacies Ω E {Ω} betwee their full expressios ad their expectatios. hus it is these discrepacy terms that are critical i determiig the covergece rates. Here these correspod to law of large umbers rates for which a scale of O /2 is appropriate. 4. he Kaufma modificatio. As the RGN algorithm possesses similar covergece rate properties to Gauss-Newto i large sample problems, ad, as the Kaufma modificatio is favoured i implemetatio, it is of iterest to ask if it too shares the same good large sample covergece rate properties. Fortuately the aswer is i the affirmative. his result ca be proved i the same way as the mai lemmas i the previous sectio. his calculatio is similar to the precedig ad is relegated to the Appedix. I this sectio the close coectio betwee the modified algorithm ad the full Gauss-Newto method is explored. hat both ca be implemeted with the same amout of work is show i []. First ote that equatio 2.7 for the Gauss-Newto correctio here becomes mi y Φα [ Φ δα,δ Φα ] [ δα δ Itroducig the variable projectio matrix P permits this to be writte: ] mi P y P Φα δ 2 + mi I P y Φα δ Φ α + δα δ δα Compariso with 3.3 shows that the first miimizatio is just mi P y Kδ. 4.3 δ hus, give α, the Kaufma search directio computed usig 3.9 is exactly the Gauss-Newto correctio for the oliear parameters. If α is set usig.9 the the secod miimizatio gives δα = Φ + Φα δ, = Φ + Φ [δ] Φ + y, 4.4

11 Separable least squares while the icremet i α arisig from the Kaufma correctio is α + δ α = Φ + y δ + O δ 2. Note this icremet is ot computed as part of the algorithm. o examie 4.4 i more detail we have dφ + = Φ Φ dφ dφ Φ Φ + Φ Φ Φ + Φ Φ dφ, = Φ Φ dφ = Φ Φ dφ Φ+ dφ Φ+ + Φ Φ dφ, P Φ+ dφ Φ+. he secod term i this last equatio occurs i 4.4. hus, settig δ = δ t, δα Φ + y δ = δ Φ Φ dφ P y + O δ 2, = δ dφ G + O δ 2. P Φ Φ α Φ dp ε he magitude of this resultig expressio ca be show to be small almost surely compared with δ whe is large eough usig the law of large umbers ad cosistecy as before. he proximity of the icremets i the liear parameters plus the idetity of the calculatio of the oliear parameter icremets demostrates the close aligmet betwee the Kaufma ad Gauss-Newto algorithms. he small residual result is discussed i []. 5. Discussio. It has bee show that both of the variats of the Gauss-Newto algorithm cosidered possess similar covergece properties i large data set problems. However, that does ot help resolve the questio of the method of choice i ay particular applicatio. here is agreemet that the Kaufma modificatio of the RGN algorithm has a advatage i beig cheaper to compute, but it is ot less expesive tha the full Gauss-Newto algorithm []. hus a choice betwee variable projectio ad Gauss-Newto must deped o other factors. hese iclude flexibility, ease of use, ad global behaviour. Flexibility teds to favour the full Gauss-Newto method because it ca be applied directly to solve a rage of maximum likelihood problems [7] so it has strog claims to be provided as a geeral purpose procedure. Ease of use is just about a draw. While Gauss-Newto requires startig values for both α ad, give the obvious approach is to compute α by solvig the liear least squares problem. Selectig betwee the methods o some a priori predictio of effectiveess appears much harder. It is argued i [2] that variable projectio ca take fewer iteratios i importat cases. here are two sigificat poits to be made here.. Noliear approximatio families eed ot be closed. Especially if the data is iadequate the the iterates geerated by the full Gauss-Newto may ted to a fuctio i the closure of the family. I this case some parameter values will ted to ad divergece is the correct aswer. he oliear parameters ca be bouded so it is possible for variable projectio to yield a well determied aswer. However, it still eeds to be iterpreted correctly. A example ivolvig the Gauss-Newto method is discussed i [7].

12 2 M.R. Osbore Fig. 5.. No covergece: fit after 50 iteratios case σ = 4, = here is some evidece that strategies which elimiate the liear parameters i separable models ca be spectacularly effective i expoetial fittig problems with small umbers of variables [5], [9]. Similar behaviour has ot bee observed for ratioal fittig [8] which is also a separable regressio problem. It seems there is somethig else goig o i the expoetial fittig case as illcoditioig of the computatio of the liear parameters affects directly both the coditioig of the liear parameter correctio i Gauss-Newto ad the accuracy of the calculatio of P i variable projectio i both these classes of problems. It should be oted that maximum likelihood is ot the way to estimate frequecies which are just the oliear parameters i a closely related problem [0]. Some possible directios for developig modified algorithms are cosidered i [3]. he importace of large sample behaviour, ad the eed for appropriate istrumetatio for data collectio are cosequeces of the result that maximum likelihood parameter estimates have the property that is asymptotically ormally distributed [2]. he effect of sample size o the covergece rate of the Gauss-Newto method is illustrated i able 5. for a estimatio problem ivolvig fittig three Gaussia peaks plus a expoetial backgroud term. Such problems are commo i scietific data aalysis ad are well eough coditioed if the peaks are reasoably distict. I such cases it is relatively easy to set adequate iitial parameter estimates. Here the chose model is µ x, t = 5e 0t + 8e t e t e t Iitial coditios are chose such that there are radom errors of up to 50% i the backgroud parameters ad peak heights, 2.5% i peak locatios, ad 25% i peak wih parameters. Numbers of iteratios are reported for a error process correspodig to a particular sequece of idepedet, ormally distributed

13 Separable least squares 3 σ = σ = 2 σ = c able 5. Iteratio couts for peak fittig with expoetial backgroud radom umbers, stadard deviatios σ =, 2, 4, ad equispaced sample poits = 64, 256, 024, 4096, he most sesitive parameters prove to be those determiig the expoetial backgroud, ad they trigger the lack of covergece that occurred whe σ = 4, = 64. he apparet superior covergece behaviour i the = 64 case over the = 256 case for the smaller σ values ca be explaied by the sequece of radom umbers geerated producig more favourable residual values i the former case. he sequece used here correspods to the first quarter of the sequece for = 256. Plots for the fits obtaied for σ = 4, = 64 ad σ = 4, = 256 are give i Figure 5. ad Figure 5.2 respectively. he difficulty with the backgroud estimatio i the former shows up i the sharp kik i the fitted red curve ear t = 0. his figure gives the result after 50 iteratios whe x = 269 ad x2 = 327 so divergece of the backgroud parameters is evidet. However, the rest of the sigal is beig picked up pretty well. he quality of the sigal represetatio suggests possible o-compactess, but the divergig parameters mix liear ad oliear makig iterpretatio of the cacellatio occurrig difficult. A similar pheomeo is discussed i [7]. his ivolves liear parameters oly, ad it is easier to see what is goig o. he problem is attributed to lack of adequate parameter iformatio i the give data. he gree curves give the fit obtaied usig the iitial parameter values ad is the same i both cases. hese curves maage to hide the middle peak fairly well, so the overall fits obtaied are quite satisfactory. he problem would be harder if the umber of peaks was ot kow a priori. 6. Appedix. he variatioal matrix whose spectral radius evaluated at determies the covergece rate of the Kaufma iteratio is Q = I K K = K K 2 F, f i 2 f i + L L. 6. It is possible here to draw o work already doe to establish the key covergece rate { result 2.0. Lemmas 3.3 ad 3.5 describe the covergece behaviour of I = K K + L L } as. Here it proves to be possible to separate out the properties of the idividual terms { by makig use of the orthogoality of K ad L oce it has bee show that E L, ε } a.s. L, ε 0,. his calculatio

14 4 M.R. Osbore Fig Fit obtaied: case σ = 4, = 256 ca proceed as follows. Let t R p. he { } E t L Lt = { E ε P Φ [t] Φ + Φ + } Φ [t] P ε, = {ε E P Φ [t] Φ Φ } Φ [t] P ε, = { 2 trace Φ [t] G Φ [t] P E { εε } } P + smaller terms, { } = σ2 2 trace Φ [t] G Φ [t] I + smaller terms. his last expressio breaks ito two terms, oe ivolvig the uit matrix ad the other ivolvig the projectio. Both lead to terms of the same order. he uit matrix term gives trace { } { } Φ [t] G Φ [t] = t Ψ i G Ψ i t, where It follows that σ 2 2 Ψ i jk = φ ij k, Ψ i : R m R p. Ψ i G Ψ i = O,. o complete the story ote that the coclusio of Lemma 3.5 ca be writte K K + L L { a.s. E K K + } L L,.

15 Separable least squares 5 If K K is bouded, positive defiite the, usig the orthogoality 3.8, { } K K K K E K a.s. K { } K KE L L,. his shows that K K teds almost surely to its expectatio provided it is bouded, positive defiite for large eough ad so ca be cacelled o both sides i the above expressio. Note first that the liear parameters caot upset boudedess. α = Φ Φ Φ y, = α + G + O Φ ε, = α + δ, δ = o, 6.2 where α is the true vector of liear parameters. Positive defiiteess follows from tk Kt = α dφ P dφ α, 2 = dφ α dφ α Equality ca hold oly if there is t such that dφ α = γφα. his coditio was met also i Lemma REFERENCES [] G. Golub ad V. Pereyra, he differetiatio of pseudo-iverses ad oliear least squares problems whose variables separate, SIAM J. Numer. Aal., 0 973, pp [2], Separable oliear least squares: the variable projectio method ad its applicatios, Iverse Problems, , pp. R R26. [3] M. Kah, M. Mackisack, M. Osbore, ad G. Smyth, O the cosistecy of Proy s method ad related algorithms, J. Comput. Graph. Statist., 992, pp [4] L. Kaufma, Variable projectio method for solvig separable oliear least squares problems, BI, 5 975, pp [5] M. Osbore, Some special oliear least squares problems, SIAM J. Numer. Aal., 2 975, pp [6], Fisher s method of scorig, Iterat. Statist. Rev., , pp [7], Least squares methods i maximum likelihood problems, Optim. Methods Softw., , pp [8] M. Osbore ad G. Smyth, A modified Proy algorithm for fittig fuctios defied by differece equatios, SIAM J. Sci. ad Statist. Comput., 2 99, pp [9], A modified Proy algorithm for expoetial fittig, SIAM J. Sci. Comput., 6 995, pp [0] B. Qui ad E. Haa, he Estimatio ad rackig of Frequecy, Cambridge Uiversity Press, Cambridge, Uited Kigdom, 200. [] A. Ruhe ad P. Wedi, Algorithms for separable oliear least squares problems, SIAM Rev., , pp [2] K. Se ad J. Siger, Large Sample Methods i Statistics, Chapma ad Hall, New York, 993. [3] W. Stout, Almost Sure Covergece, Academic Press, New York, 974.

ETNA Kent State University

ETNA Kent State University Electroic rasactios o Numerical Aalysis. Volume 28, pp. -5, 2007. Copyright 2007, Ket State Uiversity. ISSN 068-963. ENA Ket State Uiversity eta@mcs.ket.edu SEPARABLE LEAS SQUARES, VARIABLE PROJECION,

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall Iteratioal Mathematical Forum, Vol. 9, 04, o. 3, 465-475 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/0.988/imf.04.48 Similarity Solutios to Usteady Pseudoplastic Flow Near a Movig Wall W. Robi Egieerig

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Discrete Orthogonal Moment Features Using Chebyshev Polynomials

Discrete Orthogonal Moment Features Using Chebyshev Polynomials Discrete Orthogoal Momet Features Usig Chebyshev Polyomials R. Mukuda, 1 S.H.Og ad P.A. Lee 3 1 Faculty of Iformatio Sciece ad Techology, Multimedia Uiversity 75450 Malacca, Malaysia. Istitute of Mathematical

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS Jural Karya Asli Loreka Ahli Matematik Vol. No. (010) page 6-9. Jural Karya Asli Loreka Ahli Matematik A NEW CLASS OF -STEP RATIONAL MULTISTEP METHODS 1 Nazeeruddi Yaacob Teh Yua Yig Norma Alias 1 Departmet

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

The natural exponential function

The natural exponential function The atural expoetial fuctio Attila Máté Brookly College of the City Uiversity of New York December, 205 Cotets The atural expoetial fuctio for real x. Beroulli s iequality.....................................2

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Series III. Chapter Alternating Series

Series III. Chapter Alternating Series Chapter 9 Series III With the exceptio of the Null Sequece Test, all the tests for series covergece ad divergece that we have cosidered so far have dealt oly with series of oegative terms. Series with

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Chandrasekhar Type Algorithms. for the Riccati Equation of Lainiotis Filter

Chandrasekhar Type Algorithms. for the Riccati Equation of Lainiotis Filter Cotemporary Egieerig Scieces, Vol. 3, 00, o. 4, 9-00 Chadrasekhar ype Algorithms for the Riccati Equatio of Laiiotis Filter Nicholas Assimakis Departmet of Electroics echological Educatioal Istitute of

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io Chapter 9 - CD compaio CHAPTER NINE CD-9.2 CD-9.2. Stages With Voltage ad Curret Gai A Geeric Implemetatio; The Commo-Merge Amplifier The advaced method preseted i the text for approximatig cutoff frequecies

More information

Notes on iteration and Newton s method. Iteration

Notes on iteration and Newton s method. Iteration Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

Revision Topic 1: Number and algebra

Revision Topic 1: Number and algebra Revisio Topic : Number ad algebra Chapter : Number Differet types of umbers You eed to kow that there are differet types of umbers ad recogise which group a particular umber belogs to: Type of umber Symbol

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Lainiotis filter implementation. via Chandrasekhar type algorithm

Lainiotis filter implementation. via Chandrasekhar type algorithm Joural of Computatios & Modellig, vol.1, o.1, 2011, 115-130 ISSN: 1792-7625 prit, 1792-8850 olie Iteratioal Scietific Press, 2011 Laiiotis filter implemetatio via Chadrasehar type algorithm Nicholas Assimais

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

The target reliability and design working life

The target reliability and design working life Safety ad Security Egieerig IV 161 The target reliability ad desig workig life M. Holický Kloker Istitute, CTU i Prague, Czech Republic Abstract Desig workig life ad target reliability levels recommeded

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Research Article A New Second-Order Iteration Method for Solving Nonlinear Equations

Research Article A New Second-Order Iteration Method for Solving Nonlinear Equations Abstract ad Applied Aalysis Volume 2013, Article ID 487062, 4 pages http://dx.doi.org/10.1155/2013/487062 Research Article A New Secod-Order Iteratio Method for Solvig Noliear Equatios Shi Mi Kag, 1 Arif

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

subject to A 1 x + A 2 y b x j 0, j = 1,,n 1 y j = 0 or 1, j = 1,,n 2

subject to A 1 x + A 2 y b x j 0, j = 1,,n 1 y j = 0 or 1, j = 1,,n 2 Additioal Brach ad Boud Algorithms 0-1 Mixed-Iteger Liear Programmig The brach ad boud algorithm described i the previous sectios ca be used to solve virtually all optimizatio problems cotaiig iteger variables,

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

On forward improvement iteration for stopping problems

On forward improvement iteration for stopping problems O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) = AN INTRODUCTION TO SCHRÖDER AND UNKNOWN NUMBERS NICK DUFRESNE Abstract. I this article we will itroduce two types of lattice paths, Schröder paths ad Ukow paths. We will examie differet properties of each,

More information

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities

Polynomials with Rational Roots that Differ by a Non-zero Constant. Generalities Polyomials with Ratioal Roots that Differ by a No-zero Costat Philip Gibbs The problem of fidig two polyomials P(x) ad Q(x) of a give degree i a sigle variable x that have all ratioal roots ad differ by

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

1 1 2 = show that: over variables x and y. [2 marks] Write down necessary conditions involving first and second-order partial derivatives for ( x0, y

1 1 2 = show that: over variables x and y. [2 marks] Write down necessary conditions involving first and second-order partial derivatives for ( x0, y Questio (a) A square matrix A= A is called positive defiite if the quadratic form waw > 0 for every o-zero vector w [Note: Here (.) deotes the traspose of a matrix or a vector]. Let 0 A = 0 = show that:

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

5. Fast NLMS-OCF Algorithm

5. Fast NLMS-OCF Algorithm 5. Fast LMS-OCF Algorithm The LMS-OCF algorithm preseted i Chapter, which relies o Gram-Schmidt orthogoalizatio, has a compleity O ( M ). The square-law depedece o computatioal requiremets o the umber

More information

A PROOF OF THE TWIN PRIME CONJECTURE AND OTHER POSSIBLE APPLICATIONS

A PROOF OF THE TWIN PRIME CONJECTURE AND OTHER POSSIBLE APPLICATIONS A PROOF OF THE TWI PRIME COJECTURE AD OTHER POSSIBLE APPLICATIOS by PAUL S. BRUCKMA 38 Frot Street, #3 aaimo, BC V9R B8 (Caada) e-mail : pbruckma@hotmail.com ABSTRACT : A elemetary proof of the Twi Prime

More information

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated

More information

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a Math S-b Lecture # Notes This wee is all about determiats We ll discuss how to defie them, how to calculate them, lear the allimportat property ow as multiliearity, ad show that a square matrix A is ivertible

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Chapter 9: Numerical Differentiation

Chapter 9: Numerical Differentiation 178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Chapter 7: The z-transform. Chih-Wei Liu

Chapter 7: The z-transform. Chih-Wei Liu Chapter 7: The -Trasform Chih-Wei Liu Outlie Itroductio The -Trasform Properties of the Regio of Covergece Properties of the -Trasform Iversio of the -Trasform The Trasfer Fuctio Causality ad Stability

More information

6.003 Homework #3 Solutions

6.003 Homework #3 Solutions 6.00 Homework # Solutios Problems. Complex umbers a. Evaluate the real ad imagiary parts of j j. π/ Real part = Imagiary part = 0 e Euler s formula says that j = e jπ/, so jπ/ j π/ j j = e = e. Thus the

More information

Feedback in Iterative Algorithms

Feedback in Iterative Algorithms Feedback i Iterative Algorithms Charles Byre (Charles Byre@uml.edu), Departmet of Mathematical Scieces, Uiversity of Massachusetts Lowell, Lowell, MA 01854 October 17, 2005 Abstract Whe the oegative system

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

1 Generating functions for balls in boxes

1 Generating functions for balls in boxes Math 566 Fall 05 Some otes o geeratig fuctios Give a sequece a 0, a, a,..., a,..., a geeratig fuctio some way of represetig the sequece as a fuctio. There are may ways to do this, with the most commo ways

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE

NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE UPB Sci Bull, Series A, Vol 79, Iss, 207 ISSN 22-7027 NEW FAST CONVERGENT SEQUENCES OF EULER-MASCHERONI TYPE Gabriel Bercu We itroduce two ew sequeces of Euler-Mascheroi type which have fast covergece

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information