ONLINE BAYESIAN KERNEL REGRESSION FROM NONLINEAR MAPPING OF OBSERVATIONS
|
|
- Willis Pitts
- 5 years ago
- Views:
Transcription
1 ONLINE BAYESIAN KERNEL REGRESSION FROM NONLINEAR MAPPING OF OBSERVATIONS Mattheu Gest 1,2, Olver Petqun 1 and Gabrel Frcout 2 1 SUPELEC, IMS Research Group, Metz, France 2 ArcelorMttal Research, MCE Department, Mazères-lès-Metz, France ABSTRACT In a large number of applcatons, engneers have to estmate a functon lnked to the state of a dynamc system. To do so, a sequence of samples drawn from ths unknown functon s observed whle the system s transtng from state to state and the problem s to generalze these observatons to unvsted states. Several solutons can be envsoned among whch regressng a famly of parameterzed functons so as to make t ft at best to the observed samples. However classcal methods cannot handle the case where actual samples are not drectly observable but only a nonlnear mappng of them s avalable, whch happen when a specal sensor has to be used or when solvng the Bellman equaton n order to control the system. Ths paper ntroduces a method based on Bayesan flterng and kernel machnes desgned to solve the trcky problem at sght. Frst expermental results are promsng. 1. INTRODUCTION In a large number of applcatons, engneers have to estmate a functon lnked to the state of a dynamc system. Ths functon s generally qualfyng the system (e.g. the magntude of the eletro-magnetc feld n a wf network accordng to the power emtted by each antenna n the network). To do so, a sequence of samples drawn from ths unknown functon s observed whle the system s transtng from state to state and the problem s to generalze these observatons to unvsted states. Several solutons can be envsoned among whch regressng a famly of parameterzed functons so as to make t ft at best to the observed samples. Ths s a well known problem that s addressed by many standard methods n the lterature such as the kernel machnes methods [1] or neural networks [2]. Yet, several problems are not usually handled by standard technques. For nstance, actual samples are sometmes not drectly observable but only a nonlnear mappng of them s avalable. Ths s the case when a specal sensor has to be used (e.g. measurng a temperature usng a spectrometer or a thermocouple). Ths s also the case when solvng the Bellman equaton n a Markovan decson process wth unknown determnstc transtons [3]. Ths s mportant for (asynchronous and onlne) dynamc programmng, and more generally for control theory. Another problem s to handle onlne regresson, that s ncrementally mprove the regresson results as new samples are observed by recursvely updatng prevously computed parameters. In ths paper we propose a method, based on the Bayesan flterng paradgm and kernel machnes, for onlne regresson of non-lnear mappng of observatons. We have chosen here to descrbe a qute general formulaton of the problem n order to appeal a broader audence. Indeed the technque ntroduced below handles well non-lneartes n a dervatve free way and t should be useful n other felds Problem statement Through the rest of ths paper, x wll denote a column vector and x a scalar. Let x = (x 1, x 2 ) X = X 1 X 2 where X 1 (resp. X 2 ) s a compact set of R n (resp. R m ). Let t : X 1 X 2 X 1 be a nonlnear transformaton (transtons n case of dynamc systems) whch wll be observed. Let g be a nonlnear mappng such that g : f R X g f R X X 1. The am here s to approxmate sequentally the nonlnear functon f : x X f(x) = f(x 1, x 2 ) R from samples ( xk, t k = t(x 1 k, x 2 k), y k = g f (x 1 k, x 2 k, t k ) ) k by a functon ˆf θ (x) = ˆf θ (x 1, x 2 ) parameterzed by the vector θ. Here the output s scalar, however the proposed method can be straghtforwardly extended to vectoral outputs. Note that as j R X R X g R X X 1 ff R X g f : x X g f (x, t(x)) ths problem statement s qute general. Thus the work presented n ths paper can be consdered wth g : f R X g f R X (g beng known analytcally), ths case beng more specfc. The nterest of ths partcular formulaton s that a part of the nonlneartes has to be known analytcally (the mappng g) whereas the other part can be just observed (the t functon).
2 1.2. Outlne A kernel-based regresson s used, namely the approxmaton s of the form ˆf θ (x) = p =1 α K(x, x ) where K s a kernel, that s a contnuous, symmetrc and postve sem-defnte functon. The parameter vector θ contans the weghts (α ), and possbly the centers (x ) and some parameters of the kernel (e.g. the varance for Gaussan kernels). These methods rely on the Mercer theorem [4] whch states that each kernel s a dot product n a hgher dmensonal space. More precsely, for each kernel K, t exsts a mappng ϕ : X F (F beng called the feature space) such that x, y X, K(x, y) = ϕ(x), ϕ(y). Thus, any lnear regresson algorthm whch only uses dot products can be cast by ths kernel trck nto a nonlnear one by mplctly mappng the orgnal space X to a hgher dmensonal one. Many approaches to kernel regresson can be found n the lterature, the most classcal beng the Support Vector Machnes (SVM) framework [4]. There are qute fewer Bayesan approaches, nonetheless the reader can refer to [5] or [6] for nterestng examples. To our knowledge, none of them s desgned to handle the regresson problem descrbed n ths paper. As a preprocessng step a pror on the kernel to be used s put (e.g. a Gaussan kernel wth a specfc wdth), and a dctonary method [7] s used to compute an approxmate bass of ϕ(x ). Ths gves a far sparse and nose-ndependent ntalsaton for the number of kernels to be used and ther centers. Followng [8], the regresson problem s cast nto a state-space representaton θ k+1 = θ k + v k (1) y k = g ˆfθk (x 1 k, x 2 k, t k ) + n k where v k s an artfcal process nose and n k s an observaton nose. A Sgma Pont Kalman Flter (SPKF) [9] s used to sequentally estmate the parameters, as the samples (x k, t(x k ), y k ) k are observed. In the followng sectons, the SPKF approach and the dctonary method wll be frst brefly presented. The proposed algorthm, whch s based on the two forementoned methods, wll then be exhbted, and the frst expermental results wll be shown. Eventually, future works wll be sketched. 2. BACKGROUND 2.1. Bayesan Flterng Paradgm The problem of Bayesan flterng can be expressed n ts state-space formulaton: s k+1 = ψ k (s k, v k ) (2) y k = h k (s k, n k ) The objectve s to sequentally nfer E[s k y 1:k ], the expectaton of the hdden state s k gven the sequence of observatons up to tme k, y 1:k, knowng that the state evoluton s drven by the possbly nonlnear mappng ψ k and the process nose v k. The observaton y k, through the possbly nonlnear mappng h k, s a functon of the state s k, corrupted by an observaton nose n k. If the mappngs are lnear and f the noses v k and n k are addtonal Gaussan noses, the optmal soluton s gven by the Kalman flter [10]: quanttes of nterest are random varables, and nference (that s predcton of these quanttes and correcton of them gven a new observaton) s done onlne by propagatng suffcent statstcs through lnear transformatons. If one of these two hypothess does not hold, approxmate solutons exst. See [11] for a survey on Bayesan flterng. The Sgma Pont framework [9] s a nonlnear extenson of Kalman flterng (random noses are stll Gaussan). The basc dea s that t s easer to approxmate a probablty dstrbuton than an arbtrary nonlnear functon. Recall that n the Kalman flter framework, bascally lnear transformatons are appled to suffcent statstcs. In the Extended Kalman Flter (EKF) approach, nonlnear mappngs are lnearzed. In the Sgma Pont approach, a set of so-called sgma ponts are determnstcally computed usng the estmates of mean and of covarance of the random varable of nterest. These ponts are representatve of the current dstrbuton. Then, nstead of computng a lnear (or lnearzed) mappng of the dstrbuton of nterest, one calculates a nonlnear mappng of these sgma ponts, and use them to compute suffcent statstcs of nterest for predcton and correcton equatons, that s to approxmate the followng dstrbutons: p(s k Y 1:k 1 ) = p(s k S k 1 )p(s k 1 Y 1:k 1 )ds k 1 S p(y k S k )p(s k Y 1:k 1 ) p(s k Y 1:k ) = S p(y k S k )p(s k Y 1:k 1 )ds k Note that SPKF and classcal Kalman equatons are very smlar. The major change s how to compute suffcent statstcs (drectly for Kalman, through sgma ponts computaton for SPKF). Table 1 sketches a SPKF update n the case of addtve nose, based on the state-space formulaton, and usng the standard Kalman notatons: s k k 1 denotes a predcton, s k k an estmate (or correcton), P sk,y k a covarance matrx, n k a mean and k s the dscrete tme ndex. The reader can refer to [9] for detals Dctonary method As sad n secton 1, the kernel trck corresponds to a dot product n a hgher dmensonal space F, assocated wth a mappng ϕ. By observng that although F s a (very) hgher dmensonal space, ϕ(x ) can be a qute smaller em-
3 nputs: outputs: Table 1. SPKF Update s k 1 k 1, P sk 1 k 1 s k k, P sk k Sgma ponts computaton: Compute determnstcally sgma-pont set S k 1 k 1 from s k 1 k 1 and P sk 1 k 1 Predcton step: Compute S k k 1 from ψ k (S k 1 k 1, v k ) and process nose covarance Compute s k k 1 and P sk k 1 from S k k 1 Correcton step: Observe y k Y k k 1 = h k (S k k 1, n k ) Compute ȳ k k 1, P yk k 1 and P sk k 1,y k k 1 from S k k 1, Y k k 1 and observaton nose covarance K k = P sk k 1,y k k 1 Py 1 k k 1 s k k = s k k 1 + K k (y k ȳ k k 1 ) P sk k = P sk k 1 K k P yk k 1 Kk T beddng, the objectve s to fnd a set of p ponts n X such that ϕ(x ) Span {ϕ( x 1 ),..., ϕ( x p )}. Ths procedure s teratve. Suppose that samples x 1, x 2,... are sequentally observed. At tme k, a dctonary D k 1 = ( x j ) m k 1 j=1 (x j ) k 1 j=1 of m k 1 elements s avalable where by constructon feature vectors ϕ( x j ) are approxmately lnearly ndependent n F. A sample x k s then observed, and s added to the dctonary f ϕ(x k ) s lnearly ndependent on D k 1. To test ths, weghts a = (a 1,..., a mk 1 ) T have to be computed so as to verfy 2 δ k = mn a R m k 1 m k 1 a j ϕ( x j ) ϕ(x k ) j=1 Formally, f δ k = 0 then the feature vectors are lnearly dependent, otherwse not. Practcally an approxmate dependence s allowed, and δ k wll be compared to a predefned threshold ν determnng the qualty of the approxmaton (and consequently the sparsty of the dctonary). Thus the feature vectors wll be consdered as approxmately lnearly dependent f δ k ν. By usng the kernel trck and the blnearty of dot products, equaton (3) can be rewrtten as δ k = mn a n a T Kk 1 a 2a T kk 1 (x t) + K(x k, x k ) where ( K k 1 ),j = K( x, x j ) s a m k 1 m k 1 matrx and ( k k 1 (x)) = K(x, x ) s a m k 1 1 vector. If δ k > ν, x k = x mk s added to the dctonary, otherwse not. Equaton (4) admts the analytcal soluton a k = K k 1 k 1 k 1 (x k ). Moreover t exsts a computatonally effcent algorthm whch uses the parttoned matrx nverson formula to construct ths dctonary. See [7] for detals. o (3) (4) Thus, by choosng a pror on the kernel to be used, and by applyng ths algorthm to a set of N ponts randomly sampled from X, a sparse set of good canddates to the kernel regresson problem s obtaned. Ths method s theoretcally well founded, easy to mplement, computatonally effcent and t does not depend on kernels nor space s topology. Note that, despte the fact that ths algorthm s naturally onlne, ths dctonary cannot be bult (straghtforwardly) whle estmatng the parameters, snce the parameters of the chosen kernels (such as mean and devaton for Gaussan kernels) wll be parameterzed as well. Observe that only bounds on X have to be known n order to compute ths dctonary, and not the samples used for regresson. 3. PROPOSED ALGORITHM Recall that the objectve here s to sequentally approxmate a nonlnear functon, as samples (x k, t k = t(x 1 k, x2 k ), y k) are avalable, wth a set of kernel functons. Ths parameter estmaton problem can be expressed as a state-space problem (1). Note that here f does not depend on tme, but we thnk that ths approach can be easly extended to nonstatonary functon approxmaton. In ths paper the approxmaton s of the form ˆf θ (x) = p α K σ 1 (x 1, x 1 )K σ 2 (x 2, x 2 ), wth (5) =1 θ = [(α ) p =1, ( (x 1 ) T ) p =1, (σ1 ) p =1, ( (x 2 ) T ) p =1, (σ2 ) p =1 ]T and K σ j (x j, x j ) = x j exp( xj 2 2(σ j ), j = 1, 2 )2 ( Note that K σ (x, x ) = K (σ 1,σ 2 (x ) 1, x 2 ), (x 1, x2 )) = K σ 1 (x 1, x 1 )K σ 2(x2, x 2 ) s a product of kernels, thus t s a kernel. The optmal number of kernels and a good ntalsaton for the parameters have to be determned frst. To tackle ths ntal dffculty, the dctonary method s used n a preprocessng step. A pror σ 0 = (σ0, 1 σ0) 2 on the Gaussan wdth s frst put (the same for each kernel). Then a set of N random ponts s sampled unformly from X. Fnally, these samples are used to compute the dctonary. A set of p ponts D = {x 1,..., x p } s thus obtaned such that ϕ(x ) Span {ϕ σ0 (x 1 ),..., ϕ σ0 (x p )} where ϕ σ0 s the mappng correspondng to K σ0. Note that even though ths sparsfcaton procedure s offlne, the proposed algorthm (the regresson part) s onlne. Moreover, no tranng sample s requred for ths preprocessng step, but only classcal pror whch s anyway requred for the Bayesan flter (σ 0 ), one sparsfcaton parameter ν and bounds for X. A SPKF s then used to estmate the parameters. As for any Bayesan approach an ntal pror on the parameter dstrbuton has to be put. The ntal parameter vector follows
4 a Gaussan law, that s θ 0 N ( θ 0, Σ θ0 ), where θ 0 = [α 0,..., α 0, D 1, σ 1 0,..., σ 1 0, D 2, σ 2 0,..., σ 2 0] T, and Σ θ0 = dag(σα 2 0,..., σ 2 µ,..., σ σ,..., σ µ,..., σ σ... ) 0, 2 In these equatons, α 0 s the pror mean on kernel weghts, D j = [(x j 1 )T,..., (x j p) T ] s derved from the dctonary computed n the preprocessng step, σ j 0 s the pror mean on kernel devaton, and σα 2 0, σ 2 (whch s a row vector µ j 0 wth the varance for each component of x j, ), σ2 σ 0 are respectvely the pror varances on kernel weghts, centers and devatons. All these parameters (except the dctonary) have to be set up beforehand. Let θ = q = (n + m + 3)p. Note that θ 0 R q and Σ θ0 R q q. A pror on noses has to be put too: more precsely, v 0 N (0, R v0 ) where R v0 = σv 2 0 I q, I q beng the dentty matrx, and n N (0, R n ), where R n = σn. 2 Note that here the observaton nose s also structural. In other words, t models the ablty of the parameterzaton ˆf θ to approxmate the true functon of nterest f. Then, a SPKF update (whch updates the suffcent statstcs of the parameters) s smply appled at each tme step, as a new tranng sample (x k, t k, y k ) s avalable. A specfc form of SPKF s used, the so-called Square- Root Central Dfference Kalman Flter (SR-CDKF) parameter estmaton form. Ths mplementaton uses the fact that for ths parameter estmaton problem, the evoluton equaton s lnear and the nose s addtve. Ths approach s computatonally cheaper: the complexty per teraton s O( θ 2 ), whereas t s O( θ 3 ) n the general case. See [9] for detals. A last ssue s to choose the artfcal process nose. Formally, snce the target functon s statonary, there s no process nose. But the parameter space has to be explored n order to fnd a good soluton, that s an artfcal process nose must be ntroduced. Choosng ths nose s stll an open research problem. Followng [9] a Robbns-Monro stochastc approxmaton scheme for estmatng nnovaton s used. That s the process nose covarance s set to R vk = (1 α)r vk 1 + αk k (y k g ˆfθk 1 (x k, t k )) 2 K T k Here α s a forgettng factor set by the user, and K k s the Kalman gan obtaned durng the SR-CDKF update. Ths approach s summarzed n Table PRELIMINARY RESULTS In ths secton the results of our prelmnary experments are presented. The proposed artfcal problem s qute arbtrary, t has been chosen for ts nterestng nonlneartes so as to emphasze on the potental benefts of the proposed approach. More results focusng on the applcaton of ths method to renforcement learnng are presented n [12]. Table 2. Proposed algorthm nputs:ν,n,α 0, σ j 0, σα 0,σ j σ, σ j 0 µ,σ v0, σ n 0 outputs: θ, Σθ Compute dctonary; {1... N}, x U X Set X = {x 1,..., x N } D =Compute-Dctonary(X,ν,σ 0) Intalzaton: Intalze θ 0, Σ θ0, R n0, R v for k = 1, 2,... do Observe (x k, t k, y k ) SR-CDKF update: [ θ k, Σ θk, K k ] = SR-CDKF-Update( θ k 1,Σ θk 1,x k,t k,......y k,r vk 1,R n) Artfcal process nose update: R vk = (1 α)r vk 1 + αk k (y k g ˆfθk 1 (x k, t k )) 2 Kk T end for Let X = [ 10, 10] 2 and f be a 2d nonlnear cardnal sne: f(x 1, x 2 ) = sn(x1 ) x 1 + x1 x Let the observed nonlnear transformaton be t(x 1, x 2 ) = 10 tanh( x1 7 ) + sn(x2 ) Recall that ths analytcal expresson s only used to generate observatons n ths experment but t s not used n the regresson algorthm. Let the nonlnear mappng g be ( g f x 1, x 2, t(x 1, x 2 ) ) = f(x 1, x 2 ) γ max f ( t(x 1, x 2 ), z ) z X 2 wth γ [0, 1[ a predefned constant. Ths s a specfc form of the Bellman equaton [3] wth the transformaton t beng a determnstc transton functon. Recall that ths functon s of specal nterest n optmal control theory. The assocated state-space formulaton s thus θ k+1 = θ k + v k (6) y k = ˆf θk (x 1 k, x 2 ( k) γ max ˆfθk t(x 1 k, x 2 k), z ) + n k z X Problem statement and settngs At each tme step k the flter observes x k U X (x k unformly sampled from X ), the transformaton t k = t(x 1 k, x2 k ) and y k = f(x k ) γ max z X 2 f(t k, z). Once agan the regressor does not have to know the functon t analytcally, t just has to observe t. The functon f s shown on Fg. 1(a) and ts nonlnear mappng on Fg. 1(b). For these experments the algorthm parameters were set to N = 1000,
5 (a) 2-dmensonal nonlnear cardnal sne. (a) Approxmaton of f(x). (b) Nonlnear mappng observed by the regressor. Fg dmensonal nonlnear cardnal sne and ts observed nonlnear mappng. σ j 0 = 4.4, ν = 0.1, α 0 = 0, σ n = 0.05, α = 0.7, and all varances (σα 2 0, σ 2, σ 2 σ µ j 0 σ0, 2 j n 0 ) to 0.1, for j = 1, 2. The factor γ was set to 0.9. Note that ths emprcal parameters were not fnely tuned Qualty of regresson Because of the specal form of g f, any bas added to f stll gves the same observatons and t may exst other nvarances: the root mean square error (RMSE) between f and ˆf should not be used to measure the qualty of regresson. Instead the nonlnear mappng of f s computed from ts estmate ˆf and s then used to calculate the RMSE. The qualty of regresson s thus measured wth the followng RMSE: ( X (g f (x, t(x)) g fθ ˆ (x, t(x))) 2 dx) 1 2. As t s computed from ˆf, t really measures the qualty of regresson, and as t uses the assocated nonlnear mappng, t wll not take nto account the bas. Practcally t s computed on 10 4 equally spaced ponts. The functon g f s what s observed (Fg. 1(b)) and t s used by the SPKF to approxmate the functon f (Fg. 1(a)) by ˆf θ (Fg. 2(a)). We compute g fθ ˆ from ˆf θ (Fg. 2(b)) and use t to compute the RMSE. As the proposed algorthm s stochastc, the results have been averaged over 100 runs. Fg. 3 shows an errorbar plot, that s mean ± standard devaton of RMSE. The average (b) Nonlnear mappng calculated from ˆf θ (x) Fg. 2. Approxmaton of f(x)and assocated approxmated nonlnear mappng. number of kernels was ± Ths results can be compared wth the RMSE drectly computed from ˆf, that s ( x X (f(x) ˆf θ (x)) 2 dx) 1 2, whch s llustrated on Table 3. There s more varance and bas n these results because of the possble nvarances of the consdered nonlnear mappng. However one can observe on Fg. 2(a) that a good approxmaton s obtaned. Table 3. RMSE (mean ± devaton) of g ˆfθ and ˆf θ as a functon of number of samples g ˆfθ ± ± ± ˆf θ ± ± ± Thus the RMSE computed from g f s a better qualty measure. As far as we know, no other method to handle such a problem has been publshed so far, thus comparsons wth other approaches s made dffcult. However the order of magntude for the RMSE obtaned from g f s comparable wth the state-of-the-art regresson approaches when the functon of nterest s drectly observed. Ths s demonstrated on Table 4. Here the problem s to regress a lnear 2d cardnal sne f(x 1, x 2 ) = sn(x1 ) x + x For the proposed algorthm, the nonlnear mappng s the same as consdered before. We compare RMSE results (10 3 tranng ponts) to
6 s planned to be lnked to the qualty of regresson. Yet the uncertanty reducton occurrng when acqurng more samples (thus more nformaton) could be quantfed. 6. REFERENCES [1] B. Scholkopf and A. J. Smola, Learnng wth Kernels: Support Vector Machnes, Regularzaton, Optmzaton, and Beyond, MIT Press, Fg. 3. RMSE (g ˆfθ ) averaged over 100 runs. Kernel Recursve Least Squares (KRLS) and Support Vector Regresson (SVR). For the proposed approach, the approxmaton s computed from nonlnear mappng of observatons. For KLRS and SVR, t s computed from drect observatons. Notce that SVR s a batch method, that s t needs the whole set of samples n advance. The target of the regressor s ndeed g f, and we obtan the same order of magntude (however much nonlneartes are ntroduced for our method and the representaton s more sparse). The RMSE on ˆf s slghtly hgher for ths contrbuton, but ths can be mostly explaned by the nvarances (as bas nvarance) nduced by the nonlnear mappng. Table 4. Comparatve results (KRLS and SVR results are reproduced from [7]). The frst column gves RMSE (from ˆf and g ˆf for the proposed algorthm, from ˆf for the two others), and the second one the percentage of support vector/dctonary vectors from the tranng sample used. test error Method ˆf g ˆf %DV/SVs Proposed Alg % KRLS % SVR % 5. CONCLUSIONS AND FUTURE WORKS An approach allowng to regress a functon of nterest f from observatons whch are obtaned through a nonlnear mappng of t has been proposed. The regresson s onlne, kernel-based, nonlnear and Bayesan. A preprocessng step allows to acheve sparsty of representaton. Ths method has proven to be effectve on an artfcal problem and reaches good performance although much more nonlneartes are ntroduced. Ths approach s planned to be extended to stochastc transformaton functon (the observed part of nonlneartes) n order to solve a more general form of the Bellman equaton. The adaptve artfcal process nose s planned to be extended, and an adaptve observaton nose s on study. Fnally, the approxmaton ˆf θ beng a mappng of the random varable θ, t s a random varable. An uncertanty nformaton can thus be derved from t and [2] C. M. Bshop, Neural Networks for Pattern Recognton, Oxford Unversty Press, New York, USA, [3] R. Bellman, Dynamc Programmng, Dover Publcatons, sxth edton, [4] V. N. Vapnk, Statscal Learnng Theory, John Wley & Sons, Inc., [5] C. M. Bshop and M. E. Tppng, Bayesan Regresson and Classfcaton, n Advances n Learnng Theory: Methods, Models and Applcatons. 2003, vol. 190, pp , OS Press, NATO Scence Seres III: Computer and Systems Scences. [6] J. Vermaak, S. J. Godsll, and A. Doucet, Sequental Bayesan Kernel Regresson, n Advances n Neural Informaton Processng Systems , MIT Press. [7] Y. Engel, S. Mannor, and R. Mer, The Kernel Recursve Least Squares Algorthm, IEEE Transactons on Sgnal Processng, vol. 52, no. 8, pp , [8] E. A. Wan and R. van der Merwe, The Unscented Kalman Flter for nonlnear estmaton, n Adaptve Systems for Sgnal Processng, Communcatons, and Control Symposum, 2000, pp [9] R. van der Merwe and E. A. Wan, Sgma-Pont Kalman Flters for Probablstc Inference n Dynamc State-Space Models, n Workshop on Advances n Machne Learnng, Montreal, Canada, June [10] R. E. Kalman, A New Approach to Lnear Flterng and Predcton Problems, Transactons of the ASME Journal of Basc Engneerng, vol. 82, Seres D, pp , [11] Z. Chen, Bayesan Flterng : From Kalman Flters to Partcle Flters, and Beyond, Tech. Rep., Adaptve Systems Lab, McMaster Unversty, [12] Mattheu Gest, Olver Petqun, and Gabrel Frcout, Bayesan Reward Flterng, n European Workshop on Renforcement Learnng (EWRL 2008), Llle (France), 2008, Lecture Notes n Artfcal Intellgence, Sprnger Verlag, to appear.
Lecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationSTATISTICALLY LINEARIZED RECURSIVE LEAST SQUARES. Matthieu Geist and Olivier Pietquin. IMS Research Group Supélec, Metz, France
SAISICALLY LINEARIZED RECURSIVE LEAS SQUARES Mattheu Gest and Olver Petqun IMS Research Group Supélec, Metz, France ABSRAC hs artcle proposes a new nterpretaton of the sgmapont kalman flter SPKF for parameter
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationLecture 3: Dual problems and Kernels
Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationQuantifying Uncertainty
Partcle Flters Quantfyng Uncertanty Sa Ravela M. I. T Last Updated: Sprng 2013 1 Quantfyng Uncertanty Partcle Flters Partcle Flters Appled to Sequental flterng problems Can also be appled to smoothng problems
More informationTHE Kalman filter (KF) rooted in the state-space formulation
Proceedngs of Internatonal Jont Conference on Neural Networks, San Jose, Calforna, USA, July 31 August 5, 211 Extended Kalman Flter Usng a Kernel Recursve Least Squares Observer Pngpng Zhu, Badong Chen,
More informationHidden Markov Models & The Multivariate Gaussian (10/26/04)
CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationSTATS 306B: Unsupervised Learning Spring Lecture 10 April 30
STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear
More informationA Particle Filter Algorithm based on Mixing of Prior probability density and UKF as Generate Importance Function
Advanced Scence and Technology Letters, pp.83-87 http://dx.do.org/10.14257/astl.2014.53.20 A Partcle Flter Algorthm based on Mxng of Pror probablty densty and UKF as Generate Importance Functon Lu Lu 1,1,
More informationEstimating the Fundamental Matrix by Transforming Image Points in Projective Space 1
Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com
More informationRelevance Vector Machines Explained
October 19, 2010 Relevance Vector Machnes Explaned Trstan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introducton Ths document has been wrtten n an attempt to make Tppng s [1] Relevance Vector Machnes
More informationTracking with Kalman Filter
Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationParameter Estimation for Dynamic System using Unscented Kalman filter
Parameter Estmaton for Dynamc System usng Unscented Kalman flter Jhoon Seung 1,a, Amr Atya F. 2,b, Alexander G.Parlos 3,c, and Klto Chong 1,4,d* 1 Dvson of Electroncs Engneerng, Chonbuk Natonal Unversty,
More informationAN IMPROVED PARTICLE FILTER ALGORITHM BASED ON NEURAL NETWORK FOR TARGET TRACKING
AN IMPROVED PARTICLE FILTER ALGORITHM BASED ON NEURAL NETWORK FOR TARGET TRACKING Qn Wen, Peng Qcong 40 Lab, Insttuton of Communcaton and Informaton Engneerng,Unversty of Electronc Scence and Technology
More informationMultigradient for Neural Networks for Equalizers 1
Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More information1 Motivation and Introduction
Instructor: Dr. Volkan Cevher EXPECTATION PROPAGATION September 30, 2008 Rce Unversty STAT 63 / ELEC 633: Graphcal Models Scrbes: Ahmad Beram Andrew Waters Matthew Nokleby Index terms: Approxmate nference,
More informationHidden Markov Models
CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationHidden Markov Models
Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationOther NN Models. Reinforcement learning (RL) Probabilistic neural networks
Other NN Models Renforcement learnng (RL) Probablstc neural networks Support vector machne (SVM) Renforcement learnng g( (RL) Basc deas: Supervsed dlearnng: (delta rule, BP) Samples (x, f(x)) to learn
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationOn a direct solver for linear least squares problems
ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationUncertainty as the Overlap of Alternate Conditional Distributions
Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationOn an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1
On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool
More informationSparse Gaussian Processes Using Backward Elimination
Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationRBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis
Appled Mechancs and Materals Submtted: 24-6-2 ISSN: 662-7482, Vols. 62-65, pp 2383-2386 Accepted: 24-6- do:.428/www.scentfc.net/amm.62-65.2383 Onlne: 24-8- 24 rans ech Publcatons, Swtzerland RBF Neural
More informationConjugacy and the Exponential Family
CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the
More informationNeural networks. Nuno Vasconcelos ECE Department, UCSD
Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X
More informationReport on Image warping
Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.
More informationFinite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin
Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of
More informationSuppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl
RECURSIVE SPLINE INTERPOLATION METHOD FOR REAL TIME ENGINE CONTROL APPLICATIONS A. Stotsky Volvo Car Corporaton Engne Desgn and Development Dept. 97542, HA1N, SE- 405 31 Gothenburg Sweden. Emal: astotsky@volvocars.com
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationSolving Nonlinear Differential Equations by a Neural Network Method
Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationClassification as a Regression Problem
Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationNatural Images, Gaussian Mixtures and Dead Leaves Supplementary Material
Natural Images, Gaussan Mxtures and Dead Leaves Supplementary Materal Danel Zoran Interdscplnary Center for Neural Computaton Hebrew Unversty of Jerusalem Israel http://www.cs.huj.ac.l/ danez Yar Wess
More information2 STATISTICALLY OPTIMAL TRAINING DATA 2.1 A CRITERION OF OPTIMALITY We revew the crteron of statstcally optmal tranng data (Fukumzu et al., 1994). We
Advances n Neural Informaton Processng Systems 8 Actve Learnng n Multlayer Perceptrons Kenj Fukumzu Informaton and Communcaton R&D Center, Rcoh Co., Ltd. 3-2-3, Shn-yokohama, Yokohama, 222 Japan E-mal:
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationProbability Theory (revisited)
Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationA Hybrid Variational Iteration Method for Blasius Equation
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method
More informationMaximum Likelihood Estimation (MLE)
Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationMATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)
1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons
More informationA new Approach for Solving Linear Ordinary Differential Equations
, ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of
More informationECE559VV Project Report
ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More information