by Paul D. Abrm"l.2, Jr. SIMULTANEOUS ESTIMATION OF / IN LINEAR DYNAMICAL SYSTEMS Electronics Research Ce~2ter Cam bridge, MUSS.

Size: px

Start display at page:

Download "by Paul D. Abrm"l.2, Jr. SIMULTANEOUS ESTIMATION OF / IN LINEAR DYNAMICAL SYSTEMS Electronics Research Ce~2ter Cam bridge, MUSS."

Pierce Hancock
5 years ago
Views:

1 SIMULTNEOUS ESTIMTION OF / THE STTE ND NOISE STTISTICS IN LINER DYNMICL SYSTEMS by Paul D. brm"l.2, Jr. Electroics Research Ce~2ter Cam bridge, MUSS. NTION-1 ERONUTICS ND SPCE DMINISTRTION WSHINGTON, D. C. MRCH 1970

2 1. Report No. NS TR R-332 ~~~~ 4. Title ad Subtitle I 2. Govermet ccessio No. Simultaeous Estimatio of the State ad Noise Statistics i Liear Dyamical Systems 7. uthor(.) Paul D. bramso, Jr. 9. Performig Orgoixatio Name ad ddress Electroics Research Ceter Cambridge, Massachusetts 2. Sposorig gecy Name ad ddress Natioal eroautics ad Space dmiistratio Washigto, D. C I 3. Recipiet's Catalog No. 70 ; D: 6. Performig Orgaizatio Code 1 C Work Uit No Cotroct or Grat No. TECH LIBRRY KFB, NM I l I l l Il l lul Il OOb Performig Orgaizatio Report No Type of Report ad Period Covered Techical Report 114. Sposorig gecy Code 5. Supplemetary Notes jubmitted by author to Massachusetts Istitute of Techol- )gy as thesis for Doctor of Sciece degree (May 10, 1968) 16. brtroct optimal procedure for estimatig the state of a liear dyamical system whe the statistics of the measuremet ad process oise are poorly kow is developed. The criterio of maximum likelihood is used to obtai a optimal estimate of the state ad oise statistics. These estimates are show to be asymptotically ubiased, efficiet, ad uique, with the estimatio error ormally distributed with a kow covariace. The resultig eqijatios for the estimates caot be solved recursively, but a iterative procedure for their solutio is preseted. Several approximate solutios are preseted which reduce the ecessary computatios i fidig the estimates. Some of the approximate solutios allow a real time estimatio of the state ad oise statistics. Closely related to the estimatio problem is the subject of hypothesis testig. Several criteria are developed for testig hypotheses cocerig the values of the oise statistics that are used i the computatio of the appropriate filter gais i a liear Kalma type state estimator. If the observed measuremets are ot cosistet with the assumptios about the oise statistics, the estimatio of the oise statistics should be udertake usig either optimal or suboptimal procedures. Numerical results of a digital computer simulatio of the optimal ad suboptimal solutios of the estimatio problem are preseted for a simple but realistic example. 17. Kay Wards ~. -Statistics ( state ad oise) *Optimal Procedure 'Liear Dyamical System 18. Distributio Statemet Uclassified-Ulimited 19. Security Classif. (of this report) 20. Security Classif. (of thia page) 21- No. of Pages Uclassified Uclassified Price * $3.00 * For sale by the Clearighouse for Federal Scietific ad Techical Iformatio Sprigfield, Virgiia 22151

4 SIMULTNEOUS ESTIMTION OF THE STTE ND NOISE STTISTICS IN LINER DYNMICL SYSTEMS" By Paul D. bramso, Jr. Electroics Research Ceter SUMMRY optimal procedure for estimatig the state of a liear dyamical system whe the statistics of the measuremet ad process oise are poorly kow is developed. The criterio of maximum likelihood is used to obtai a optimal estimate of the state ad oise statistics. These estimates are show to be asymptotically ubiased, efficiet, ad uique, with the estimatio error ormally distributed with a kow covariace. The resultig equatios for the estimates caot be solved recursively, but a iterative procedure for their solutio is preseted. Several approximate solutios are preseted which reduce the ecessary computatios i fidig the estimates. Some of the approximate solutios allow a real time estimatio of the state ad oise statistics. Closely related to the estimatio problem is the subject of hypothesis testig. Several criteria are developed for testig hypotheses cocerig the values of the oise statistics that are used i the computatio of the appropriate filter gais i a liear Kalma type state estimator. If the observed measuremets are ot cosistet with the assumptios about the oise statistics, the estimatio of the oise statistics should be udertake usig either optimal or suboptimal procedures. Numerical results of a digital computer simulatio of the optimal ad suboptimal solutios of the estimatio problem are preseted for a simple but realistic example. *Submitted to the Departmet of eroautics ad stroautics, Massachusetts Istitute of Techology, o May 10, 1968, i partial fulfillmet of the requiremets for the degree of Doctor of Sciece. iii

6 TBLE OF CONTENTS Chapter 1 INTRODUCTION Page 1.1 Statemet ad Discussio of the Problem Historical Backgroud e Summary of Thesis..., Chapter 2 EXPECTTION OPERTORS ND MXIMUM LIKELIHOOD ESTIMTION 2-1 Itroductio...,, Coditioal ad Ucoditioal Expectatio Operators.., ,..., Maximum Likelihood State Estimatio Chapter 3 FIXIMUM LIKELIHOOD ESTIMTION OF NOISE COVRINCE PRMETERS ND THE SYSTEM STTE 3.1 Itroductio e 3.2 Summary of Previous Results i Maximum Likelihood Estimatio e Derivatio of the Likelihood Fuctio ,4 symptotic Properties of Noise Covariace ad System State Maximum Likelihood Estimators Selectio of the Priori Noise Covariace Distributio.... a. e. e... e ,6 Computatio of the Estimate Computatio of the Iformatio Matrix Chapter 4 SUBOPTIML SOLUTIONS OF THE ESTIMTION PROBLEM 4.1 Itroductio Liearized Maximum Likelihood Solutio

7 Page 4.3 Near Maximum Likelihood Solutio Explicit Suboptimal Solutios Review of Procedures Suggested by Others Chapter 5 TESTING OF STTISTICL HYPOTHESES 5.1 Itroductio,.. = Samplig Characteristics ad Distributios Cofidece Itervals Tests o the Mea Tests o the Variqce...., Multidimesioal Hypothesis Tests with Time Varyig Populatio Parameters pplicatio of Hypothesis Tests to Maximum Likelihood State Estimatio Chapter 6 NUMERICL RESULTS 6.1 Itroductio e Descriptio of System ad Measuremet Effect of Icorrect Noise Covariace Parameters Upo Maximum Likelihood State Estimatio Compariso of State ad Noise Covariace Estimatio Procedures.... e 6.5 Hypothesis Tests Results. e. e e Chapter 7 CONCLUSION 7.1 Summary of Results.. e Suggestios for Further Study vi

8 ppedix B E Evaluatio of Explicit Estimator Mea Squared Error Refereces Biography... D vii

9 LIST OF ILLUSTRTIONS Figure Page 3.1 Trucated Normal Distributio Gamma Distributio with p= Test Power vs ad a Test Power vs ad a for Fixed Test Power vs ad for Fixed a Test Power vs T-I ad a for Fixed Test Power vs q ad for Fixed a Covariace vs Estimated R Covariace vs Estimated Q Covariace vs Estimated R Covariace vs Estimated Q Maximum Likelihood Solutio Ru Maximum Likelihood Solutio Ru Maximum Likelihood Solutio Ru Maximum Likelihood Solutio Ru Maximum Likelihood Solutio Ru Maximum Likelihood Solutio Ru 5 (cot'd.) Explicit Suboptimal Solutio Ru Explicit Suboptimal Solutio Ru Explicit Suboptimal Solutio Ru Explicit Suboptimal Solutio Ru Explicit Suboptimal Solutio Ru Explicit Suboptimal Solutio Ru viii

10 I- Figure Page 6,17 Variace of M. L. E. vs R Variace of M, Lo E. vs Q ,19 Coditioal Mga of Explicit Suboptimal 6.20 Estimator vs Ro Coditioal MEa of Explicit Suboptimal Estimator vs Qoo Coditioal Variace of Explicit Suboptimal Estimator about Bias Value ix

11 LIST OF TBLES Table Page 6.1 Mote Carlo Ru 1: Maximum Likelihood ad Liearized P4aximum Likelihood Solutios Mote Carlo Ru 2: Maximum Likelihood ad Liearized Maximum Likelihood Solutios Mote Carlo Ru 3: Maximum Likelihood ad Liearized Maximum Likelihood Solutios Mote Carlo Ru 4: Maximum Likelihood ad Liearized Maximum Likelihood Solutios Mote Carlo Ru 5: Maximum Likelihood ad Liearized Maximum Likelihood Solutios Mote Carlo Ru 6: Explicit Suboptimal Solutio Mote Carlo Ru 7: Explicit Suboptimal Solutio Mote Carlo Ru 8: Explicit Suboptimal Solutio Mote Carlo Ru 9: Explicit Suboptimal Solutio Mote Carlo Ru 10: Explicit Suboptimal Solutio Mote Carlo Ru 11: Explicit Suboptimal Solutio Hypothesis Test Ru 1: R ad Q Test ,13 Hypothesis Test Ru 2: Noise Bias Test X

12 Chapter 1 INTRODUCTION 1.1 Statemet - ad Discussio of the Problem Optimal estimatio has received cosiderable attetio i recet years i fields such as space avigatio, statis- tical commuicatio theory, ad may others that ofte require the estimatio of certai variables that are either ot directly measurable or are beig measured with istru- mets that are ot sufficietly accurate for a adequate determiistic solutio. I essece the procedures aim at reducig the effects of radom disturbaces associated with these "imperfect' istrumets. I may situatios, the estimatio procedure cosists of o more tha averagig repeated measuremets of the ''same" quatity made with the same or differet istrumets. I this way, the radom errors made i each measuremet might "average out," resultig i a higher cofidece i the value of the quatity beig measured tha would be the case if oly a sigle measuremet was take. I this type of operatio, the improved cofidece i the estimate depeds upo the fact that the "same" quatity beig measured is truly time ivariat. I more complex situatios, the quatity beig measured might chage from oe measuremet time to aother, Suppose it is kow that the voltage across a electrical etwork decreases expoetially with time. simple average of 1

13 repeated measuremets of the voltage made at differet times would lead to a erroeous estimate. However, if the time costat associated with the expoetial decay is kow, the each measured voltage ca be related to the voltage at ay specified time. These computed voltages ca the be averaged to obtai a estimate of the voltage at the specified time. The examples illustrated above represet the most simple case of estimatio i which each measuremet carries the same weight so that simple liear averagig of the measuremets is performed to obtai the estimate. However, if each measuremet has associated with it a differet cofidece, usually characterized by the variace of the measuremet error, the a more complicated estimatio scheme must be employed which takes ito accout the differig accuracies of the measuremets. Typical examples of this situatio are: 1) whe two or more differet types of istrumets are used to measure the same quatity, or 2) i the case of the previous example whe there is some radom characteristic i the expoetial fuctio of the voltage beig measured. This leads to a reductio i the cofidece i relatig measuremets made at some time distat from the specified time. Operatioal or computatioal procedures ivolvig a cosideratio of the variaces of the various oises i the prablem represet the first degree of sophisticatio i estimatio. Various formulatios have bee advaced which characterize the statistical ature of the problem i some orderly patter. There are two widely used techiques for optimal estimatio whe the time variatio of the quatity 2

14 beig measured ca be described by a liear differetial 1 equatio ad whe the measuremets are liearly related to the quatity beig estimated. The iitial sigificat work o this problem was by Wieer (Ref. 36) who developed the coditio to be satisfied for optimal estimatio i the least mea-squared-error sese. This coditio is geerally referred to as the Wieer-Hopf itegral equatio. He also developed the solutio for the case of a time ivariat system with statioary oise processes. This work ad further extesios ad modificatios by others are kow as Wieer filters. I the Wieer filter, the measuremet iformatio is ackowledged to have a sigal ad a oise compoet. The filter, which is usually implemeted as a liear aalog filter, is desiged so that the oise compoet of the measuremet is more heavily atteuated tha the sigal compoet, thus allowig extractio of as much iformatio from the measuremet as is possible. However, o-time statioary, trasiet, or multiple iput-output problems are difficult to solve by the Wieer approach. Kalma (Ref. 16) treated the estimatio problem from a differet poit of view ad formulated the equivalet of the Wieer-Hopf itegral equatio as a vector-matrix differetial equatio i state space. He developed the solutio for a liear system with ormally distributed oises as a set of vector-matrix differece equatios which are commoly termed the "Kalma filter." Iformatio about the dyamics of the process beig measured, statistics of the disturbaces 3

15 ivolved, ad a priori kowledge of the quatities beig estimated are icluded i the formulatio of the problem. I the Kalma filter, the estimatio proceeds from ay chose startig time ad is well suited for situatios domiated by a trasiet mode, such as the lauchig of a space vehicle. I the steady state, the Kalma filter ca be show to be equivalet to a Wieer filter ad thus ca be cosidered as a more geeral formulatio of the estimatio problem. Further advatages of the Kalma filter are that the computatios are performed recursively, i the time domai, ad are readily applicable to ostatioary ad multiple iput-output systems. I the stadard formulatio of the Kalma estimatio procedure, allowace is made for a variatio of the oise variaces with respect to time. However, this kowledge is assumed to be kow prior to the actual filter operatio. I a operatioal situatio, the time varyig filter gais ca be precomputed ad stored i the filter to be used i cojuctio with the measuremet iformatio to obtai the optimal estimate. s a estimatio procedure of the first degree of sophisticatio, i.e., with the cosideratio of the oise variaces, this is ideed a very powerful ad geerally applicable procedure. Kalma filterig ca be thought of as a method of combiig i a optimal fashio all iformatio up to ad icludig the latest measuremet to provide a estimate at that time. The proper weightig to apply to the ew measuremet is determied by the relative "quality" of the ew iformatio as compared to the iformatio cotaied i the estimate before the latest measuremet. Poor measuremets will receive 4

16 less weight tha good oes. If there is oise drivig the system betwee measuremet times, the filter will weight the extrapolated value of the old estimate less tha if there were o oise. This is because oise itroduces a ucertaity i the state of the system betwee measuremet times. Cosequetly the estimate will deped less upo old estimates ad more upo ew measuremets. The appropriate measures of the "quality" of the old estimate ad the ew measuremet are respectively the covariace of the old estimatio error ad the covariace of the ew measuremet error. These importat poits ca be clarified by cosiderig the followig simple example. Let x represet the scalar state of a system at time "." If the system ca be described by a liear differetial equatio, the the state at time ll'l ca be related to the state at time 'I-1" by the differece equatio x = CP (,-1) X -1 + r w CP(,-1) is the state trasitio matrix ad extrapolates the state from time -1 to time if the effects of w are igored. r is the "forcig fuctio matrix" ad w is the state "drivig oise" which is assumed to be a zero mea ucorrelated ormally distributed oise with variace Q. Let x 1-1 represet the estimate of x obtaied after processig -1 measuremets ad let Pl-l represet the variace of the estimatio error after -1 measuremets. 5

17 The measuremet at time is give by z = H x + v where v is additive oise represetig the error i the measuremet ad H is the "observatio matrix'' which relates the measuremet to the state. I this example, z is a scalar ad H = 1. It is assumed that v is a zero mea ucorrelated ormally distributed oise with variace R. The scalar Kalma filter equatio for icorporatig this ew measuremet ito the state estimate is give by The variace of the estimatio error after icorporatio of the ew measuremet is give by If the state estimate xi - is very good compared with the iformatio cotaied i z, the ad thus X " X l l-1 ad 6

18 I this case, the measuremet datum is effectively rejected because it is so oisy that it is virtually useless. Sice o ew iformatio has bee added, the variace of the esti-. matio error remais the same after the measuremet. I the other extreme case, suppose x I -1 is of very poor quality compared with the iformatio cotaied i z. The : ad thus X " 2 l ad P l R I this case, the estimate xlml is effectively rejected ad the estimate x is based upo the sigle measuremet z. l I all cases fallig betwee these two extremes, the estimate X is a liear combiatio of the old estimate xl - ad l the ew measuremet z. Before computig the proper above, the variace of the state weightig factors give estimatio error before the measuremet at time must be foud. This ca be doe by studyig how the actual state chages betwee time -1 ad time ad how the state estimate chages i this same time iterval. be the estimate of the state x - Let X-l I -1 after the measuremet at time -1. Sice w is a zero mea idepedet radom variable, the best estimate of the- state 7

19 at time based upo the -1 measuremets is give by X I -1-1 I -1 X represets the covariace of the estimatio If '-11-1 error at time -1, it ca be see that + '-1 I -1 ri Q large drivig oise variace will cause a large icrease i the mea squared error i the estimate whe it is extrapolated from oe measuremet time to the ext. The filter equatios give above are for the case of a scalar state ad measuremet. I Chapter 2, the more geeral case of a vector state ad measuremet is treated. However, eve i more complicated situatios, the same iterpretatio ca be applied to the operatio of the filter. The primary purpose of the filter is to compute ad apply the proper weightig factors so that the ew measuremet iformatio ca be icorporated with a old estimate of the state to provide a combied ad improved state estimate. Precise kowledge of the measuremet ad drivig oise statistics is of fudametal importace i the operatio of a Kalma filter. However, i ay operatioal situatio, the statistics of the oises that are used i the filter are i fact oly estimates or predictios of the statistics of the oises that will actually be ecoutered. I some cases these estimates might be quite accurate, but i other 8

20 cases they may be sufficietly i error to adversely affect the filter. Oe effect of thi.s ca be a large discrepecy betwee the state estimatio error covariace matrix as com- puted withi the filter ad the "actu.al" state estimatio error covariace. If there is a differece betwee the com- puted ad actual covariace.s,t the old state estimate, the filter ca make a error i computig the weightig for a ew measuremet, This subject is treated fully i Chapter 2 but it ca be uderstood by cosiderig the followig example. Suppose that it is assumed that there is o oise drivig the state whe i, fact drivig oise is preset, m-- The the computed covariace of the state estimatio error will geerally be smaller tha the actual estimatio error covariace. This is because the drivig oise itroduces a error i extrapolatig the state estimate from oe measure- met time to the ext which is ot accouted for i the computed state estimatio error covariace matrix. The filter "thiks" it is doig a better Job of estimatig the state tha is actually the ca.seo If the filter t.hiks the old state estimate is much better tha it actually is, it may assig little weight to ew measuremet iformatio ad thus effectively discard this ew iformatio,, Of coursel this is exactly the wrog thig to do, The old state estimate may be of very poor qua1it.y so that the ew measuremet fforma- tio should be weighted quite heavily. However, i its igo- race, the filter fails to do this ad as a result the actual estimatio error may become very large while the filter 9

21 "thiks" it is doig a good job of estimatig the state. similar problem ca arise i the case of vector measuremets. If the relative quality of the differet measuremets is ot well kow, the more weight might be give to a measuremet take with a iaccurate istrumet tha to a very accurate oe. This would lead to a greater estimatio error tha would be the case if the relative accuracy of the differet measuremets was kow ad the proper weightig assiged to each. priori estimates of the statistics of the oises ca be obtaied i several ways. They may be o more tha educated guesses as to what oise eviromet may actually exist, It is ofte very difficult to predict with accuracy the operatig coditios of a complicated ad iterrelated system, especially i research ad developmet applicatios whe little may be kow before a experimet is coducted. other techique for obtaiig the statistics of the oises is the aalysis of previous experimets. These experimets may have bee coducted i a operatioal eviromet or i the cotrolled eviromet of a laboratory. I either case, it is rarely possible to have complete cofidece i the estimates of the oise statistics due to the ecessarily fiite umber of experimets that ca be performed ad possible problems associated with the iability tb isolate ad distiguish the various effects of the differet oises. d there is still a questio as to whether the eviromet will remai costat betwee the time these estimates 10

22 of the statistics are obtaied ad whe the estimates are subsequetly used i the Kalma filter. Thus i may situatios, the assumptio that the a priori estimates of the statistics of the measuremet ad drivig oises are good estimat.es may ot be justified. The primary objective of this work is to develop a optimal estimator of the state that remais optimal whe the statistics of these oises are ot precisely kow a priori. I the process of estimatig the state uder these coditios, optimal estimates of the measuremet ad drivig oise statistics are also obtaied. I developig optimal estimators for the state ad oise statistics, it is ot assumed that the statistics of the oises are kow precisely a priori. Istead, it is assumed that the ucertaity i kowledge of these statistics has a particular distributio about some a priori value. This is completely aalogous to the usual assumptio made i Kalma filterig that the iitial state of the system is ot kow precisely, but rather the ucertaity i kowledge of the state ca be described by a suitable probability desity fuctio. I both cases, it is assumed that the distributio of the ucertaity is kow a priori. This represets the secod degree of sophisticatio is estimatio procedures. It reduces by oe level the ecessary specificatio.of the values of the oise statistics. Istead of havig to specify their exact values, all that eed be specified is the possible distributio these values might have, I fact, it will subsequetly be show that the exact shape of this distributio is 11

23 relatively uimportat whe a large umber- of measuremets has bee take. The above discussio ca be clarified by cosiderig the followig simple example. It will be show-that the a priori estimates of the oise statistics ca be improved at the same time that the state is beig estimated. ll measuremets cotai some iformatio about the oises as well as the state, whether these measuremets are take i the laboratory or i a operatioal eviromet. So a procedure ca be devised to utilize this iformatio about the oises actually ecoutered to improve our kowledge of the oise statistics. Suppose the state that is to be estimated is a time ivariat scalar ad the measuremets of the state are give by z = x + v where x is the costat state ad v is a zero mea idepe- det ormally distributed measuremet oise with time ivari- at variace R. If a sigle measuremet is take, the optimal estimate of the state x is give by ad the variace of the state estimatio error is give by 12

24 If repeated measuremets are performed, it is easy to show that the optimal estimate of the state after the th measuremet is give by j=1 ad the variace of the state estimatio error is Thus icreasig the umber of measuremets decreases the variace of the estimatio error by the factor (l/l0 Note that i this simple example, the measuremet oise variace is ot eeded to defie the optimal estimate of the state, 1 This is a cosequece of the fact that if the actual measure- met oise variace is assumed to be time ivariat, ad if there is o a priori iformatio about the state, the all measuremets are give the same weight, regardless of the actual value of R. However, the variace of the state estimatio does deped upo the actual value of R as give above. I more complicated situatios, such as vector measuremets or whe there is oise drivig the state, the optimal state estimate does deped upo the relative sizes of the oise covariaces ivolved. But i this case, oly the variace of the state estimatio error depeds upo R. If the value of R is ukow, its value ca be estimated from the measuremets themselves. I the above case, whe the true state is time ivariat, a estimate of R ca be defied

25 j=1 It is easy to show that such a estimate is a ubiased estimate of the oise variace. The expected value of R is give by h &(R) = j= l where E ( ) represets a average over the esemble of all possible measuremet oises with covariace R. It ca be see that where 2, X = x - x l l ad k so 2, X l k= (zj-x = v l j Vk vs - k=l s=l k=l vj Vk ad * E[(z~-x ) l ]=R+-R--R= R I obtaiig the above expressio, use was made of the idepe- dece of the measuremet oises at differet times. The j=1 14

26 It ca be error is give show that the variace of the R estimatio by 2 2 E[(R - R) R2 ] = Thus as the umber of measuremets icreases, the variace of the oise variace estimatio error becomes small ad R becomes a arbitrarily good estimate of the actual measuremet oise variace. With a estimate of R, a estimate of the state estimatio error variace ca he obtaied. P --- I R J s was metioed before, i most cases some estimate of the measuremet oise variace is available before the above measuremets are take. Suppose a estimate of R is obtaied from a series of measuremets ad it differs from a a priori value obtaied by some other meas. Now the questio is which value more accurately represets the variace of the measuremet oise, the a priori value or the value obtaied from the measuremets. The cocepts of relative weightig discussed i coectio with Kalma state estimatio offer a solutio to this problem. There is usually some measure of accuracy associated with the a priori estimate of R. This measure is ofte the variace of possible deviatios of the actual value of R about the a priori estimate. If it is felt that the a priori 15

27 estimate is highly accurate, the variace about the true value would be small. Coversely, if it is felt that the a priori estimate of R is highly iaccurate, the variace about the true value of R would be large. combied estimate of the measuremet oise variace ca by defied by RZ = 2 ar u2 + u 2 R RO Ro + u2 R 2 0 RO u RO h where Ro is the a priori estimate of R, R is the estimate obtaied from the measuremets, ur is the variace of the true value of R about the a priori estimate, ad u2 is the variace of the true value about the estimate R. R 2 u is R give by RL R -1 CT = [(R - R) ] = - I order to compute o2, the true value of R must be kow. R However, for moderately large, the approximatio ca be made By aalogy to the state estimatio problem, a measure of the variace of the combied estimate of the measuremet oise variace is give by ORC = 2 CT R 2 2 OR OR 0 2 OR 0 16

28 If the a priori estimate Ro is of high accuracy compared h with Rl the ad thus h h RZ = Ro ad 2 2 o c = o R RO If the a priori estimate is of low accuracy compared with Rl the 2 2 * RO *R ad thus h c h R = R ad 2 2 * c = * R R I all cases fallig betwee these two extremes, the estimate RC is a liear combiatio of the a priori estimate ad the estimate obtaied from the measuremets. Of course, the situatio is ot always as simple as i the previous example. The state may be a time-varyig vector with additive drivig oise. The measuremets may be vectors idicatig that several measuremet devices of possibly differig accuracies are used to measure the state at ay time. I such cases, the problem is simultaeously estimatig the state ad the oise covariaces becomes much more complicated. 17

29 The resultig equatios for optimal estimates of the state ad oise statistics are geerally coupled oliear equatios that must be solved by some umerical procedure. But the essece of the problem is the same. From the iformatio cotaied i the measuremets take i a operatioal eviromet, improvemets ca be made i the estimates of ot oly the state but also the statistics of the measuremet ad drivig oises. The performace of the state estimator i such a situatio ca be improved compared with the estimator that uses icorrect values of the oise statistics i com- putig the appropriate filter gais. Optimal state estimatio whe the statistics of the measuremet ad drivig oises are poorly kow is but oe class of problems withi the more geeral area of state estimatio i the presece of "modelig errors." I the formulatio of the Kalma filter, it is assumed that the dyamics of the system ca be accurately modeled as a set of liear differetial or differece equatios with precisely kow coefficiets. This is reflected i the value of the state trasitio matrix that is used to extrapolate the state estimate from oe measuremet time to the ext. I fact, the modelig of the system might ivolve approximatios. The umber of state variables that are ecessary to accurately model the system might be so great that the umber of computatios eeded to estimate all of the variables becomes prohibitively large. Ofte the umber of computatios ca be 18

30 1~~~ icludig oly the most sigificat state vari- ables i the filter model. This will reduce the complexity of the filter but ca itroduce additioal errors i the estimatio of the reduced umber of state variables, It may ot be possible to model the system dyamics by ay set, o matter how large, of liear differetial equa- tios. The motio of the state might be described by a set of oliear differetial equatios which ca oly be approximated by a set of liear differetial equatios describig the motio of the system about some omial path, This too ca itroduce errors i the state estimatio that are ot accouted for withi the model, There are other sources of modelig error. The elemets of the state trasitio matrices used withi the filter may ot be accurately kow. The actual measuremets may be a oliear fuctio of the state although it was assumed i the derivatio of the filter equatios that the measuremets are a liear fuctio of the state. These oliearities may ot be highly sigificat but they ca cause additioal state estimatio errors. ll of these "modelig errors," icludig iaccurately kow oise statistics, ca result i a degradatio of the Kalma filter performace. May authors have studied the problem of optimal estiiatio ad cotrol of a liear plat whose parameters may ot be accurately kow. comprehesive list of refereces I this subject would be prohibitively log. For this $aso, the oly works cited here are those that have some 19

31 I bearig o the problem of optimal state estimatio i the presece of modelig errors. Spag (Ref. 34) has studied the problem of optimal cotrol of a liear plat with ukow coefficiets uder the assumptios that there is o measuremet oise ad the statistics of the oise drivig the state are precisely kow. He also assumes that the ucertaity i kowledge of the coefficiets describig the plat have some distributio of values that ca be represeted by a probability desity fuctio of coefficiet values. The optimal cotrol sigal which miimizes a quadratic error measure is obtaied by fidig the coditioal mea of the system trackig error, coditioed upo the actual measuremets of the system but averaged over the distributio of all possible plat coefficiet values. I this way, the error is miimized over the esemble of all possible trials with systems whose parameters vary i a fashio described by the assiged probability desity fuctio. No attempt is made to estimate the actual plat coefficiets. lthough Spag is cocered primarily with optimal cotrol, several of the cocepts he develops have direct applicatio to optimal state estimatio whe the parameters of the system are ukow. Dreick (Ref. 8) has also studied this problem. He also assumes that the ucertaity i the parameters of a liear plat ca be described by a probability desity fuctio whose first two momets are kow. His optimal cotrol sigal miimizes the coditioal mea squared trackig error ad is a fuctio of the measuremets o the 20

32 t system ad the first two momets of.the parameter distributios. However, usig his procedure, there is o way to estimate the values of the ukow system parameters except i a very restricted set of problems. Magill (Ref. 21) takes a iterestig ad rather uique approach to the problem of optimal state estimatio whe certai statistical parameters oe the problem are ukow. These parameters, called the parameter vector, are assumed to come from a fiite set of values that are kow a priori. The optimal estimator is composed of a set of Kalma type state estimators, with each filter usig oe of the fiite umber of parameter vectors to compute the proper measuremet gais. The outputs of the filters are weighted ad added, with the weightig of each filter output beig determied by the coditioal probability that the parameter vector beig used i that filter is the true parameter vector. These coditioal probabilities are fuctios of the measuremets ad are obtaied by relatively simple but oliear calculatios. The followig works are primarily cocered with obtaiig relatively simple ad easy-to-use procedures rather tha fidig a "optimal" solutio to the problem. The approaches to the problem are quite differet but there is oe commo feature. This feature is the real time examiatio of measuremet residual's to determie if a Kalma type state estimator is performig as predicted. The measuremet residual is defied as the differece betwee a actual measuremet ad the predicted measuremet, this predictio 21

33 beig based upo the predicted state at the time of the measuremet. If the measuremet at time is give by z = H x + v * ad x I -1 is the estimate of the state x before the measuremet z the the measuremet residual is defied by ' z = z - H Xl-l If there are o modelig errors, it is easy to show that z is a zero mea radom variable with covariace where R is the covariace of the measuremet oise v ' ad is the covariace of the state estimatio error before ' j -1 the th measuremet. Jazwiski (Ref. 15) has suggested itroducig ito the model of the dyamics of the system a zero mea radom drivig oise which i some sese ca accout for the effect of ay modelig error. However, the covariace of this oise is ot kow a priori sice it is ot kow what modelig errors are actually preset. Jazwiski proposes a simple ad reportedly effective procedure for determiig how much "drivig oise'' to itroduce ito the model based upo a examiatio of a sigle residual at a time. If the squared residual is much larger tha predicted by the filter, the 22

34 computed covariace of the old state estimate is artificially icreased at that time so that the ew measuremet is weighted more heavily tha would be the case if o adjustmet was made. I this way possible divergece problems i the filter are miimized because as soo as the residuals become large, idicatig that there is a error i the model, the measuremets are weighted heavily. This teds to reduce the estimatio error to a level cosistet with that predicted by the filter. No attempt is made to estimate the value of the covariace of the added drivig oise sice i fact it does ot exist. It was icluded to accout for ay ukow modelig errors. Eve if the covariace is estimated, such a estimate would have little statistical sigificace sice it would be based upo a examiatio of a sigle measuremet residual. So Jazwiski's procedure should be viewed as a attempt to reduce the effect of modelig errors o the filter operatio rather tha a attempt to improve our kowledge of the model. Deis (Ref. 5) addresses himself to a more complicated problem, that of estimatig the effects of errors i modelig the dyamics of the system as well as estimatig the covariaces of the measuremet ad drivig oises. Oly his procedure for estimatig the statistics of the oises is of iterest here. Deis develops expressios for a real time estimator of the measuremet ad drivig oise covariaces. The estimates are subsequetly used i the computatio of the 23

35 appropriate weightig gais i a Kalma state estimator. Deis' solutio for the estimatio of the oise statistics is suboptimal i the sese that o optimality criterio is used i defiig these estimates. The expressios were obtaied by a examiatio of the characteristics of quadratic fuctios of certai measuremet residuals. From this examiatio a reasoable, if ot optimal, estimator is postulated. However, i may useful applicatios there are several problems associated with the use of this estimator. It is ot always possible to estimate all of the ukow elemets of the measuremet ad drivig oise covariace matrices. Depedig upo the dimesio ad ature of the measuremet, some or all of the elemets of the drivig oise covariace may ot be observable, ad as a result, a sigular situatio is created. There are also certai situatios whe the estimators may be biased ad result i estimates that do ot coverge to the true values of the oise covariaces as the umber of measuremets becomes large. Deis does ot develop expressios for the evaluatio of the quality of the oise covariace estimates. Such measures of quality would be eeded if it is desired to icorporate the estimates obtaied from the measuremets with some a priori estimates to obtai a combied estimate based upo a priori kowledge of the oise covariaces ad the iformatio cotaied i the measuremets. Shellebarger (Ref. 31) is exclusively cocered with estimatig the values of the measuremet ad drivig oise covariaces so that the proper gais ca be computed for 24

36 estimatig the state. His techique is aimed at fidig a approximate solutio to this problem ad cosequetly his estimator for these parameters is suboptimal. He bases his estimates of the oise covariace parameters upo a examiatio of a sigle measuremet residual at a time, If the measuremet is of small dimesio compared with the umber of covariace parameters beig estimated, there is o uique solutio for all of the oise covariace parameters. I additio to this, there is also a questio of a possible bias i the oise covariace estimator, The work of Smith (Ref. 33) is eve more restricted i that he attempts to estimate oly the measuremet oise covariace, assumig that the dyamical model of the state ad the covariace of the drivig. oise are kow precisely. His work results i a suboptimal estimator for the state ad measuremet oise covariace. Here too there is a questio of a possible bias i the oise covariace estimator. Because of the relevace of oise covariace estimatio to this work, a short review of the procedures of Deis, Shellebarger, ad Smith is icluded i Chapter 4, lthough their procedures axe suboptimal ad there are problems associated with implemetig their estimators i certai cases, it is felt that there are some situatios whe these estimators provide a adequate solutio to the problem of iaccurately kow oise statistics. Their procedures are much simpler that the optimal procedures developed i Chapter 3 ad provide some isight ito the variety of techiques that are available for a approximate solutio to the problem. 25

37 ....._._ Summary of Thesis s was previously metioed, the primary objective of this thesis is the developmet of a optimal estimator of the state ad statistics of the measuremet ad drivig oises. However, several other related subjects are also treated. I Chapter 2, it will be show that a biased or corre- lated measuremet or drivig oise ca be estimated usig a liear recursive filter idetical i form to the usual Kalma filter for estimatig the state. This is a cose- quece of the fact that such a biased or correlated oise is observable i terms of a liear fuctio of the measuremets. It will also be show that a error i the values of the measuremet ad drivig oise covariaces used to compute Kalma filter gais does ot produce a observable effect i a liear fuctio of the measuremets. Therefore, ay estimator of these covariaces is iheretly a oliear estimator sice a oliear fuctio of the measuremet is eeded i the estimatio loop. I the simple example give previously, it was show that a estimator for the measure- met oise variace is a quadratic fuctio of the measuremets. Iitially a attempt was made to formulate the problem of oise covariace estimatio i terms of miimum variace estimatio, but the oliearities i the problem immediately produced great aalytical difficulties. This is oe of the reasos why the criterio of maximum likelihood was chose to defie the optimal estimates of the state ad the oise statistics. s the ame might imply, maximum likelihood 26

38 ~~ ~~~ -~ ~~ ~ ~~~ estimates are the most probable values of the state ad statistics for a give set of measuremets. The techiques of maximum likelihood result i complicated equatios, but the theory of maximum likelihood estimators is sufficietly developed to allow a proper hadlig of the problem. The poits of maximum likelihood are foud by settig the derivatives of a suitable Likelihood fuctio to zero ad the solvig the resultig equatios for the ukow parameters. There is a likelihood equatio associated with each parameter beig estimated. Whe the oise covariace matrices are assumed to be time ivariat, the solutio of the likelihood equatios for the optimal state estimate is just a Kalma type estimator which uses the optimal estimates of the oise covariaces to compute the appropriate filter gais. Ufortuately, there is o geeral closed form solutio of the likelihood equatios for these optimal oise covariace estimates. However, a iterative procedure is proposed for the solutio of the likelihood equatios correspodig to the estimates of the oise covariaces. These estimates are show to be asymptotically ubiased, efficiet, cosistet, ad uique, with the estimatio error ormally distributed with a kow covariace, I additio to the optimal solutio discussed i Chapter 3, several suboptimal solutios of the problem are give i Chapter 4. These solutios ca result i a major savigs i the computatioal requiremets but they do ot have the wide rage of applicability of the optimal solutio. 27

39 I. l 1l1l1l1l l I Ill1 Chapter 5 is devoted to a discussio of hypothesis testig. Hypothesis testig is closely related to the estimatio problem. Certai criteria are developed for makig decisios as to whether observed measuremets are cosistet with assumptios about the statistics of the measuremet ad drivig oises. However, the tests themselves do ot allow a determiatio of the reasos the measuremets fail a particular hypothesis test, but rather idicate that there is some error i the model of the system ad/or measuremet. The tests ca usually be coducted at less computatioal expese tha a more complicated oise covariace estimatio procedure, so they ca be used to determie if such additioal estimatio should be coducted. I Chapter 6, the umerical results of a computer simulatio of the theoretical results are preseted. The optimal ad suboptimal estimators are simulated to study their performace i a simple but realistic situatio. The techiques of hypothesis testig are also studied to fid the power of certai tests i detectig errors i the values of the oise statistics used withi a Kalma filter. 28

40 Chapter 2 EXPECTTION OPERTORS ND MXIMUM LIKELIHOOD ESTIMTION 2.1 Itroductio - I this chapter two types of expectatio operators are defied ad maximum likelihood parameter estimatio discussed. precise uderstadig of the expectatio operator otatio is ecessary for subsequet work, so importat defiitios ad results are give here. The maximum likelihood equatios are utilized to establish the otatio ad results of the familiar liear state estimatio problem with ad without the use of a priori iformatio about the state. The questio of ubiasedess ad the covariace of the state estimate i the presece of iaccurately kow oise statistics is also discussed. More geeral parameter estimatio problems ad a more detailed examiatio of the properties of maximum likeli- hood estimators are treated i Chapter Coditioal ad Ucoditioal - - Expectatio -~ Operators Let x ad y be radom variables (possibly vector valued) with joit probability desity fuctio f(x,y) defied over the rage -00 < x,y < a. The coditioal expectatio, or mea, of x, coditioed upo the value of y is defied by J -00 (2.2.1) 29

41 where f(x1y) is the coditioal probability desity fuctio of x give y. Defie pplyig Bayes' rule, (2.2.2) The ucoditioal expectatio of x is defied by (2.2.3) The first expectatio, ~(xly), is the expected value of x if y were fixed at the coditioed value. It is foud by averagig over all other radom iflueces with a costat value of y. The secod expectatio, E(x), is the expected value of x which represets a average over the distributio 30

42 of y as well as over all other radom iflueces. The coditioal covariace of x is defied by T cov(xly) 4 E((X - E(XIY))(X - E(XIY)) ly) T T = E(X x ly) - E(XIY) E(X ly) (2.2.4) The ucoditioal covariace of x is defied by T cov(x) 4 E((x - E(xj) (x - E(x)) ) T T = E(x x ) - E(x) E(x ) (2.2.5) T But E(cov(x1y)) = E(x x - E(E(x\Y)E(x~\Y)) so cov(x) = E(cov(x1y)) + cov(~(xly)) (2.2.6) Thus the ucoditioal covariace ca always be decomposed ito the sum of two compoets: 1) the average coditioal covariace ad 2) the covariace of the coditioal average. The use of the coditioal ad ucoditioal expectio operators i this work is somewhat ucovetioal because the radom variables y may represet the parameters of the probability desity fuctio of x. It is ot usual to thik 31

43 - - of the parameters of a probability desity fuctio as them- selves beig radom variables. However, i situatios where it is desired to estimate the values of these parameters o the basis of observed values of a radom variable x, by cosiderig y to be a radom variable ay a priori iformatio about the value of y ca be utilized coheretly i formig a a posteriori estimate of the value of y. It may ot seem legitimate to regard the value of y as itself beig the outcome of a radom experimet. Usually it is more atural to regard y simply as a fixed, though ukow, costat which appears as a parameter i the x distributio from which sample values are take. However, if this approach is used, there is o way to utilize a priori iformatio about y ad accordigly the performace of the estimator would be degraded. I the extreme case whe o a priori iformatio about y exists, the itroductio of the cocept of a iitial distributio for y would be ujustified ad of o practical use, I the other extreme case whe it is assumed that the parameters are kow precisely a priori, the the probability desity fuctio of y would reduce to impulses at the kow values of the parameters. However, i such a situatio, i the absece of ay other radom iflueces o y, there would be o eed for the etire estimatio process sice it is assumed that the values of y are kow. I all cases fallig betwee these two extremes, by itroductio of a realistic if ot precisely correct desity fuctio for y, the realities of the situatio ca be more closely modeled tha by cosiderig that the parameters y are either exactly kow or 32

44 completely ukow a priori. The above discussio ca be illustrated by a simple example. Let x by a ormal variable with mea m ad variace s, with coditioal probability desity fuctio f(xlm,s). Furthermore let m ad s be radom variables with a joit probability desity fuctio f(m,s). For simplicity it is assumed that s ad m are idepedet, so f(m,s) = f(m) f (s). The coditioal mea of x is But f(xlm,s) = 1 e - 1/2[~ - m) 2 /s~ (2lTs) 1 2 so E(xlm,s) = m idepedet of s The ucoditioal mea of x is - m f (m) dm = m The coditioal variace of x is The ucoditioal variace of x is // m E((x - m)2) = E((X - m) - 2 Im,s) f(m,s) dm ds 33

45 _I = Jf (s + m2 - E2) f(m) f (s) dm ds = s + m - -2 m where - s f ( s) ds Note that E((x - 5) ) # E((x - m) ) uless m = m. 2.3 Maximum Likelihood State Estim.atio I this sectio the theory of maximum likelihood estima- tio is discussed ad applied to the estimatio of the state of a liear dyamical system which is drive by white oise ad observed by liear oisy measuremets. Because of the relative simplicity of the equatios for determiig the state estimate, much ca be said about the performace of the estimator. I more complicated situatios, such as estimatig the covariacc of the measuremet ad drivig oises, evaluatio of the estimator behavior is cosiderably more difficult ad requires a more thorough aalysis. For this reaso the discussio of these situatios is deferred util Chapter 3. Maximum likelihood estimatio, as the ame might imply, is cocered with fidig the maximum of a likelihood fuctio defied as a fuctio of the parameters beig estimated ad the measuremets o the system. Let Z deote the realized values of a set of measuremets ad at = (a 1, a 2,..,a m ) be 34

46 the vector of parameters belogig to a set of all possible parameter values R. Further, let f (Zla) deote the coditioal probability desity fuctio of the measuremets Z give the value of the parameter a. The likelihood fuctio is the defied by l(a, Z) = f(zla1 (2.3.1) The priciple of maximum likelihood cosists of acceptig ^2 "m it = (a', a,.., a ) as the estimate of at, where l(a,z) = max l(a,z) a (2.3.2) There may be a set of samples for which a does ot exist. Uder suitable regularity coditios o f (Zla), the frequecy of such samples ca be show to be egligible. I practice it is coveiet to work with the atural logarithm of 1 ( a,z), i which case a i (2.3.2) equatio satisfies the L(a,Z) = I l (a,z) = max L (a,z) a (2i3.3) Whe the maximum i (2.3.3) of R, ad L (a,z) is attaied at a iterior poit is a differetiable fuctio of a, the the partial derivatives vaish at that poit, so that a is a solutio of the equatio (2.3.4) 35

47 Equatio (2.3.4) is called the maximum likelihood equatio ad ay solutio of it a maximum likelihood estimate. The fuctio a defied by (2.3.3) over the sample space of observatios Z is called a maximum likelihood estimator. If a priori iformatio about the parameters beig estimated exists ad if the a priori ucertaity i kowledge of these parameters ca be formulated as a a priori proba- bility desity fuctio for a, the.a slightly differet likelihood fuctio ca be defied.so that this a priori iformatio ca be used i a optimal fashio. I such cases, the augmeted likelihood fuctio is defied by 1 (a,z) = f(alz) (2.3.5) where f(alz) is the coditioal probability desity fuctio of the parameters a give the measuremets Z. By applicatio of Bayes' rule it ca be see that where f(a) is the a priori probability desity fuctio of c1 ad f(z) is the ucoditioal probability desity fuctio of Z, foud by h ab I this case the logarithm of t-he augmeted likelihood 36

48 I fuctio (2.3.5) is (2.3.6) The iclusio of a priori iformatio about a has a tedecy to shift the zero poits of (2.3.7) towards the peak of the a priori parameter desity fuctio. If a priori iformatio about a exists, it is usually preferable to utilize the formulatio f(alz) sice this allows utilizatio of all iformatio about the value of a, both from the a priori iformatio ad iformatio derived from the measuremets Z, However, it should be realized that if the assiged a priori probability desity fuctio of the parameters does ot accurately represet possible variatios i the parameters, the performace of the estimator may i fact be degraded by iclusio of a priori iformatio. Whe studyig the performace of a estimator, there is some justificatio for lookig first at a estimator which does ot utilize a priori iformatio. This allows determiatio of how effectively a give estimator extracts iformatio $ram the measuremets without cosiderig how this estimate might be icorporated with a a priori estimate to obtai a combied estimate. I the derivatio of the maximum likelihood state estimatio equatios, it is first assumed that a priori iformatio about the state does exist so that the latter form of the likelihood fuctio is employed. fter the solutio of 37

49 this problem is obtaied, the equatios for estimatig the state without a priori iformatio will be give. Both solutios of the state estimatio equatios should more correctly be called coditioal maximum likelihood estimates because the optimality.of such estimates is coditioed upo the assumptio that the oise drivig the state ad corruptig the measuremets of the state have a kow distributio with precisely kow parameters. If this assumptio is ot valid, the the state estimates are o loger the true maximum likelihood estimates ad all guaratees of optimality are lost. The purpose of this sectio is to establish certai results ad otatio which will be eeded i later chapters. excellet referece o the subject of maximum likelihood state estimatio is by Rauch (Ref. 26). Let the liear dyamical system beig observed be defied by the recursive relatioship x x ~ - i- 'k ~ k Wk (6x1 vector) (2.3.8) ad the liear oisy observatios upo the system at time k be defied by zk=h k x k + v k (yxl vector) (2.3.9) 38

50 where Q(k,k-l) is the Bxf3 state trasitio matrix Hk is the yxb observatio matrix rk is the Bxq forcig fuctio matrix wk is the xl drivig oise vector v is the Yxl measuremet oise vector k For this derivatio it is assumed that vk ad wk are idepe- det zero mea ormal radom variables with kow covariaces ad Qk respectively. Usig the otatio of Sectio 2.2, E(Vk) = E(Wk) = 0 (2.3.10) m m m E (v v.~) = Rk 6jk, k I E(W w-~) = Q, 6jk, k I E(V w.l) = 0 k j (2.3.11) where 6 = 1 if j = k ad is zero otherwise. jk The above coditioal expectatio operators are coditioed upo the assumed values of the meas ad covariaces of the oises as well as their assumed idepedece. Give the vector of measuremets Z: T T = (zl,..., z) ad a idepedet a priori estimate of the iitial state, maximum likelihood estimatio of the state x is based upo fidig the particular value of the state which maximizes the coditioal probability desity fuctio of the state, give all measuremets of the state. Implicit i the defiitio of the likelihood fuctio is that all values of Rk ad Q k' k = l,..,, be kow precisely, as well as the covariace of the a priori state distributio, the elemets of the state trasitio matrices, the observatio matrices, ad the 39

51 forcig fuctio matrices. To idicate this depedece of the likelihood fuctio o these parameters, some of the parameters will appear as coditioig variables i the coditioal likelihood fuctio. This choice of parameters to thus idicate is motivated by the work of Chapter 3, whe the values of certai parameters are to be estimated. It is coveiet to work with the atural logarithm of the likelihood fuctio. (2.3.12) where R ad Q represet the kow sequece of values R1ro*iRrQ1f-*fQr the measuremet ad drivig oise covariaces. The coditioal probability desity fuctio of the state is foud by use of Bayes' rule. O ay oe trial, the iitial state xo is ot a radom variable but assumes a certai value. However, this value is ot precisely kow. To model this ucertaity i the value of the iitial state, x is assumed to be a radom variable 0 40

52 (over the esemble of all possible iitial coditios) havig 1 a ormal probability desity fuctio f(xo) with mea zo ad covariace about the mea Polo. This distributio is presumed to be kow a priori. The a priori state estimate is take to be the mea of this distributio. Because of the symmetry of f(x ) 0 - about its mea, xo is also the poit of maximum probability of the distributio. \ - X = x the a priori state estimate The averagig here is performed over the esemble of all possible iitial coditios ad is coditioed upo kowledge of zo ad P. 010 be the maximum likelihood estimate of x Let x 1-1 immediately before the th measuremet ad let Pl-l be the coditioal covariace of x about its coditioal mea x ~ ~ ~ - m The averagig here is over the esemble of all possible measuremet ad drivig oises - ad iitial state coditios, all coditioed upo the values of R ad Q. It ca be show that-.before the update at time, the coditioal proba- bility desity fuctio of x is "I. I

53 From (2.3.9) z = Hx + v Sice v is a ormally distributed variable, idepedet of x ' ad x is also a ormal variable, the z is a ormally distributed variable with coditioal mea ad coditioal covariace Therefore the coditioal probability desity fuctio of z is ad from (2.3.12) ad (2.3.13) where "costat" icludes all terms that are ot fuctios of x. 42

54 + HPl-l The maximum likelihood estimate of x is that value of x which maximizes L ' or makes = o (2.3.17) It ca be see that alt -=- T -1 T -1 ax (X-xl-l) 'l-1 + (Z-Hx) R H (2.3.18) The after some maipulatio, the solutio of (2.3.17) is - -1 X l - ('/-l (2.3.19) Upo usig the matrix iversio lemma (see ppedix ) X l = x l-1 f (Z - H Xl-l 1 (2.3.20) where - 'lsf-1 H(R T HT) -' (2.3.21) is called the optimum gai to the measuremet residual ( ' - H xl -1) The coditioal probability desity fuctio of x after the th measuremet ca be show to be (2.3.22) 43

55 -7 I l1.l1l lll1l1l1ll1l1111 where P is the coditioal covariace of x about x l l after the th measuremet. It ca be show that h The ecessary quatities for computig x ca be l obtaied recursively from the estimate at the previous time. (2.3.24) (2.3.25) It should be oted that the above recursive state estimatio equatios are idetical to those obtaied by Kalma (Ref. 16) usig the method of orthogoal projectios ad Lee (Ref. 20) usig the method of weighted least squares. It is also easy to show that the state estimate is that estimate which miimizes the coditioal covariace of the state estimatio error at each stage of estimatio. If o a priori iformatio about the state is used, the logarithm of the likelihood fuctio is defied by (2.3.26) where f(zlx,riq) is the joit coditioal probability desity fuctio of the measuremets Z give x R, ad Q. 44 I

56 By applicatio of Bayes rule f(zlx,r,q) = f (Zml By repeated applicatio of Bayes' rule, it ca be show that (2.3.28) It ca be aticipated that util a sufficiet umber of measuremets have bee take, the state estimate caot be defied ad there is o uique solutio of the likelihood equatios The problem is coveietly broke ito two parts, obtaiig a miimal data set ad the subsequet recursive estimatio usig the equatios previously derived. miimal data set is defied as the smallest set of measuremets that is ecessary to completely defie the state. That is, for : some o, there is o uique solutio of the likelihood equatios for the state x. The derivatio of the estimatio equatios whe o a priori iformatio is used is cosiderably more complicated tha the case previously studied whe a priori iformatio was used. Oly the results of the derivatio will be preseted here. Fraser (Ref. 10) obtaied the same equatios give below usig the criterio of miimum covariace. 45

57 Prior to obtaiig a miimal data set o uique estimate of the state exists so a auxiliary variable must be itro- duced. Defie YI = F X' l l (2.3.30) (2.3.31) where x' ad X; I -1 are the state estimates obtaied withl out a priori iformatio ad F will be subsel ad Fl-l quetly defied. It ca be show that a uique y ad l exist at all times, but oly if F ad F are Yl -1 l I -1 of full rak ad possess iverses do uique x' l ad x' I -1 exist. * Recursive equatios for y l' Fl' Yl-1, ad Fl-l * ca be obtaied with iitial coditios Yolo = o F = o 010 Subsequetly, m m (2.3.32) (2.3.33) - S r D-'rTS F I -1 - ' (2.3.34) 46

58 HR F l - T -lh + Fl-l _..., (2.3.35) where D = Q, ' + rts r C ~ = S -1 ~ D It ca be show that Fl-l ad F are equal to the l iverse of the state estimatio error covariace matrix before ad after the th measuremet respectively. For < F is sigular, implyig that some or all elemets or l of the error covariace matrix are ifiite, this i tur implyig that some or all of the elemets of the state caot be estimated o the basis of the measuremets take. However, oce a miimal data set is obtaied, the state estimate x' l ca be obtaied from the equatio below. h X' =F l -1 l YI (2.3.36) Subsequetly, the usual state estimatio equatios (2.3,20) ad (2.3.24) ca be used with the solutio of the miimal data set (2.3.36) used as the iitial state estimate ad F used as the covariace of the iitial state estimatio l error., The solutio of the state estimatio problem with o a priori iformatio ca be thought of as the limitig case of the solutio with a priori iformatio as P -1 +O. 010 I 47

59 other words, the covariace of the a priori estimatio error distributio becomes arbitrarily large ad i the limit becomes ifiite. This is equivalet to havig o a priori iformatio about the state. The state estimate obtaied usig a priori iformatio ca be show to be completely equivalet to a liear combiatio of the state estimate obtaied without use of a priori iformatio ad the propagated forward iitial state estimate. X l - -1 Pl(P/o X lo + F l x' l 1 (2.3.37) where x l is the combied state estimate x' l is the state estimate a priori iformatio obtaied without X lo is the propagated forward iitial state estimate P 40 is the covariace of the propagated forward iitial state estimatio error i=l P 010 P l is the covariace of the a priori state distributio is the covariace of the combied state estimatio error 48

60 This result is also equivalet to settig the iitial coditios o y I\ ad F ad P-l 010 respectively. It ca be show that i most situatios (whe the state is completely observable by the measuremets ad cotrollable by the drivig oise) that as -f ODf ' P -l -k 0 lo i which case I\ X + x' l Thus as would be expected, for large, the effect of ay iitial state estimate will become arbitrarily small. If the true values of R ad Q are ot kow precisely, the the measuremet iformatio caot be processed optimally. Let R ad Q represet the assumed value of the sequeces R ad Q, * * "* xl represet the state estimate after measure- * * mets usig R ad Q to compute the measuremet residual * gai matrices, ad P represet the "computed" state l covariace matrix. The (2.3.38) * - 'l-1 * T * * HT) -l (R + HPl-l (2.3.40) a 49

61 (2.3.41) (2.3.42) P* represets the coditioal state covariace matrix after l the th measuremet, coditioed upo the assumptio that * * R = R ad Q = Q. If this assumptio is ot valid, the P* does ot accurately represet the state covariace l matrix. It ca easily be show that the actual coditioal covariace matrix ca be computed recursively usig the followig equatios. (2.3.44) P represets the state covariace matrix uder the l * * assumptios that R ad Q are used to compute the filter gais (2.3.40) while the true values of the oise covariaces are R ad Q. If the iitial state covariace is presumed to * * * = P. Uless R = R ad Q = Q, P * be kow, the P will ot be equal to P. Depedig upo the values of l l * * R, R, Q, Q, this deviatio ca be very sigificat. Numeri- cal results of a computer simulatio of these equatios for a particular system are give i Chapter 6. Because of the liearity of the maximum equatios i the state estimatio problem, a likelihood strog statemet 50

62 ca be made about the distributio of the estimatio error, From the form of the state estimatio equatios it ca be see that if the iitial state distributio is ormal as well as the measuremet ad drivig oises, the the state estimate is also a ormal radom variable. I order to completely specify the distributio of the estimatio error, the mea ad covariace of the distributio must be determied. Covetioally, a estimator is said to be ubiased if over a esemble of trials the expected value of the state estimate is equal to the expected value of the state. Implicit i this defiitio is averagig over the probability desity fuctios of the measuremet ad drivig oises as well as averagig over the esemble of all iitial coditios of the state. Eve if icorrect values of R ad Q are used to compute the measuremet residual gai matrices, the state estimate remais ubiased i the above sese as log as the measuremet gais are fixed umbers ad are ot radom fuctios of the outcomes of the measuremet process, The coditioal expected value of the state estimate (2.3.38) ca be computed recursively- (2.3.45) * Uder the assumptio that is ot a radom variable uder the expectatio operator, 51

63 where %* "* X = x - x l-1 l-1 But from (2.3.41) (2.3.46) Sice (x) &(xv1) Repeatig the above procedure, it ca be show that "* * E(X = E(x) + [: E(;* (2.3.47) l i=l 010 With E(;* ) = E(X xo) ad X 010 = E(Xo) the E&* ) = "* ad E (Xl) = E (x,) for all (2.3.48) 52

64 * * This result is idepedet of the values of R, R, Q, Q. The maximum likelihood state estimate remais ubaised for * * ay values of R ad Q, but the covariace of such a esti- mate is a fuctio of these quatities as expressed by (2.3.43) ad (2.3.44). Thus it ca be see that over the esemble of trials with a11 possible iitial coditios, measuremet oises, ad drivig oises, the state estimatio error is zero mea ormally distributed with covariace for ay. ' 1, Now the questio is asked: Is the state estimate biased over the esemble of trials with the same iitial coditios? Or i other words, if the iitial state were fixed ad oe averaged the estimate over all measuremet ad drivig oises which might be experieced, would the state estimate be biased? The aswer is yes if a priori iformatio about the state is used ad the iitial state is differet from the iitial estimate. This ca be show i a fashio aalogous to the previous work. Now all codi- tioal expected values are additioally coditioed upo the value of x the iitial state. From ( , 0' Now "* = x - x

65 * as averagig is ot performed over xo. Uless xo = x the olof The bias of the estimator is due to the use of a priori ifor- matio i the estimator. If o a priori iformatio is used, it is easy to show that However, eve if iitial iformatio is used, as becomes large the bias due to iitial coditio error becomes arbi- trarily small. O the average, x = x = x estimator is ubiased as show before. - "* ad the But over the esemble of all possible trials with the same iitial coditios, the estimate is oly asymptotically ubiased. However, the dis- tributio of the estimate about this possibly biased value ca be show to be ormal for ay. slightly differet defiitio of ubiasedess is used i Chapter 3 i the discussio of maximum likelihood estima- tors of more geeral parameters. There, a estimator a of the true value of the parameters a is said to be ubiased if where a. is the true value of a. This defiitio is really appropriate i situatios whe o a priori iformatio about the parameters is used so that the parameter estimate is a 54

66 fuctio of the measuremets aloe. However, the asymptotic 7 behavior of the estimator will be show to be idepedet of the a priori estimate so that this defiitio is useful eve if a priori iformatio is used i obtaiig the estimate. Usig this defiitio of ubiasedess, the maximum likelihood state estimate is ubiased if E(X "* IX) = x l Usig a procedure similar to that used to obtai (2.3.47) ad (2,3.49), it ca be show that But ad "* Uless x 010 the maximum likelihood estimator is biased. But as before, if oe looks at the asymptotic behavior of the estimator or studies a estimator which does ot use a priori iformatio about the state, the ad the estimator is ubiased. 55 I

67 Now the questio is asked: What is the effect of possibpe.biases i the measuremet ad drivig oises ad what ca be doe to estimate these biases? I such a situatio, the system state is give by the relatioship x = Q(,-l)x-l + r(w + B ~ ) (2.3.51) where as before w is a zero mea radom variable with covariace Q with w idepedet of wk for k #. ' Bw is a costat bias idepedet of w with T 2 &(BW Bw) = ab W These coditioal expected values are take over the esemble of all possible drivig oise bias values. It is usually assumed that over the above metioed esemble, BW is ormally 2 distributed with zero mea ad covariace a. The measucemet z is give by BW z = H x + v +Bv (2.3.52) where as before v is a zero mea radom variable with covariace R, with v idepedet of vk for k #. B V is a costat measuremet bias idepedet of v ad the drivig oise- bias Bw, with 56

68 (BV) = 0 These coditioal expected values are take over the esemble of all possible measuremet oise bias values. gai it is usually assumed that Bv is ormally distributed. If the state x is estimated with the effects of these biases eglected, the the state estimate x is computed l usig (2.3.38) ad (2.3.41), with the "computed" covariace "* matrix give by (2.3.39) ad (2.3.42). It is assumed that the values of R ad Q used to compute these matrices ad the measuremet residual gais are the correct values. Now however, the state estimate will ot be a optimal estimate * ad P will ot correspod to the actual state estimatio l error covariace because of the eglected biases. From (2.3.51) ad (2.3.41) it ca be see that The the actual state estimatio error covariace matrix befoee the measuremet at time is give by 57

69 (v So i order to compute P -1' drivig oise bias B ad W the correlatio betwee the %* the state estimatio error ~ ~ - ~ l must be determied. This will be doe subsequetly. From (2.3.52) ad (2 3.38) it ca be see that %* %* * %* X l = x + l-1 + Bv H Xl-l 1 (2.3.55) The the actual state estimatio error covariace matrix after the measuremet at time is give by (2.3.56) + * %*T %* E(BvXl-l + E(Xl-l BT) v *T - H P J-1 * %* The correlatio betwee Bv ad ~ ~ must l be determied ~ - ~ i order to evaluate P l* Muktiplyig (2.3.53) by Bv ad performig the codi- tioal expected value, %* E: (X (2.3.57) sice it s a-ssumed that Bv is idepedet of w ad Bw. %* T %* T F: (x I Bv) ad E (x Bw) ca be computed recursively. Multiplyig (2.3.55) by Bv ad performig the expected value, 58

70 VI1 (2.3.58) Multiplyig (2.3.55) by Bw ad performig the expected value, (2.3.59) sice Bw is assumed to be idepedet of v ad Bv. But from (2.3.53) it ca be see that (2.3.60) SO ( becomes * 2 - (I -,H~) r BW (2.3.61) It is assumed that the iitial state estimatio error is idepedet of Bw ad Bv so the iitial coditios o (2.3.57) ad (2.3.61) are T %* T E (2* B ) = E(X 010 v 010 w B)=O Usig a aalysis similar to that previously give, it ca be show that across the esemble of all possible iitial 59

71 state coditios, measuremet ad drivig oises, - ad measuremet ad drivig oise biases, the state estimate x j is ubiased. However, if the biases are preset, the actual state estimatio error covariace matrix is o loger accu- * rately represeted by P but rather by P as give above. l l If there is a possibility that biases may be preset i the measuremet or drivig oises, the it is usually prefer- able to estimate their values so that their effect upo the state estimator is dimiished. This ca easily be accomplished withi the framework of maximum likelihood state estimatio already established. Defie a ew state variable "* T T T st = (x, Bwf Bv) (2.3.62) ad a ew state trasitio matrix (2.3.63) ad a ew forcig fuctio matrix (2.3.64) ' = [!I 60

72 1 I I The the augmeted state s obeys the recursive relatioship s = Y(,-1) sel + w (2.3.65) Defie a ew observatio matrix (2.3.66) The the measuremet z is give by z =G s + v (2.3.67) Now the problem is reduced to exactly the same form as the case whe the oises were zero mea except that ow the state vector is of icreased dimesio ad icludes all possi- bpe.oise biases. The estimator for the augmeted state s ca be formulated i exactly the same way as before with iitial coditios This says that the a priori estimates of the biases should always be zero sice, if they were ozero, they could be removed with the residual ucertaity i the bias values the zero mea. The covariace of the iitial augmeted state estimatia srror is give by 61

73 0 0 2 *B 0 W *B where P' is the.covariace of the uaugmeted state estimate, 010 is the covariace of the drivig oise bias, ad 0: *B is W V the covariace of the measuremet oise bias. Thus the augmeted state ca be estimated usig the same form of the equatios as for the uaugmeted state with the substitutios H I' ' X + G ' + + E l + s l If the true covariaces of the radom parts of the oises as well as the covariaces of the bias parts of the oises are kaw precisely ad used i the filter, the it ca be show that E accurately represets the covariace of the l augmet& state estimatio error, ad the filter.is- optimal i a mirimum covariace or maximum likelihood- sese. If istead of the measuremet ad drivig oises havig a bias, they have a compoet which is correlated with past oises, the a slightly differet approach must be used. Oly a limited type of correlatio is easily treated so the 62

74 followig defiitios are made. It is assumed that the state obeys the relatioship x x ~ + - r(w ~ + w) C (2.3.68) where w is ucorrelated zero mea oise such that T E(W w.) = 3 Q 6j (2.3.69) C ad w is correlated zero mea oise such that (2.3.70) T W is the "correlatio time" of the drivig oise. It is also assumed that w ad wc are mutually ucorrelated so that (W WC) = 0 (2.3.71) 3 The correlated oise wz ca be geerated by cosiderig wc to be composed of two parts. C * -mw w = w +(e W-l (2.3.72) * where w is a zero mea radom oise that is idepedet of all past oises with (2.3.73) 63

75 It is easy to show that the correlated oise defied by (2.3.72) has the proper correlatio betwee the oises at differet times as give by (2.3.70). It is also assumed that the measuremet z is give by z = H x + v + vc (2.3.74) where v is ucorrelated zero mea oise such that m E ( V ~ v?) = R 6 3 j (2.3.75) ad vc is correlated zero mea oise such that (2.3.76) T is the "correlatio time" of the measuremet oise. It is V agai assumed that v ad v: are mutually ucorrelated with the further assumptio that all measuremet oises are ucorrelated with all drivig oises. gai it is coveiet to defie the correlated measure- met oise by * -Wv vc = v + (e V-l (2.3.77) * where v is a zero mea radom oise that is idepedet of all past oises with * *T - 2/Tv &(v v ) = Rc (1- e 1 (2.3.78) 64

76 It is easy to show that the correlated measuremet oise defied by (2.3.77) has the proper correlatio betwee the oises at differet times as give by (2.3.75). It should be oted that whe the correlatio time of the oises becomes very large, the correlated oises approach costat biases, whereas as the correlatio times become small, the oises become ucorrelated. If it is assumed that the state x is estimated eglect- ig this correlatio, the state estimate x is computed l usig (2.3.38) ad (2.3.41), with the "computed" covariace matrix give by (2.3.39) ad (2.3.42). gai x will ot l * be a optimal estimate ad P will ot correspod to the l actual state estimatio error covariace matrix because of the eglected correlatio i the oises. From (2.3.68) ad (2.3.41) it ca be see that "* "* (2 3.79) The the actual state estimatio error covariace matrix before the measuremet at time is give by I order to compute Pl-l, the correlatio betwee the drivig oise w C 'L* ad the state estimatio error ~ ~ - must ~ l be ~ - ~ 65

77 computed. This will be doe subsequetly. From (2.3.74) ad (2.3.38) it ca be see that %* %* X l = x + l-1 * %* (V + vc - H Xl-l 1 (2.3.81) The the actual state estimatio error covariace matrix after the measuremet at time is give by (2.3.82) C %* The correlatio betwee v ad ~ ~ must l be computed ~ - i ~ order to evaluate P. l Multiplyig (2.3.79) expected value by vz ad performig the coditioal sice it is assumed that v (2.3.77) plus the idepedece of v, C is idepedet of wzo But usig * %* -l/tv %* vct ) E(x-ll-l vct) = (e ) E(x-lj-l -1 (2.3 84) Similarly it ca be see that %* -UTw %* ct wct) = (e 1 (2.3,85) E(x-ll-l E (X-l 1 -lw-l 66

78 + %* %* vct ) ad E (X wct ) -ll-1-1 ca be computed recur- E (X-l sively. Multiplyig (2.3.81) by vz ad performig the expected value, %* * %* ct * = (I - H) E (Xl-lv + Rc E(xl vct) (2.3.86) * -Wv %* = (,-1)(e vct ) E(x-ll-l -1 * Rc C Multiplyig (2.3.81) by w ad performig the expected value, (2.3,87) It is assumed that the iitial state estimatio error is ucorrelated with the measuremet ad drivig wises, so the iitial coditios o the recursive equatios (2.3.86) ad (2.3.87) are E &* VCT) = E (x wct) = %* By aalogy with the estimatio of possible oise biases, it is possible to estimate the correlated part of the measure- met ad drivig oises. 67

79 r Il 1 I Defie a ew state variable T T ct V~T) s = (X, w I (2.3.88) ad a ew state trasitio matrix Y(,-l) = (2.3.89) ad a ew forcig fuctio matrix - r ' - 0 I I - (2.3.90) ad a ew "drivig oise" vector T *T wt v*t) u = (W I ' (2.3.91) It ca be see that the ew state s satisfies the relatioship s = Y(,-1) s ~ - ~ U +' (2.3.92) ad the measuremet z is give by z = G s + v (2.3.93) 68

80 where G is defied by Now the problem is reduced to exactly the same form as the cases whe the oises are ucorrelated except that ow the state vector is of icreased dimesio ad icludes all possible correlated oises. The estimator for the augmeted state s ca be formulated i exactly the same way as before with iitial coditios The covariace of the iitial augmeted state estimatio error is give by Thus the augmeted state ca be estimated usig the same form of the equatios as for the uaugmeted state without correlated oises. If the true covariace of the correlated ad ucorrelated parts of the oises as well as the proper correlatio times are kow precisely ad used i the filter, the it ca be show that E l as computed by the filter accurately 69

81 represets the covariace of the augmeted state estimatio error, ad the filter is optimal i a miimum covariace or maximum likelihood sese. 70

82 Chapter 3 MXIMUM LIKELIHOOD ESTIMTION OF NOISE COVRINCE PRMETERS ND THE SYSTEM STTE 3.1 Itroductio I Chapter 2 the theory of maximum likelihood estimatio was briefly discussed ad the applied to the problem of state estimatio. The resultig equatios were derived uder the assumptio that the probability desity fuctios of the measuremet ad drivig oises as well as the iitial state probability desity fuctio are kow a priori. It was show that if the secod order statistics of the oises are ot kow precisely, the state estimatio becomes suboptimal. The purpose of this chapter is to utilize the cocepts of maximum likelihood to remove the restrictio that R ad Q be kow preciselyapriori i order to obtai a optimal state estimate. I Sectio 3.2 importat defiitios are give ad a summary of some classical results of maximum likelihood estimatio discussed. These results cocer the asymptotic properties of maximum likelihood estimators, but they caot be directly applied to the problem of state ad oise covariace estimatio. I Sectio 3.3 the likelihood fuctios appropriate for the solutio of a set of closely related problems are derived, 71

83 all of which cocer the estimatio of the oise covariace parameters. Sectio 3.4 is devoted to demostratig the asymptotic properties of these estimators. The remaider of this chapter cocers the applicatio of the theoretical results to the problem of state ad oise covariace estimatio. 3.2 Summary of Previous Results i Maximum Likelihood ~ c Estimatio Maximum likelihood estimatio has bee studied by may authors ad may useful results have bee obtaied cocerig the properties of maximum likelihood estimators. These results apply directly oly to a limited set of problems, whe the measuremets are idepedet ad idetically distributed. However, they provide a base upo which the aalysis of more geeral problems ca rest. The purpose of this sectio is to summarize the importat results ad defiitios which will be eeded to exted the aalysis to more geeral problems. First several importat defiitios must be made. These defiitios apply equally well to ay situatio whe the values of certai parameters are to be estimated o the basis of observatios of a radom variable which is a fuctio of these parameters. They are ot limited to situatios whe the criterio of maximum likelihood is used to defie the estimate. The estimator of the true value of the parameter a is a observable radom variable, say a(zl,".,z ) which is a 72

84 fuctio of the sample elemets (zl,..,z ) ad whose distributio is, i some sese, cocetrated about the true value of a. s i liear estimatio, it will be foud that the 1 covariace of the estimate is ofte a reasoable criterio for measurig the cocetratio. If the realized (observed) value of a correspodig to a realized (observed) value of (zl,..,z) is used for ao, the true value of a, the the 1 radom variable a is called a poit estimate or estimator for ao. This use of a ormally would be made, of course, oly whe the value of a. If whe a = a 0' ubiased estimator for ao. is ukow. E (aril ao) = ao, the a is called a This is the last defiitio of ubiasedess that was used i Chapter 2 i the discussio of maximum likelihood state estimatio. If a estimator a coverges to a as + 03, it is 0 called a cosistet estimator for ao. for a to be a cosistet estimator is that it be ubiased ad have a covariace which goes to zero as -t 00. If a is a ubiased estimator for covariace ad has the further property estimator has a smaller covariace tha efficiet estimator. ecessary coditio a 0 havig fiite that o other ubiased a ' it is called a The followig results of maximum likelihood estimatio have bee obtaied by Rao (Ref. 25), Wilks (Ref. 37), ad Deutsch (Ref. 6) after certai assumptios have bee made about the ature of the likelihood fuctio. Let Z: T = (z:, e., z) be a vector of idepedet ~ idetically distributed observatios ad a be the m x 1 73

85 vectors of parameters beig estimated. The the joit codi- tioal probability desity fuctio of Z ca be foud by applicatio of Bayes I rule. (3.2.1) where f(~~lz,_~,a) is the coditioal probability desity fuctio of z give Z-l ad a. Because of the assumed idepedece of the zi, (3.2.2) By repeated applicatio of Bayes' rule, it ca be see that (3.2.3) It is assumed that the likelihood fuctio is chose to be the probability desity fuctio (3.2.31, i which case the atural logarithm of the likelihood fuctio has the form (3.2.4) The i=l (3.2.5) 74

86 s stated i Chapter 2, maximum likelihood estimatio is cocered with fidig the value of the parameters a such that For otatioal coveiece, defie The followig assumptios are made about the likelihood fuctio. al a2l a 3 ~ The derivatives - exist for almost aa 7, aa 3 aa all Z i a iterval R of a. E [ m 'I' 5 (=) af zlcxo] af is positive defiite For every a i R with E [M (Z) I aol < K for some K which is idepedet of a ad. 75

87 Defie S(2,a) = aa the m x 1 sigle measuremet score s (Zra) = al f the m x 1 total measuremet score aa J(ao) = J(ao,ao) the m x m sigle measuremet coditioal iformatio matrix J (ao) = J(aouao) the m x m total measuremet coditioal iformatio matrix The followig theorems are from Wilks. The proofs will ot be repeated here but will be discussed subsequetly. symptotic Di stributjo- of the Score Suppose (zlf..,z) is a sample from the probability desity fuctio f(zlao). Let f(zlaf possess fiite first derivatives with respect to a i the rage R. The if J (a,a) is positive defiite for a i R, the total measure met' score S(Zrao) is asymptotically distributed for large 76

88 as a zero mea ormal radom variable with covariace J (a 1. o 7 Covergece -- of - the Maximum Likelihood Estimator Suppose (z1,..,z) is a sample from the probability desity fuctio f(zlao) where f(z a) possesses fiite first derivatives with respect to a i R Let th Sj(z,a), the j compoet of the vector S (z,a), be a cotiuous fuctio of a i s2 for all values of z except possibly for a set of zero probability. The there exists a sequece of solutios of (3.2.6) which coverges almost certaily to ao. If the solutio is a uique vector a for - > some o, the sequece of vectors coverges almost certaily to a. as symptotic - Distributio of the Maximum Likelihood Estimator If (zl,..,z) is a sample from the probability desity fuctko f(zlao) where f(zla) possesses fiite first ad secod derivatives with respect to a i the rage R, ad if the maximum likelihood estimator satisfyig (3.2.6) is uique for some > - some the it is asymptotically ormally 0' -1 distributed for large with mea a. ad covariace [J(ao)I. Thus uder the assumptios previously give, the maximum likelihood estimator of the parameters a. is asymptotically ubiased ad ormally distributed for ay value of a. i the rage R, with 77

89 Now the distributio of the estimatio error over the esem- ble of all possible true values of a. is sought. aalytic expressio for the ucoditioal probability desity fuctio of a caot be foud i most situatios. Formally Eve if f(alao) is a ormal desity fuctio, the above itegral is usually oaalytic for ay otrivial f (a 1. However, eve if the ucoditioal distributio of a is ot kow, two useful momets of the distributio, the mea ad covariace, ca be evaluated. The ucoditioal mea of the estimate is defied by 0 R - = a. f(ao) dao = a R where 7 is the mea of the distributio f (ao). 0 The ucoditioal covariace of the estimate is defied 0 78

90 I cov(a) = E[ /- 1 r, = E[ But + ad = o so But ad The J T E[ (ao-a ) (ao-a ) 3 = cov(ao) the covariace of the a 0 distributio 79

91 - J-l represets the mea square estimatio error matrix, which for ay otrivial f(ao) is oaalytic, Formally There are several approximate techiques for evaluatig this itegral which are discussed i Sectio 3.7, 3.3 Derivatio of the Likelihood Fuctio - I this sectio several closely related problems are studied ad the likelihood fuctio appropriate for the solutio of each derived, It will be show that the asymp- totic behavior of the solutios of each problem is the same so that if the asymptotic behavior of ay oe is foud, the results ca be applied to the others. The otatio ad defiitios of Sectio 2.3 are used with the additioal assump- tio that the measuremet ad drivig oise covariace matrices are diagoal ad time ivariat. The techique of maximum likelihood estimatio is ot restricted to cases whe this assumptio is valid, but the estimatio problem becomes much more complicated if this assumptio is ot made. discussio of the problem whe this restrictio is ot employed is give i Chapter 7. Estimatio of Noise Covariace Parameters with No Priori Noise Covariace Iformatio The first problem cosidered is estimatig the diagoal elemets of the measuremet ad drivig oise covariace 80

92 matrices without the use of a priori iformatio about these quatities. The maximum- likelihood estimate of the oise covariace parameters is defied by (3.3.1) where l(r,q,z ) is the likelihood fuctio which is chose to be the coditioal probability desity fuctio (3.3.2) By applicatio of Bayes' rule Repeatig the above procedure to fid f(z-llr,q), it ca be show that (3.3.3) where f (zil Zi - l,r,q) is the coditioal probability desity fuctio of zi give 7,i-l, R, ad Q. Usig the results of Sectio 2.3, it ca be show that z i mea is a ormally distributed radom variable with coditioal 81

93 ad coditioal covariace &(zi z~~z~-~,r,q) T = R + HiPili-l HT where zi = z i - H. 1 x ili-1 is the maximum likelihood estimate of xi after i-1 xi I i-1 measuremets usig the true values of R ad Q to compute the proper filter gais, ad Pili-l is the coditioal covariace of xi about x i I i-1' It is assumed that a priori iformatio about the state is used i formig the above state estimates so that a uique X exists for all i. i I i-1 Defie Bi - R + HiPili-l HT The the coditioal probability desity fuctio of zi is give by -1/2 (zibi T -1 zi) e (3.3.4) s i Chapter 2, it is coveiet to work with the atural logarithm of the likelihood fuctio (3.3.2).

94 . fter algebraic maipulatio, L(R,Q,Z) = costat - 1/2 [ f llbil + ztby'z~ i=l 7 (3.3.5) where "costat" icludes all terms that are ot fuctios of R or Q. It is coveiet to itroduce a auxiliary variable. 5 is the (y + q) x 1 vector of the diagoal elemets of R ad Q. The likelihood equatios are obtaied by equatig the derivatives of L (R,Q,Z ) with respect to 5 to zero. Usig the idetities of ppedix, after algebraic maipu.latio, _.T i=l (3.3.6) ' is foud as the solutio of (3.3.7) I geeral there is o closed form solutio of (3.3.7) for 6, so a iterative solutio like those described i Sectio 3.6 must be employed. 83

95 Estimatio of Noise Covariace - Parameters with Priori - Noise Covariace Iformatio I this problem the measuremet ad drivig oise covariace matrices are ot kow precisely a priori but rather kowledge of them is described by a joit probability desity fuctio f(r,q), where it is assumed that f(r,q) is kow a priori. The maximum likelihood estimate of the oise covariace parameters i this case is defied by (3.3.8) where 1 (R,Q,Z) is the augmeted likelihood fuctio which is chose to be the coditioal probability desity fuctio By applicatio of Bayes' rule (3.3.10) f(z ) eed ot be evaluated as it is ot a fuctio of R or Q. Formally ll R ad Q depedece is itegrated out. 84

96 Defie L(R,Q,Z) = I 1 (R,Q,Z) (3.3.11) The it ca be see that L (R,Q,Z) = L(R,Q,Z) + I f(r,q) - I f(z) (3.3.12) It is assumed that R ad Q are idepedet radom variables, i which case It is further assumed that the diagoal elemets of R ad Q are mutually idepedet, so f(r) = Y f(rii) The L(R,Q,Z) f (Q) = rl i=l f (Qii) = costat - 1/2 [ zll\b. 1 I+ZFB;'Z~ 1 Y 17 + I f(rii) + I f(aii) (3.3.14) i=l i=l where "costat" icludes all terms that are ot fuctios of R ad Q. 85

97 (3.3.14) is the set to zero ad solved for 5. gai there is o geeral closed form solutio so some iterative procedure must be employed, However, it ca be see that the iclusio of a priori iformatio has a tedecy to shift the solutio poit towards the peak of the a priori distributio of 5. Estimatio of Noise Covaxiace Parameters ad the System State with No ~ Priori Noise Covariace Iformatio - I this problem the oise covariace parameters ad the state are to be estimated simultaeously. Noapriori iformatio about the oise covariace parameters is to be used, but as before it is assumed that a priori state iformatio is used. The maximum likelihood estimate of these quatities is defied by (3 3 15) where l(r,q,x,z ) is the likelihood fuctio which is chose to be the coditioal probability desity fuctio

98 ~~ ~ ~~~ 7 where f(x,zlr,q) is the joit coditioal probability desity fuctio of the state x ad the measuremets Z give R ad Q. By applicatio of Bayes' rule (3.3.17) (3.3.18) The set of parameters to be estimated is ow T T at = (X, 5 1 Usig (2.3.22) ad (3.3.5) it ca be see that T -1 L,(R,Q,X~,~,) = costat - 1/2 I+ xplx (3.3.19) i=l J where = x - x x l ad "costat" icludes all terms that are ot fuctios of x R, or Q. ' The likelihood equatios are obtaied by equatig the derivatives of-l with respect to a to zero. with fidig the state estimate, Dealig first a L = - (X - x l ITP-l l (3.3.20) 87

99 The the solutio of l = o (3.3.21) is clearly h (3.3.22) This says that the maximum likelihood estimate of the state x after measuremets is just the maximum likelihood state estimate which uses the estimates of R ad Q to compute the filter gais. The simultaeous estimates for < (R ad Q ) are foud as the solutios of [ = o (3.3.23) Usig the idetities of ppedix, after algebraic maipulatio, + 1 Tr[ (B~~_B;'Z~Z~B~ at7 i=l ^T T -1 abi axi H 1. 1 ] T ] (3.3.24) )- - 2 Bi zi a<' 1 88

100 Substitutig the solutio of (3.3.21) ito (3.3.24), (3.3.25) s before there is o geeral closed form solutio of (3.3.25) for 5, so some iterative procedure must be employed. However, whe there is o drivig oise (Q = 0) a cosiderable simplificatio occurs. By use of Bayes' rule, the likelihood fuctio (3.3.16) ca be rewritte i the followig form. By repeated applicatio of Bayes' rule, it ca be show that Whe Q = 0, it is easy to show that where

101 where "il = z i - Hi Q(i,)x The (3.3.19) becomes T llrl -!- (zi-hi@(i,)x ) R (zi-h.@(i,)x 1 i=l ) - I (3.3.28) The Defie F (i) = 1 QT(i,)HT;-lHi@(i,) i l i=l The after algebraic maipulatio, the solutio of (3.3.21) for x l is - + F 1-1(Ploxlo + QT(i,)HiR T-1 zi) (3.3.30) X l - ('lo l i=l Usig the idetities of ppedix, it ca be show that. dl - -1 T -1 1 a~ Tr[ (R-l - R zilzil R )-I 2 i=l a? y--- ac' (3.3.31) 90

102 .... _. *. The solutio of (3.3.25) for R; the becomes IT] (3.3.32) I closed form solutio of (3.3.30) ad (3.3.32) for X ad R is ot possible except i the trivial case of a l scalar measuremet ad whe o a priori iformatio about -1 the state is used. I this case, P = 0 ad (3.3,30) lo becomes X l = [ f QT(i,)HiHi@(i,) T i=l ]- f QT(i,)Hizi T i=l (3.3.33) From (3.3.33) it ca be see that x is ot a fuctio of l R so that x ca be computed idepedetly of what value I\ l of R is obtaied from (3.3.32). I ay other case a umerical solutio of (3.3.30) ad (3.3.32) must be performed. However, eve if a closed form solutio is ot obtaied, the estimatio equatios i this o drivig oise case have a particularly simple form. Estimatio of the Noise Covariace Parameters ad the Systemstate _ ~ _ - with Pori_N&se Covariace ~ ffomatio I this problem the state ad oise covariace parameters are to be simultaeously estimated whe a priori iformatio about R ad Q is used. The maximum likelihood estimate of these quatities i this case is defied by (3.3.34) 91

103 where 1 (R,QIx,Z ) is the augmeted likelihood fuctio which is chose to be the coditioal. probability-desity fuctio (3.3.35) By use of Bayes' rule (3.3.36) From (3.3.10) ssumig that all the diagoal elemets of R ad Q are mutually idepedet, it ca be show that L(R,Q,xsZ) = I 1 (R,Q,X,,Z,) = L(R,Q,XiZ) (3.3.37) Y + I f (Rii) + 1 I f(qii) i=l i=l rl so (3.3.38) where al(r,q,xj) is give by (3.3.20) ad (3.3.24). aa It ca be see that the likelihood equatio for the state is uchaged by the iclusio of a priori iformatio about 5 92

104 sice f(5) is ot a fuctio of x. The likelihood equatios for the oise covariace parameters are modified by the additio of the term related to the a priori probability desity fuctio of the parameters 5. Several commets should be made about the four problems just discussed. I each problem it was assumed that a priori iformatio about the state was used i formig the state estimates. This assumptio gxeat-ly simplifies the formulatio ad solutio of the problem while ot beig ureasoably restrictive. If the iitial. state estimate is believed to be of poor quality, the settig its covariace to a large positive defiite matrix will effectively result i ot usig the a priori iformatio about the state. The assumptio that the iitial state ucertaity has a ormal distributio is a realistic assumptio i most applicatios. However, it was felt that a distictio should be made betwee oise covariace estimators which do or do ot use a priori iformatio about these parameters. The derivatio of the estimatio equatios with o a priori oise covariace iformatio is importat because a arbitrary selectio of a a priori distributio of these quatities does ot have to be made. The proper choice of a distributio for the covariace parameters is much less clear tha was the case i choosig a distributio of the iitial state estimatio erroro The case of o a priori iformatiu could be hadled withi the framework of the estimator that uses a priori iformatio by settig the covariace of the a priori oise covariace parameter 93

105 distributio to a large quatity but with relatively little additioal effort the two cases ca be treated separately. The most physically motivated-problem is the last of the four give above, that of maximizig the joit coditioal probability desity fuctio of the state ad oise covariace parameters, The solutio of this problem gives the most probable values of the state ad oise covariaces based upo the measuremets ad the a priori iformatio. However, as will be see, the asymptotic behavior of the solutio of this problem is most easily obtaied i terms of the asymptotic behavior of the simpler problem of estlfmatig the oise covariace parameters aloe. This is the primary motivatio for separately treatig these two problems. Noise Covariace ad Likelihood Estimators I Sectio 3.2 the asymptotic properties of a restricted set of maximum likelihood estimators were give, amely that class of estimators for which the measuremets were idepe- det ad idetically distributed, Now the asymptotic properties of four maximum likelihood estimators that do ot fit i the above category are sought. 1) oise covariace estimatio with o a priori iformatio 2) oise covariace estimatio with a priori iformatio 33 oise covariace ad system state estimatio with o a priori iformatio 94

106 4) oise covariace ad system state estimatio with a priori iformatio s will be show, if the asymptotic properties of the first of the above estimators are foud, the properties of the other three follow immediately. Therefore, the asymptotic properties of the oise covariace estimator with o a priori iformatio will be foud first. The maximum likelihood estimate of R ad Q was defied as the solutio of (3.3.7). Defie the sigle measuremet score ^T 1 Tr[ (Bil-BII~i~TBT1)- abi - axili-~ T 2 Bi1zi Hi I S j (Zi,[) = a$ a$ (3.4,l) (3.4.1) differs from the sigle measuremet score of Sectio 3,2 because it is a fuctio of all measuremets up to ad iclud- ig the ith measuremet. Defie the total measuremet score (3.4.4) 95

107 I J (6,) = J (So, 6,) the sigle measuremet. coditioal iformatio.matrix j(s0) = J (5,So) the total measuremet coditioal o iformatio matrix The the likelihood equatios (3.3.6) become (3.4.5) It ca be show that whe 5 = So, the true value of the parameters, the measuremet residuals z i are zero mea ormal variables with covariace B. with the further property that the residuals at differet times are idepedet. Or 1 It ca also be show that where 96

108 abi ad E [ Tr [ (B~l-B~lziz'fBT1)y] Tr (Bi1zl 1 1 a<] Therefore, after algebraic maipulatio it ca be show that + Tr (BflHiG$-lHi) T 6il From (3.4.7) it ca be see that S(Zi,So) is idepedet of S(Zl,So) for i # 1. The it follows immediately that (3.4.8) (3.4.7) ad (3.4.9) represet respectively the sigle ad total measuremet coditioal iformatio matrices, Because of the idepedece of the measuremet residuals whe 5 = ad the other relatioships show above, the 50 asymptotic properties of the maximum likelihood oise covariace estimator ca be foud relatively easily. These properties are quite similar to those metioed i Sectio 3,2 eve though 97

109 , ~11111l1111l11l111 the measuremets are ot ow idetically distributed. symptotic Distributio ~ gf the Score - Suppose (zl,..,z ) is a sample from the probability desity fuctio f(zilzi-l, 5,). Let f(zilzi-,,e) fiite first derivatives with respect to E i the rage G!. The if J(<,<) is positive defiite for 5 i G!, S(Z,< 0 ) is asymptotically distributed for large as a zero mea ormal radom variable with covariace J,(E,). Proof: possess It has alreadl bee show that S(Z,co) is a zero mea radom variable with covariace J(EO). Now all that remais to show is that S is asymptotically ormally distributed, From the defiitio of S(Z,co), It was show that S(Zi,<,) was idepedet of S(Zl,<,) for i # lo If it is assumed that o term domiates the above sum by havig a large value with appreciable probability, the by use of the cetral limit theorem cocerig the sum of idepedet xadom variables, the score S(Z,co) ca be show to be asymptotically ormally distributed for large. LkeLihood Estimator - Suppose (Z~,.~,Z ) is a sample from the probability desity fuctio f (2,. 1 I Zi-l,<o). Let f (zi I z ~-~,<) possess fiite first derivatives with respect to 5 i R. be.a cotiuous fuctio of 5 i 52 for all values of Z i - Let SJ(zi,<) 98

110 except possibly for a set of zero probability. If as + 00, the there exists a sequece of solutios of Si(Z,<) = 0 (3-4 10) which coverges i probability to Coo If for - > some o the solutio is a uique vector E8 the sequece of vectors coverges i probability to <, as + soo Proof: Defie The 1 Si(Z,<) is the mea of a sample of size from a populatio havig mea ' (Eo,<) if Eo is the true value of E o From the weak law of large umbers, -.. probability to ' (So, <) R' to be (e0-6, Eo+6) with 6 > 0. Si coverges i Without loss of geerality, defie It ca be show that j(<,<) 0 is mootoically decreasig over this iterval, ad sice 99

111 Therefore there exists a (6,E) so that the-probability exceeds 1 - E that both of the followig iequalities hold for: ay > (6,E) if 5, is the true value of 5. Sice sj(zi,e) is cotiuous i 5 over Q for all zi except for a set of probability zero, a similar statemet holds for S(Z,c). i Q ', Therefore, for ay fixed > (6,E) for some This is equivalet to the statemet that a sequece of roots of (3.4.10) exists which coverges i probability to 5,. I particular if ( has a uique solutio 5, for = + l,.., for some iteger o, the the sequece S, > o, cuverges i probability to 6,. symptotic Distributio ~ of the Maximum Likelihood Estimator If (zl,..,z ) is a sample from the probability desity fuctio f(zilzi-l,so) where f(ziizi-l, 5) possesses fiite 100

112 first ad secod derivatives with respect to 5 i the rage a, ad if the maximum likelihood estimator 5, satisfyig (3.4.10) is uique for - > some o, the it is asymptotically ormally distributed for large with mea 5, Proof: First it will be show that * ad covaraice [J(S )]-I. 0 T' (3.4.11) with large probability. This will the be used to show that * 5, is a efficiet estimator ad the asymptotic distributio -1 of (E-<,) is ormal with zero mea ad covariace [J,(So)l. Sice 5, satisfies the likelihood equatio the by a Taylor series expasio of Sj at to, where go - 5, - 5, Here as elsewhere idex summatio otatio is used. If a idex appears more tha oce o the right side of a equatio with o comparable idex o the left side, a summatio over that idex is implied. Defie 101

113 The (3.4.12) becomes T 0 = S(ZfSo) + C(Z,E0)So ssumig that C is of full rak, -1 T So = - [C(Z,S0)1 S(Z So) (3.4.13) Def ie -1 b = J(So) [C(ZrSo)I Multiplyig (3.4.13) by J(So) ad rearragig terms, J(So)So T T - S(Z,So) = - (b + 1) S(Z,So) (3.4.14) It will ow be show that b + - I with large probability, i which case the right had side of (3.4.14) -f 0, establishig the desired result. s before, defie ad Now defie 1.02

114 The ssumig that differetiatio with respect to 5, ca be take outside the itegral d (1) = 0 Or But s becomes large, by applicatio of the strog law of large umbers, it ca be show that 103

115 alogous to the assumptio made i Sectio 3.2, it is assumed that - 1 a 3 ~ a$ac k ag 1 < K with large probability as + 00, where K is idepedet of 5 ad. Sice 6 + 0, the product with large probability. ssumig that for large, where K1 is a positive defiite matrix idepedet of, the ad b + - I with large probability Thus it has bee show that (3.4.15) It has already bee show that S(Z,t0) is asymptotically distributed as a ormal radom variable with zero mea ad covariace J,(S,). From this ad (3.4.15) it ca be cocluded that (E-<,) is ormally distributed with zero mea ad 104

116 covariace [J(So) I as * 03. Wilks has show that (3.4.15) is a ecessary ad sufficiet coditio for statig that 5, efficiet estimator for 5,. is a asymptotically Thus it has bee show that the maximum likelihood estimator for the oise covariace parameters usig o a priori iformatio about these parameters is: 1) cosistet, 2) asymptotically ubiased, 3) asymptotically ormally distri- buted, ad 4) asymptotically efficiet. Now the asymptotic properties of the three closely related estimators previously metioed are sought. If a priori iformatio about 5 is used, the maximum likelihood estimator was defied to be the solutio of (3.3,14). The estimator i the absece of a priori iformatio is the solutio of (3.3.6). ) = o 1 ali: z) 52 (3.4.17) where 5, is the estimator usig a priori iformatio ad 5, is the estimator without usig a priori iformatio, Expadig (3.4.16) i a Taylor series about c2, +... = 0 105

117 But ad It has bee show that for large, But so -1 It has already bee assumed that as -+ 03, [J(So) I Now the assumptio is made that 5, is sufficj-etly close to 50 so that the followig approximatio is valid. N - J (5,) L It is also assumed that as -+ 00, (3.4.19) so that - J (5 ) o domiates i 106

118 ad the (3.4.18) becomes h The first liear correctio to the solutio 5, due to iclusio of a priori iformatio is the 5, = 5, + [JKo)1-1 But as + [J(Eo)] -t 0, so assumig that all elemets al f(5), are fiite, h * 5, -+ 5, as -t m Therefore, uder a wide set of coditios, the estimator which utilizes a priori iformatio behaves asymptotically as the estimator which does ot utilize this a priori iformatio. If the state ad oise covariace parameters are estimated without a priori iformatio about 5, the maximum likelihood estimator was defied to be the solutio of (3,3.21) ad (3.3.23). Or (3.4.20) (3.4.21) 107 I

119 V I The estimator for 5 aloe with o a priori iformatio about 5 was defied to be the solutio of (3.3.6). Or (3.4.22) h where 5, is the estimate of 5 foud simultaeously with h X l ad 5, is the estimate of 5 foud idepedetly. It ca be see that Expadig (3.4.21) i a Taylor series about E,,

120 It has bee show that for large, 7 where J (5,) is the coditioal iformatio matrix. alogous to a assumptio previously made, it is assumed that 52 is sufficietly close to 5, so that But 5'5, ssumig that as +. ==, - J(S0) domiates i ( , 109 I..

121 ad the (3.4.24) becomes +X, w2." 11 The first liear correctio to the solutio 5, due to simultaeously estimatig the state is the But as + m, [J,(<,)] -f remais fiite, -1 ap 0, so assumig that [Tr(P-' l) l a5 52 5, -+ 5, as Therefore, the estimator of 5 whe the state is also estimated behaves asymptotically as the estimator which does ot simultaeously estimate the state. s was show, the estimator of 5 aloe coverges to the true value of 5, so that the state estimator which the uses this estimated value of 5 coverges to the true maximum likelihood state estimator discussed i Chapter 2. Usig similar argumets, the iclusio of a priori iformatio about 5 i the simultaeous state ad oise 110

122 m- - covariace parameter estimator does ot affect its asymptotic properties, E 3.5 Selectio of the Priori Noise.- - Covariace Distributio The choice of f(r) ad f(q) is somewhat arbitrary as these fuctios are itroduced so that ucertaity i kow- ledge of R ad Q ca be properly treated. However, oce selected, they ca strogly ifluece the solutios of the likelihood equatios. They must be selected to realistically represet possible variatios i the values of R ad Q while ot beig mathematically itractable, Cautio should be observed i their selectio because the simplest ad seemigly realistic distributios may be usuited for use i a maximum likelihood estimator. Suppose that f(r) or f(q) is defied to be ozero oly over some fiite rage of R or Q ad is zero outside this rage. The all solutios of the likelihood equatios for R ad Q must also lie withi this rage. This ca be see by cosiderig the followig example. Let f(zl5) be the coditioal probability desity fuctio of a radom variable z, assumed to be ormally distributed with zero mea ad variace 5. Let f(5) be the a priori probability desity fuctio of 5, defied over some fiite rage 5, < 5 < 5, otherwise 111

123 By applicatio of Bayes' rule where For ay fiite value of E, f(z1e) is zero oly at z = fa, ad it is assumed that f(5) is selected so that f(z) is also zero oly at z = *a. The from the above it ca be see that f (5 I z) is zero outside the rage (co,e,). regardless of the shape of f (51~) withi the rage (Eo,E,), there ca be o legitimate solutios of This says that outside this rage. If the rage is too small ad happes to exclude the true value of 5, the maximum likelihood equatios caot have a valid solutio for the true value of 5. So if f(r) ad f(q) are defied oly over some fiite or semi-ifiite rage of R or Q, this rage must be large eough to iclude all possible true values of R ad Q. Sice the diagoal elemets of R ad Q represet variaces, it is clear that the a priori probability desity fuctios for these quatities must be zero for all egative values of the diagoal elemets. From the precedig discussio it ca be see that all solutios of the likelihood equatios h.. for Ri7 ad must be positive. 112

124 ... Perhaps the simplest possible distributio for R ad Q is a rectagular distributio for ay diagoal elemet, deoted by 5. 1 f(5) = , L , 5, > o (3.5.1) = o otherwise It ca be see that this distributio does ot possess fiite derivatives with respect to 5 for ay value of 5. The derivatives are either zero or ifiite. Therefore This says that if 5, < 5 < C,, the the maximum of f(51z) occurs at the same poit as the maximum of f(zl5) ad that o valid maximum ca exist outside the rage (<,,<,). The solu- tio for 5 i this case would be idetical to the solutio obtaied by cosiderig that o a priori iformatio about the value of 5 exists, as log as such a solutio is withi the rage (50,51)- This is the distributio that would be used, at least i theory, if the oly a priori iformatio about 5 is that 5 must be positive. I such a case 113

125 f (5) = lim 5, , < 5 < 5, = o otherwise It should be oted that if a rectagular distributio of 5 is used, the i the absece of ay measuremets, o uique maximum likelihood estimate of 5 exists. This is a cosequece of the fact that all values of 5 withi the rage (5,,5,) are equally likely to occur, so that there is o preferred value from the viewpoit of maximum likelihood. If aother estimatio criterio is used, there may be a preferred value. I the case of a miimum variace estima- tio criterio, the mea of the distributio of 5 would be the miimum variace estimate. I may situatios more may be kow about 5 tha merely that its value lies i some rage with equal probability of occurece i that rage. I such situatios a more complex f(5) should be assiged. Two possible distributios are give below, a trucated ormal distributio ad a Gamma distributio. Trucated Normal Distributio - If 5 has a trucated ormal distributio, the its probability desity fuctio is give by = o otherwise 114 5, < 5 < 5, (3.5.2)

126 where erf( ) is the error fuctio 1-1 is the mea of the utrucated distributio a2 is the variace of the utrucated distributio Fig. 3.1 Trucated Normal Distributio The mea of the trucated distributio is (3.5.3) 115..I

127 where -S p = a2 K(e 2 2 -S 1 - e 2, ad the variace of the trucated distributio is (3.5.4) p = cr2 + a where Gamma Distributio If 6 has a Gamma distributio, the its probability desity fuctio is give by (3.5.5) where a ad 1-1 are parameters of the distributio, ad a > 0. r(a) is the Gamma fuctio. 116

128 Fig. 3.2 Gamma Distributio with p = 1 The mea of the distributio is (3.5.6) ad the variace of the distributio is (3.5.7) I Chapter 2, the a priori state estimate was defied as the mea of the ormal a priori state probability desity fuctio. Because of the symmetry of the ormal desity fuctio, the mea is located at the poit of maximum 117

129 probability or likelihood. Now the a priori values of R ad Q must be defied i terms of parameters of their respective distributios. The Gamma distributio is ot symmetric about its mea so that the poit of maximum probability occurs at a differet poit tha the mea of the distributio. The same is true for the trucated ormal distributio if the poits of trucatio are ot chose to be equidistat from the mea. Because the criterio of maximum likelihood is used to defie the optimal estimates of the state ad oise covariace parameters, it would be cosistet to defie the a priori estimates of these quatities as the poits of maximum likelihood of their respective a priori probability desity fuctios. ^k If 5, the kth compoet of 5, the deotes the a priori estimate of k = uk 50 for the trucated ormal distributio "k - (ak-l) yk 50 - k a for the Gamma distributio ctually, if the parameters of the respective distributios are defied, there is o eed to separately defie the a priori estimates of 5 whe solvig the likelihood equatios. The solutio is a fuctio of the parameters of the distribu- tio, ot 5,. Eowever, i subsequet sectios whe approximate solutios are discussed, it becomes coveiet to itroduce the a priori estimates as separate etities, although they will be related to the parameters of their distributios as show above. 118 I

130 If a rectagular distributio of 5 is selected, the o poit of maximum likelihood of this distributio exists. I this case, the a priori estimate of 5 is defied as the mea of the rectagular distributio. I fact, ay poit withi the ozero rage of the distributio could be selected as the a priori estimate without affectig the solutio, but for the sake of uiqueess, the above defiitio is made. 3.6 Computatio - of the Estimate The likelihood equatios for estimatig the state ad oise covariace parameters with ad without the use of a priori iformatio have bee derived but i geeral the equatios are so complicated that solutios caot be obtaied i closed form. I this sectio techiques for a umerical solutio of the equatios are discussed. For simplicity, oly oe of the several possible cases are treated, that of simultaeously estimatig the state ad oise covar- iace parameters whe a priori iformatio is used. The solutio of this problem icludes all of the features that are ecessary for the solutio of the others, so that oly slight modificatio of the discussio below is ecessary i the other cases. The solutios of the augmeted likelihood equatios 119

131 are sought. geeral method of solutio would be to assume a trial solutio ad derive liear equatios for small addi- tive correctios. correctios become the estimate, the retaiig oly the T a This process ca be repeated util the egligible. If a. a L is the trial value of expadig - i a Taylor series ad aa first power of ao = a - a leads to 0' h (3.6.1) - 7 % ssumia tf correctio to a. is ao = - -1 T a 0 a 0 (3.6.2) The ext trial value is the a. + ao. Clearly this method has several drawbacks. Computatio of a 'I,: - ad its iverse is very complicated, ad oce a stable 2 aa solutio is foud, aother computatio, the coditioal ifor- matio matrix, must be performed before ay evaluatio of the performace of the estimator ca be udertake. mechaiza- tio itroduced by Rao elimiates these drawbacks. It is quite similar to the above method but employs oe approxima- tio which greatly reduces the umber of computatios. For this iterative solutio, the approximatio is made (3.6.3) 120

132 " where J(ao) is the augmeted coditioal iformatio matrix defied by Thus the additive correctio hao becomes (3.6.4) " I large samples with a give a = ao, the differece betwee a 2 ~ ; ", ad - J(ao) will be of order l/, so that the above bl0 approximatio holds to first order of small quatities. Whe a stable solutio of a is obtaied, the asymptotic estimatio error is zero mea ormally distributed with coditioal covariace [J(a)J -1 which is closely approximated by the computed [J (a) " 1-l. I this method the mai difficulty is the computatio ad iversio of the iformatio matrix at each stage of the iteratio, I practice this is foud to be uecessary. The iformatio matrix ca be kept fixed after some stage ad oly the score recalculated. t the fial stage whe stable values are reached, the iformatio matrix ca be recomputed at the estimate value to obtai the covariace of the estima- tio errors Wheever a iterative solutio to a set of oliear equatios is proposed, there is always a questio of cover- gece. This questio is reasoably well resolved i the case 121

133 of the likelihood equatios. Deutsch discusses this problem ad refereces several other works o the subject. The results of his discussio are give below... If a. is selected as the iitial estimate of the solutio of the likelihood equatios, if a is the jth iteratio value j of the estimate, ad if a is the "true" maximum likelihood estimate, decreases iteratio the the iteratio process coverges if la -a1 j as j icreases ad teds to zero as j -f 00. The process is defied as follows: Let g(a) be a differetiable fuctio which has o zero i the eighborhood of the root a for the likelihood equatio. The existece of a is postulated. Defie where L is a likelihood fuctio. The geeral iteratio process is the If E j = laj - a1 is the estimatio error at the jth iteratio, the g (a) must be chose such that < E ad E -+ 0 as j This j j coditio assures the covergece of the iteratio process 122

134 - to the value a. By usig the asymptotic properties of the maximum likelihood estimator for large sample sizes, the two previously give iterative techiques for the computatio of the estimate ca be show to be coverget. 3.7 Computatio - ~ of.the _- Iformatio -- - Matrix By calculatio of the iformatio matrix, the asymptotic covariace of the maximum likelihood estimate ca be obtaied. Care must be take to distiguish betwee [J(o,)]-l - ad [J(ao)] I, the former beig the coditioal covariace of the estimate for a give value of ao, the latter beig the average coditoal covariace of the estimate, averaged over the esemble of all possible true values of ao. (3.7.1) (3.7.2) where L = L,(ao,Z) -1 [J(ao)] is a highly oliear fuctio of ao, so the average coditioal covariace caot be explicitly calculated. -1 Fortuately, J(ao) is ot eeded i fidig a but is ' oly used i evaluatio of the estimator performace over the -1 esemble of all possible ao. To fid J(ao) some umerical evaluatio of (3.7.2) is ecessary. From (3.3.37) ad (3.3=20), 123

135 - (X q= - x l / (3.7.3) From (3.3.37) ad (3.3.24) (3.7.4) where x = x - x l zi=z i - H x i iji-1 The it follows that (3.7.5) Usig the same procedures as i obtaiig (3.4.9), after algebraic maipulatio, it ca be show that (3.7.6) 124

136 h hm where ax ax l Gjk l = [+ a5k It ca also be show that (3.7.7) If the diagoal elemets of R ad Q (5) are mutually idepedet ad are distributed with a trucated ormal distributio, the (3.7.8) where Sk represets the appropriate elemet of R or Q ad ~ ad ak are the mea ad variace of the correspodig utrucated ormal distributio. If the diagoal elemets of R ad Q are distributed with a Gamma distributio, the (3.7.9) where ak ad pk are parameters of the correspodig Gamma distributio. ll of the ecessary quatities appearig i (3.7.6) ca be computed usig recursive relatioships. 125

137 m m (3.7.10) (3.7.11) arj J = [(I- H) - arj ( I -J H ) T st ] (3.7.12) (3.7.13) ap = (I - H) b ( I - H ) T aqjj aqjj (3.7.14) (3.7.17) where a = ($ ap H T - 3) ar R aq (3.7.18) The proper iitial coditios for these recursive relatioships are: 126

138 J (a ) ca be partitioed ito submatrices, correspodig o to x ad 5. where The ad where ad - Neither P or w ca be computed aalytically. l - first order approximatio to P ad could be computed l by expadig P ad W about e. l (3.7.19) (3 7, _.

139 , 1 where to = 5, - F But = F ad where is the mea of the a priori distributio of true parameter values Eo ad cov(5 0 ) is the covariace of this distributio. The = Pii (r) + Tr[ [p)r 2 i' 1 COv(~o) It is obvious that extesive computatio is ecessary to compute these quatities so that this techique is ot particu- larly attractive. alterate method of evaluatig Pi ad would be to select a sample of 5 chose from the distributio f(e) ad the employ the approximatios - P l K K j= 1 K Of course, the sample size K must be sufficietly large to esure that this approximatio is reasoably good. 128

140 r I The simplest approximatio to make would be - Z P P l [ (TI This approximatio may be adequate i applicatios where the rage of 5 is limited, but cautio should be employed i its use. 129

141 Chapter 4 SUBOPTIML SOLUTIONS OF THE ESTIMTION PROBLEM 4.1 Itroductio exact or iterative solutio of the likelihood equatios of Chapter 3 requires extesive computatio as the solutio is geerally foud oly after several passes over the measuremet data. I may applicatios such computatio is ot feasible or a "real time" solutio is eeded. I such situatios, approximate solutios are ecessary, either to reduce the required computatio ad/or to obtai a real time solutio of the parameter estimatio problem. s would be expected, the quality of the estimator is degraded i such cases, but ofte the degradatio is ot serious. However, there are certai special cases whe some of the approximate solutios are oiz uique or are so highly biased that their use is questioable. This chapter deals with the derivatio ad evaluatio of several suboptimal approximate solutios. lso icluded is a summary of possible parameter estimators suggested by other authors. The list of approximate solutios is ot exhaustive but is meat to illustrate several techiques that are availadle to obtai a adequate solutio of the problem. 4.2 Liearized Maximum Likelihood Solutio The iterative solutio of thc maximum likelihood equatios 130

142 of Chapter 3 was based upo successive reliearizatio of the. maximum likelihood equatios about trial values of the parameters obtaied from the previous iteratio, cotiuig the process util covergece. If the iitial trial value of the parameter is "sufficietly close" to the true value, a sigle correctio to the iitial estimate based upo a liear approximatio to the equatios is ofte adequate for the solutio. This sigle liearizatio is the basis of the liearized maximum likelihood solutio. s i Chapter 3, the solutio of is sought. If a. is,he trial (a priori) ralue of the esti- mate, the from (3.6.4) the liearized maximum likelihood solutio ar is foud from the equatio (4.2.1) The liearized solutio a of full rak. Both J(ao) " R ad aa, ca be evaluated i a0 real time sice they represet the coditioal iformatio matrix ad the score evaluated at the a priori estimate of the parameter a. ca be foud as log as J(ao) is ("E) The coditioal iformatio matrix f (ao) is expressible as a liear combiatio of the coditioal " iformatio matrix at the previous time, J-l(ao), ad a term which represets the additioal iformatio about the 131

143 parameters cotaied i the measuremet at time. Similarly, (i?) the score -, is expressible as a liear combiatio of (a$-l) a0 the score at the previous time, -,, ad a term which a0 is a fuctio of the measuremet at time o Thus as the measuremets are take, the coditoal iformatio matrix ad the score ca be computed as ruig sums, ad the liearized solutio (4,2.1) ca be foud i real time. a L Because - aa is a highly oliear fuctio of a, there is o simple way to determie whe the above liearizig approximatio is valid, or more importatly, whe the liearized solutio is "closer'f to the true value tha the a priori estimate. Several measures ca be used to determie if the liearized solutio is closer to the true solutio. If the liearized solutio is valid, the followig iequality should be satisfied, [$)I" a 0 If this is ot satisfied, aother trial value of a. must be foud ad the procedure repeated. Evaluatio requires a recomputatio of the score ad the iformatio matrix at the value a = at, so i of this measure coditioal this sese the liearized solutio is ot real time. However, umerical results idicate that this liearized solutio coverges over a wide rage of a. so that i may applicatios this check is ot ecessary. The asymptotic coditioal covariace of the liearized " solutio is approximately [J (a o 1 ]-lo better approximatio 132

144 ca be obtaied if computatioal capacity allows evaluatio If it is kow that there may be a sigificat error i the a priori estimate of a, the use of the liearized techique may be questioable. However, i this situatio a combiatio of a iterative solutio plus a liearized solutio could be used. Sufficiet measuremets are take to obtai a relatively good estimate of a by use of the iterative procedures of Chapter 3. Subsequetly, the liearized solutio is employed, usig the results of the iterative procedure as the poit about which to liearize. third procedure, sequetial reliearizatio, could also be used. It is quite similar to the liearized solutio except at regular itervals of time, which may ecompass several measuremet times, the best liearized estimate of a is used to compute subsequet values of the iformatio matrix ad the score. t each reliearizatio, the score must be corrected to accout for havig used a differet value of a i its computatio tha the ewly obtaied value, Let al be the estimate of a that was obtaied at the previous reliearizatio ad used from the util the preset i the computatio of the score, ad let a2 be the curret liearized estimate, Expadig the score i a Taylor series about a 1' 133

145 Usig the approximatio the corrected score is give by s with the liearized solutio, this procedure should be used oly after a sufficietly accurate estimate of a is obtaied, either from the a priori estimate or through use of the iterative procedure. 4.3 Near Maximum Likelihood Solutio By a suitable approximatio to (3.3.38) a "ear maximum likelihood" solutio ca be foud which reduces the ecessary computatios cosiderably. I this solutio, the state esti- mate is defied to be the maximum likelihood estimate which uses the ear maximum likelihood estimates of R ad Q (5) to compute the filter gais, ad estlmates of 5 are foud from the solutio of the "pseudo" likelihood equatios: where is the "pseudo" likelihood fuctio defied by (4.3.1L This equatio is obtaied from (3.3.38) by retaiig oly the most sigificat terms. The savigs i computatio ~ axi i-1 arise from ot havig to compute appearig i the 134 a5

146 k likelihood fuctio ad GiIi - appearig i the exeressio for the coditioal iformatio matrix (3.7.6). axili-l is ae a array with Bx(y+v) elemets ad Gifi-l k is a array with B2x(y+) elemets. If all of the symmetry properties of Gi 1 i-1 are utilized, the umber of idepedet elemets is g( i1 X ( + ) ( + +l). If the state, drivig oise, ad 2 2 measuremet are of moderate dimesio, the umber of compu- tatios ivolved i calculatig these quatities ca be cosiderable, so that ot havig to perform these calculatios ca. result i a sigificat savig i computer time. If covergece of (4.3.1) to a uique solutio is obtaied, the asymptotic distributio of x l ormal with coditioal covariaces ad 5, are approximately The coditioal iformatio matrix J (5) is ot the same as the iformatio matrix of Chapter 3 because of the omitted terms i the likelihood fuctio. Here compariso of (4.3.2) with (3.7.6) will show that the above 135

147 * iformatio matrix is smaller tha the iformatio matrix of Chapter 3. Thus, as would be expected, the covariace of the parameter estimates will be larger whe the pseudo likelihood equatios are used tha whe the full likelihood equatios are solved. Numerical results idicate that the iterative solutio of the pseudo likelihood equatios whe the iformatio matrix (4,3.2) is used as a approximatio to the egative gradiet of the likelihood equatios may preset difficulties. This is because i some circumstaces J give above may be early sigular ad usig its iverse i the solutio may result i a ustable iterative procedure. However, these same umerical results show that the pseudo likelihood equatios do have a uique solutio, but they must be foud usig other techiques i the iteratio algorithm, say a fixed step size sweep lookig for zeros of the pseudo likelihood equatios I this sectio, explicit "real time" solutios for the estimates of R ad Q are sought. s will be show, such estimates are approximatios to the maximum likelihood solutios ad o ay give trial may be highly biased. However, if the r--- a positive defiite matrix is said to be smaller tha aother positive defiite matrix B if the matrix (-B) is egative semi-defiite, 136

148 a priori estimates of R ad Q are sufficietly close to the true values, such estimators will provide reasoable estimates with cosiderably less computatio tha the estimators previously discussed. Eve if the estimates are biased, they provide useful iformatio. If the estimates differ cosistetly ad sigificatly from the assumed a priori values, the there is good reaso to doubt the accuracy of the a priori values, e, ve though the biased estimates do ot ecessarily repres t better estimates of R ad Q. I other \ words, the explicit estimates will idicate if there is a sigificat error i the a priori values of R ad Q eve if they do ot tell how to correct this error. I this sese their use is related to testig a hypothesis o the values of R ad Q as discussed i Chapter 5. These approximate estimators are obtaied as approximate solutios of the pseudo likelihood equatios (4.3.1). The last term allows itroductio of a distributio fuctio of R ad Q so that a priori estimates ca be weighted with the estimates derived from the measuremets aloe, For this approximate solutio it is coveiet to form estimates of R ad Q which are idepedet of this distributio fuctio, ad after such estimates are obtaied, the the a priori estimates ad their associated covariaces are cosidered i obtaiig a combied estimate for R ad Q. Thus, iitially, the solutios of the followig equatio are sought. 137

149 abi 1 Tr (B: -) = 0 (4,4.1) i=l ac where -1-1 B? = B~ - B~ 1 T -1 zi zi Bi Usig the results of ppedix, (4.4.1) becomes [ (Bf )jj + Tr(BllHi apl 1 1 HT) 1 = 0 ar i=l (4.4.2) xtr(byl Hi HT) = 0 (4.4.3) aq i=l s the equatios stad, o explicit solutio for estimates of R ad Q is possible, so further approximatios must be made. Whe these approximatios are made, there is a real questio of existece of idepedet solutios of the resultig equatios for the ukow elemets of R ad Q. Eve if there are suffi- ciet idepedet equatios, there is o geeral way to obtai a closed form solutio of the oliear relatioships. If R or Q is to be estimated separately, there is o difficulty i obtaiig a reasoable solutio to the problem. Ufortuately the questio of simultaeous estimatio of these quatities from the above equatios is ot well resolved. The solutios give below represet separate estimatio of R with Q kow ad estimatio of Q with R kow. The two solutios ca be used, with cautio, to simultaeously estimate R ad Q, realizig beforehad that the resultig estimates are ot idepedet, This depedecy ca result i biased estimates which fail to distiguish betwee errors i R ad Q. However, 138

150 as metioed previously, some useful iformatio ca be derived from such biased estimates. It ca be show that for may applicatios arj J HT << 1 so that (4.4.2) becomes -1-1 T -1 jj 1 (Bi - Bi zi zi Bi ) = o i=l (4.4.4) But Sice X = x + ili i I i-1 ili HTR- (~~ 1 - Hixi I i-1 1 it ca be see that - (I - HiPili HTR- ) (zi - Hixi I i-1 1 = R(R -1 - R-~H~P~ I HTR- ) azi -1 = R Bi zi Defiig the -1-1 Bi zi = R zi 139

151 ~ + HiPili ad (4.4.4) becomes Or i=l [R -1 (R - HiPili HT - zq zi T )R -1 ] jj = 0 i=l It is still ot possible to solve (4.4.5) for R as Pili ad zi are highly oliear fuctios of both R ad Q. However, if either the a priori values of R ad Q or some estimates of these quatities are used to compute zo i ad P ili the the estimate of R ca be defied as *. R3 = * * 1. 1 (zi* zi I*T * HT)jj i=l 1 (4.4.6) where zi ad P ili are computed as fuctios of either the a priori estimates of R ad Q or some previously obtaied estimates. 7,. recursive relatioship for R j ca be obtaied if.i. 4. h z- ad P! are ot fuctios of R or Q. Equatio (4,4.7) is ot the oly approximate solutio that could be reasoably obtaied from (4.4.4). Rewritig (4,4,4) i=l [Bi -1(Bi - z~zt)b~~]~~ = 0 140

152 If the estimatio process has reached a steady state, that is, Bi II: costat for all i, the a estimate of R ca be defied by * * where zi ad are equal to zi ad Pili-l computed as a fuctio of a priori values of R ad Q or some past estimates of these quatities. The form of (4.4.9) is ot as desirable as (4.4.6) because R is ot ecessarily positive defiite. If some of the squared residuals are small compared with * HiPili-1 HT it the some of the terms i the above sum ca be egative. If this occurs ofte, the the resultig estimate of R may have egative diagoal elemets. However, the esti- mator has the advatage of ot beig a fuctio of the value * * of R that is used to update z ad Pl-l at time. This ca reduce possible bias problems i the feedback estimator discussed later. The estimator of the form (4.4.6) is the oe studied further. Obtaiig a explicit estimate of Q is ot as straight- forward as obtaiig the estimate of R. There are may approximate solutios to (4.4.3) for Q depedig upo the ature of the approximatios made. The solutio give below is but oe of several possible solutios, but it is felt that it has the advatage of simplicity ad wide applicability. 141

153 By maipulatio of (4.4.3) it ca be show that apili-l T Tr (BylHi * Hi) aq'j i=l i=l i-l (Pi j i-l-p -xix. T ) P -1 ili 1 ili-1 a:;i;-l] " where xi = x ili - x ili 'ili-1 H i ~ i B - i ~ ~ e T Defie (i,i-1) The (4.4.3) becomes But - T 'i I i-1 - ui + riqri so E['-' i=l (riqrt -X.X i i-1 1 i T - pili + u~)p~~~.~ -1 r ]JJ = o (4.4.10) Equatio (4.4.10) caot be solved explicitly for Q, so additioal approximatios are ecessary. If it is assumed.. that Ti ad are approximately costat for all i, the (4.4.10) becomes 142

154 The equatio above is satisfied if T T 1 (riqri - ax.axi 1 i=l - 'ili + Ui) = 0 (4-4.11) i=l If 'I' does ot exist, the geeralized iverse of r used. is to be (See ppedix for discussio of geeralized iverse.) I geeral the dimesio of the drivig oise vector is less tha or equal to the dimesio of the state, i which case (I'TI',)-' exists ad the geeralized iverse of r is 11 i T -lrt r! = viri) i i The estimate of Q is defied as i=l * * * where xi, 'ili' ad Ui are computed as fuctios of the a priori estimates of R ad Q or some past estimates. If * * * 'l' ad U are ot fuctios of R or Q, a recursive relatioship ca be obtaied. 143

155 Two classes of estimators of the form (4.4.7) ad (4.4.13) exist depedig upo what use is made of past esti- mates of R ad Q. 1) o feedback estimators 2) feedback estimators I the o feedback case, a priori values of R ad Q are used to compute the quatities deoted by a * i the estimator equatios. I the feedback case these quatities are computed as fuctios of past estimates of R ad Qo t each stage the best available estimates of R ad Q are used to update the starred quatities. If feedback is employed ad the variace estimatio process coverges to the true values of R ad * Q, the the state estimate x will coverge i most applicaj tios to the optimal state estimate that would be obtaied if the true values of R ad Q were kow a priori. However, usig this estimatio scheme, covergece is ot guarateed. I fact, umerical results idicate that if the a priori values of R ad Q are sigificatly i error, the process will coverge but to biased ad icorrect estimates of the variace parameters. Techiques for evaluatig the performace of the feedback ad o feedback estimators are give ext. The two measures which seem appropriate for evaluatig the performace of the explicit suboptimal estimators are the mea ad mea square error of the estimates of R ad Q. I the precedig sectio, estimators for the diagoal elemets 144

156 of R ad Q were developed, resultig i (y+rl) estimator equatios. The mea square error matrix of such estimates is a (y+q) x ( y+~) matrix, which icludes the mea of all quadratic fuctios of the errors i each compoet of the diagoal elemets of R ad Q. Such a matrix is most diffi- cult to compute, so for the purposes of this developmet, oly the diagoal elemets of such a matrix will be cosidered. s metioed i Chapter 2, a distictio must be made betwee coditioal ad ucoditioal expectatio operators. The same otatio as i that chapter will be used to make this distictio. First, the performace of the o feedback estimator will be discussed. From (4.4.7) ^jj is The coditioal expected value of R This coditioal expected value is coditioed upo the fact that the a priori estimates Ro ad Qo are used to compute the filter gais while the true values of these covariaces are R ad Q. veragig is performed over the esemble of all drivig ad measuremet oises as well as all possible iitial state coditios. 145

157 HPl-l z * * = z - H x %* l - v - HXl * *T T HT - HR - R H = + HPl where * - - 'l-1 * T * (Ro + HT) 'L* %*T P - (ot P uless Ro = R l - E(xlxl) l ad Q = Q) * 0 * I the o feedback case, P is ot a radom variable uder l * * the expectatio operator, so E (P = P ad l b (4.4.14) This ca be expressed as where F Fi i=l * * If R~ = R ad Qo = Q, the P l - 'l' R = P HT ad l ' * 146

158 from the defiitio of F it ca be see that Fi = 0, for all i. The If Ro # R or Qo # Q, the Fi # 0 ad R is biased, the bias equallig F. above. The ucoditioal expected value of R follows from the Here averagig is doe over the esembles metioed above ad also over the esemble of all possible R ad Q. - By defiitio E(R) = R, where E is the mea of the distributio of all possible R values, ad E(F) = 2 E(Fi) i=l But E(Fi) = HiE(Pili)Hi T + HiPiiiHi * T - H..E(R) 1 1 * - E(R)i *T Hi T E(Pili) ca be computed recursively usig (2,3.43) ad (2.3.44) * * *, *T E(Pili) = (I-iHi) E(Pili - (I-~H~) + ~ R ~ E ('i I i-1) T = (i, i-1) E (Pi-1 I i-l (i,i-l) + riurt where is the mea of the distributio of all possible Q values. 147

159 The E(F~) = T * *, - *T T HiFi I ihi + Hipi 1 iht - H.. R - Ri H 1 1 i ad If the a priori values of R ad Q are assumed to be equal to the meas of their respective distributios, the - - Ro = R ad Qo = Q ad it ca be show that E(Fi) = 0 The. E(RJ~) = -1 R j Thus R is a ubiased estimator of R across the esemble of all possible R ad Q. However, if Ro # E or Qo # zf the.. E(Fi) # 0 ad R is biased, the bias equallig E(F ) o The measures of error of the estimator are chose to be the expected squared deviatio of the R estimate from the true value, or c[(r "jj - Rjj)21 ad - Rjj)

160 I 7-1 I E[(;~J - E(;~J))~I ca be computed recursively by otig that The diagoal elemets of (4.4.7) are squared ad the codi- tioal expected.value the evaluated. Use is made of the fact that the residuals zi are zero mea ormal variables i the o feedback case, ad the approximatio is used., It ca be show that as the filter approaches optimality (Ro -t R, Qo -+ Q), the above approximatio is idetically satisfied. Usig the above approximatio ad after extesive algebraic maipulatio, I this expressio, (F:j)2 is due to the bias of the estima- tor ad GiJ is due to possible deviatios from this bias. The ucoditioal mea square estimatio error follows from the above. 149

161 Evaluatio of E(G1) ad E[ (F jj ) 2 ] is extremely complicated so the details of their evaluatio are give i ppedix B. Oly the results of that evaluatio are give here, Uder the assumptios that R = R ad Q, 0 - h - = Q, the (4.4.17) where is a diagoal yxy matrix whose diagoal elevets P Q is a diagoal qxrl matrix whose diagoal elemets -jj are 2 ~ [ ( ~ - j Q j 1 C rill is a yxy matrix defied by k=l x lk is a BxB matrix defied by L 41 is a yxq matrix defied by k= 1 * Dk = I - khk 150

162 where % F' - ' =c 1 l k=l k=l -I 1 1 Evaluatio of the o feedback Q estimator is similar to that of the R estimator. From (4.4.13) where * "* * x = x l - Xl- ad The coditioal expected value of Gij is give by * +P - * p*-l - -1 "-1 p* where M = r P ('l l 'l j-1 l-1 'l-1 l-1 l * T-1 f u - UU 151

163

164 2 E [ (62j - E (QJ3) ) J. * ca be computed recursively by otig that The diagoal elemets of (4.4.13) are squared ad the codi- tioal expected value the evaluated. Use is made of the approximatio Usig this approximatio ad after extesive algebraic maipulatio (4.4.20) where 2 Jjj = (F) jj f -[((Q 2 f M - T) * jj)2] J-l 2 * -I * * T-b T = r ( l - U)r so E[(;: - Q jj ) 2 J = Jij f (M jj ) 2 (4.4.21) I this expressio, Mjj is due to the bias of the estimator ad J:7.. is due to possible deviatios from this bias. The ucoditioal mea square Q estimatio error follows from the above. 15 3

165

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig