Explicit Lp-norm estimates of infinitely divisible random vectors in Hilbert spaces with applications

Similar documents
Sums of independent random variables

Stochastic integration II: the Itô integral

Estimation of the large covariance matrix with two-step monotone missing data

MATH 2710: NOTES FOR ANALYSIS

Elementary theory of L p spaces

Convex Optimization methods for Computing Channel Capacity

4. Score normalization technical details We now discuss the technical details of the score normalization method.

State Estimation with ARMarkov Models

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

HENSEL S LEMMA KEITH CONRAD

Distributed Rule-Based Inference in the Presence of Redundant Information

Numerical Linear Algebra

General Linear Model Introduction, Classes of Linear models and Estimation

Approximating min-max k-clustering

Applications to stochastic PDE

1 Probability Spaces and Random Variables

Analysis of some entrance probabilities for killed birth-death processes

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

CHAPTER 5 TANGENT VECTORS

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Uniform Law on the Unit Sphere of a Banach Space

p-adic Measures and Bernoulli Numbers

Chapter 7: Special Distributions

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

On Doob s Maximal Inequality for Brownian Motion

Robustness of classifiers to uniform l p and Gaussian noise Supplementary material

Feedback-error control

Elementary Analysis in Q p

LEIBNIZ SEMINORMS IN PROBABILITY SPACES

#A47 INTEGERS 15 (2015) QUADRATIC DIOPHANTINE EQUATIONS WITH INFINITELY MANY SOLUTIONS IN POSITIVE INTEGERS

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

The Longest Run of Heads

Positive decomposition of transfer functions with multiple poles

c Copyright by Helen J. Elwood December, 2011

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

Section 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative

GENERALIZED NORMS INEQUALITIES FOR ABSOLUTE VALUE OPERATORS

SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES. Gord Sinnamon The University of Western Ontario. December 27, 2003

The non-stochastic multi-armed bandit problem

Asymptotically Optimal Simulation Allocation under Dependent Sampling

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

Uniformly best wavenumber approximations by spatial central difference operators: An initial investigation

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

COMMUNICATION BETWEEN SHAREHOLDERS 1

Commutators on l. D. Dosev and W. B. Johnson

Multiplicity of weak solutions for a class of nonuniformly elliptic equations of p-laplacian type

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES

Randomly Weighted Series of Contractions in Hilbert Spaces

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS

STABILITY ANALYSIS AND CONTROL OF STOCHASTIC DYNAMIC SYSTEMS USING POLYNOMIAL CHAOS. A Dissertation JAMES ROBERT FISHER

On Z p -norms of random vectors

Khinchine inequality for slightly dependent random variables

THE SET CHROMATIC NUMBER OF RANDOM GRAPHS

arxiv:cond-mat/ v2 25 Sep 2002

Yixi Shi. Jose Blanchet. IEOR Department Columbia University New York, NY 10027, USA. IEOR Department Columbia University New York, NY 10027, USA

Multi-Operation Multi-Machine Scheduling

Location of solutions for quasi-linear elliptic equations with general gradient dependence

An Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

1 Gambler s Ruin Problem

Generalized Coiflets: A New Family of Orthonormal Wavelets

The inverse Goldbach problem

DIFFERENTIAL GEOMETRY. LECTURES 9-10,

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

SOME TRACE INEQUALITIES FOR OPERATORS IN HILBERT SPACES

TRACES OF SCHUR AND KRONECKER PRODUCTS FOR BLOCK MATRICES

Introduction to Banach Spaces

Paper C Exact Volume Balance Versus Exact Mass Balance in Compositional Reservoir Simulation

Adaptive estimation with change detection for streaming data

Combinatorics of topmost discs of multi-peg Tower of Hanoi problem

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application

On the capacity of the general trapdoor channel with feedback

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH

1 Extremum Estimators

t 0 Xt sup X t p c p inf t 0

F(p) y + 3y + 2y = δ(t a) y(0) = 0 and y (0) = 0.

Plotting the Wilson distribution

The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule

On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm

A Social Welfare Optimal Sequential Allocation Procedure

Research Article Controllability of Linear Discrete-Time Systems with Both Delayed States and Delayed Inputs

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

MATH 6210: SOLUTIONS TO PROBLEM SET #3

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION

Slash Distributions and Applications

On the minimax inequality and its application to existence of three solutions for elliptic equations with Dirichlet boundary condition

Online Appendix to Accompany AComparisonof Traditional and Open-Access Appointment Scheduling Policies

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

Sampling and Distortion Tradeoffs for Bandlimited Periodic Signals

Hidden Predictors: A Factor Analysis Primer

1-way quantum finite automata: strengths, weaknesses and generalizations

Efficient algorithms for the smallest enclosing ball problem

Boundary regularity for elliptic problems with continuous coefficients

Ž. Ž. Ž. 2 QUADRATIC AND INVERSE REGRESSIONS FOR WISHART DISTRIBUTIONS 1

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

2 K. ENTACHER 2 Generalized Haar function systems In the following we x an arbitrary integer base b 2. For the notations and denitions of generalized

Transcription:

University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 5-211 Exlicit L-norm estimates of infinitely divisible random vectors in Hilbert saces with alications Matthew D Turner mturne2@utk.edu Recommended Citation Turner, Matthew D, "Exlicit L-norm estimates of infinitely divisible random vectors in Hilbert saces with alications. " PhD diss., University of Tennessee, 211. htt://trace.tennessee.edu/utk_graddiss/135 This Dissertation is brought to you for free and oen access by the Graduate School at Trace: Tennessee Research and Creative Exchange. It has been acceted for inclusion in Doctoral Dissertations by an authorized administrator of Trace: Tennessee Research and Creative Exchange. For more information, lease contact trace@utk.edu.

To the Graduate Council: I am submitting herewith a dissertation written by Matthew D Turner entitled "Exlicit L-norm estimates of infinitely divisible random vectors in Hilbert saces with alications." I have examined the final electronic coy of this dissertation for form and content and recommend that it be acceted in artial fulfillment of the requirements for the degree of Doctor of Philosohy, with a major in Mathematics. We have read this dissertation and recommend its accetance: Xia Chen, Jie Xiong, Mary Leitnaker Original signatures are on file with official student records. Jan Rosinski, Major Professor Acceted for the Council: Dixie L. Thomson Vice Provost and Dean of the Graduate School

To the Graduate Council: I am submitting herewith a dissertation written by Matthew D. Turner entitled Exlicit L-norm estimates of infinitely divisible random vectors in Hilbert saces with alications. I have examined the final aer coy of this dissertation for form and content and recommend that it be acceted in artial fulfillment of the requirements for the degree of Doctor of Philosohy, with a major in Mathematics. Jan Rosinski, Major Professor We have read this dissertation and recommend its accetance: Xia Chen Jie Xiong Mary Leitnaker Acceted for the Council: Carolyn R. Hodges Vice Provost and Dean of the Graduate School Original signatures are on file with official student records.

Exlicit L-norm estimates of infinitely divisible random vectors in Hilbert saces with alications A Dissertation Presented for the Doctor of Philosohy Degree The University of Tennessee, Knoxville Matthew D. Turner May 211

Coyright c 211 by Matthew D. Turner. All rights reserved. ii

Dedication I dedicate this dissertation to my wife Heather, sons Jackson and Bryson, arents Tony and Kathy, and brother Mark. Their suort and encouragement have been my motivation. iii

Acknowledgments I would like to exress my sincere gratitude to those individuals at the University of Tennessee who have made this dissertation ossible. I am most grateful to my advisor, Dr. Jan Rosiński. It has been my rivilege and leasure to work under his guidance. I would also like to thank the members of my committee, Dr. Xia Chen, Dr. Jie Xiong, and Dr. Mary Leitnaker for their willingness to serve and their critical review of my work. Last, but certainly not least, I would like to thank Dr. William Wade and Mrs. Pam Armentrout for their unending advice and mentoring. iv

Abstract I give exlicit estimates of the L-norm of a mean zero infinitely divisible random vector taking values in a Hilbert sace in terms of a certain mixture of the L2- and L-norms of the Levy measure. Using decouling inequalities, the stochastic integral driven by an infinitely divisible random measure is defined. As a first alication utilizing the L-norm estimates, comutation of Ito Isomorhisms for different tyes of stochastic integrals are given. As a second alication, I consider the discrete time signal-observation model in the resence of an alha-stable noise environment. Formulation is given to comute the otimal linear estimate of the system state. v

Contents 1 Infinitely Divisible Distributions 1 1.1 Introduction................................ 1 1.2 L -norm of Hilbert sace valued infinitely divisible random vectors.. 4 2 Kalman Filter 21 2.1 Kalman filter theory........................... 21 2.2 Finite L 2 -norm noise environment.................... 26 2.3 α-stable noise environment........................ 28 2.3.1 Exact 1-dimensional filtering................... 33 2.3.2 Vehicle tracking.......................... 35 2.3.3 Aircraft tracking......................... 37 3 Infinitely Divisible Random Measures 42 3.1 Introduction................................ 42 3.2 Stochastic integration........................... 45 3.2.1 Sace of integrands........................ 45 3.2.2 The stochastic integral driven by random measures...... 52 3.2.3 Examles............................. 66 3.3 Itô isomorhisms............................. 68 3.3.1 Examles............................. 76 4 Summary and Future Directions 86 Bibliograhy 88 Aendices 91 A Moments of Indeendent Random Variables and Vectors 92 vi

B Modular Saces 111 C Selected Prerequisite Analysis Results 115 C.1 Convergence results............................ 115 C.2 Algebras.................................. 117 Vita 12 vii

List of Figures 1.1 Exlicit constant in L -norm estimate.................. 7 2.1 α-stable Kalman filter for constant velocity 1 dimensional motion... 39 2.2 2-D constant velocity model CV.................... 4 2.3 2-D coordinated turn model CT..................... 41 A.1 Grah and aroximations of c and d................. 14 viii

List of Algorithms 1 Kalman filter for Gaussian noise..................... 23 2 Kalman filter............................... 25 3 Kalman filter for finite L 2 -norm noise.................. 29 4 Iteratively reweighted least squares.................... 34 5 Kalman filter for α-stable noise...................... 35 6 Kalman filter for 1 dimensional α-stable noise.............. 36 ix

Chater 1 Infinitely Divisible Distributions 1.1 Introduction When roducing models of an evolving dynamical system, one is often faced with the challenge of which effects to include in the model and which effects may reasonably be ignored to accurately determine the state of the system. An alternate aroach is to cature these unmodeled effects as random variables or stochastic rocesses, which are often assumed to be Gaussian in the classical literature. Many researchers have sought extensions to such models by relacing the Gaussian assumtion, as there is a need for models caturing observed heavy tailed data exhibiting high variability and/or long range deendency. Infinitely divisible distributions have often been utilized for such modeling. The advantage of infinitely divisible models is their comutability in terms of the Lévy-Khintchine trilet arameterization. Difficulties arise, however, when such distributions have infinite variance, since L 2 -theory and orthogonality are not alicable. Instead, we seek comutation of the L -norm in terms of the Lévy measure. Infinitely divisible distributions are a broad family of distributions containing many named distributions. For examle, the geometric, negative binomial, and Poisson distribution are all discrete distributions in this family. So too are the continuous normal, Cauchy, gamma, F, lognormal, Pareto, Student s t, Weibull, α- stable, and temered α-stable distributions. The following theorem characterizes infinitely divisible random vectors and will be the rimary tool used for investigation 1

throughout. For x H, a real Hilbert sace, define x def = x max{ x, 1}. Whenever H = R, we have x if x 1 x = signx if x > 1. Theorem 1.1.1 Lévy-Khintchine reresentation. The characteristic function of an infinitely divisible random vector X taking values in a Hilbert sace H can be written as Ee i u, X = ex {i u, b 12 u, Σu + e i u,x 1 i u, x } Qdx, 1.1 H where u, b H, Σ is a nonnegative symmetric oerator on H, and Q is a measure on H such that Q{} = and H x 2 Qdx <. Moreover, the trilet b, Σ, Q comletely determines the distribution of X and this trilet is unique. We call b, Σ, Q the Lévy-Khintchine trilet of X. When Q, X is Gaussian with mean b and covariance matrix Σ and results are well-known. Gaussian case Σ that is of interest to us in the following work. It is the non- When studying infinitely divisible distributions and their associated random vectors, the characteristic function will be our rimary tool. If we define the exonent of 1.1 by Cu def = i u, b 1 2 u, Σu + H e i u,x 1 i u, x Qdx, then C is called the cumulant of X and we have Ee i u,x = e Cu. Moreover, if X is infinitely divisible with Lévy-Khintchine trilet b X, Σ X, Q X and cumulant C X u, Y is infinitely divisible with Lévy-Khintchine trilet b Y, Σ Y, Q Y and cumulant C Y u, and X and Y are indeendent, then X + Y is also infinitely divisible with cumulant C X u + C Y u, and hence, has Lévy-Khintchine trilet b X + b Y, Σ X + Σ Y, Q X + Q Y. As an immediate corollary of the Lévy-Khintchine reresentation, we have that the family of infinitely divisible random vectors are closed under continuous linear transformations and, in articular, rojections of infinitely divisible random vectors are infinitely divisible. More recisely: 2

Corollary 1.1.2. Let X H be an infinitely divisible random vector with Lévy- Khintchine trilet b, Σ, Q. If F : H H 1 is a continuous linear oerator from the Hilbert sace H into the Hilbert sace H 1, then F X H 1 is also an infinitely divisible random vector with Lévy-Khintchine trilet b F, Σ F, Q F, where and for every B BH 1, def b F = F b + F x F x Qdx, Σ F = F ΣF, H Q F B def = Q {x H : F x B \ {}}. Before roving the corollary, we make a few remarks. First, if Q is a symmetric Lévy measure on H, then Q F is a symmetric Lévy measure on H 1. Second, the integrand in the definition b F is an odd function. Therefore, if b = and Q is symmetric, then b F = also. We oint out these facts since the majority of the examles we consider will make one or both of these assumtions. Proof of Corollary 1.1.2. Let F : H H 1 be a continuous linear oerator and let u H 1. Then Ee i u,f X = Ee i F u,x { = ex i F u, b 1 2 F u, ΣF u + e i F u,x 1 i F u, x } Qdx H { = ex i u, F b 1 2 u, F ΣF u + e i u,f x 1 i u, F x } Qdx H { = ex i u, F b + i u, F x i u, F x Qdx 1 H 2 u, F ΣF u + e i u,f x 1 i u, F x } Qdx H { = ex i u, F b + F x F x Qdx 1 H 2 u, F ΣF u + e i u,x 1 i u, x } Q F dx H { 1 = ex i u, b F 1 2 u, Σ F u + e i u,x 1 i u, x } Q F dx. H 1 3

In ractice, the normal distribution is justified in its use by the central limit theorem and a oular distribution in modeling because of the ease of comutations when L 2 -orthogonality is alicable. Under the assumtion of non-gaussian distributions, it is often not known how the error should be measured. The next section addresses this question for infinitely divisible distributions. In Chater 2, we will aly this result to obtain the Kalman filter for a discrete time signal-observation model with infinite covariance noise. In Chater 3, we will define the stochastic integral of a stochastic field driven by an infinitely divisible random measure. Itô Isomorhisms will be derived for the stochastic integral. 1.2 L -norm of Hilbert sace valued infinitely divisible random vectors Let X be a mean random vector taking values in a searable Hilbert sace H with characteristic function given by 1.1. When X is urely Gaussian Q, the L -norm of X is controlled by the covariance matrix Σ. In the non-gaussian case, Marcus and Rosiński 21 showed that for X L 1, the L 1 -norm of X is controlled by the Lévy measure Q as.25lq E X 2.125lQ, where the functional l of Q satisfies H { } x 2 min, x Qdx = 1. l 2 l The following theorem generalizes this result to obtain bounds on the L -norm of X. Assume that X is in L for given 1, EX =, and that X does not have a Gaussian comonent. The characteristic function of X can be written as E ex i u, X = ex e i u,x 1 i u, x Qdx. H We assume throughout that Q is symmetric and later remark on removing this restriction by standard symmetrization techniques. Since Q is assumed symmetric, 4

the characteristic function of X is E ex i u, X = ex H cos u, x 1 Qdx. It is well known that an infinitely divisible random vector X with Lévy measure Q has finite L -norm if and only if x 1 x Qdx is finite see e.g. Sato 22, Corollary 25.8. Therefore the Lévy measure Q satisfies H x 2 1 { x <1} + x 1 { x 1} Qdx <. Let the functional l of Q be given by the solution of ξl def = H x 2 l 2 1 { x l <1} + x 1 l { x l 1} Qdx = 1. 1.2 We remark that x 2 1 { x <1} + x x 2 x if 1 2, 1 { x l} = x 2 x if > 2. We can view l as a secial mixture of the L 2 -norm and L -norm of Q. In the case of non-gaussian infinitely divisible random vectors, the following theorem gives exlicit estimates of the L -norm in terms of the Lévy measure Q. Theorem 1.2.1. Let 1. Assume that X L is a mean infinitely divisible random vector without Gaussian comonent, taking values in the Hilbert sace H, and that X has symmetric Lévy measure Q. Then.25l X Kl 1.3 where K def = 1 + 2 3 + 1, if 1 2 4 4 + 1 + 1, if 2 < 3 4 4 1/ 4 + K 3, K 4, + 1 1/, if 3 < < 4 2 4 4, if = 4 K 1, K 2, + 1 + K 1/ 3, K 4, + 1 1/, if > 4, 5 1.4

def where K 1, = + 1 +1 2 1/, K 2, = 4 1+1/ 2, K 3, = 4, K 2 2/+1 1 +1 4, = 2 2+1 4 /2 x + 5 /2, and x 4.7591 solves x = e logx + 1. We remark on imortant cases for the constant K. First, it is the 1 < 2 case that is of most interest to us, as L -theory must be used when working with models containing infinite covariance noise or random driving terms. It is often challenging, if not imossible, to comute such norms directly. Second,we have very nice constants for estimation of the mean, variance, skewness, and kurtosis. Constant K is grahed in Figure 1.1. In rearation of the roof of Theorem 1.2.1, we follow the lead of Marcus and Rosiński 21 and decomose X as X = Y + Z, where Y and Z are indeendent mean zero random vectors with characteristic functions E ex i u, Y = ex cos u, x 1 Qdx x <l and E ex i u, Z = ex cos u, x 1 Qdx, x l resectively. The following four lemmas rovide uer and lower bounds for norms of Y and Z and will be used in the roof of Theorem 1.2.1. Lemma 1.2.2. We have the following uer bounds on norms of Y : i. If 1 2, then ii. If 2 < 4, then Y Y 4 = Y Y 2 = x <l x <l x 2 Qdx 1/2. 1.5 2 1/4 x 4 Qdx + 3 x Qdx 2. 1.6 x <l iii. If > 4, then Y K 1, K 2, Y 2 + l, 1.7 6

6 65 5.5 6 5 K 4.5 4 K 55 5 3.5 3 45 2.5 1 2 3 4 4 3 4 5 6 7 8 Figure 1.1: Exlicit constant in L -norm estimate. where K 1, and K 2, are given in Theorem 1.2.1. Proof. 1.5 and 1.6 were roved by Marcus and Rosiński 21, Lemma 1.1. Now let > 4. Let {Y t } t be a Lévy rocess such that Y d 1 = Y. Since the Lévy measure of Y, and hence Y t, is suorted on { x < l}, the samle ath t Y t ω a.s. has no jums of magnitude larger than l on t [, 1]. So there exists Ω Ω with PΩ = 1 such that Y t ω Y t ω l for every ω Ω and for every t [, 1]. For each n N, we may write Y as the sum of n i.i.d. random vectors by Y = d Y 1 Y n 1 n + Y n 1 n Y n 2 n + + def Y 1 Y = n n k=1 k Y, n where Fix ε > and ω Ω. Since k Y def = Y k n n Y k 1. n {t [, 1] : X t ω X t ω l + ε} =, 7

standard analysis results give that there exists N = Nω so large that for each n Nω, k Y ω < l + ε n for every 1 k n. For each n N, define a new i.i.d. sequence of bounded random vectors {Y k,n } n k=1 by For each ω Ω, Y k,n for every n Nω. We now have that def = k Y 1 { n k n Y k,n ω = k Y ω n }. Y <l+ε S n def = n Y k,n Y a.s., k=1 since PΩ = 1. Observe that for fixed n, {Y k,n } n k=1 is sequence of symmetric since Q is assumed symmetric i.i.d. random vectors bounded by l + ε. By de la Peña and Giné 1999, Theorem 1.2.5, a Hoffman-Jorgensen tye inequality, for every n N, S n K 1, K 2, S n 2 + max. Y k,n 1 k n But and S n 2 2 = E n k=1 Y 2 k,n n E k=1 Y k,n < l + ε k n 2 Y = E Y 2 for every 1 k n. Hence, for every n N, S n < K 1, K 2, Y 2 + l + ε. By Fatou s lemma and the arbitrariness of ε, Y K 1, K 2, Y 2 + l. 8

Lemma 1.2.3. We have the following lower bounds on norms of Y : i. If 1 2, then ii. If > 2, then E Y Y Y 2 = E Y 2 2. 1.8 l2 + 3E Y 2 2 x <l x 2 Qdx 1/2. 1.9 Proof. Let 1 2. To show 1.8, Holder s inequality gives E Y 2 = E Y 2 8 4 4 Y 4 E Y 2 4 2 4 2 4 E Y 8 4 4 = E Y 2 4 E Y 4 2 4 4 2 2 4 and hence, E Y E Y 2 4 2 E Y 4 2 2. Alying 1.6 to the denominator gives E Y E Y 2 4 2 l 2 E Y 2 + 3 E Y 2 2 2 2 = E Y 2 2, l2 + 3E Y 2 2 roving 1.8. This technique is known as Littlewood s aroach. 1.9 is immediate by 1.5. Lemma 1.2.4. We have the following uer bounds on norms of Z: i. If 1 2, then E Z c x Qdx, 1.1 x l where c = 2 3 + 1. If H = R, the constant may be taken as c given by A.12 or A.2 instead. 9

ii. If 2 < 3, then E Z x l x Qdx + 1 4 iii. Let λ > /x. If 3 < < 4 or if > 4, then E Z K 3, K 4, x l x l /2 x Qdx 2 + where K 3, and K 4, are given in Theorem 1.2.1. iv. Let λ /x. If 3 < < 4 or if > 4, then E Z max 1 + 8 log x 1 6, log x x 2 Qdx x 2 Qdx. x l 1.11 x l x l x Qdx, 1.12 x Qdx. 1.13 v. If = 4, then E Z = x l 2 x 4 Qdx + 3 x Qdx 2. 1.14 x l Proof. First, 1.14 follows exactly as in 1.6 by standard comutation from the characteristic function. Next let λ def = Q x l and {W i } i N a collection of i.i.d. random vectors in H such that PW i A = λ 1 QA { x l}. Let N be a Poisson random variable with mean λ indeendent of {W i } i N. Now Z is a comound Poisson random vector and we have Then Z d = E Z N = E W i = N W i. 1.15 k E W i PN = k. 1.16 k=1 1

First let 1 2. By Corollary A.6 if H = R or Theorem A.2 in general, for each k k k N, E W i is bounded above by c E W i = c ke W 1. Utilizing this in 1.16 gives E Z c E W 1 k=1 kpn = k = c E W 1 EN = c E W 1 λ, since N is a Poisson random variable with mean λ. But E W 1 = x λ 1 Qdx and hence, roving 1.1. E Z c x l Next, let 2 < 3. By Theorem A.1, k E W i ke X 1 + = ke X 1 + = ke X 1 + x l x Qdx, 1 E X 1 2 2 1 E X 1 2 2 1 2 k E S i 1 2 k i 1 E X j 2 j=1 k 2 k E X 1 2 E X 1 2. 2 Again recalling that N is Poisson, substituting into 1.16 gives E Z k=1 ke X 1 + 1 2 k 2 k E X 1 2 E X 1 2 PN = k 2 = E N E X 1 1 + E N 2 N E X 1 2 E X 1 2 4 = λ x λ 1 Qdx + = x l 1 λ 2 x 2 λ 1 Qdx x 2 λ 1 Qdx 4 x l x l x 1 Qdx + x 2 Qdx x 2 Qdx. 4 x l x l x l 11

Finally, let > 3. If λ > /x, we have by de la Peña and Giné 1999, Theorem 1.2.5, a Hoffman-Jorgensen tye inequality k E W i 41/+1 2 /+1 4 /2+1 2 2/+1 k /+1 W 2 2/+1 i 1 2 k 1/+1 +1 + 22/+1 E W 2 2/+1 i. 1 By convexity, a + b +1 2 a +1 + b +1 for every a, b. Therefore k E W i Substituting into 1.16 gives E Z = K 3, = K 3, 2 4 2 2/+1 1 +1 k 2 2+1 W i + 2 k E W i = K 3, 2 2+1 k /2 E W 1 2 /2 + ke W1. K 3, 2 2+1 k /2 E W 1 2 /2 + ke W1 PN = k k=1 2 2+1 EN /2 2 2+1 x l x l /2 x 2 λ Qdx 1 + EN /2 x Qdx 2 λ /2 EN /2 + x l x l x λ 1 Qdx x Qdx To bound λ /2 EN /2, Kwaień and Woyczyński 29, Proosition 1.7.2 showed that in the case λ > /x, N /2 N 4 + 5λ.. Hence, λ /2 EN /2 4 /2 λ + 5 /2 4 /2 x + 5 /2 12

and we have /2 E Z K 3, 2 2+1 x Qdx 2 4 /2 x + 5 /2 + x Qdx x l /2 = K 3, K 4, x Qdx 2 + x l x l x l x Qdx. Now suose that λ /x. For each ω Ω, Holder s inequality gives k W i ω k k 1/ W i ω k 1 1/ W i ω and hence, k E W i k 1 Substituting into 1.16 gives E Z E W 1 k=1 k E W i = k E W 1. k PN = k = x l x λ 1 QdxEN. 1.17 To bound λ 1 EN, Kwaień and Woyczyński 29, Proosition 1.7.2 also showed that in the case λ /x, Combining with 1.17 gives λ 1 EN max 1 + E Z max 1 + 8 log x 1 8 log x 1 6, log x 6, log x x l x Qdx. 13

Lemma 1.2.5. If 1, we have the following lower bound on norms of Z: E Z 1 e λ λ x l x Qdx. 1.18 Proof. Let 1. Since we have assumed that Q is symmetric, Lemma A.7 gives Substituting into 1.16 gives k E W i E W 1. E Z k=1 E W 1 PN = k = 1 e λ λ x l x Qdx. We are now ready to rove the uer bound of Theorem 1.2.1 using Lemma 1.2.2 and Lemma 1.2.4. Proof of uer bound of Theorem 1.2.1. First assume that 1 2. From 1.5 and 1.1, we have X Y 2 + Z 1/2 x 2 Qdx + c = l x <l x <l x l x 2 1/2 Qdx + l l c x l 1/ x Qdx x 1/ Qdx. l 1.19 By definition 1.2 of l, x l x l Substituting into 1.19 gives X { x <l x l Qdx = 1 x <l x l 2 1/2 Qdx + c 1 x <l 2 Qdx. x l } 2 1/ Qdx l. 14

Clearly, by definition 1.2 of l we have and hence, x <l x 2 Qdx 1 l X max a + c 1 a l 1 + c l. a 1 Next, let 2 < 3. Combining 1.6 and 1.11 gives X Y + Z l 2 x 2 Qdx + 3 x <l + x l x <l x <l 2 1/4 x 2 Qdx 1/ x 1 Qdx + x 2 Qdx x 2 Qdx 4 x l x l x 2 x 2 2 1/4 Qdx + 3 Qdx l l 2 l 2 x <l x 1 + Qdx + x l l 4 4 1 4 + 1 + l. 4 x l Now let 3 < < 4. If λ > /x, 1.12 gives E Z K 3, K 4, x l x 2 l 2 /2 x Qdx 2 + Qdx x l x l x = K 3, K 2 /2 4, Qdx l + x l l 2 K 3, K 4, + 1 l x l x 2 1/ Qdx l l 2 x Qdx x Qdxl l 15

and if λ /x, 1.13 gives E Z max 1 + = max 1 + max 1 + 8 log 8 log 8 log x 1 x 1 x 1 6, log x 6, log x 6, log x l. x l x l x Qdx x Qdxl l In either case, we have E Z K 3, K 4, + 1 l. This, along with 1.6 gives X Y 4 + Z 4 1/ 4 + K 3, K 4, + 1 1/ l. Now let = 4. Combining 1.6 and 1.14 gives X Y + Z l 2 x 2 Qdx + 3 + x <l = x l x <l + 2 4 4l. x l x 4 Qdx + 3 x 2 l 2 Qdx + 3 x 4 l 4 Qdx + 3 x <l x l x <l x l 2 1/4 x 2 Qdx 2 1/4 x 2 Qdx x 2 2 Qdx l 2 1/4 x 2 2 Qdx l 2 l 1/4 l 16

Finally, let > 4. Combining 1.7 and the bounds on Z from the 3 < < 4 case, we have X Y + Z K 1, K 2, Y 2 + l + K 1/ 3, K 4, + 1 1/ l x = K 1, K 2 1/2 2, Qdx l + l + K 1/ x <l l 2 3, K 4, + 1 1/ l K 1, K 2, + 1 + K 1/ 3, K 4, + 1 1/ l. We are now ready to rove the lower bound of Theorem 1.2.1 using Lemma 1.2.3 and Lemma 1.2.5. Proof of lower bound of Theorem 1.2.1. By 1.2, either x 2 x <l l 2 Qdx.5 1.2 or x x l l Qdx.5 1.21 must be true. Assume 1.2 holds. If 1 2, Lemma A.7 and 1.7 combine to give E X = E Y + Z E Y Since the function t t l 2 + 3t 2 2 is increasing in t, E Y 2 2. l2 + 3E Y 2 2 E X.5l 2 l 2 + 3.5l 2 2 2 = 2.5/2 l l 5 4 and hence, X.25l.25l. 17

If > 2, then by Lemma A.7 and 1.9, X Y Y 2 = Now assume 1.21 holds. Then x l x <l x + x l Qdx Now the left hand side simlifies to 1 x 2 2 Qdx.5l >.25l. x l 2 x Qdx l λ, x l x Qdx.5l. where λ = Q x l, and hence, x Qdx l 2.5 + λ = l 4 x l 1 + 2λ. We may combine this with the lower bound inequality in 1.1 and utilize Lemma A.7 as in the above case to get E X E Z 1 e λ λ l 4 1 + 2λ l 4 and hence, X.25l.25l. In either case, the left hand inequality in 1.3 holds. Recall that we have been working under the assumtion that Q is symmetric. To remove this restriction, assume that X is a mean infinitely divisible random vector in L with Lévy measure Q and let X s be the standard symmetrization of X. The Lévy measure of X s is given by Q s A = QA + Q A and if c solves 1.2 for Q s, we have that c also solves x 2 1 c 2 { x <c} + x c 1 { x c} Qdx = 1 2. 1.22 H 18

By Corollary A.8 and Theorem 1.2.1, 1 8 c 1 2 Xs X X s Kc. Now let l solve and H x 2 1 l 2 { x <l} + x 1 l { x l} Qdx = 1 1.23 k def = 2, if 1 2 2, if > 2. Then k > 1 and if 1 2, we have H min { } x 2 { kl, x 1 Qdx max 2 kl k, 1 } { min 2 k H or if > 2, we have H max { } x 2 { kl, x 1 Qdx max 2 kl k, 1 } { max 2 k H x 2 l 2 x 2 l 2 }, x Qdx = 1 l k = 1 2 }, x Qdx = 1 l k = 1 2 2. In either case, c kl since c solves 1.22. Clearly, l c since l solves 1.23. We have roven the following corollary to Theorem 1.2.1: Corollary 1.2.6. Let 1. Assume that X L is a mean infinitely divisible random vector without Gaussian comonent, taking values in the Hilbert sace H, and that X has Lévy measure Q. Let l be the solution of ξl def = H x 2 1 l 2 { x <l} + x 1 l { x l} Qdx = 1. 1.24 Then.125l X max{ 2, 2}Kl 1.25 where K is given by 1.4. The last corollary to Theorem 1.2.1 that we resent gives quick estimation of the L -norm of X in terms of the functional ξl. 19

Corollary 1.2.7. Under the assumtions of Theorem 1.2.1, if ξl is given by 1.2, then {.25 min } { ξ1, ξ1 X K max } ξ1, ξ1. Similarly, under the assumtions of Corollary 1.2.6, {.125 min } ξ1, ξ1 X max{ 2, { 2}K max } ξ1, ξ1. Proof. First, suose 1 2. If l < 1, l 2 = x <l x <l = ξ1 x 2 Qdx + x 2 Qdx + x l l x <1 l 2 x Qdx x 2 x Qdx + x 1 x Qdx and l = l 2 x 2 Qdx + x Qdx x <l x l x 2 Qdx + x 2 Qdx + x Qdx x <l l x <1 x 1 = ξ1. If l 1, similar arguments give l 2 ξ1 and l ξ1. In either case we have { min } { ξ1, ξ1 l max } ξ1, ξ1. Similar arguments give the > 2 case. 2

Chater 2 Kalman Filter 2.1 Kalman filter theory In his landmark aer, Kalman 196 considered the discrete time signal-observation model x k = F k x k 1 + B k u k + w k y k = H k x k + v k, where x k is the state of an evolving dynamical system at time k, u k is a deterministic control inut to the system, and y k is a noisy linear observation of x k. The noise terms {w k } and {v k } are assumed to be mean Gaussian random vectors with covariance matrices W k and V k, resectively. In filter theory, the objective is to roduce an efficient estimate x k of the unobservable rocess x k using the observed values y 1, y 2,..., y k, which are known at time k. An efficient estimate is one that minimizes some exected loss of the error x k x k. In his aer, Kalman 196 showed that x def k = E x k y 1, y 2,..., y k minimizes the L 2 -norm of the error and gave a recursive formulation for comuting the estimate x k. Under the assumtion of normally distributed noise terms, the orthogonal rojection x k is an affine transformation of the observations y 1, y 2,..., y k. Let ˆx k k 1 be the redicted state of the system at time k, given that the observations y 1, y 2,..., y k 1 are known at time k 1. Then, at time k, the observation y k becomes available and we may udate our state estimate. Let ˆx k k be the udated estimate of the system state at time k once the observation y k has become available. 21

We denote by P k k the covariance matrix of the error x k ˆx k k and by P k k 1 the covariance matrix of the error x k ˆx k k 1. The recursively formulated solution given by Kalman 196 to comute x k = ˆx k k is given in Algorithm 1. The filter ˆx k k is a linear combination of the redicted state ˆx k k 1 and the observation y k. The otimal Kalman gain K k in Algorithm 1 is chosen to minimize the L 2 -norm of the error x k ˆx k k and is given by K k = P k k 1 H T k Hk P k k 1 H T k + V k 1. 2.1 Over the years since this ublication, some research has focused on relacing the noise terms by random vectors with heavy-tailed distributions. Gordon et al. 23, Introduction argued for the need of models allowing heavy tailed error estimates as outlying system state realizations and/or observation measurements have long been known to adversely affect the estimation rocedure. In Gordon et al. 23, the authors assume that the noise terms are ower law distributed and give the Kalman filter in terms of the tail covariance matrices of the noise terms. Stuck 1978 first addressed this model under the assumtion that both x k and y k are R- valued and each noise sequence {w k } and {v k } are α-stable random variables for fixed α. These examles fall under a more general framework for which the noise sequences are assumed to be symmetric infinitely divisible random vectors. In what follows, we establish a general framework to exlore the Kalman filter under this assumtion on the distributions of the noise sequences and demonstrate in two different examles that a solution can often be obtained or aroximated. The first examle assumes that each noise term has finite L 2 -norm, but makes no other assumtions on the distributions. The second examle considers the roblem for α-stable distributed noise sequences, which was first addressed in dimension 1 by Stuck 1978 and then in Gordon et al. 23. In each examle, a tractable aroximate solution is given. Each solution is exact in dimension 1 and agrees with the classic Kalman gain 2.1 when α = 2 in the second examle. Before we begin, we should oint out that these solutions are only otimal in the linear sense. Kalman 196 noted that, under the assumtion that the noise terms are normally distributed, the orthogonal rojection E x k y 1, y 2,..., y k is a linear function of the observations y 1, y 2,..., y k. However, by removing this assumtion, this is no longer the case. In general, the L 2 -orthogonal rojection E x k y 1, y 2,..., y k is non-linear and non-linear filtering theory may give better results. If we are seeking the 22

Algorithm 1 Kalman filter for Gaussian noise. 1: Initialize: def ˆx = Ex = P = W 2: Predict: def ˆx k k 1 = F kˆx k 1 k 1 + B k u k unbiased estimate P k k 1 = F k P k 1 k 1 Fk T + W k 3: Udate: K k = P k k 1 Hk T Hk P k k 1 Hk T + V 1 k def ˆx k k = ˆx k k 1 + K k yk H kˆx k k 1 P k k = I K k H k P k k 1 otimal solution x k minimizing, say, the L -norm of the error x k x k, the conventional conditional exected value is no longer even the otimal solution. Instead, it will be the conditional L -exected value E x k y 1, y 2,..., y k that minimizes the L -norm of the error. However, the linear formulation has the desirable roerty of being easily imlemented and are the only estimates we consider. To this end, consider the discrete time signal-observation model x k = F k x k 1 + B k u k + w k y k = H k x k + v k, 2.2 where x k R d, F k R d d, u k R n, B k R d n, y k R m, and H k R m d. Assume that the system noise {w k } k N are indeendent symmetric R d -valued random vectors with the Lévy-Khintchine trilets w k,, Q w,k, k = 1, 2,..., where, for each k, Q w,k is a symmetric Lévy measure on R d, that the observation noise {v k } k N are indeendent symmetric R m -valued random vectors with the Lévy- Khintchine trilets v k,, Q v,k, k = 1, 2,..., where, for each k, Q v,k is a symmetric Lévy measure on R m, and that x R d is a symmetric infinitely divisible random vector with Lévy-Khintchine trilet x,, Q w,, 23

where Q w, is a symmetric Lévy measure on R d. Moreover, assume that the sequence of random vectors {x, w 1, v 1, w 2, v 2,... } are mutually indeendent. Finally, assume that for some fixed 1, we have that both for each k =, 1, 2,..., and that R d x 1 { x 1} Q w,k dx < R m x 1 { x 1} Q v,k dx < for each k = 1, 2, 3,.... Restricting ourselves to linear estimates, the Kalman filter algorithm is given by Algorithm 2. Let e k k be the udated estimate error, e k k 1 the redicted estimate error, and observe that e = x, e k k 1 def = x k ˆx k k 1 = F k x k 1 + B k u k + w k F kˆx k 1 k 1 + B k u k = F k e k 1 k 1 + w k, and e k k def = x k ˆx k k = x k ˆx k k 1 + K k yk H kˆx k k 1 = x k ˆx k k 1 K k H k x k + v k + K k H kˆx k k 1 = e k k 1 K k H k xk ˆx k k 1 Kk v k = I d K k H k e k k 1 K k v k = I d K k H k F k e k 1 k 1 + w k Kk v k. First, we remark that e k 1 k 1, w k, and v k are indeendent. Second, K k v k is a symmetric random vector. These two facts, along with Corollary 1.1.2, imly that the udated error e k k is an infinitely divisible random vector on R d and, since each 24

Algorithm 2 Kalman filter 1: Initialize: def ˆx = Ex = 2: Predict: def ˆx k k 1 = F kˆx k 1 k 1 + B k u k unbiased estimate 3: Udate: def ˆx k k = ˆx k k 1 + K k yk H kˆx k k 1 Lévy measure Q, is symmetric, the Lévy-Khintchine trilet is given by e,, Q w, def =,, Q, e k k,, Q k 1 I d K k H k F k + Q w,k I d K k H k + Q v,k def K k =,, Q k, k = 1, 2,.... 2.3 We recall from Corollary 1.1.2 that the subscrit notation Q v,k K k reresents a new Lévy measure on R d given by for every B BR d. Q v,k K k B def = Q v,k {x R m : K k x B \ {}}, magnitude of the error by l k, where l k solves R d x 2 In light of Section 1.2, for every k, we may measure the l 2 k 1 { x l k <1 } + x l k 1 { } x >1 l k Q k dx = 1. 2.4 The otimal Kalman gain K k R d m is chosen to minimize l k. While no closed form solution exists for such arbitrary Lévy measures, we demonstrate aroximate solutions in the following two examles. The first will deal with the case that = 2 and the Lévy measures are arbitrary. The second examle will deal with the symmetric α-stable case. Often, we will need to comute Q k iteratively, as oosed to recursively as in 2.3. To do so, observe that if Q is a measure on R n, G R q n, and H R r q, then Q G H is a measure on R r and we have, for B B R r, Q G H B = Q G {x R q : Hx B \ {}} = Q {x R n : Gx {x R q : Hx B \ {}} \ {}} = Q {x R n : HGx B \ {}} = Q HG B. 25

Using this rule that Q G H = Q HG, we may derive the following formulation of 2.3: Theorem 2.1.1. The recursively defined Lévy measure Q k in 2.3 is Q k = Q w, k 1 i= I d K k i H k i F k i k + Q w,j k j 1 j=1 + i= I d K k i H k i F k ii d K j H j Qv,j k j 1 i= I d K k i H k i F k ik j, 2.5 where the roduct notation is understood to be right multilication and equal to the identity matrix when the roduct is emty. 2.2 Finite L 2 -norm noise environment Suose now that = 2, so that each noise w k and v k has finite L 2 -norm. The integrand of 2.4 is no longer iecewise, simlifying comutations. Since each L 2 - norm is finite, the second moments of w k and v k are finite and given by W k def = R d x 2 Q w,k dx and V k def = R m x 2 Q v,k dx, resectively. Then the initial and udated errors are given by l 2 = x 2 Q dx = R d x 2 Q w, dx = W R d and lk 2 = x 2 Q k dx R d = x 2 Q k 1 I d K k H k F k + Q w,k I d K k H k + Q v,k K k dx R d 26

= I d K k H k F k x 2 Q k 1 dx R d + I d K k H k x 2 Q w,k dx + R d R m K k x 2 Q v,k. Instead of minimizing l k, will minimize an uer bound on l k. Using the subordinate matrix 2-norm induced by the Euclidean vector norm, we can bound the magnitude of the udated error by l 2 k I d K k H k F k 2 2 Let us define + I d K k H k 2 2 ˆl2 def = l 2 and x 2 Q k 1 dx R d R d x 2 Q w,k dx + K k 2 2 R m x 2 Q v,k dx = I d K k H k F k 2 2 l2 k 1 + I d K k H k 2 2 W k + K k 2 2 V k. ˆl2 k def = I d K k H k F k 2 2 ˆl 2 k 1 + I d K k H k 2 2 W k + K k 2 2 V k. 2.6 The above definitions allow us to iteratively udate our error estimates using only the revious error udate. Now we must determine an aroximating rocedure that minimizes ˆl k k. While the subordinate matrix 2-norm has the desirable roerty that I 2 = 1, it resents a challenge in minimizing ˆl k k. For a matrix A, the Frobenius norm A F def = trace A T A, 2.7 while larger than the subordinate matrix 2-norm A 2 2, is easier to comute. To this end, we may bound 2.6 by ˆl2 k I d K k H k F k 2 F ˆl 2 k 1 + I d K k H k 2 F W k + K k 2 F V k. 2.8 The right hand side is now easy to minimize by recognizing it as a multivariate multile regression minimizing the residual sum of squares of the model [ˆlk 1 I d Wk I d d m ] = K k [ˆlk 1 H k F k Wk H k Vk I m ]. 27

It is well known that for a multile multivariate linear regression model Y = BX, the least squares estimate of the matrix B is Y X T XX T 1. Hence K k = 1 ˆl2 k 1Fk T Hk T + W k Hk ˆl2 T k 1H k F k Fk T Hk T + W k H k Hk T + V k I m. The above solution is exact in 1 dimension, since the matrix norms 2 and F are relaced by, and coincides with the classic Kalman filter. The algorithm is summarized in Algorithm 3. 2.3 α-stable noise environment For the next examle, fix 1 < α < 2 and assume that x is known, so that Q w, = δ. Assume that the signal noise sequence has the form w k = G w k, where G R d q and w k are R q -valued rotationally invariant α-stable random vectors with Lévy measures Q w,k dx def = ck w x α q dx. By Corollary 1.1.2, w k are infinitely divisible R d -valued random vectors with Lévy-Khintchine trilets,, Q w,k def =,, Q w,k G. Assume v k are R m -valued rotationally invariant α-stable random vectors with Lévy measures Q v,k dx def = c v k x α m dx. Before determining the Kalman gain, we will need the following comutations in the analysis of this roblem: Fix 1 < α and let A R d d. I denote by σ the uniform measure on the unit shere. Then x R l d 2 2 1 { x <l} Q w,k A dx = 1 Ax 2 1 l 2 { Ax <l} Q w,k G R q dx = 1 AGx l R 2 1 2 { AGx <l} ck w x α q dx q = c k w AGru 2 1 l 2 { AGru <l} ru α q σdur q 1 dr S q 1 = c k w AGu 2 1 l 2 { AGu <l/r} σdur 1 α dr S q 1 = c k w AGu 2 1 l 2 {r<l/ AGu, AGu =} r 1 α drσdu S q 1 = c k w 1 AGu α σdu, 2.9 l α 2 α S q 1 28

Algorithm 3 Kalman filter for finite L 2 -norm noise. 1: Initialize: ˆx = Ex = ˆl2 = W 2: Predict: ˆx k k 1 = F kˆx k 1 k 1 + B k u k 3: Udate: K k = ˆl2 k 1 F k T HT k + W khk T ˆx k k = ˆx k k 1 + K k yk H kˆx k k 1 ˆl2 k 1 H kf k F T k HT k + W kh k H T k + V ki m 1 ˆl2 k = I d K k H k F k 2 2 ˆl 2 k 1 + I d K k H k 2 2 W k + K k 2 2 V k. and, similarly, x R l d Also, if A R d m, then x R l d 2 2 and, similarly, 1 { x l} Q w,k A dx = c k w 1 AGu α σdu. 2.1 l α α S q 1 1 { x <l} Q v,k A dx = 1 l 2 R m Ax 2 1 { Ax <l} Q v,k dx x R l d 2 2 = 1 Ax l R 2 1 2 { Ax <l} c v k x α m dx m = cv k Aru 2 1 l 2 { Aru <l} ru α m σdur m 1 dr S m 1 = cv k Au 2 1 l 2 { Au <l/r} σdur 1 α dr S m 1 = cv k Au 2 1 l 2 {r<l/ Au, Au =} r 1 α drσdu S m 1 = cv k 1 Au α σdu, 2.11 l α 2 α S m 1 1 { x <l} Q v,k A dx = cv k 1 Au α σdu. 2.12 l α 2 α S m 1 We are now ready to comute the estimated error l k. To comute the first integral in the functional equation 2.4 for l k, we use the iterative formulation and the integral 29

formulas 2.9 and 2.11 to get 2 x 1 R l 2 { x <lk }Q k dx d k 2 x = 1 R l 2 { x <lk }Q w, k 1 dx d i= k I d K k i H k i F k i 2 k x + 1 R l 2 { x <lk } Q w,j k j 1 dx d i= I d K k i H k i F k ii d K j H j k j=1 2 k x + 1 R l 2 { x <lk } Q v,j k j 1 dx d i= I d K k i H k i F k ik j k j=1 1 k k j 1 = c w lk α j I d K k i H k i F k i I d K j H j Gu 2 α j=1 S q 1 i= 1 k k j 1 α + c v j I d K k i H k i F k i K j u σdu 2 α l α k j=1 S m 1 i= α σdu and similarly, using the integral formulas 2.1 and 2.12, we have that the second integral in the functional equation 2.4 for l k is l k R d x = + l α k l α k 1 { x lk k 1 }Q k dx 1 k α 1 k α j=1 j=1 c w j c v j S q 1 S m 1 I d K k i H k i F k i I d K j H j Gu α I d K k i H k i F k i K j u σdu. k j 1 i= k j 1 i= Since l k satisfies 2.4, the two comutations above combine to give j=1 S q 1 i= α σdu 1 lk α = 2 α + 1 k k j 1 α c v j I d K k i H k i F k i K j u σdu α j=1 S m 1 i= k k j 1 α + cj w I d K k i H k i F k i I d K j H j Gu σdu. 2.13 While no closed form solution exists for K k minimizing l k excet in the 1-dimensional case, we can get a tractable roblem, as we did in the = 2 examle, by minimizing 3

an uer bound of l k. Define def 1 W k = ck w 2 α + 1 σ S q 1 α and def 1 V k = c v k 2 α + 1 σ S m 1. α Observe that l α = and that 1 lk α = 2 α + 1 c v k K k u α σdu α S m 1 k 1 k j 1 + c v j j=1 S I d K k H k F k I d K k i H k i F k i K j u m 1 + ck w I d K k H k Gu α σdu S q 1 k 1 k j 1 + cj w I d K k H k F k I d K k i H k i F k i j=1 S q 1 I d K j H j Gu α α σdu σdu I d K k H k F k α 2 lα k 1 + I d K k H k G α 2 W k + K k α 2 V k, 2.14 def where, for a matrix A, A 2 = max x =1 Ax is the subordinate matrix 2-norm induced by the Euclidean vector norm. As we did in the = 2 case, we consider ˆlα k def = I d K k H k F k α 2 ˆl α k 1 + I d K k H k G α 2 W k + K k α 2 V k 2.15 instead of l k. The above iterative definition will allow us to minimize the convenient uer bound ˆl k of l k. As before, using these uer bounds, our error estimates may be udated using only the revious estimated error. Now we must determine an aroximating rocedure that minimizes ˆl k. As we did in the = 2 case, we will minimize the Frobenius norm F see 2.7 for definition 31

instead of the subordinate matrix 2-norm 2. To this end, we may bound 2.15 by ˆlα k = I d K k H k F k α 2 ˆl α k 1 + I d K k H k G α 2 W k + K k α 2 V k I d K k H k F k α F ˆl α k 1 + I d K k H k G α F W k + K k α F V k = ˆl k 1 α I d K k H k F k α 2 F I d K k H k F k 2 F + W k I d K k H k G α 2 F I d K k H k G 2 F + V k K k α 2 F K k 2 F, 2.16 the right hand side now being easier to minimize as follows: suose that we have an estimate K t k for K k. Then we may iteratively imrove our estimate of K k by finding K t+1 k w t 1 minimizing I d K t+1 2 k H k F k + w t 2 I d K k H k G 2 F + wt 3 F K t+1 k 2 F, 2.17 where w t 1 w t 2 def = ˆl k 1 α I d K t k def = W k I d K t k H k H k α 2 F k, F G, α 2 F and w t 3 K t = V k def k α 2 F We may recognize 2.17 as a multivariate multile regression minimizing the residual sum of squares of the model [ w t 1 F k w t 2 G d m ]. [ = K t+1 k w t 1 H k F k w t 2 H k G w t 3 I m ]. It is well known that for a multile multivariate linear regression model Y = BX, the least squares estimate of the matrix B is Y X T XX T 1. Hence K t+1 k = w t 1 F k Fk T Hk T + w t 2 GG T Hk T 1 w t 1 H k F k Fk T Hk T + w t 2 H k GG T Hk T + w t 3 I m. 32

This aroximating technique is known as iteratively reweighted least squares. See, for examle, Gentle 27, L norms and Iteratively Reweighted Least Squares, g. 232 for an overview. Iteratively reweighted least squares aroximates K k minimizing 2.16 by K k = lim t K t k. The above rocedure is easily imlemented on a comuter and allows us to aroximate the otimal Kalman gain K k using the iteratively reweighted least squares algorithm. We may initialize the algorithm by the least squares solution, where w 1 1, w 1 2, and w 1 3 are taken to be 1, and comute the error to be any matrix norm of the difference K t+1 k K t k. The iteratively reweighted least squares algorithm is imlemented in Algorithm 4 and the Kalman filter is imlemented in Algorithm 5. Algorithm 5 can become unstable over time due to the fact that we are not actually keeing track of the actual errors, but instead, an uer bound on the errors using the matrix norm inequality AB 2 A 2 B 2. At each ste, we used this inequality, and hence our estimated error ˆl k tends to be much larger than the actual error l k. If we are only tracking the target short term, Algorithm 5 works very well. However, for long term tracking we may imrove estimation of x k at the exense of comutational inefficiency by keeing track of more of the matrix multilications in 2.13 instead of aroximating the error by 2.14. If we are filtering off-line and comutational seed is not a riority, we may use 2.13 for l to imrove erformance. Alternatively, we may erform a statistical analysis to determine how large an overestimate 2.14 tends to be and adjust accordingly. 2.3.1 Exact 1-dimensional filtering As mentioned above, we can get an exact closed form solution in dimension 1 and demonstrate this here. If d = m = q = 1, then the inequality 2.14 is in fact an equality, since the matrix norms are relaced by, giving l α k = 1 K k H k α F k α l α k 1 + 1 K k H k α W k + K k α V k = 1 K k H k α F k α l α k 1 + W k + Kk α V k, 33

Algorithm 4 Iteratively reweighted least squares. 1: Initialize K 1 k to the least squares solution with weights of 1: K 1 k = F k Fk T HT k + GGT Hk T Hk F k Fk T HT k + H kgg T Hk T + I 1 m 2: While error > ε and t maxiterations Comute w t 1 = ˆl k 1 α I d K t k H α 2 k F k F w t 2 = W k I d K t k H k G α 2 F w t K t 3 = V k k α 2 F Comute K t+1 k = w t 1 F k Fk T HT k + wt 2 GG T Hk T Comute error = Increment t. 3: K k = K t k. K t+1 k w t 1 1 H k F k Fk T HT k + wt 2 H k GG T Hk T + wt 3 I m K t k F where we have assumed without loss of generality that G 1 it may be absorbed into c w k in dimension 1. Here, W k and V k reduce to W k = 2c w k 1 2 α + 1 α and Let us define l α k k 1 V k = 2c v k 1 2 α + 1. α def = F k α l α k 1 + W k. One can show by arguments similar to those used to derive 2.13 that l k k 1 measures the magnitude of the redicted error e k k 1 just as l k measures the magnitude of the udated error e k k. We then have l α k = 1 K k H k α l α k k 1 + K k α V k and may minimize l k by standard calculus. The derivative of l α k is comuted as d l α k d K k = α 1 K k H k α 1 sign 1 K k H k H k l α k k 1 + α K k α 1 sign K k V k. Equating to and solving, we see that 34

Algorithm 5 Kalman filter for α-stable noise. 1: Initialize: ˆx = x ˆlα = 2: Predict: ˆx k k 1 = F kˆx k 1 k 1 + B k u k 3: Udate: Aroximate K k by iteratively reweighted least squares Algorithm 4. ˆx k k = ˆx k k 1 + K k yk H kˆx k k 1 ˆlα k = I d K k H k F k α 2 ˆl α k 1 + I d K k H k G α 2 W k + K k α 2 V k. sign K k = sign 1 K k H k sign H k and K k V 1 α 1 k = 1 K k H k H k 1 α 1 l α α 1 k k 1. Hence, K k V 1 α 1 k = 1 K k H k sign H k H k 1 α 1 l α α 1 k k 1, which is easily solved for K k to get the otimal Kalman gain as K k = sign H k H k 1 α α 1 l α 1 k k 1 H k α α α 1 l α 1 α 1 k k 1 + V 1 k 1. 2.18 If we take α = 2 in the above equation, we have exactly the classic Kalman gain 2.1 ignoring the fact that the disersion V k, laying a similar role as variance in the normal distribution, is infinite. The Kalman filter algorithm is imlemented in Algorithm 6. As oosed to the higher dimensional solutions of the Kalman filter for finite L 2 -norm noise and α-stable noise I have given, the Kalman gain 2.18 is exact in the sense that it minimizes the error l k, not an uer bound on l k. We next resent simulations utilizing these results for the α-stable noise environment. 2.3.2 Vehicle tracking Suose we are tracking a vehicle moving in a straight line. The vehicle s osition is measured every T seconds, at which time we can change the velocity u = u k+1. Then 35

Algorithm 6 Kalman filter for 1 dimensional α-stable noise. 1: Initialize: ˆx = Ex = l α = W 2: Predict: ˆx k k 1 = F kˆx k 1 k 1 + B k u k lk k 1 α = F k α lk 1 α + W k 3: Udate: K k = sign H k H k 1 α 1 l α α 1 k k 1 ˆx k k = ˆx k k 1 + K k yk H kˆx k k 1 l α k = 1 K kh k α l α k k 1 + K k α V k H k α α α 1 l α 1 α 1 k k 1 + V 1 k 1 the osition of the vehicle is modeled by x k = x k 1 + T u k. In actuality, the osition of the vehicle at each time is erturbed by circumstances beyond our control otholes, gusts of wind, etc.. A more realistic model is x k = x k 1 + T u k + w k, where w k is a random noise. At each time increment, we observe the osition of the vehicle, which is also contaminated by a random noise. The observation y k is modeled by y k = x k + v k, where v k is a random noise. Our objective is to efficiently estimate the osition of the vehicle at time k. First, we could comletely ignore our observation y k and redict the osition of the vehicle to be ˆx k = ˆx k 1 + T u k. Or, we could comletely ignore the dynamics of the system and redict the osition of the vehicle to be the observation ˆx k = y k. In actuality, we would like to use each iece of information: the dynamics of the system and the observation. If we restrict to linear estimates and assume that {w k } and {v k } are indeendent symmetric α-stable random variables, then we may aly the Kalman filter Algorithm 6 to estimate the osition of the vehicle x k k at time k. Figure 2.1 is a simulation with arameters = 1, α = 1.4, T =.1, and constant velocity u k = u = 4 throughout every time increment. The 36

disersion arameter c w k of w k is taken to be small c w k =.1. This reresents that the otholes, gusts of wind, etc. have minimal effect on the osition of the vehicle. The disersion arameter c v k of v k is taken to be large in comarison to vk w cv k = 5. This arameter reresents the known accuracy of the gs technology. The classic Kalman filter Algorithm 1 weights the observation to heavily in this case, as it does not exect such extreme tail events that occur under an α-stable distribution. We can see in Figure 2.1 the tail events that occur in the observation noise. Such tail events have robability under the Gaussian distribution and are not exected in the classic Kalman filter. 2.3.3 Aircraft tracking As a last examle, we consider two models commonly emloyed in the tracking of an aircraft. Ignoring altitude, the system state being tracked is x = x 1, ẋ 1, x 2, ẋ 2. The system dynamics of a maneuvering aircraft are modeled by the constant velocity CV model and the coordinated turn CT model see e.g. Bar-Shalom et al. 21, Section 11.7 for an overview. The models are T 2 2 x k = F x k 1 + T T 2 2 w k, T where the system dynamics matrix for the CV model is and for the CT model is F def = F def = 1 T 1 1 T 1 sin ωt 1 cos ωt 1 ω ω cos ωt sin ωt 1 cos ωt sin ωt 1 ω ω. sin ωt cos ωt 37

In ractice, the turn rate ω is unknown. One would need to consider the augmented state matrix x k = x 1, ẋ 1, x 2, ẋ 2, ω, for which the system model is now non-linear. Standard ractice is to then aroximate by a first order exansion. We assume here that the turn rate ω is constant and known for simulation uroses. The signal noise w k is a 2-dimensional rotationally invariant α-stable random vector. At each time increment, we observe the osition of the aircraft, which is also contaminated by a 2-dimensional rotationally invariant α-stable random noise. Then the observation y k is [ ] 1 y k = x k + v k. 1 We aly Algorithm 5 to estimate the osition of the vehicle by ˆx k k. Figure 2.2 and Figure 2.3 are simulations of the CV and CT models resectively. The arameters were taken as = 1, α = 1.4, T =.1, c w k =.1, and ck v = 3. As in the vehicle tracking examle, the classic Kalman filter can erform oorly when tail events occur. If we mistakenly believe that the noise is normally distributed, then we do not anticiate such extreme tail events exerienced in the noisy observation. Therefore, the classic Kalman filter is again weighting the observation to heavily and undererforms the α-stable Kalman filter Algorithm 5. 38

8 1 dimensional constant velocity motion in an α Stable noise environment osition 6 4 2 Signal x α Kalman Gaussian Kalman 2 1 2 3 4 5 6 7 8 9 1 time 6 4 Filtering of observation noise α Kalman Observation osition 2 2 1 2 3 4 5 6 7 8 9 1 time Figure 2.1: α-stable Kalman filter for constant velocity 1 dimensional motion. 39

x 1 coordinate of aircraft x 2 coordinate of aircraft 6 4 2 Constant velocity model of 2 dimensional motion in an α Stable noise environment Signal x α Kalman Gaussian Kalman 1 2 3 4 5 6 x coordinate of aircraft 1 2 Filtering of observation noise 2 α Kalman Observation 4 1 2 3 4 5 time x 2 coordinate of aircraft 2 1 Filtering of observation noise 1 α Kalman Observation 2 1 2 3 4 5 time Figure 2.2: 2-D constant velocity model CV. 4

x 2 coordinate of aircraft x 1 coordinate of aircraft 3 2 1 Coordinated turn model of 2 dimensional motion in an α Stable noise environment Signal x α Kalman Gaussian Kalman 1 2.5 2 1.5 1.5.5 x coordinate of aircraft 1 1 Filtering of observation noise 1 α Kalman Observation 2 1 2 3 4 5 time x 2 coordinate of aircraft 2 Filtering of observation noise 2 α Kalman Observation 4 1 2 3 4 5 time Figure 2.3: 2-D coordinated turn model CT. 41