{η : η=linear combination of 1, Z 1,, Z n

Size: px

Start display at page:

Download "{η : η=linear combination of 1, Z 1,, Z n"

Branden Andrews
5 years ago
Views:

1 If 3. Orthogonal projection. Conditional expectation in the wide sense Let (X n n be a sequence of random variables with EX n = σ n and EX n 0. EX k X j = { σk, k = j 0, otherwise, (X n n is the sequence with orthogonal elements. For σk, this sequence forms white noise. If random variables with finite second moments are not orthogonal, they might be orthogonalize by a specific orthogonalization procedure. We start with an example. Example. Let Y and Z be random variables with EY = 0, EZ = 0 and EY <, EZ <. We introduce a projection of X on a pair (, Z by taking a linear combination of and Z: Ŷ = c + c Z. If constants c, c can be chosen such that E{Y Ŷ }Z = 0 the projection Ŷ is named the orthogonal projection. Under EZ > 0, a direct verification shows that the choice c = 0 and c = EY Z gives EZ the orthogonal projection Ŷ = EY Z Z. (3. EZ Consider now the next setting. Let Y, Z,..., Z n be random variables with finite second moments: EY < and EZk <, k =,..., n and introduce a linear space generated by, Z,, Z n : M = {η : η=linear combination of, Z,, Z n } ; if, Z,, Z n is replaced by, Z,, Z n,..., M contains not only linear combination of, Z,, Z n,... but also their limits in L -norm. The orthogonal projection Ŷ of Y on M is an element of M such that Eη ( Y Ŷ = 0 for any η M; symbolically Y Ŷ M. 3.. Computation of the orthogonal projection. Set -m = EY ; -m z k = EZ k, k =,, n; -Z vector(column with entries Z,, Z n -m Z vector(column with entries m z,, m z n -var(y = E(Y m Y ; -cov(z, Z = E(Z m Z (Z m Z T (T is the transposition symbol; -cov(y, Z = E(Y m y (Z m Z T.

2 Theorem 3.. Under nonsingular matrix cov(z, Z, the orthogonal projection of Y on M exists and is defined as follows: Ŷ = m + cov(y, Zcov (Z, Z(Z m Z. (3. Moreover, the variance of the perpendicular Y Ŷ is: E(Y Ŷ = var(y cov(y, Zcov (Z, Zcov T (Y, Z. (3.3 Proof. With an arbitrary vector(raw C, let us introduce a random variable ξ = ( Y m + C ( Z m Z. (3.4 We intend to choose C such that ξ is orthogonal to M. If it is possible, (3.4 provides the decomposition for Y (here M is the orthogonal complement to M: Y = m C ( Z m Z } {{ } M + ξ }{{} M. (3.5 Hence, if C is found, we have Ŷ = m C( Z m Z. Now, owing Z m Z M and ξ M with E ( Z m Z = 0 and Eξ = 0, we find Eξ ( Z m Z = 0. The latter and (3.4 provide cov(y, Z + Ccov(Z, Z = 0 (3.6 and, whereas cov(z, Z is assumed to be nonsingular matrix, the vector C = cov(y, Zcov (Z, Z solves (3.6. Hence, the first statement holds true. Notice now that ξ = Y Ŷ and Eξ = var(y cov(y, Zcov (Z, Zcov T (Y, Z +cov(y, Zcov (Z, Zcov T (Y, Z = var(y cov(y, Zcov (Z, Zcov T (α, Z. Remark 3.. If the matrix cov(z, Z is singular, the statement of the theorem remains valid with cov (Z, Z replaced by the Moore-Penrose pseudo-inverse matrix cov + (Z, Z = T T (T T T T, where T T T = cov(z, Z with T the rectangular matrix of the full rank so that T T T a quadratic nonsingular matrix. 3.. The conditional expectation in the wide sense. The orthogonal projection Ŷ of Y on M is referred as conditional expectation in the wide sense given M, Ŷ = Ê(Y M, and plays an important role in the optimal in the mean square sense linear estimation. We establish below the main properties of Ê(Y M.

3 . EÊ(Y M = EY, Ê(Y M = Y, if Y M Ê(Y M = 0, if Y M. Proof. The random variable Y Ŷ is orthogonal to M, particularly, orthogonal to, that is E(Y Ŷ = 0. If α M, then by the definition Ê(α M = 0.. If c is a constant, then Ê(cY M = cê(z M. Proof. It holds true since the operator of orthogonal projection is linear. 3. If c, c are constants and Y, Y are random variables, possessing second moments, then Ê(c Y + c Y M = c Ê(Y M + c Ê(Y M. Proof. It holds true since the operator of orthogonal projection is linear. 4. If M ia a linear subspace of M: M M, then Ê(Ê(Y M M = Ê( Y M. Proof. Since both Ê(Ê(Y M M and Ê( Y M are from M, it suffices to show that φ = Ê(Ê(α M M Ê( α M is orthogonal to M. To this purpose, the decomposition Y = Ê( Y M + ( Y Ê( Y M is used. Then, whereas ( Y Ê( Y M M, for any ψ M we have EψY = EψÊ( Y M. On the other side, since M M, similarly we get EψÊ (Ê( Y M M = EψÊ( Y M = EψY. Thus, Eψφ = Let M and M are orthogonal linear spaces: M M. Set M = M M. Then for Y, with EY = 0, Ê(Y M = Ê(Y M + Ê(Y M. Proof. Notice that φ = Y Ê(Y M + Ê(Y M is orthogonal to M and M, that is orthogonal to M M := M and the result holds true. 3

4 6. For every η M E(Y η E(Y Ŷ = EY EŶ. Proof. Write E(Y η = = E (Y Ŷ ( η Ŷ = E(Y Ŷ + E(η Ŷ E(Y Ŷ (η Ŷ = E(Y Ŷ + E(η Ŷ E(Y Ŷ. 7. The Cauchy-Schwarz inequality E ( Ê(Y M EY. Proof. Write 0 E ( Y Ê(Y M = EY E ( Ê(Y M Law of large numbers for sequence of orthogonal random variables. Let X, X,... be a sequence of orthogonal random variables with EX n 0 and EX n σ n. Denote by Proposition 3.3. Under n= a n = n n= σ n n X k. <, it holds lim n Ea n = 0. Proof. The result is implied by Ea n = n n σ k and the Kronecker lemma, which being adapted to the case considered, states: σn < = lim σ n n n k = 0. (3.7 We give below a sketch of the proof for (3.7. Set V n = n σ k k and notice that n σk = n (V k V k k. Now, summing by parts we find n σ k = V nn n V k ( k (k, so that σ n k = (V n ( V n k k (k. 4

5 The use of a telescopic sum n = n k ( k (k allows to transform the above equality to the following one n σk = n ( ( Vn V k k (k and evaluate the right hand side of that equality. For n > N, write n ( ( Vn V k k (k 3 = N n ( Vn V k ( k (k 3 + ( ( Vn V n k k (k 3 k=n and notice that the first summand is evaluated above by N n V = N n σ k k 0, n while the second by V V N = σk k=n 0, N. k 3.4. The martingale in the wide sense. Let X, X,... be a sequence of random variables, with EX k <, k, and M be a linear space generated by, X, X,... and M n be a linear space generated by, X, X,..., X n. For a random variable ξ, with Eξ <, set Y n = Ê(ξ M n, n. By property 4. of the conditional expectation in the wide sense, for n > m provides Ê(Y n M m = Y m. The random sequence with this property, created by the real conditional expectation, is named martingale. In our setting, the conditional expectation in the wide sense is used and so a notion of martingale in the wide sense is appropriate. The aim of this Section is to show that Y n, n, converges in L -norm to a limit Ê(ξ M: l.i.m. Ê(ξ M n = Ê(ξ M. (3.8 n Proof of (3.8. Denote y = Y, Y n = Y n Y n, n and notice that for n > m the random variable y n is orthogonal to M X m, whereas Ey n = 0 and moreover Ê(y n M m = Ê(Y n Y n M m = Y m Y m = 0. Hence, Y n = n y k is the sum of zero mean orthogonal random variables x,..., x n and thanks of that EYn = n Ey k. Then, obviously, EYn increases in n. On the other hand, since by property 7. for the 5

6 conditional expectation in the wide sense EYn for any n Eyk Eξ, Eξ, we have that that is Ey k Eξ. Then, the sequence Y n, n converges in L -norm to Y = y k, since by the Cauchy criteria (Theorem. for Lect. ( E(Y n Y m = E k=m+ y k = k=m+ Ey k 0, n, m Markov process in the wide sense. Assume Ê(X n M n = a n X n, n, (3.9 where a n is a sequence of numbers. Set ε n = X n a n X n and show that (ε n n is a sequence of zero mean orthogonal random variables: Eε n 0 Eε n ε m = { 0, n m, EX n a nex n := b b, otherwise. (3.0 In fact, whereas by property. of the conditional expectation EX n = EÊ( X n M n, the first part from (3.0 is valid. By the definition εn is orthogonal to M n and so for M m with m < n. The last property from (3.0 is provided by E ( X n a n X n = EX n a nex n. Thus, the sequence (X n n is defined by a recursion X n = a n X n + ε n subject by X as the initial value, where (ε n n is the sequence of zero mean and orthogonal random variables with Eε n = b n. Random sequences of such the type are named Markov processes in the wide sense. 6

X n = c n + c n,k Y k, (4.2)

X n = c n + c n,k Y k, (4.2) 4. Linear filtering. Wiener filter Assume (X n, Y n is a pair of random sequences in which the first component X n is referred a signal and the second an observation. The paths of the signal are unobservable