Nonlinear function on function additive model with multiple predictor curves. Xin Qi and Ruiyan Luo. Georgia State University
|
|
- Patience Clark
- 5 years ago
- Views:
Transcription
1 Statistica Sinica: Supplement Nonlinear function on function additive model with multiple predictor curves Xin Qi and Ruiyan Luo Georgia State University Supplementary Material This supplementary material is organized as follows. Section S1 includes additional figures and tables in simulations and application. In Section S, we provides a set of identifibility conditions for the nonlinear function F (x, s, t) in the model (3.) in the main manuscript. Section S3 provides additional computational details and the choice of tuning parameters. Section S4 provides details of calculating the estimation error for F (x, s, t) in Simulation 1. The proofs of theorems and technical lemmas are provided in Sections S5 and S6, respectively. In this supplementary material, all the labels of equations, figures, tables, sections, and so on, are prefixed with S., such as equations: (S.1), (S.),, sections: S.1.1, and so on. The equation/section/ numbers without prefix S. are for those in the main manuscript.
2 Xin Qi and Ruiyan Luo S1 Additional figures and tables S.1.1 Figures 3 sample curves of X(s) s Figure S.1: Thirty sample curves of X(s) in Simulation 1.
3 S1. ADDITIONAL FIGURES AND TABLES 3 samples curves from Y(t) for F function 1 3 samples curves from Y(t) for F function t t 3 samples curves from Y(t) for F function 3 3 samples curves from Y(t) for F function t t Figure S.: Thirty sample curves of Y (t) for each of the four F (x, s, t) with ρ =.7 and σ = 1 in Simulation 1.
4 Xin Qi and Ruiyan Luo Frequencies of selected λ Frequencies of selected κ e 1 1e The setting : F function 1, σ =.1, ρ = Frequencies of selected λ e 1 1e The setting : F function 1, σ =.1, ρ = Frequencies of selected κ e 1 1e The setting : F function 1, σ =.1, ρ =.7 Frequencies of selected λ e 1 1e The setting : F function 1, σ =.1, ρ =.7 Frequencies of selected κ e 1 1e The setting : F function 1, σ = 1, ρ = Frequencies of selected λ e 1 1e The setting : F function 1, σ = 1, ρ = Frequencies of selected κ e 1 1e The setting : F function 1, σ = 1, ρ = e 1 1e The setting : F function 1, σ = 1, ρ =.7 Figure S.3: The frequencies of the selected tuning parameters for the first F (x, s, t) in Simulation 1.
5 S1. ADDITIONAL FIGURES AND TABLES Frequencies of selected λ Frequencies of selected κ 4 6 1e 1 1e The setting : F function, σ =.1, ρ = Frequencies of selected λ e 1 1e The setting : F function, σ =.1, ρ = Frequencies of selected κ e 1 1e The setting : F function, σ =.1, ρ =.7 Frequencies of selected λ e 1 1e The setting : F function, σ =.1, ρ =.7 Frequencies of selected κ e 1 1e The setting : F function, σ = 1, ρ = Frequencies of selected λ e 1 1e The setting : F function, σ = 1, ρ = Frequencies of selected κ e 1 1e The setting : F function, σ = 1, ρ = e 1 1e The setting : F function, σ = 1, ρ =.7 Figure S.4: The frequencies of the selected tuning parameters for the second F (x, s, t) in Simulation 1.
6 Xin Qi and Ruiyan Luo Frequencies of selected λ Frequencies of selected κ e 1 1e The setting : F function 3, σ =.1, ρ = Frequencies of selected λ e 1 1e The setting : F function 3, σ =.1, ρ = Frequencies of selected κ e 1 1e The setting : F function 3, σ =.1, ρ =.7 Frequencies of selected λ e 1 1e The setting : F function 3, σ =.1, ρ =.7 Frequencies of selected κ e 1 1e The setting : F function 3, σ = 1, ρ = Frequencies of selected λ e 1 1e The setting : F function 3, σ = 1, ρ = Frequencies of selected κ e 1 1e The setting : F function 3, σ = 1, ρ = e 1 1e The setting : F function 3, σ = 1, ρ =.7 Figure S.5: The frequencies of the selected tuning parameters for the third F (x, s, t) in Simulation 1.
7 S1. ADDITIONAL FIGURES AND TABLES Frequencies of selected λ Frequencies of selected κ e 1 1e The setting : F function 4, σ =.1, ρ = Frequencies of selected λ e 1 1e The setting : F function 4, σ =.1, ρ = Frequencies of selected κ e 1 1e The setting : F function 4, σ =.1, ρ =.7 Frequencies of selected λ e 1 1e The setting : F function 4, σ =.1, ρ =.7 Frequencies of selected κ e 1 1e The setting : F function 4, σ = 1, ρ = Frequencies of selected λ 1e 1 1e The setting : F function 4, σ = 1, ρ = e 1 1e The setting : F function 4, σ = 1, ρ = Frequencies of selected κ 1e 1 1e The setting : F function 4, σ = 1, ρ =.7 Figure S.6: The frequencies of the selected tuning parameters for the fourth F (x, s, t) in Simulation 1.
8 Xin Qi and Ruiyan Luo NO CO NMHC NOx t t t t C6H6 Temperature Relative Humidity t t t Figure S.7: The 355 sample curves for each of the seven variables in the air quality data.
9 The estimated intercept functions : µ^(t) S1. ADDITIONAL FIGURES AND TABLES t Figure S.8: Estimated intercept function µ(t) for the daily air quality data. The function F^1 (DISC) (x,s,1) The function F^ (DISC) (x,s,1) The function F^3 (DISC) (x,s,1) x s x s x s The function F^4 (DISC) (x,s,1) The function F^5 (DISC) (x,s,1) The function F^6 (DISC) (x,s,1) x s x s x s Figure S.9: Estimated functions (DISC) F j (x, s, t) given t = 1, 1 j 6, for the daily air quality data.
10 Xin Qi and Ruiyan Luo (DISC) F^3 (x,s,6) (DISC) F^3 (x,s,1) (DISC) F^3 (x,s,18) s= s=6 s=1 s=18 s= s= s=6 s=1 s=18 s= s= s=6 s=1 s=18 s= NOx NOx NOx Figure S.1: The curves daily air quality data. F (DISC) 3 (x, s, t) versus x (predictor NO x ) for different values of s and t in the
11 S1. ADDITIONAL FIGURES AND TABLES S.1. Tables Table S.1: The averages (and standard deviations) of the number K of the selected components and running time (in seconds) of 1 replicates for Simulation 1. The running time was obtained on a computer cluster with Linux system and Intel(R) Xeon(R) CPU GHz. Model σ ρ K time (.) (.986).7.15(.36) (.88).3(.17) 3.98(6.49).7.5(.) 3.48(6.366).(.) 1.13(.187).7 1.9(.7) 1.668(.13) 1.8(1.4) (6.895).7.7(1.18) (7.3).54(.5) (.671).7.75(.44) 1.98(.735).43(.6) 8.387(5.461) (1.1) 3.9(6.96).18(.39) 1.916(.537).7.89(.31) 1.981(.688) 4.7(.13) (6.64).7 3.3(1.39) 36.1(7.136)
12 Xin Qi and Ruiyan Luo Table S.: The averages (and standard deviations) of the number K of the selected components and running time (in seconds) of 1 replicates for Simulations and 3. Simulation Simulation 3 γ ρ curve σ ρ K time K time (.) 39.75(1.48) 3.(.) (13.785).7.(.) 4.173(1.4) 3.(.) 68.14(13.88).14(.67) (8.91).48(1.185) 5.554(46.39).7 4.9(1.31) (3.) 5.9(.357) 55.84(45.438).(.) 4.451(1.57) 3.(.) 64.99(1.47).7.1(.1) 41.71(1.851).74(.44) (1.848).(.) (8.976).39(1.54) (4.778).7 4.3(1.66) (9.49) 3.865(1.8) (39.48) (.) 39.8(1.67) 3.(.) 67.99(13.67).7.(.) 4.654(1.468) 3.(.) 69.3(14.186).3(.3) (9.441).4(1.39) 5.579(45.187) (1.31) (3.65) 5.394(.113) 53.69(45.759).(.) 4.176(1.564) 3.(.) 65.98(1.864).7.(.) (1.596).639(.483) 69.13(1.45).1(.1) 11.36(8.75).6(.676) 4.684(4.64).7 4.3(1.7) 116.4(9.6) 4.43(1.98) 4.397(35.73)
13 S. CONDITIONS FOR THE IDENTIFIABILITY OF THE FUNCTION F (X, S, T ) S Conditions for the identifiability of the function F (x, s, t) To ensure the identifiability of F (x, s, t) in model (3.), we impose a set of conditions on F (x, s, t) and the distribution of X(s) in the following Proposition. Proposition S.1. Suppose that the following conditions are satisfied, (1). X(s) is a Gaussian process. The covariance function Σ X (s, s ) of X(s) is continuous and all the eigenvalues of Σ X (s, s ) are positive. (). The true function F (x, s, t) satisfies the condition (3.3) in the manuscript, and the partial derivative x F (x, s, t) is a continuous function on (x, s, t) (, ) [, 1] [a, b]. If there is another function F (x, s, t) satisfying the conditions in (), and F (X(s), s, t)ds = F (X(s), s, t)ds for all t 1, then we have F (x, s, t) = F (x, s, t). Remark 1. Different conditions exist for the identifiability of F (x, s, t). But we note that any set of identifiability conditions cannot be verified in practice. Indeed, a necessary condition for identifiability is that all the eigenvalues of the covariance function of X(s) are positive. Indeed, if this condition is not satisfied, let β(s) be a nonzero eigenfunction corresponding to the zero eigenvalue. Then we have X(s)β(s)ds = and hence the function F (x, s, t) = F (x, s, t)+{x E[X(s)]}β(s) leads to the same model as the true model. But F (x, s, t) satisfies the condition (3.3) and is different from F (x, s, t). Therefore, F (x, s, t) is not identifiable. In practice, we only have a finite number of sample curves and there are
14 Xin Qi and Ruiyan Luo infinitely many eigenvalues for the covariance function of X(s), we cannot determine if all the eigenvalues are positive. Therefore, any identifiability conditions cannot be verified in practice. Proof of Proposition S.1. We prove this proposition by contradiction. Let (Ω, P ) denote the probability space. Suppose that there exists another function F (x, s, t) which is different from F (x, s, t), and satisfies the conditions in () in this proposition and F (X(ω, s), s, t)ds = F (X(ω, s), s, t)ds for all ω Ω and t 1. Then there exists t such that F (x, s, t ) F (x, s, t ). Define G(x, s) = F (x, s, t ) F (x, s, t ). Then G(x, s) and G(X(ω, s), s)ds = F (X(ω, s), s, t )ds F (X(ω, s), s, t )ds =, (S.1) for all ω. Then for any two ω and ω, we have = G(X(ω, s), s)ds G(X(ω, s), s)ds = X(ω,s) X(ω,s) x G(x, s)dxds. (S.) We show that x G(x, s) is a nonzero function. Otherwise, if x G =, then G(x, s) only depends on s, that is, we have F (x, s, t ) F (x, s, t ) = G(x, s) = h(s) for some function h(s). By the condition (3.3), E[ F (X(s), s, t )] = = E[F (X(s), s, t )] for all s 1, so we have = E[G(X(s), s)] = h(s) and hence G(x, s) = which contradicts to the fact G(x, s). Therefore, x G(x, s) is a nonzero function. Because x G(x, s) = x F (x, s, t ) x F (x, s, t ) is nonzero and continuous due to the continuity of both x F (x, s, t ) and x F (x, s, t ), we can find a rectangle region {(x, s) : x 1 x
15 S. CONDITIONS FOR THE IDENTIFIABILITY OF THE FUNCTION F (X, S, T ) x, s 1 s s }, such that x G(x, s) > δ for some positive constant δ. Without loss of generality, we assume that in this rectangle region, x G(x, s) > δ, (if x G(x, s) is negative, the proof is similar). We define two step functions: x 1 +x if < s < s 1 or s < s < 1 f 1 (s) =, x 1 if s 1 s s x 1 +x if < s < s 1 or s < s < 1 f (s) =, (S.3) x if s 1 s s and plot them in the following Figure S (s 1, x ) (s 1, x 1 ) f (s) f 1 (s) (s, x ) (s, x 1 ) s Figure S.11: The plots of f 1 (s) and f (s) defined in (S.3).
16 Xin Qi and Ruiyan Luo Because x G(x, s) > δ in the region {(x, s) : x 1 x x, s 1 s s }, we have = G(f (s), s)ds {x 1 x x,s 1 s s } G(f 1 (s), s)ds = f (s) x G(x, s)dxds f 1 (s) x G(x, s)(x, s)dxds {x 1 x x,s 1 s s } δdxds > η >, (S.4) where η is any positive constant satisfying η < δ(s s 1 )(x x 1 ) and we will choose a specific η in the following. Consider the Karhunen-Loève expansion: X(s) = ξ k φ k (s), (S.5) where {φ k (s), k 1} is the collection of all the eigenfunctions of Σ X (s, s ) and forms an orthonormal basis of the L [, 1] space. ξ k = X(s)φ k(s)ds, k 1, are uncorrelated random variables and V ar(ξ k ) = ν k, where ν 1 ν > are eigenvalues of Σ X (s, s ) and they are all positive by the condition (1) in this proposition. Because we have assumed that X(s) is a Gaussian process, {ξ k, k 1} are independent normal variables. Now let f 1 (s) = a (1) k φ k(s), f (s) = a () k φ k(s), be the expansions of f 1 (s) and f (s) using the orthonormal basis {φ k (s), k 1}. Define an event { A = ω : } { [X(s, ω) f 1 (s)] ds 1 ω : } [X(s, ω) f (s)] ds 1. (S.6) Let N 1 = sup{ X(s, ω) : s 1, ω A } and N = sup s 1 max{f 1 (s), f (s)} N = max{n 1, N }, C 1 = sup x G(x, s). (S.7) x N, s 1
17 S. CONDITIONS FOR THE IDENTIFIABILITY OF THE FUNCTION F (X, S, T ) We choose an integer M large enough such that k=m+1 E[ξ k] = k=m+1 ν k η 3C 1, k=m+1 {a (1) k } < η, 16C 1 k=m+1 {a () k } < η. 16C 1 (S.8) We define two events A 1 = {ω : [ξ k (ω) a (1) k ] < η/(8c 1 M), 1 k M, A = {ω : [ξ k (ω) a () k ] < η/(8c 1 M), 1 k M, k=m+1 k=m+1 ξ k (ω) η/(16c 1 )}, ξ k (ω) η/(16c 1 )}. (S.9) Because {ξ k, k 1} are independent normal variables, P (A 1 ) = where each P P M P ( ( ) [ξ k (ω) a (1) k ] < η/(8c 1 M) P k=m+1 ξ k (ω) η/(16c 1 ) ( ) [ξ k (ω) a (1) k ] < η/(8c 1 M) is positive, 1 k M, and ( k=m+1 ξ k (ω) η/(16c 1 ) 1 E ( ) k=m+1 ξ k η/(16c 1 ) 1 η/(3c 1) η/(16c 1 ) = 1, = 1 ) = 1 P ( k=m+1 E[ξ k ] η/(16c 1 ) k=m+1 ξ k > η/(16c 1 ) = 1 k=m+1 ν k η/(16c 1 ) ) ), where the first inequality in the second line follows from the Markov s inequality, and the first inequality in the last line follows from the first inequality in (S.8). Therefore, P (A 1 ) is positive and similarly, P (A ) is also positive. Pick up an ω A 1 and an ω A. By
18 Xin Qi and Ruiyan Luo (S.9) and (S.8), and noting that {φ k (s), k 1} is an orthonormal basis, we have [X(s, ω) f 1 (s)] ds = M [ξ k (ω) a (1) k ] + M [ξ k (ω) a (1) k ] + M η/(8c 1 M) + [ ξ k (ω)φ k (s) [ k=m+1 k=m+1 [ k=m+1 ξ k (ω) + η + η + η = 3η. 8C 1 16C 1 16C 1 8C 1 Similarly, a (1) k φ k(s)] ds ξ k (ω)φ k (s) a (1) k φ k(s)] ds ξ k (ω)φ k (s)] ds + [X(s, ω) f (s)] ds 3η 8C 1. [a (1) k ] k=m+1 [ k=m+1 a (1) k φ k(s)] ds (S.1) (S.11) Now we choose η small enough such that 3η/(8C 1 ) < 1. Then by the definition of the event A in (S.6), we have ω, ω A, which, together with (S.7), implies that for any (x, s) with s 1 and x between X(s, ω) and f 1 (s), or x between X(s, ω) and f (s), we have x G(x, s) C 1. (S.1) Now by (S.), X(s, ω) = G(X(s, ω), s)ds G(X(s, ω), s)ds = x G(x, s)dxds X(s,ω) f1 (s) f (s) X(s, ω) = x G(x, s)dxds + x G(x, s)dxds + x G(x, s)dxds X(s,ω) f 1 (s) f (s) f (s) X(s,ω) X(s, ω) x G(x, s)dxds x G(x, s)dxds x G(x, s)dxds f 1 (s) f 1 (s) f (s)
19 η η X(s,ω) f 1 (s) x G(x, s)dxds X(s, ω) f 1 (s) C 1 ds X(s, ω) f (s) x G(x, s)dxds X(s, ω) f (s) C 1 ds S3. COMPUTATIONAL ISSUE (by (S.4)) (by (S.1)) η C 1 X(s, ω) f 1 (s) ds C 1 X(s, ω) f (s) ds (by Cauchy-Schwarz ) η 3C 1η 8C 1 3C 1η 8C 1 = η 4, (S.13) where the last inequality follows from (S.1) and (S.11). Now (S.13) implies that η/4, but η is a positive number. So we got a contradiction. Hence, we must have F (x, s, t) = F (x, s, t). The proposition is proved. S3 Computational issue S.3.1 Solving the optimization problem (3.18) We first calculate the first term, the integrated sum of squared residuals, in the objective function of (3.18). 1 n Y(t) v (t)1 n R k v k (t) 1 n {1 n Y(t)}v (t)dt + l=1 1 n { R l R k }v k (t)v l (t)dt. dt = 1 n Y(t) dt + 1 n { R k Y(t)}v k (t)dt + v (t) dt 1 n {1 n R k }v k (t)v (t)dt (S3.14)
20 Xin Qi and Ruiyan Luo From the definition of Σ in (3.14), if j k, 1 n R j R k = 1 n n l=1 [ ] [ ] r l (Ĝj) r(ĝj) r l (Ĝk) r(ĝk) = Σ(Ĝj, Ĝk) = λ Ĝj, Ĝk H, where the last equality follows from the constraints in the optimization problem (3.15). Similarly, if j = k, 1 n R k R k = 1 n n l=1 [ ] [ ] r l (Ĝk) r(ĝk) r l (Ĝk) r(ĝk) = Σ(Ĝk, Ĝk) = 1 λ Ĝk, Ĝk H, Because in our asymptotic theory, the tuning parameter λ goes to zero as n, the terms λ Ĝj, Ĝk H are typically small. Hence, we have the approximation 1 R R n j k, for all k j, and 1 R R n k k 1. Moreover, for any 1 k K, R k 1 n = [ ] n l=1 r l (Ĝj) r(ĝj). Then by (S3.14), we have the following approximation, 1 n + Y(t) v (t)1 n ȳ(t)v (t)dt {v (t) ȳ(t)} dt R k v k (t) dt 1 n 1 n { R k Y(t)}v k (t)dt + ȳ(t) dt + Y(t) dt + v (t) dt v k (t) dt = 1 n {v k (t) ŵ () k (t)} dt (S3.15) Y(t) dt ŵ () k (t) dt, where ŵ () k (t) = R k Y(t)/n. Thus, the estimates µ(t) and φ k (t) of µ(t), φ k (t), 1 k K, can be obtained by solving the following problems separately: [ b b min v (t) Y (t) dt + κ v (t) a a [ b v k (t) 1 n R k Y(t) dt + κ min v k (t) a v ] (t) dt b a, (S3.16) ] v k(t) dt, 1 k K. (S3.17) =
21 S3. COMPUTATIONAL ISSUE We solve (S3.16) and (S3.17) using B-spline basis expansions as in Section 3. of Luo and Qi (17). S.3. Choice of the number of components and tuning parameters We choose the number of components and the two tuning parameters λ and κ simultaneously based on the following cross-validation procedure. Both λ and κ are chosen from the set {1 1, 1 8, 1 6, 1 4, 1, 1}. For the l-th value of λ, 1 l 6, we first determine a maximum number of components, K max,l. The optimal number of components will be chosen from the integers between 1 and K max,l. In Theorem 3, we choose the first few components with relatively large σ k. As σ k can be estimated by ( σ(l) k ) which is the maximum value of the optimization problem (3.15), we set { K max,l = min k > 1 : ( σ (l) ( σ (l) k ).1 1 ) + + ( σ (l) ) k }. (S3.18) Once we have determined all the K max,l, we use the cross-validation method to determine the tuning parameters (λ, κ) and the optimal number of components, K opt, simultaneously. We summarize the details of the procedure in the following algorithm. Algorithm For the l-th value of the tuning parameter λ, 1 l 6, we determine K max,l using the whole data set and (S3.18).. We randomly split the whole data set into five subsets. For each 1 v 5, we use the v-th subset as the v-th validation set and all other observations as the v-th training set. Then for the l-th value for λ and the v-th training set,
22 Xin Qi and Ruiyan Luo (a) we estimate Ĝ(v,l) k (x, s) for all 1 k K max,l. Then for the i-th value of κ, 1 i 6, we estimate µ (v,l,i) (t) and φ (v,l,i) k (t) for all 1 k K max,l. (b) For each K = 1,, K max,l, we use µ (v,l,i) (t), Ĝ (v,l) (x, s) and k (v,l,i) φ k (t) for 1 k K, to obtain the predicted response curves {ŷ (v,l,i,k) j (t), 1 j N v } for the v-th validation set using (3.19) and then calculate the corresponding validation error e v,l,i,k = N v Ny j=1 m=1 (ŷ(v,l,i,k) j (t m ) y (v) j (t m )) δ m /N v, where {y (v) (t), 1 j N v } are the collection of all the response curves in the v-th validation set. j After we repeat (a)-(b) for all 1 v 5 and 1 l 6, we calculate the average validation error, ē l,i,k = 5 v=1 e v,l,i,k/5. 3. Let ē l,i,k = min 1 l 6,1 i 6,1 K Kmax,l ē l,i,k. Then we choose the l -th value for λ, the i -th value for κ, and the optimal number of components is K opt = K. S4 Details of calculating the estimation error for F (x, s, t) in Simulation 1 In Simulation 1, the functions F (x, s, t) do not satisfy the condition (3.3) and may be unidentifiable. So we consider the following centered function F (x, s, t) = F (x, s, t) E[F (X(s), s, t],
23 S4. DETAILS OF CALCULATING THE ESTIMATION ERROR FOR F (X, S, T ) IN SIMULATION 1 and the models in Simulation 1 can be rewritten as Y (t) = µ (t) + F (X(s), s, t)ds + ε(t), where µ (t) = µ(t)+ E[F (X(s), s, t]ds. To calculate the expectation E[F (X(s), s, t], we note that X(s) is a Gaussian process, and for any s, the marginal distribution of X(s) is the standard normal distribution. So E[F (X(s), s, t] = e x / F (x, s, t)dx/ π which is calculated using a Riemann sum in practice. Based on Proposition S.1 in Section S of this supplementary material, we can show the centered functions F (x, s, t) is identifiable. Actually, F (x, s, t) satisfies the condition (3.3), and hence satisfies the condition () in Proposition S.1. It follows from the results in Section of the book Rasmussen and Williams (5) that all the eigenvalues of the covariance function Σ X (s, s ) = e {1 s s } are positive. Hence, the condition (1) in Proposition S.1 is also satisfied. It follows from Proposition S.1 that F (x, s, t) is identifiable. So we will calculate the estimation error for F (x, s, t). For any method considered in Simulation 1, let F (x, s, t) denote the estimate of F (x, s, t). Then we use F (x, s, t) = F (x, s, t) e x / F (x, s, t)dx/ π as the estimate of F (x, s, t). In practice, we have only a finite number of sample predictor curves. The range only covers a finite region of x. So we cannot obtain any information about F (x, s, t) when x is outside of this finite region, although the function F (x, s, t) is defined in the unbounded region (, ) [, 1] [, 1]. In the following Figure S.1, we plot 1 sample curves of X(s) in the training set of one repeat in Simulation 1. The plot
24 Xin Qi and Ruiyan Luo shows that most values of these sample curves fall between -.5 and.5. 1 sample curves of X(s) in the traing set x=.5 x= s Figure S.1: 1 sample curves of X(s) in the training set of one repeat in Simulation 1. Therefore, one can anticipate that the estimate of F (x, s, t) with x outside the interval [.5,.5] may be less accurate for any method considered in Simulation 1. Therefore, we only consider the estimation error of F (x, s, t) in the finite region [.5,.5] [, 1] [, 1]. Specifically, we calculate the following relative mean squared error (RelMSE) for F (x, s, t): RelMSE = [ ] F (x, s, t) F (x, s, t) dxdsdt F (x, s, t) dxdsdt
25 S4. DETAILS OF CALCULATING THE ESTIMATION ERROR FOR F (X, S, T ) IN SIMULATION 1 where the integrals are calculated using Riemann sum over a dense grid in the region [.5,.5] [, 1] [, 1]. We report the averages and standard deviations of the RelMSEs over 1 repeats in Table S.3 for all 16 settings in Simulation 1. Since the output of pffr.pc does not provide estimate of F (x, s, t), we do not include this method in Table S.3. Table S.3: The averages (standard deviations) of RelMSEs of 1 replicates for Simulation 1. Model σ ρ SigComp.nonlin pffr.nonlin SigComp.lin pffr e-4 ( ).6 ( 1e-4 ) e-4 ( e-4 ).11 ( e-4 ).7.11 ( e-4 ).13 ( 4e-4 ) 6e-4 ( e-4 ).778 (.46 ).47 (.11 ).6 ( 9e-4 ).46 (.18 ).51 ( 9e-4 ).7.8 (.83 ).1445 (.45 ).19 (.8 ).384 ( ) 6e-4 ( e-4 ). ( 7e-4 ).8878 (.913 ) ( ).7.6 ( 9e-4 ).35 (.1 ).9418 (.7316 ) ( ).77 (.1 ).59 (.17 ).894 (.383 ) ( ).7.84 (.15 ).14 (.364 ).938 (.4537 ) ( ) 9e-4 ( 1e-4 ).3 (.119 ).367 (.59 ) ( ).7.6 (.14 ).37 (.9 ).3944 (.519 ) ( ).15 (.38 ).8 (.91 ).3865 (.1979 ) ( ) (.71 ).71 (.79 ).3736 (.54 ) 3.77 ( ).15 ( 3e-4 ).374 (.143 ) 1.44 (.5115 ) ( )).7.41 ( 8e-4 ).43 (.193 ).9444 (.37 ) ( ).177 (.5 ).383 (.14 ) (.647 ) ( ) (.13 ).1617 (.54 ).9547 (.84 ) ( )
26 Xin Qi and Ruiyan Luo S5 Proofs of theorems S.5.1 Proof of Theorem 1 For 1 k K, let H k (x, s) be any function satisfying that H k(x(s), s)ds has a finite second moment and ϕ k (t) be any function in L [a, b]. Let Y new (t) = µ(t) + Y 1 (t) = µ(t) + Y (t) = µ(t) + F (X new (s), s, t)ds + ε new G k (X new (s), s)φ k (t)ds H k (X new (s), s)ϕ k (t)ds denote the response function for X new (s), the predicted response function based on the partial sum K G k(x, s)φ k (t), and the predicted response function based on K H k(x, s)ϕ k (t), respectively. The mean squared prediction error for K G k(x, s)φ k (t) is E [ Y new Y 1 ] L (S5.19) 1 =E µ + F (X new (s), s, )ds + ε new µ G k (X new (s), s)φ k ds L 1 =E F (X new (s), s, )ds G k (X new (s), s)φ k ds + E [ ] ε new L L (because X new (s) and ε new (t) are independent) =E F (X(s), s, )ds G k (X(s), s)φ k ds L + E [ ] ε L (S5.)
27 S5. PROOFS OF THEOREMS (because X new (s) and ε new (t) have the same distributions as X(t) and ε(t)) K =E S r k φ k + E [ ε ] L (by the definition of S(t) and (3.6)). L Similarly, the mean squared prediction error for the µ(t) + K H k(x, s)ϕ k (t) is E [ Y new Y ] K L = E S q k ϕ k + E [ ] ε L where q k = H k(x(s), s)ds. A nice property of the KL expansion is that K r kφ k (t) is the best K-dimensional random approximation of S(t) and has the smallest approximation [ S ] error among all K-dimensional random approximation of S(t). So E K r kφ k L [ S ] E K q kϕ k and hence E [ Y new Y 1 ] [ L E Ynew Y ] L. Because L H k (x, s) and ϕ k (t) are arbitrary, the partial sum K G k(x, s)φ k (t) has the smallest prediction error. Moreover, because {r 1, r, } are uncorrelated random variables with mean zero and variance one and {φ 1, φ, } are orthogonal functions with φ k L = σ k, we have E [ S ] [ r k φ k L = E k=k+1 which, together with (S5.19), leads to E [ Y new Y 1 L ] = r k φ k L ] = k=k+1 L k=k+1 σ k + E[ ε L ]. E [ r k φ k L ] = k=k+1 The inequalities in (3.8) in this theorem have been proved. Under Condition 1 in Section σ k,
28 Xin Qi and Ruiyan Luo 3.4 of asymptotic theory, a straightforward calculation leads to the inequalities in (3.9). S.5. Proof of Theorem We only prove the theorem for k = 1. The proof of the theorem for k > 1 is very similar to that for k = 1 and hence is omitted. Here is the outline of the proof for k = 1. We first show that for any G(x, s) satisfying the constraint Σ(G, G) = 1, we have Λ(G, G) σ 1. Then we prove that G 1 (x, s) satisfies Σ(G 1, G 1 ) = 1 and Λ(G 1, G 1 ) = σ 1, which implies that G 1 (x, s) is the solution to (3.11) and the maximum value of (3.11) is σ 1. Before providing the details of the proof, we recall some definitions and notations. Recall that for any K 1, K r(g k)φ k (t) = K r kφ k (t) is the best K-dimensional approximation to S(t), and S(t) = r(g k )φ k (t), where (S5.1) r(g 1 ), r(g ),, are uncorrelated random variables with mean and variance 1, (S5.) and φ k (t) = σ k φk, where φ 1, φ,, are orthonormal eigenfunctions of S(t). (S5.3) Recall the definitions of Λ(G, G) and Σ(G, G): Λ(G, G) = b a E [S(t)r(G)] dt, Σ(G, G) = E [ (r(g)) ]. (S5.4)
29 S5. PROOFS OF THEOREMS Now let G(x, s) be any function satisfying the following constraint in the optimization problem (3.11): 1 = Σ(G, G) = E [ (r(g)) ]. (S5.5) We have Λ(G, G) = b {E [S(t)r(G)]} dt = b a a { E [r(g k )r(g)] φ k (t)} dt (by (S5.1)) = {E [r(g k )r(g)]} φ k L (because φ 1(t),, φ K (t) are orthogonal) = {E [r(g k )r(g)]} σk (because φ k L = σ k by (S5.3)) {E [r(g k )r(g)]} σ1 σ1e [ r(g) ] (by (S5.)) =σ 1, (by (S5.5)). Therefore, the maximum value of the optimization problem (3.11) is less than or equal to σ 1. On the other hand, by (S5.), Σ(G 1, G 1 ) = E [ (r(g 1 )) ] = 1. Therefore, G 1 (x, s) satisfies the constraint in (3.11) for k = 1. Moreover, by (S5.4) and (S5.1), b { b Λ(G 1, G 1 ) = {E [S(t)r(G 1 )]} dt = E [r(g k )r(g 1 )] φ k (t)} dt = a a {E [r(g k )r(g 1 )]} φ k L (because φ 1(t),, φ K (t), are orthogonal to each other)
30 Xin Qi and Ruiyan Luo =σ 1E [ r(g 1 ) ] (because (S5.) and φ k L = σ k ) =σ 1, (by (S5.)). Therefore, G 1 (x, s) is the solution to (3.11) and the maximum value is σ 1. S.5.3 Proof of Theorem 3 For convenience, we first recall some notations in the main manuscript. We use L and H to denote L and Sobolev space in [, 1] [, 1], respectively, and both of them are Hilbert spaces. Given a function G(x, s) defined in [, 1] [, 1], let G L = G(x, s) dxds, G H = G L + xx G L + xs G L + ss G L, be the L norm and the Sobolev norm, respectively. For two functions G(x, s) and G(x, s) defined in [, 1] [, 1], the inner products in the two Hilbert spaces are denoted by G, G L = G(x, s) G(x, s)dxds, G, G H = { G(x, s) G(x, s) + xx G(x, s) xx G(x, s) + xs G(x, s) xs G(x, s) + ss G(x, s) ss G(x, s) } dxds, respectively. Step 1: Show that Σ(, ), Λ(, ), Σ(, ), and Λ(, ) are all bounded bilinear functions in H
31 S5. PROOFS OF THEOREMS For any G(x, s) H, by the definition of r(g) in (3.5), we have r(g) = = = = X(s) X(s) X(s) x x [ G(X(s), s)ds E G(X(s), s)ds] [ 1 x G(x, s)dxds + G(, s)ds E x G(x, s)dxds E x G(x, s) dxds + E [ [ X(s) X(s) xx G(y, s)dy + x G(, s) dxds xx G(y, s) dydxds + xx G(y, s) dydxds + xx G(y, s) dyds + xx G(y, s) dyds + K x x G(, s) ds X(s) x G(x, s)dxds + x G(x, s)dxds] x G(x, s) dxds [ ] = (K + ) { G(y, s) + xx G(y, s) }dy ds (K + ) {G(y, s) + xx G(y, s) }dyds ] x G(, s) dydxds x G(, s) dydxds [ ] { G(y, s) + xx G(y, s) }dy ds dyds G(, s)ds] x G(x, s) dxds (S5.6) (S5.7) (K + ) G H, (S5.8) where the inequality in (S5.6) follows from the interpolation inequality (Lemma 5.4 in Adams and Fournier (3)) and the inequality in (S5.7) follows from the Cauchy-
32 Xin Qi and Ruiyan Luo Schwarz inequality. By (S5.8) and the definition of Σ(, ) in (3.1), for any G(x, s), G(x, s) H, Σ(G, G) = E[r(G)r( G)] (K + ) G H G H. (S5.9) (S5.9) implies that Σ(, ) is a bounded bilinear function in H. By Theorem 1.8 in Rudin (1991), Σ(, ) defines a bounded linear operator which, for simplicity, is still denoted by Σ: for any G(x, s), G(x, s) H, Σ(G, G) = G, Σ G H. Similarly, we can show that Λ(, ), Σ(, ), and Λ(, ) are all bounded bilinear functions in H. They all define bounded operators in H (still denoted by Λ, Σ and Σ): Σ(G, G) = G, Σ G H, Λ(G, G) = G, Λ G H, Λ(G, G) = G, Λ G H. Then the optimization problem (3.11) can be expressed as max G, ΛG H, (S5.3) G H subject to G, ΣG H = 1 and G k, ΣG H =, 1 k k 1. The optimization problem (3.15) can be expressed as max G, ΛG H, G H subject to G, ΣG H + λ G H = 1 and Ĝk, ΣG H + λ Ĝk, G H =, for all 1 k k 1. This problem can be further expressed as max G, ΛG H, (S5.31) G H subject to G, ( Σ + λi)g H = 1 and Ĝk, ( Σ + λi)g H =,
33 S5. PROOFS OF THEOREMS where I is the identity operator in H. Step : Transform (S5.3) and (S5.31) to eigenvalue problems in H. Because Σ and ( Σ + λi) are all positive definite (that is, all their eigenvalues are nonnegative) and symmetric, their symmetric square root operators uniquely exist (Theorem 1.33 in Rudin (1991)). Let Σ 1/ and ( Σ + λi) 1/ denote the symmetric square root operator of Σ and Σ + λi, respectively. Since all the eigenvalues of Σ + λi are positive (and greater than λ), it is invertible and ( Σ + λi) 1/ is the symmetric square root operator of ( Σ + λi) 1. Lemma S.1. There exist two bounded operators, B and H, in H such that Λ = Σ B Σ + H, Λ = Σ BΣ. (S5.3) Let η k = Σ 1/ G k and η k = ( Σ + λi) 1/ Ĝ k. Then the optimization problem (S5.3) can be transformed to max η, Γη H, s.t. η η H H = 1, η k, η H =, for all 1 k k 1, (S5.33) where Γ = Σ 1/ BΣ 1/ and its solutions are η k s. The optimization problem (S5.31) can be transformed to max η, Γη H, s.t. η η H H = 1, η k, η H =, for all 1 k k 1, (S5.34) where Γ = ( Σ + λi) 1/ ( Σ B Σ + H)( Σ + λi) 1/ = ( Σ + λi) 1/ Σ B Σ( Σ + λi) 1/ + ( Σ +
34 Xin Qi and Ruiyan Luo λi) 1/ H( Σ + λi) 1/, and η k s are the solutions. Based on the two transformations, we provide the convergence rate in the next step. Step 3: Provide the convergence rate for 1 n n l=1 F (X l (s), s, )ds F (X l(s), s, )ds. L For simplicity, we assume that E[F (X(s), s, t)] = for all s, t. Similar to (3.7), the signal function can be written as S l (t) = Ŝ l (t) = = F (X l (s), s, t)ds = F (X l (s), s, t)ds = r l (G k )φ k (t), { Ĝ k (X l (s), s)ds } Ĝ k (s)ds φ k (t) (S5.35) {r l (Ĝk) r(ĝk)} φ k (t). (S5.36) By (S5.35) (S5.36), 1 n Ŝl S l L n = 1 n [r l (Ĝk) r(ĝk)] φ k r l (G k )φ k n l=1 l=1 L 3 n [r l (Ĝk) r(ĝk)] φ () k [r l (G k ) r(g k )]φ k n l=1 L + 3 n () [r n l (Ĝk) r(ĝk)][ φ k φ k ] + 3 n r(g n k )φ k l=1 L l=1 L, (S5.37) where () φ k (t) = R k Y(t)/n is the function in the problem (S3.17) to which φ k is the solution. By the definition (3.7), we have Y l (t) = µ(t) + F (X l(s), s, t)ds + ε l (t) =
35 S5. PROOFS OF THEOREMS µ(t)+ { G k(x l (s), s)φ k (t)} ds+ε l (t) = µ(t)+ r l(g k )φ k (t)+ε l (t). Therefore, φ () k (t) = R k Y(t)/n = = = = l=1 n l=1 [ ] r l (Ĝk) r(ĝk) Y l (t)/n n [ ] r l (Ĝk) r(ĝk) [Y l (t) Ȳ (t)]/n n l=1 j=1 1 [ ] r l (Ĝk) r(ĝk) [r l (G j ) r(g j )] φ j (t) + 1 n n Ĝk, ΣG j H φ j (t) + ( ΞĜk)(t), j=1 n l=1 (S5.38) [ ] r l (Ĝk) r(ĝk) [ε l (t) ε(t)] where Ξ is a operator from H to L [, 1] such that for any G J, ( ΞG)(t) = n l=1 [r l(g) r(g)] [ε l (t) ε(t)] /n. By (S5.38) and noting that φ k (t) s are orthogonal to each other and φ k = σ k, the first term on the right hand side of (S5.37) can be expressed as 3 n [r n l (Ĝk) r(ĝk)] φ () k [r l (G k ) r(g k )]φ k l=1 L 6 n [r l (Ĝk) r(ĝk)] Ĝk, n ΣG j H φ j [r l (G k ) r(g k )]φ k l=1 j=1 + 6 n [r l (Ĝk) r(ĝk)] ΞĜk n =6 j=1 l=1 where Ĝ(a) j σ j Ĝ(a) j L G j, Σ(Ĝ(a) j G j ) H + 6 k =1 Ĝk, ΣĜk H Ĝk, Ξ ΞĜ k H, (S5.39) = K Ĝk, ΣG j H Ĝ k and Ξ is the adjoint operator of Ξ. L Lemma S.. For any ɛ >, there exists an event Ω n,ɛ with probability greater than 1 ɛ,
36 Xin Qi and Ruiyan Luo such that in Ω n,ɛ, for any 1 j K, Ĝ(a) j and for any j > K, Ĝ(a) j G j, Σ(Ĝ(a) j G j ) H D 19 G j, Σ(Ĝ(a) j G j ) H 1 + D δ j + δ K ( K + ( K σ k ) σ k n 1/, ) n 1/, where δ k = min 1 j k (σ j σ j+1), and D 19 and D are two constants which do not depend on n. In the rest of the proof, we only consider the event Ω n,ɛ. By Condition 1, we have { } δ j = min 1 l j (σ l σl+1) C j (θ+1), σj Cj θ, σ j Cj θ, ( K ) D 1K (θ+1). (S5.4) and hence σ k Then by (S5.4) and Lemma S., 6 j=1 ( σ j Ĝ(a) j 6D 19 K j=1 G j, Σ(Ĝ(a) j G j ) H σ j [δ j + D 1 K (θ+1) ] + 6D D [(K θ+3 + K (θ+1) )n 1/ + K θ+1 ] j=k+1 σ j [δ K + D 1K (θ+1) ] ) n 1/ + 6 j=k+1 σ j D 3 [K (θ+1) n 1/ + K θ+1 ], (S5.41) where the inequality in the last line is because (θ + 1) > θ + 3 when θ > 1. By the definition (S6.61) of Ω n,ɛ, similar arguments as in the proof of Lemma S. lead to an
37 estimate of the second term on the right hand side of (S5.39): S5. PROOFS OF THEOREMS 6 Ĝk, ΣĜk H Ĝk, Ξ ΞĜ k H D 5 n 1/, k =1 Therefore, by (S5.39), (S5.41) and (S5.4), we have (S5.4) 3 n n [r l (Ĝk) r(ĝk)] φ () k l=1 [r l (G k ) r(g k )]φ k L D 3 [K (θ+1) n 1/ + K θ+1 ] + D 5 n 1/. (S5.43) By similar arguments, we can obtain the upper bound for the second and third terms on the right hand side of (S5.39), 3 n [r n l (Ĝk) r(ĝk)][ φ k l=1 3 n r(g k )φ k D 6 n 1, n l=1 L () φ k ] D 4 K θ+3 n 1/, L (S5.44) (S5.45) Combining (S5.43), (S5.44) and (S5.45), and noting that K = C K n 1/(3θ+) given in the theorem, we have 1 n n Ŝl S l L D 1n (θ 1)/(3θ+). l=1 which is the inequality (3.) in the theorem. (S5.46) Now we prove the inequality (3.1) in the theorem. Because Y new = µ(t) + Y pred = µ(t) + F (X new (s), s, t)ds + ε new (t), F (X new (s), s, t)ds (S5.47)
38 Xin Qi and Ruiyan Luo and ε new (t) is independent of X(s), Y(t) and X new (s), we have E [ Y pred Y new ] L X(s), Y(t) (S5.48) = µ(t) + F (X new (s), s, t)ds µ(t) F (X new (s), s, t)ds + E [ ] ε L. L The first inequality in (3.1) immediately follows from (S5.48). In order to show the first inequality in (3.1), we just need to show µ(t) + F (X new (s), s, t)ds µ(t) F (X new (s), s, t)ds D n (θ 1)/(3θ+), L which can be proved using similar arguments as in the proof of the inequality (S5.46). We skip the details. S6 Proofs of technical lemmas S.6.1 The proof of Lemma S.1 By the definition (3.7), we have Y l (t) = µ(t) + F (X l(s), s, t)ds + ε l (t) = µ(t) + { G k(x l (s), s)φ k (t)} ds + ε l (t) = µ(t) + r l(g k )φ k (t) + ε l (t), which, together
39 S6. PROOFS OF TECHNICAL LEMMAS with the definition (3.14) of Λ(G, G), leads to G, ΛG H = Λ(G, G) = 1 [ 1 n {r n l (G) r(g)} {Y l (t) Y (t)}] dt l=1 = 1 [ 1 ( n ) n {r n l (G) r(g)} {r l (G k ) r(g k )} φ k (t) + {r l (G) r(g)} {ε l (t) ε l (t)}] dt l=1 l=1 = 1 [ 1 n G, n ΣG n k H φ k (t) + {r l (G) r(g)} {ε l (t) ε l (t)}] dt = l=1 G, ΣG k H σ k + G, HG H = G, ( Σ B Σ + H)G H, where B and H are two bounded operators. Therefore, we have Λ = Σ B Σ+ H. Similarly, we can show that Λ = Σ BΣ. S.6. The proof of Lemma S. We use to denote the operator norm. We split the proof into several steps. Step 1: provide an upper bound for Γ Γ. Γ Γ (S6.49) = ( Σ + λi) 1/ Σ B Σ( Σ + λi) 1/ + ( Σ + λi) 1/ H( Σ + λi) 1/ Σ 1/ BΣ 1/ ( Σ + λi) 1/ Σ B Σ( Σ + λi) 1/ Σ 1/ BΣ 1/ + ( Σ + λi) 1/ H( Σ + λi) 1/ ( Σ + λi) 1/ Σ B ( Σ + λi) 1/ Σ Σ 1/ + ( Σ + λi) 1/ Σ Σ 1/ B Σ 1/ + ( Σ + λi) 1/ H( Σ + λi) 1/.
40 Xin Qi and Ruiyan Luo We first estimate ( Σ + λi) 1/ Σ Σ 1/ ( Σ + λi) 1/ Σ (Σ + λi) 1/ Σ + (Σ + λi) 1/ Σ Σ 1/ = f( Σ) f(σ) + g(σ), (S6.5) where f(z) = (z +λ) 1/ z and g(z) = f(z) z 1/. Because both f(z) and g(z) are analytic functions in the domain G = {z = x + iy : x > λ/} of the complex plane, it follows the theory in Section 1.6 of Rudin (1991) that f( Σ) f(σ) = f(z)(zi Σ) 1 dz C 1 +C +C 3 +C 4 f(z)(zi Σ) 1 dz, C 1 +C +C 3 +C 4 (S6.51) where C i, 1 i 4, are four segments in the domain G shown in Figure S.13.
41 S6. PROOFS OF TECHNICAL LEMMAS y λ + i C (M+1)+i C 1 C 3 x λ i C 4 (M+1) i Figure S.13: The contour for the integration in (S6.51) and M is the larger one of the maximum eigenvalues of Σ and Σ. Now we estimate f(z)(zi Σ) 1 dz f(z)(zi Σ) 1 dz = f(z) [(zi C 1 C 1 C Σ) ] 1 (zi Σ) 1 dz 1 f(z) (zi Σ) 1 (zi Σ) 1 dz. (S6.5) C 1 Define = Σ Σ. (S6.53) Then (zi Σ) 1 (zi Σ) 1 = (zi Σ) 1 ( Σ Σ)(zI Σ) 1 = (zi Σ) 1 (zi Σ) 1.
42 Xin Qi and Ruiyan Luo For any z C 1, (zi Σ) 1 (zi Σ) 1 (zi Σ) 1 (zi Σ) 1 z. where the last inequality follows from the fact that (zi Σ) 1 is less than or equal to the largest one of (z µ 1 ) 1, (z µ ) 1,, and (zi Σ) 1 is less than or equal to the largest one of (z µ 1 ) 1, (z µ ) 1,, (see inequality (3) in Section 1.4 in Rudin (1991)). µ k s and µ k s are the eigenvalues of Σ and Σ, and all of them are nonnegative. Therefore, for any z C 1, (z µ k ) 1 s and (z µ k ) 1 s are all less than or equal to z 1. Then by (S6.5), f(z)(zi Σ) 1 dz f(z)(zi Σ) 1 dz C 1 C 1 f(z) z dz D 3 λ 1/, C 1 where D 3 is a constant. By a similar argument, we can obtain f(z)(zi Σ) 1 dz f(z)(zi Σ) 1 dz D 4. C +C 3 +C 4 C +C 3 +C 4 Therefore, by (S6.51), we have ( Σ + λi) 1/ Σ (Σ + λi) 1/ Σ = f( Σ) f(σ) D 5 λ 1/, (S6.54) By the inequality (3) in Section 1.4 in Rudin (1991), we estimate the second term in (S6.5), (Σ + λi) 1/ Σ Σ 1/ = g(σ) sup g(µ k ) sup f(µ k ) µ 1/ k (S6.55) k 1 k 1 λ µ k = sup ( k 1 µk + λ + ) µ k µk + λ λ1/. (S6.5), (S6.54) and (S6.55) lead to ( Σ + λi) 1/ Σ Σ 1/ D 5 λ 1/ + λ 1/. (S6.56)
43 S6. PROOFS OF TECHNICAL LEMMAS By the same argument as in (S6.55), we have (Σ + λi) 1/ sup k 1 By the central limit theorem in Hilbert space, we have E [ ] = E [ Σ ] Σ With similar arguments as above and (S6.58), we have E 1 µk + λ λ 1/. D 1 n, E [ Ξ ] D n. [ ( Σ ] [ + λi) 1/ H( Σ + λi) 1/ D 6 λ 1/ n 1/ + λ 1 n 1], (S6.57) (S6.58) (S6.59) where D 1, D and D 6 are constants and Ξ is the operator defined in the last line of (S5.38). Now by (S6.49), (S6.56) (S6.59), E[ Γ Γ ] E[ ( Σ + λi) 1/ Σ B ( Σ + λi) 1/ Σ Σ 1/ ] + E[ ( Σ + λi) 1/ Σ Σ 1/ B Σ 1/ ] + E[ ( Σ + λi) 1/ H( Σ + λi) 1/ ] D 7 ( λ 1/ n 1/ + λ 1/ + λ 1/ n 1/ + λ 1 n 1) D 8 n 1/4. (S6.6) where we use the condition λ = C λ n 1/. For any ɛ >, define the event Ω n,ɛ = { Γ Γ D 8 ɛ 1 n 1/4 /3, D 1 ɛ 1 n 1/ /3, and Ξ D ɛ 1 n 1/ /3}. (S6.61) Then by (S6.58), (S6.6) and the Markov inequality, the probability of Ω n,ɛ is greater than 1 ɛ. In the rest of the proof of the lemma, we only consider the event Ω n,ɛ. Step 3: provide upper bounds for η k η k, k 1. We first note that the eigenvalues of Γ are the same as the eigenvalues σ 1 σ of the problem (S5.3) (which is equivalent to the problem (3.11) in the main
44 Xin Qi and Ruiyan Luo manuscript). As the inequalities in (5.) in Hall et al. (7), by the inequality (.1) in Bhatia et al. (1983) and (S6.6), we have sup δ k η k η k 8 Γ Γ D 9 n 1/4, k 1 (S6.6) where we recall that δ k = min 1 j k (σ j σ j+1), the last inequality follows from the definition (S6.61) of Ω n,ɛ and D 9 = 8D 8 ɛ 1 /3. Then it follows that sup k 1 δk 1 η k, η k = 1 sup δk η k η k 1 k 1 D 9n 1/ (S6.63) For any 1 k m, let P be the orthogonal projection operator onto the subspace spanned by { η k : k A}, where A is any subset of positive integers but does not include k. Then by the inequality (.1) in Bhatia et al. (1983), we have δ k Pη k Γ Γ D 9 n 1/4. (S6.64) Step 4: provide upper bounds for σ k G k H and σ k Ĝk H, k 1. Because for any k 1, we have G k (x, s) = F (x, s, t)φ k(t)dt/σ k, G k L = σ 4 k = σ k F L. G k (x, s) dxds = σ 4 k F (x, s, t) dtdxds [ F (x, s, t)φ k (t)dt] dxds φ k (t) dt = σ k F (x, s, t) dtdxds Similarly, we have xx G k L σ k xxf L, xs G k L σ k xsf L and ss G k L σ k ssf L. Hence, we have σ k G k H = σ k[ G k L + xxg k L + xsg k L + ssg k L ] D 1, (S6.65) for all k 1, where D 1 = F L + xx F L + xs F L + ss F L.
45 S6. PROOFS OF TECHNICAL LEMMAS To provide the upper bound for σ 1 Ĝ1, we consider the following inequality G 1, ΛG 1 H G 1, ( Σ + λi)g 1 H Ĝ1, ΛĜ1 H Ĝ1, ( Σ + λi)ĝ1 H = Ĝ1, ΛĜ1 H, (S6.66) where the inequality is because Ĝ1 is the solution to (S5.31) and the equality is because of Ĝ1, ( Σ + λi)ĝ1 = 1. By a tedious calculation, it follows from (S6.66) that in the event Ω n,ɛ, we have σ1 Ĝ1 D 11, where D 11 is a constant. For a general k, we can similarly obtain σ k Ĝk D 11, (S6.67) Step 5: prove the inequalities in the lemma. Because η k = Σ 1/ G k and η k = ( Σ + λi) 1/ Ĝ k, we have where q k = Σ 1/ Ĝ k = η k + q k, Σ1/ Gk = η k + q k, (S6.68) [ ( Σ + λi) 1/ Σ 1/] [ Ĝ k and q k = Σ1/ Σ 1/] G k. By similar arguments as in Step, we can show that ( Σ + λi) 1/ Σ 1/ D 1 n 1/4 and Σ 1/ Σ 1/ D 13 n 1/4. By (S6.65) and (S6.67), For any 1 j K, Ĝ(a) j G j, Σ(Ĝ(a) j q k D 14 σ k n 1/4, q k D 15 σ k n 1/4. (S6.69) G j ) = Σ 1/ Ĝ (a) j Ĝj, ΣG j Σ 1/ Ĝ j Σ 1/ G j + By (S6.6), (S6.63), (S6.69), we have Σ 1/ G j = k j, Ĝk, ΣG j Σ 1/ Ĝ k Σ 1/ G j Ĝk, ΣG j Σ 1/ Ĝ k. (S6.7) Ĝj, ΣG j Σ 1/ Ĝ j Σ 1/ G j = η j + q j, η j + q j ( η j + q j ) (η j + q j )
46 Xin Qi and Ruiyan Luo D 16 [δ j + σ 4 j ]n 1/ D 16 δ j n 1/ (S6.71) where the last inequality is because δ j = min 1 l j (σ l σ l+1 ) σ j. Similarly, k j, D 17 [ Ĝk, ΣG j Σ 1/ Ĝ k = k j, η k, η j η k + ( k j, η k + q k, η j + q j ( η k + q k ) ] ( K q k ) + q j D 18 δ j + σ k ) (S6.7) n 1/ where the last inequality follows from (S6.64), (S6.69) and the fact that K k j, η k, η j η k is the orthogonal projection of η j onto the space spanned by { η k : 1 k K, k j}. Combining (S6.7), (S6.71) and (S6.7) leads to Ĝ(a) j G j, Σ(Ĝ(a) j G j ) D 19 δ j + for any 1 l K. Similarly, for any l > K, we have Ĝ(a) j G j, Σ(Ĝ(a) j G j ) 1 + D δ K ( K + ( K σ k ) σ k n 1/, ) n 1/. Bibliography Adams, R. A. and J. J. Fournier (3). Sobolev spaces, Volume 14 of Pure and Applied Mathematics. Academic Press, Amsterdam. Bhatia, R., C. Davis, and A. McIntosh (1983). Perturbation of spectral subspaces and solution of linear operator equations. Linear Algebra and its Applications 5, Hall, P., J. L. Horowitz, et al. (7). Methodology and convergence rates for functional linear regression. The Annals of Statistics 35 (1), Luo, R. and X. Qi (17). Function-on-function linear regression by signal compression. Journal of the American Statistical Association 11 (518),
47 BIBLIOGRAPHY Rasmussen, C. E. and C. K. I. Williams (5). Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning series), Chapter 4. The MIT Pres. Rudin, W. (1991). Functional Analysis. International series in pure and applied mathematics. McGraw-Hill.
arxiv: v2 [math.pr] 27 Oct 2015
A brief note on the Karhunen-Loève expansion Alen Alexanderian arxiv:1509.07526v2 [math.pr] 27 Oct 2015 October 28, 2015 Abstract We provide a detailed derivation of the Karhunen Loève expansion of a stochastic
More informationSPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS
SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More informationStatistical Convergence of Kernel CCA
Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,
More informationGrouped Network Vector Autoregression
Statistica Sinica: Supplement Grouped Networ Vector Autoregression Xuening Zhu 1 and Rui Pan 2 1 Fudan University, 2 Central University of Finance and Economics Supplementary Material We present here the
More informationLecture Notes on PDEs
Lecture Notes on PDEs Alberto Bressan February 26, 2012 1 Elliptic equations Let IR n be a bounded open set Given measurable functions a ij, b i, c : IR, consider the linear, second order differential
More informationLearning Theory of Randomized Kaczmarz Algorithm
Journal of Machine Learning Research 16 015 3341-3365 Submitted 6/14; Revised 4/15; Published 1/15 Junhong Lin Ding-Xuan Zhou Department of Mathematics City University of Hong Kong 83 Tat Chee Avenue Kowloon,
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationBasic Operator Theory with KL Expansion
Basic Operator Theory with Karhunen Loéve Expansion Wafa Abu Zarqa, Maryam Abu Shamala, Samiha Al Aga, Khould Al Qitieti, Reem Al Rashedi Department of Mathematical Sciences College of Science United Arab
More informationElements of Positive Definite Kernel and Reproducing Kernel Hilbert Space
Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department
More information2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.
Chapter 3 Duality in Banach Space Modern optimization theory largely centers around the interplay of a normed vector space and its corresponding dual. The notion of duality is important for the following
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 2 Part 3: Native Space for Positive Definite Kernels Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH
More information1 Math 241A-B Homework Problem List for F2015 and W2016
1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let
More informationMath 61CM - Solutions to homework 6
Math 61CM - Solutions to homework 6 Cédric De Groote November 5 th, 2018 Problem 1: (i) Give an example of a metric space X such that not all Cauchy sequences in X are convergent. (ii) Let X be a metric
More informationLinear Algebra Massoud Malek
CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product
More informationSPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS
SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint
More information08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms
(February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops
More informationFUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1
FUNCTIONAL DATA ANALYSIS Contribution to the International Handbook (Encyclopedia) of Statistical Sciences July 28, 2009 Hans-Georg Müller 1 Department of Statistics University of California, Davis One
More informationHilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.
Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,
More informationAnalysis Comprehensive Exam Questions Fall 2008
Analysis Comprehensive xam Questions Fall 28. (a) Let R be measurable with finite Lebesgue measure. Suppose that {f n } n N is a bounded sequence in L 2 () and there exists a function f such that f n (x)
More informationITO OLOGY. Thomas Wieting Reed College, Random Processes 2 The Ito Integral 3 Ito Processes 4 Stocastic Differential Equations
ITO OLOGY Thomas Wieting Reed College, 2000 1 Random Processes 2 The Ito Integral 3 Ito Processes 4 Stocastic Differential Equations 1 Random Processes 1 Let (, F,P) be a probability space. By definition,
More information(z 0 ) = lim. = lim. = f. Similarly along a vertical line, we fix x = x 0 and vary y. Setting z = x 0 + iy, we get. = lim. = i f
. Holomorphic Harmonic Functions Basic notation. Considering C as R, with coordinates x y, z = x + iy denotes the stard complex coordinate, in the usual way. Definition.1. Let f : U C be a complex valued
More informationA Note on Hilbertian Elliptically Contoured Distributions
A Note on Hilbertian Elliptically Contoured Distributions Yehua Li Department of Statistics, University of Georgia, Athens, GA 30602, USA Abstract. In this paper, we discuss elliptically contoured distribution
More informationTools from Lebesgue integration
Tools from Lebesgue integration E.P. van den Ban Fall 2005 Introduction In these notes we describe some of the basic tools from the theory of Lebesgue integration. Definitions and results will be given
More informationANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.
ANALYSIS QUALIFYING EXAM FALL 27: SOLUTIONS Problem. Determine, with justification, the it cos(nx) n 2 x 2 dx. Solution. For an integer n >, define g n : (, ) R by Also define g : (, ) R by g(x) = g n
More informationADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS
J. OPERATOR THEORY 44(2000), 243 254 c Copyright by Theta, 2000 ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS DOUGLAS BRIDGES, FRED RICHMAN and PETER SCHUSTER Communicated by William B. Arveson Abstract.
More informationLINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM
LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F is R or C. Definition 1. A linear operator
More information1 The Observability Canonical Form
NONLINEAR OBSERVERS AND SEPARATION PRINCIPLE 1 The Observability Canonical Form In this Chapter we discuss the design of observers for nonlinear systems modelled by equations of the form ẋ = f(x, u) (1)
More informationFunctional Analysis HW #5
Functional Analysis HW #5 Sangchul Lee October 29, 2015 Contents 1 Solutions........................................ 1 1 Solutions Exercise 3.4. Show that C([0, 1]) is not a Hilbert space, that is, there
More informationAnalysis Preliminary Exam Workshop: Hilbert Spaces
Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H
More information(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define
Homework, Real Analysis I, Fall, 2010. (1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define ρ(f, g) = 1 0 f(x) g(x) dx. Show that
More information1. If 1, ω, ω 2, -----, ω 9 are the 10 th roots of unity, then (1 + ω) (1 + ω 2 ) (1 + ω 9 ) is A) 1 B) 1 C) 10 D) 0
4 INUTES. If, ω, ω, -----, ω 9 are the th roots of unity, then ( + ω) ( + ω ) ----- ( + ω 9 ) is B) D) 5. i If - i = a + ib, then a =, b = B) a =, b = a =, b = D) a =, b= 3. Find the integral values for
More informationNumerical Solutions to Partial Differential Equations
Numerical Solutions to Partial Differential Equations Zhiping Li LMAM and School of Mathematical Sciences Peking University Nonconformity and the Consistency Error First Strang Lemma Abstract Error Estimate
More informationStability of an abstract wave equation with delay and a Kelvin Voigt damping
Stability of an abstract wave equation with delay and a Kelvin Voigt damping University of Monastir/UPSAY/LMV-UVSQ Joint work with Serge Nicaise and Cristina Pignotti Outline 1 Problem The idea Stability
More informationProjection Theorem 1
Projection Theorem 1 Cauchy-Schwarz Inequality Lemma. (Cauchy-Schwarz Inequality) For all x, y in an inner product space, [ xy, ] x y. Equality holds if and only if x y or y θ. Proof. If y θ, the inequality
More informationMAT 257, Handout 13: December 5-7, 2011.
MAT 257, Handout 13: December 5-7, 2011. The Change of Variables Theorem. In these notes, I try to make more explicit some parts of Spivak s proof of the Change of Variable Theorem, and to supply most
More informationVectors in Function Spaces
Jim Lambers MAT 66 Spring Semester 15-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V, also
More informationKernel Methods. Machine Learning A W VO
Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance
More informationA SYNOPSIS OF HILBERT SPACE THEORY
A SYNOPSIS OF HILBERT SPACE THEORY Below is a summary of Hilbert space theory that you find in more detail in any book on functional analysis, like the one Akhiezer and Glazman, the one by Kreiszig or
More informationAN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE. Mona Nabiei (Received 23 June, 2015)
NEW ZEALAND JOURNAL OF MATHEMATICS Volume 46 (2016), 53-64 AN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE Mona Nabiei (Received 23 June, 2015) Abstract. This study first defines a new metric with
More informationIntroduction to Signal Spaces
Introduction to Signal Spaces Selin Aviyente Department of Electrical and Computer Engineering Michigan State University January 12, 2010 Motivation Outline 1 Motivation 2 Vector Space 3 Inner Product
More informationReal Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis
Real Analysis, 2nd Edition, G.B.Folland Chapter 5 Elements of Functional Analysis Yung-Hsiang Huang 5.1 Normed Vector Spaces 1. Note for any x, y X and a, b K, x+y x + y and by ax b y x + b a x. 2. It
More informationGaussian Processes. 1. Basic Notions
Gaussian Processes 1. Basic Notions Let T be a set, and X : {X } T a stochastic process, defined on a suitable probability space (Ω P), that is indexed by T. Definition 1.1. We say that X is a Gaussian
More informationThe following definition is fundamental.
1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationCHAPTER 6. Representations of compact groups
CHAPTER 6 Representations of compact groups Throughout this chapter, denotes a compact group. 6.1. Examples of compact groups A standard theorem in elementary analysis says that a subset of C m (m a positive
More informationFall TMA4145 Linear Methods. Solutions to exercise set 9. 1 Let X be a Hilbert space and T a bounded linear operator on X.
TMA445 Linear Methods Fall 26 Norwegian University of Science and Technology Department of Mathematical Sciences Solutions to exercise set 9 Let X be a Hilbert space and T a bounded linear operator on
More informationEigenvalues and Eigenfunctions of the Laplacian
The Waterloo Mathematics Review 23 Eigenvalues and Eigenfunctions of the Laplacian Mihai Nica University of Waterloo mcnica@uwaterloo.ca Abstract: The problem of determining the eigenvalues and eigenvectors
More informationarxiv: v1 [math.pr] 22 May 2008
THE LEAST SINGULAR VALUE OF A RANDOM SQUARE MATRIX IS O(n 1/2 ) arxiv:0805.3407v1 [math.pr] 22 May 2008 MARK RUDELSON AND ROMAN VERSHYNIN Abstract. Let A be a matrix whose entries are real i.i.d. centered
More informationYour first day at work MATH 806 (Fall 2015)
Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies
More informationChapter 8 Integral Operators
Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,
More information5 Compact linear operators
5 Compact linear operators One of the most important results of Linear Algebra is that for every selfadjoint linear map A on a finite-dimensional space, there exists a basis consisting of eigenvectors.
More informationNormed Vector Spaces and Double Duals
Normed Vector Spaces and Double Duals Mathematics 481/525 In this note we look at a number of infinite-dimensional R-vector spaces that arise in analysis, and we consider their dual and double dual spaces
More informationReal Variables # 10 : Hilbert Spaces II
randon ehring Real Variables # 0 : Hilbert Spaces II Exercise 20 For any sequence {f n } in H with f n = for all n, there exists f H and a subsequence {f nk } such that for all g H, one has lim (f n k,
More informationSpectral analysis for rank one perturbations of diagonal operators in non-archimedean Hilbert space
Comment.Math.Univ.Carolin. 50,32009 385 400 385 Spectral analysis for rank one perturbations of diagonal operators in non-archimedean Hilbert space Toka Diagana, George D. McNeal Abstract. The paper is
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationSPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT
SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT Abstract. These are the letcure notes prepared for the workshop on Functional Analysis and Operator Algebras to be held at NIT-Karnataka,
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationLecture Notes on Metric Spaces
Lecture Notes on Metric Spaces Math 117: Summer 2007 John Douglas Moore Our goal of these notes is to explain a few facts regarding metric spaces not included in the first few chapters of the text [1],
More informationOn the bang-bang property of time optimal controls for infinite dimensional linear systems
On the bang-bang property of time optimal controls for infinite dimensional linear systems Marius Tucsnak Université de Lorraine Paris, 6 janvier 2012 Notation and problem statement (I) Notation: X (the
More informationSpectral Theory, with an Introduction to Operator Means. William L. Green
Spectral Theory, with an Introduction to Operator Means William L. Green January 30, 2008 Contents Introduction............................... 1 Hilbert Space.............................. 4 Linear Maps
More informationThe Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:
Oct. 1 The Dirichlet s P rinciple In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: 1. Dirichlet s Principle. u = in, u = g on. ( 1 ) If we multiply
More informationWe denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,
The derivative Let O be an open subset of R n, and F : O R m a continuous function We say F is differentiable at a point x O, with derivative L, if L : R n R m is a linear transformation such that, for
More informationHomework If the inverse T 1 of a closed linear operator exists, show that T 1 is a closed linear operator.
Homework 3 1 If the inverse T 1 of a closed linear operator exists, show that T 1 is a closed linear operator Solution: Assuming that the inverse of T were defined, then we will have to have that D(T 1
More informationLecture 4: Numerical solution of ordinary differential equations
Lecture 4: Numerical solution of ordinary differential equations Department of Mathematics, ETH Zürich General explicit one-step method: Consistency; Stability; Convergence. High-order methods: Taylor
More informationVector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.
Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +
More informationContents. 0.1 Notation... 3
Contents 0.1 Notation........................................ 3 1 A Short Course on Frame Theory 4 1.1 Examples of Signal Expansions............................ 4 1.2 Signal Expansions in Finite-Dimensional
More informationFunctional Analysis F3/F4/NVP (2005) Homework assignment 3
Functional Analysis F3/F4/NVP (005 Homework assignment 3 All students should solve the following problems: 1. Section 4.8: Problem 8.. Section 4.9: Problem 4. 3. Let T : l l be the operator defined by
More information4 Linear operators and linear functionals
4 Linear operators and linear functionals The next section is devoted to studying linear operators between normed spaces. Definition 4.1. Let V and W be normed spaces over a field F. We say that T : V
More informationINNER PRODUCT SPACE. Definition 1
INNER PRODUCT SPACE Definition 1 Suppose u, v and w are all vectors in vector space V and c is any scalar. An inner product space on the vectors space V is a function that associates with each pair of
More informationOn the Identifiability of the Functional Convolution. Model
On the Identifiability of the Functional Convolution Model Giles Hooker Abstract This report details conditions under which the Functional Convolution Model described in Asencio et al. (2013) can be identified
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationNATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, January 20, Time Allowed: 150 Minutes Maximum Marks: 40
NATIONAL BOARD FOR HIGHER MATHEMATICS Research Scholarships Screening Test Saturday, January 2, 218 Time Allowed: 15 Minutes Maximum Marks: 4 Please read, carefully, the instructions that follow. INSTRUCTIONS
More informationANALYSIS QUALIFYING EXAM FALL 2016: SOLUTIONS. = lim. F n
ANALYSIS QUALIFYING EXAM FALL 206: SOLUTIONS Problem. Let m be Lebesgue measure on R. For a subset E R and r (0, ), define E r = { x R: dist(x, E) < r}. Let E R be compact. Prove that m(e) = lim m(e /n).
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More information4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan
The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan Wir müssen wissen, wir werden wissen. David Hilbert We now continue to study a special class of Banach spaces,
More informationFunctional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32
Functional Analysis Martin Brokate Contents 1 Normed Spaces 2 2 Hilbert Spaces 2 3 The Principle of Uniform Boundedness 32 4 Extension, Reflexivity, Separation 37 5 Compact subsets of C and L p 46 6 Weak
More informationNORMS ON SPACE OF MATRICES
NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system
More informationTHEOREMS, ETC., FOR MATH 515
THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every
More informationL. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS
Rend. Sem. Mat. Univ. Pol. Torino Vol. 57, 1999) L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Abstract. We use an abstract framework to obtain a multilevel decomposition of a variety
More informationFunctional Analysis F3/F4/NVP (2005) Homework assignment 2
Functional Analysis F3/F4/NVP (25) Homework assignment 2 All students should solve the following problems:. Section 2.7: Problem 8. 2. Let x (t) = t 2 e t/2, x 2 (t) = te t/2 and x 3 (t) = e t/2. Orthonormalize
More informationStochastic Design Criteria in Linear Models
AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework
More informationOn Riesz-Fischer sequences and lower frame bounds
On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition
More informationNonlinear error dynamics for cycled data assimilation methods
Nonlinear error dynamics for cycled data assimilation methods A J F Moodey 1, A S Lawless 1,2, P J van Leeuwen 2, R W E Potthast 1,3 1 Department of Mathematics and Statistics, University of Reading, UK.
More informationTake-Home Final Examination of Math 55a (January 17 to January 23, 2004)
Take-Home Final Eamination of Math 55a January 17 to January 3, 004) N.B. For problems which are similar to those on the homework assignments, complete self-contained solutions are required and homework
More informationHILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define
HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,
More informationFourier Series. 1. Review of Linear Algebra
Fourier Series In this section we give a short introduction to Fourier Analysis. If you are interested in Fourier analysis and would like to know more detail, I highly recommend the following book: Fourier
More informationNormed & Inner Product Vector Spaces
Normed & Inner Product Vector Spaces ECE 174 Introduction to Linear & Nonlinear Optimization Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 174 Fall 2016 1 / 27 Normed
More information1 Directional Derivatives and Differentiability
Wednesday, January 18, 2012 1 Directional Derivatives and Differentiability Let E R N, let f : E R and let x 0 E. Given a direction v R N, let L be the line through x 0 in the direction v, that is, L :=
More informationJae Gil Choi and Young Seo Park
Kangweon-Kyungki Math. Jour. 11 (23), No. 1, pp. 17 3 TRANSLATION THEOREM ON FUNCTION SPACE Jae Gil Choi and Young Seo Park Abstract. In this paper, we use a generalized Brownian motion process to define
More informationFunctional Analysis Review
Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all
More informationMATH 205C: STATIONARY PHASE LEMMA
MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)
More information2. Review of Linear Algebra
2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 22: Preliminaries for Semiparametric Inference
Introduction to Empirical Processes and Semiparametric Inference Lecture 22: Preliminaries for Semiparametric Inference Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationOn compact operators
On compact operators Alen Aleanderian * Abstract In this basic note, we consider some basic properties of compact operators. We also consider the spectrum of compact operators on Hilbert spaces. A basic
More informationFUNCTIONAL ANALYSIS-NORMED SPACE
MAT641- MSC Mathematics, MNIT Jaipur FUNCTIONAL ANALYSIS-NORMED SPACE DR. RITU AGARWAL MALAVIYA NATIONAL INSTITUTE OF TECHNOLOGY JAIPUR 1. Normed space Norm generalizes the concept of length in an arbitrary
More informationModeling Multi-Way Functional Data With Weak Separability
Modeling Multi-Way Functional Data With Weak Separability Kehui Chen Department of Statistics University of Pittsburgh, USA @CMStatistics, Seville, Spain December 09, 2016 Outline Introduction. Multi-way
More informationHilbert Space Methods for Reduced-Rank Gaussian Process Regression
Hilbert Space Methods for Reduced-Rank Gaussian Process Regression Arno Solin and Simo Särkkä Aalto University, Finland Workshop on Gaussian Process Approximation Copenhagen, Denmark, May 2015 Solin &
More information