Index Models for Sparsely Sampled Functional Data. Supplementary Material.

Size: px
Start display at page:

Download "Index Models for Sparsely Sampled Functional Data. Supplementary Material."

Transcription

1 Index Models for Sparsely Sampled Functional Data. Supplementary Material. May 21, Proof of Theorem 2 We will follow the general argument in the proof of Theorem 3 in Li et al. 2010). Write d = d 1,..., d p ) and d c = d c1,..., d cp ). Let G n d c ) denote the difference between the criterion values for the candidate and the true dimension vectors: G n d c ) = log Ld c ) log Ld) + d c log n/[nh d c n d)]. Note that the set of candidate dimension vectors is finite. Thus, it is sufficient to show that for each vector d c not equal to d function G n d c ) is positive with probability tending to one. Note that d c log n/[nh d c n d)] = O log n[n 4/ d c +4) + n 4/ d+4) ] ) = o1). If d cj < d j for at least one j, we can choose a positive constant c, such that log Ld c ) log Ld) > c with probability tending to one, due to the lack of fit. It follows that G n d c ) > 0 with probability tending to one. Now consider the remaining case of d cj d j for all j and d c > d. In this case d c log n/[nh d c n d)] > log n/[nh d c n d c )] for all sufficiently large n. On the other hand, log Ld c ) log Ld) = log1+ Ld c) Ld) ) = log1+o p 1/[nh d c n d c )])) = O p 1/[nh d c n d c )]), Ld) by the classical results on local linear smoothing. Consequently, with probability tending to one, G n d c ) > 1/2) log n/[nh d c n ] > 0. 1

2 2 Proof of Theorems 3 and 4 Theorem 3. For simplicity of the exposition we will focus on the case of a single predictor. Due to the additive structure of the SIMFE estimation procedure with respect to the predictors, extension to the general case presents only notational challenges, while the argument itself remains essentially intact. We will also simplify the notation by omitting the superscript containing the predictor index. For example, we will write µ i and ˆP i instead of µ ij and ˆP ij. Define δ kh = [log n/nh k )] 1/2. Observe the relationship A ˆβt) βt) = A[ˆη] [η] F, where [ˆη] and [η] denote matrixes [ˆη 1...ˆη d ] T and [η 1...η d ] T, respectively, and F stands for the Frobenius matrix norm. Hence, we need to show that there exists an invertible matrix A, for which A[ˆη] [η] F = O p h 4 opt + δ 2 dh opt + n 1/2 ). In the new notation ˆµ i = ˆµ i1,..., ˆµ iq ) T and ˆη ˆµ i = ˆη T 1 ˆµ i,..., ˆη T d ˆµ i) T = ˆP i. To be able to conveniently apply existing results, we will slightly modify the trimmed objective function, replacing ˆP l with ˆP l ˆP i and writing this expression in terms of ˆη and ˆµ. We will also use ˆµ to denote ˆµ l ˆµ i. The modified objective function, 1 n i=1 I ni ρˆηi Yl a i c T i ˆη ˆµ )2 Kh ˆη ˆµ ), l=1 however, corresponds to exactly the same SIMFE estimator as the original trimmed function. Define Ǩt = I ni ρˆηi Kht ˆη ˆµ ), ˆ = c i ˆµ and = c i µ. We will use the superscript t, τ) to indicate that the estimator corresponds to the τ iteration of the algorithm corresponding to the bandwidth h t. account the modifications to the objective function, yields Here η r1 such rotation. ˆη t,τ+1) = η r1 + { Ǩ ˆ t,τ) t A simple manipulation of formula 23), taking into ˆ t,τ) ) T } 1 Ǩ ˆ t,τ) t {Y l a t,τ) i η ˆ t,τ) r1 } 1) corresponds to an arbitrary rotation of [η], and the above formula holds for each Let M denote the number of time point configurations, at which the predictor functions are observed. We will only consider configurations that are generated with positive probabilities. Denote by A k, k = 1,..., M, the index set of the observations corresponding to the k-th time point configuration. Note that, using the basis representation for the predictors and the projection functions, equation 10) simplifies to Y i = m k η T 1 µ i,..., η T d µ i) + ε i, i A k, 2) 2

3 where Eε i W i ) = 0. The last equality also implies Eε i µ i ) = 0. This formulation of the SIMFE model allows us to apply some of the theory developed for the MAVE approach. We will first focus on the right-hand side of display 1), for sufficiently large n, with ˆ replaced by, and rewrite it as η r1 + { M Σ k } 1 { M Σ k}, where Σ k = i,l A k Ǩ t t,τ) t,τ) ) T and Σ k = i,l A k Ǩ t t,τ) Y i a t,τ) j η r1 t,τ) ). We will apply Lemma A.5 in the supplemental material of Xia 2008) to each Σ k, and use a natural generalization of Lemma A.4 to handle { M Σ k } 1. For each k M, write π k for the probability of the k-th time point configuration, and let µ k) denote µ i for some i in A k. Define and D 1 = Φ n = n 1 M A k n i A k ρ f i η µ i )) m k η µ i ) ν µ i ) ) ε i, M πkeρ 2 fη µ k) )) m k η µ k) ) ν µ k) ) ) m k η µ k) ) ν µ k) ) ) T, 3) D 2 = 2 M πkeρ 2 fη µ k) )) m k η µ k) ) T m k η µ k) ) ω µ k) ) ) T. 4) where ν µ) = µ E µ η µ) and ω µ) = E µ µ T η µ) E µ η µ)e µ η µ) T. We will use superscript + to denote the MoorePenrose inverse. By Lemmas A.4 and A.5 in the supplemental material of Xia 2008), there exists a rotation of [η], call it [η r2 ], such that η r1 + { provided ˆη t,τ) η r1 Ǩ t t,τ) t,τ) ) T } 1 Ǩ t t,τ) {Y l a t,τ) i η t,τ) } = η r2 + I D + 2 D 1 ) 1 D + 2 Φ n + O p h 4 t + δ 2 dh t ), 5) = o p h t ). Note that Φ n = O p n 1/2 ). Due to the assumption A3, the unknown parameters, µ δ and σ 2 are estimated at the usual parametric rate, n 1/2, and hence ˆµ = µ1 + O p n 1/2 )). Consequently, equations 1) and 5) imply ˆη t,τ+1) η r2 = O p h 4 t + δ 2 dh t + n 1/2 ), 6) as long as ˆη t,τ) η r1 = o p h t ). Note that δ 2 dh t = oh t ), and hence the stochastic bound in display 6) can be written as o p h t ). It follows that the final estimator for the bandwidth h t, which is also the initial estimator for the bandwidth h t+1, satisfies ˆη t+1,0) η r = o p h t+1 ), for some rotation of [η]. Hence, as long as ˆη 0,0) η = o p h 0 ) holds for the initialization estimator, we can establish, by induction, that the final SIMFE estimator satisfies [ˆη] [η r ] F = O p h 4 opt + δ 2 dh opt + n 1/2 ), 7) 3

4 where [η r ] is an appropriate rotation of [η]. The required bound for the initialization estimator follows from Theorem 4 in Li et al. 2010), which establishes that the gopg estimator converges at the rate O p h 2 0 +δ 2 ph 0 h 1 0 ). The last stochastic bound is o p h 0 ) by the definitions of h 0 and δ ph0. Theorem 4. In the case where n i = 1 for all i, we can repeat the proof of Theorem 3, treating S as an additional univariate predictor. The reduced dimension increases from d to d + 1. As a result, the convergence rate changes from O p n 4/d+4) log n + n 1/2) to O p n 4/d+5) log n + n 1/2). In the general case, some of the expressions in the proof of Theorem 3 need to be modified. However, because the sequence {n i } is bounded, the stochastic bounds 6) and 7) remain the same as in the case n i = 1. 3 Proof of Theorem 5 Partition the rows of the matrix I Ω i S i ) I Ω i S i ) T + σ 2 Ω i Ω T i into p groups of adjacent rows, so that the size of the j-th group is q j. Partition the columns of the same matrix analogously. This corresponds to a partition of the matrix into p 2 blocks, V ijk, where j is the group index in the row partition, and k is the group index in the column partition. Write v ijk for the vectorized form of the block V ijk. Recall the definitions of matrices Σ i and vectors ξ i in sections 2.4 and 3. Each element of ξ i has the form covu ijl1, U ikl2 ) for some indexes j, k, l 1, l 2 that satisfy: p k j 1, d j l 1 1, d k l 2 1, and l 2 l 1 whenever k = j. Consequently, each element of ξ i can be written as η T jl 1 V ijk η kl2. Thus, there exists a collection of vectors γ jl1 kl 2, with the indexes satisfying the inequalities given above, such that for each i the elements of ξ i have the form γ T jl 1 kl 2 v ijk. We will denote this representation of the vector ξ i as γ v i, where γ is the vector constructed by stacking the vectors γ jl1 kl 2, and v i is similarly constructed from v ijk. Using the derivations in Section 2.4 and the notation in Appendix D, we see that m ti P i ) can be written as ˇmη µ i, γ v i ), for some function ˇm, which does not depend on i. We can now follow the argument in the proof of Theorem 3, with some small modifications. Our initialization estimator is still gopg, however we use ˆµ i, ˆv i ), instead of ˆµ i, as the predictor vectors. Here ˆv i are constructed analogously to v i, but the unknown parameters and σ 2 are replaced with their estimates. We initialize the bandwidth as n 1/ˇp+4), where ˇp is the dimension of ˆµ i, ˆv i ). As in the proof of Theorem 3, the parameters and σ 2 are estimated at the parametric rate, n 1/2, and the aforementioned gopg estimator of η, γ) is 4

5 consistent. We no longer partition the observations by time point configuration, but instead treat v i as an additional predictor. Equation 2) is replaced with Y i = ˇm η µ i, γ v i ) + ε i, and the dimensionality of the corresponding model increases from d to d = d+dd+1)/2. The rest of the proof is the same as that of Theorem 3, with the appropriate modifications, such as replacing ˆη with ˆη, ˆγ) and ˆµ i with ˆµ i, ˆv i ). Taking into account the increased dimensionality of the problem, stochastic bound in display 7) changes to O p h 4 opt + δ 2 d hopt + n 1/2 ), which simplifies to O p n 4/ d+4) log n + n 1/2 ). References Li, L., Li, B., and Zhu, L.-X. 2010). Groupwise dimension reduction. Journal of the American Statistical Association 105, Xia, Y. 2008). A multiple-index model and dimension reduction. Journal of the American Statistical Association 103,

Index Models for Sparsely Sampled Functional Data

Index Models for Sparsely Sampled Functional Data Index Models for Sparsely Sampled Functional Data Peter Radchenko, Xinghao Qiao, and Gareth M. James May 21, 2014 Abstract The regression problem involving functional predictors has many important applications

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

CHAPTER 10 Shape Preserving Properties of B-splines

CHAPTER 10 Shape Preserving Properties of B-splines CHAPTER 10 Shape Preserving Properties of B-splines In earlier chapters we have seen a number of examples of the close relationship between a spline function and its B-spline coefficients This is especially

More information

R u t c o r Research R e p o r t. The Optimization of the Move of Robot Arm by Benders Decomposition. Zsolt Robotka a RRR , DECEMBER 2005

R u t c o r Research R e p o r t. The Optimization of the Move of Robot Arm by Benders Decomposition. Zsolt Robotka a RRR , DECEMBER 2005 R u t c o r Research R e p o r t The Optimization of the Move of Robot Arm by Benders Decomposition Zsolt Robotka a Béla Vizvári b RRR 43-2005, DECEMBER 2005 RUTCOR Rutgers Center for Operations Research

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

FIXED POINT ITERATIONS

FIXED POINT ITERATIONS FIXED POINT ITERATIONS MARKUS GRASMAIR 1. Fixed Point Iteration for Non-linear Equations Our goal is the solution of an equation (1) F (x) = 0, where F : R n R n is a continuous vector valued mapping in

More information

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Submitted to the Annals of Statistics arxiv: arxiv:0000.0000 RATE-OPTIMAL GRAPHON ESTIMATION By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Network analysis is becoming one of the most active

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

Discussion of High-dimensional autocovariance matrices and optimal linear prediction,

Discussion of High-dimensional autocovariance matrices and optimal linear prediction, Electronic Journal of Statistics Vol. 9 (2015) 1 10 ISSN: 1935-7524 DOI: 10.1214/15-EJS1007 Discussion of High-dimensional autocovariance matrices and optimal linear prediction, Xiaohui Chen University

More information

Markov Chains and Stochastic Sampling

Markov Chains and Stochastic Sampling Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix

More information

Nonparametric Function Estimation with Infinite-Order Kernels

Nonparametric Function Estimation with Infinite-Order Kernels Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density

More information

Extended dynamic programming: technical details

Extended dynamic programming: technical details A Extended dynamic programming: technical details The extended dynamic programming algorithm is given by Algorithm 2. Algorithm 2 Extended dynamic programming for finding an optimistic policy transition

More information

Consensus building: How to persuade a group

Consensus building: How to persuade a group Consensus building: How to persuade a group Bernard Caillaud and Jean Tirole May 7, 007 Additional materials Proof of Lemma C1 Referring to Appendix C and using feasibility constraints, note that a mechanism

More information

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators Lei Ni And I cherish more than anything else the Analogies, my most trustworthy masters. They know all the secrets of

More information

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance In this technical appendix we provide proofs for the various results stated in the manuscript

More information

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance In this technical appendix we provide proofs for the various results stated in the manuscript

More information

Upper triangular matrices and Billiard Arrays

Upper triangular matrices and Billiard Arrays Linear Algebra and its Applications 493 (2016) 508 536 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Upper triangular matrices and Billiard Arrays

More information

ANALYSIS QUALIFYING EXAM FALL 2016: SOLUTIONS. = lim. F n

ANALYSIS QUALIFYING EXAM FALL 2016: SOLUTIONS. = lim. F n ANALYSIS QUALIFYING EXAM FALL 206: SOLUTIONS Problem. Let m be Lebesgue measure on R. For a subset E R and r (0, ), define E r = { x R: dist(x, E) < r}. Let E R be compact. Prove that m(e) = lim m(e /n).

More information

Lecture 3. Random Fourier measurements

Lecture 3. Random Fourier measurements Lecture 3. Random Fourier measurements 1 Sampling from Fourier matrices 2 Law of Large Numbers and its operator-valued versions 3 Frames. Rudelson s Selection Theorem Sampling from Fourier matrices Our

More information

Theoretical Statistical Correlation for Biometric Identification Performance

Theoretical Statistical Correlation for Biometric Identification Performance Theoretical Statistical Correlation for Biometric Identification Performance Michael E. Schuckers schuckers@stlawu.edu Theoretical Statistical Correlation for Biometric Identification Performance p. 1/20

More information

Key words and phrases. Hausdorff measure, self-similar set, Sierpinski triangle The research of Móra was supported by OTKA Foundation #TS49835

Key words and phrases. Hausdorff measure, self-similar set, Sierpinski triangle The research of Móra was supported by OTKA Foundation #TS49835 Key words and phrases. Hausdorff measure, self-similar set, Sierpinski triangle The research of Móra was supported by OTKA Foundation #TS49835 Department of Stochastics, Institute of Mathematics, Budapest

More information

YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL

YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL Journal of Comput. & Applied Mathematics 139(2001), 197 213 DIRECT APPROACH TO CALCULUS OF VARIATIONS VIA NEWTON-RAPHSON METHOD YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL Abstract. Consider m functions

More information

Supplemental for Spectral Algorithm For Latent Tree Graphical Models

Supplemental for Spectral Algorithm For Latent Tree Graphical Models Supplemental for Spectral Algorithm For Latent Tree Graphical Models Ankur P. Parikh, Le Song, Eric P. Xing The supplemental contains 3 main things. 1. The first is network plots of the latent variable

More information

7 Veltman-Passarino Reduction

7 Veltman-Passarino Reduction 7 Veltman-Passarino Reduction This is a method of expressing an n-point loop integral with r powers of the loop momentum l in the numerator, in terms of scalar s-point functions with s = n r, n. scalar

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Statistics for high-dimensional data: Group Lasso and additive models

Statistics for high-dimensional data: Group Lasso and additive models Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional

More information

Letting p shows that {B t } t 0. Definition 0.5. For λ R let δ λ : A (V ) A (V ) be defined by. 1 = g (symmetric), and. 3. g

Letting p shows that {B t } t 0. Definition 0.5. For λ R let δ λ : A (V ) A (V ) be defined by. 1 = g (symmetric), and. 3. g 4 Contents.1 Lie group p variation results Suppose G, d) is a group equipped with a left invariant metric, i.e. Let a := d e, a), then d ca, cb) = d a, b) for all a, b, c G. d a, b) = d e, a 1 b ) = a

More information

On the Identifiability of the Functional Convolution. Model

On the Identifiability of the Functional Convolution. Model On the Identifiability of the Functional Convolution Model Giles Hooker Abstract This report details conditions under which the Functional Convolution Model described in Asencio et al. (2013) can be identified

More information

Assumptions for the theoretical properties of the

Assumptions for the theoretical properties of the Statistica Sinica: Supplement IMPUTED FACTOR REGRESSION FOR HIGH- DIMENSIONAL BLOCK-WISE MISSING DATA Yanqing Zhang, Niansheng Tang and Annie Qu Yunnan University and University of Illinois at Urbana-Champaign

More information

INTRODUCTION TO FINITE ELEMENT METHODS

INTRODUCTION TO FINITE ELEMENT METHODS INTRODUCTION TO FINITE ELEMENT METHODS LONG CHEN Finite element methods are based on the variational formulation of partial differential equations which only need to compute the gradient of a function.

More information

2 Two-Point Boundary Value Problems

2 Two-Point Boundary Value Problems 2 Two-Point Boundary Value Problems Another fundamental equation, in addition to the heat eq. and the wave eq., is Poisson s equation: n j=1 2 u x 2 j The unknown is the function u = u(x 1, x 2,..., x

More information

We describe the generalization of Hazan s algorithm for symmetric programming

We describe the generalization of Hazan s algorithm for symmetric programming ON HAZAN S ALGORITHM FOR SYMMETRIC PROGRAMMING PROBLEMS L. FAYBUSOVICH Abstract. problems We describe the generalization of Hazan s algorithm for symmetric programming Key words. Symmetric programming,

More information

Concepts and Applications of Stochastically Weighted Stochastic Dominance

Concepts and Applications of Stochastically Weighted Stochastic Dominance Concepts and Applications of Stochastically Weighted Stochastic Dominance Jian Hu Department of Industrial Engineering and Management Sciences Northwestern University jianhu@northwestern.edu Tito Homem-de-Mello

More information

Appendix. Proof of Theorem 1. Define. [ ˆΛ 0(D) ˆΛ 0(t) ˆΛ (t) ˆΛ. (0) t. X 0 n(t) = D t. and. 0(t) ˆΛ 0(0) g(t(d t)), 0 < t < D, t.

Appendix. Proof of Theorem 1. Define. [ ˆΛ 0(D) ˆΛ 0(t) ˆΛ (t) ˆΛ. (0) t. X 0 n(t) = D t. and. 0(t) ˆΛ 0(0) g(t(d t)), 0 < t < D, t. Appendix Proof of Theorem. Define [ ˆΛ (D) X n (t) = ˆΛ (t) D t ˆΛ (t) ˆΛ () g(t(d t)), t < t < D X n(t) = [ ˆΛ (D) ˆΛ (t) D t ˆΛ (t) ˆΛ () g(t(d t)), < t < D, t where ˆΛ (t) = log[exp( ˆΛ(t)) + ˆp/ˆp,

More information

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence: A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition

More information

Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data

Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Raymond K. W. Wong Department of Statistics, Texas A&M University Xiaoke Zhang Department

More information

On an Approximation Result for Piecewise Polynomial Functions. O. Karakashian

On an Approximation Result for Piecewise Polynomial Functions. O. Karakashian BULLETIN OF THE GREE MATHEMATICAL SOCIETY Volume 57, 010 (1 7) On an Approximation Result for Piecewise Polynomial Functions O. arakashian Abstract We provide a new approach for proving approximation results

More information

Faculty of Engineering, Mathematics and Science School of Mathematics

Faculty of Engineering, Mathematics and Science School of Mathematics Faculty of Engineering, Mathematics and Science School of Mathematics JS & SS Mathematics SS Theoretical Physics SS TSM Mathematics Module MA3429: Differential Geometry I Trinity Term 2018???, May??? SPORTS

More information

1 Introduction. 2 Measure theoretic definitions

1 Introduction. 2 Measure theoretic definitions 1 Introduction These notes aim to recall some basic definitions needed for dealing with random variables. Sections to 5 follow mostly the presentation given in chapter two of [1]. Measure theoretic definitions

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

Studia Scientiarum Mathematicarum Hungarica 42 (2), (2005) Communicated by D. Miklós

Studia Scientiarum Mathematicarum Hungarica 42 (2), (2005) Communicated by D. Miklós Studia Scientiarum Mathematicarum Hungarica 4 (), 7 6 (5) A METHOD TO FIND THE BEST BOUNDS IN A MULTIVARIATE DISCRETE MOMENT PROBLEM IF THE BASIS STRUCTURE IS GIVEN G MÁDI-NAGY Communicated by D Miklós

More information

Efficiency loss and the linearity condition in dimension reduction

Efficiency loss and the linearity condition in dimension reduction Biometrika (203), 00,2,pp. 37 383 doi: 0.093/biomet/ass075 C 203 Biometrika Trust Advance Access publication 24 February 203 Printed in Great Britain Efficiency loss and the linearity condition in dimension

More information

Chapter 8 Gradient Methods

Chapter 8 Gradient Methods Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point

More information

The Solution of Linear Systems AX = B

The Solution of Linear Systems AX = B Chapter 2 The Solution of Linear Systems AX = B 21 Upper-triangular Linear Systems We will now develop the back-substitution algorithm, which is useful for solving a linear system of equations that has

More information

5.80 Small-Molecule Spectroscopy and Dynamics

5.80 Small-Molecule Spectroscopy and Dynamics MIT OpenCourseWare http://ocw.mit.edu 5.80 Small-Molecule Spectroscopy and Dynamics Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 5.80 Lecture

More information

Nonparametric regression with rescaled time series errors

Nonparametric regression with rescaled time series errors Nonparametric regression with rescaled time series errors José E. Figueroa-López Michael Levine Abstract We consider a heteroscedastic nonparametric regression model with an autoregressive error process

More information

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω) Online Appendix Proof of Lemma A.. he proof uses similar arguments as in Dunsmuir 979), but allowing for weak identification and selecting a subset of frequencies using W ω). It consists of two steps.

More information

Multigrid Method ZHONG-CI SHI. Institute of Computational Mathematics Chinese Academy of Sciences, Beijing, China. Joint work: Xuejun Xu

Multigrid Method ZHONG-CI SHI. Institute of Computational Mathematics Chinese Academy of Sciences, Beijing, China. Joint work: Xuejun Xu Multigrid Method ZHONG-CI SHI Institute of Computational Mathematics Chinese Academy of Sciences, Beijing, China Joint work: Xuejun Xu Outline Introduction Standard Cascadic Multigrid method(cmg) Economical

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Sobolev Spaces. Chapter 10

Sobolev Spaces. Chapter 10 Chapter 1 Sobolev Spaces We now define spaces H 1,p (R n ), known as Sobolev spaces. For u to belong to H 1,p (R n ), we require that u L p (R n ) and that u have weak derivatives of first order in L p

More information

ETNA Kent State University

ETNA Kent State University Electronic Transactions on Numerical Analysis. Volume 35, pp. 148-163, 2009. Copyright 2009,. ISSN 1068-9613. ETNA SPHERICAL QUADRATURE FORMULAS WITH EQUALLY SPACED NODES ON LATITUDINAL CIRCLES DANIELA

More information

Discrete Applied Mathematics

Discrete Applied Mathematics Discrete Applied Mathematics 194 (015) 37 59 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: wwwelseviercom/locate/dam Loopy, Hankel, and combinatorially skew-hankel

More information

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES RUTH J. WILLIAMS October 2, 2017 Department of Mathematics, University of California, San Diego, 9500 Gilman Drive,

More information

A canonical application of the EM algorithm is its use in fitting a mixture model where we assume we observe an IID sample of (X i ) 1 i n from

A canonical application of the EM algorithm is its use in fitting a mixture model where we assume we observe an IID sample of (X i ) 1 i n from 1 The EM algorithm In this set of notes, we discuss the EM (Expectation-Maximization) algorithm, which is a common algorithm used in statistical estimation to try and find the MLE. It is often used in

More information

Analysis of Greedy Algorithms

Analysis of Greedy Algorithms Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm

More information

LS.1 Review of Linear Algebra

LS.1 Review of Linear Algebra LS. LINEAR SYSTEMS LS.1 Review of Linear Algebra In these notes, we will investigate a way of handling a linear system of ODE s directly, instead of using elimination to reduce it to a single higher-order

More information

Online Supplement: Managing Hospital Inpatient Bed Capacity through Partitioning Care into Focused Wings

Online Supplement: Managing Hospital Inpatient Bed Capacity through Partitioning Care into Focused Wings Online Supplement: Managing Hospital Inpatient Bed Capacity through Partitioning Care into Focused Wings Thomas J. Best, Burhaneddin Sandıkçı, Donald D. Eisenstein University of Chicago Booth School of

More information

Condition number estimates for matrices arising in the isogeometric discretizations

Condition number estimates for matrices arising in the isogeometric discretizations www.oeaw.ac.at Condition number estimates for matrices arising in the isogeometric discretizations K. Gahalaut, S. Tomar RICAM-Report -3 www.ricam.oeaw.ac.at Condition number estimates for matrices arising

More information

Supplementary Material for: Spectral Unsupervised Parsing with Additive Tree Metrics

Supplementary Material for: Spectral Unsupervised Parsing with Additive Tree Metrics Supplementary Material for: Spectral Unsupervised Parsing with Additive Tree Metrics Ankur P. Parikh School of Computer Science Carnegie Mellon University apparikh@cs.cmu.edu Shay B. Cohen School of Informatics

More information

Product Zero Derivations on Strictly Upper Triangular Matrix Lie Algebras

Product Zero Derivations on Strictly Upper Triangular Matrix Lie Algebras Journal of Mathematical Research with Applications Sept., 2013, Vol.33, No. 5, pp. 528 542 DOI:10.3770/j.issn:2095-2651.2013.05.002 Http://jmre.dlut.edu.cn Product Zero Derivations on Strictly Upper Triangular

More information

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Rend. Sem. Mat. Univ. Pol. Torino Vol. 57, 1999) L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Abstract. We use an abstract framework to obtain a multilevel decomposition of a variety

More information

Dimension Reduction in Abundant High Dimensional Regressions

Dimension Reduction in Abundant High Dimensional Regressions Dimension Reduction in Abundant High Dimensional Regressions Dennis Cook University of Minnesota 8th Purdue Symposium June 2012 In collaboration with Liliana Forzani & Adam Rothman, Annals of Statistics,

More information

Reflections and Rotations in R 3

Reflections and Rotations in R 3 Reflections and Rotations in R 3 P. J. Ryan May 29, 21 Rotations as Compositions of Reflections Recall that the reflection in the hyperplane H through the origin in R n is given by f(x) = x 2 ξ, x ξ (1)

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Detailed Proofs of Lemmas, Theorems, and Corollaries

Detailed Proofs of Lemmas, Theorems, and Corollaries Dahua Lin CSAIL, MIT John Fisher CSAIL, MIT A List of Lemmas, Theorems, and Corollaries For being self-contained, we list here all the lemmas, theorems, and corollaries in the main paper. Lemma. The joint

More information

VARIATIONAL PRINCIPLE FOR THE ENTROPY

VARIATIONAL PRINCIPLE FOR THE ENTROPY VARIATIONAL PRINCIPLE FOR THE ENTROPY LUCIAN RADU. Metric entropy Let (X, B, µ a measure space and I a countable family of indices. Definition. We say that ξ = {C i : i I} B is a measurable partition if:

More information

Suboptimal Feedback Control by a Scheme of Iterative Identification and Control Design

Suboptimal Feedback Control by a Scheme of Iterative Identification and Control Design Mathematical Modelling of Systems 1381-2424/97/0202-00$12.00 1997, Vol. 3, No. 1, pp. 77 101 c Swets & Zeitlinger Suboptimal Feedback Control by a Scheme of Iterative Identification and Control Design

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

A = , A 32 = n ( 1) i +j a i j det(a i j). (1) j=1

A = , A 32 = n ( 1) i +j a i j det(a i j). (1) j=1 Lecture Notes: Determinant of a Square Matrix Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk 1 Determinant Definition Let A [a ij ] be an

More information

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix) Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University

More information

Assorted DiffuserCam Derivations

Assorted DiffuserCam Derivations Assorted DiffuserCam Derivations Shreyas Parthasarathy and Camille Biscarrat Advisors: Nick Antipa, Grace Kuo, Laura Waller Last Modified November 27, 28 Walk-through ADMM Update Solution In our derivation

More information

General Franklin systems as bases in H 1 [0, 1]

General Franklin systems as bases in H 1 [0, 1] STUDIA MATHEMATICA 67 (3) (2005) General Franklin systems as bases in H [0, ] by Gegham G. Gevorkyan (Yerevan) and Anna Kamont (Sopot) Abstract. By a general Franklin system corresponding to a dense sequence

More information

Supplement to Multiresolution analysis on the symmetric group

Supplement to Multiresolution analysis on the symmetric group Supplement to Multiresolution analysis on the symmetric group Risi Kondor and Walter Dempsey Department of Statistics and Department of Computer Science The University of Chicago risiwdempsey@uchicago.edu

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Smoothness of conditional independence models for discrete data

Smoothness of conditional independence models for discrete data Smoothness of conditional independence models for discrete data A. Forcina, Dipartimento di Economia, Finanza e Statistica, University of Perugia, Italy December 6, 2010 Abstract We investigate the family

More information

Numerical Methods for Large-Scale Nonlinear Systems

Numerical Methods for Large-Scale Nonlinear Systems Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.

More information

Relation of Pure Minimum Cost Flow Model to Linear Programming

Relation of Pure Minimum Cost Flow Model to Linear Programming Appendix A Page 1 Relation of Pure Minimum Cost Flow Model to Linear Programming The Network Model The network pure minimum cost flow model has m nodes. The external flows given by the vector b with m

More information

(a) Write a greedy O(n log n) algorithm that, on input S, returns a covering subset C of minimum size.

(a) Write a greedy O(n log n) algorithm that, on input S, returns a covering subset C of minimum size. Esercizi su Greedy Exercise 1.1 Let S = {[a 1, b 1 ], [a 2, b 2 ],..., [a n, b n ]} be a set of closed intervals on the real line. We say that C S is a covering subset for S if, for any interval [a, b]

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

JACOBI S ITERATION METHOD

JACOBI S ITERATION METHOD ITERATION METHODS These are methods which compute a sequence of progressively accurate iterates to approximate the solution of Ax = b. We need such methods for solving many large linear systems. Sometimes

More information

Determinant lines and determinant line bundles

Determinant lines and determinant line bundles CHAPTER Determinant lines and determinant line bundles This appendix is an exposition of G. Segal s work sketched in [?] on determinant line bundles over the moduli spaces of Riemann surfaces with parametrized

More information

Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global

Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global homas Laurent * 1 James H. von Brecht * 2 Abstract We consider deep linear networks with arbitrary convex differentiable loss. We provide a short and elementary proof of the fact that all local minima

More information

Chain Independence and Common Information

Chain Independence and Common Information 1 Chain Independence and Common Information Konstantin Makarychev and Yury Makarychev Abstract We present a new proof of a celebrated result of Gács and Körner that the common information is far less than

More information

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0}

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0} VARIATION OF ITERATED BROWNIAN MOTION Krzysztof Burdzy University of Washington 1. Introduction and main results. Suppose that X 1, X 2 and Y are independent standard Brownian motions starting from 0 and

More information

Detailed Proof of The PerronFrobenius Theorem

Detailed Proof of The PerronFrobenius Theorem Detailed Proof of The PerronFrobenius Theorem Arseny M Shur Ural Federal University October 30, 2016 1 Introduction This famous theorem has numerous applications, but to apply it you should understand

More information

arxiv: v3 [stat.me] 12 Jul 2015

arxiv: v3 [stat.me] 12 Jul 2015 Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models Arnaud Doucet 1, Pierre E. Jacob and Sylvain Rubenthaler 3 1 Department of Statistics,

More information

is the desired collar neighbourhood. Corollary Suppose M1 n, M 2 and f : N1 n 1 N2 n 1 is a diffeomorphism between some connected components

is the desired collar neighbourhood. Corollary Suppose M1 n, M 2 and f : N1 n 1 N2 n 1 is a diffeomorphism between some connected components 1. Collar neighbourhood theorem Definition 1.0.1. Let M n be a manifold with boundary. Let p M. A vector v T p M is called inward if for some local chart x: U V where U subsetm is open, V H n is open and

More information

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent

More information

Chapter 4. Matrices and Matrix Rings

Chapter 4. Matrices and Matrix Rings Chapter 4 Matrices and Matrix Rings We first consider matrices in full generality, i.e., over an arbitrary ring R. However, after the first few pages, it will be assumed that R is commutative. The topics,

More information

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x. ESTIMATION OF ERROR Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call the residual for x. Then r := b A x r = b A x = Ax A x = A

More information

Lectures on Linear Algebra for IT

Lectures on Linear Algebra for IT Lectures on Linear Algebra for IT by Mgr Tereza Kovářová, PhD following content of lectures by Ing Petr Beremlijski, PhD Department of Applied Mathematics, VSB - TU Ostrava Czech Republic 3 Inverse Matrix

More information

Signed Measures and Complex Measures

Signed Measures and Complex Measures Chapter 8 Signed Measures Complex Measures As opposed to the measures we have considered so far, a signed measure is allowed to take on both positive negative values. To be precise, if M is a σ-algebra

More information

The master equality polyhedron with multiple rows

The master equality polyhedron with multiple rows The master equality polyhedron with multiple rows Sanjeeb Dash Ricardo Fukasawa IBM Research February 17, 2009 Oktay Günlük Abstract The master equality polyhedron (MEP) is a canonical set that generalizes

More information

Observer design for a general class of triangular systems

Observer design for a general class of triangular systems 1st International Symposium on Mathematical Theory of Networks and Systems July 7-11, 014. Observer design for a general class of triangular systems Dimitris Boskos 1 John Tsinias Abstract The paper deals

More information

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true 3 ohn Nirenberg inequality, Part I A function ϕ L () belongs to the space BMO() if sup ϕ(s) ϕ I I I < for all subintervals I If the same is true for the dyadic subintervals I D only, we will write ϕ BMO

More information