Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Size: px
Start display at page:

Download "Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis"

Transcription

1 Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Chuang Wang, Yonina C. Elar, Fellow, IEEE an Yue M. Lu, Senior Member, IEEE Abstract We present a high-imensional analysis of three popular algorithms, namely, Oja s metho, GROUSE an PE- TRELS, for subspace estimation from streaming an highly incomplete observations. We show that, with proper time scaling, the time-varying principal angles between the true subspace an its estimates given by the algorithms converge wealy to eterministic processes when the ambient imension n tens to infinity. Moreover, the limiting processes can be exactly characterize as the unique solutions of certain orinary ifferential equations (ODEs). A finite sample boun is also given, showing that the rate of convergence towars such limits is O(/ n). In aition to proviing asymptotically exact preictions of the ynamic performance of the algorithms, our high-imensional analysis yiels several insights, incluing an asymptotic equivalence between Oja s metho an GROUSE, an a precise scaling relationship lining the amount of missing ata to the signalto-noise ratio. By analyzing the solutions of the limiting ODEs, we also establish phase transition phenomena associate with the steay-state performance of these techniques. Inex Terms Subspace tracing, streaming PCA, incomplete ata, high-imensional analysis, scaling limit I. INTRODUCTION Subspace estimation is a ey tas in many signal processing applications. Examples inclue source localization in array processing, system ientification, networ monitoring, an image sequence analysis, to name a few. The ubiquity of subspace estimation comes from the fact that a low-ran subspace moel can conveniently capture the intrinsic, lowimensional structures of many large atasets. In this paper, we consier the problem of estimating an tracing an unnown subspace from streaming measurements with many missing entries. The streaming setting appears in applications (e.g. vieo surveillances) where high-imensional ata arrive sequentially over time at high rates. It is especially relevant in ynamic scenarios where the unerlying subspace to be estimate can be time-varying. Missing ata is also a C. Wang is with the John A. Paulson School of Engineering an Applie Sciences, Harvar University, Cambrige, MA 238, USA ( chuangwang@g.harvar.eu). Y. C. Elar is with the Department of EE, Technion, Israel Institute of Technology, Haifa, 32, Israel ( yonina@ee.technion.ac.il). Y. M. Lu is with the John A. Paulson School of Engineering an Applie Sciences, Harvar University, Cambrige, MA 238, USA ( yuelu@seas.harvar.eu). The wor of C. Wang an Y. M. Lu was supporte in part by the US Army Research Office uner contract W9NF an in part by the US National Science Founation uner grants CCF-394 an CCF The wor of Y. Elar was supporte in part by the European Union s Horizon 22 Research an Innovation Program uner Grant ERCCOG-BNYQ. Preliminary results of this wor was presente at the Signal Processing with Aaptive Sparse Structure Representations (SPARS) worshop in 27. very common issue in practice. Incomplete observations may result from a variety of reasons, such as the limitations of the sensing mechanisms, constraints on power consumption or communication banwih, or a eliberate esign feature that protects the privacy of iniviuals by removing partial recors. GROUSE [] an PETRELS [2] as well as the classical Oja s metho [3] are three popular algorithms for solving the above estimation problem. They are all streaming algorithms in the sense that they provie instantaneous, on-the-fly upates to their subspace estimates upon the arrival of a ata point. The three iffer in their upate rules: Oja s metho an GROUSE perform first-orer incremental graient escent on the Eucliean space an the Grassmannian, respectively, whereas PETRELS can be interprete as a secon-orer stochastic graient escent scheme. These algorithms have been shown to be highly effective in practice, but their performance epens on the careful choice of algorithmic parameters such as the step size (for GROUSE an Oja s metho) an the iscount parameter (for PETRELS). Various convergence properties of these techniques have been stuie in [2], [4] [7], but a precise analysis of their performance is still an open problem. Moreover, the important question of how the signal-to-noise ratios (SNRs), the amount of missing ata, an various other algorithmic parameters affect the estimation performance is not fully unerstoo. As the main objective of this wor, we present a tractable an asymptotically exact analysis of the ynamic performance of Oja s metho, GROUSE an PETRELS in the highimensional regime. Our contribution is mainly threefol:. Precise analysis via scaling limits. We show in Theorem an Theorem 2 that the time-varying trajectories of the estimation errors, measure in terms of the principal angles between the true unerlying subspace an the estimates given by the algorithms, converge wealy to eterministic processes, as the ambient imension n. Moreover, such eterministic limits can be characterize as the unique solutions of certain orinary ifferential equations (ODEs). In aition, we provie a finite-size guarantee in Theorem 3, showing that the convergence rate towars such limits is O(/ n). Numerical simulations verify the accuracy of our asymptotic preictions. The main technical tool behin our analysis is the wea convergence theory of stochastic processes (see [8] [2] for its mathematical founations an [3] [5] for its recent applications in relate estimation problems). 2. Insights regaring the algorithms. In aition to proviing asymptotically exact preictions of the ynamic performance of the three subspace estimation algorithms, our high-

2 2 imensional analysis leas to several valuable insights. First, the result of Theorem implies that, espite their ifferent upate rules, Oja s methos an GROUSE are asymptotically equivalent, with both converging to the same eterministic process as the imension increases. Secon, the characterization given in Theorem 2 shows that PETRELS can be examine within a common framewor that incorporates all three algorithms, with the ifference being that PETRELS uses an aaptive scheme to ajust its effective step sizes. Thir, our limiting ODEs also reveal an (asymptotically) exact scaling relationship that lins the amount of missing ata to the SNR. See the iscussions in Section IV-A for etails. 3. Funamental limits an phase transitions. Analyzing the limiting ODEs also reveals phase transition phenomena associate with the steay-state performance of these algorithms. Specifically, we provie in Propositions an 2 critical threshols for setting ey algorithm parameters (as a function of the SNR an the subsampling ratio), beyon which the algorithms converge to noninformative estimates that are no better than mere ranom guesses. The rest of the paper is organize as follows. We start by presenting in Section II-A the exact problem formulation for subspace estimation with missing ata. This is followe by a brief review of the three algorithms to be analyze in this wor. The main results are presente in Section III, where we show that the ynamic performance of Oja s metho, GROUSE an PETRELS can be asymptotically characterize by the solutions of certain eterministic systems of ODEs. Numerical experiments are also provie to illustrate an verify our theoretical preictions. To place our asymptotic analysis in proper context, we iscuss relate wor in the literature in Section III-D. We consier various implications an insights rawn from our analysis in Section IV. Due to space limitation, we only present informal erivation of the limiting ODEs an proof setches in Section V. More technical etails an the proofs of all the results presente in this paper can be foun in the Supplementary Materials [6]. Notation: Throughout the paper, we use I to enote the ientity matrix. For any positive semiinite matrix M, its principal square root is written as (M) 2. Depening on the context, enotes either the l 2 norm of a vector or the spectral norm of a matrix. For any x R, the floor operation x gives the largest integer that is smaller than or equal to x. Let {X n } be a sequence of ranom variables in a general P probability space. X n X means that Xn converges in wealy probability to a ranom variable X, whereas X n X means that X n converges to X wealy (i.e. in law). Finally, A enotes the inicator function for an event A. II. PROBLEM FORMULATION AND OVERVIEW OF A. Observation Moel ALGORITHMS We consier the problem of estimating a low-ran subspace using partial observations from a ata stream. At any iscretetime, suppose that a sample vector s R n is generate accoring to s = Uc + a. () Here, U R n is an unnown eterministic matrix whose columns form an orthonormal basis of a -imensional subspace, an c R is a ranom vector representing the expansion coefficients in that subspace. We also assume that the covariance matrix of c is Λ = iag(λ, λ 2,..., λ ), (2) where λ λ 2 λ are some strictly positive numbers. The noise in the observations is moele by a ranom vector a R n with zero mean an a covariance matrix equal to I n. Furthermore, a is inepenent of c. Since {λ l } l in (2) inicate the strength of the subspace component relative to the noise, we shall refer to these parameters as the SNR in our subsequent iscussions. We consier the missing ata case, where only a subset of the entries of s is available. This observation process can be moele by a iagonal matrix Ω = iag(v,, v,2,..., v,n ), (3) where v,i = if the ith component of x is observe, an v,i = otherwise. Our actual observation, enote by y, may then be written as y = Ω s. (4) Given a sequence of incomplete observations {y, Ω } arriving in a stream, we aim to estimate the subspace spanne by the columns of U. B. Oja s Metho Oja s metho [3] is a classical algorithm for estimating low-ran subspaces from streaming samples. It was originally esigne for the case where the full sample vectors s in () are available. Given a collection of K such sample vectors, it is natural to use the following optimization formulation to estimate the unnown subspace: Û = arg min X X=I = arg max X X=I K = min w s Xw 2 (5) K X s s X, (6) = where the equivalence between (5) an (6) is establishe by solving the simple quaratic problem min w s Xw 2 an substituting the solution into (5). Oja s metho is a stochastic projecte-graient algorithm for solving (6). At each step, let X enote the current estimate of the subspace. Then, with the arrival of a new sample vector s, we first upate X accoring to X = X + τ n s w, (7) The assumption that the covariance matrix is iagonal can be mae without loss of generality, after a rotation of the coorinate system. To see that, suppose c has a general covariance matrix Σ, which is iagonalize as Σ = ΦΛΦ. Here, Φ is an orthonormal matrix an Λ is a iagonal matrix as in (2). The generating moel () can then be rewritten as s = (UΦ)(Φ c ) + a. Thus, our problem is equivalent to estimating a subspace spanne by UΦ, an Λ is the covariance matrix of the new expansion coefficient vector Φ c.

3 3 where w = X s an {τ } is a sequence of positive constants that control the step-size (or learning rate) of the algorithm. We note that, up to a scaling constant, s w T in (7) is exactly equal to the graient of the objective function X s s X in (6) ue to the new sample s. Next, to enforce the orthogonality constraint, we compute X + = X ( X X ) 2, (8) where ( ) 2 stans for the principal square root of a positive semiinite matrix. In practice, (8) is implemente using the QR-ecomposition of X. To hanle the case of partially-observe samples, we can moify Oja s metho in two ways. First, we estimate the expansion coefficients w in (7) by solving a least squares problem that taes into account the missing ata moel: ŵ = arg min y Ω X w 2, (9) w R where y is the incomplete sample vector ine in (4), Ω is the corresponing subsampling matrix, an X is the current estimate of the subspace. Next, we replace the missing elements in y by the corresponing entries in X ŵ. This imputation step leas to an estimate of the full vector: ŷ = y + (I n Ω )X ŵ. () Replacing the original vectors y an w in (7) by their estimate counterparts ŷ an ŵ, we reach the moifie Oja s metho, the pseuocoe of which is summarize in Algorithm. Note that, to ensure we have enough observe entries in y, we first chec, with the arrival of a new partially observe vector y, whether et(x Ω X ) > ɛ et(x X ), () where ɛ > is a small positive constant. If this is inee the case, we o the stanar upate as escribe above; otherwise, we simply ignore the new sample vector an o not change the estimate in this step. Note that, uner a suitable probabilistic moel for the subsampling process (see assumption (A.3) in Section III-C), one can show that () is satisfie with high probability as long as ɛ < α, where α enotes the subsampling ratio ine in assumption (A.3). C. GROUSE Similar to Oja s metho, Grassmannian Ran-One Upate Subspace Estimation (GROUSE) [] is a first-orer stochastic graient escent algorithm for solving (5). The main ifference is that GROUSE solves the optimization problem on the Grassmannian, the manifol of all subspaces with a fixe ran. One avantage of this approach is that it avois the explicit orthogonalization step in (8), allowing the algorithm to achieve even lower computational complexity. At each step, GROUSE first fins the coefficient ŵ accoring to (9). It then computes the reconstruction error vector r = y Ω p, (2) Algorithm Oja s metho [3] with imputation Require: An initial estimate X such that X X = I, a sequence of step-size parameters {τ } an a positive constant ɛ. : := 2: repeat 3: if et(x Ω X ) > ɛ et(x X ) then 4: ŵ := arg min w y Ω X w 5: ŷ := y + (I n Ω )X ŵ 6: X := X + τ n ŷ ŵ 7: X + := X ( X X ) 2 8: else 9: X + := X : en if : := + 2: until termination Algorithm 2 GROUSE [] Require: An initial estimate X such that X X = I, a sequence of step-size parameters {τ } an a positive constant ɛ. : := 2: repeat 3: if et(x Ω X ) > ɛ et(x X ) then 4: ŵ := arg min w y Ω X w 5: p := X ŵ 6: r := y Ω p 7: θ := τ n r p[ X + := X + (cos(θ ) ) p p 8: + sin(θ ) r ] ŵ r ŵ 9: else : X + := X : en if 2: := + 3: until termination where p = X ŵ. Next, it upates the current estimate X on the Grassmannian as X + = X + where [ (cos θ ) p r p + sin θ r ] w w, θ = τ n r p, (3) an {τ } is a sequence of step-size parameters. The algorithm is summarize in Algorithm 2. D. PETRELS When there is no missing ata, an alternative to Oja s metho is a classical algorithm calle Projection Approximation Subspace Tracing (PAST) [7]. This metho estimates

4 4 Algorithm 3 Simplifie PETRELS [2] Require: An initial estimate of the subspace X, R = δ n I for some δ >, an positive constants γ an ɛ. : := 2: repeat 3: if et(x Ω X ) > ɛ et(x X ) then 4: ŵ := arg min w y Ω X w 5: X + := X + Ω (y X ŵ )ŵ R 6: v := γ R ŵ 7: β := + α ŵ v 8: R + := γ R α v v /β 9: else : X + := X : R + := R 2: en if 3: := + 4: until termination the unerlying subspace U by solving an exponentiallyweighte least-squares problem X + = arg min X R n γ s Xw 2, (4) = where w = X T s an γ (, ] is a iscount parameter. The solution of (4) has a simple recursive upate rule X + = X + (s X w ) w R (5) R + = (γr + w w ). (6) Moreover, one can avoi the explicit calculation of the matrix inverse in (6) by using the Woobury ientity an the fact that (6) amounts to a ran- upate. Parallel Subspace Estimation an Tracing by Recursive Least Squares (PETRELS) [2] extens PAST to the case of partially-observe ata. The main change is that it estimates the coefficient w in (4) using (9). In aition, it provies a parallel sub-routine in its calculations so that upates to ifferent coorinates can be compute in a fully parallel fashion. In its most general form, PETRELS nees to maintain an upate a ifferent matrix R i for each of the n coorinates. To reuce computational complexity, a simplifie version of PETRELS has been provie in [2], using a common R for all the coorinates. In this paper, we focus on this simplifie version of PE- TRELS, which is summarize in Algorithm 3. Note that we introuce an aitional parameter α in lines 7 an 8 of the pseuocoe. The simplifie algorithm given in [2] correspons to setting α =. In our analysis, we set α to be equal to the subsampling ratio ine later in (25). Empirically, we fin that, with this moification, the performance of the simplifie algorithm matches that of the full PETRELS algorithm when the ambient imension n is large. III. MAIN RESULTS: SCALING LIMITS In this section, we present the main results of this wor a tractable an asymptotically exact analysis of the performance of the three algorithms reviewe in Section II. A. Performance Metric: Principal Angles We start by ining the performance metric we will be using in our analysis. Recall the generative moel ine in (). The groun truth subspace is represente by the matrix U, whose column vectors form an orthonormal basis of that subspace. For Algorithms, 2, an 3, the estimate subspace at the th step is spanne by an orthogonal matrix Û = X (X X ) /2, (7) where X is the th iteran generate by the algorithms. Note that, for Oja s metho an GROUSE, Û = X as the matrix X is alreay orthogonal, whereas for PETRELS, generally X X I an thus the step in (7) is necessary. In the special case of = (i.e. ran-one subspaces), both U an Û are unit-norm vectors. The egree to which these vectors are aligne can be measure by their cosine similarity, ine as U. Û This concept can be naturally extene to arbitrary. In general, the closeness of two -imensional subspaces may be quantifie by their principal angles [8], [9]. In particular, the cosines of the principal angles are uniquely specifie as the singular values of a matrix ine as Q (n) = U Û = U X (X X ) /2. (8) In what follows, we shall refer to Q (n) as the cosine similarity matrix. Since we will be stuying the high-imensional limit of Q (n) as the ambient imension n, we use the superscript (n) to mae the epenence of Q (n) on n explicit. B. The Scaling Limits of Stochastic Processes: Main Ieas To analyze the performance of Algorithms, 2, an 3, our goal boils own to tracing the evolution of the cosine similarity matrix Q (n) over time. Thans to the streaming nature of all three methos, it is easy to see that the ynamics of their estimates X can be moele by homogeneous Marov chains with state space in R n. Thus, being a function of X [see (8)], the ynamics of Q (n) forms a hien Marov chain. We then show that, as n an{ with proper } time scaling, the family of stochastic processes Q (n) inexe n by n converges wealy to a eterministic function of time that is characterize as the unique solution of some ODEs. Such convergence is nown in the literature as the scaling limits [], [2], [5], [2] of stochastic processes. To present out results, we first consier a simple one-imensional example that illustrates the unerlying ieas behin scaling limits. Our main convergence theorems are presente in Section III-C. Consier a -D stochastic process ine by a recursion q + = q + τ n f(q ) + n (/2)+δ v, (9) where f( ) is a Lipschitz function, τ an δ are two positive constants, v is a sequence of i.i.. ranom variables with zero mean an unit variance, an n > is a constant introuce to scale the step-size an the noise variance. (This particular scaling is chosen here because it mimics the actual scaling that appears in the high-imensional ynamics of Q (n) we shall stuy.)

5 Figure. Convergence of the -D stochastic process q (n) t escribe in Example to its eterministic scaling limit. Here, we use δ =.25. When n is large, the ifference between q an q + is small. In other wors, we will not be able to see macroscopic changes unless we observe the process over a large number of steps. To accelerate the time (by a factor of n), we embe {q } in continuous-time by ining a piecewise-constant process q (n) (t) = q nt, (2) where is the floor function. Here, t is the rescale (accelerate) time: within t [, ], the original iscrete-time process moves n steps. Due to the scaling of the noise term in (9) (with the noise variance equal to n 2δ for some δ > ), the rescale stochastic process q (n) (t) converges to a eterministic limit function as n. We illustrate this convergence behavior using the following example. Example : Let us consier the special case where f(q) = q. We plot in Figure simulations results of q (n) (t) for several ifferent values of n. We see that, as n increases, the rescale stochastic processes q (n) (t) inee converge to some limit function (the blac line in the figure), which will be enote by q(t). To prove this convergence, we first expan the recursion (9) (by using the fact that f(q) = q) an get q = ( τ n ) q +, (2) where is a zero-mean ranom variable ine as = ( τ n (/2)+δ n ) i v i. i= Since {v i } i are inepenent ranom variables with unit variance, E ( ) 2 ( ( τ n +2δ n )2 ) = O(n 2δ ). It then follows from (2) that, for any t >, q (n) (t) = q nt P lim n ( τ n ) nt q = q e τt, (22) where P stans for convergence in probability. For general nonlinear function f(q), we can no longer irectly simplify the recursion (9) as in (2). However, similar convergence behaviors of q (n) (t) still exist. Moreover, the limit function q(t) can be characterize via an ODE. To see the origin of the ODE, we note that, for any t > an = nt, Figure 2. Convergence of the cosine similarity Q (n) associate with nt GROUSE at a fixe rescale time t =.5, as we increase n from 2 to 5,. In this experiment, = an thus Q (n) reuces to a scalar, enote by Q (n). The error bars show the stanar eviation of Q(n) nt over inepenent trials. In each trial, we ranomly generate a subspace U, the expansion coefficients {c } an the noise vector {a } as in (). The re ashe lines is the limiting value preicte by our asymptotic characterization, to be given in Theorem. we may rewrite (9) as q (n) (t + /n) q (n) (t) /n = τf[q (n) (t)] + n (/2) δ v. (23) Taing the n limit on both sie of (23) an neglecting the noise term n (/2) δ v, we may then write at least in a nonrigorous way the following ODE q(t) = τf[q(t)], which always has a unique solution ue to the Lipschitz property of f( ). For instance, the ODE associate with the linear setting in Example is q(t) = τq(t), whose unique solution q(t) = q e τt is inee the limit establishe in (22). A rigorous justification of the above steps can be foun in the theory of wea convergence of stochastic processes (see, for example, [2], [2]). Returning from the above etour, we recall that the central objects of our analysis are the cosine similarity matrices Q (n) ine in (8). It turns out that, just lie the simple - D process q in (9), the matrix-value stochastic processes Q (n), after a proper time rescaling = nt, also converge to a eterministic limit as the ambient imension n. This phenomenon is emonstrate in Figure 2, where we plot the cosine similarity Q (n) nt of GROUSE at t =.5 for ifferent values of n. In this experiment, we set = an thus Q (n) nt reuces to a scalar. The stanar eviations of Q (n) over nt inepenent trials, shown as error bars in Figure 2, ecrease as n increases. This inicates that the performance of these stochastic algorithms can inee be characterize by certain eterministic limits when the imension is high. C. The Scaling Limits of Oja s, GROUSE an PETRELS Q (n) nt To stuy the scaling limits of the cosine similarity matrices, we embe the iscrete-time process Q(n) into a continuous time process Q (n) (t) via a simple piecewise-constant

6 6 interpolation: Q (n) (t) = Q (n) nt. (24) The main objective of this wor is to establish the highimensional limit of Q (n) (t) as n. Our asymptotic analysis is carrie out uner the following technical assumptions on the generative moel () an the observation moel (3). (A.) The elements of the noise vector a are i.i.. ranom variables with zero mean, unit variance, an finite higherorer moments; (A.2) c in () is a -D ranom vector with zero-mean an a covariance matrix Λ as given in (2). Moreover, all the higher-orer moments of c exist an are finite, an {c } is inepenent of {a }; (A.3) We assume that { } v,i in the observation moel (3) is a collection of inepenent an ientically istribute binary ranom variables such that P(v,i = ) = α, (25) for some constant α (, ). Throughout the paper, we refer to α as the subsampling ratio. We shall also assume that the algorithmic parameter ɛ use in Algorithms 3 satisfies the conition that ɛ < α. (A.4) The subspace matrix U an initial guess X are incoherent in the sense that n Ui,j 4 C n n an X,i,j 4 C n, (26) i= j= i= j= where U i,j an X,i,j enote the (i, j)th entries of U an X, respectively, an C is a generic constant that oes not epen on n. (A.5) The initial cosine similarity Q (n) converges entrywise an in probability to a eterministic matrix Q(). (A.6) For Oja s metho an GROUSE, the step-size parameters τ = τ(/n), where τ( ) is a eterministic function such that sup t τ(t) C for a generic constant C that oes not epen on n. For PETRELS, the iscount factor γ = µ n, (27) for some constant µ >. Assumption (A.4) requires some further explanations. The conition (26) essentially requires the basis matrix U an the initial guess X to be generic. To see this, consier a U that is rawn uniformly at ranom from the Grassmannian for ran- subspaces. Such a U can be generate as U = X(X X) /2, (28) where X is an n ranom matrix whose entries are i.i.. stanar normal ranom variables. For such a generic choice of U, one can show that its entries U i,j O(/ n) an that (26) hols with high probability when n is large. Theorem { (Oja s metho an GROUSE): Fix T >, an let Q (n) (t) } be the time-varying cosine similarity t [,T ] matrices associate with either Oja s metho or GROUSE over the finite interval t [, T ]. Uner assumptions (A.) (A.6), we have { Q (n) (t) } t [,T ] wealy Q(t), stans for wea convergence an Q(t) is a eterministic matrix-value process. Moreover, Q(t) is the unique solution of the ODE where wealy Q(t) = F (Q(t), τ(t)i ), (29) where F : R R R is a matrix-value function ine as [ F (Q, G) = αλ 2 Q 2 QG Q ( ] I + 2 G) Q αλ 2 Q G. (3) Here α is the subsampling ratio, an Λ is the iagonal covariance matrix ie in (2). In Section V, we present a (nonrigorous) erivation of the limiting ODE (29). Full technical etails an a complete proof can be foun in the Supplementary Materials [6]. An interesting conclusion of this theorem is that the cosine similarity matrices Q (n) (t) associate with Oja s metho an GROUSE converge to the same asymptotic trajectory. We will elaborate on this point in Section IV-A. To establish the scaling limits of PETRELS, we nee to introuce an auxiliary matrix G (n) = (X X ) 2 R (X X ) 2, (3) where the matrices R an X are those use in Algorithm 3. Similar to (24), we embe the iscrete-time process G (n) into a continuous-time process: G (n) (t) = G (n) nt. (32) The following theorem, whose proof can be foun in the Supplementary Materials [6], characterizes the asymptotic ynamics of PETRELS. { Theorem 2 (PETRELS): For any fixe T >, let Q (n) (t) } be the time-varying cosine similarity matrices associate with PETRELS on the interval t [, T ]. t [,T ] Let { G (n) (t) } be the process ine in (32). Uner t [,T ] assumptions (A.) (A.6) an as n, we have { Q (n) (t) } wealy Q(t) an { G (n) (t) } wealy G(t), t [,T ] t [,T ] where { Q(t), G(t) } is the unique solution of the following system of couple ODEs: Q(t) = F (Q(t), G(t)), (33) G(t) = H(Q(t), G(t)). (34) Here, F is the function ine in (3) an H is a function ine by [ ] H(Q, G) = G µ G(G + I )(Q αλ 2 Q + I ), (35) where µ > is the constant given in (27). Theorem an Theorem 2 establish the scaling limits of Oja s metho, GROUSE an PETELS, respectively, as n. In practice, the imension n is always finite, an thus the actual trajectories of the performance curves will fluctuate aroun their asymptotic limits. To boun such fluctuations via a finite-sample analysis, we first nee to slightly strengthen assumption (A.5) as follows:

7 (a) GROUSE (crosses) an Oja (circles) (b) PETRELS Figure 3. Numerical simulations vs. asymptotic characterizations. (a) Results for Oja s metho an GROUSE, where the soli lines are the theoretical preictions of the cosines of 4 principal angles by the solution of the ODE (29). The crosses (for Oja s metho) an circles (for GROUSE) show the simulation results average over inepenent trials. In each trial, we ranomly generate a subspace U as in (28), the expansion coefficients {c } an the noise vector {a }. The error bars inicate ±2 stanar eviations. (b) Similar comparisons of numerical simulations an theoretical preictions for PETRELS. (A.7) Let Q (n) be the initial cosine similarity matrices. There exists a fixe matrix Q() such that E Q (n) Q() Cn /2, 2 where 2 enotes the spectral norm of a matrix an C > is a constant that oes not epen on n. Theorem 3 (Finite Sample Analysis): Let Q (n) (t) be the time-varying cosine similarity matrices associate with Oja s metho, GROUSE, or PETELS, respectively. Let Q(t) enote the corresponing scaling limit given in (29), (33) or (34). Fix any T >. Uner assumptions (A.) (A.4), (A.6) (A.7), for any t [, T ], we have E Q (n) (t) Q(t) C(T ), (36) 2 n sup n where C(T ) is a constant that can epen on the terminal time T but not on n. The above theorem, whose proof can be foun in the Supplementary Materials [6], shows that the rate of convergence towars the scaling limits is O(/ n). Example 2: To emonstrate the accuracy of the asymptotic characterizations given in Theorem an Theorem 2, we compare the actual performance of the algorithms against their theoretical preictions in Figure 3. In our experiments, we generate a ranom orthogonal matrix U accoring to (28) with n = 2, an = 4. For Oja s metho an GROUSE, we use a constant step size τ =.5. For PETRELS, the iscount factor is γ = µ/n with µ = 5, an R = δ n I with δ =. The covariance matrix is set to Λ = iag {5, 4, 3, 2} an the subsampling ratio is α =.5. Figure 3(a) shows the evolutions of the cosines of the 4 principal angles between U an the estimates given by Oja s metho (shown as crosses) an GROUSE (shown as circles). We compute the theoretical preictions of the principal angles by performing a SVD of the limiting matrices Q(t) as specifie by the ODE (29). (In fact, this ODE has a simple analytical solution. See Section IV-B for etails.) Figure 3(b) shows similar comparisons between PETRELS an its corresponing theoretical preictions. In this case, we solve the limiting ODEs (33) an (34) numerically. D. Relate Wor The problem of estimating an tracing low-ran subspaces has receive a lot of attention recently in the signal processing an learning communities. Uner the setting of fully observe ata, an earlier wor [2] stuies a bloc-version of Oja s metho an provies a sample complexity estimate for the case of =. Similar analysis is available for general - imensional subspaces [22], [23]. The streaming version of Oja s metho an its sample complexities have also been extensively stuie. See, e.g., [24] [28]. For the case of incomplete observations, the sample complexity of a bloc version of Oja s metho with missing ata is analyze in [29] uner the same generative moel as in (). In [7], the authors provie the sample complexity for learning a low-ran subspace from subsample ata uner a nonparametric moel much more general than (): the complete ata vectors are assume to be i.i.. samples from a general probability istribution on R n. In the streaming setting, Oja s metho, GROUSE, PETRELS are three popular algorithms for tacling the challenge of subspace learning with partial information. Other interesting approaches inclue online matrix completion methos [3] [32]. See [33] for a recent review of relevant literature in this area. Local convergence of GROUSE is given in [4], [5]. Global convergence of GROUSE is establishe in [6] uner the noiseless setting. In general, establishing finite sample global performance guarantees for GROUSE an other algorithms such as Oja s an PETRELS in the missing ata case is still an open problem. Unlie most wor in the literature that sees to establish finite-sample performance guarantees for various subspace

8 8 estimation algorithms, our results in this paper provie an asymptotically exact characterization of three popular methos in the high-imensional limit. The main technical tool behin our analysis is the wea convergence of stochastic processes towars their scaling limits that are characterize by ODEs or stochastic ifferential equations (see, e.g., [8] [], [5]). Using ODEs to analyze stochastic recursive algorithms has a long history [34], [35]. An ODE analysis of an early subspace tracing algorithm was given in [36], an this result was aapte to analyze PETRELS for the nonsubsample case [2]. Our results in this paper iffer from previous analysis not only in that it can hanle the more challenging case of incomplete observations. In aition, previous ODE analysis in [2], [36] eeps the ambient imension n fixe an stuies the asymptotic limit as the step size tens to. The resulting ODEs involve O(n) variables. In contrast, our analysis stuies the limit as the imension n, an the resulting ODEs only involve at most 2 2 variables, where is the imension of the subspace which, in many practical situations, is a small constant. This low-imensional characterization maes our limiting results more practical to use, especially when the ambient imension n is large. It is important to point out a limitation of our asymptotic analysis: we require the initial estimate X to be asymptotically correlate with the true subspace U. To see why this is an issue, we note that if the initial cosine similarity matrix Q() = (i.e., a fully uncorrelate initial estimate), then the ODEs in Theorems an 2 only provie a trivial solution Q(t), yieling no useful information. In practice, a correlate initial estimate can be obtaine by performing a PCA on a small batch of samples; it may also available from aitional sie information about the true subspace U. Therefore, the requirement that Q() be invertible is not overly restrictive. Nevertheless, we observe in numerical simulations that, uner sufficiently high SNRs, Oja s metho, GROUSE an PETRELS can successfully estimate the subspace by starting from ranom initial guesses that are uncorrelate with U. Extening our analysis framewor to hanle the case of ranom initial estimates is an important line of future wor. IV. IMPLICATIONS OF HIGH-DIMENSIONAL ANALYSIS The scaling limits presente in Section III provie asymptotically exact characterizations of the ynamic performance of Oja s metho, GROUSE, an PETRELS. In this section, we iscuss implications of these results. Analyzing the limiting ODEs also reveals the funamental limits an phase transition phenomena associate with the steay-state performance of these algorithms. A. Algorithmic Insights By examining Theorem an Theorem 2, we raw the following conclusions regaring the three subspace estimation algorithms.. Connections an ifferences between the algorithms. Theorem implies that, as n, Oja s metho an GROUSE converge to the same eterministic limit process characterize as the solution of the ODE (29). This result is Figure 4. Monte Carlo simulations of the PETRELS algorithm v.s. asymptotic preictions obtaine by the limiting ODEs given in Theorem 2 for =. In this case, the two matrices Q(t) an G(t) reuce to two scalars Q(t) an G(t). The variable G(t) acts as an effective step-size, which aaptively ajusts its value accoring to the change in Q(t). The error bars shown in the figures represent one stanar eviation over 5 inepenent trials. The signal imension is n = 4. somewhat surprising, as the upate rules of the two methos (see Algorithm an Algorithm 2) appear to be sufficiently ifferent. Theorem 2 shows that PETRELS is also intricately connecte to the other two algorithms. Inee, the ODE (33) of the cosine similarity matrix Q(t) for PETRELS has exactly the same form as the one for GROUSE an Oja s metho shown in (29), except for the fact that the nonaaptive stepsize τ(t)i in (29) is now replace by a matrix G(t), itself governe by an ODE (33). Thus, G(t) in PETRELS can be viewe as an aaptive scheme for ajusting the step-size. To investigate how G(t) evolves, we run an experiment for =. In this case, the quantities Q(t), G(t) an Λ reuce to three scalars, enote by Q(t), G(t), an λ, respectively. Figure 4 shows the ynamics of PETRELS to recover this -D subspace. It shows that G(t) increases initially, which helps to boost the convergence spee. As Q(t) increases (meaning the estimates becoming more accurate), however, the effective step-size G(t) graually ecreases, in orer to help Q(t) reach a higher steay-state value. 2. Subsampling vs. the SNR. The ODEs in Theorems an 2 also reveal an interesting (asymptotic) equivalence between the subsampling ratio α an the SNR as specifie by the matrix Λ. To see this, we observe from the inition of the two functions F an H in (3) an (35) that α always appears together with Λ in the form of a prouct αλ 2. This implies that an observation moel with subsampling ratio α an SNR Λ will have the same asymptotic performance as a ifferent moel with subsampling ratio α an SNR α/ˆα Λ. In simpler terms, having missing ata is asymptotically equivalent to lowering the SNR in the fully-observable setting. B. Oja s Metho an GROUSE: Analytical Solutions an Phase Transitions Next, we investigate the ynamics of Oja s metho an GROUSE by stuying the solution of the ODE given in

9 9 Theorem. To that en, we consier a change of variables by ining P (t) = [Q(t)Q (t)]. (37) One may euce from (29) that the evolution of P (t) is also governe by a first-orer ODE: where P (t) = A(t) P (t)b(t) B(t)P (t), (38) A(t) = τ(t)[2 + τ(t)]αλ 2 (39) B(t) = τ(t) ( αλ 2 τ(t) 2 I ) (4) are two iagonal matrices. Thans to the linearity of (38), it amits an analytical solution P (t) = e t B(r) r P ()e t + t B(r) r A(s)e 2 t s B(r) r s. (4) Note that the first two terms on the right-han sie of (4) represent the influence of the initial estimate P () = [Q()Q ()] on the current state at t. In the special case of the algorithms using a constant step size, i.e., τ(t) τ >, the solution (4) may be further simplifie as P (t) = e tb P ()e tb + Z(t), (42) where Z(t) = iag { z (t),..., z (t) } with z l (t) = (2 + ( τ)αλ2 l 2αλ 2 l τ e τ(2αλ2 τ)t) l (43) for l. Note that if 2αλ 2 l τ = for some l, the above expression for z l is unerstoo via the convention that ( e τt )/ = τt. The formula (42) reveals a phase transition phenomenon for the steay-state performance of the two algorithms as we change the step-size parameter τ. To see that, we first recall that the eigenvalues of Q (n) (t)(q (n) (t)) are exactly equal to the square cosines of the principal angles { θl n(t)} between the true subspace U an the estimate given by the algorithms. We say an algorithm generates an asymptotically informative solution if lim lim t n cos2 (θl n (t)) > for all l, (44) i.e., the steay-state estimates of the algorithms achieve nontrivial correlations with all the irections of U. In contrast, a noninformative solution correspons to lim lim t n cos2 (θl n (t)) = for all l, (45) in which case the steay-state estimates carry no information about U. For >, one may also have the thir situation where only a subset of the irections of U can be recovere (with nontrivial correlations) by the algorithm. Proposition : Let θ (n) l (t) enotes the lth principal angle between the true subspace an the estimate obtaine by Oja s metho or GROUSE with a constant step size τ. Uner the same assumptions as in Theorem, we have lim lim t n cos2 (θ (n) l (t)) = max {, 2αλ 2 l τ αλ 2 l (2+τ) }, (46) where {λ l } are the SNR parameters ine in (2). It follows that the two algorithms provie asymptotically informative solutions if an only if τ < 2α min l λ2 l. (47) Proof: Suppose the iagonal matrix B in (4) has positive iagonal entries (with ), an 2 = negative or zero entries. Without loss of generatively, [ we may assume ] that B can be split into a bloc form B 2 such that B 2 B only contains the positive 2 iagonal entries, an B 2 only contains the nonpositive entires. Accoringly, [ we split ] the other two [ matrices in (42) ] P as P () =, P,2 Z (t) an Z(t) = 2. P 2, P 2,2 2 Z 2 (t) Applying the bloc matrix inverse formula to (42), we get [ ] P W (t) =, (t) W,2 (t), (48) W 2, (t) W 2,2 (t) where W, (t) = e tb P, e tb + Z (t) e tb P,2 e tb2 (e tb2 P 2,2 e tb2 + Z 2 ) e tb 2 P 2, e tb. It is easy to verify from the initions of B an Z that { 2αλ 2 τ αλ 2(2+τ),..., 2αλ 2 τ αλ 2 (2+τ) lim W,(t) = iag t Similarly, we may verify that lim W,2(t) = 2 t lim W 2,2(t) = 2 2. t }. (49) (5) Substituting (49) an (5) into (48) an recalling that the eigenvalues of P (t) are exactly equal to the square cosines of the principal angles, we reach (46). Applying the conitions given in (44) an (45) to (46) yiels (47). C. Steay-State Analysis of PETRELS The steay-state property of PETRELS can also be obtaine by stuying the limiting ODEs as given in Theorem 2. The coupling of Q(t) an G(t) in (33) an (34), however, maes the analysis much more challenging. Unlie the case of Oja s metho an GROUSE, we are not able to obtain closeform analytical solutions of the ODEs for PETRELS. In what follows, we restrict our iscussions to the special case of =. This simplifies the tas, as the matrix-value ODEs (33) an (34) reuce to scalar-value ones. It is not har to verify that, for any solution { Q(t), R(t) } with an initial conition { Q(), R() }, there is a symmetric solution { Q(t), G(t) } for the initial conition { Q(), G() }. To remove this reunancy, it is convenient to investigate the ynamics of Q 2 (t) an G(t), which satisfy the following ODEs [Q2 (t)] = GQ 2 [2αλ 2 G 2Q 2 ( + 2 G)αλ2 ] (5) G(t) = G[µ G(G + )(Q2 αλ 2 + )]. (52)

10 (a) informative solution (c) noninformative solution (b) informative solution () noninformative solution Figure 5. Phase portraits of the nonlinear ODEs in Theorem 2: The blac curves are trajectories of the solutions (Q 2 (t), G(t)) of the ODES starting from ifferent initial values. The green an re curves represent nontrivial solutions of the two stationary equations Q2 (t) = an G(t) =. Their intersection point, if it exists, is a stable fixe point of the ynamical system. The fixe-points of the top two figures correspon to Q 2 ( ) >, an thus the steay-state solutions in these two cases are informative. In contrast, the fixe-points of the bottom two figures are associate with noninformative steay-state solutions with Q 2 ( ) =. Figure 5 visualizes several ifferent solution trajectories of these ODEs as the blac curves in the Q G plane. These solutions start from ifferent initial conitions at the borers of the figures, an they converge to certain stationary points. The locations of these stationary points epen on the SNR {λ l }, the subsampling ratio α an the iscount parameter µ use by the algorithm. In Figures 5(a) an 5(b), the stationary points correspon to Q 2 >, an thus the algorithm generates asymptotically informative solutions accoring to the inition in (44). In contrast, Figure 5(c) an Figure 5() show the situations where the steay-state solutions are noninformative. Proposition 2: Let =. Uner the same assumptions as in Theorem 2, PETRELS generates an asymptotically informative solution if an only if µ < ( 2αλ 2 + 2) 2 4, (53) where µ is the parameter ine in (27), λ enotes the SNR in (2), an α is the subsampling ratio. Proof: It follows from Theorem 2 that verifying the conitions (44) an (45) boils own to stuying the fixe point of a ynamical system governe by the limiting ODEs (5) an (52). This tas is in turn equivalent to setting the left-han sies of the ODEs to zero an solving the resulting equations. Let {Q, G } be any solution to the equations G2 = an G =. From the forms of the right-han sies of (5) an (52), we see that {Q, G } must fall into one of the following three cases: Case I: G = an Q can tae arbitrary values; Case II: Q = an G is the unique positive solution to G (G + ) = µ; (54) Case III: Q an G. A local stability analysis, erre to the en of the proof, shows that the fixe points in Case I are always unstable, in the sense that any small perturbation will mae the ynamics move away from these fixe points. Thus, we just nee to focus on Case II an Case III, with the former corresponing to an uninformative solution an the latter to an informative one. We will show that, uner (53), a fixe point in Case III exists an it is the unique stable fixe point. That solution isappears when (53) ceases to hol, in which case the solution in Case II becomes the unique stable fixe point. To see why (53) provies the phase transition bounary, we note that a solution in Case III, if it exists, must satisfy (Q ) 2 = f(g ) an (Q ) 2 = h(g ), where f(g) = αλ2 + ( + G 2 )αλ2 αλ 2 (55) h(g) = ( µ G(G + ) ) αλ 2. (56) The above two equations are erive from Q2 (t) = an G(t) =. In Figure 5, the functions f(g) an h(g) are plotte as the green an re ashe lines, respectively. It is easy to verify from their initions that f(g) an h(g) are both monotonically ecreasing in the feasible region ( Q 2 an G > ). Moreover, = f () < h (), where f an h enote the inverse function of f an h, respectively. Thus, a solution in Case III exists if f () > h (), which then leas to (53) after some algebraic manipulations. Finally, we examine the local stability of the fixe points in Case I an Case II. Note that a fixe point (Q, G ) of the 2-imensional ODE (5) an (52) is stable if an only if [ ] [Q 2 ] Q2 (t) Q=Q,G=G < an [ ] G G(t) Q=Q,G=G <, where Q2 (t) an G(t) are the functions on the right-han sie of (5) an (52), respectively. It follows that [ all the ] Case I fixe points are always unstable, because G G(t) G= = µ >. Furthermore, the Case II fixe point is also unstable if (53) hols, because [Q 2 ] [ Q2 (t) ] Q=,G=G = 2αλ 2 G >, where G is the value specifie in (54). On the other han, when (53) oes not hol, the Case II fixe point becomes stable. Example 3: Proposition 2 preicts a critical choice of µ (as a function of the SNR λ an the subsampling ratio α) that separates informative solutions from noninformative ones. This preiction is confirme numerically in Figure 6. In our experiments, we set =, n =,. We then scan the

11 Multiplying both sies of (57) from the left by u, we get Q + Q = n g, (6) where [ (cos(θ g = n ) ) ] u p p + sin(θ ) u r A (6) r specifies the increment of the cosine similarity from Q to Q +. To erive the limiting ODE, we first rewrite (6) as Figure 6. The grayscale in the figure visualizes the steay-state square cosine similarities of PETRELS corresponing to ifferent values of the SNR λ 2, the subsampling ratio α, an the step-size parameter µ. The re curve is the theoretical preiction given in Proposition 2 of a phase transition bounary, below which no informative solution can be achieve by the algorithm. The theoretical preiction matches numerical results. parameter space of µ an αλ 2. For each choice of these two parameters on our search gri, we perform inepenent trials, with each trial using a ifferent realizations of c an a in () an a ifferent U rawn uniformly at ranom from the n-d sphere. The grayscale in Figure 6 shows the average value of the square cosine similarity Q(t) at t = 3. V. DERIVATIONS OF THE ODES AND PROOF SKETCHES In this section, we present a nonrigorous erivation of the limiting ODEs an setch the main ingreients of our proofs of Theorems an 2. More technical etails an the complete proofs can be foun in the Supplementary Materials [6]. A. Derivations of the ODE In what follows, we show how one may erive the limiting ODE in Theorem. We focus on GROUSE, but the other two algorithms can be treate similarly. For simplicity, we consier the case in which the subspace imension is =. In this case, the true subspace U in () an its estimate X given by Algorithm 2 reuce to vectors u an x, respectively. The covariance matrix Λ in (2) also reuces to a scalar λ. Consequently, the weight vector ŵ obtaine in (9) becomes a scalar w = x Ω s / Ω x 2. Our first observation is that the ynamic of GROUSE can be moele by a Marov chain (x, u ) on R 2n, where u u for all. The upate rule of this Marov chain is x + x = [ (cos(θ ) ) p p + sin(θ ) r r ] A, (57) where A = { Ω x 2 > ɛ}. Here, the inicator function A encoes the test in line 3 of Algorithm 2. Since we are consiering the special case of =, the vectors r an p as originally ine in (2) can be rewritten as r = Ω (s p ) (58) p = x x Ω s Ω x 2. (59) Q + Q /n = E g + (g E g ), (62) where E enotes conitional expectation with respect to { all the ranom } elements encountere up to step, i.e., cj, a j, Ω j in the generative moel (). One can j show that E (g E g ) 2 = O() (63) an E g = F (Q, τ ) + O(/ n), (64) where F (, ) is the function ine in (3). Substituting (64) into (62) an omitting the zero-mean ifference term (g E g ), we get Q + Q /n = F (Q, τ ) + O(/ n). (65) Let Q(t) be a continuous-time process ine as in (24), with t = /n being the rescale time. In an intuitive but nonrigorous way, we have Q + Q /n Q(t) as n. This then gives us the ODE in (29). In what follows, we provie some aitional etails behin the estimate in (64). To simplify our presentation, we first introuce a few variables. Let z = Ω x 2, z = n Ω s 2 p = u Ω s, q = x Ω s Q = u Ω x. (66) Since u = x =, all these variables are O() quantities when n. (See Lemma 5 in Supplementary Materials.) Given its inition in (3), we rewrite θ use in (57) as θ 2 = τ 2 n q 2 z 2 [ ] z 2 q2 O(/n). nz Thus, it is natural to expan the two terms cos(θ ) an sin(θ ) that appear in (57) via a Taylor series expansion, which yiels cos(θ ) = τ 2 r 2 p 2 2n 2 + O(n 2 ) sin(θ ) = τ n r 2 p 2 + O(n 3/2 ). Substituting (67) into (6) gives us (67) g = τ z [ p q z q 2 ( Q + τ 2 z Q )] {z >ɛ} + O(n /2 ). (68) A rigorous justification of this step is presente as Lemma 8 in the Supplementary Materials.

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Sparse Reconstruction of Systems of Ordinary Differential Equations

Sparse Reconstruction of Systems of Ordinary Differential Equations Sparse Reconstruction of Systems of Orinary Differential Equations Manuel Mai a, Mark D. Shattuck b,c, Corey S. O Hern c,a,,e, a Department of Physics, Yale University, New Haven, Connecticut 06520, USA

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

TIME-DELAY ESTIMATION USING FARROW-BASED FRACTIONAL-DELAY FIR FILTERS: FILTER APPROXIMATION VS. ESTIMATION ERRORS

TIME-DELAY ESTIMATION USING FARROW-BASED FRACTIONAL-DELAY FIR FILTERS: FILTER APPROXIMATION VS. ESTIMATION ERRORS TIME-DEAY ESTIMATION USING FARROW-BASED FRACTIONA-DEAY FIR FITERS: FITER APPROXIMATION VS. ESTIMATION ERRORS Mattias Olsson, Håkan Johansson, an Per öwenborg Div. of Electronic Systems, Dept. of Electrical

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes Online ICA: Unerstaning Global Dynamics of Nonconvex Optimization via Diffusion Processes Chris Junchi Li Zhaoran Wang Han Liu Department of Operations Research an Financial Engineering, Princeton University

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Quantum mechanical approaches to the virial

Quantum mechanical approaches to the virial Quantum mechanical approaches to the virial S.LeBohec Department of Physics an Astronomy, University of Utah, Salt Lae City, UT 84112, USA Date: June 30 th 2015 In this note, we approach the virial from

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

Sturm-Liouville Theory

Sturm-Liouville Theory LECTURE 5 Sturm-Liouville Theory In the three preceing lectures I emonstrate the utility of Fourier series in solving PDE/BVPs. As we ll now see, Fourier series are just the tip of the iceberg of the theory

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Generalization of the persistent random walk to dimensions greater than 1

Generalization of the persistent random walk to dimensions greater than 1 PHYSICAL REVIEW E VOLUME 58, NUMBER 6 DECEMBER 1998 Generalization of the persistent ranom walk to imensions greater than 1 Marián Boguñá, Josep M. Porrà, an Jaume Masoliver Departament e Física Fonamental,

More information

Generalized Tractability for Multivariate Problems

Generalized Tractability for Multivariate Problems Generalize Tractability for Multivariate Problems Part II: Linear Tensor Prouct Problems, Linear Information, an Unrestricte Tractability Michael Gnewuch Department of Computer Science, University of Kiel,

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Error Floors in LDPC Codes: Fast Simulation, Bounds and Hardware Emulation

Error Floors in LDPC Codes: Fast Simulation, Bounds and Hardware Emulation Error Floors in LDPC Coes: Fast Simulation, Bouns an Harware Emulation Pamela Lee, Lara Dolecek, Zhengya Zhang, Venkat Anantharam, Borivoje Nikolic, an Martin J. Wainwright EECS Department University of

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

Monotonicity for excited random walk in high dimensions

Monotonicity for excited random walk in high dimensions Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods Hyperbolic Moment Equations Using Quarature-Base Projection Methos J. Koellermeier an M. Torrilhon Department of Mathematics, RWTH Aachen University, Aachen, Germany Abstract. Kinetic equations like the

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

θ x = f ( x,t) could be written as

θ x = f ( x,t) could be written as 9. Higher orer PDEs as systems of first-orer PDEs. Hyperbolic systems. For PDEs, as for ODEs, we may reuce the orer by efining new epenent variables. For example, in the case of the wave equation, (1)

More information

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

12.11 Laplace s Equation in Cylindrical and

12.11 Laplace s Equation in Cylindrical and SEC. 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential 593 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential One of the most important PDEs in physics an engineering

More information

Dot trajectories in the superposition of random screens: analysis and synthesis

Dot trajectories in the superposition of random screens: analysis and synthesis 1472 J. Opt. Soc. Am. A/ Vol. 21, No. 8/ August 2004 Isaac Amiror Dot trajectories in the superposition of ranom screens: analysis an synthesis Isaac Amiror Laboratoire e Systèmes Périphériques, Ecole

More information

arxiv: v1 [math.mg] 10 Apr 2018

arxiv: v1 [math.mg] 10 Apr 2018 ON THE VOLUME BOUND IN THE DVORETZKY ROGERS LEMMA FERENC FODOR, MÁRTON NASZÓDI, AND TAMÁS ZARNÓCZ arxiv:1804.03444v1 [math.mg] 10 Apr 2018 Abstract. The classical Dvoretzky Rogers lemma provies a eterministic

More information

Robustness and Perturbations of Minimal Bases

Robustness and Perturbations of Minimal Bases Robustness an Perturbations of Minimal Bases Paul Van Dooren an Froilán M Dopico December 9, 2016 Abstract Polynomial minimal bases of rational vector subspaces are a classical concept that plays an important

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

Sharp Thresholds. Zachary Hamaker. March 15, 2010

Sharp Thresholds. Zachary Hamaker. March 15, 2010 Sharp Threshols Zachary Hamaker March 15, 2010 Abstract The Kolmogorov Zero-One law states that for tail events on infinite-imensional probability spaces, the probability must be either zero or one. Behavior

More information

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an

More information

TOEPLITZ AND POSITIVE SEMIDEFINITE COMPLETION PROBLEM FOR CYCLE GRAPH

TOEPLITZ AND POSITIVE SEMIDEFINITE COMPLETION PROBLEM FOR CYCLE GRAPH English NUMERICAL MATHEMATICS Vol14, No1 Series A Journal of Chinese Universities Feb 2005 TOEPLITZ AND POSITIVE SEMIDEFINITE COMPLETION PROBLEM FOR CYCLE GRAPH He Ming( Λ) Michael K Ng(Ξ ) Abstract We

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

arxiv: v1 [cs.lg] 22 Mar 2014

arxiv: v1 [cs.lg] 22 Mar 2014 CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations

Optimized Schwarz Methods with the Yin-Yang Grid for Shallow Water Equations Optimize Schwarz Methos with the Yin-Yang Gri for Shallow Water Equations Abessama Qaouri Recherche en prévision numérique, Atmospheric Science an Technology Directorate, Environment Canaa, Dorval, Québec,

More information

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Switching Time Optimization in Discretized Hybrid Dynamical Systems Switching Time Optimization in Discretize Hybri Dynamical Systems Kathrin Flaßkamp, To Murphey, an Sina Ober-Blöbaum Abstract Switching time optimization (STO) arises in systems that have a finite set

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS Yannick DEVILLE Université Paul Sabatier Laboratoire Acoustique, Métrologie, Instrumentation Bât. 3RB2, 8 Route e Narbonne,

More information

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments

TMA 4195 Matematisk modellering Exam Tuesday December 16, :00 13:00 Problems and solution with additional comments Problem F U L W D g m 3 2 s 2 0 0 0 0 2 kg 0 0 0 0 0 0 Table : Dimension matrix TMA 495 Matematisk moellering Exam Tuesay December 6, 2008 09:00 3:00 Problems an solution with aitional comments The necessary

More information

BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS. Mauro Boccadoro Magnus Egerstedt Paolo Valigi Yorai Wardi

BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS. Mauro Boccadoro Magnus Egerstedt Paolo Valigi Yorai Wardi BEYOND THE CONSTRUCTION OF OPTIMAL SWITCHING SURFACES FOR AUTONOMOUS HYBRID SYSTEMS Mauro Boccaoro Magnus Egerstet Paolo Valigi Yorai Wari {boccaoro,valigi}@iei.unipg.it Dipartimento i Ingegneria Elettronica

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

Simultaneous Input and State Estimation with a Delay

Simultaneous Input and State Estimation with a Delay 15 IEEE 5th Annual Conference on Decision an Control (CDC) December 15-18, 15. Osaa, Japan Simultaneous Input an State Estimation with a Delay Sze Zheng Yong a Minghui Zhu b Emilio Frazzoli a Abstract

More information

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1 Assignment 1 Golstein 1.4 The equations of motion for the rolling isk are special cases of general linear ifferential equations of constraint of the form g i (x 1,..., x n x i = 0. i=1 A constraint conition

More information

IERCU. Institute of Economic Research, Chuo University 50th Anniversary Special Issues. Discussion Paper No.210

IERCU. Institute of Economic Research, Chuo University 50th Anniversary Special Issues. Discussion Paper No.210 IERCU Institute of Economic Research, Chuo University 50th Anniversary Special Issues Discussion Paper No.210 Discrete an Continuous Dynamics in Nonlinear Monopolies Akio Matsumoto Chuo University Ferenc

More information

arxiv: v2 [math.pr] 27 Nov 2018

arxiv: v2 [math.pr] 27 Nov 2018 Range an spee of rotor wals on trees arxiv:15.57v [math.pr] 7 Nov 1 Wilfrie Huss an Ecaterina Sava-Huss November, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial

More information

The effect of dissipation on solutions of the complex KdV equation

The effect of dissipation on solutions of the complex KdV equation Mathematics an Computers in Simulation 69 (25) 589 599 The effect of issipation on solutions of the complex KV equation Jiahong Wu a,, Juan-Ming Yuan a,b a Department of Mathematics, Oklahoma State University,

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

KNN Particle Filters for Dynamic Hybrid Bayesian Networks

KNN Particle Filters for Dynamic Hybrid Bayesian Networks KNN Particle Filters for Dynamic Hybri Bayesian Networs H. D. Chen an K. C. Chang Dept. of Systems Engineering an Operations Research George Mason University MS 4A6, 4400 University Dr. Fairfax, VA 22030

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

State observers and recursive filters in classical feedback control theory

State observers and recursive filters in classical feedback control theory State observers an recursive filters in classical feeback control theory State-feeback control example: secon-orer system Consier the riven secon-orer system q q q u x q x q x x x x Here u coul represent

More information

Optimal Variable-Structure Control Tracking of Spacecraft Maneuvers

Optimal Variable-Structure Control Tracking of Spacecraft Maneuvers Optimal Variable-Structure Control racking of Spacecraft Maneuvers John L. Crassiis 1 Srinivas R. Vaali F. Lanis Markley 3 Introuction In recent years, much effort has been evote to the close-loop esign

More information

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM ON THE OPTIMALITY SYSTEM FOR A D EULER FLOW PROBLEM Eugene M. Cliff Matthias Heinkenschloss y Ajit R. Shenoy z Interisciplinary Center for Applie Mathematics Virginia Tech Blacksburg, Virginia 46 Abstract

More information

Energy behaviour of the Boris method for charged-particle dynamics

Energy behaviour of the Boris method for charged-particle dynamics Version of 25 April 218 Energy behaviour of the Boris metho for charge-particle ynamics Ernst Hairer 1, Christian Lubich 2 Abstract The Boris algorithm is a wiely use numerical integrator for the motion

More information

The Subtree Size Profile of Plane-oriented Recursive Trees

The Subtree Size Profile of Plane-oriented Recursive Trees The Subtree Size Profile of Plane-oriente Recursive Trees Michael FUCHS Department of Applie Mathematics National Chiao Tung University Hsinchu, 3, Taiwan Email: mfuchs@math.nctu.eu.tw Abstract In this

More information

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

More information

Hyperbolic Systems of Equations Posed on Erroneous Curved Domains

Hyperbolic Systems of Equations Posed on Erroneous Curved Domains Hyperbolic Systems of Equations Pose on Erroneous Curve Domains Jan Norström a, Samira Nikkar b a Department of Mathematics, Computational Mathematics, Linköping University, SE-58 83 Linköping, Sween (

More information

Perturbation Analysis and Optimization of Stochastic Flow Networks

Perturbation Analysis and Optimization of Stochastic Flow Networks IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO. Y, MMM 2004 1 Perturbation Analysis an Optimization of Stochastic Flow Networks Gang Sun, Christos G. Cassanras, Yorai Wari, Christos G. Panayiotou,

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information