Optimal rates of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces

Size: px
Start display at page:

Download "Optimal rates of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces"

Transcription

1 Optimal rates of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces Heinz H. Bauschke, J.Y. Bello Cruz, Tran T.A. Nghia, Hung M. Phan, and Xianfu Wang November 8, 015 Abstract We systematically study the optimal linear convergence rates for several relaxed alternating projection methods and the generalized Douglas-Rachford splitting methods for finding the projection on the intersection of two subspaces. Our analysis is based on a study on the linear convergence rates of the powers of matrices. We show that the optimal linear convergence rate of powers of matrices is attained if and only if all subdominant eigenvalues of the matrix are semisimple. For the convenience of computation, a nonlinear approach to the partially relaxed alternating projection method with at least the same optimal convergence rate is also provided. Numerical experiments validate our convergence analysis. 010 Mathematics Subject Classification: Primary 65F10, 65K05; Secondary 65F15, 65B05, 15A18, 90C5, 41A5 Keywords: Convergent and semi-convergent matrix, Friedrichs angle, generalized Douglas- Rachford method, linear convergence, principal angle, relaxed alternating projection method. 1 Introduction Methods of alternative projections and Douglas-Rachford play important roles in convex optimization; see, e.g., [1,, 3, 6, 11, 14, 15, 16, 18, 0]. They are also widely used in differential equation and signal processing [3, 9]. In order to study convergence rates, error bounds are given for the method of alternating projections [, 17, 7], [18, Theorem 9.8] and [, Section 3.4]; for the Douglas-Rachford method [5, 16], [30] and [3, Proposition 4]. The purpose of this paper is to give a systematic convergence rate analysis of relaxed alternating projections, partial relaxed alternating projection, and the generalized Douglas-Rachford method for two subspaces in finite dimensional spaces. The optimal convergence rates are explicitly given in terms of the relaxation Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. heinz.bauschke@ubc.ca. IME, Federal University of Goias, Goiania, G.O , Brazil. yunier@impa.br and yunier@ufg.br. Mathematics & Statistics, Oakland University, Rochester, MI 48309, USA. nttran@oakland.edu. Department of Mathematical Sciences, University of Massachusetts Lowell, 65 Riverside St., Olney Hall 48, Lowell, MA 01854, USA. Hung Phan@uml.edu. Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. shawn.wang@ubc.ca. 1

2 parameters and sine or cosine of principal angles. Our results extend the work by Demanet and Zhang [16], and the work by Bauschke, Deutsch, Hundal and Park [6]. Our quantification of the optimal convergence rate in terms of relaxation parameters and principal angles will shed light on how to choose parameters in practical applications. To this end, we need a study on the optimal linear convergence rate of the powers of a real or complex matrix A. Necessary and sufficient conditions for such convergence were first established by Hensel [6] and later by Oldenburger [41]. The convergence and its asymptotic rate play a central role in many well-known algorithms for solving linear systems such as Jacobi, Gauss-Seidel, successive over-relaxation methods; see, e.g., [10, 35, 38, 4]. Furthermore, the convergence of the power A k in operator norm is linear and the rate, which is fundamentally different from the asymptotic one mentioned above, is dominated by the second-largest modulus of eigenvalues of A, γ(a), i.e., modulus of the subdominant or controlling eigenvalues [8, 40]. Natural questions thus arising are What is the optimal (smallest) linear convergence rate? and When is γ(a) exactly the optimal linear convergence rate?. In general, the optimal linear convergence rate does not exist (see Example.11 below). However, many iterative linear methods such as the method of alternating projections (also known as von Neumann s method) [, 17] and the Douglas-Rachford splitting algorithm [19, 0, 9, 3] do obtain the optimal linear rates of convergence in operator norm; see also [5, 16, 7]. The rest of the paper is organized as follows. In Section we study optimal linear convergence rates of matrices. We give a necessary and sufficient condition for the powers A k to converge linearly with the optimal rate γ(a) via the semi-simpleness of subdominant eigenvalues. The sufficient part is similar to and can be obtained from [43, Theorem.9], in which the norm of power matrix is exactly computed by the power of spectral radius when all the dominant eigenvalues are nondefective, i.e., semisimple; see also our Remark.19 for further details. However, to obtain the necessary part, we develop a systematic analysis on matrix powers, derive in Lemma.14 a new bound for their norms, which is sharper than the one in [43, Theorem.9], and apply the results to optimal convergence rates of convergent matrices. The main contribution of the paper is developed in Sections 3, 4 and 5. In Section 3, using results of Section, we analyze optimal linear convergence rates of the relaxed alternating methods [1, 33] and also the generalized Douglas-Rachford splitting methods [16] with parameters for two linear subspaces. To the best of our knowledge, our optimal rates of the relaxed alternating methods established here via principal angles [9] are new in the literature and they significantly improve the one of classical alternating methods. Furthermore, our optimal rate of the generalized Douglas-Rachford splitting methods extends the similar result obtained implicitly in Demanet-Zhang [16, Section.6] without assuming the trivial intersection of two subspace as in [16]. In Section 4 we introduce and study a nonlinear map that helps to accelerate significantly the convergence of alternating projection methods. This map also allows us to overcome the difficulty of computing the principal angles used to determine the parameter in the relaxed/partial alternating methods aforementioned. In particular, this generalizes one of the result by Bauschke-Deutsch-Hundal-Park [6, Theorem 3.8]. In Section 5, we provide some numerical results to illustrate our convergence theory developed in earlier sections. The numerical experiments indicate that relaxed alternating projection and partial relaxed alternating projection methods perform better than the method of alternating projection in general. Finally, we present our conclusions in Section 6. Notation. Throughout, we denote by C n n and R n n the sets of n n complex matrices and real matrices, respectively. Let A be a matrix in C n n (or R n n ). The notation A stands for the adjoint (complex transposed) matrix of A. The matrix norm used in this paper is the operator norm, i.e., A = max{ Ax x C n, x 1}, the induced matrix norm. We write ker A, ran A, and

3 rank A as the kernel, range, rank of A, respectively. Moreover, Fix A := ker(a Id) is known as the set of fixed points of A, where Id is the identity mapping. We say A is nonexpansive if Ax x for all x C n ; furthermore, A is firmly nonexpansive if Ax + x Ax x for all x C n. For any subspace U of R n, we use P U for the orthogonal projection operator to U, dim U for the dimension of U, and U for the orthogonal complement of U. We denote I n, 0 n, 0 m n by the n n identity matrix, the n n zero matrix, and the m n zero matrix, respectively. N is the set of nonnegative integers {0, 1,...}. Preliminary results: the optimal convergence rate of matrices In this section we establish conditions under which convergent matrices attain the optimal convergent rate. Our analysis in later sections hinges on these results on matrices. Let us begin with:.1 Some definitions and well-known facts about matrices Definition.1 (convergent matrices) Let A C n n. We say A is convergent 1 to A C n n if and only if (1) A k A 0 as k. We say A is linearly convergent to A with rate µ [0, 1) if there are some M, N > 0 such that () A k A Mµ k for all k > N, k N. Then µ is called a linear convergence rate of A. When the infimum of all the convergence rates is also a convergence rate, we say this minimum is the optimal linear convergence rate. For any A C n n we denote by σ(a) the spectrum of A, the set of all eigenvalues. The spectral radius [37, Example 7.1.4] of A is defined by (3) ρ(a) := max{ λ λ σ(a)}. The next fact is the classical formula of spectral radius. Fact. (spectral radius formula) ([37, Example ]) Let A C n n. Then we have (4) ρ(a) = lim k A k 1 k. With λ σ(a), recall from [37, page 587] that index (λ) is the smallest positive integer k satisfying rank (A λ Id) k = rank (A λ Id) k+1. Furthermore, we say λ σ(a) is semisimple if index (λ) = 1; see, e.g., [37, Exercise 7.8.4]. Fact.3 For A C n n, λ σ(a) is semisimple if and only if ker(a λ Id) = ker(a λ Id). 1 In the literature, A is called convergent if the power A k converges to 0; moreover, A is semi-convergent whenever the latter limit A k exists. To avoid the confusion of these two terminologies, we just say A is convergent in both cases. This is significantly different from the asymptotic convergence rate [10, p. 199]. 3

4 Proof. Note that λ σ(a) is semisimple if and only if dim[ker(a λ Id)] = n rank (A λ Id) = n rank (A λ Id) = dim[ker(a λ Id) ]. Since ker(a λ Id) ker(a λ Id), the equality dim[ker(a λ Id)] = dim[ker(a λ Id) ] holds if and only if ker(a λ Id) = ker(a λ Id). This verifies the proof of the fact. The following result taken from [37] gives us a complete characterization of a convergent matrix. Fact.4 (limits of powers) ([37, page and page 630]) For A C n n, lim k A k exists if and only if (5) (6) ρ(a) < 1, or else ρ(a) = 1 and λ = 1 is semisimple and it is the only eigenvalue on the unit circle. When this happens, we have (7) lim k A k = A = the projector onto ker(a Id) along ran(a Id). In particular, when ρ(a) < 1, we have A = 0. The proof of the above fact is indeed based on the spectral resolution of A k stated below. Fact.5 (spectral resolution of A k ) ([37, page 603 and page 69]) For k N and A C n n with σ(a) = {λ 1, λ,..., λ s } and k i = index (λ i ), we have (8) A k = s i=1 λ k i G i + s i=1 k i 1 j=1 where the spectral projector G i s have the following properties: ( ) k λ k j j i (A λ i Id) j G i, (i) G i is the projector onto ker((a λ i Id) k i ) along ran((a λi Id) k i ). (ii) G 1 + G + + G s = Id. (iii) G i G j = 0 when i = j. (iv) N i = (A λ i Id)G i = G i (A λ i Id) is nilpotent of index k i, i.e., N k i i = 0 and N k i 1 i = 0. Furthermore, the second sum in (8) disappears when index (λ i ) = 1 for all i = 1,..., s. Remark.6 Note from Fact.5(i) and (iv) that 0 = N k i 1 i = (A λ i Id) k i 1 G k i 1 i = (A λ i Id) k i 1 G i if k i > 1. The limit A in Fact.4 is an oblique projector, not necessarily an orthogonal projector. However, we have: Corollary.7 Suppose that A C n n is convergent to A C n n. Then the following hold: (i) A = P Fix A if and only if Fix A = Fix A. (ii) If A is nonexpansive or normal, then A = P Fix A. 4

5 Proof. It follows from (7) that A is equal to the projector onto ker(a Id) along ran(a Id). Thanks to the equality [37, (5.9.11)], we have ran(a Id) = ran(a Id). If A = P Fix A, we obtain It follows that ran(a Id) = ran(a Id) = ran(p Fix A Id) = ran(p (Fix A) ) = (Fix A). Fix A = [ (Fix A) ] = ran(a Id) = ker(a Id) = Fix A. Conversely, if Fix A = Fix A, we have ker(a Id) = Fix A = Fix A = ker(a Id) = ran(a Id), which implies in turn that the projector onto ker(a Id) along ran(a Id) is exactly the orthogonal projection P Fix A. The first part (i) of the corollary is complete. To justify the second part (ii), suppose in addition that A is nonexpansive. Then Fix A = Fix A by [6, Lemma.1] and thus A is convergent to P Fix A. Moreover, if A is normal, then A Id is also normal. Hence, for all x C n we have (A Id)x = (A Id) (A Id)x, x = (A Id)(A Id) x, x = (A Id) x. The latter clearly shows that Fix A = Fix A and thus A = P Fix A. The proof is complete. Remark.8 (convergence, firmly nonexpansiveness and nonexpansiveness) Let A R n n. When A is firmly nonexpansive, A is convergent; see, e.g., [3, Example 5.17]. However, the converse implication fails. Indeed, consider, for n, ( ) 0 n A =. n 0 Then A is not (firmly) nonexpansive because Ae 1 = ne where e 1 = (1, 0) and e = (0, 1). On the other hand, the characteristic polynomial is λ λ n 1, which has roots ±n 1/. Thus A is convergent due to Fact.4. Moreover, convergence and nonexpansiveness are independent, e.g., A = Id is nonexpansive but not convergent.. Asymptotic convergence rates of convergent matrices We will prove later in this section that whenever A is convergent to A, it is linearly convergent with the rate not smaller than ρ(a A ). To develop this idea, let us now consider the case of diagonalizable matrices. Example.9 (diagonalizable case) Suppose that A C n n is diagonalizable and that σ(a) = {λ 1,..., λ s } with 1 = λ 1 > λ λ 3 λ s. Note that all eigenvalues {λ 1,..., λ s } are semisimple when A is diagonalizable. By Fact.5 and Fact.4, we have A is convergent to A and that A k = A + λ k G + + λ k sg s, 5

6 which yields A k A = λ k G + + λ k sg s. It follows that [ ( A k A λ k λ ) k G + + λ λ k( G + + G s ). ( λs ) k Gs ] λ Hence A k A with the linear rate λ. In general an eigenvalue having second-largest modulus after 1 is called a subdominant eigenvalue. Definition.10 (subdominant eigenvalues) ([8, 40]) For A C n n, we define (10) γ(a) := max { λ λ {0} σ(a) \ {1} }. An eigenvalue λ σ(a) satisfying λ = γ(a) is referred as a subdominant eigenvalue. When A is not diagonalizable, γ(a) need not be the convergence rate. Example.11 Let us consider the following matrix A = 1 0 1, which gives us that γ(a) = 1. Note also that A is not diagonalizable. Moreover, by induction it is easy to check that A k = 1 k 0 for all k N. k k k Hence we have A k A := as k. However, observe that A k A γ(a) k = k k 0 k k = k as k k Hence γ(a) is not a convergence rate. However, observe further that any µ ( 1, 1) is a convergence rate of A. Thus A does not obtain the optimal linear convergence rate. The following result below shows that whenever a matrix A is convergent, it must be linearly convergent with any rate in (γ(a), 1). One may also use the spectral decomposition [34, Proposition 1] to prove. Here, our proof is a bit different, but it is necessary for our further study in the paper. 6

7 Theorem.1 (rate of convergence I) Suppose that A C n n is convergent to A C n n. Then we have γ(a) = ρ(a A ) < 1 and that (11) (A A ) k = A k A for all k N. Moreover, the following two assertions are satisfied: (i) A is linearly convergent with any rate µ (γ(a), 1). (ii) If A is linearly convergent with rate µ [0, 1), then µ [γ(a), 1). Proof. First let us justify that γ(a) = ρ(a A ) < 1 and (11) by considering the two following cases taken from Fact.4: Case 1: ρ(a) < 1. In this case we have A = 0 by (4). It follows that γ(a) = ρ(a) = ρ(a A ) < 1. Note also that (11) is trivial, since A = 0. Case : ρ(a) = 1, and λ = 1 is semisimple and the only eigenvalue on the unit circle. Suppose that σ(a) \ {1} = {λ,..., λ s } with 1 > λ... λ s. The Jordan decomposition [37, page 590] of A allows us to find an invertible matrix P C n n and r > 0 such that (1) A = PJP 1 with J being the Jordan form of A, I r 0 0 J 1 (λ j ) J(λ ) 0 J =......, J(λ. j) = 0 J (λ j ) , 0 0 J(λ s ) 0 0 J tj (λ j ) and λ j J k (λ j ) =... 1, index (λ j) = max { } d jk k = 1,..., t j, λj where d jk is the dimension of the matrix J k (λ j ), t j = dim(ker(a λ j Id)). Moreover, it follows from [37, p. 69] that ( ) (13) A Ir 0 = P P This together with the Jordan decomposition above gives us that 0 r 0 0 (14) A A 0 J(λ ) 0 = P P 1, 0 0 J(λ s ) which readily yields ρ(a A ) = max{0, λ } = γ(a) < 1. Observe further from (1) and (13) that AA = A A = (A ) = A. For any k N the latter gives us that (A k A )(A A ) = A k+1 A k A A A + (A ) = A k+1 A A + A = A k+1 A. 7

8 By using this expression, we may prove by induction (11) and this completes the first part of the theorem. Now to verify (i), pick any µ (γ(a), 1) = (ρ(a A ), 1). Employing (4) for operator A A allows us to find some N N such that A k A = (A A ) k µ k for all k N, which verifies the linear convergence of A with rate µ. It remains to prove (ii). Suppose that A is convergent to A with rate µ [0, 1). Hence there are some M, N > 0 such that A k A Mµ k for all k > N, k N. Combining this with the spectral radius formula (4) and (11) gives us that γ(a) = ρ(a A ) = lim k (A A ) k 1 k = lim k A k A 1 k lim k M 1 k µ = µ, which ensures γ(a) µ and thus completes the proof of the theorem. Remark.13 We observe from (14) that if λ σ(a) \ {0, 1}, λ σ(a A ) and its index does not change..3 The optimal convergence rate of convergent matrices A natural question arising from the above theorem is that in which case γ(a) is the optimal linear convergence rate of A; see our Definition.1. By Theorem.1, the actual problem is to describe when γ(a) is a convergence rate of A; see also our Example.11. Theorem.15 below gives us a complete answer for this question. To prepare, we need the following lemma, which enhances the spectral radius formula in Fact. and is of its own interest. It is worth noting that one may prove it by using the Jordan decomposition [37, page 590 and page 618] with similar complexity. Lemma.14 Let A C n n with spectral radius ρ(a) > 0. σ(a), λ = ρ(a) }. We have Define α := max { index (λ) λ (15) 0 < lim sup k A k ( ) <. k ρ(a) α 1 k Proof. For the matrix A, denote the set of distinct eigenvalues in σ(a) by {λ 1,..., λ s } with ρ(a) = λ 1 λ... λ s and k i = index (λ i ), i = 1,..., s. We get from (8) that (16) Denote by A k = s i=1 λ k i G i + s i=1 k i 1 j=1 ( ) k λ k j j i (A λ i Id) j G i. (17) E := {1,..., s}, F := { l N λ l = ρ(a), 1 l s }, and S := { i F index (λ i ) = α }. 8

9 It is clear that F S =. By (16) we have (18) A k = Note that ( ) k j } {{ } :=H k i 1 i S j=0 λ k j i (A λ i Id) j G i k i 1 ( ) k + λ k j j i (A λ i Id) j G i. i E\S j=0 }{{} :=K (19) K ( ) k α 1 λ 1 k = i E\S k i 1 j=0 = i (E\S) F + i E\F k i 1 j=0 ( ) k j k λ i j k i 1 j=0 1 λ 1 k ( ) (A λ i Id) j G i k α 1 ) k j λ i 1 λ 1 k ( ) (A λ i Id) j G i k α 1 ( k j ( ) k j k λ i j 1 λ 1 k ( ) (A λ i Id) j G i. k α 1 For i (E \ S) F and 0 j k i 1, observe from the definition of α in (17) that j α. It follows that ( ) k (0) ( j k ) λk j i = λ 1 k α 1 (α 1)!(k α + 1)! j!(k j)! 1 λ 1 j (α 1)! j!(k α + ) 1 λ 1 j := ε 1 (k) 0 as k. For i E \ F and 0 j k i 1, we have ( ) k j k λ i (1) j λ 1 k kj( λ i ) k j λ1 j := ε (k) 0, λ 1 since k j is polynomial in k and ( λ i / λ 1 ) k j is exponential with λ i / λ 1 < 1 for i / F. It follows from (19), (0), and (1) that () K ) ( k α 1 λ 1 k i (E\S) F k i 1 ε 1 (k) (A λ i ) j G i + j=0 i E\F k i 1 ε (k) (A λ i ) j G i 0. j=0 9

10 Now note from (18) that ( ) k H ( ) k α 1 (3) λ 1 k = i S α 1 j=0 j ( k α 1 λ k (α 1) i = i S = i S ) λk j i λ 1 k (A λ i Id) j G i λ 1 k (A λ i Id) α 1 G i + i S λ k i λ 1 k λ (α 1) i α j=0 (A λ i Id) α 1 α G i + i S j=0 Furthermore, for i S and j α similarly to (0) we may prove that ( ) k which implies in turn that ( j k ) λk j i α 1 λ 1 k := ε3 (k) 0 when k, ( ) k ( j k ) λk j i λ 1 k (A λ i Id) j G i α 1 ( ) k ( j k ) λk j i λ 1 k (A λ i Id) j G i. } α 1 {{ } :=H 1 (4) α H 1 ε 3 (k) (A λ i Id) j G i 0 as k. i S j=0 ( ) k By dividing (18) by λ 1 k and taking k, we get from (18), (), (3), and (4) that α 1 (5) lim sup k ( k α 1 A k ) ρ(a) k = lim sup k i S λ k i λ 1 k λ (α 1) i (A λ i Id) α 1 G i (6) λ i (α 1) (A λ i Id) α 1 G i <, i S which verify the right-hand inequality in (15). Furthermore, since λ i λ 1 = 1 for all i S, by passing to subsequences we may assume without loss of generality that for each i S the sequence [ ] k λi λ i xi with x i = 1 as k. Hence, it follows from (5) that (7) lim sup k A k ( ) k ρ(a) α 1 k i S x i λ (α 1) i (A λ i Id) α 1 G i. The left-hand inequality in (15) is justified when i S x i λ (α 1) i (A λ i Id) α 1 G i = 0. By contraction, suppose i S x i λ (α 1) i (A λ i Id) α 1 G i = 0. By multiplying this equality by G l with l S, we get from Fact.5 (i) and (iii) that x l λ (α 1) l (A λ l Id) α 1 G l = 0, 10

11 which is impossible since x l = 0, λ l = λ 1 = 0, and (A λ l Id) α 1 G l = 0 by Remark.6. The proof is complete. Theorem.15 (rate of convergence II) Let A C n n be convergent to A C n n. Then γ(a) is the optimal linear convergence rate of A if and only if all the subdominant eigenvalues are semisimple. Proof. By Theorem.1, we only need to prove that γ(a) is a linear convergence rate of A if and only if λ is semisimple for every eigenvalue λ σ(a) satisfying λ = γ(a). Define α := max { index (λ) λ σ(a), λ = γ(a) }. If γ(a) = 0, then by (8) we have A k = A for all k 1. This means that A = A and A = A, which ensures that γ(a) is semisimple and γ(a) = 0 is a convergence rate. Thus the statement of the theorem is trivial in this case. It remains to prove the theorem when γ(a) > 0. If A is linearly convergent to A with the rate γ(a), we find M, N such that A k A Mγ(A) k for all k > N, k N Note from Theorem.1 that γ(a) = ρ(a A ). Thanks to Remark.13, all subdominant eigenvalues of A are eigenvalues of A A and their indices do not change. It follows from (11) and Lemma.14 for A A that 0 < lim sup k (A A ) k ( ) = lim sup k ρ(a A α 1 ) k k A k A ( k α 1 ) γ(a) k lim sup k which yields α = 1 and thus all subdominant eigenvalues are semisimple. ( M k ), α 1 Conversely, if all subdominant eigenvalues are semisimple, we have α = 1. Applying Lemma.14 again to A A gives us that lim sup k A k A γ(a) k = lim sup k (A A ) k ( ) <, k ρ(a A α 1 ) k which verifies that γ(a) is the convergence rate of A. The proof is complete. Remark.16 It is worth mentioning that Example.9 is also a direct consequence of Theorem.15, since all the eigenvalues of A are semisimple when A is diagonalizable. Moreover, γ(a) is not the convergence rate in Example.11, since 1 = γ(a) is not semisimple in this case. Next let us summarize Fact.4, Theorem.1, and Theorem.15 in the following result, which provides a complete characterization for obtaining the optimal convergence rate. Theorem.17 (optimal convergence rate) Let A C n n. Then A is convergent with the optimal linear convergence rate γ(a) if and only if one of the following holds: (i) ρ(a) < 1 and all λ σ(a) satisfying λ = γ(a) are semisimple. (ii) ρ(a) = 1, λ = 1 is the only eigenvalue on the unit circle, λ = 1 is semisimple, and all λ σ(a) satisfying λ = γ(a) are semisimple. 11

12 Proof. If A is convergent with the optimal linear convergence rate, Theorem.1 tells us that γ(a) is the optimal convergence rate. Moreover, (i) and (ii) follow from Fact.4 and Theorem.15. Conversely, if (i) and (ii) hold, we also get from Fact.4 and Theorem.15 that A is convergent with the optimal rate γ(a). Theorem.18 Let A C n n be convergent to A. Then we have (8) A k A A A k and thus γ(a) A A. Furthermore, if A is normal then we have (9) A k A = A A k and γ(a) = A A is the optimal convergence rate of A. Proof. First, observe from (11) in Theorem.1 that A k A = (A A ) k A A k, which clearly ensures (8) and thus γ(a) = ρ(a A ) A A. To justify the second part, suppose that A is convergent and normal. We claim that A A is also normal. This is trivial when A = 0. It remains to take into account the case A = 0. Since A is normal, we can find a diagonal matrix J = diag (λ 1, λ,..., λ n ) with λ 1... λ n and a unitary matrix P such that A = PJP. Fact.4 tells us that 1 σ(a) and 1 = λ 1 =... = λ r > λ r+1... λ n for some r N. It follows that (30) A = lim k A k = P Hence we obtain ( ) Ir 0 P r r 0 0 A A 0 λ r+1 0 = P P, 0 0 λ n which is a normal matrix. The latter formula together with (11) also gives us that A k A = (A A ) k = A A k = λ r+1 k = [ρ(a A )] k = γ(a) k, which ensures (9) and completes the proof of the theorem. Remark.19 (i). Theorems.1 and.15 can also be deduced from the spectral radius formula and Jordan factorizations; however, we were not be able to find these results explicitly in literature. The asymptotic convergence bound of matrix powers can be found [43, Theorem.9, p. 33], i.e., Let A be of order n and let ε > 0 be given. Then for any norm there exists σ (depending on the norm) and τ > 0 (depending on A, ε and the norm) such that for all k 1: (31) σρ(a) k A k τ A,ε (ρ(a) + ε) k. If the dominant eigenvalues of A is nondefective, we may take ε = 0. Lemma.14 gives a better upper bound for A k when ρ(a) > 0. The distinguished feature of our results given here is the 1

13 complete characterizations under which the convergence rate of the matrix power A k is exactly γ(a), rather than γ(a) + ε for some ε > 0; or just sufficient conditions. (ii). Assume that lim k A k exists and ρ(a) = 1. According to Fact.4, the spectral resolution, A = P + Z where P = G 1, P = P, PZ = ZP = 0 and ρ(z) = γ(a) < 1, see also [36, Theorem.1]. We have A k = P + Z k so that A k P = Z k. Therefore, the rate of convergence of A k to P is exactly the rate of convergence of Z k to 0. This observation is exactly Theorem.1 with a different proof, where A there is P in this decomposition. It is well-known that ρ(z) is the asymptotic convergence rate meaning that for every ε > 0, there exists N N, M 1, M > 0 such that ( k N) Mρ(Z) k Z k M (ρ(z) + ε) k. Our result says that there exists M 3 > 0 such that ( k N) Z k M 3 ρ(z) k if and only if all eigenvalues of Z with magnitude ρ(z) are semisimple. In other words, when one of the eigenvalues of Z with magnitude ρ(z) is not semisimple, the convergence rate of Z k to 0 can be as close as ρ(z) as one wishes, but not exactly ρ(z). (iii). When all subdominant eigenvalues of A are semisimple, all eigenvalues of Z in (ii) with magnitude ρ(z) are also semisimple and thus nondefective in the sense of Stewart [43]. It follows from [43, Theorem.9, p. 33] or (31) above that there exists τ > 0 such that Z k τρ(z) k for all k > 1. This together with the spectral resolution aforementioned in (ii) tells us that A k linearly converges to P with the exact rate ρ(z) = ρ(a). The latter conclusion is indeed the sufficient part in (.15) obtained by using a different method. 3 Convergence rate analysis of relaxed alternating projection and generalized Douglas-Rachford methods In this section, using results in Section and principal angles between two subspaces, we will analyze convergence rates of relaxed alternating projections, partial relaxed alternating projection and generalized Douglas-Rachford methods for two subspaces comprehensively. We show that how to choose the relaxation parameter to find the optimal rates of convergence. It turns out that matrices associated with these iteration procedures do have subdominant eigenvalues being semi-simple. Throughout the section we suppose that U and V are two subspaces of R n with 1 p := dim U dim V := q n 1. Note that the whole section will be trivial if dim U = 0 or dim V = n. Let us recall the principal angles and the Friedrichs angles between U and V as follows, which are crucial for our quantitative analysis of convergence rates. Definition 3.1 (principal angles) ([9], [37, page 456]) The principal angles θ k [0, π ], k = 1,..., p between U and V are defined by (3) cos θ k := u k, v k { = max u, v u U, v V, u = v = 1, u, u j = v, v j = 0, j = 1,..., k 1 } with u 0 = v 0 := 0. It is worth mentioning that the vectors u k, v k are not uniquely defined, but the principal angles θ k are unique with 0 θ 1 θ θ p π ; see [37, page 456]. 13

14 Definition 3. (Friedrichs angle) The cosine of the Friedrichs angle θ F (0, π ] between U and V is { (33) c F (U, V) := max u, v } u U (U V), v V (U V), u = v = 1. In the following proposition we show that the Friedrichs angle is exactly the (s + 1)-th principal angle θ s+1 where s := dim(u V). Proposition 3.3 (principal angles and Friedrichs angle) Let s := dim(u V). Then we have θ k = 0 for k = 1,..., s and θ s+1 = θ F > 0. Proof. Let x 1,..., x s be an orthonormal basis of the subspace U V. We may choose u k = v k = x k, k = 1,..., s from (3). It follows that cos θ k = x k, x k = 1 and thus θ k = 0 for all k = 1,..., s. Moreover, since span {u 1,..., u s } = span {v 1,..., v s } = U V, we obtain from (3) that (34) cos θ s+1 = max { u, v u U, v V, u = v = 1, u, v (U V) }. This together with (33) tells us that θ s+1 = θ F. The proof is complete. The following result follows the idea of [9, 16] to construct the orthogonal projections P U and P V with the appearance of the principal angles. Proposition 3.4 (principal angles and orthogonal projections) Suppose that p + q < n. Then we may find an orthogonal matrix D R n n such that I p (35) P U = D 0 0 p q p n p q D and P V = D C CS 0 0 CS S I q p n p q where C and S are two p p diagonal matrices defined by C := diag ( ) cos θ 1,..., cos θ p and S := diag ( ) (36) sin θ 1,..., sin θ p D, with the principal angles θ 1,..., θ p between U and V found in Definition 3.1. Consequently, we have (37) C CS 0 0 P U P V = D 0 0 p q p 0 D and P U P V = D n p q 0 p CS C q p I n p q Furthermore, the orthogonal projection P U V is computed by ( ) Is 0 (38) P U V = D D with s := dim(u V). 0 0 n s D. Proof. Let Q U R n p, Q U R n (n p) and Q V R n q be three matrices such that their columns form three orthonormal bases for U, U and V, respectively. It follows from [37, page 430] that P U = Q U QU, I P U = P U = Q U Q and P U V = Q V QV. Furthermore, by [9, Theorem 1] we have that the Singular Value Decomposition (SVD) of the p q matrix QU Q V is (39) Q U Q V = ACB with C = diag (cos θ 1,..., cos θ p ) R p p, 14

15 where A R p p and B R q p satisfy AA = A A = B B = I p. Since all p columns of B are orthonormal and p q, we may find a q (q p) matrix B such that B := (B, B ) R q q is orthogonal. Define further that D 1 := Q U A R n p, we have D 1 D 1 = A Q U Q U A = A A = I p. Note further from (39) that (40) Moreover, we get from (39) that P U Q V = Q U Q U Q V = Q U ACB = D 1 CB. [Q U Q V ] [Q U Q V ] = QV Q U Q U Q V = QV(Id P U )Q V = Id QV P UQ V = Id QV Q UQU Q V = Id (ACB ) (ACB ) = Id BCA ACB ( ) ( = Id BC B = B B C 0 B B Ip C = B ) 0 B 0 0 q p 0 I q p ( ) S 0 = B B. 0 I q p Hence the columns of B are eigenvectors of [Q Q U V ] [Q Q U V ]. It follows that the SVD of Q Q U V has the form ( ) (4) Q S 0 U Q V = A 1 B 0 I q p for some A 1 R (n p) q with A1 A 1 = I q. Define D := Q U A 1 R n q, we have D D = A1 Q Q U U A 1 = A1 A 1 = I q. Moreover, it follows from (4) that ( ) (43) (I P U )Q V = Q U Q S 0 U Q V = D B. 0 I q p Note further that D1 D = A QU Q U A 1 = 0, since the columns of Q U, Q U are two basis of U and U, respectively. Thus there is an n (n p q) matrix D 3 such that D := (D 1, D, D 3 ) R n n is orthogonal. Combining (40) and (43) gives us that Hence we have P V = Q V Q V = Q V = D 1 CB + D ( S 0 0 I q p ) B = D 1 ( C 0p (q p) ) B + D ( S 0 0 I q p [ ( ) ] [ ( ( ) D 1 C 0p (q p) B S 0 + D B C B 0 I q p = D 1 C D 1 + D 1 ( CS 0p (q p) ) D + D ( SC C CS 0 0 = D CS S I q p 0 D, n p q 0 (q p) p 0 (q p) p ) D 1 + D ( S 0 0 I q p ) B. ) ( D1 S 0 + B 0 I q p ) D ) ] D which ensures the second part of (35). Note further that D 1 D1 = Q U A(Q U A) = Q U AA QU = Q U QU = P U. It follows that I p P U = D 0 0 p q p 0 D, n p q 15

16 which verifies (35). The formulas of P U P V and P U P V = (Id P U )(Id P V ) in (37) can be derived easily from (35). It remains to establish (38). Observe from (37) and Proposition 3.3 that C k C (k 1) CS 0 ( ) (P U P V ) k = D 0 0 p 0 D Is 0 D D as k. 0 0 n s n p Note further that Fix(P U P V ) = U V = Fix(P V P U ); see, e.g., ( [6, Lemma ).4]. Combining this with Is 0 (38) and Corollary.7 tells us that P U V = P Fix(PU P V ) = D D 0 0. n s Remark 3.5 When p + q < n, observe from (37), (33), and Proposition 3.3 that γ(p U P V ) = γ(p U P V ) = c F (U, V). These equalities is also true when p + q n by applying the trick used in Case in the proof of Theorem 3.6. It follows that c F (U, V) = c F (U, V ) by replacing U, V by U, V, respectively. This equality is known as Solmon s formula; see [17, Theorem 16] and also [39, Theorem 3] for different proofs. 3.1 Convergence rate of relaxed alternating projection methods Throughout this subsection let us denote the classical alternating projection mapping by T := P U P V, which is well-known to be convergent to P U V with the linear rate c F (U, V) = cos θ s+1 with s = dim(u V); see [17, 7]. We will study some relaxations of this operator and show that a better optimal rate can be obtained. The first kind relaxed alternating projection mapping we will study is defined by (46) T µ := (1 µ) Id +µp U P V with µ R. It is worth noting that the case µ = 0 is not interesting, since T 0 = Id is the identity map. Let us analyze the convergence of T µ in the following result mainly for the case µ = 0. When µ = 1, it recovers the classical result aforementioned. Theorem 3.6 (relaxed alternating projection) Let θ s+1 = θ F be defined in Proposition 3.3 with s = dim(u V). Then the mapping T µ = (1 µ) Id +µp U P V, µ R is convergent if and only if µ [0, ). Moreover, the following assertions hold: (i) If µ (0, 1+sin θ s+1 ], then T µ is convergent to P U V with the optimal linear rate γ(t µ ) = 1 µ sin θ s+1. (ii) If µ ( 1+sin θ s+1, ), then T µ is convergent to P U V with the optimal linear rate γ(t µ ) = µ 1. Consequently, when µ = 0, T µ is convergent to P U V with linear rate smaller than cos θ s+1 if and only if µ (1, sin θ s+1 ). Furthermore, T µ attains the smallest convergence rate 1 sin θ s+1 1+sin θ s+1 at µ = 1+sin θ s+1. Proof. Let us justify the theorem by considering two main cases as follows. 16

17 Case 1: p + q < n, where 1 p = dim U q = dim V n 1. By Proposition 3.4, (35) and (37), we may find some orthogonal matrix D such that (1 µ)i p + µc µcs 0 T µ = (1 µ) Id +µp U P V = D 0 (1 µ)i p 0 D (47) I p µs 0 0 µcs 0 (1 µ)i n p = D 0 (1 µ)i p 0 D. 0 0 (1 µ)i n p It follows that (48) σ(t µ ) = {1 µ sin θ k k = 1,..., p} {1 µ}. Suppose first that T µ is convergent, we get from Fact.4 that ρ(t µ ) 1 and 1 σ(t µ ). Thus we have 1 µ 1 and 1 = 1 µ, which yield 0 µ <. Conversely, suppose that 0 µ < and observe from Proposition 3.3 that 1 = 1 µ sin θ 1 =... = 1 µ sin θ s > 1 µ sin θ s µ sin θ p 1 µ > 1. If µ = 0 then T µ = Id is always convergent. If µ > 0 and s = 0, it is clear that 1 / σ(t µ ) by (48). Thus T µ is convergent by Fact.4. If µ > 0 and s > 0, we claim that 1 σ(t µ ) is semisimple. Indeed, observe from (47) that ( ker( µs (49) ker(t µ Id) = ) ( ) ) R s D = D. 0 (n p) 1 0 (n s) 1 Similarly we also have ( ker( µ (50) ker(t µ Id) = S 4 ) ) D 0 (n p) 1 ( ) R s = D. 0 (n s) 1 It follows from (49) and (50) that 1 is semisimple to T µ due to Fact.3. This tells that T µ is convergent by Fact.4. Thus T µ = (1 µ) Id +µp U P V, µ R is convergent if and only if µ [0, ). Next let us justify (i) and (ii) under the assumption that µ (0, ). We claim first that T µ is convergent to P U V. Indeed, note that Furthermore, we have Fix T µ = ker[µ(p U P V Id)] = ker(p U P V Id) = Fix(P U P V ) = U V. Fix T µ = ker[µ(p V P U Id)] = ker(p V P U Id) = Fix(P V P U ) = V U, which yields in turn the equality Fix T µ = Fix T µ. By Corollary.7, the mapping T µ is convergent to P U V. Now we justify the quantitative characterizations in (i) and (ii). Observe from (48) that the subdominant eigenvalue of T µ is (51) γ(t µ ) = max{ 1 µ sin θ s+1, 1 µ }. Note also that (5) (1 µ sin θ s+1 ) (1 µ) = µ cos θ s+1 [ µ(1 + sin θ s+1 ) ]. 17

18 Subcase a: cos θ s+1 = 0. Then we have θ s+1 =... = θ p = π and γ(t µ) = 1 µ. In this case it is easy to see that CS = 0 and thus T µ is diagonalizable by (47). Thanks to Example.9 we have T µ is convergent with optimal rate 1 µ. Both (i) and (ii) are valid in this case. Subcase b: cos θ s+1 > 0. Let us consider the following three subsubcases: Subsubcase b1: µ (0, sin θ s+1 +1 ). Then we have 1 µ sin θ s+1 > 1 µ by (5) and thus γ(t µ ) = 1 µ sin θ s+1. Observe that (53) 1 > a µ := 1 µ sin θ s+1 > 1 sin θ s sin θ s+1 = 1 sin θ s sin θ s+1 0. Hence we have γ(t µ ) = 1 µ sin θ s+1. Suppose further that θ s+1 =... = θ k and θ s+1 = θ k+1 with some k {s + 1,..., p}, we easily check from (47) that ker(t µ a µ Id) = ker(t µ a µ Id) = D ( 0 1 s (R k s ) 0 1 (n k) ), which shows that a µ is semisimple by Fact.3. Thanks to Theorem.15, T µ is convergent with the optimal rate a µ. Subsubcase b: µ = 1+sin θ s+1 > 1. Then we obtain from (51) that (54) γ(t µ ) = 1 µ sin θ s+1 = 1 µ = 1 µ sin θ s+1 = µ 1 = 1 sin θ s sin θ s+1. It is similar to the above subsubcase that a µ σ(t µ ) is semisimple. Furthermore, 1 µ σ(t µ ) is also semisimple. Indeed, observe that µc µcs 0 T µ (1 µ) Id = D 0 0 p n p By using these two expressions, we may check that µ C 4 µ C 3 S 0 D, (T µ (1 µ) Id) = D 0 0 p n p ker(t µ (1 µ) Id) = ker(t µ (1 µ) Id), D. which yields that (1 µ) is also semisimple by Fact.3. By Theorem.15 again, we obtain that 1 sin θ s+1 1+sin θ s+1 is the optimal linear convergent rate of T µ. Subsubcase b3: µ > 1+sin θ s+1 we get from (51) that (55) γ(t µ ) = 1 µ = µ 1 > > 1. It follows from (5) that 1 µ sin θ s+1 < 1 µ. And thus 1 + sin θ s+1 1 = 1 sin θ s sin θ s+1. Similarly to the above case, 1 µ σ(t µ ) is semisimple. Thus Theorem.15 tells us that µ 1 is the convergent rate of T µ in this subcase. Combining Subsubcase b1 and Subsubcase b ensures (i), and (ii) is exactly the Subsubcase b3. Thus (i) and (ii) are verified. Let us complete the proof by verifying the last part of the theorem. When µ (0, 1+sin θ s+1 ], we have 1 µ sin θ s+1 < cos θ s+1 if and only if µ > 1, since sin θ s+1 > 0 by Proposition

19 Furthermore, when µ ( 1+sin θ s+1, ), we have µ 1 < cos θ s+1 if and only if µ < 1 + cos θ s+1 = sin θ s+1. Combining these two observations with (i) and (ii) in the theorem tells us that T µ is convergent to P U V with a rate smaller than cos θ s+1 if and only if µ (1, sin θ s+1 ). Moreover, the optimal rate 1 sin θ s+1 1+sin θ s+1 of T µ is obtained at µ = 1+sin θ s+1 due to (53), (54), and (55). Case : p + q n. We may find some k N such that n := n + k > p + q. Define U := U {0 k } R n, V := V {0 k } R n, and T µ = (1 µ) Id +µp U P V. It is clear that 1 p = dim U dim V = q and p + q < n. Observe from Definition 3.1 that the principal angles between U and V are the same with the ones between U and V. Moreover, we have P U = ( ) ( ) PU 0 PV 0, P 0 0 V =, and thus k 0 0 k ( (56) T µ Tµ 0 = 0 (1 µ)i k Since q n 1, there is some x R n \ {0} such that P V x = 0. It follows that Tx = 0, and thus we have 0 σ(t) and then 1 µ σ(t µ ). If T µ is convergent, Fact.4 tells us that 1 < 1 µ 1, i.e., µ [0, ). Conversely, if µ [0, ) we have T µ is convergent due to Case 1. This together with (56) ensures that T µ is also convergent. Hence T µ is convergent if and only if µ [0, ). To verify the convergence rate of T µ, suppose further that µ (0, ). We note that σ(t µ ) = σ(t µ), which implies in turn that γ(t µ ) = γ(t µ). It follows from Case 1 that T µ in (56) is convergent to P U V = PU V 0 ( ) with the convergence rate γ(t 0 0 µ). This together with (56) yields k ). T k µ P U V (T µ) k P U V. Thus γ(t µ) = γ(t µ ) is the convergence rate of T µ by also Theorem.1. The analysis of γ(t µ) in (i) and (ii) in Case 1 also guides us to verify (i) and (ii) for γ(t µ ) in Case. Hence the proof is complete. Next we study another kind of relaxation of the the map T = P U P V, that is (57) S µ := P U ((1 µ) Id +µp V ) = (1 µ)p U + µp U P V ; see also [33] for a similar form, which will give us a better optimal rate. Since the proof is similar to the one of Theorem 3.6 above, we only sketch the main steps. Theorem 3.7 (partial relaxed alternating projection) The map S µ := P U ((1 µ) Id +µp V ) = (1 µ)p U + µp U P V is convergent if and only if µ [0, ) with the convention 1 sin θ p 0 =. Moreover, the following assertions hold: (i) If µ (0, sin θ s+1 +sin θ p ], then S µ is convergent to P U V with the optimal linear convergence rate γ(s µ ) = 1 µ sin θ s+1. (ii) If µ ( sin θ s+1 +sin θ p, sin θ p ), then S µ is convergent to P U V with the optimal linear convergence rate γ(s µ ) = µ sin θ p 1. Consequently, when µ = 0, S µ is convergent to P U V with the optimal linear convergence rate smaller than cos θ s+1 = c F (U, V) if and only if µ (1, sin θ s+1 ). Furthermore, S sin θ µ attains the smallest linear p convergence rate sin θ p sin θ s+1 sin θ s+1 +sin θ p at µ = sin θ s+1 +sin θ p. 19

20 Proof. We separate the proof into two main cases as below: Case 1: p + q < n with 1 p = dim U q = dim V n 1. It follows from (35) and (37) that there is some orthogonal matrix D R n n such that (1 µ)i p + µc µcs 0 I p µs µcs 0 (58) S µ = D 0 0 p 0 D = D 0 0 p 0 D n p n p Hence we have (59) σ(s µ ) = {1 µ sin θ k k = 1,..., p} {0}. Suppose that S µ is convergent, we get from Fact.4 that (60) 1 < 1 µ sin θ p and 1 µ sin θ s+1 1. Since θ s+1 = θ F = 0 by Proposition 3.3, the latter gives us that µ [0, sin θ p ). Conversely, suppose that µ [0, sin θ p ), we have (61) 1 = 1 µ sin θ 1 = = 1 µ sin θ s 1 µ sin θ s+1 1 µ sin θ p > 1. If µ = 0 then S µ = P U is always convergent. If µ > 0 and s = 0, it is clear that 1 / σ(s µ ) by (59). Thanks to Fact.4, we have S µ is convergent. If µ > 0 and s > 0, it is similar to the corresponding part of Theorem 3.6 that 1 σ(s µ ) is semisimple. Combining (61) with Fact.4 gives us that S µ is convergent. Thus S µ is convergent if and only if µ [0, sin θ p ). To verify (i) and (ii), assume further that µ (0, sin θ p ). Let us claim that S µ is convergent to P U V. Via the explicit form of S µ in (58), we can easily check that Note also from (38) that Fix S µ = ker(s µ Id) = D((R s ) 0 1 (n s) ) = ker(s µ Id) = Fix S µ. U V = Fix P U V = D((R s ) 0 1 (n s) ). It follows that Fix S µ = Fix S µ = U V. Thanks to Corollary.7, we have S µ is convergent to P U V. Next we justify the qualitative characterizations in (i) and (ii). Observe from (59) and (61) that (6) γ(s µ ) = max{ 1 µ sin θ s+1, 1 µ sin θ p }. Note also that (63) (1 µ sin θ s+1 ) (1 µ sin θ p ) = µ(sin θ p sin θ s+1 )[ µ(sin θ s+1 + sin θ p )]. Subcase a: sin θ p = sin θ s+1, i.e., θ s+1 = θ s+ = = θ p. Hence we have σ(s µ ) = {1 µ sin θ s, 1 µ sin θ s+1, 0} and γ(s µ ) = 1 µ sin θ s+1. Moreover, it is easy to check that c µ := 1 µ sin θ s+1 is semisimple by showing that ker(s µ c µ Id) = ker(s µ c µ Id). Subcase b: sin θ p = sin θ s+1, i.e., sin θ p > sin θ s+1. We continue the proof by taking into account three different cases as follows. 0

21 Subsubcase b1: µ (0, sin θ s+1 +sin θ p ). Then we have from (63) that 1 µ sin θ s+1 > 1 µ sin θ p, which gives us that γ(s µ ) = 1 µ sin θ s+1 by (6). Moreover, note that (64) c µ = 1 µ sin θ s+1 > 1 sin θ s+1 + sin θ p sin θ s+1 = sin θ p sin θ s+1 sin θ s+1 + sin θ p > 0. Thanks to the structure of S µ in (58), we may check that c µ is semisimple. Thus c µ = γ(s µ ) is the optimal linear convergence rate of S µ by Theorem.15. Subsubcase b: µ = sin θ s+1 +sin θ p. Thus (65) γ(s µ ) = 1 µ sin θ s+1 = 1 µ sin θ p = sin θ p sin θ s+1 sin θ s+1 + sin θ p > 0. We can check that c µ = 1 µ sin θ s+1 and d µ := 1 µ sin θ p are semisimple in this case via Fact.3. This together with Theorem.15 tells us that γ(s µ ) = 1 µ sin θ s+1 = sin θ p sin θ s+1 sin θ s+1 +sin θ p the optimal linear rate of S µ. Subsubcase b3: µ (, ). It follows from (63) that 1 µ sin θ sin θ s+1 +sin θ p sin θ s+1 < 1 p µ sin θ p, which yields γ(s µ ) = 1 µ sin θ p by (6). Moreover, observe that (66) µ sin θ p 1 > sin θ s+1 + sin θ p sin θ p 1 = sin θ p sin θ s+1 sin θ s+1 + sin θ p > 0. We also have d µ = 1 µ sin θ p is semisimple via Fact.3. Thanks to Theorem.15, γ(s µ ) = µ sin θ p 1 is the optimal linear convergence rate of S µ. Combining Subsubcase b1 and Subsubcase b gives us (i). Furthermore, Subsubcase b3 exactly verifies (ii). The last part of the theorem is indeed a direct consequence of (i) and (ii). The proof of the theorem for Case 1 is complete. Case : p + q n. Then we find some k N such that n := n + k > p + q and define U := U {0 k } R n, V := V {0 k } R n, and S µ = (1 µ)p U + µp U P V. It is clear that 1 p = dim U dim V = q and p + q < n. Moreover, we also have ( ) S µ Sµ 0 =, 0 0 k which shows that S µ is convergent if and only if S µ is convergent. The rest of the proof is quite similar to the corresponding one in Theorem 3.6. Remark 3.8 It is clear that the optimal linear rate sin θ p sin θ s+1 sin θ s+1 +sin θ p is of S µ is smaller than the one 1 sin θ s+1 of T 1+sin θ µ in Theorem 3.6. Note further from the above theorem that S = P U R V with s+1 R V := P V Id, which is known as the reflection-projection method [7, 11] is convergent to P U V if and only if <, i.e., θ sin θ p < π p. When this case is fulfill, the optimal linear rate of the reflectionprojection method is max{ 1 sin θ s+1, 1 sin θ p } by (6). Besides the definition of θ s+1, θ p in Definition 3.1 and Definition 3., we may also obtain θ s+1, θ p in following formulas (67) cos θ s+1 = P U P V P U V and sin θ p = P U P U P V = P U P U P V P U from (35), (37), and (38). 1

22 Remark 3.9 (finite termination) From Theorem 3.6, observe that the map T µ has the linear convergence rate 0, i.e., it will always terminate after finite powers if and only if θ s+1 = π and µ = 1. Similarly, we get from Theorem 3.7 that S µ has the linear convergence rate 0 if and only if µ = and θ sin θ s+1 +sin θ s+1 = θ p. The latter condition is clearly satisfied when dim(u V) = p 1 p and µ = 1 ; e.g., U and V are two different lines passing the origin in R, or U is a line in R 3 sin θ s+1 and V is a hyperplane in R 3 with U V, or U and V are two different hyperplanes in R 3, etc. 3. Convergence rate of the generalized Douglas-Rachford method Convergence rates of many specific matrices relating to the Douglas-Rachford operator (68) R := P U P V + P U P V = R UR V + Id = R U R V + Id have been discussed in [16]. One of the particular cases there is the so-called generalized Douglas- Rachford operator R µ defined by R µ := (1 µ) Id +µr. The convergence rate of this mapping has been obtained in Demanet-Zhang [16] under an additional condition U V = {0}. In the following result we give a complete characterization of the convergence of this map and also show that the condition U V = {0} can be relaxed. Theorem 3.10 (generalized Douglas-Rachford method) The map R µ is convergent if and only if µ [0, ). Moreover, the following assertions hold: (i) R µ is normal. (ii) If µ (0, ) then R µ is convergent to P Fix R = P (U V) (U V ) with the optimal linear convergence rate γ(r µ ) = µ( µ) cos θ s+1 + (1 µ), where s := dim(u V). Proof. As proceeded in the proof of Theorem 3.6 and Theorem 3.7, we consider two major cases as below. (69) Case 1: p + q < n. By using the expressions of (37), we easily establish that R µ C + (1 µ)s µcs 0 0 = D µcs C + (1 µ)s (1 µ)i q p 0 D I n p q I p µs µcs 0 0 = D µcs I p µs (1 µ)i q p 0 D ; I n p q see also a similar form on [16, page 14]. It is easy to check that R µr µ = R µ R µ, i.e., R µ is normal. Thus (i) is satisfied. We may get from the above format and the block determinant formula, c.f., [37, page 475] that σ(r µ ) = { {cos θ k + (1 µ) sin θ k ± iµ cos θ k sin θ k k = 1,..., p} {1} if q = p, {cos θ k + (1 µ) sin θ k ± iµ cos θ k sin θ k k = 1,..., p} {1} {1 µ} if q > p,

23 where i := 1. For any k = 1,..., p, we have 1 µ sin θ k ± iµ cos θ k sin θ k = (1 µ sin θ k ) + [ ] µ cos θ k sin θ k = [µ cos θ k + (1 µ)] + µ cos θ k (1 cos θ k ) = µ( µ) cos θ k + (1 µ). Suppose further that R µ is convergent. Then we get from Fact.4 that µ( µ) cos θ s+1 + (1 µ) 1, which yields µ( µ)(1 cos θ s+1 ) 0 and thus µ [0, ], since cos θ s+1 < 1. Next let us consider three particular subcases of µ. Subcase a. µ =. Then all eigenvalues of R µ have magnitude 1. By Fact.4, we have (70) 1 µ sin θ k ± iµ cos θ k sin θ k = 1 for all k = 1,..., p, which implies in turn that sin θ s+1 cos θ s+1 = 0 and thus θ s+1 = π, since sin θ s+1 > 0 by Proposition 3.3. It follows that 1 µ sin θ s+1 ± iµ cos θ s+1 sin θ s+1 = 1, which contradicts (70). Hence when µ =, R µ is not convergent. (71) Subcase b: µ = 0. It is obvious that R µ = Id is convergent to Id with rate 0. Subcase c: 0 < µ <. By Propodition 3.3 we have 1 = µ( µ) cos θ 1 + (1 µ) = = > µ( µ) cos θ s+1 + (1 µ) µ( µ) cos θ p + (1 µ) 1 µ. µ( µ) cos θ s + (1 µ) µ( µ) cos θ s+ + (1 λ) Since R µ is normal, it follows from Fact.4 and Corollary.7 that R µ is convergent. Hence we have R µ is convergent if and only if µ [0, ). It remains to verify (ii) in this case. Suppose that µ (0, ), we get from the normality of R µ and Theorem.18 that γ(r µ ) = µ( µ) cos θ s+1 + (1 µ) (by (71)) is the optimal linear convergence rate of R µ and that R µ is convergent to P Fix Rµ = P Fix R. Moreover, we have Fix R = (U V) (U V ) by [5, Proposition 3.6]. This ensures (ii) and thus completes the proof of the theorem for Case 1. Case : p + q n. Similarly to the proof of Theorem 3.6 and Theorem 3.7, we find k > 0 such that n + k := n > p + q. Define further that U := U {0 k } R n, V := V {0 k } R n, and R µ = (1 µ) Id +µ[p U P V + P (U ) P (V ) ]. It is easy to verify that ( ) (7) R Rµ 0 µ =. 0 I k Note from Case 1 that R µ is normal, and so is R µ. Morever, we get from (7) that R µ is convergent if and only if R µ is convergent with the same rate. The analysis of the convergence of R µ in Case 1 justifies all the statement of the theorem in this case. The proof is complete. 3

24 Remark 3.11 (1). Unlike the relaxed alternating projection methods studied in Theorem 3.6 and 3.7, convergence rate of the (over and under) relaxation of the Douglas-Rachford algorithm is always bigger than the original one due to γ(r 1 ) = cos θ s+1 µ( µ) cos θ s+1 + (1 µ) = γ(r µ ) for all µ [0, ). Moreover, it is worth mentioning here that Theorem 3.10 also tells us that R = R U R V, which is known as reflection-reflection method will never be convergent in the case of two nontrivial subspaces with 1 dim U, dim V n 1. (). For the linear convergence rate of the Douglas-Rachford method on a general Hilbert space, see [5]. 4 A nonlinear approach to the alternating projection method Throughout this section, we also suppose that U and V are two subspaces of R n with 1 p = dim U dim V = q n 1. From Theorem 3.7, we know that the map S µ (57) obtains its smallest rate sin θ p sin θ s+1 at µ =. This rate is smaller than the optimal rate of T sin θ s+1 +sin θ p sin θ s+1 +sin θ µ p and T. However, it is not trivial to determine θ s+1 and θ p to construct µ = for sin θ s+1 +sin θ p S µ especially with big dimensions of U and V; see Definition 3.1, Definition 3., and (67). In this section we introduce a simple nonlinear mapping, by using the idea of a line search [6, 3, 5] for the map S µ, so that the iterative sequence given by this nonlinear mapping is linearly convergent to the projection on U V with the same optimal rate mentioned above, so at least as fast convergence rate as the one using the optimal relaxation parameter. One may think of this mapping as the partial relaxed alternating projection with an adaptive parameter µ(x) depending on each iteration period. This is a technique employed for other iterative methods; see, e.g., [3, 4, 11, 1, 13, 1]. Definition 4.1 Define the map B T with T = P U P V by (73) B T (x) := P U ((1 µ x )x + µ x P V x) = (1 µ x )P U x + µ x P U P V x, where (74) P U x P U P V x, x µ x := P U x P U P V x if P U x P U P V x = 0 1 if P U x P U P V x = 0. Remark 4. In [4, 6, 3], an accelerated mapping of T is introduced by using the line-search [5] as (75) A T (x) := (1 λ x )x + λ x P U P V x, where (76) x P U P V x, x λ x = x P U P V x if x P U P V x = 0 1 if x P U P V x = 0. It is worth noting that µ x = λ x and B T x = A T x when x U. 4

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2 Norwegian University of Science and Technology Department of Mathematical Sciences TMA445 Linear Methods Fall 07 Exercise set Please justify your answers! The most important part is how you arrive at an

More information

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Dot Products K. Behrend April 3, 008 Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Contents The dot product 3. Length of a vector........................

More information

A proof of the Jordan normal form theorem

A proof of the Jordan normal form theorem A proof of the Jordan normal form theorem Jordan normal form theorem states that any matrix is similar to a blockdiagonal matrix with Jordan blocks on the diagonal. To prove it, we first reformulate it

More information

MATH 5640: Functions of Diagonalizable Matrices

MATH 5640: Functions of Diagonalizable Matrices MATH 5640: Functions of Diagonalizable Matrices Hung Phan, UMass Lowell November 27, 208 Spectral theorem for diagonalizable matrices Definition Let V = X Y Every v V is uniquely decomposed as u = x +

More information

arxiv: v1 [math.oc] 15 Apr 2016

arxiv: v1 [math.oc] 15 Apr 2016 On the finite convergence of the Douglas Rachford algorithm for solving (not necessarily convex) feasibility problems in Euclidean spaces arxiv:1604.04657v1 [math.oc] 15 Apr 2016 Heinz H. Bauschke and

More information

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction MAT4 : Introduction to Applied Linear Algebra Mike Newman fall 7 9. Projections introduction One reason to consider projections is to understand approximate solutions to linear systems. A common example

More information

arxiv: v1 [math.oc] 17 Nov 2017

arxiv: v1 [math.oc] 17 Nov 2017 Optimal rates of linear convergence of the averaged alternating modified reflections method for two subspaces Francisco J Aragón Artacho Rubén Campoy November 3, 208 arxiv:70652v [mathoc] 7 Nov 207 Abstract

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

Bare-bones outline of eigenvalue theory and the Jordan canonical form

Bare-bones outline of eigenvalue theory and the Jordan canonical form Bare-bones outline of eigenvalue theory and the Jordan canonical form April 3, 2007 N.B.: You should also consult the text/class notes for worked examples. Let F be a field, let V be a finite-dimensional

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

Math 108b: Notes on the Spectral Theorem

Math 108b: Notes on the Spectral Theorem Math 108b: Notes on the Spectral Theorem From section 6.3, we know that every linear operator T on a finite dimensional inner product space V has an adjoint. (T is defined as the unique linear operator

More information

The Cyclic Decomposition of a Nilpotent Operator

The Cyclic Decomposition of a Nilpotent Operator The Cyclic Decomposition of a Nilpotent Operator 1 Introduction. J.H. Shapiro Suppose T is a linear transformation on a vector space V. Recall Exercise #3 of Chapter 8 of our text, which we restate here

More information

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings

On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings On a result of Pazy concerning the asymptotic behaviour of nonexpansive mappings arxiv:1505.04129v1 [math.oc] 15 May 2015 Heinz H. Bauschke, Graeme R. Douglas, and Walaa M. Moursi May 15, 2015 Abstract

More information

SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT

SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT Abstract. These are the letcure notes prepared for the workshop on Functional Analysis and Operator Algebras to be held at NIT-Karnataka,

More information

LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS

LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F has characteristic zero. The following are facts (in

More information

MATH 320: PRACTICE PROBLEMS FOR THE FINAL AND SOLUTIONS

MATH 320: PRACTICE PROBLEMS FOR THE FINAL AND SOLUTIONS MATH 320: PRACTICE PROBLEMS FOR THE FINAL AND SOLUTIONS There will be eight problems on the final. The following are sample problems. Problem 1. Let F be the vector space of all real valued functions on

More information

The rate of linear convergence of the Douglas Rachford algorithm for subspaces is the cosine of the Friedrichs angle

The rate of linear convergence of the Douglas Rachford algorithm for subspaces is the cosine of the Friedrichs angle The rate of linear convergence of the Douglas Rachford algorithm for subspaces is the cosine of the Friedrichs angle Heinz H. Bauschke, J.Y. Bello Cruz, Tran T.A. Nghia, Hung M. Phan, and Xianfu Wang April

More information

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

More information

COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE

COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE MICHAEL LAUZON AND SERGEI TREIL Abstract. In this paper we find a necessary and sufficient condition for two closed subspaces, X and Y, of a Hilbert

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

MATH 583A REVIEW SESSION #1

MATH 583A REVIEW SESSION #1 MATH 583A REVIEW SESSION #1 BOJAN DURICKOVIC 1. Vector Spaces Very quick review of the basic linear algebra concepts (see any linear algebra textbook): (finite dimensional) vector space (or linear space),

More information

Math 408 Advanced Linear Algebra

Math 408 Advanced Linear Algebra Math 408 Advanced Linear Algebra Chi-Kwong Li Chapter 4 Hermitian and symmetric matrices Basic properties Theorem Let A M n. The following are equivalent. Remark (a) A is Hermitian, i.e., A = A. (b) x

More information

Notes on the matrix exponential

Notes on the matrix exponential Notes on the matrix exponential Erik Wahlén erik.wahlen@math.lu.se February 14, 212 1 Introduction The purpose of these notes is to describe how one can compute the matrix exponential e A when A is not

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

j=1 x j p, if 1 p <, x i ξ : x i < ξ} 0 as p.

j=1 x j p, if 1 p <, x i ξ : x i < ξ} 0 as p. LINEAR ALGEBRA Fall 203 The final exam Almost all of the problems solved Exercise Let (V, ) be a normed vector space. Prove x y x y for all x, y V. Everybody knows how to do this! Exercise 2 If V is a

More information

Chapter SSM: Linear Algebra. 5. Find all x such that A x = , so that x 1 = x 2 = 0.

Chapter SSM: Linear Algebra. 5. Find all x such that A x = , so that x 1 = x 2 = 0. Chapter Find all x such that A x : Chapter, so that x x ker(a) { } Find all x such that A x ; note that all x in R satisfy the equation, so that ker(a) R span( e, e ) 5 Find all x such that A x 5 ; x x

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Math 113 Final Exam: Solutions

Math 113 Final Exam: Solutions Math 113 Final Exam: Solutions Thursday, June 11, 2013, 3.30-6.30pm. 1. (25 points total) Let P 2 (R) denote the real vector space of polynomials of degree 2. Consider the following inner product on P

More information

MATHEMATICS 217 NOTES

MATHEMATICS 217 NOTES MATHEMATICS 27 NOTES PART I THE JORDAN CANONICAL FORM The characteristic polynomial of an n n matrix A is the polynomial χ A (λ) = det(λi A), a monic polynomial of degree n; a monic polynomial in the variable

More information

MATH 581D FINAL EXAM Autumn December 12, 2016

MATH 581D FINAL EXAM Autumn December 12, 2016 MATH 58D FINAL EXAM Autumn 206 December 2, 206 NAME: SIGNATURE: Instructions: there are 6 problems on the final. Aim for solving 4 problems, but do as much as you can. Partial credit will be given on all

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in 806 Problem Set 8 - Solutions Due Wednesday, 4 November 2007 at 4 pm in 2-06 08 03 Problem : 205+5+5+5 Consider the matrix A 02 07 a Check that A is a positive Markov matrix, and find its steady state

More information

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class:

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class: Math 4242 Fall 206 homework page Math 4242 Fall 206 Darij Grinberg: homework set 8 due: Wed, 4 Dec 206 Exercise Recall that we defined the multiplication of complex numbers by the rule a, b a 2, b 2 =

More information

Nonlinear Programming Algorithms Handout

Nonlinear Programming Algorithms Handout Nonlinear Programming Algorithms Handout Michael C. Ferris Computer Sciences Department University of Wisconsin Madison, Wisconsin 5376 September 9 1 Eigenvalues The eigenvalues of a matrix A C n n are

More information

= W z1 + W z2 and W z1 z 2

= W z1 + W z2 and W z1 z 2 Math 44 Fall 06 homework page Math 44 Fall 06 Darij Grinberg: homework set 8 due: Wed, 4 Dec 06 [Thanks to Hannah Brand for parts of the solutions] Exercise Recall that we defined the multiplication of

More information

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same. Introduction Matrix Operations Matrix: An m n matrix A is an m-by-n array of scalars from a field (for example real numbers) of the form a a a n a a a n A a m a m a mn The order (or size) of A is m n (read

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

Summary of Week 9 B = then A A =

Summary of Week 9 B = then A A = Summary of Week 9 Finding the square root of a positive operator Last time we saw that positive operators have a unique positive square root We now briefly look at how one would go about calculating the

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS KEITH CONRAD. Introduction The easiest matrices to compute with are the diagonal ones. The sum and product of diagonal matrices can be computed componentwise

More information

Characterization of half-radial matrices

Characterization of half-radial matrices Characterization of half-radial matrices Iveta Hnětynková, Petr Tichý Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague 8, Czech Republic Abstract Numerical radius r(a) is the

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Math Linear Algebra II. 1. Inner Products and Norms

Math Linear Algebra II. 1. Inner Products and Norms Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,

More information

Analysis Preliminary Exam Workshop: Hilbert Spaces

Analysis Preliminary Exam Workshop: Hilbert Spaces Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H

More information

On the order of the operators in the Douglas Rachford algorithm

On the order of the operators in the Douglas Rachford algorithm On the order of the operators in the Douglas Rachford algorithm Heinz H. Bauschke and Walaa M. Moursi June 11, 2015 Abstract The Douglas Rachford algorithm is a popular method for finding zeros of sums

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Travis Schedler Thurs, Nov 17, 2011 (version: Thurs, Nov 17, 1:00 PM) Goals (2) Polar decomposition

More information

ISOMETRIES OF R n KEITH CONRAD

ISOMETRIES OF R n KEITH CONRAD ISOMETRIES OF R n KEITH CONRAD 1. Introduction An isometry of R n is a function h: R n R n that preserves the distance between vectors: h(v) h(w) = v w for all v and w in R n, where (x 1,..., x n ) = x

More information

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

MATH 240 Spring, Chapter 1: Linear Equations and Matrices MATH 240 Spring, 2006 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 8th Ed. Sections 1.1 1.6, 2.1 2.2, 3.2 3.8, 4.3 4.5, 5.1 5.3, 5.5, 6.1 6.5, 7.1 7.2, 7.4 DEFINITIONS Chapter 1: Linear

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Linear Algebra Highlights

Linear Algebra Highlights Linear Algebra Highlights Chapter 1 A linear equation in n variables is of the form a 1 x 1 + a 2 x 2 + + a n x n. We can have m equations in n variables, a system of linear equations, which we want to

More information

Where is matrix multiplication locally open?

Where is matrix multiplication locally open? Linear Algebra and its Applications 517 (2017) 167 176 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.com/locate/laa Where is matrix multiplication locally open?

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C :

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C : TOEPLITZ OPERATORS EFTON PARK 1. Introduction to Toeplitz Operators Otto Toeplitz lived from 1881-1940 in Goettingen, and it was pretty rough there, so he eventually went to Palestine and eventually contracted

More information

. = V c = V [x]v (5.1) c 1. c k

. = V c = V [x]v (5.1) c 1. c k Chapter 5 Linear Algebra It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix Because they form the foundation on which we later work,

More information

Spectral inequalities and equalities involving products of matrices

Spectral inequalities and equalities involving products of matrices Spectral inequalities and equalities involving products of matrices Chi-Kwong Li 1 Department of Mathematics, College of William & Mary, Williamsburg, Virginia 23187 (ckli@math.wm.edu) Yiu-Tung Poon Department

More information

Matrix Theory. A.Holst, V.Ufnarovski

Matrix Theory. A.Holst, V.Ufnarovski Matrix Theory AHolst, VUfnarovski 55 HINTS AND ANSWERS 9 55 Hints and answers There are two different approaches In the first one write A as a block of rows and note that in B = E ij A all rows different

More information

1 Invariant subspaces

1 Invariant subspaces MATH 2040 Linear Algebra II Lecture Notes by Martin Li Lecture 8 Eigenvalues, eigenvectors and invariant subspaces 1 In previous lectures we have studied linear maps T : V W from a vector space V to another

More information

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Travis Schedler Thurs, Nov 17, 2011 (version: Thurs, Nov 17, 1:00 PM) Goals (2) Polar decomposition

More information

Linear Algebra Lecture Notes-II

Linear Algebra Lecture Notes-II Linear Algebra Lecture Notes-II Vikas Bist Department of Mathematics Panjab University, Chandigarh-64 email: bistvikas@gmail.com Last revised on March 5, 8 This text is based on the lectures delivered

More information

REPRESENTATION THEORY WEEK 7

REPRESENTATION THEORY WEEK 7 REPRESENTATION THEORY WEEK 7 1. Characters of L k and S n A character of an irreducible representation of L k is a polynomial function constant on every conjugacy class. Since the set of diagonalizable

More information

LINEAR ALGEBRA KNOWLEDGE SURVEY

LINEAR ALGEBRA KNOWLEDGE SURVEY LINEAR ALGEBRA KNOWLEDGE SURVEY Instructions: This is a Knowledge Survey. For this assignment, I am only interested in your level of confidence about your ability to do the tasks on the following pages.

More information

Chapter 4 Euclid Space

Chapter 4 Euclid Space Chapter 4 Euclid Space Inner Product Spaces Definition.. Let V be a real vector space over IR. A real inner product on V is a real valued function on V V, denoted by (, ), which satisfies () (x, y) = (y,

More information

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N.

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N. Math 410 Homework Problems In the following pages you will find all of the homework problems for the semester. Homework should be written out neatly and stapled and turned in at the beginning of class

More information

Notes on nilpotent orbits Computational Theory of Real Reductive Groups Workshop. Eric Sommers

Notes on nilpotent orbits Computational Theory of Real Reductive Groups Workshop. Eric Sommers Notes on nilpotent orbits Computational Theory of Real Reductive Groups Workshop Eric Sommers 17 July 2009 2 Contents 1 Background 5 1.1 Linear algebra......................................... 5 1.1.1

More information

The Jordan Normal Form and its Applications

The Jordan Normal Form and its Applications The and its Applications Jeremy IMPACT Brigham Young University A square matrix A is a linear operator on {R, C} n. A is diagonalizable if and only if it has n linearly independent eigenvectors. What happens

More information

11. Convergence of the power sequence. Convergence of sequences in a normed vector space

11. Convergence of the power sequence. Convergence of sequences in a normed vector space Convergence of sequences in a normed vector space 111 11. Convergence of the power sequence Convergence of sequences in a normed vector space Our discussion of the power sequence A 0,A 1,A 2, of a linear

More information

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial

4.1 Eigenvalues, Eigenvectors, and The Characteristic Polynomial Linear Algebra (part 4): Eigenvalues, Diagonalization, and the Jordan Form (by Evan Dummit, 27, v ) Contents 4 Eigenvalues, Diagonalization, and the Jordan Canonical Form 4 Eigenvalues, Eigenvectors, and

More information

The Fundamental Theorem of Linear Algebra

The Fundamental Theorem of Linear Algebra The Fundamental Theorem of Linear Algebra Nicholas Hoell Contents 1 Prelude: Orthogonal Complements 1 2 The Fundamental Theorem of Linear Algebra 2 2.1 The Diagram........................................

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent. Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u

More information

Tangent spaces, normals and extrema

Tangent spaces, normals and extrema Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

More information

OHSx XM511 Linear Algebra: Solutions to Online True/False Exercises

OHSx XM511 Linear Algebra: Solutions to Online True/False Exercises This document gives the solutions to all of the online exercises for OHSx XM511. The section ( ) numbers refer to the textbook. TYPE I are True/False. Answers are in square brackets [. Lecture 02 ( 1.1)

More information

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction NONCOMMUTATIVE POLYNOMIAL EQUATIONS Edward S Letzter Introduction My aim in these notes is twofold: First, to briefly review some linear algebra Second, to provide you with some new tools and techniques

More information

The Eigenvalue Problem: Perturbation Theory

The Eigenvalue Problem: Perturbation Theory Jim Lambers MAT 610 Summer Session 2009-10 Lecture 13 Notes These notes correspond to Sections 7.2 and 8.1 in the text. The Eigenvalue Problem: Perturbation Theory The Unsymmetric Eigenvalue Problem Just

More information

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors Chapter 7 Canonical Forms 7.1 Eigenvalues and Eigenvectors Definition 7.1.1. Let V be a vector space over the field F and let T be a linear operator on V. An eigenvalue of T is a scalar λ F such that there

More information

Heinz H. Bauschke and Walaa M. Moursi. December 1, Abstract

Heinz H. Bauschke and Walaa M. Moursi. December 1, Abstract The magnitude of the minimal displacement vector for compositions and convex combinations of firmly nonexpansive mappings arxiv:1712.00487v1 [math.oc] 1 Dec 2017 Heinz H. Bauschke and Walaa M. Moursi December

More information

Linear Algebra and Dirac Notation, Pt. 2

Linear Algebra and Dirac Notation, Pt. 2 Linear Algebra and Dirac Notation, Pt. 2 PHYS 500 - Southern Illinois University February 1, 2017 PHYS 500 - Southern Illinois University Linear Algebra and Dirac Notation, Pt. 2 February 1, 2017 1 / 14

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

More information

Math 312 Final Exam Jerry L. Kazdan May 5, :00 2:00

Math 312 Final Exam Jerry L. Kazdan May 5, :00 2:00 Math 32 Final Exam Jerry L. Kazdan May, 204 2:00 2:00 Directions This exam has three parts. Part A has shorter questions, (6 points each), Part B has 6 True/False questions ( points each), and Part C has

More information

Linear Algebra- Final Exam Review

Linear Algebra- Final Exam Review Linear Algebra- Final Exam Review. Let A be invertible. Show that, if v, v, v 3 are linearly independent vectors, so are Av, Av, Av 3. NOTE: It should be clear from your answer that you know the definition.

More information

Topics in linear algebra

Topics in linear algebra Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over

More information

Topic 1: Matrix diagonalization

Topic 1: Matrix diagonalization Topic : Matrix diagonalization Review of Matrices and Determinants Definition A matrix is a rectangular array of real numbers a a a m a A = a a m a n a n a nm The matrix is said to be of order n m if it

More information

Math 594. Solutions 5

Math 594. Solutions 5 Math 594. Solutions 5 Book problems 6.1: 7. Prove that subgroups and quotient groups of nilpotent groups are nilpotent (your proof should work for infinite groups). Give an example of a group G which possesses

More information

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3

Answers in blue. If you have questions or spot an error, let me know. 1. Find all matrices that commute with A =. 4 3 Answers in blue. If you have questions or spot an error, let me know. 3 4. Find all matrices that commute with A =. 4 3 a b If we set B = and set AB = BA, we see that 3a + 4b = 3a 4c, 4a + 3b = 3b 4d,

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture

More information

Spanning and Independence Properties of Finite Frames

Spanning and Independence Properties of Finite Frames Chapter 1 Spanning and Independence Properties of Finite Frames Peter G. Casazza and Darrin Speegle Abstract The fundamental notion of frame theory is redundancy. It is this property which makes frames

More information

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP) MATH 20F: LINEAR ALGEBRA LECTURE B00 (T KEMP) Definition 01 If T (x) = Ax is a linear transformation from R n to R m then Nul (T ) = {x R n : T (x) = 0} = Nul (A) Ran (T ) = {Ax R m : x R n } = {b R m

More information

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS JONATHAN M. BORWEIN AND MATTHEW K. TAM Abstract. We analyse the behaviour of the newly introduced cyclic Douglas Rachford algorithm

More information

UNIT 6: The singular value decomposition.

UNIT 6: The singular value decomposition. UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T

More information

1 Linear Algebra Problems

1 Linear Algebra Problems Linear Algebra Problems. Let A be the conjugate transpose of the complex matrix A; i.e., A = A t : A is said to be Hermitian if A = A; real symmetric if A is real and A t = A; skew-hermitian if A = A and

More information