GLOBALLY CONVERGENT LEVENBERG-MARQUARDT METHOD FOR PHASE RETRIEVAL

Size: px
Start display at page:

Download "GLOBALLY CONVERGENT LEVENBERG-MARQUARDT METHOD FOR PHASE RETRIEVAL"

Transcription

1 GLOBALLY CONVERGENT LEVENBERG-MARQUARDT METHOD FOR PHASE RETRIEVAL CHAO MA, XIN LIU, AND ZAIWEN WEN Abstract. In this paper, we consider a nonlinear least squares odel for the phase retrieval proble. Since the Hessian atrix ay not be positive definite and the Gauss-Newton GN) atrix is singular at any optial solution, we propose a odified Levenberg-Marquardt LM) ethod, where the Hessian is substituted by a suation of the GN atrix and a regularization ter. Siilar to the well-known Wirtinger flow ) algorith under certain assuptions, we start fro an initial point provably close to the set of the global optial solutions. Global linear convergence and local quadratic convergence to the global solution set are proved by estiating the sallest nonzero eigenvalues of the GN atrix, establishing local error bound properties and constructing a odified regularization condition. The coputational cost becoes tractable if a preconditioned conjugate gradient PCG) ethod is applied to solve the LM equation inexactly. Specifically, the pre-conditioner is constructed fro the expectation of the LM coefficient atrix by assuing the independence between the easureents and iteration point. Preliinary nuerical experients show that our algorith is robust and it is often faster than the ethod on both rando exaples and natural iage recovery. Key words. Non-convex optiization, phase retrieval, Levenberg-Marquardt ethod, convergence to global optiu AMS subject classification. 49N30, 49N45, 90C6, 90C30, 94A0. Introduction. One popular forulation of the phase retrieval proble is solving a syste of quadratic equations in the for.) y r = a r, z, r =,,...,, where z C n is the decision variable, a r C n are known sapling vectors, a r, z is the inner product between a r and z in C n, a is the agnitude of a C, and y r R are the observed easureents. This proble arises fro any areas of science and engineering such as X-ray crystallography [5, 35], icroscopy [34], astronoy [9], diffraction and array iaging [8, 0], and optics [43]. It also appears in a few other iportant fields, including acoustics [, 3], blind channel estiation in wireless counications [, 0], interferoetry [3], quantu echanics [, 39] and quantu inforation [6]. Many algoriths have been developed to solve.). One of the ost widely used ethod is the error reduction algorith derived by Gerchberg and Saxton [4] and Fienup [0, ]. This approach has been extended as the hybrid input-output HIO) algorith proposed by Fienup []. Bauschke et. al. established a few connections between the ER and HIO algoriths and classical convex optiization ethods in [4]. Based on these connections, they proposed the hybrid projection-refection HPR) ethod in [5]. Luke further developed in [3] the relaxed averaged alternating reflection RAAR) ethod which can often be ore efficient and reliable than the HIO and HPR ethods. The quadratic syste.) can be forulated as the following nonlinear least squares NLS) proble:.) in fz) = z C n a r z ) y r. School of Matheatical Sciences, Peking University, Beijing, CHINA achao pku@pku.edu.cn). State Key Laboratory of Scientific and Engineering Coputing, Acadey of Matheatics and Systes Science, Chinese Acadey of Sciences, CHINA liuxin@lsec.cc.ac.cn). Research supported in part by NSFC grants 0409, 330 and 93305, and the National Center for Matheatics and Interdisciplinary Sciences, CAS. Beijing International Center for Matheatical Research, Peking University, Beijing, CHINA wenzw@pku.edu.cn). Research supported in part by NSFC grant 309 and by the National Basic Research Project under the grant 05CB85600.

2 Wen et. al. introduced an alternating direction ethod of ultipliers ADMM) to solve.) in [44] and showed that the ADMM is usually coparable to any existing ethods for both classical and ptychographic phase retrieval probles. In [40], Yoav Shechtan et. al. proposed a daped Gauss-Newton schee. Other approaches include the difference ap DF) algorith developed by Elser [5] and the so-called saddle-point optiization algoriths developed by Machesini [3]. Netrapalli et. al designed alternating iniization ethods in [37]. Although the ethods entioned above often perfor well nuerically, their convergence to the global optial solutions is not clear, yet. Recently, there are a few iportant progress on achieving the global optiality for solving nonconvex optiization probles. Miniizing a coposite function with nonconvex sparse regularization ter is studied in [46, 30, 7]. Sun and Luo proved in [4] that a firstorder ethod converges to global optiality on a atrix copletion proble. Candes et. al. proposed a so-called Wirtinger flow ) algorith for solving the odel.) in [9]. The algorith is consisted of two parts. An initial point z 0 is obtained fro the leading eigenvector of a certain atrix, and the point is refined by a gradient descent schee in the sense of Wirtinger calculus iteratively. When there are no noise involved in the easureents of.), it is proved in [9] that the initialization step yields an initial point z 0 very close to the set of global optial solutions with a high probability. Then it is showed that the algorith converges to the global iniizer in a global linear rate. Since the coputational cost of the each step of the algorith is cheap, the nuerical results see to be practically useful. In this paper, we propose a odified LM approach for solving the NLS odel.). In fact, nuerical ethods for the general NLS probles in rz), where rz) are the residual functions, have been well studied for decades. The Gauss-Newton GN) ethod calculates a search direction deterined by a so-called GN atrix through the first-order inforation. Global convergence to a stationary point can be guaranteed after cobining certain line search techniques. If the NLS has a zero residual at the global optial solutions, the GN atrix equals to the Hessian at these points, which ensures the quadratic local convergence rate of the GN ethod. However, when the Hessian is singular at the solutions, the GN ethod ay fail. Another widely used approach is the LM ethod [9, 33] by adding a regularization ter to the GN atrix. The regularization paraeter is usually updated adaptively in a fashion siilar to the trust-region schee []. The regularization ter akes the LM ethod to conquer the singularity issue. Yaashita and Fukushia established quadratic convergence for singular probles satisfying certain error bound conditions when the regularization paraeter is chosen to be rz) in [45]. Fan and Yuan [8] provided a ore general analysis and extended the applicable regularization paraeters to be a faily µ k = rz k ) δ with δ [, ]. The readers are referred to [4, 8, 3, 36,, 7] and the reference therein, for other algoriths for NLS, including the structured quasi-newton ethod. Our ain contribution is a practical linearly convergent LM ethod with a provable second-order local convergence rate. Our approach is divided into two stages. The first stage is an initialization procedure exactly the sae as the ethod in [9]. The second stage is to update the iterate by an LM ethod where the regularization paraeter is based on the residual nor, i.e., the objective function value fz) in.). Since the Hessian is indefinite and calculating a positive definite correction to the Hessian ay be expensive, it is reasonable to use the LM ethod rather than the odified Newton ethod. By estiating the sallest nonzero eigenvalues of the GN atrix, and establishing local error bound properties and a odified regularization condition, we are able to prove that our approach can achieve a globally linear convergence to the global solution set and attain a locally quadratic convergence rate with high probability. In particular, the region of quadratic convergence is estiated explicitly. In order to reduce the coputational cost, the LM equation is solved

3 inexactly by the PCG ethod. The globally linear convergence to the global solution set is still ensured if the accuracy is proportional to the residual. We further construct a siple practical pre-conditioner using the expectation of the LM coefficient atrix by assuing that the easureents and iteration point are independent. Although the LM coefficient atrix tends to be singular close to the optial solution, the PCG ethod still runs soothly since all iterations are perfored in an invariant subspace. Because the condition nuber of the preconditioned coefficient atrix in this subspace is sall, the nuber of iterations of the PCG ethod can be controlled reasonably sall. Consequently, the total coputational cost becoes at least copetitive to the ethod. Our nuerical experients illustrate that the inexact LM ethod indeed outperfors the ethod on both rando exaples and natural iage recovery. We notice that the authors of [6] show local quadratic convergence rate of the odified LM ethod under certain deterinistic local error bound conditions. However, it is not clear how to verify if the original NLS proble.) satisfies these local error bound conditions, and how to estiate an explicit neighborhood around the solution set where these local error bound holds. The difference is that we can prove the existence of certain local error bound condition in a neighborhood close to the solution set with high probability. Although this theoretically neighborhood ay be quite sall when the diension n is large, our analysis is still eaningful for a second-type algorith. In the rest of this paper, we first give a brief description of the approach and its convergence properties in Section. Our proposed LM approach for the Gaussian odel is introduced in Section 3. The theoretical analysis on the exact LM ethod is presented in Section 4. In Section 5, we establish the convergence of the inexact LM fraework and construct a preconditioner for coputing the LM direction. The algorith is extended to the coded diffraction odel and is analyzed in Section 6. Nuerical experients are reported in Section 7 to deonstrate the effectiveness and efficiency of our LM ethod.. Preliinary... Proble Stateent. We first introduce the Gaussian odel for the choices of the sapling vectors. ASSUMPTION.. A proble is called the Gaussian odel if the saple vectors a r C n N 0, I/) + in 0, I/), where N µ, Σ) denotes a Gaussian distribution with ean µ and covariance Σ. It holds a r 6n for r =,,...,. There is no noise in the observation easureents. Naely, the global iniu of.) is zero. Siilar to the analysis in [9], the event a r 6n holds with probability no less than e.5n in Assuption.. During the theoretical analysis in this paper, we always ake this assuption. Hence, e.5n will always be a ter of the probabilities in the ain theores. Since the decision variable z of.) is coplex, we use the Wirtinger derivatives [38] to calculate the derivatives of the objective function. For any z C n, the coplex conjugate of z is written as z. For ease of notation, we define two augented vectors in bold face as.) z = [ z z ] and z = [ z z Then the objective function of.) can be viewed as a function with respect to the variable z, i.e., fz) = ]. z a r a r) z y r ). 3

4 It follows fro the calculation rules of the Wirtinger derivatives that the gradient is gz) := fz) = a r z ) [ ] a y r a.) r)z r ā r a. r ) z For convenience, we denote fz) := ) a r z y r ar a r)z... The Algorith. We briefly review the algorith in [9] in this subsection. The initial point is constructed fro the eigenvector corresponding to the largest eigenvalue of a atrix Y = y r a r a r. The detailed procedure is outlined in Algorith. It is shown in [9] that the initialization procedure can generate a good approxiation to the set of optial solutions. In fact, let x C n be an optial solution to.) and assue that x is independent of a r. The expectation of Y is EY = xx + x I, whose leading eigenvector is parallel to x. When is sufficiently large, Y is close to its expectation so that the angle r yr between x and the leading eigenvector of Y is sall, and n is close to x r. ar Algorith : Initialization in the ethod Input easureents {a r } and observations {y r } r =,,..., ). Calculate z 0 to be the leading eigenvector of Y = y r a r a r. 3 Noralize z 0 such that z 0 r = n y r r a r. Once an initial point z 0 is obtained, the ethod executes gradient descent steps via Wirtinger derivative using a restricted step size.3) z k+ = z k µ k z 0 : µ k z 0 fz k). The update of the conjugates { z k } is oitted since it is equivalent to the calculation of {z k }. Let x C n be an optial solution to.). For each z C n, the distance between x and z is easured as distz, x) = in z φ [0,π] eiφ x = z + x z x. The next theore shows the property of the initialization Algorith and the global linear convergence of the algorith.3). When the nuber of easureents is sufficiently large, the spectral initialization can produce a good initial point. Consequently, by initiating fro this point, a linear convergence can be achieved with high probability. THEOREM.. Theore 3.3 of [9]) Suppose that Assuption. holds. Let x C n be any solution of.), c 0 n log n, where c 0 is a sufficiently large constant. Then the initial estiate z 0 noralized to have a squared Euclidean nor equal to r y r, obeys.4) distz 0, x) 8 x with probability at least 0e γn 8/n γ is a fixed positive constant). Let {z k } be a sequence generated by.3) starting fro any initial solution z 0 obeying.4) with µ k = µ c /n for all k and soe fixed constant c. Then there is an event of probability at least 3e γn e.5 8/n, such that on this event, we have distz k, x) µ ) k/.5) x

5 3. A Modified LM Method. The algorith is essentially a gradient descent ethod with a restricted step size. Since the odel.) is a NLS proble, it is natural to consider the LM ethod for a faster local convergence rate than the ethod. Using the calculation rules of the Wirtinger derivatives, we obtain the Jacobian and GN atrix of fz): 3.) 3.) Jz) := [ ] a z a, a z a,, a z a a z ā, a z ā,, a, z ā Ψz) := Jz) Jz) = [ a r z a r a r a rz) a r a r a rz) ā r a r a rz ā r a r ]. The LM direction s k is calculated by solving the following linear syste 3.3) Ψ µ k z k s k = gz k ), where µ k 0 and Ψ µ z = Ψz) + µi. Then the iteration schee of the LM algorith is 3.4) z k+ = z k + s k. The role of the paraeter µ k is iportant. It can be updated siilar as the strategies in the classic trust-region type algoriths. For the sake of theoretical analysis, we propose the following updating rules for the Gaussian odel: 3.5) { 70000n nfzk ), if fz µ k = k ) 900n z k ; fzk ), otherwise. Roughly speaking, when the residual is large and the iteration is far away fro the optial solution set, the larger paraeter µ k = 70000n nfz k ) can guarantee a global linear convergence. As long as the residual becoes sall enough, the choice of µ k = fz k ) adapted fro [45, 8] ensures a fast local convergence rate. To further iprove the efficiency of the LM algorith in practice, the equation 3.3) can be solved inexactly after reaching certain criterion, such as 3.6) Ψ µ k z k s k + gz k ) η k gz k ) for soe constant η k 0. With a suitably chosen paraeter η k, a global linear convergence rate of the LM ethod can be guaranteed while a better nuerical perforance than the exact LM ethod can be achieved. The fraework of the exact and inexact LM ethod are unified in Algorith. Algorith : An Modified LM ethod for Phase Retrieval Input: Measureents {a r }, observations {y r }. Set ɛ 0. Construct an initial guess z 0 using Algorith. Set k := 0. 3 while gz k ) ɛ do 4 Copute s k by solving 3.3) with µ k specified in 3.5) until 3.6) is satisfied. 5 Set z k+ by 3.4) and k := k +. 6 Output: z k. Siilar to the ethod, the calculation involving the conjugates of {z k } is not necessary. As we will describe later in Section 5., the LM equation 3.3) can be solved by the PCG ethod which only consists of a series of vector suations and atrix-vector 5

6 ultiplications. It allows us to calculate s k without considering its conjugate. Therefore, the coputational cost and storage are reduced. Nevertheless, for convenience of theoretical analysis, we still deal with atrices in C n n and treat variables in C n. We should ention that the GN and Newton ethods are not used because of singularity issues. Note that the NLS.) adits a zero residual at an optial solution under Assuption.. The GN atrix Ψz) equals to the Hessian at this solution and they are ostly singular. Consequently, Newton and GN ethods cannot be eployed directly. The odified Newton ethod is not practical either because the Hessian is indefinite and it is often intractable to calculate a suitable regularization paraeter. Our odified LM ethod whose paraeter µ k tending to zero conquers the singularity issue and ensures a local quadratic convergence. 4. Analysis of the Exact LM Method. In this section, we analyze the convergence of our LM algorith with η k = 0 in 3.6). The ain result consists of two parts. When fz k ) 900n z k holds, our odified LM algorith can achieve a globally linear convergence with x and guarantees a quadratic convergence rate with high probability. Our ain result on Gaussian odel is stated as follows. THEOREM 4.. Suppose that Assuption. holds. Let x C n be any solution of.) and c 0 n log n, where c 0 is a sufficiently large constant. Let {z k } be a sequence generated by Algorith where the LM equation exactly solved. Then, starting fro any high probability. Otherwise, it iplies distz k, x) 4 n initial solution z 0 satisfying distz 0, x) 8 x, there is an event of probability at least 5e γn 8/n e.5n γ is a fixed positive constant), such that on this event, 4.) where 4.) distz k+, x) < c distz k, x), for all k = 0,,... c := x 4µ k ), if fz k ) 900n z k ; n 9.89 n, otherwise. Furtherore, there exists a sufficiently large integer l satisfying fz l ) < 900n z l. Consequently, it holds for all k l that 4.3) where 4.4) distz k+, x) < c distz k, x), c := n. x The lower bound of the probability of convergence in Theore 4. is of the sae order as that of Theore. although the constant γ is different. When n is sufficiently large, e γn and e.5n becoes negligible copared to the ter /n. Then the probabilities in Theores. and 4. tend to be equal. Since the ethod is onotone, and according to the selection of the paraeter µ k, the coefficient c is uniforly bounded above by x x n 4µ 0 = and tends to n nfz 0) 9.89, which is a constant less than. In this sense, n our linear convergence rate is no worse than the ethod. One advantage of our odified LM ethod is its locally quadratic convergence property. It cannot be derived directly fro the analysis for the deterinistic probles in [8] fro two ain perspectives: i) we adit a ore relaxed region where the local error bound properties hold; ii) the neighborhood of provable quadratic convergence can be estiated specifically. 6

7 4.. Leas for the Proof. Let X C n be the set of optial solutions of.) and the letter x C n be reserved for a solution of.). We first prove Theore 4. in the case x =. In the end, we coplete the proof by showing that the case x can be reduced to the case x =. When z is independent to {a r }, it is easily verified that EΨz) = Φz), where [ ] zz 4.5) Φz) = + z zi zz zz zz + z zi. Although the LM iterates {z k } are not independent to the easureents {a r }, the relationship between Ψz) and Φz) still plays an iportant role in our theoretical analysis. For convenience of notation, we also use Φ µ z = Φz) + µi hereafter. The first lea describes the concentration of the GN atrix at a solution x. LEMMA 4.. For any z C n and δ > 0, there exists a sufficiently large nuber c = cδ). If > cn log n, then Ψz) Φz) δ z holds with probability at least 0e γn 8/n. Lea 4. can be verified in the sae anner of Lea 4.7 in [9]. The next lea is on the saple covariance atrix which can be proved in a siilar fashion. LEMMA 4.3. Assue Ψx) Φx) δ, then I n a r a r δ with probability no less than e γ. On this event, it holds 4.6) δ) u a ru + δ) u, u C n. The next lea reveals the distribution of the eigenvalues of Ψx). LEMMA 4.4. Suppose that x =. Then Φx) has one eigenvalue of 4, one eigenvalue of 0, and all other eigenvalues are. If Ψx) Φx) δ, then the largest eigenvalue of Ψx) is less than 4 + δ. The above lea is straightforward and hence its proof is oitted. Our proof also uses the following lea fro [6]. LEMMA 4.5. Suppose X, X,..., X are i.i.d. real-valued rando variables obeying X r b for soe nonrando b > 0, EX r = 0, and EXr = v. Set σ = axb, v ), then ) PX X y) exp y σ. For any z C n, we define x z to be the vector in X nearest to z, i.e., x z = arg in z x. x X Then, we denote h z = z x z. We now describe a few essential characteristics of Ψz), fz), and gz) near the global solution. The so-called local error bound property is an instinctive 7

8 property of the objective function. Since its proof is different fro that of [9], the detailed analysis is included. The other two properties highly depend on our odified LM ethod. These three properties are the foundation of our analysis. We ephasize that the bold face letters z, u, v, h are the augented vectors defined as.) for z, u, v, h, respectively. LEMMA 4.6. Suppose that Assuption. holds, cn log n where c is sufficiently large, and Ψx) Φx) δ holds with δ = 0.0. Let µ be deterined by 3.5). Then, with probability at least e 3γn, we have the following properties.. Estiate of the sallest nonzero eigenvalues: 4.7) v Ψu)v u v holds for all u, v C n, such that u = v = and Iu v) = 0;. Local error bound property: 4.8) 4 distz, x) fz) 8.04distz, x) ndistz, x) 4, holds for any z satisfying distz, x) 8 ; 3. Regularization condition: 4.9) µz)h Ψ µ z ) gz) 6 h n h gz) holds for any z = x + h, h z 8, and fz) 900n. Proof. ) To prove 4.7), we first prove that for any u, v C n, 4.0) v Ψu)v v Φu)v u v, by eploying Lea 4.5. Then, by the condition Iu v) = 0, we have v Φu)v u v, which copletes the proof. We first consider the case when u and v are fixed. Define X r u, v) = a ru a rv + Rea ru) a r v) ), E r u, v) = EX r u, v), then v Ψu)v = X r u, v), v Φu)v = E r u, v). Let Y r u, v) = E r u, v) X r u, v), we obtain v Φu)v v Ψu)v = Y r u, v). Since Rea ru) a r v) ) a ru a rv, we have X r u, v) 0. In addition, considering u = v =, it is easy to know E r u, v) = u v + + 4Rev u) 8. Hence, we know Y r u, v) 8. Meanwhile, it follows fro the inequalities EEX) X) ) EX ) and X r 4 a ru a rv that EY r u, v) EX r u, v) 6E a ru 4 a rv 4 6 E a ru 8 E a rv 8 = 384. By choosing σ = 384 and y = /, Lea 4.5 iplies Pv Φu)v v Ψu)v 0.5) e

9 Choosing γ to be a sufficiently sall positive nuber, such as 307, we obtain Pv Ψu)v v Φu)v 0.5) e γ. We have verified 4.0) when u and v are a fixed pair of vectors. We next extend the result to any pair of u and v. To achieve this goal, we prove that Y r will not change too uch when the variation of u and v are sall, and use a net on S n S n to coplete the extension. We next define Then for any u, v, v C n, we have gu, v) = Y ru, v) = v Φu)v v Ψu)v. 4.) gu, v) gu, v ) E ru, v) E ru, v ) + X ru, v) X ru, v ) u v u v + 4 Re v u) v u) ) + a ru a r v a rv + Re a r ū) a rv) a rv ) )). For the first two parts of 4.), we have 4.) u v u v = v v ) uu v + v uu v v ) v v, 4.3) For the third part of 4.), we can derive 4.4) a ru a r v a rv Re v u) v u) ) v v. a ru v v ) a r a rv + v a r a rv v ) 6n v v + δ)n v v. A siilar derivation on the fourth part of 4.) gives a ru 4.5) Re a r ū) a rv) a rv ) )) + δ)n v v. Substituting 4.)-4.5) into 4.) yields gu, v) gu, v ) δ)n) v v. Siilarly, for any u, u, v C n, we obtain gu, v) gu, v) δ)n) u u. Hence, for any u, u, v, v C n, it holds gu, v) gu, v ) δ)n) u u + v v ). 9

10 Choose ɛ 48+9+δ)n, such as ɛ = 50n, and let N ɛ be an ɛ-net of S n. Then for any u, v) S n S n, we can find u, v N ɛ N ɛ, satisfying u u + v v 4+96+δ)n. Hence, gu, v) gu, v ). We can choose an N ɛ obeying N ɛ + ɛ )n. Therefore, with probability larger than + ) 4n e γ, ɛ we have for any u, v ) N ɛ N ɛ, gu, v ) 0.5. In this occasion, for any u, v S n, we have which eans gu, v) gu, v ) + gu, v) gu, v ), v Ψu)v v Φu)v = u v + 4Rev u) ) +. This copletes the proof of 4.0). When Iu v) = 0, we have Rev u) ) 0. Therefore, v Ψu)v. In addition, when cn log n and c is sufficiently large, we have + ) 4n e γ = + 500n) 4n n cγn e γ e γ. ɛ This copletes the proof of 4.7). ) We now prove the left hand side of 4.8). Recalling that z = x + h, what we want to prove is that with high probability, 4.6) a r x + h) a rx ) 4 h holds for any h 8. Note that a rx + h) a rx = Rex a r a rh) + a rh. Let h = sy, where s = h R, y C n and y =. Then, it suffices to prove 4.7) Rex a r a ry) + s a ry ), for 0 s 8. We first prove the inequality for a fixed y, then extend the result to any y by using a covering arguent. Since the technique is nearly the sae as what is done in VII.F) of [9], we only suarize the ain steps here. Let X r y, s) := Rex a r a ry) + s a ry ) and Yr y, s) := EX r y, s) X r y, s). Then, by VII.5-7) of [9] and the fact that Ix y) = 0, we can easily calculate: 4.8) EX r y, s) = s + 8sRex y) + 6Rex y) +. 0

11 Using 0 s 8 and X ry, s) 0, we obtain the following estiations Y r y, s) EX r y, s) s + 8s + 8 < 0, EY r y, s) EX r y, s) = s 4 E a ry 8 + 8s 3 E a ry 6 Rex a r a ry) + 4s E a ry 4 Rex a r a ry) +3sE a ry Rex a r a ry) 3 + 6ERex a r a ry) 4 s 4 E a ry 8 + 8s 3 E a rx E a ry 4 + 4s E a rx 4 E a ry +3s E a rx 6 E a ry E a rx 8 E a ry 8 4s s 3 + 9s + 859s < 50. Applying Lea 4.5 with σ = ax50, 0 ) = 50 and y = /4 yields ) P Y r y, s) e γ 4 with γ = /860. It further iplies ) P X r y, s) EX r y, s) e γ. 4 Since EX r y, s) = s + 8sRex y) + 6Rex y) + 8s x y, we have ) 4.9) P X r y, s) 3 e γ. 4 This copletes the proof of 4.9) for a fixed y. In order to extend the result to all y C n, we only need to estiate X r y, s) X r y, s) L y y, for any y, y C n, and find an ɛ-net N ɛ with ɛ /4L). Then, with probability no less than e γ, for all y N ɛ, 4.9) holds for 0 s /8. Under this circustance, for any y C n, we can find a y N ɛ, and have X r y, s) X r y, s) =. X r y, s) X r y, s) This copletes the proof of 4.7) and thus the left side of 4.8). We next prove the right hand side of 4.8). By soe siple calculation, we have f = Reh a r a rx) + a rh ) 4 Reh a r a rx) + a rh 4 4 a rh a rx + a rh 4.

12 Together with the inequalities 4.6), Corollary 7.6 of [9] and VII.9) of [9], we can further obtain Recall the fact δ = 0.0, f can be bounded as f 4 + δ) h δ)n h 4. f 8.04 h n h 4. This copletes the proof of 4.8). 3) Finally, we verify 4.9). The right side of 4.8) iplies h z 900n holds. We notice that 4.0) µh Ψ µ z ) g = h g h Ψ µ z ) Ψ z g. 00 n when fz) Therefore, we estiate the two ters in the right hand side of 4.0), respectively. Siilar to what is done in VII.G) of [9], we obtain 4.) h g 8 h + 000n h g. Let λ i and w i, i =,..., n be the i-th sallest eigenvalue and associated eigenvector of Ψ z, respectively. Suppose that g has the following decoposition g = n c s w s, where c s are coplex nubers. Then, we obtain s= which gives Ψ µ z ) Ψg = n s= λ s λ s + µ c sx s, 4.) h Ψ µ z ) Ψg h Ψ µ z ) Ψg On the other hand, for any y C n, y =, we have λ n λ n + µ h g. y Ψz) Ψx)) y = a r z a rx ) a ry + Re a rz) a rx) ) a ry) ) a r z a rx ) a ry + a rz) a rx) ) a r y ) a ry 4 4.3) a rz a rx + a rz) a rx) ). Siilar to the proof of the right side of 4.8), we can get which gives a r z a rx + a r z) a rx) 6.08 h +.n h 4, 4.4) y Ψz) Ψx)) y.n 6.08 h +.n h n h,

13 where the last inequality uses h /8. Together with Lea 4.4, we obtain λ n n h. Substituting the above eigenvalue evaluation to 4.) and together with µ = 70000n nfz) 35000n n h, we have 4.5) h Ψ µ z ) Ψg n h n h n n h h g n/ n/ n/00 h g n h g 6 h + 300n h g, where the second inequality uses that h 00 a+bz, and n a+cz decreases on z when b < c, and the last inequality uses the relationship h 8. Substituting 4.) and 4.5) into 4.0), we iediately obtain 4.9). This copletes the proof of Lea Proof of Theore 4.. By abuse of the notation, we siply denote z as the current iterate and z + deterined by z + = z Ψ µ z ) gz). Subtracting x fro this equation, we have 4.6) h z + z + x z = h z Ψ µ z ) gz). For the sake of siplicity, we oit the letter z in fz), µz), h z and h z +, and oit the letter z in gz), Ψz) and Φz), when it causes no abiguity. We divide the proof of Theore 4. into two parts. ) We verify 4.) and 4.3) under the condition fz) < z 900n. The updating forula for µ gives µ = fz). Using the left hand side of 4.8) of Lea 4.6, we have h z z f = 900n + h z 5 n, which iplies h 4 n. By the definition of h, we know that Iz h) = 0, which eans z h = 0. On the other hand, it is easy to verify that h z Ψ µ z ) gz) = Ψ µ z ) Ψ µ z h z gz)) = Ψ µ z ) µh + [ h a r a ar a rh r)z ā r a r ) z ] ), and z µh + [ h a r a ar a rh r)z ā r a r ) z 3 ] ) = 0,

14 which further gives I z h + ) h a r a rha r a r)z) = 0. Hence, the eigenvalue estiate 4.7) iplies that the sallest eigenvalue of Ψ restricted in the subspace S := {v Iz v) = 0} is z. Therefore, the largest eigenvalue of Ψ µ z ) restricted in S is. Then, we can obtain z +µ 4.7) 4.8) Denote v := h + Ψµ z ) µh + z µh + + µ h a r a rha r a r)z, we obtain 4.9) h + [ h a r a ar a rh r)z ā r a r ) z ] ) [ h a r a ar a rh r)z ā r a r ) z ] ). z + µ µh + v µ z + µ h + z + µ v. By using the definition of µ and the local error bound condition 4.8), it holds that 4.30) 4.3) z + µ = x h + f h + h + h, µ = f 8.04 h n h.0 h +.4 n h We next estiate v. For any u C n and u =, using 4.6), Corollary 7.6 of [9] and VII.9) of [9], we have u v a rh a rz a ru = a rh a rx + h) a ru a rh 3 a ru + a rh a rx a ru 6n h a rh + + δ) a rh 4 6n + δ) h 3 + 6n + δ) + δ) h. Therefore, the nor of v can be bounded by 4.3) v = v 6n + δ) h 3 + 6n + δ) + δ) h ) = 3.03n h n h. Substituting 4.30)-4.3) into 4.9) yields h + µ h + v 4.0 h +.48 n h n h n h. Using the fact that h = h 7, we further have n h + < n) h < n 9.89 h, n 4

15 which guarantees the inequalities 4.) and 4.3) under the situation that fz k ) < z 900n. ) We next consider the case under the conditions fz) z 900n and h 8. Recalling the inequality 4.8) and 4.9) in Lea 4.6, and the positive definiteness of Ψ z, we obtain h + h Ψ µ z ) g h h Ψ µ z ) g + µ g ) ) h + 8µ µ g 66000n h µ ) h + ) 8µ µ 35000n n h g ) h, 66000n h 8µ which iplies 4.33) h + ) h. 4µ Therefore, we finish the proof for the special case x =. 3) Finally, we consider x. By observing the iteration schee 3.4), it is not difficult to verify that starting fro z0, the kth LM iteration for a proble with a solution x x is x zk+ x = z k x + s k x. Therefore, by the previous proof for the case x =, we have zk+ dist x, dist zk+ x, x ) x < x x dist zk x, x x ) 4 µk x ) < n, which yields 4.) and 4.3). This copletes the proof for all x. ) ) zk dist x, x, x 5. Analysis of the Inexact LM Method. In this section, we first establish the convergence result for the inexact LM fraework, then present a pre-conditioned conjugate gradient PCG) ethod for solving the LM syste inexactly and provide a practically useful choice of the pre-conditioner. 5.. Convergence of the Inexact LM Method. The following theore describes the global linear convergence of the inexact LM ethod. THEOREM 5.. Suppose that Assuption. holds. Let x C n be any solution of.), and c 0 n log n, where c 0 is a sufficiently large constant. Assue that {z k } is a sequence generated by Algorith with the paraeter 5.) η k := { c ).35n µ k x, if fz k) 900n z k ; 9.89 n c x 49.57n x µk gz k ) x, otherwise. 5

16 Then, starting fro any initial solution z 0 satisfying distz 0, x) x /8, there is an event of probability at least 5e γn 8/n e.5n, such that on this event, it holds that 5.) distz k+, x) < + c distz k, x), for all k = 0,,... distz k+, x) < 9.89 n + c x distz k, x), for all k l, x where c, c are defined by 4.) and 4.4), respectively, and l satisfying fz l ) < 900n z l. Proof. We only prove the result when x = and fz k ) 900n z k. The other part can be proved in the sae anner and hence oitted. Let z k+ := z k Ψ µ k z k ) gz k ) be the exactly LM step at the k-th iteration. By using Theore 4., we have 5.3) distz k+, x) < c distz k, x), and 5.4) distz k+, x) distz k+, x) + z k+ z k+ = distz k+, x) + s k + Ψ µ k z k ) gzk ) = distz k+, x) + Ψ µ k z k ) Ψ µ k z k s k + gz k )) distz k+, x) + η µk gz k ). By using Lea 4.3, Lea 4.6 and Cauchy-Schwarz inequality, we obtain gz k ) = a r z k a rx ) a r a rz k 4n zk + δ) a rz a rx ) 5.5).35n h k. Substituting 5.3), 5.5) and the updating forula 5.) into 5.4), we iediately obtain the relationship 5.). Theore 5. tells us that if η k takes the order of fz k ), the inexactly LM ethod guarantees a global linear convergence to a global solution. When η k takes the order of fz k ) 3, the inexactly LM ethod achieves a local quadratic convergence rate. 5.. Solve the LM Equation by PCG. In this subsection, we discuss the PCG ethod for solving the LM equation 3.3). The CG ethods adits a global linear convergence rate which depends on the condition nuber of the coefficient atrix see [4, 7]). However, the linear syste atrix Ψ µ z tends to be singular as the paraeter µ k decreases, which takes place when the iteration is close enough to the solution set. Our recipe is using the PCG ethod with a suitable pre-conditioner M. Therefore, the original linear syste 3.3) is replaced by 5.6) M Ψ µ z s = M gz). 6

17 5.7) Since EΨz) = Φz) if z is independent to {a r }, we suggest to use a pre-conditioner Φ µ z := Φ z + µ z I n and Φ µ z ) gz) as the initial point of the PCG ethod. A siple verification shows that Φ µ z is positive definite and its inverse has an explicit forulation: 5.8) Φ µ z ) = ai n + bzz + c z z, where a = z + µ, b = 3 z + µ)4 z + µ), c = z + µ)µ. Hence, the linear syste Φ µ z ) s = b can be calculated in On) arithetic operations. The reaining task is to analyze the condition nuber of Φ µ z ) Ψ µ z. Siilar to Ψ µ z, Φ µ z is also nearly singular once µ is sall. Therefore, the condition nuber of Φ µ z ) Ψ µ z is likely to be huge. Fortunately, the subspace V := { x x = [ s s ], s C n } is a coon range space of Φ z and Ψ z. It can be easily verified that any iteration z is in V and Φ µ z ) Ψ µ z z V if z V. It is easy to establish the following convergence property of the CG ethod. LEMMA 5.. Assue that A is a positive seidefinite atrix and V is its range space. Denote A µ := A + µi. Let y V be the solution of the linear syste A µ y = b, and {y k } be the sequence generated by the CG ethod fro an initial point y 0 V. Then, for any k, it holds k κv A y k y A µ µ ) y 0 y A µ, κv A µ ) + ) where y A µ = y A µ y) / and κ V A µ ) refers to the restricted condition nuber κ V A µ ) := ax y V, y y A µ y = in y V, y y A µ y. = Lea 5. shows that one only need to evaluate the restricted condition nuber of Φ µ z ) Ψ µ z. Without loss of generality, we assue x =. Let λ be an eigenvalue of Φ µ z ) Ψ µ z, and y λ be the corresponding eigenvector. Firstly, we have 5.9) Φ µ z ) Ψ µ z y λ y λ = λ. Using the relationship 4.7) and 4.4) and the fact that h 8, we obtain Φ µ z ) Ψ µ z y λ y λ = Φ µ z ) )Ψ µ z Φ µ z )y λ z + µ Ψµ z Φ µ z y λ 7 8) + µ Ψ z Ψ x + Ψ x Φ x + Φ x Φ z ) 7 8) + µ 37.09n h + 0.0). 7

18 Assue µ = Kn nfz), then Hence, we have λ 74.8n h ) < 75 + Kn nfz) K n κ V Φ µ z ) Ψ µ z ).03K n K n 75, which eans the condition nuber is close to if either K or n is large. In each PCG iteration, the coputational cost of the gradient evaluation is On), and the cost of the atrix-vector ultiplications for calculating Φ µ z ) s is also On). Lea 5. shows that the upper bound of the nuber of iterations is related to the restricted condition nuber and the distance between the initial guess and the solution set. Since the restricted condition nuber of Φ µ z ) Ψ µ z is sall, the PCG ethod often takes just a few iterations to achieve a good accuracy. Therefore, the coputational cost at a single iteration of our PCG ethod is not too expensive than that of the ethod. 6. Extensions to the Coded Diffraction CD) Model. We ake the following assuption in this section. ASSUMPTION 6.. A proble is called the CD odel if n 6.) y r = xt) d l t)e iπkt/n, r = l, k), 0 k n, l L, t=0 where xt) and d l t) denote the t-th eleent of x and d l, respectively. Assue that L clog n) 4, where c is a sufficiently large nuerical constant, and d l are i.i.d sapled fro a distribution d, which is syetric and satisfies d M and Ed = 0, Ed = 0, E d 4 = E d ). For the CD odel, an initialization via resapled Wirtinger Flow is introduced in [9] as Algorith 3. By conducting a resapled gradient descent steps, this initialization schee can provide a better initial guess than that of Algorith. Algorith 3: Initialization via the resapled ethod Input easureents {a r } and observations {y r } r =,,..., ). Divide the easureents and observations into B + groups of size = /B + ). The easureents and observations in group b are denoted as a b) r and y b) r for b = 0,,..., B. 3 Obtain u 0 by conducting Algorith on group 0. 4 For b = 0 to B, perfor the following update: u b+ = u b µ u 0 5 Set z 0 = u B. z a b+) r 8 ) y b+) r r )z a b+) r a b+)

19 By eploying Algorith 3, the distance between the initial guess z 0 and a solution x can be iproved to 6.) distz 0, x) 8 n x. Then the ethod can achieve distz k, x) 8 n µ 3 ) k/ x. Readers who are interested in the algorith can refer to section V and VII of [9] for the detailed inforation. The odified LM Algorith can be extended to solve the CD odel directly. For the sake of theoretical analysis, the regularization paraeter µ k is updated as 6.3) { 35000n fzk ), if fz µ k = k ) 360n z k ; fzk ), otherwise. If L clog n) 3, then a counterpart of Lea 4. holds with probability at least L+)/n 3. The first equality in Lea 4.3 also holds with probability no less than /n. Finally, we extend Lea 4.6 to the CD odel. LEMMA 6.. Suppose that Assuption 6. holds, Ψx) Φx) δ holds with δ = 0.0 and µ k is updated by 6.3). Then, with probability at least 3/n, we have. Estiate of the sallest nonzero eigenvalue: 6.4) v Ψu)v u v holds for all u, v, z C n, such that Iu v) = 0, and distu, x) /50 n);. Local error bound property: 6.5) 4 5 distz, x) fz) 8.04distz, x) ndistz, x) 4, holds for any z satisfying distz, x) 8 n ; 3. Regularization condition: 6.6) µz)h Ψ µ z ) gz) 6 h n h gz) holds for any z = x + h, h 8 z, and fz) n 360n. Proof. Since the easureents are not independent fro each other in the CD odel, Lea 4.5 cannot be applied. We first prove 6.4). Note that for u, v, z C n satisfying Iu v) = 0, v Ψu)v = v Φu)v + v Ψu) Φu))v u v + u v + 4Re v u) ) v Ψu) Φu) u v v Ψu) Φu). Hence, 6.4) holds if Ψu) Φu) 3 4 u for all u obeying distu, x) 50 n. Because Ψu) Φu) is hoogeneous for u when x is fixed, we assue u = without loss of generality. Then, we have to prove Ψu) Φu) 3 4. It holds that 6.7) Ψu) Φu) Ψu) Ψx) + Ψx) Φx) + Φx) Φu). 9

20 Taking h = u x leads to h /50 n). By using 4.4), we have y Ψu) Ψx)) y.n6.08 h +.n h 4 ) 0.57, for all y C n, y =. Therefore, Ψu) Ψx) 0.57 is an estiation of the first ter of the right side of 6.7). The second ter of 6.7) satisfies Ψx) Φx) δ = 0.0. Siilar to the first ter, the third ter of the right side of 6.7) can be estiated as Hence, we have Φx) Φu) 8 + h ) h ) Ψu) Φu) = This copletes the proof of 6.4). We next prove the left side of 6.5). By following a b) a b, we have fz) = = a r z a rx ) Reh a ra r x ) + a rh ) Reh a ra r x ) Using Corollary 7.5 of [9] and Ih x) = 0, we know Together with a rh 4. Reh a ra r x ) δ h. a rh 4 6n + δ) h 4 and h /8 n), we obtain fz) δ ) 6 + δ) h > h. The right side of 6.5) can be proven in the sae way as 4.8). Hence, the detailed proof is oitted. Finally, we prove 6.6). Siilar to 4.3) and using h 8, we can estiate n the largest eigenvalue of Ψz): λ n nfz). Therefore, using a sae derivation as 4.5), we obtain h Ψ µ z ) Ψg 6 h n h g, which together with 4.) gives 6.6). 0

21 Consequently, both global linear convergence rate and local quadratic rate can be established for the CD odel based on the above leas. THEOREM 6.3. Suppose that Assuption 6. holds. Let x C n be any solution of.) and {z k } be a sequence generated by Algorith where the LM equation exactly solved and µ k is chosen as 6.3). Then, starting fro any initial solution z 0 obeying distz 0, x) x 8 n, there is an event of probability at least L + )/n 3 /n such that on this event, distz k+, x) < c distz k, x), for all k = 0,,... distz k+, x) < c distz k, x), for all k l, where s satisfies fz s ) < 360n z s, and ) x 4µ 6.9) c := k, if fz k ) 360n z k ; n n, otherwise; 6.0) c := n. x Siilar theoretical results on the inexact LM ethod can also be derived. The proof of the theore follows the sae procedure and shares the sae inequalities, although the calculation is different. We oit the for conciseness. 7. Nuerical Experients. In this section, we present soe nuerical results to deonstrate the perforance of the LM ethod using the paraeter µ k = fz k ), and copare it with the ethod in [9]. 7.. Recovery of D signals. We begin our nuerical experients on -D rando signals under Gaussian and CD odel. In order to ake coparison with the ethod, we choose the sae type of signals as that in [9]: Rando low-pass signals, where x is given by x[t] = M/ k= M/ ) X k + iy k )e πik )t )/n, with M = n/8 and X k and Y k are i.i.d. N 0, ). Rando Guassian signals, where x C n is a rando coplex Gaussian vector with i.i.d. entries of the for X[t] = X + iy, with X and Y distributed as N 0, ). In the initialization step, 50 iterations of power ethod are run to calculate the eigenvector needed in Algorith. For the LM ethod, we solve the LM syste accurately and inaccurately, and these two versions are denoted by and, respectively. For the ethod, we set η k = 0 6 in 3.6). For the ethod, we set the axiu iteration nuber of PCG to be ink +, 5), where k is the iteration nuber in Algorith. We stop the LM algorith after 00 iterations. For the ethod, we use the step length µ k = in exp k/τ 0 ), 0.), where τ 0 330, and stop after 500 iterations. Notice that in each iteration, there are at ost 5 PCG iterations. Consequently, the coputational

22 Gaussian odel Coded diffraction odel Success rate /n Gaussian odel a) Success rate for Gaussian signals L Coded diffraction odel Success rate /n b) Success rate for low-pass signals L FIG. 7.. Epirical probability of success based on 00 rando trials. cost of every PCG iteration is about two ties of a iteration. Hence, considering the calculation of gradient in each iterations, the coputational cost of one iteration is no ore than that of iterations. Therefore, excuting 00 iterations is not ore expensive than 500 iterations. In this experient, we set n = 5 and copare the epirical success rate and the CPU tie of the LM and ethods. The epirical probability of success is an average over 00 trials, where in each instance, new rando sapling vectors are generated according to the Gaussian or CD odels. For coded diffraction odel, we use octanary patterns as the asks in [9]. We declare a trial successful if the relative error of the reconstruction distz final, X )/ x falls below 0 5 before the iteration process is stopped. For a successful trial, we define the CPU tie of this trial to be the tie used until the first iteration after which the relative error is saller than 0 5. Figure 7. shows that around 4.5n Gaussian phaseless easureents or 6 octanary patterns are enough for an exact recovery with high probability for all algoriths. For all tested signals and odels, the success rate of the LM ethods rises a little bit earlier than the ethod as the nuber of easureents increases. The three algoriths perfor siilarly in ters of the success rates. Furtherore, this figure shows that solving the LM equations inexactly does not exert significant ipact on success rate of the LM ethod. We next exaine the order of convergence of the and LM ethods. Figure 7. shows the relationship between the relative error in logarith scale and the nuber of iterations for the three algoriths. To better illustrate the perforance of the LM ethods, we only show errors of the first 40 iterations. We can see fro Figure 7. that the ethod does show

23 0 Gaussian odel 0 Coded diffraction odel 4 4 log0relative error) Iteration 0 Gaussian odel a)gaussian signals Iteration 0 Coded diffraction odel 4 4 log0relative error) Iteration b)low-pass signals Iteration FIG. 7.. Relationship between the relative errors and the nuber of iterations. /n = 6 for Gaussian odel and L = 0 for CD odel. Signal Gaussian signal Low-pass signal odel Gaussian Coded diffraction Gaussian Coded diffraction iter CPU iter CPU iter CPU iter CPU s s s s s s s s s s s s TABLE 7. Coputational results on rando exaples quadratic convergence. As it is expected, the ethod shows linear convergence after the first several iterations. However, its convergence rate is uch faster than that of the ethod. Table 7. presents the averaged nuber of iterations of the LM and ethods used to achieve an accuracy of 0 5 under a fixed /n or L. In the table, the statistics of the ethod is approxiately proportional to the logarith of that of the ethod, which shows the quadratic convergence of the ethod. Although the ethod converges linearly, it converges fast and does not take any ore iterations than the accurate algorith. We should point out that Figure 7. is not very fair to the ethod, since the gradient ethod tends to takes a large nuber of iterations. Therefore, we show the relationship between the relative error and CPU tie in Figure 7.3. Fro this figure, we can see that, although the ethod converges quadratically, it consues uch ore CPU tie than the other two ethods because solving the LM equation accurately needs a lot of PCG iter- 3

24 0 Gaussian odel 0 Coded diffraction odel 4 4 log0relative error) CPU tie 0 Gaussian odel a)gaussian signals CPU tie 0 Coded diffraction odel 4 4 log0relative error) CPU tie b)low-pass signals CPU tie FIG Relationship between the relative errors and CPU tie. /n = 6 for Gaussian odel and L = 0 for CD odel. ations. On the other side, the ethod consues the sallest CPU tie. Table 7. also shows the averaged CPU tie it does not include the CPU tie of the initialization step) of the three ethods to ake a successful recovery. We still can see that the ethod is the ost tie-consuing ethod, while the ethod tends to take uch less CPU tie about /3 or less) than the ethod. Obviously, solving the LM equation is expensive although a proising PCG is eployed, and aking a suitable truncation to the PCG ethod can efficiently reduce the coputational cost. 7.. Perforance on natural iage. We next perfor a few nuerical experients on recovering natural iages, siilar to Section IV.C of [9]. The two iages that we use are colored photographs of the Turret of Palace Museu turret ) and the Milky Way Galaxy galaxy ). The colored iages are viewed as n n 3 arrays, where the first two indices encode the pixel location, and the last is the color band. We run the LM and ethods on each of the three RGB iages. We generate L = 0 rando octanary patterns and gather the CD patterns for each color band using these 0 saples. We run 50 iterations of the power ethod in the initialization step. For the ethod, the stopping tolerance of the PCG is set to η k = 0 6. For the ethod, the axiu iteration nuber of PCG is set to 5. For the ethod, the step length is µ k = in exp k/τ 0 ), 0.4), where τ We perfor 5 LM iterations and 300 iterations. The relative error is calculated by x x F / x F where x and x are the recovered and original iage, respectively. The CPU tie is an average of the CPU tie fro three RGB iages. 4

25 FIG Turret of Palace Museu. The iage size is pixels. FIG The Milky Way Galaxy. The iage size is pixels. Figure 7.4 and 7.5 show the iage turret and galaxy recovered by the, respectively. The iages recovered by the other two algoriths are not reported because they are siilar. Table 7. shows the average nuber of iterations and average CPU tie used by the three algoriths to reduce the relative error to 0 5 and 0 0 for the three color band. We can see an obvious advantage of the ethod over the ethod. However, the CPU tie of the ethod is uch larger than the other two ethods. Figure 7.6 shows the relationship between the relative errors and the nuber of iterations. It obviously deonstrates quadratic convergence of the ethod and fast linear convergence of the ethod. In particular, the ethod takes about one iteration to reduce the accuracy fro 0 5 to 0 0. Figure 7.7 shows the relationship between the CPU tie and the relative errors of the three ethods. Due to the inexactness in solving the LM equation, the ethod takes uch less than than the ethod Phase retrieval with noise. We now evaluate the nuerical perforance of the LM ethods when there exists noises in the observation. We add different level of noises to {y r } and explore the relationship between the signal-to-noise rate SNR) of the observation and the ean square error MSE) of the recovered solution. Specifically, SNR and MSE are 5

26 Iage Turret of Palace Musue The Milky Way Galaxy Criterion iter CPU iter CPU iter CPU iter CPU s s s s s s s s s s s s TABLE 7. Coputational results in natural iage recovering. 0 Turret 0 Milky Way 4 4 log0relative error) Iteratioin Iteratioin FIG Relationship between the relative errors and the nuber of iterations for natural iages recovery. calculated by 7.) MSE := dist x, x) i= x, and SNR := a rx 4 w, where x is the output of the LM ethods after 50 iterations or of the ethod after 500 iterations, and w the added noise. The db-scale of MSE and SNR is calculated by 0 log MSE and 0 log SNR, respectively. We construct rando signals with n = 5, set = 6n for Gaussian odel and L = for the CD odel. The SNR is varied fro 0db to 60db. For each case, 00 Monte Carlo trials are repeated. Figure 7.8 shows the results on the change of MSE versus SNR. It shows that both algoriths achieve a siilar order of accuracy. In fact, both algoriths can converge to the sae iniu x with high probability. 8. Conclusion and future work. In this paper, we develop a odified LM ethod via Wirtinger derivative to solve the phase retrieval proble. Starting fro the sae spectral initialization step as the ethod, our ethod converges to the global solution linearly under the sae assuption as the ethod. The convergence rate is further iproved to be quadratic in a predictable neighborhood of the solution. Siilar theoretical analysis holds even if the LM equation is solved inexactly. In particular, a siple yet useful preconditioner is constructed based on the expectation of the LM coefficient atrix by assuing the independence between easureents and the LM iteration. Since the restricted condition nuber of this preconditioned coefficient atrix is sall, it enables a fast convergence of the PCG ethod for solving the LM equation. In our nuerical experients, we verify that the proposed LM ethod indeed converges quadratically in recovering both rando exaples and natural iages if the LM equation is solved sufficiently accurate. Our inexact LM ethod is coparable to the ethod in ters of the success rate and it has advantage in ters of the CPU tie. 6

Globally Convergent Levenberg-Marquardt Method For Phase Retrieval

Globally Convergent Levenberg-Marquardt Method For Phase Retrieval Globally Convergent Levenberg-Marquardt Method For Phase Retrieval Zaiwen Wen Beijing International Center For Mathematical Research Peking University Thanks: Chao Ma, Xin Liu 1/38 Outline 1 Introduction

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization

Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization Structured signal recovery fro quadratic easureents: Breaking saple coplexity barriers via nonconvex optiization Mahdi Soltanolkotabi Ming Hsieh Departent of Electrical Engineering University of Southern

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

P016 Toward Gauss-Newton and Exact Newton Optimization for Full Waveform Inversion

P016 Toward Gauss-Newton and Exact Newton Optimization for Full Waveform Inversion P016 Toward Gauss-Newton and Exact Newton Optiization for Full Wavefor Inversion L. Métivier* ISTerre, R. Brossier ISTerre, J. Virieux ISTerre & S. Operto Géoazur SUMMARY Full Wavefor Inversion FWI applications

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili,

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili, Australian Journal of Basic and Applied Sciences, 5(3): 35-358, 20 ISSN 99-878 Generalized AOR Method for Solving Syste of Linear Equations Davod Khojasteh Salkuyeh Departent of Matheatics, University

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Matheatical Sciences 04,, p. 7 5 ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD M a t h e a t i c s Yu. A. HAKOPIAN, R. Z. HOVHANNISYAN

More information

arxiv: v1 [math.na] 10 Oct 2016

arxiv: v1 [math.na] 10 Oct 2016 GREEDY GAUSS-NEWTON ALGORITHM FOR FINDING SPARSE SOLUTIONS TO NONLINEAR UNDERDETERMINED SYSTEMS OF EQUATIONS MÅRTEN GULLIKSSON AND ANNA OLEYNIK arxiv:6.395v [ath.na] Oct 26 Abstract. We consider the proble

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Variations on Backpropagation

Variations on Backpropagation 2 Variations on Backpropagation 2 Variations Heuristic Modifications Moentu Variable Learning Rate Standard Nuerical Optiization Conjugate Gradient Newton s Method (Levenberg-Marquardt) 2 2 Perforance

More information

arxiv: v3 [quant-ph] 18 Oct 2017

arxiv: v3 [quant-ph] 18 Oct 2017 Self-guaranteed easureent-based quantu coputation Masahito Hayashi 1,, and Michal Hajdušek, 1 Graduate School of Matheatics, Nagoya University, Furocho, Chikusa-ku, Nagoya 464-860, Japan Centre for Quantu

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Asynchronous Gossip Algorithms for Stochastic Optimization

Asynchronous Gossip Algorithms for Stochastic Optimization Asynchronous Gossip Algoriths for Stochastic Optiization S. Sundhar Ra ECE Dept. University of Illinois Urbana, IL 680 ssrini@illinois.edu A. Nedić IESE Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

Testing equality of variances for multiple univariate normal populations

Testing equality of variances for multiple univariate normal populations University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate

More information

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China 6th International Conference on Machinery, Materials, Environent, Biotechnology and Coputer (MMEBC 06) Solving Multi-Sensor Multi-Target Assignent Proble Based on Copositive Cobat Efficiency and QPSO Algorith

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

Physics 215 Winter The Density Matrix

Physics 215 Winter The Density Matrix Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL paper prepared for the 1996 PTRC Conference, Septeber 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL Nanne J. van der Zijpp 1 Transportation and Traffic Engineering Section Delft University

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography Tight Bounds for axial Identifiability of Failure Nodes in Boolean Network Toography Nicola Galesi Sapienza Università di Roa nicola.galesi@uniroa1.it Fariba Ranjbar Sapienza Università di Roa fariba.ranjbar@uniroa1.it

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Data-Driven Imaging in Anisotropic Media

Data-Driven Imaging in Anisotropic Media 18 th World Conference on Non destructive Testing, 16- April 1, Durban, South Africa Data-Driven Iaging in Anisotropic Media Arno VOLKER 1 and Alan HUNTER 1 TNO Stieltjesweg 1, 6 AD, Delft, The Netherlands

More information

A Bernstein-Markov Theorem for Normed Spaces

A Bernstein-Markov Theorem for Normed Spaces A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

A new type of lower bound for the largest eigenvalue of a symmetric matrix

A new type of lower bound for the largest eigenvalue of a symmetric matrix Linear Algebra and its Applications 47 7 9 9 www.elsevier.co/locate/laa A new type of lower bound for the largest eigenvalue of a syetric atrix Piet Van Mieghe Delft University of Technology, P.O. Box

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

A Probabilistic and RIPless Theory of Compressed Sensing

A Probabilistic and RIPless Theory of Compressed Sensing A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA 94305 2 Applied and Coputational Matheatics,

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material

Consistent Multiclass Algorithms for Complex Performance Measures. Supplementary Material Consistent Multiclass Algoriths for Coplex Perforance Measures Suppleentary Material Notations. Let λ be the base easure over n given by the unifor rando variable (say U over n. Hence, for all easurable

More information

The linear sampling method and the MUSIC algorithm

The linear sampling method and the MUSIC algorithm INSTITUTE OF PHYSICS PUBLISHING INVERSE PROBLEMS Inverse Probles 17 (2001) 591 595 www.iop.org/journals/ip PII: S0266-5611(01)16989-3 The linear sapling ethod and the MUSIC algorith Margaret Cheney Departent

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Optical Properties of Plasmas of High-Z Elements

Optical Properties of Plasmas of High-Z Elements Forschungszentru Karlsruhe Techni und Uwelt Wissenschaftlishe Berichte FZK Optical Properties of Plasas of High-Z Eleents V.Tolach 1, G.Miloshevsy 1, H.Würz Project Kernfusion 1 Heat and Mass Transfer

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010 A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING By Eanuel J Candès Yaniv Plan Technical Report No 200-0 Noveber 200 Departent of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065

More information

Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods

Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods Eric de Sturler Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods Eric de Sturler Departent of Matheatics, Virginia Tech www.ath.vt.edu/people/sturler/index.htl sturler@vt.edu Efficient

More information

Hamming Compressed Sensing

Hamming Compressed Sensing Haing Copressed Sensing Tianyi Zhou, and Dacheng Tao, Meber, IEEE Abstract arxiv:.73v2 [cs.it] Oct 2 Copressed sensing CS and -bit CS cannot directly recover quantized signals and require tie consuing

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

An RIP-based approach to Σ quantization for compressed sensing

An RIP-based approach to Σ quantization for compressed sensing An RIP-based approach to Σ quantization for copressed sensing Joe-Mei Feng and Felix Kraher October, 203 Abstract In this paper, we provide new approach to estiating the error of reconstruction fro Σ quantized

More information

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all Lecture 6 Introduction to kinetic theory of plasa waves Introduction to kinetic theory So far we have been odeling plasa dynaics using fluid equations. The assuption has been that the pressure can be either

More information

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se

More information

RESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS

RESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS BIT Nuerical Matheatics 43: 459 466, 2003. 2003 Kluwer Acadeic Publishers. Printed in The Netherlands 459 RESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS V. SIMONCINI Dipartiento di

More information