A Parallel Multisplitting Solution of the Least Squares Problem

Size: px
Start display at page:

Download "A Parallel Multisplitting Solution of the Least Squares Problem"

Transcription

1 NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Lnear Algebra Appl., 5, (1998) A Parallel Multsplttng Soluton of the Least Squares Problem R. A. Renaut Department of Mathematcs, Arzona State Unversty, Tempe, AZ , USA The lnear least squares problem, mn x Ax b 2, s solved by applyng a multsplttng(ms) strategy n whch the system matrx s decomposed by columns nto p blocks. The b and x vectors are parttoned consstently wth the matrx decomposton. The global least squares problem s then replaced by a sequence of local least squares problems whch can be solved n parallel by MS. In MS the solutons to the local problems are recombned usng weghtng matrces to pck out the approprate components of each subproblem soluton. A new two-stage algorthm whch optmzes the global update each teraton s also gven. For ths algorthm the updates are obtaned by fndng the optmal update wth respect to the weghts of the recombnaton. For the least squares problem presented, the global update optmzaton can also be formulated as a least squares problem of dmenson p. Theoretcal results are presented whch prove the convergence of the teratons. Numercal results whch detal the teraton behavor relatve to subproblem sze, convergence crtera and recombnaton technques are gven. The two-stage MS strategy s shown to be effectve for near-separable problems John Wley & Sons, Ltd. KEY WORDS least squares; QR factorzaton; teratve solvers; parallel algorthms; multsplttng 1. Introducton We consder the soluton of the overdetermned system of lnear equatons Ax = b (1.1) where A s an m n (m n) real matrx of rank n and x and b are vectors of length n. Drect solutons of ths system can be obtaned by a least squares algorthm for mn Ax b x n 2 (1.2) Correspondence to R. A. Renaut, Department of Mathematcs, Arzona State Unversty, Tempe, AZ , USA. CCC /98/ $17.50 Receved 25 June John Wley & Sons, Ltd. Revsed 1 May 1997

2 12 R. A. Renaut va the QR factorzaton of A [9]. Typcal methods for computng the QR decomposton use Householder transformatons, Gvens transformatons, or the Gram Schmdt process. One possble approach to the parallelzaton of these least squares algorthms therefore nvolves the determnaton of parallel algorthms for orthogonal transformatons [4]. Another straghtforward method symmetrzes (1.1) by formng the normal equatons A T Ax = A T b (1.3) Ths system can be solved drectly usng Gaussan elmnaton or teratvely usng any standard method such as conjugate gradents or a Krylov-subspace algorthm [1]. Parallelzaton of these algorthms can proceed at the matrx operaton level, n whch approprate data mappng allows for effcent realzatons of matrx vector update operatons [15]. Each of the parallelzaton strateges mentoned above has the advantage that all convergence characterstcs of the seral algorthm are mantaned because the seral algorthm tself s not modfed. On the other hand, consdered separately, each detal of the algorthm may pose conflctng demands for effcent parallelsm. Also, the approach s potentally very tme ntensve because of lack of portablty across archtectures. The alternatve approach consdered here s to develop new algorthms whch have great potental for parallelsm, are essentally archtecture-ndependent algorthms and use as much seral expertse as possble. The multsplttng (MS) phlosophy ntroduced by O Leary and Whte [14] for the soluton of regular systems of equatons, meets both goals and could be appled at the system level for the soluton of the normal equatons (1.3). A drect mplementaton, however, requres the formaton of the operator A T A, whch s not desrable. Instead we propose least squares MS algorthms for the soluton of (1.2). Specfcally, MS s an teratve technque whch uses doman parttonng to replace a large-scale problem by a set of smaller subproblems, each of whch can be solved ndependently n parallel. The success of MS reles on an approprate recombnaton strategy of the subproblem solutons to gve the global soluton. Ths paper presents three least squares (LS) algorthms based on the MS approach; () an LS algorthm wth a standard MS approach to soluton recombnaton, () an teratve refnement mplementaton of the LS algorthm, () a two-stage MSLS algorthm whch solves a second LS problem to determne the optmal weghts at the recombnaton phase. Theoretcal propertes of these algorthms are determned and an estmate of ther relatve parallel computatonal costs, gnorng communcaton, are presented. A performance evaluaton va numercal mplementaton s also provded. The format of the paper s as follows. In Secton 2 we revew the lnear MS algorthm ntroduced n [14]. The new algorthms desgned for the least squares problem are also presented and estmates of ther computatonal costs are provded. A theoretcal analyss of the convergence propertes of the algorthms s detaled n Secton 3. Results of some numercal tests are reported n Secton 4. Fnally, conclusons and suggestons for future drectons of the research are dscussed n Secton 5.

3 A Parallel Multsplttng Soluton of the Least Squares Problem Descrpton of multsplttng 2.1. Multsplttng for Ax = b Iteratve methods based on a sngle splttng, A = M N, are well known [16]. Multsplttngs generalze the splttng to take advantage of the computatonal capabltes of parallel computers. A multsplttng of A s defned as follows: Defnton 2.1. Lnear multsplttng (LMS). Gven a matrx A n n and a collecton of matrces M (j),n (j),e (j) n n, j = 1:p, satsfyng () A = M (j) N (j) for each j, j = 1:p, () M (j) s regular, j = 1:p, () E (j) s a non-negatve dagonal matrx, j = 1:pand p j=1 E(j) = I. Then the collecton of trples (M (j),n (j),e (j) ), j = 1:ps called a multsplttng of A and the LMS method s defned by the teraton: x k+1 = p E (j) (M (j) ) 1 (N (j) x k + b), k = 1,... (2.1) j=1 The advantage of ths method s that at each teraton there are p ndependent problems of the knd M (j) y k j = N (j) x k + b, j = 1:p (2.2) where yj k represents the soluton to the local problem. The work for each equaton n (2.2) s assgned to one (or a set of) processor(s) and communcaton s requred only to produce the update gven n (2.1). In general, some (most) of the dagonal elements n E (j) are zero and therefore the correspondng components of yj k need not be calculated. If the yk j are dsjont, j = 1: p, the method corresponds to block Jacob and s called non-overlappng. Then, the dagonal matrces E (j) have only zero and one entres. For overlapped subdomans the elements n E (j) need not be just zeros and ones but Frommer and Pohl [7] showed that the beneft of overlap s the ncluson of extra varables n the mnmzaton for the local varables, and that the updated values on the overlapped porton of the doman should not be utlzed,.e., the weghts are stll zeros or ones. Before we contnue to develop the MS approach for the soluton of the least squares problem t s useful to realze that MS can be seen as a doman decomposton algorthm, n whch the update gven by (2.2) only provdes an updated soluton for a porton of the doman. Specfcally, the varable doman x n s parttoned accordng to x = (x 1,x 2,...,x p ) T, where each subdoman has x n and p =1 n = n, wthout overlap of subdomans. The soluton yj k+1 s then the soluton wth updated values only on the j th porton of the partton. The same dea s appled to defne a least squares multsplttng (LSMS) approach Lnear least squares For (1.2) the matrx A s parttoned nto blocks of columns consstently wth the decomposton of x nto blocks as A = (A 1,A 2,...,A p ), where each A m n. Ths s not unlke the row projecton methods as descrbed, for example, n [3], except that there the

4 14 R. A. Renaut decomposton of A s nto blocks of rows. The two approaches are not equvalent. Wth the column decomposton Ax = p =1 A x (1.2) can be replaced by the subproblems mn A y b (x) 2, y n 1 p where b (x) = b j A j x j = b Ax + A x. Clearly each of these subproblems s also a lnear least squares problem, amenable to soluton by QR factorzaton of submatrx A. Equvalently, denote the soluton at teraton k by x k = (x1 k,xk 2,...,xk p ), then the soluton at teraton k + 1 s found from the soluton of the local subproblems accordng to y k+1 mn n A y k+1 b (x k ), 1 p (2.3) x k+1 = p =1 The updated local soluton to the global problem s gven by x k+1 α k+1 x k+1 (2.4) = (x k 1,xk 2,...,xk 1,yk+1,x k +1,...,xk p ) (2.5) where the non-negatve weghts satsfy p =1 αk+1 = 1 and the solutons of the subproblems (2.3) are denoted by y k+1. Ths update equaton s stll vald for overlapped domans wth the one zero weghtng scheme, except that the notaton has to be modfed n (2.3) to ndcate that (2.3) s solved wth respect to a larger block and that y k+1 n (2.5) s the update restrcted to the local doman. For block of (2.4) ( x k+1 = (α k+1 x k+1 p ) + = α k+1 y k+1 + p j=1 j j=1 j = α k+1 y k+1 + (1 α k+1 Ths s completely local and can be rewrtten as where now α k+1 be expressed as: δ k+1 ) αj k+1 x j k+1 α k+1 j x k (2.6) )x k x k+1 = x k + α k+1 (y k+1 x k ) (2.7) = x k + α k+1 δ k+1 s the step taken on partton. The update of b (x k ) n (2.3) can then

5 A Parallel Multsplttng Soluton of the Least Squares Problem 15 b (x k+1 ) = b (x k ) = b (x k ) p j=1 j p j=1 j α k+1 j A j δ k+1 j (2.8) αj k+1 Bj k+1 where Bj k+1 = A j δj k+1. The overlapped update of b (x k+1 ) follows smlarly but does requre communcaton. The basc LSMS algorthm usng p processes follows: Algorthm 2.1. LSMS For all processes, 1 p, calculate Q R = A, ntalze y 0 = x 0, k = 0, α0 = 0, α k = α = 1 p, b (x 0 ) = b. Whle not converged, k = k + 1 calculate A δ = B! Matrx vector update, communcate B to all processors! Global communcaton, update b va (2.9). Fnd y to solve (2.3)! Solve local least squares, δ = y x x = x + αδ! Update x, test for convergence locally, communcate convergence result to all processors, end whle. End Ths algorthm s hghly parallel and completely load-balanced when the problem sze s the same for each process. For the overlapped case, modfcatons of the algorthm are necessary, but because of the zero one weghtngs used, these modfcatons are mnor. Testng for convergence can be carred out ether locally or globally. In the former case t s only necessary to share a logcal varable wth all the other processes. Otherwse the vector x must be accumulated and a global check performed. Observe that ths algorthm s presented as a slave-only model. A master slave model requres only mnor modfcaton. From (2.9) and defnng the resdual r = b Ax r(x k+1 ) = r(x k ) p j=1 α k+1 j A j δ k+1 j (2.9) t s easy to see that teratve refnement requres only a mnor modfcaton of Algorthm 2.1. Specfcally, after the frst teraton, the update of b by (2.9) can be replaced by the update of r from (2.9) and n the update (2.7) we use δ k+1 = y k+1. The communcaton and computaton costs are unchanged, but the local least squares problem (2.9) s replaced by y k+1 mn n A y k+1 r(x k ) 2

6 16 R. A. Renaut for whch the rght-hand sde s now the same for all subproblems. Ths algorthm s the MS analog of the teratve refnement procedure for least squares ntroduced by Golub [8]. In lght of the nvestgaton by Hgham [13], and to gve a far comparson between methods, we have chosen to mplement the teratve refnement (LSMSIR) usng sngle precson resduals. Golub and Wlknson [10] revealed, however, that the procedure s satsfactory only when the true resdual vector s suffcently small. We mght expect, therefore, that LSMSIR wll not offer mprovement compared wth LSMS. Ths s confrmed by the numercal experments presented n Secton 4. Equaton (2.7) suggests that convergence mght be mproved by use of a global update usng a lne search procedure. In partcular, n [18] t was suggested that one choce would be to set α k+1 = α = 1 whch amounts to the update x k+1 = y k+1. Another mprovement suggested employs a one dmensonal lne search dependent on α = α k+1,orap-dmensonal mnmzaton of the resdual over the parameters {α k+1, 1 p}. In the latter case ths can be formulated as a least squares mnmzaton mn Dα r(xk ) 2 (2.10) α p where D m p has columns dj k+1 = A j δj k+1.forp<<nths represents non-parallel overhead of a least squares solve, requrng the formaton of the QR factorzaton of D, but t has the potental to mprove the speed of convergence of the teraton. Algorthm 2.2. Optmal recombnaton ORLSMS For all processors, 1 p calculate Q R = A, ntalze x, k = 1, calculate A x = B, form b va (2.9) and r = b B, fnd y to solve (2.3) δ = y x. Whle not converged calculate A δ = d communcate d to all processors calculate Q D R D = D and solve (2.10) for α update x = x + α δ test convergence update B = B + α d communcate B to all processors form b va (2.9) and r = b B fnd y to solve (2.3) δ = y x, end whle. End Note that n ths verson of the algorthm we have employed a redundant update n whch every process solves the outer least squares problem for α. Ths does have the advantage

7 A Parallel Multsplttng Soluton of the Least Squares Problem 17 that each process can keep a record of the global update, provded that the ntal guess s known to each process. The per teraton communcaton cost s now two global vector exchanges. Hence, the per teraton communcaton costs are twce those wthout the optmal recombnaton. Observe, also, the OR algorthm can be modfed to act on the IR algorthm. Agan a master slave mode of the algorthm follows n a straghtforward manner Computaton performance analyss We have already remarked on the communcaton costs of the algorthms presented n the prevous secton. Moreover, our ntent here s to evaluate the potental parallelsm n these MS algorthms, wthout consderaton of communcaton bandwdths, cache sze or other archtecture-dependent factors. Therefore, we do have to defne a measure of parallel effcency. To do ths we estmate to the hghest order the computaton costs assocated wth each algorthm as compared wth a drect solve by the least squares soluton of the whole system. We make an assumpton that the QR factorzaton s calculated by Householder transformatons, whch are a lttle cheaper than the Gvens rotatons. Seral cost of the QR soluton of (1.2) s gven by C S = 2n 2 (m n/3) + mn + n 2 where the frst term s for the determnaton of R, the second for the update of b and the fnal for the back substtuton to gve x. The per process cost for Algorthm 2.1 s C P = 2n 2 (m n /3) + K(2mn + n 2 ) where K s the total number of teratons requred to acheve a specfed convergence crteron. The frst term represents the formaton of the QR factorzaton of A. Costs to frst order n m or n are gnored. When IR s ncorporated, the costs are unchanged, provded the resdual s calculated to the same precson as the remander of the operatons. Algorthm 2.2 does have a basc teraton cost that s greater because of the QR factorzaton of D. Hence n ths case C POR = 2n 2 (m n /3) + K(2mn + n 2 + 2p2 (m p/3) + mp + p 2 ) The percentage parallel effcency acheved s gven by E = C S C P p (2.11) where p s the number of processes used n the calculaton. Moreover, for the OR algorthms C P s replaced by C POR. Measurements of these effcences are gven n our presentaton of the numercal results. Note that when overlap s ntroduced nto the systems, the formulae are stll vald but wth n replaced by n = n/p + o where o determnes the amount of the overlap.

8 18 R. A. Renaut 3. Convergence 3.1. Convergence of the lnear LSMS algorthm In order to nvestgate the convergence of Algorthm 2.1 we need to determne the lnear teratve scheme satsfed by the global update x k+1. We shall assume that the system matrx A n (1.1) has full rank so that the soluton to (1.2) exsts and s unque. Because the matrx A s of full column rank, so are the submatrces A. Therefore the soluton of the least squares problem (2.3) exsts, s unque and s gven by p y = (A T A ) 1 A T b (A T A ) 1 A T A j xj k j=1 j Hence the th component of the global update (2.4) can be wrtten x k+1 = p C j xj k + α (A T A ) 1 A T b j=1 where { (1 α )I C j = n = j α (A T A ) 1 A T A (3.1) j j Here I n s the dentty matrx of order n, the matrces C j are the j blocks of a matrx C, wth block sze n n j, and C n n, consstent wth x n. Equvalently, (2.4) becomes x k+1 = Cx k + b (3.2) where b = α (A T A ) 1 A T b. Moreover, C = C(α), so that convergence depends on the parameter α. It s easly seen that teratve refnement for Algorthm 2.1 leads exactly to (3.2). The convergence behavor of both algorthms s thus, the same. They dffer only n mplementaton and, consequently, effects of fnte precson arthmetc. Theorem 3.1. The teratve scheme defned by (3.2) wth α = α = 1 s a block Jacob teratve scheme for the soluton of the normal equatons (1.3). Proof Set α = 1 n (3.1). Then t s clear that the equvalent form of (3.2) s Mx k+1 = Nx k + b (3.3) where M s a dagonal matrx wth entres A T A, N s gven by { 0 = j N j = A T A j j and b = A T b. Therefore (3.2) solves the equaton (M N)x = b

9 A Parallel Multsplttng Soluton of the Least Squares Problem 19 and from (3.1), M N = A T A and b = A T b. Corollary 3.1. The teratve scheme defned by (3.2) for fxed α, 0 <α<1, s a relaxed block Jacob scheme for the soluton of the normal equatons (1.3). The condton for the convergence of (3.2) when α = 1 s now mmedate because the system matrx A T A s symmetrc and postve defnte (SPD), (see Corollary 5.47 n Chapter 7 of [2]). Theorem 3.2. The teraton defned by (3.2) wth α = α = 1 converges for any ntal vector x 0 f and only f M + N s postve defnte. Moreover, the Gauss Sedel mplementaton of Algorthm 2.1 necessarly converges because for successve-over-relaxaton (SOR) convergence s gven by Corollary 5.48 n Chapter 7 of [2]. Theorem 3.3. The block SOR method converges for all 0 <α<2. It s now helpful to ntroduce the notaton µ = ρ(m 1 N), the spectral radus of M 1 N, µ σ(m 1 N), an element n the spectrum of M 1 N and µ the smallest egenvalue of M 1 N. Lemma 3.1. All the egenvalues µ of M 1 N satsfy µ <1. Proof The system matrx defned by the normal equatons s SPD. M s also SPD and the teraton wth α = 1 s symmetrzable,.e., there exsts a matrx W, det W 0, such that W(I M 1 N)W 1 s SPD [11]. In ths case a choce for W s W = M 1/2. Therefore, by Theorem n [11] the egenvalues of M 1 N are real and satsfy µ <1. Theorem 3.4. The relaxed teraton converges for any suffcently small postve α satsfyng 2 0 <α< 1 µ Proof The result follows from the observaton that the teraton matrx of the relaxed block Jacob teraton s gven by H = (1 α)i + αm 1 N Therefore f λ σ(h),wehaveλ = 1 α(1 µ), and ρ(h) < 1 f and only f 0 <α< 1 µ 2. By Theorem 3.2 ρ(m 1 N) < 1 when M + N s postve defnte and therefore we can conclude µ > 1 n the above to gve: Corollary 3.2. The relaxed block Jacob teraton converges f M + N s postve defnte and 0 <α<1.

10 20 R. A. Renaut Remark 1 The row projecton methods [3] use block algorthms to solve the set of equatons { AA T y = b x = A T y wth and wthout precondtonng. On the other hand, the column decomposton ntroduced here solves the normal equatons Convergence of the ORLSMS algorthm Unlke Algorthm 2.1 the convergence of Algorthm 2.2 cannot be nvestgated by the determnaton of a lnear teraton for x. Rather, Algorthm 2.2 needs to be seen as a procedure for the mnmzaton of the non-lnear functon f(x)= Ax b 2 2. As such Algorthm 2.2 can then be nterpreted as a modfcaton of the parallel varable dstrbuton (PVD) algorthm for non-lnear functons, f(x), ntroduced by Ferrs and Mangasaran [6]. In the PVD, mnmzaton of f occurs n two stages, a parallel stage and a synchronzaton stage. The former corresponds to the determnaton of the parallel soluton of the local problem (2.3) but wth the addtonal local update of the non-local varables by a scalng of the search drecton for those varables. These search drectons are a set of vectors d k for terates x k, usually gven by d k = f(xk ). A verson for whch these search f(x k ) drectons are taken to be zero s denoted by PVD0. In the synchronzaton stage, x k+1 s updated va the mnmzaton of f, but now wth respect to the weghtngs for the lnear combnaton of x k wth the local solutons x k+1. The mnmzaton s constraned by the requrement that the update x k+1 s a strctly convex combnaton of x k and the x k+1. Ths ensures f(x k+1 )<f(x k )and hence a decrease n the objectve functon each teraton. Convergence of the PVD algorthm for f LCK 1 ( n ) s gven by Theorem 2.1 n [6]. Here the set LCK 1 ( n ) s the set of functons wth Lpschtz contnuous frst partal dervatves on n wth Lpschtz constants K. Theorem 3.5. For a bounded sequence {d k }, ether the sequence {x k+1 } termnates at a statonary pont x k,.e., a pont at whch f(x)=0, or each of ts accumulaton ponts s statonary and lm k f(x k )=0. The proof of ths theorem employs not only the requrement that f has a Lpschtz contnuous gradent, that the sequence {d k } s bounded, but also that n the synchronzaton step f(x k+1 ) p 1 p l=1 f( xk+1 l ), because of the convex update of x k. In Algorthm 2.2 the search drectons are zero, and thus, ORLSMS s actually a verson of PVD0. Furthermore, for f(x)= Ax b 2 2 we have f(y) f(x) 2 2 = 2AT A(y x) 2 2 and f LCK 1 ( n ) wth K = 2ρ(A T A). To prove convergence for ORLSMS t therefore only remans to check that the modfcaton of the synchronzaton stage employed n Algorthm 2.2 satsfes the assumpton f(x k+1 ) p 1 p l=1 f( xk+1 l ) used n the proof of Theorem 3.5 But at synchronzaton n Algorthm 2.2, x s updated by (2.4), for whch r( x l k+1 ) = f( x l k+1 )<r(x k ). Therefore, ths condton s necessarly satsfed, otherwse

11 A Parallel Multsplttng Soluton of the Least Squares Problem 21 1 p p l=1 f( xk+1 l )< p l=1 αk+1 l f( x l k+1 )and the mnmum s not found, contradctng the mnmzaton of the outer stage. Thus, Theorem 3.5 apples for the ORLSMS algorthm. Furthermore, the functon f(x)= Ax b 2 2 s strongly convex, f(y) f(x) f (x)(y x) K 2 y x 2 2, x,y n where K = 2ρ(A T A). Therefore, the lnear convergence result of Ferrs and Mangasaran [6] also apples. Theorem 3.6. The sequence {x k } defned by the algorthm ORLSMS converges lnearly to the unque soluton x LS of (1.2) at the lnear root rate ( f(x x k 0 ) ) f(x LS ) 1/2 ( x ρ(a T 1 1 ( ) K 2 ) k/2 A) p K 1 where K 1 s the Lpschtz constant for l f(x l ). On the contrary, however, when we seek to apply Theorem 3.5 to the LSMS algorthm, we do not have an update at the synchronzaton stage for whch t s necessary that f(x k+1 ) f(x k ). In partcular, ths reducton n the objectve functon s just the reducton n the resdual functon and we see, not unexpectedly, that when we determne the requrement for ths decrease we obtan exactly the restrcton on α as gven by Theorem 3.4 In order to force convergence, the mplementaton used actually updates x, ether as gven by (2.7) or, when ths update does not lead to a decrease n the objectve functon, the update to x s taken as the local soluton x whch leads to the mnmum of f for that teraton. Therefore, the convergence theory for the PVD s useful n ths case for determnng the weakness of the relaxed splttng and mmedately suggests the modfcaton requred to force convergence. 4. Numercal results Here, numercal results of tests of the algorthms n Secton 2 are presented for three examples. Further results can be found n [12] and [17]. The frst example comes from a table-flatness problem and generates a structured matrx. Ths s one of the structured matrces used by Duff and Red n [5], and for the case we chose generates a matrx of sze wth non-zero entres, whch s reasonably condtoned, condton number 105. Results presented for ths test case are referred to as results for measured data. To ndcate how the algorthms perform for sparse matrces wth arbtrary structure no attempt was made to ft the splttng to the structure of the problem by reorderng the unknowns. For dense matrces, we used two matrces of sze wth random entres generated accordng to a normal probablty dstrbuton and a unform probablty dstrbuton, respectvely. The condton numbers of these matrces were approxmately 20 and 398, respectvely. The results presented for these matrces are referred to as results for normal and unform data, respectvely. Note, all matrces used n the evaluaton were reasonably well condtoned.

12 22 R. A. Renaut Table 1. Comparson of four methods, for tolerance 10 TE,TE = 3 and TE = 5. Measured data Algorthm LSMS LSMSIR ORLSMS ORLSMSIR P O TE K R K E K R K E K R K E K R K E N N 75 N 15 N 14 N Table 2. Comparson of four methods, for tolerance 10 TE,TE = 3 and TE = 5. Normal data Algorthm LSMS LSMSIR ORLSMS ORLSMSIR P O TE K R K E K R K E K R K E K R K E N N 890 N N 992 N A representatve selecton of the results s gven n Tables 1 7. The notaton s as follows: P number of processors O overlap between domans TE 10 TE s the tolerance K R number of teratons to convergence 10 TE n l 2 norm of the relatve error K E number of teratons to convergence 10 TE n l 2 norm of the relatve resdual N convergence was not acheved to ths tolerance after teratons Tables 1 3 present a comparson of the four algorthms wthout overlap at tolerances 10 3 and Tables 4 7 show how the algorthms perform when overlap s ncorporated. All calculatons are n sngle precson.

13 A Parallel Multsplttng Soluton of the Least Squares Problem 23 Table 3. Comparson of four methods, for tolerance 10 TE,TE = 3 and TE = 5. Unform data Algorthm LSMS LSMSIR ORLSMS ORLSMSIR P O TE K R K E K R K E K R K E K R K E N 130 N N 351 N 255 N 300 N N 479 N 312 N 317 N N 587 N 625 N 639 N N 690 N 527 N 568 N N 739 N 521 N 560 N N 898 N 599 N 610 N Table 4. Effect of overlap on convergence for ORLSMS and ORLSMSIR and error convergence Measured data Splttng Algorthm O ORLSMS ORLSMSIR Table 5. Effect of overlap on convergence for ORLSMS and ORLSMSIR and resdual convergence Measured data Splttng Algorthm O ORLSMS ORLSMSIR

14 24 R. A. Renaut Table 6. Effect of overlap on convergence for ORLSMS and ORLSMSIR and error convergence Unform data Splttng Algorthm O N N ORLSMS N N N ORLSMSIR Table 7. Effect of overlap on convergence for ORLSMS and ORLSMSIR and resdual convergence Unform data Splttng Algorthm O ORLSMS ORLSMSIR

15 A Parallel Multsplttng Soluton of the Least Squares Problem 25 In summary the numercal results show: () Convergence for a mnmum resdual soluton s acheved more quckly than for a mnmal error soluton. () Iteratve refnement does not mprove the convergence rates, for ether Algorthm 2.1 or 2.2, confrmng the observaton of Golub and Wlknson [10]. () Algorthm 2.2 has generally much better convergence propertes for a mnmum resdual soluton than Algorthm 2.1 because of the optmal recombnaton of the local solutons at each teraton. (v) Overlap can mprove the rate of convergence. The amount of overlap to use s problem dependent. For a dense matrx the cost of the subproblem soluton ncreases wth overlap so that at some pont more overlap s no longer benefcal. For separable or near separable problems the deal overlap s often mmedately clear. In other cases graph theoretc technques may be needed to determne optmal groupngs of varables. (v) Overlap does not always reduce the number of outer teratons to convergence of the OR algorthms. A decrease n the objectve functon s guaranteed each teraton but the mnmzaton wth respect to the weghts, α, may lead to dfferent subspaces beng weghted dfferently than n the non-overlapped case. Hence, faster convergence s not guaranteed,.e., rate of convergence s dependent on the vectors α k. Fgures 1 5 llustrate the results of Tables 1 7, usng the estmate of percentage parallel effcency gven by (2.11). In Fgures 1 3 the lne types are O, +, X and for the algorthms LSMS, LSMSIR, ORLSMS and ORLSMSIR, respectvely. In Fgures 4 and 5 the lne types, O, X and + ndcate overlap 0, 10, 20 and 30, respectvely. Effcency for the random matrces s less than for the structured cases. Also, because overlap s more costly, the gan n rate of convergence s not recognzed n terms of parallel effcency when overlap s large. Effcences greater than ndcate speed-up of the splt algorthm as compared wth a straghtforward drect QR solve. The ntroducton of OR s effectve at mprovng parallel effcency. 5. Conclusons The algorthms presented n ths paper provde a vable parallel strategy for the soluton of the lnear least squares problem. In partcular, a two-level approach to mnmzaton n whch subproblem solutons are obtaned ndependently, but then combned to gve an optmal global update, s very successful. The method has been demonstrated to work not only for a sparse test example but also for dense random matrces. We conclude that the approach s of partcular value for: () Problems that are nherently near separable. In these cases the optmal local solutons quckly converge to the global optmum and the cost of each subproblem solve s relatvely cheap. The method s then vable. () Large dense problems whch are too memory ntensve to be solved on a sngle processor machne. Although, n ths case, parallel effcency s very low, the algorthm provdes an effectve soluton technque. Furthermore, these algorthms have the addtonal advantages, compared wth drect parallelzaton of the seral algorthm, of smplcty, portablty and flexblty.

16 26 R. A. Renaut 400 Table 1: Error 10^ Table 1: Error 10^ Table 1: Resdual 10^ Table 1: Resdual 10^ Fgure 1. Comparson of parallel effcency of algorthms for data from Table 1

17 A Parallel Multsplttng Soluton of the Least Squares Problem 27 Table 2: Error 10^ 3 60 Table 2: Error 10^ Table 2: Resdual 10^ 3 Table 2: Resdual 10^ Fgure 2. Comparson of parallel effcency of algorthms for data from Table 2

18 28 R. A. Renaut 120 Table 3: Error 10^ 3 16 Table 3: Error 10^ Table 3: Resdual 10^ 3 Table 3: Resdual 10^ Fgure 3. Comparson of parallel effcency of algorthms for data from Table 3

19 A Parallel Multsplttng Soluton of the Least Squares Problem Table 4: IR Error 10^ Table 4: Error 10^ Table 5: IR Resdual 10^ Table 5: Resdual 10^ Fgure 4. Comparson of parallel effcency for overlap 0, 10, 20, 30 for data from Tables 4 and 5

20 30 R. A. Renaut Table 6: IR Error 10^ 4 Table 6: Error 10^ Table 7: IR Resdual 10^ Table 7: Resdual 10^ Fgure 5. Comparson of parallel effcency for overlap 0, 10, 20, 30 for data from Tables 6 and 7

21 A Parallel Multsplttng Soluton of the Least Squares Problem 31 REFERENCES 1. R. Barrett, M. Berry, T. Chan, J. Demmel, J. Donato, J. Dongarra, V. Ejkhout, R. Pozo, C. Romne and H. van der Vorst. Templates for the Soluton of Lnear Systems: Buldng Blocks for Iteratve Methods. SIAM, Phladelpha, A. Berman and R. J. Plemmons. Nonnegatve Matrces n the Mathematcal Scences. Classcs n Appled Mathematcs, SIAM, Phladelpha, R. Bramley and A. Sameh. Row projecton methods for large nonsymmetrc lnear systems. SIAM J. Sc. Stat. Comput., 13, , E. Chu and A. George. QR factorzaton of a dense matrx on a hypercube multprocessor. SIAM J. Sc. Stat. Comput., 11(5), , I. S. Duff and J. K. Red. A comparson of some methods for the soluton of sparse overdetermned systems of lnear equatons. J. Inst. Maths. Applcs., 17, , M. C. Ferrs and O. L. Mangasaran. Parallel varable dstrbuton. SIAM J. Optmzaton, 4(4), , A. Frommer and B. Pohl. A comparson result for multsplttngs and waveform relaxaton methods. Numer. Lnear Algebra Appl., 2, , G. H. Golub. Numercal methods for solvng least squares problems. Numer. Math., 7, , G. H. Golub and C. van Loan. Matrx Computatons, second edton. John Hopkns Press, Baltmore, G. H. Golub and J. H. Wlknson. Note on the teratve refnement of least squares soluton. Numer. Math., 9, , L. A. Hageman and D. M. Young. Appled Iteratve Methods. Academc Press, New York, Q. He. Parallel multsplttngs for nonlnear mnmzaton. Ph.D. thess, Arzona State Unversty, In preparaton. 13. N. J. Hgham. Iteratve refnement enhances the stablty of QR decomposton methods for solvng lnear equatons. BIT, 31, , D. P. O Leary and R. E. Whte. Mult-splttng of matrces and parallel soluton of lnear systems. SIAM J. Alg. Dsc. Meth., 6, , J. M. Ortega. Introducton to Parallel and Vector Solutons of Lnear Systems. Plenum Press, New York and London, J. M. Ortega and W. C. Rhenboldt. Iteratve Soluton of Nonlnear Equatons n Several Varables. Academc Press, New York, R. A. Renaut, Q. He and F.-S. Horng. Parallel multsplttng for mnmzaton, n Grand Challenges n Computer Smulaton, A. Tentner, edtor, pp , Hgh Performance Computng Socety for Computer Smulaton, R. A. Renaut and H. D. Mttelmann. Parallel multsplttngs for optmzaton. J. Parallel Alg. and Appl., 7, 17 27, 1995.

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence. Vector Norms Chapter 7 Iteratve Technques n Matrx Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematcs Unversty of Calforna, Berkeley Math 128B Numercal Analyss Defnton A vector norm

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 13 GENE H GOLUB 1 Iteratve Methods Very large problems (naturally sparse, from applcatons): teratve methods Structured matrces (even sometmes dense,

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

On a direct solver for linear least squares problems

On a direct solver for linear least squares problems ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Deriving the X-Z Identity from Auxiliary Space Method

Deriving the X-Z Identity from Auxiliary Space Method Dervng the X-Z Identty from Auxlary Space Method Long Chen Department of Mathematcs, Unversty of Calforna at Irvne, Irvne, CA 92697 chenlong@math.uc.edu 1 Iteratve Methods In ths paper we dscuss teratve

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

1 Derivation of Point-to-Plane Minimization

1 Derivation of Point-to-Plane Minimization 1 Dervaton of Pont-to-Plane Mnmzaton Consder the Chen-Medon (pont-to-plane) framework for ICP. Assume we have a collecton of ponts (p, q ) wth normals n. We want to determne the optmal rotaton and translaton

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 17. a ij x (k) b i. a ij x (k+1) (D + L)x (k+1) = b Ux (k)

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 17. a ij x (k) b i. a ij x (k+1) (D + L)x (k+1) = b Ux (k) STAT 309: MATHEMATICAL COMPUTATIONS I FALL 08 LECTURE 7. sor method remnder: n coordnatewse form, Jacob method s = [ b a x (k) a and Gauss Sedel method s = [ b a = = remnder: n matrx form, Jacob method

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros

On the Interval Zoro Symmetric Single-step Procedure for Simultaneous Finding of Polynomial Zeros Appled Mathematcal Scences, Vol. 5, 2011, no. 75, 3693-3706 On the Interval Zoro Symmetrc Sngle-step Procedure for Smultaneous Fndng of Polynomal Zeros S. F. M. Rusl, M. Mons, M. A. Hassan and W. J. Leong

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

The lower and upper bounds on Perron root of nonnegative irreducible matrices

The lower and upper bounds on Perron root of nonnegative irreducible matrices Journal of Computatonal Appled Mathematcs 217 (2008) 259 267 wwwelsevercom/locate/cam The lower upper bounds on Perron root of nonnegatve rreducble matrces Guang-Xn Huang a,, Feng Yn b,keguo a a College

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

Relaxation Methods for Iterative Solution to Linear Systems of Equations

Relaxation Methods for Iterative Solution to Linear Systems of Equations Relaxaton Methods for Iteratve Soluton to Lnear Systems of Equatons Gerald Recktenwald Portland State Unversty Mechancal Engneerng Department gerry@pdx.edu Overvew Techncal topcs Basc Concepts Statonary

More information

Feb 14: Spatial analysis of data fields

Feb 14: Spatial analysis of data fields Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choo

form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choo Scalng and structural condton numbers Arnold Neumaer Insttut fur Mathematk, Unverstat Wen Strudlhofgasse 4, A-1090 Wen, Austra emal: neum@cma.unve.ac.at revsed, August 1996 Abstract. We ntroduce structural

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Formulas for the Determinant

Formulas for the Determinant page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS

NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS IJRRAS 8 (3 September 011 www.arpapress.com/volumes/vol8issue3/ijrras_8_3_08.pdf NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS H.O. Bakodah Dept. of Mathematc

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

General viscosity iterative method for a sequence of quasi-nonexpansive mappings Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Perron Vectors of an Irreducible Nonnegative Interval Matrix Perron Vectors of an Irreducble Nonnegatve Interval Matrx Jr Rohn August 4 2005 Abstract As s well known an rreducble nonnegatve matrx possesses a unquely determned Perron vector. As the man result of

More information

A Ferris-Mangasarian Technique. Applied to Linear Least Squares. Problems. J. E. Dennis, Trond Steihaug. May Rice University

A Ferris-Mangasarian Technique. Applied to Linear Least Squares. Problems. J. E. Dennis, Trond Steihaug. May Rice University A Ferrs-Mangasaran Technque Appled to Lnear Least Squares Problems J. E. Denns, Trond Stehaug CRPC-TR98740 May 1998 Center for Research on Parallel Computaton Rce Unversty 6100 South Man Street CRPC -

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Norms, Condition Numbers, Eigenvalues and Eigenvectors Norms, Condton Numbers, Egenvalues and Egenvectors 1 Norms A norm s a measure of the sze of a matrx or a vector For vectors the common norms are: N a 2 = ( x 2 1/2 the Eucldean Norm (1a b 1 = =1 N x (1b

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

On a Parallel Implementation of the One-Sided Block Jacobi SVD Algorithm

On a Parallel Implementation of the One-Sided Block Jacobi SVD Algorithm Jacob SVD Gabrel Okša formulaton One-Sded Block-Jacob Algorthm Acceleratng Parallelzaton Conclusons On a Parallel Implementaton of the One-Sded Block Jacob SVD Algorthm Gabrel Okša 1, Martn Bečka, 1 Marán

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1 Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networks as Associative Memory CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

A new Approach for Solving Linear Ordinary Differential Equations

A new Approach for Solving Linear Ordinary Differential Equations , ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of

More information

The Exact Formulation of the Inverse of the Tridiagonal Matrix for Solving the 1D Poisson Equation with the Finite Difference Method

The Exact Formulation of the Inverse of the Tridiagonal Matrix for Solving the 1D Poisson Equation with the Finite Difference Method Journal of Electromagnetc Analyss and Applcatons, 04, 6, 0-08 Publshed Onlne September 04 n ScRes. http://www.scrp.org/journal/jemaa http://dx.do.org/0.46/jemaa.04.6000 The Exact Formulaton of the Inverse

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances ec Annexes Ths Annex frst llustrates a cycle-based move n the dynamc-block generaton tabu search. It then dsplays the characterstcs of the nstance sets, followed by detaled results of the parametercalbraton

More information

Asymptotics of the Solution of a Boundary Value. Problem for One-Characteristic Differential. Equation Degenerating into a Parabolic Equation

Asymptotics of the Solution of a Boundary Value. Problem for One-Characteristic Differential. Equation Degenerating into a Parabolic Equation Nonl. Analyss and Dfferental Equatons, ol., 4, no., 5 - HIKARI Ltd, www.m-har.com http://dx.do.org/.988/nade.4.456 Asymptotcs of the Soluton of a Boundary alue Problem for One-Characterstc Dfferental Equaton

More information

Least squares cubic splines without B-splines S.K. Lucas

Least squares cubic splines without B-splines S.K. Lucas Least squares cubc splnes wthout B-splnes S.K. Lucas School of Mathematcs and Statstcs, Unversty of South Australa, Mawson Lakes SA 595 e-mal: stephen.lucas@unsa.edu.au Submtted to the Gazette of the Australan

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013 ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run

More information

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

IV. Performance Optimization

IV. Performance Optimization IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton

More information

5 The Rational Canonical Form

5 The Rational Canonical Form 5 The Ratonal Canoncal Form Here p s a monc rreducble factor of the mnmum polynomal m T and s not necessarly of degree one Let F p denote the feld constructed earler n the course, consstng of all matrces

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information