2 AN OVERVIEW OF THE TENSOR PRODUCT

Size: px
Start display at page:

Download "2 AN OVERVIEW OF THE TENSOR PRODUCT"

Transcription

1

2 98 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH Th choic of data distribution has a larg influnc on th prformanc of th synthsizd programs, ur simpl algorithm for slcting th appropriat data distribution siz is vry ffctiv, and 3 Th dynamic programming approach can always rduc th numbr of passs to accss out-of-cor data Th papr is organizd as follows: Sction introducs tnsor products and discusss formulation of block rcursiv algorithms using tnsor products and othr matrix oprations In Sction 3, w introduc a two-lvl computation modl and prsnt th smantics of data distributions and data accss pattrns Sction prsnts an ovrviw of our out-of-cor program synthsis framwork In Sction 5 and Sction 6, w summariz th prformanc rsults and show th ffctivnss of using various blockcyclic data distributions Prformanc rsults ar prsntd in Sction 7 Finally, conclusions ar providd in Sction 8 A VERVIEW F THE TESR PRDUCT In this sction, w illustrat th formulation of block rcursiv algorithms using tnsor products W bgin with som prliminary dfinitions which ar ssntial for undrstanding th rst of th papr 1 Prliminaris Th tnsor product is usful in xprssing th block structur in a matrix Lt A b an m n matrix and B b a p q matrix Th tnsor product A B is a block matrix obtaind by rplacing ach lmnt a i;j by th matrix a i;j B, i, 3 A m;n B p;q ˆ 6 a 0;0 B p;q a 0;n 1 B p;q a m 1;0 B p;q a m 1;n 1 B p;q 7 5: Th abov tnsor product can b factorizd as follows: A m;n B p;q ˆ A m;n I p I n B p;q ˆ I m B p;q A m;n I q ; whr I n rprsnts th n n idntity matrix A tnsor factorization can b usd to fficintly comput Y mp obtaind by applying C mp;nq A m;n B p;q to vctor X nq, i, Y mp ˆ C mp;nq X nq For xampl, dirct application of C mp;nq to X nq rquirs mpnq scalar oprations Howvr, th following algorithm basd on th tnsor factorization of C mp;nq : Z mq ˆ A m;n I q X nq ; Y mp ˆ I m B p;q Z mq, rquirs only qmn mpq scalar oprations A tnsor product involving an idntity matrix can b implmntd as paralll oprations For xampl, considr th application of I m A p;n to X mn, i, A p;n 3 X mn 3 0:n X 7 mn n :n A p;n 5 ˆ X mn m 1 n : mn 1 A p;n X mn 3 0:n 1 p;n X mn n :n : A p;n X mn m 1 n : mn 1 This can b intrprtd as m copis of A p;n acting in paralll on m disjoint sgmnts of X mn Howvr, to intrprt th application A p;n I m to X mn as paralll oprations, w nd to undrstand strid prmutations (aka shuffl prmutations) of a vctor X mn is a vctor Y mn, whr Y mn ˆ X mn 0:mn 1:n ; X mn 1:mn 1: n ;;X mn m 1:mn 1:n Š; i, th first m lmnts of Y mn ar X mn 0:mn 1:n, which rprsnts lmnts Th strid prmutation L mn n of X mn at strid n starting with lmnt 0 Th nxt m lmnts ar lmnts of X mn at strid n, starting with lmnt 1 Th strid prmutation L mn n can b rprsntd as an mn mn transformation For xampl, th ffct of applying L 6 to X6 can b xprssd in matrix form as follows: x 0 x x 1 x L X6 ˆ x x x 3 ˆ 6 7 x 1 : x 5 x x 5 x 5 Strid prmutations can also b dfind in trms of a prmutation of th tnsor product of vctor bass A vctor basis m i, 0 i<m, is a column vctor of lngth m with a on at position i and zros lswhr Th tnsor product of vctor bass is calld a tnsor basis A tnsor basis m1 i 1 m t i t can b linarizd into a vctor basis m 1m t i 1m m t i t 1m t i t Equivalntly, a vctor basis M i can b factorizd into a tnsor product of vctor bass m1 i 1 m t i t, whr M ˆ m 1 m t and i k ˆ i div M k 1 mod m k ;M k ˆ Qt iˆk m i;m t 1 ˆ 1: For xampl, 1 L mn n 8 ˆ : Th strid prmutation can now b dfind as: L mn n m i n j ˆ n j m i : This givs th rlationship btwn th indxing of th input and th output vctors By linarizing th input tnsor basis m i n j to mn in j, w gt th indxing function of th input vctor to b in j Similarly, th indxing function of th output vctor is obtaind by linarizing th output tnsor basis to b jm i Thrfor, th ffct of applying th strid prmutation L mn n to an input vctor is that th lmnt at indx in j of th input vctor is stord in location at indx jm i of th output vctor Using strid prmutations, an application of A p;n I m to X mn can also b intrprtd as m paralll applications of

3 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 99 A p;n to disjoint sgmnts of X mn by using th idntity L pm m Ap;n I m ˆ I m A p;n L mn m as follows: Lpm m Y pm ˆ I m A p;n L mn m Xmn ; i, Y pm 3 0:pm 1:m Y pm 1:pm 1:m ˆ Y pm m 1:pm 1:m 3 A p;n X mn 0:mn 1:m A p;n X mn 1:mn 1:m : A p;n X mn m 1:mn 1:m Howvr, th inputs for ach application of A p;n ar accssd at a strid of m and th outputs ar also stord at a strid of m Th proprtis of tnsor products can b usd to transform th tnsor product rprsntation of an algorithm into anothr quivalnt form, which can tak th advantag of th paralll oprations discussd abov For xampl, by using th following tnsor product factorizations, A m;n B p;q ˆ A m;n I p I n B p;q ˆ I m B p;q A m;n I q ; A B can b implmntd by first applying q paralll applications of A and thn m paralll applications of B 1 Svral othr ky proprtis of tnsor products ar listd blow [1]: 1 A B C ˆ A B C ˆ A B C; A B C D ˆAC BD; assum that th ordinary multiplications AC and BD ar dfind Q 3 n 1 iˆ0 I Q m Ai ˆI m n 1 iˆ0 A i ; Q n 1 iˆ0 A i Im ˆ Q n 1 iˆ0 A i I m Proprty is also calld factor grouping Proprtis 3 and follow from Proprty Tnsor Product Formulation of Block Rcursiv Algorithms A block rcursiv algorithm is obtaind from a rcursiv tnsor factorization of a computation matrix For xampl, FFT algorithms ar drivd by tnsor factorization of th discrt Fourir transform (DFT) matrix Th algorithms obtaind from tnsor factorization ar computationally mor fficint than thos that dirctly us th unfactorizd matrix For xampl, computing th DFT of a vctor of siz by dirctly multiplying it by an DFT matrix rquirs oprations compard to only log oprations using an FFT algorithm Som othr xampls of block rcursiv algorithms ar Strassn's matrix multiplication [11], [13], convolution [8], and fast sin/cosin transforms [18] A tnsor product formulation of a block rcursiv algorithm has th following gnric form: 1 W ignor th dimnsions of matrics whnvr thy ar clar from th contxt 1 Y k jˆ1 I rj A vj I cj ; whr A vj is a v j v j squar linar transformation, Q k iˆ1 F i dnots F k F 1, and r j v j c j ˆ r i v i c i, for 1 i; j k Th computation prformd at ach stp j is U j ˆ I rj A vj I cj V j Du to th prsnc of idntity trms, it is asy to xprss ach computation stp using paralll oprations Howvr, th task of harnssing this inhrnt paralllism in ach computation stp with th goal of minimizing th paralll I/ oprations is nontrivial W nxt prsnt tnsor product formulations of two FFT algorithms which ar usd as xampls in this papr 1 Fast Fourir Transform Th tnsor product formulations of various FFT algorithms ar prsntd in [1], [18] Ths formulations ar obtaind by diffrnt tnsor factorizations of th discrt Fourir transform matrix Although all of ths algorithms ar computationally quivalnt, thy hav diffrnt computational structurs and diffrnt data accss pattrns For xampl, considr th following tnsor product formulation of th radix- dcimation-in-tim Cooly-Tuky FFT: and F n ˆ Yn I n i F I i 1 I n i iˆ1 R n ˆ Yn iˆ1 I i 1 L n i 1 ; F ˆ 1 1 : 1 1 T i i 1!R n; 3 T i rprsnts a diagonal matrix of constants and R i 1 n prmuts th input squnc to a bit-rvrsd ordr As can b sn from (3), for an FFT on n points, thr ar n stps in th computation aftr prforming th initial bit-rvrsal prmutation At ach stp, th data array from th prvious stp is scald by multiplying by twiddl factors T i i 1 X i 1, followd by th buttrfly compu- Y ˆ I n i tation X i ˆ I n i F I i 1 Y Matrix Transposition Th transposition of a p q matrix M p;q can b xprssd using a strid prmutation L pq q as M pq T ˆ L pq q Mpq, whr M pq is th row-major linar rprsntation of M p;q Various matrix transposition algorithms can b xprssd using tnsor product formulas involving strid prmutations [10] For xampl, th block matrix transposition algorithm for transposing a p q matrix can b dscribd by th following formula: L pq q ˆ I q L p q 1 q Ip1 1 L p q q Ip1q 1 I pq L p 1 q 1 q 1 I p L p 1 q q Iq1 ; whr p ˆ p p 1 and q ˆ q q 1 Th first (rightmost) factor convrts th row-major rprsntation of th input matrix to a row-major rprsntation of th input matrix viwd as a p q block matrix consisting of p 1 q 1 siz blocks Th

4 300 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 scond and third factor xprss transposition of ach block and transposition of th block matrix, rspctivly Th fourth factor is th invrs of th first and it rvrts th block row-major rprsntation to row-major rprsntation of th output Th corrctnss of this rprsntation can b sn by applying th factors to th input basis s p i p 1 i 1 q j q 1 j 1 to gt th following squnc of bass, s! P i q j p 1 i 1 q 1 j 1! p i q j q 1 j 1 p 1 i 1! q j p i q 1 j 1 p 1 i 1! q j q 1 j 1 p i p 1 i 1 ˆ t ; and noting that t ˆ L pq q s ot that w hav usd th idntity A m;n B p;q n i q j ˆAm;n n i B p;q q j : Th basis t is calld th output basis 3 PARALLEL I/ MDEL WITH BLCK-CYCLIC DATA DISTRIBUTIS W us a two-lvl modl which is similar to Vittr and Shrivr's two-lvl mmory modl [] Howvr, in our modl th data on disks (calld out-of-cor data) can b distributd in diffrnt (logical) block sizs Th modl consists of a procssor with an intrnal random accss mmory and a st of disks Th storag capacity of ach disk is assumd to b infinit n ach disk, th data is organizd as physical block with fixd siz Four paramtrs: (th siz of th input), M (th siz of th intrnal mmory), B d (th siz of ach physical block), and D (th numbr of disks) ar usd in this modl W assum that M<, 1 B d M, and 1 D M B d In this modl, disk I/ occurs in physical tracks (dfind blow) of siz B d D Th physical blocks which hav th sam rlativ positions on ach disk constitut a physical track Th physical tracks ar numbrd contiguously with th outrmost track having th lowst addrss and th innrmost track having th highst addrss Th ith physical track is dnotd by T i Fig 1 shows an xampl data layout with B d ˆ, D ˆ, and ˆ 6 Each paralll I/ opration can simultanously accss D physical blocks, on block from ach disk Thrfor, paralllism in data accss is at two lvls: Elmnts in on physical block ar transfrrd concurrntly and D physical blocks can b transfrrd in on I/ opration In this papr, w us th stripd disk accss modl in which physical blocks in on I/ opration com from th sam track, as opposd to th indpndnt I/ modl in which block can com from diffrnt tracks W us th paralll primitivs, paralll_rad(i) and paralll_writ(i), to dnot th rad and writ to th physical track T i, rspctivly W dfin th masur of I/ prformanc as th numbr of paralll I/s rquird 31 Block-Cyclic Data Distributions Block-cyclic distributions hav bn usd for distributing arrays among procssors on a multiprocssor systm A block-cyclic distribution partitions an array into qual sizd blocks of conscutiv lmnts and thn maps thm onto th procssors in a cyclic mannr If w rgard th disks in th abov modl as procssors, thn th data organization dscribd abov (g, in Fig 1) is xactly a block-cyclic distribution (dnotd as cyclicb d ) with th block siz B d Morovr, w can assum that data can b distributd with an arbitrary block siz Fig shows th data organization for th sam paramtrs as in Fig 1, but with a cyclic 8 distribution otic that th siz of th physical track and th siz of th physical block ar not changd Howvr, thy contain diffrnt rcords W will call B rcords in a block formd by a cyclic B distribution a logical block Similarly, th logical blocks which hav th sam rlativ positions on ach disk consist of a logical track Th ith logical track is dnotd as LT i ot that ach paralll I/ opration still accsss a physical track not a logical track Hnc, svral paralll I/ oprations ar ndd to accss a logical track For xampl, to load th logical track LT 1 in Fig, two paralll_rad oprations paralll rad and paralll rad 3, which, rspctivly, load th physical tracks T and T 3, ar ndd W nxt us a simpl xampl to show th advantags of using logical distributions on dvloping I/-fficint programs for block rcursiv algorithms Why Logical Data Distributions? Assum that w want to implmnt F 8 I8 on our targt modl undr th paramtrs givn in Fig 1 Furthr, w assum that th siz of th main mmory is th half of th siz of th inputs Bcaus w ar mainly intrstd in data accss pattrns, w ignor th ral computations conductd by F 8 Th only thing w nd to rmmbr is that F 8 nds ight lmnts with a strid of ight bcaus of th xistnc of th idntity matrix I 8 W first considr implmnting F 8 I8 on th physical block distribution From th abov discussion, w know that th first F 8 nds to b applid to ight lmnts: 0; 8; 16; ; 8; 3; 0; 8, and 56 From Fig 1, w can s that ths lmnts rquird by th F 8 computation ar stord on four physical tracks Howvr, our main mmory can hold only two physical tracks, so that w can not simply load all of th four physical tracks into th main mmory and accomplish th computation in on pass of I/ To gt around this mmory limitation, w can us two diffrnt approachs First, w load th first physical track and kp th first half of th rcords in ach physical block in that loadd physical track and throw othr half of th rcords W do this for vry othr physical track Thn w do th computation for half of th rcords in th main mmory Aftr finishing computation for half of th rcords, w writ th rsults out Thn w rpat th abov procdur Howvr, w now kp othr half of th rcords in th main mmory for ach loadd track By doing computation in this way, it is obvious that w nd two passs to load out-ofcor data Anothr mthod is to us a logical block distribution Lt th siz of a logical block b ight, as shown in Fig In this cas, th ight rcords rquird by on F 8 ar stord on two Cormn has calld this data organization on disks as a bandd data layout [3] and studid th prformanc for a class of prmutations and svral othr basic primitivs of ESL languag [1]

5 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 301 Fig 1 Th data organization for ˆ 6 inputs with B d ˆ and D ˆ Each column is a disk Each box is a physical block Each row consists of a physical track Th numbrs in ach box dnot th rcord indics physical tracks, ithr T 1 and T 3 or T and T Thrfor, if w can first load and prform computation on T 1 and T 3, followd by loading and prforming computation on T and T, thn th ntir computation can b prformd in a singl pass Hnc, logical distribution can b usd to rduc th numbr of passs ndd to prform th ntir computation Howvr, thr ar svral issus which nd to b addrssd, such as how to dtrmin th block siz of th logical distribution and how to dtrmin th data accss pattrns W will discuss ths issus in th following sctions For simplicity, w mak th following assumptions: Th input and th output data ar stord in sparat st of disks All paramtrs ar powr of two 3 Th block siz B of th distribution is a multipl of B d 3 Smantics of Data Distributions and Accss Pattrns A block-cyclic distribution can b algbraically rprsntd by a tnsor basis by idntifying th bass which corrspond to procssor indx [9] This approach can b adaptd to rprsnt data distributions onto disks in our paralll I/ modl by substituting disks for procssors Howvr, du to th xistnc of physical blocks and physical tracks, th mthod of using tnsor bass to dfin a block-cyclic distribution for multiprocssors nds to b gnralizd This w achiv by furthr factoring th tnsor basis to gt bass for physical block indx and physical track indx W call this factord tnsor basis an (out-of-cor) data distribution basis, which is dfind as follows: Dfinition 31 Lt B ˆ B b B d If a vctor of lngth, whr = GBD and G is an intgr, is distributd according to th cyclic B distribution on D disks, thn its data distribution basis is dfind as: For xampl, th data distribution basis for Fig is g d bb bd, whr th siz of ach physical block is four, ach logical block contains two physical blocks, thr ar four disks, and th inputs ar stord on two logical tracks Th data distribution basis for Fig 1 can b writtn as g d bd, whr B b ˆ 1 A slctd portion of th distribution basis in (5) can b usd to obtain th indxing function ndd to dnot a particular data unit such as a logical track or a physical track Lt logical-track D ˆ G g physical-track D ˆ G g B b b b 7 Thn th indxing function for accssing th physical tracks can b obtaind by linarizing physical-track D Similarly, w can hav tnsor bass which dnot th rcords insid a logical track and a physical track, rspctivly Ths tnsor bass ar calld th logical tracklmnt basis ( D d B b b b B d b d ) and th physical track-lmnt basis ( D d B d b d ), rspctivly An intrsting point to not is that th logical track-lmnt basis can b obtaind by dlting th bass corrsponding to th logical track indx from th data distribution basis D Similarly, th physicaltrack lmnt basis can b obtaind by dlting th bass corrsponding to th physical track indx from th data distribution basis D Formally, logical-track-lmnt D ˆ D logical-track D ; 6 Dˆ G g D d B b b b B d b d : W us D s to rfr to th sth factor (from th lft), g, D ˆ D d 5 and physical-track-lmntbasis D ˆ D physical-track D ; 3 Th rsults can b asily gnralizd to all paramtrs to b powr of any positiv intgr whr th basis diffrnc oprator, dnotd as -, is dfind as:

6 30 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig Th data organization for ˆ 6 inputs with B d ˆ, D ˆ, and B ˆ 8 Each column is a disk Th first lft shadowd box dnots an xampl logical block Thr ar two logical tracks, LT 0 and LT 1, ach of thm consists of two physical tracks Dfinition 3 Lt S and G b two tnsor bass Thir diffrnc is dnotd as S Gand is a tnsor basis which is constructd by dlting all of th vctor bass in G from S 33 Tnsor Bass for Data Accss For fixd input and output data distribution bass, diffrnt ordrs of instantiating th indics in th indxing function of th data distribution bass (as dfind in (5)) corrspond to diffrnt accss pattrns for out-of-cor data For xampl, if w instantiat th indics in th ordr from right to lft (which is what w hav usd to intrprt a tnsor basis so far), i, g is th slowst and b d is th fastst changing indics, thn w actually accss data first in th first logical block of th first disk and thn accss th first logical block in th scond disk Aftr finishing th accss to th first logical track squntially, th scond logical track is accssd, and so on This data accss pattrn can b bttr undrstood by xamining th following cod, which uss th indics in ach vctor basis as loop indx variabl D g ˆ 0;G 1 D d ˆ 0;D D b b ˆ 0;B b 1 D b d ˆ 0;B d 1 rad gb b DB d db b B d b b B d b d EDD EDD EDD EDD If w instantiat th indx b b in B b b b aftr th indx d in D d in (5), thn it rsults in an accss pattrn whr first th data along a physical track is accssd and thn th succssiv physical tracks ar accssd This chang in th instantiation ordr of th indics can b rgardd as a prmutation of th data distribution basis W call a prmutation of a data distribution basis as a loop basis For th abov xampl, th loop basis can b dnotd as: Lˆ G g B b b b D d B d b d Togthr, a data distribution bass and a loop bass spcify a data accss pattrn To synthsiz a program with this data Lt S b a tnsor basis and Sˆq sˆ1 Lt b a prmutation on xs is 1qŠ, thn a prmutation of S is a tnsor basis dfind as follows: S ˆ q sˆ1 x s i s 8 accss pattrn, vry indx in a loop basis may corrspond to a loop in th gnratd loop nst Morovr, th ordr of th loops in th loop nst is dtrmind by th ordr of th vctor bass in th loop basis A program which accsss out-of-cor data spcifid by th loop basis dnotd by (8) is shown blow D g ˆ 0;G 1 D b b ˆ 0;B b 1 D d ˆ 0;D D b d ˆ 0;B d 1 rad gb b DB d db b B d b b B d b d EDD EDD EDD EDD ot that in th abov program, th indxing function for accssing ach rcord is obtaind by linarizing th data distribution basis Th ordr of loops is spcifid by th loop basis In trms of programs, a loop basis can b undrstood as a notation spcifying how to r-ordr th loop nsts and furthr how to split a loop nst [3] SYTHESIZIG I/-EFFICIET PRGRAMS In this sction, w first giv an ovrviw of our program synthsis framwork W thn dscrib th structur of th gnratd program and how th program can b obtaind from an augmntd tnsor basis In th following sction, w dscrib how to comput th augmntd tnsor basis to obtain th dsird program structur 1 vrviw of Program Synthsis Th thr major stps in synthsizing fficint paralll I/ programs for a block rcursiv algorithm ar shown in Fig 3 Th first stp transforms th input tnsor product formula into an fficint form It uss th targt machin paramtr and proprtis of tnsor products to obtain th fficint form using ithr a grdy approach or an approach basd on dynamic programming It also dtrmins th appropriat input and output data distributions for implmnting th transformd formula Th scond and th third stps ar applid to ach computational stp, which is rprsntd by a tnsor product In th final program, an outrmost loop structur is usd to construct th program for ovrall tnsor product formula Mor

7 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 303 Fig 3 Synthsizing paralll out-of-cor programs spcifically, th scond stp dcomposs th computation of ach tnsor product into subcomputations by analyzing data accss pattrns and xploiting locality and concurrncy Th rsults of ths analyss ar rprsntd as an augmntd tnsor basis Th augmntd tnsor basis consists of th following four componnts: data distribution bass, loop bass, subcomputations, and mmory-loads Ths four componnts ar thn usd by th third stp of th cod gnration algorithm to gnrat paralll I/ programs ur prsntation of th drivation of fficint implmntations for th block rcursiv algorithms is in th rvrs ordr of Fig 3 W first prsnt a procdur for cod gnration by using th information containd in th augmntd tnsor basis Thn w dtrmin fficint implmntations for a strid prmutation and a simpl tnsor product with a givn data distribution on a givn modl by dtrmining th corrsponding augmntd tnsor bass Furthr, w dvlop a simpl algorithm to dtrmin th data distribution which can rsult in an fficint implmntation Furthrmor, w us a dynamic (or a multistp dynamic) programming algorithm to dtrmin an fficint implmntation for th block rcursiv algorithms Th dynamic programming algorithm will us th proprtis of tnsor products and th prformanc of ach tnsor product Th mthod of stimating th prformanc for ach tnsor product will b prsntd in Sction 5 and Sction 53 with th analysis of th scond stp (dtrmining augmntd tnsor bass) Structur of th Gnratd Paralll I/ Cod To minimiz th numbr of I/ oprations for a synthsizd program for a tnsor product, w nd to xploit locality by rusing th loadd data This rquirs dcomposing th computation and rorganizing data and data accss pattrns to maximiz data rus In th synthsizd program, th sam subcomputation is prformd svral tims ovr diffrnt data sts Hnc, th loop structur of th synthsizd program is constructd as follows An outr loop nst nclosing thr innr loop nsts: rad loop nst, computation loop nst, and writ loop nst Th rad loop nst loads out-of-cor data without ovrflowing main mmory Th computation loop nst prforms subcomputation on a mmory-load And th writ loop nst writs th ouput to th disk Th data sts ar accssd on track at a tim using paralll primitivs, paralll_rad and paralll_writ To rflct th structur of th outr and innr loops dscribd abov, w nd to sparat input loop bass into thr parts: 1) th part spcifing mmory-loads ( n ), ) th part spcifing th physical tracks in a mmory load ( m ), and 3) th part spcifying th rcords within a track ( ) Undr our stripd I/ modl, ach I/ opration rads and writs in trms of physical track ach tim Hnc, in th synthsizd program, th loops which corrspond to may not appar xplicitly Formally, w can writ th input loop basis as follows: ˆ n m ; whr w call n a mmory basis, sinc ach instantiation of th indics in n corrsponds to a mmory-load Similarly, w can sparat th output loop basis as follows: ˆ n m : 9 10 Morovr, our mthod of dtrmining loop bass will guarant that n is a prmutation of n Furthrmor, in ordr to hav a common outr loop nst, n ˆ n To minimiz th paralll I/ opartions, it is dsirabl that th synthsizd program maks a singl pass ovr th input data That is to say ach mmory-load should hav th following prfct mmory-load proprty: Th input data lmnts of th mmory-load can b organizd to form a st of tracks consistnt with input data distribution and th output data lmnts of th mmory load can b organizd to form a st of tracks consistnt with output data distribution If w can construct prfct mmory-loads, thn w can synthsiz a program which accsss out-ofcor data only onc (calld a on-pass program) Howvr, for som computations, it may not b possibl to construct prfct mmory-loads For ths computations, th synthsizd program kps only part of th rcords from a loadd physical track in th main mmory and discards othr rcords Thrfor, in a multipl-pass program th sam physical track is loadd svral tims In trms of input and output loop bass, prfct mmoryloads can b constructd if and consist of th physicaltrack-lmnt bass from th input and output data distribution bass, rspctivly Hnc, initially, w assum that th initial loop bass and hav th proprtis that and consist of th physical-track-lmnt bass from th input and th output data distribution bass, rspctivly If it turns out that a singl pass program cannot b synthsizd for th computation, thn (or ) is furthr factorizd into two parts, 1 and Furthr, is movd out of and put into n This movd tnsor basis is usd to dtrmin which portions of a physical block should b kpt for th currnt mmory-load Th siz of this movd vctor basis is qual to th numbr of tims th sam physical tracks ar loadd 3 Paralll I/ Cod Gnration In this sction, w first dfin th augmntd tnsor basis and thn dscrib th gnric cod gnration routin which uss th augmntd tnsor basis to gnrat paralll I/ cod

8 30 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig Procdur of cod gnration for a tnsor product Fig 5 Cod for I F I,whr X is an array of siz M and A ˆ I F I An augmntd tnsor basis for a singl-procssor multidisk systm includs data distribution bass, loop bass, mmory-loads, and oprations on ach mmory-load Morovr, for a tnsor product computation, th input and output data may b organizd and accssd diffrntly W thrfor nd to us input data distribution basis, output data distribution basis, input loop basis, and output loop basis to dnot thm, rspctivly Dfinition 3 An augmntd tnsor basis constituts th following four componnts: 1 Data Distribution Basis Lt data b distributd by cyclic B on D disks Lt B ˆ B b B d and th numbr of data lmnts b, whr = GBD Thn th (input or output) data distribution basis has th form: Dˆ G g D d B b b b B d b d : 11 Loop Basis An (input or output) loop basis has th following gnric form, LˆL n Lm L1 1 whr L 1 is a subst of L, whr L ˆD D and L 1 ˆL L ; 5 5 As dscribd in Sction, th siz of L (i, ) dpnds upon th tnsor product bing implmntd Th procdur for dtrmining L is dscribd in th following sctions L m consists of th last portions of D L 1 such that jl mjˆm jl 1 j ; L n ˆD L m L 1 3 Mmory-Load Th rcords in ach mmory-load ar dnotd by L m L1 Mor spcifically, ach mmory-load is obtaind by an instantiation of indics in L n, looping ovr indics in L m and using L to idntify which portions in ach loadd physical track should b kpt for th currnt mmory-load Subcomputation Th dcomposd computation which will b applid to ach mmory-load ot that th input and th output data distribution bass can b diffrnt Morovr, th input data distribution basis can b obtaind by factoring th input basis Th output data distribution basis can b obtaind by applying th corrsponding tnsor product or strid prmutation to th input data distribution basis Using this augmntd tnsor basis and assuming that n ˆ n, a gnric program can thn b obtaind as dscribd in Fig Furthr, Fig 5 shows an xampl synthsizd program for I F I W assum that M ˆ 16, D ˆ, B d ˆ, B ˆ, F is a matrix, and data ar distributd in a cyclic mannr It uss 8 g d bd as both th input and th output distribution bass Th input and th output loop bass ar also th sam as g g1 d bd, whr g g1 is a factorization of 8 g Th subcomputation is dnotd by I F I Th mmory basis is g Th dtails of how to dtrmin this information ar discussd in Sction 53

9 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA SYTHESIZIG PRGRAMS FR STRIDE PERMUTATIS In this sction, w discuss how to dtrmin an fficint augmntd tnsor basis for strid prmutations using a cyclic B distribution ur goal is to dcompos computations into a squnc of subcomputations prformd on prfct mmory-loads In th cas that prfct mmoryloads cannot b constructd, w minimiz th numbr of tims th data is loadd for ach mmory-load In doing so, w nsur that ach physical track of th output is writtn out only onc W first dvlop an approach to dtrmining th input and output loop bass for th givn distribution cyclic B Basd on ths loop bass and data distribution bass, w dtrmin mmory-loads and oprations on th mmory-loads Following this, a program can b synthsizd by using th procdur prsntd in Sction 3 Th cost of th program can also b dtrmind from th loop bass W summariz our rsults in th following thorm and thn prsnt a constructiv proof which constructs th augmntd tnsor basis Dfinition 51 Lt Y ˆ L PQ Q X, whr PQ ˆ and X and Y ar input and output vctors with lngth, rspctivly Lt X and Y b distributd according to cyclic B and th data distribution bass b dnotd as and, rspctivly Furthr, lt ˆ and ˆ Thn, a program can b synthsizd with B 1 maxf1; dd dj jb d D M g 6 paralll I/ oprations for th strid prmutation Y ˆ L PQ Q X Proof W prsnt an algorithm, as shown in Fig 6, for dtrmining th input and th output loop bass Th algorithm is furthr xplaind in Stp 1 as shown blow In Stp and Stp 3, w show how to construct mmory-loads and oprations for a mmory-load In Stp, w show that I/ costs can b obtaind from this information 1 Dtrmin input and output loop bass W bgin with th following construction for th input and th output loop bass, ˆ ; ˆ ; 13 1 whrwusthconvntionthatapparingonth right hand sid rfrs to th original rprsntation, which is qual to 1 3, and apparing on th lft hand sid rfrs to an updat So dos Furthr, w assum that ˆ, ˆ It is asy to vrify that is a prmutation of Thrfor, thy dnot th sam rcords Thus, if th numbr of rcords dnotd by j j is lss than th siz of th main mmory, thn w can simply tak m ˆ and m ˆ Howvr, th numbr of th rcords dnotd by j j may 6 Th notation jsj dnots th siz of th tnsor basis S, which is qual to th multiplication of th dimnsions of ach vctor basis in S Fig 6 Algorithm for dtrmining input and output loop bass for strid prmutations xcd th siz of th main mmory In that cas, w want to construct mmory-loads which can b obtaind by rading th input data svral tims whil writing th output data only onc In trms of tnsor bass, as w discussd in Sction 3, this rloading can b achivd by looping ovr part of th indics in In othr words, w nd to factor as and 1 such that th instantiation of th indics in slcts which subblocks should b kpt for a loadd physical track and th instantiation of th indics in 1 dnots rcords insid ach subblock Furthr, j j is qual to th numbr of tims w will rload ach physical track This rloading is achivd by taking m ˆ and moving bfor m In summary, th input and output loop bass in (13) and (1) ar modifid as follows: Factor such that m consists of th last factors of th factord tnsor basis and th siz of m is qual to M B dd For input loop basis, lt ˆ m, 1 ˆ Thus, th input and output loop bass can b writtn as, ˆ n m 1 ; ˆ n m ; whr n ˆ m 1 and n ˆ m W furthr vrify th following facts: First, m 1 and m contain th sam vctor bass, although in a diffrnt ordr [17] Scond, from th prvious rsults, w hav that j m 1 jˆj m jˆ M Thrfor, th rcords dnotd by thm can fit into a mmory-

10 306 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig 7 Paralll I/ Program for L 36 load Third, sinc j m j>j m j ˆ M DB d, loading j m j physical tracks will ovrflow th main mmory unlss som rcords ar discardd from th loadd tracks Th dtails for dtrmining which rcords to b discardd will b discussd in th nxt stp Fourth, n and n contain th sam vctor bass W thrfor can st n ˆ n, which will only chang th ordr of writing rsults onto physical tracks Dtrmin mmory-load Whn j j M; m ˆ and m ˆ Thrfor, th rcords dnotd by m or m can b usd to form a prfct mmory-load Howvr, whn this condition is not satisfid, w nd to us (15) and (16) as th input and output loop bass, rspctivly Bcaus j m 1 jˆj m jˆ M; th siz of ach mmory-load can b st to b qual to th siz of th main mmory Howvr, as w mntiond bfor, w nd to discard som rcords from ach loadd track to form th mmory-load This can b don by linarizing Each instantiation of th indics in will giv a st of subblocks in a physical track which should b kpt 3 Dtrmin oprations for a mmory-load As w mntiond abov, for ach mmory load, th tnsor vctors in th input and output loop bass which dnot th rcords insid a mmory-load ar th sam, but in a diffrnt ordr In othr words, on is a prmutation of th othr Bcaus th input and output loop bass ar prmutations of th input and output data distribution bass, w actually prmut a mmory-load of data ach tim Thrfor, ach in-mmory opration is nothing mor than a prmutation for a subst of data distribution bass dnotd by m 1 and m ot that whn ˆ, 1 ˆ I/ cost of synthsizd programs It is asy to s that if j j M, a on-pass program can b synthsizd, i, th numbr of paralll I/s is B dd Whn th abov condition dos not hold, w kp j 1 j rcords for ach loadd physical track and load th sam physical track j j tims Morovr, sinc j m jˆ M DB d, it can b asily dtrmind that j Bcaus w writ out ach rcord only onc, th numbr jˆ j jbdd M of paralll I/ oprations is 1 j jb d D M B dd Combining ths two cass yilds th prformanc rsults prsntd in th thorm Furthr, a program with this prformanc can b synthsizd by using th procdur listd in Fig tu W now us an xampl to illustrat th mthods of dtrmining augmntd tnsor bass and synthsizing paralll I/ programs for strid prmutations Assum that w hav a strid prmutation L 36, which can b intrprtd as an 8 matrix transposition Th paramtrs of th modl ar dfind as follows: D ˆ, B d ˆ, B b ˆ, and M ˆ 8 Thn th input and output data distribution bass can b writtn as follows: ˆ g d bb bd ; 17 ˆ g 0 d 0 b 0 b b 0 : 18 d Morovr, th output data distribution basis can also b obtaind by applying th strid prmutation L 36 to th input data distribution basis In othr words, it can b writtn as: ˆ b b bd g d : 19 Thn, following th procdur of th proof of Thorm 51, w can first dtrmin th input and output loop bass as follows W first factor g as g g1 Thn, by th algorithm prsntd in Fig 6, w hav:

11 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 307 Fig 8 Exampl matrix transposition (a) Inputs whn viwd as an 8 two-dimnsional array (b) Input data distribution on two disks (c) Load physical tracks T 0, T, in-cor prmutation, and writ to physical tracks T 0, T (d) Load physical tracks T 1, T 5, in-cor prmutation and writ to physical tracks T, T 6 ˆ d bd ; ˆ g d ; 0 m ˆ g ; m ˆ b d ; n ˆ g 1 bb ; n ˆ g 1 bb : 1 Furthr, th rcords dnotd by m or m will b usd to form prfct mmory-loads Th in-cor computation can b dtrmind by finding out th prmutation which prmuts m to m This can b asily dtrmind as L 8 Sinc j jm, a on-pass program, as shown in Fig 7, can b synthsizd by using th information dtrmind abov and th cod gnration algorithm prsntd in th prvious subsction Th procdur of computing L 36 using th synthsizd program is illustratd in Fig 8 and Fig 9 Fig 8 shows th input vctor whn xplaind as a matrix and its initial data distribution on two disks It also shows th first two intrmdiat subtransposition stps Fig 9 illustrats th succssiv two intrmdiat stps and th final outputs Each of th intrmdiat subtransposition stps rads a block of matrix, transposs th block in th intrnal mmory, and thn writs th block onto disks For clarity, w assum that th outputs ar writtn on a diffrnt st of disks 6 SYTHESIZIG PRGRAMS FR TESR PRDUCTS In this sction, w first prsnt an algorithm to dtrmin fficint loop bass for a tnsor product undr a givn data distribution cyclic B Basd on ths loop bass and data distribution bass, w can dtrmin mmory-loads and oprations to ach mmory-load In othr words, th augmntd tnsor basis can b obtaind Thrfor, a program can b gnratd by using th procdur discussd in Sction 51 W also show that th cost of th program synthsizd can b obtaind from th algorithm Sinc th computation of th tnsor product I R AV IC dos not chang th ordr of th inputs (or it can b computd in-plac), w will us th sam input and output data distribution bass for th input and output data and also th sam input and output loop bass for programs synthsizd in this subsction Thrfor, w will only considr input, input distribution, and input loop bass W summariz our rsults as a thorm and thn prsnt a constructiv proof which constructs th augmntd tnsor basis Bfor w prsnt th thorm, w first introduc th concpt of dsird rcords and discuss svral proprtis of th possibl locations in which th dsird rcords may rsid on disks For th tnsor product I R AV IC, th major computational matrix A V is applid to V input rcords and ths V rcords hav a strid C in th input vctor W call ach of

12 308 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig 9 Exampl matrix transposition (a) Load physical tracks T, T 6, in-cor prmutation, and writ to physical tracks T 1, T 3 (b) Load physical tracks T 3, T 7, in-cor prmutation, and writ to physical tracks T 5, T 7 (c) utputs ths V rcords for th first A V computation a dsird rcord Mor spcifically, V dsird rcords can b dnotd as fx icšj0 i V 1g ot that all of th othr A V computations will hav a similar data accss pattrn For xampl, th scond A V computation is applid to th V inputs bginning from th scond rcord with th sam strid C W now discuss svral proprtis of th possibl locations in which th dsird rcords may rsid on disks Th conscutiv dsird rcords will b first stord in a logical block, and thn th succssiv dsird rcords will b stord to othr logical blocks on othr disks Thus, for xampl, whn C>B d and VC <B, th numbr of physical tracks which holds th V V dsird rcords is C=B d rathr than C= B d D If th dsird rcords ar stord on svral disks, thn ach of ths disks will contain th sam numbr of dsird rcords and th dsird rcords in ach of ths disks ar stord in th sam rlativ locations If th dsird rcords ar stord on svral logical tracks, thn all of th logical tracks which contain th dsird rcords will hav th sam numbr of dsird rcords and th dsird rcords in ach logical track ar stord in th sam rlativ locations Th corrctnss of ths proprtis follows th dfinition of data distribution, th rgular data accss pattrn of ach computational matrix in th input tnsor product, and th assumptions that all of th paramtrs in th machin modl and th input tnsor product ar powrs of two For xampl, th corrctnss of th first proprty can b xplaind as follows Sinc VC <B, all of th dsird rcords ar stord in th first logical block Th distanc of th physical blocks which contain th dsird rcords is C B d Thrfor, th numbr of physical tracks which hold th V dsird rcords is C=B d Ths proprtis will b usd in th proof of th following thorm Thorm 6 Lt th input data b distributd according to cyclic B Lt t dnot th numbr of physical tracks whr th rcords for an A V computation ar stord Thn for th tnsor product I R AV IC, whr RV C ˆ and V M, if t M B dd, a program can b synthsizd with B dd paralll I/ oprations; othrwis a program can b synthsizd with 3 M t paralll I/ oprations Th abov thorm can also b statd in trms of tnsor bass as follows: Lt b th input data distribution basis Lt = Furthr assum that 1 dnots a subst of and 1 ˆ is movd into th mmory basis Thn for th tnsor product I R AV IC, whr RV C ˆ and V M, if 1 ˆ, a program can b synthsizd with B dd paralll I/ oprations; othrwis, a program can b synthsizd with j j 3 B dd In th following proof of th thorm, w will show how to construct 1 and W will also prov that j jˆ BdD M t Proof 1 Dtrmin input loop basis If th dsird rcords for an A V computation ar stord in t physical tracks and t M B d D, thn w can simply load th t physical tracks ach tim and, thrfor, a on-pass program can b gnratd Howvr, whn t > M B dd, w cannot kp all of th rcords in t physical tracks in th main mmory W tak th following simpl approach:

13 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 309 Fig 10 Algorithm for dtrmining input loop bass and th valu of Z for a tnsor product W construct M=V sts of dsird rcords by loading ach physical and rtaining in th main mmory only thos rcords which fall in ths sts Each physical track nds to b rloadd to prform computation on th rmaining rcords In trms of tnsor bass, w nd to do nothing mor than factor and prmut th input data distribution basis to rflct this data accss pattrn Mor spcifically, w bgin with ˆ and n m ˆ, whr has th sam initial valu as dfind in Sction 5 For a on-pass program, w factor and prmut n m to chang th ordr of accssing physical tracks Howvr, for a multipass program, w nd to factor and prmut all of th s, sinc w nd to kp part of th rcords loadd in th main mmory and discard othr rcords As w discussd bfor, th part of th rcords to b kpt or discardd can b dnotd by a subst of th vctor bass in th physical-track-lmnt basis In ordr to factor and prmut a tnsor basis to a dsird form, w nd to xamin th rlativ valus of th paramtrs in th targtd I/ modl, th tnsor product, and th siz B of th data distribution W summariz th abov idas as an algorithm in Fig 10, which is furthr xplaind as follows: Initialization This stp initializs th valus of n m,, and svral tmporary variabls For xampl, R b dnots th maximum numbr of th dsird rcords for an A V computation in a physical block R t is th numbr of th dsird rcords in a physical track R d is th numbr of disks whr th dsird rcords for an A V ar stord S is th distanc of two conscutiv physical tracks which contain th dsird rcords Sinc th strid of two dsird rcords is C, R b can b dtrmind as d Bd C Th corrctnss of R t and t can b similarly vrifid Comput will invok a procdur to comput th valus such as R d and S Fig 11a and Fig 11b show th dtails on how to dtrmin thos two valus Th corrctnss of th algorithm in Fig 11a for computing R d can b provn as follows: Whn C B d, th succssiv disks may contain th sam numbr of th dsird rcords if th dsird rcords can not b stord in on logical block Th numbr of ths succssiv disks is dpndnt on th valu of V Furthr, sinc thr ar R b B b dsird rcords pr logical block and R b B b ˆ B C (sinc R b ˆ Bd C in this cas), th numbr of disks which contain th dsird rcords is V qual to th smallr of B=C and D This rsults in th first cas of th algorithm Similarly, whn B d <C B, th succssiv disks may contain th dsird rcords Sinc in this cas, ach logical block contains B C dsird rcords, th numbr of disks which contain th dsird rcords is again qual to th smallr of V B=C and D For th third cas, any two disks which contain two conscutiv dsird rcords hav a strid C B Thrfor, R d ˆ D C=B Th last cas is trivial Similarly, w can prov th corrctnss of th algorithm in Fig 11b n-pass program This stp dtrmins how to accss physical tracks Th ida is straightforward It dtrmins th dcompositions and prmutations for n m basd on th strid btwn two conscutiv physical tracks which contain th dsird rcords Th rsult from this stp may also b ndd for th nxt stp to dtrmin th final loop basis for synthsizing multipass programs Multipass program If th numbr of physical tracks which hold th rcords for an A V computation is largr than th numbr of physical tracks which th main mmory can hold, thn a multipass program nds to b synthsizd Mor spcifically, w nd to dtrmin which portions of th rcords in a physical track should b kpt for ach pass of computation Th basic ida of kping

14 310 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig 11 (a) Algorithm for computing R d and (b) Algorithm for computing S rcords for th currnt mmory-load can b dscribd as follows: First, for ach dsird rcord, w want to tak X 1 succssiv rcords and kp ths X 1 rcords with th corrsponding dsird rcord as th currnt mmory-load n approach of dtrmining X is to tak X as larg as possibl Howvr, X nds to satisfy th following thr conditions First, X must b lss than th gap btwn any two conscutiv dsird rcords in a physical block Scond, X must b lss than th siz of a physical block Third, all of th dsird rcords with thir X 1 succssiv rcords should b abl to fit into th main mmory, which mans that XR t t M, orxv M Ths thr conditions can b xprssd as M X ˆ minfc;b d ; R t t g Fig 1 shows an xampl of how to construct mmory-loads by taking portions of th rcords from a physical block, whr w assum that thr ar four dsird rcords in a physical block, and C ˆ X Th xampl can b intrprtd as follows Th physical block is first brokn into ight subblocks Thn w tak th rcords in th oddnumbrd subblocks to construct on mmory-load and tak th rcords in th vnnumbrd subblocks to construct anothr mmory-load In trms of tnsor bass, w first dcompos Bb b b as b d3 bd X bd1 Thn, w prmut th rsulting tnsor basis as b d bd3 X bd1 Scond, w apply a similar ida for disks For ach disk which contains th dsird rcord, w tak Y 1 succssiv disks and w kp th rcords at th sam rlativ locations with th original disk in ach succssiv disk for th currnt mmory-load W want to tak th largst possibl valu of Y givn th condition that th numbr of th rcords kpt must fit into th main mmory W considr th following two cass First, X ˆ minfb d ;Cg In this cas, ithr all of th rcords btwn any two dsird rcords or all of th rcords in a physical block ar chosn to b kpt for th currnt mmoryload Howvr, if all of th rcords btwn any two dsird rcords ar chosn, all of th rcords in a physical block will b covrd Thus, it is idntical to th cas that all of th rcords in a physical block ar chosn to b kpt Furthr, R d disks contain dsird rcords Thrfor, R d B d rcords ar chosn from ach physical track In ordr to not ovrflow th main mmory, w nd that R d YB d t M Scond, X ˆ M R t t In this cas, Fig 1 Constructing portions of mmory-loads from a physical block

15 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 311 w do not choos all of th rcords btwn two dsird rcords Howvr, sinc w hav alrady chosn th largst possibl valu for X, th main mmory has bn filld up in this cas Thrfor, w can not add any mor rcords from succssiv disks from this approach In othr words, Y ˆ 1 An xampl, which is similar to th xampl shown in Fig 1, can b constructd for disks Mor spcifically, if w viw th rcords in a physical block as disks, X as Y, R b as R d, thn w hav an xampl for disks Furthr, in trms of tnsor bass, w can intrprt this ida as follows: W first dcompos D D d as R d Rd Y d 3 d Y d1 Thn, w D R prmut it as d Y d R d d 3 Y d1 Th rsulting tnsor basis allows us to accss odd-numbrd subst of disks first and thn vnnumbrd subst of disks W now considr an xampl which contains both disks and rcords in physical blocks Mor spcifically, w considr th xampl in which data can b rprsntd by combining factord D d and B b b b Assum that w want to accss th rcords first in th oddnumbrd disk subblocks and thn in th vn-numbrd disk subblocks Furthr, for ach physical block w want to accss th rcords first in th odd-numbrd subblocks and thn in th vn-numbrd subblocks To D R achiv this data accss pattrn, w mov d Y B d R and b X b d from thir currnt locations in D d B b b b to th bginning of D d B b b b In th algorithm prsntd in Fig 10, w hav D Bd R dnotd d Y R d b X b d as Thrfor, to construct ach mmory-load, w can simply mov into n m and put thm anywhr in n 7 For th following analysis, w assum that w hav found th substs of, namly 1 and, by th abov algorithm is movd into th mmory basis and will gnrat loop nsts for data accss Th othr portions of th algorithm, which ar usd for computing th valu of Z, will b discussd in Stp 3 Dtrmin mmory-load For a on-pass program, w can simply factor as n m and tak j m jˆ M B d D For a multipl-pass program, w factor 1 to b n m such that j m jˆ M j 1 j and all of th vctor bass in appar in n 7 Mor spcifically, th initial n m should b modifid to 0 n 0 m, whr 0 m contains th last factors of n m and j 0 m jˆ M j 1 j, and 0 n contains n m 0 m and d Morovr, for th multipl-pass program, as discussd in Sction 5, w us to dtrmin which rcords should b kpt for th currnt mmory-load 3 Dtrmin oprations for a mmory-load Th original tnsor product can b rgardd as R paralll applications of A V to th inputs with a strid C Whn data ar distributd among disks and loadd in units of physical tracks, th nt ffct is to possibly rduc th strid of th rcords which ach A V will accss in main mmory Th oprations on a mmory-load hav th gnral form of I M AV IZ VZ Howvr, th valu of Z will dpnd on th rlativ valus of th paramtrs in th targt machin modl and th input tnsor product Th algorithm prsntd in Fig 10 can b usd to dtrmin this valu Th corrctnss of th valu of Z obtaind from th algorithm can b provn as follows: For on-pass programs, whn C B d, w do not chang th strid for subcomputations Thrfor, Z ˆ C thrwis, th strid will b rducd to b qual to th distanc of two conscutiv dsird rcords in a physical track, which is qual to D R t B d For multipass programs, whn X ˆ C, w choos all of th rcords btwn any two dsird rcords for th currnt mmory-load, so th strid of in-cor computation dos not chang Whn X ˆ M R t t, w rduc th strid of in-cor computation from C to X Whn X ˆ B d, th nxt dsird rcord is not in th sam physical block Sinc w kp Y disks as a subst of disks, w rduc th strid from C to YB d I/ cost of synthsizd programs For a on-pass program which dos not mov any vctor bass in, th numbr of paralll I/s is simply qual to B d D In othr words, th synthsizd program is optimal in trms of th numbr of I/s For a multipass program, w nd to rad th inputs j j tims Thrfor, th numbr of paralll I/ oprations is j j 3 B dd From th algorithm prsntd in Fig 10, w can dtrmin that j jˆ DBd M t W thrfor can attain th prformanc prsntd in th thorm Th constant 3 can b xplaind as follows: Whn w stor a physical track, w nd to rad that physical track into main mmory again, sinc portions of th rcords in that physical track hav bn discardd By rloading this physical track, w can rassmbl th physical track with th part of updatd rcords and thn writ it out in paralll thrwis, part of th rcords to b writtn out in that physical track may not b corrct Furthr, ªrassmblingº th physical track nds to us th tnsor basis (notic that is qual to to put th updatd rcords into th corrct locations on th physical track This is similar to using to tak subblocks out from a loadd physical track for th currnt mmoryload

16 31 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 Fig 13 Algorithm for computing th fficint siz of data distributions ow, a program with th prformanc discussd abov can b synthsizd by using th procdur listd in Fig Howvr, to b accurat, whn synthsizing a multipass program, w nd to incorporat th ida of ªrassmblingº a physical track into th writout part of th procdur listd in Fig, which, as w discussd abov, is nothing mor than using th linarization of to put subblocks in th currnt mmory-load into th corrct locations of th rloadd physical track tu ot that th valu of t can b dtrmind at th initialization stp Thrfor, th prformanc of th synthsizd program for a tnsor product can b dtrmind without gnrating th whol augmntd tnsor basis This rsult is usd in th first phas of transforming tnsor product formulas, whr w nd th prformanc valu for ach tnsor product to dtrmin fficint transformations 61 Dtrmining Efficint Data Distributions In th prvious sctions, w prsntd approachs for synthsizing fficint I/ programs for a givn data distribution W now prsnt an algorithm to dtrmin a data distribution which optimizs th prformanc of th synthsizd program Th ida of th algorithm is as follows: W bgin with th physical track distribution cyclic B d, i, initially B ˆ B d If a on-pass program can b synthsizd undr this data distribution, thn B d is th dsird block siz for th data distribution thrwis, w doubl th valu of B If th prformanc of th synthsizd program undr this distribution incrass, w continu this procdur thrwis, th algorithm stops and th currnt block siz is th dsird siz of data distributions W formaliz this ida in Fig 13 6 Transforming Tnsor Product Formulas In this sction, w discuss tchniqus of program synthsis for tnsor product formulas Thr ar svral stratgis for dvloping I/-fficint programs, such as xploiting locality and xploiting paralllism in accssing th data Similar idas hav bn discussd in [15], whr thy us factor grouping to xploit locality and data rarrangmnt to rduc th cost of I/ oprations W hav also prsntd a grdy mthod which uss factor grouping to improv th prformanc of block rcursiv algorithms for Vittr and Shrivr's stripd two-lvl mmory modl with a fixd block siz of data distribution [10] Factor grouping combins contiguous tnsor products in a tnsor product formula and thrfor rducs th numbr of passs to accss scondary storag Considr th cor Cooly-Tuky FFT computation, which dos not contain th initial bit-rvrsal opration and th twiddl factor computation For i= and i=3, w hav th tnsor products I n F I and I n 3 F I, rspctivly Assuming that ach of ths tnsor products can b implmntd optimally, th numbr of paralll I/ oprations rquird to implmnt ths two stps individually is DB Howvr, thy ar contiguous tnsor products in () Hnc, by using th proprtis of tnsor products, such as Proprtis 1 and listd in Sction 3, thy can b combind into on tnsor product, I n F I I n 3 F I ˆ I n 3 I F I I n 3 F I I ˆ I n 3 I F I I n 3 F I I ˆ I n 3 F F I ; which may also b implmntabl optimally by using only paralll I/ oprations DB d Data rarrangmnt uss th proprtis of tnsor products to chang data accss pattrns For xampl, th tnsor product I R AV IC can b transformd into th AV )(I R L VC C quivalnt form (I R L VC V )(I RC ) In th bst cas, th numbr of paralll I/s rquird is 6 DB d aftr using this transformation, sinc at last thr passs ar ndd for th transformd form Bcaus of th xtra passs introducd by this transformation, it is not profitabl to us it for our targtd machin modl Furthr, th first and th last trms in th transformd formula may not b implmntabl optimally Thrfor, w hav not incorporatd this transformation into our currnt optimization procdurs 61 Minimizing I/ Cost by Dynamic Programming Sinc factor grouping (as shown abov) and th siz of th data distribution (as will b shown in th nxt sction) hav a larg influnc on th prformanc of synthsizd programs, w tak th following approach for dtrmining an optimal mannr in which a tnsor product formula can b implmntd W us th algorithm for dtrmining th optimal data distribution prsntd in Fig 13 as a main routin Howvr, for ach cyclic B data distribution, w us a dynamic programming algorithm to dtrmin th optimal factor grouping Hnc, w also call this mthod a multi-stp dynamic programming mthod TABLE 1 umbr of I/ Passs for Strid Prmutation L PQ Q D ˆ, B d ˆ, M ˆ 6, and ˆ PQ ˆ 08

17 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 313 TABLE umbr of I/ Passs for Strid Prmutation L PQ Q D ˆ 16, B d ˆ 51, M ˆ, and ˆ PQ ˆ 50 Lt C i; jš b th optimal cost (th minimum numbr of I/ passs rquird to accss th out-of-cor data) for computing j i tnsor factors from th ith factor to th jth factor in a tnsor product formula Thn C i; jš can b computd as follows: C i; jš ˆ C0 if i ˆ j min ikj fc i; kš C k 1;jŠg if i j : In th abov formula, C 0 dnots th cost for computing a tnsor product Th mthod of dtrmining th cost of a tnsor product has bn discussd in Sction 53 Th valus of C 0 can b computd using th rsults in Thorm 6 and th algorithm prsntd in Fig 11a to comput t A spcial cas of k ˆ j nds to b furthr xplaind Whn k ˆ j, w assum that C j 1;jŠˆ0 and w us C i; kš to rprsnt th cost of grouping all th tnsor product factors from i to j togthr Bcaus th groupd tnsor product is a simpl tnsor product, th valu of C i; kš in this cas can also b dtrmind by using th rsults in Thorm 6 and th algorithm prsntd in Fig 11a to comput t Howvr, in this cas, if k i>m, or th siz of groupd oprations is largr than th siz of th main mmory, w do not want to group all of th k i factors togthr W assign a larg valu, such as 1, toc k; jš to prvnt it from bing slctd M t ˆ M B dd is th maximum numbr of physical tracks in a mmory-load W can vrify that th rsults prsntd hr ar mor comprhnsiv than th rsults prsntd in [10] In most cass, using th approach prsntd in Sction 53, w can actually synthsiz programs with bttr prformanc For xampl, whn VC >M, M<VDB d and M>VB d, from [10], a program with VB dd M passs will b synthsizd Howvr, for ths conditions, w hav that C>B d and VC >M If w furthr assum that C<B d D, thn from th rsults in Tabl 3 and Tabl, w can synthsiz a program with VC M passs, which is lss than VB dd M W now show that by using an appropriat cyclic B data distribution, a bttr prformanc program can b synthsizd for most of th cass Svral typical xampls ar shown in Tabl 6 W notic that whn w incras B, w can rduc th numbr of passs of data accss for most of th cass and th dcras in th numbr of passs can b as larg as ight tims Th valus in th tabl also suggst that w can us th algorithm prsntd in Fig 13 to find an fficint siz of data distributions for a givn tnsor product W also notic that for som cass, such as C B d, w can not improv th prformanc Th rason is that th strid rquird by A V is lss than th siz of th physical block and w cannot rduc it furthr by rdistribution 7 PERFRMACE RESULTS F SYTHESIZED PRGRAMS 71 Matrix Transposition Givn th flxibility of choosing diffrnt data distributions, w can synthsiz programs with bttr prformanc than thos obtaind using fixd siz data distributions for strid prmutations W prsnt a st of xprimntal rsults for th numbr of I/ oprations rquird by th cyclic B d distribution and cyclic B distribution, whr th siz B of th distribution varis Ths rsults ar summarizd in Tabl 1 and Tabl From th tabls, w can s that th numbr of passs is not a monotonically incrasing or dcrasing function Howvr, it normally dcrass and thn incrass as B is incrasd Thrfor, it is likly that th algorithm in Fig 13 will find an fficint siz of data distributions 73 Tnsor Product Formulas W show th ffctivnss of th multistp dynamic programming mthod by comparing th programs synthsizd by it with th programs synthsizd by th grdy mthod and th dynamic programming mthod (applid to a data distribution of fixd siz), rspctivly Th xampl w us is th cor Cooly-Tuky FFT computation Th rsults for svral typical sizs of inputs ar shown in Tabl 7 W find that using dynamic programming for a fixd siz cyclic B d distribution normally cannot improv prformanc ovr th grdy mthod Howvr, by using th multistp dynamic programming mthod, w can rduc TABLE 3 umbr of I/ Passs for th Tnsor Product I R AV IC 7 Tnsor Products Th numbr of I/ passs rquird by th synthsizd programs ar summarizd in Tabl 3, Tabl, and Tabl 5 by going through various cass of t In thos tabls,

18 31 IEEE TRASACTIS PARALLEL AD DISTRIBUTED SYSTEMS, VL 10, 3, MARCH 1999 TABLE umbr of I/ Passs for th Tnsor Product I R AV IC TABLE 5 umbr of I/ Passs for th Tnsor Product I R AV IC TABLE 6 umbr of I/ Passs for th Tnsor Product I R AV IC with Various Data Distributions D ˆ 16, B d ˆ 51, M ˆ, and ˆ RV C TABLE 7 umbr of I/ Passs for th Synthsizd Programs Using Grdy, Dynamic Programming (DP) and Multipl-Stp Dynamic Programming (MDP) Mthods (D ˆ 16, B d ˆ 51, and M ˆ ) th numbr of passs for th synthsizd programs by at last 1 if is vry larg Bcaus th input siz is larg, th prformanc gain by liminating vn on pass to accss out-of-cor data is significant 8 CCLUSIS W hav prsntd a novl framwork for synthsizing outof-cor programs for block rcursiv algorithms using th algbraic proprtis of tnsor products W usd th stripd Vittr and Shrivr's two lvl mmory modl as our targt machin modl Howvr, instad of using th simplr physical track distribution normally usd by this modl, w usd various block-cyclic distributions supportd by th High Prformanc Fortran to organiz data on disks Morovr, w us tnsor bass as a tool to captur th smantics of data distributions and data accss pattrns W showd that by using th algbraic proprtis of tnsor products, w can dcompos computations and arrang data accss pattrns to gnrat out-of-cor programs automatically W dmonstratd th importanc of choosing th appropriat data distribution for th fficint out-of-cor implmntations through a st of xprimnts Th xprimntal rsults also showd that our simpl algorithm for choosing th fficint data distribution is vry ffctiv From th obsrvations about th importanc of data distributions and factor grouping for tnsor products, w proposd a dynamic programming approach to dtrmin th fficint data distribution and th factor grouping For an xampl FFT computation, this dynamic programming approach rducd th numbr of I/ passs by at last on compard to th simplr grdy algorithm ACKWLEDGMETS Supportd by US ational Scinc Foundation Grant SF- IRI , Rom Labs Contracts F C-0037,

19 LI ET AL: SYTHESIZIG EFFICIET UT-F-CRE PRGRAMS FR BLCK RECURSIVE ALGRITHMS USIG BLCK-CYCLIC DATA 315 ARPA/SIST contracts J-1985, and C-018 undr subcontract KI [3] M Wolf, High Prformanc Compilrs for Paralll Computing Addison-Wsly, 1996 REFERECES [1] GE Bllloch, Vctor Modls for Data-Paralll Computing Th MIT Prss, 1990 [] A Choudhary, I Fostr, G Fox, K Knndy, C Ksslman, C Kolbl, J Saltz, and M Snir, ªLanguags, Compilrs, and Runtim Systms Support for Paralll Input-utput,º Tchnical Rport CCSF-39, Scalabl I/ Initiativ, Caltch Concurrnt Suprcomputing Facilitis, Caltch, 199 [3] TH Cormn, ªVirtual Mmory for Data-Paralll Computing,º PhD thsis, Dpt of Elctrical Eng and Computr Scinc, Massachustts Inst of Tchnology, 199 [] TH Cormn and D Kotz, ªIntgrating Thory and Practic in Paralll Fil Systms,º Tchnical Rport PCS-TR93-188, Dpt of Math and Computr Scinc, Dartmouth Collg, Mar 1993 [5] DL Dai, SKS Gupta, SD Kaushik, JH Lu, RV Singh, C-H Huang, P Sadayappan, and RW Johnson, ªEXTET: A Portabl Programming Environmnt for Dsigning and Implmnting High Prformanc Block Rcursiv Algorithms,º Proc Suprcomputing '9, pp 9±58, 199 [6] J Eklundh, ªA Fast Computr Mthod for Matrix Transposing,º IEEE Trans Computrs, vol 0, no 7, pp 801±803, July 197 [7] DG Fitlson, PF Corbtt, Y Hsu, and J-P Prost, ªParalll I/ Systms and Intrfacs for Paralll Computrs,º Multiprocssor SystmsÐDsign and Intgration C-L Wu, d, World Scintific, 1995 [8] J Granta, M Connr, and R Tolimiri, ªRcursiv Fast Algorithms and th Rol of th Tnsor Product,º IEEE Trans Signal Procssing, vol 0, no 1, pp,91±,930, Dc 199 [9] SKS Gupta, ªSynthsizing Communication-Efficint Distributd- Mmory Paralll Programs for Block Rcursiv Algorithms,º PhD thsis, Th hio Stat Univ, Mar 1995 [10] SKS Gupta, Z Li, and JH Rif, ªGnrating Efficint Programs for Two-Lvl Mmoris from Tnsor Products,º Proc Svnth IASTED/ISMM Int'l Conf Paralll and Distributd Computing and Systms, pp 510±513, Washington DC, ct 1995 [11] C-H Huang, JR Johnson, and RW Johnson, ªGnrating Paralll Programs from Tnsor Product Formulas: A Cas Study of Strassn's Matrix Multiplication Algorithm,º Proc Int'l Conf Paralll Procssing 199, pp 10±108, Aug 199 [1] JR Johnson, RW Johnson, D Rodriguz, and R Tolimiri, ªA Mthodology for Dsigning, Modifying and Implmnting Fourir Transform Algorithms on Various Architcturs,º Circuits Systms and Signal Procssing, vol 9, no, pp 50±500, 1990 [13] RW Johnson, C-H Huang, and JR Johnson, ªMultilinar Algbra and Paralll Programming,º J Suprcomputing, vol 5, pp 189±18, 1991 [1] SD Kaushik, C-H Huang, JR Johnson, RW Johnson, and P Sadayappan, ªEfficint Transposition Algorithms for Larg Matrics,º Proc Suprcomputing '93, ov 1993 [15] SD Kaushik, C-H Huang, RW Johnson, and P Sadayappan, ªA Mthodology for Gnrating Efficint Disk-Basd Algorithms from Tnsor Product Formulas,º Proc Sixth Ann Workshop Languags and Compilrs for Paralll Computing, pp 358±338, Aug 1993 [16] B Kumar, C-H Huang, P Sadayappan, and RW Johnson, ªAn Algbraic Approach to Cach Mmory Charactrization for Block Rcursiv Algorithms,º Proc 199 Int'l Computr Symp, pp 336± 3, 199 [17] Z Li, ªComputational Modl and Program Synthsis for Paralll ut-of-cor Computation,º PhD thsis, Duk Univ, 1996 [18] CV Loan, Computational Framworks for th Fast Fourir Transform SIAM, 199 [19] R Paig, JH Rif, and R Wachtr, Paralll Algorithm Drivation and Program Transformation Kluwr Acadmic, 1993 [0] HS Ston, ªParalll Procssing with th Prfct Shuffl,º IEEE Trans Computrs, vol 0, no, pp 153±161, Fb 1971 [1] R Thakur, R Bordawkar, and A Choudhary, ªCompilation of ut-of-cor Data Paralll Programs for Distributd Mmory Machins,º Proc IPPS '9 Workshop Input/utput in Paralll Computr Systms, pp 5±7, Apr 199 Also appard in Computr Architctur ws, vol, no [] JS Vittr and EAM Shrivr, ªAlgorithms for Paralll Mmory I: Two-Lvl Mmoris,º Algorithmica, vol 1, nos -3, pp 110±17, 199 Zhiyong Li rcivd th BS and MS dgrs in computr scinc and nginring from Huazhong Univrsity of Scinc and Tchnology, Popls Rpublic of China, in 198 and 1987, rspctivly, and th PhD dgr in computr scinc from Duk Univrsity in 1996 From 1987 to 199, h was an assistant profssor in th Dpartmnt of Computr Scinc and Enginring at Huazhong Univrsity of Scinc and Tchnology H workd at th Prformanc Lab of Sun Microsystms in 1996 and was on of th main dsignrs for standard Java bnchmarks Sinc 1997, h has bn with th IBM twork Computing Softwar Division at Rsarch Triangl Park, currntly working on Intrnt lctronic commrc Dr Li has publishd mor than 0 paprs in rfrd journals and confrncs in th ara of programming languags, paralll and distributd computing, and artificial intllignc H has applid for thr US patnts rlatd to Java and objctd-orintd tchnologis John H Rif rcivd th BS (magna cum laud) dgr in applid math and computr scinc in 1973 from Tufts Univrsity, and th MS and PhD dgrs in applid mathmatics from Harvard Univrsity, in 1975 and 1977, rpctivly H is currntly a profssor in th Dpartmnt of Computr Scinc, Duk Univrsity, Durham, orth Carolina H has workd for many yars on th dvlopmnt and analysis of paralll and randomizd algorithms for various fundamntal problms, including solutions of larg spars systms, sorting, and graph problms H is th author of mor than 10 publications to dat His rsarch combins thory and practic Although primarily a thortical computr scintist, Prof Rif also has mad a numbr of contributions to practical aras of computr scinc, including paralll architcturs, robotics, data comprssion, molcular simulations, and optical computing H has don a numbr of implmntations of sophisticatd paralll algorithms, such as paralll nstd dissction on massivly paralll machins, as wll as implmntations of paralll data comprssion algorithms into spcial purpos chips H has focusd particularly on mrging nw aras, such as biomolcular computing H is dirctor of th Consortium of Biomolcular Computing and Applications, which consists of most of th major US rsarch groups in biomolcular computing Dr Rif has rcntly had two books publishd on paralll algorithms and implmntations forwhich h was ditorðsynthsis of Paralll Algorithms (Kluwr Acadmic Publishrs, 1993) and Paralll Algorithm Drivation and Program Transformation (co-ditd with R Paig and R Wachtr) Dr Rif is a fllow of th ACM (1996), a fllow of th IEEE (1993), and a fllow of th Institut of Combinatorics (1991) Sandp KS Gupta rcivd th BTch dgr in computr scinc and nginring from th Institut of Tchnology, Banaras Hindu Univrsity, Varanasi, India, 1987, th MTch dgr in computr scinc and nginring from th Indian Institut of Tchnology, Kanpur, 1989, and th MS and PhD dgrs in computr and information scinc from Th hio Stat Univrsity, Columbus, hio, in 1991 and 1995, rspctivly H is currntly an assistant profssor in th Dpartmnt of Computr Scinc at Colorado Stat Univrsity, Colorado Prior to joining Colorado Stat Univrsity, h hld rsarch and taching positions at Duk Univrsity and hio Univrsity His rsarch intrsts includ paralll and distributd computing, compilrs, and mobil computing Dr Gupta is a mmbr of th ACM and th IEEE

Higher order derivatives

Higher order derivatives Robrto s Nots on Diffrntial Calculus Chaptr 4: Basic diffrntiation ruls Sction 7 Highr ordr drivativs What you nd to know alrady: Basic diffrntiation ruls. What you can larn hr: How to rpat th procss of

More information

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013 18.782 Introduction to Arithmtic Gomtry Fall 2013 Lctur #20 11/14/2013 20.1 Dgr thorm for morphisms of curvs Lt us rstat th thorm givn at th nd of th last lctur, which w will now prov. Thorm 20.1. Lt φ:

More information

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 07 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat

More information

Addition of angular momentum

Addition of angular momentum Addition of angular momntum April, 0 Oftn w nd to combin diffrnt sourcs of angular momntum to charactriz th total angular momntum of a systm, or to divid th total angular momntum into parts to valuat th

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by D. Klain Vrsion 207.0.05 Corrctions and commnts ar wlcom. Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix A A k I + A + k!

More information

The Matrix Exponential

The Matrix Exponential Th Matrix Exponntial (with xrciss) by Dan Klain Vrsion 28928 Corrctions and commnts ar wlcom Th Matrix Exponntial For ach n n complx matrix A, dfin th xponntial of A to b th matrix () A A k I + A + k!

More information

(Upside-Down o Direct Rotation) β - Numbers

(Upside-Down o Direct Rotation) β - Numbers Amrican Journal of Mathmatics and Statistics 014, 4(): 58-64 DOI: 10593/jajms0140400 (Upsid-Down o Dirct Rotation) β - Numbrs Ammar Sddiq Mahmood 1, Shukriyah Sabir Ali,* 1 Dpartmnt of Mathmatics, Collg

More information

CS 361 Meeting 12 10/3/18

CS 361 Meeting 12 10/3/18 CS 36 Mting 2 /3/8 Announcmnts. Homwork 4 is du Friday. If Friday is Mountain Day, homwork should b turnd in at my offic or th dpartmnt offic bfor 4. 2. Homwork 5 will b availabl ovr th wknd. 3. Our midtrm

More information

EEO 401 Digital Signal Processing Prof. Mark Fowler

EEO 401 Digital Signal Processing Prof. Mark Fowler EEO 401 Digital Signal Procssing Prof. Mark Fowlr Dtails of th ot St #19 Rading Assignmnt: Sct. 7.1.2, 7.1.3, & 7.2 of Proakis & Manolakis Dfinition of th So Givn signal data points x[n] for n = 0,, -1

More information

Slide 1. Slide 2. Slide 3 DIGITAL SIGNAL PROCESSING CLASSIFICATION OF SIGNALS

Slide 1. Slide 2. Slide 3 DIGITAL SIGNAL PROCESSING CLASSIFICATION OF SIGNALS Slid DIGITAL SIGAL PROCESSIG UIT I DISCRETE TIME SIGALS AD SYSTEM Slid Rviw of discrt-tim signals & systms Signal:- A signal is dfind as any physical quantity that varis with tim, spac or any othr indpndnt

More information

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim.

COHORT MBA. Exponential function. MATH review (part2) by Lucian Mitroiu. The LOG and EXP functions. Properties: e e. lim. MTH rviw part b Lucian Mitroiu Th LOG and EXP functions Th ponntial function p : R, dfind as Proprtis: lim > lim p Eponntial function Y 8 6 - -8-6 - - X Th natural logarithm function ln in US- log: function

More information

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018 Propositional Logic Combinatorial Problm Solving (CPS) Albrt Olivras Enric Rodríguz-Carbonll May 17, 2018 Ovrviw of th sssion Dfinition of Propositional Logic Gnral Concpts in Logic Rduction to SAT CNFs

More information

cycle that does not cross any edges (including its own), then it has at least

cycle that does not cross any edges (including its own), then it has at least W prov th following thorm: Thorm If a K n is drawn in th plan in such a way that it has a hamiltonian cycl that dos not cross any dgs (including its own, thn it has at last n ( 4 48 π + O(n crossings Th

More information

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Construction of asymmetric orthogonal arrays of strength three via a replacement method isid/ms/26/2 Fbruary, 26 http://www.isid.ac.in/ statmath/indx.php?modul=prprint Construction of asymmtric orthogonal arrays of strngth thr via a rplacmnt mthod Tian-fang Zhang, Qiaoling Dng and Alok Dy

More information

u 3 = u 3 (x 1, x 2, x 3 )

u 3 = u 3 (x 1, x 2, x 3 ) Lctur 23: Curvilinar Coordinats (RHB 8.0 It is oftn convnint to work with variabls othr than th Cartsian coordinats x i ( = x, y, z. For xampl in Lctur 5 w mt sphrical polar and cylindrical polar coordinats.

More information

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH.

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, April 04, 2005, 8:35 AM) PART I: CHAPTER TWO COMB MATH. C:\Dallas\0_Courss\03A_OpSci_67\0 Cgh_Book\0_athmaticalPrliminaris\0_0 Combath.doc of 8 COPUTER GENERATED HOLOGRAS Optical Scincs 67 W.J. Dallas (onday, April 04, 005, 8:35 A) PART I: CHAPTER TWO COB ATH

More information

Quasi-Classical States of the Simple Harmonic Oscillator

Quasi-Classical States of the Simple Harmonic Oscillator Quasi-Classical Stats of th Simpl Harmonic Oscillator (Draft Vrsion) Introduction: Why Look for Eignstats of th Annihilation Oprator? Excpt for th ground stat, th corrspondnc btwn th quantum nrgy ignstats

More information

Search sequence databases 3 10/25/2016

Search sequence databases 3 10/25/2016 Sarch squnc databass 3 10/25/2016 Etrm valu distribution Ø Suppos X is a random variabl with probability dnsity function p(, w sampl a larg numbr S of indpndnt valus of X from this distribution for an

More information

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

ECE602 Exam 1 April 5, You must show ALL of your work for full credit. ECE62 Exam April 5, 27 Nam: Solution Scor: / This xam is closd-book. You must show ALL of your work for full crdit. Plas rad th qustions carfully. Plas chck your answrs carfully. Calculators may NOT b

More information

10. The Discrete-Time Fourier Transform (DTFT)

10. The Discrete-Time Fourier Transform (DTFT) Th Discrt-Tim Fourir Transform (DTFT Dfinition of th discrt-tim Fourir transform Th Fourir rprsntation of signals plays an important rol in both continuous and discrt signal procssing In this sction w

More information

Einstein Equations for Tetrad Fields

Einstein Equations for Tetrad Fields Apiron, Vol 13, No, Octobr 006 6 Einstin Equations for Ttrad Filds Ali Rıza ŞAHİN, R T L Istanbul (Turky) Evry mtric tnsor can b xprssd by th innr product of ttrad filds W prov that Einstin quations for

More information

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the Copyright itutcom 005 Fr download & print from wwwitutcom Do not rproduc by othr mans Functions and graphs Powr functions Th graph of n y, for n Q (st of rational numbrs) y is a straight lin through th

More information

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation. Lur 7 Fourir Transforms and th Wav Euation Ovrviw and Motivation: W first discuss a fw faturs of th Fourir transform (FT), and thn w solv th initial-valu problm for th wav uation using th Fourir transform

More information

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002 3.4 Forc Analysis of Linkas An undrstandin of forc analysis of linkas is rquird to: Dtrmin th raction forcs on pins, tc. as a consqunc of a spcifid motion (don t undrstimat th sinificanc of dynamic or

More information

That is, we start with a general matrix: And end with a simpler matrix:

That is, we start with a general matrix: And end with a simpler matrix: DIAGON ALIZATION OF THE STR ESS TEN SOR INTRO DUCTIO N By th us of Cauchy s thorm w ar abl to rduc th numbr of strss componnts in th strss tnsor to only nin valus. An additional simplification of th strss

More information

What are those βs anyway? Understanding Design Matrix & Odds ratios

What are those βs anyway? Understanding Design Matrix & Odds ratios Ral paramtr stimat WILD 750 - Wildlif Population Analysis of 6 What ar thos βs anyway? Undrsting Dsign Matrix & Odds ratios Rfrncs Hosmr D.W.. Lmshow. 000. Applid logistic rgrssion. John Wily & ons Inc.

More information

2.3 Matrix Formulation

2.3 Matrix Formulation 23 Matrix Formulation 43 A mor complicatd xampl ariss for a nonlinar systm of diffrntial quations Considr th following xampl Exampl 23 x y + x( x 2 y 2 y x + y( x 2 y 2 (233 Transforming to polar coordinats,

More information

Section 6.1. Question: 2. Let H be a subgroup of a group G. Then H operates on G by left multiplication. Describe the orbits for this operation.

Section 6.1. Question: 2. Let H be a subgroup of a group G. Then H operates on G by left multiplication. Describe the orbits for this operation. MAT 444 H Barclo Spring 004 Homwork 6 Solutions Sction 6 Lt H b a subgroup of a group G Thn H oprats on G by lft multiplication Dscrib th orbits for this opration Th orbits of G ar th right costs of H

More information

1 Minimum Cut Problem

1 Minimum Cut Problem CS 6 Lctur 6 Min Cut and argr s Algorithm Scribs: Png Hui How (05), Virginia Dat: May 4, 06 Minimum Cut Problm Today, w introduc th minimum cut problm. This problm has many motivations, on of which coms

More information

Symmetric centrosymmetric matrix vector multiplication

Symmetric centrosymmetric matrix vector multiplication Linar Algbra and its Applications 320 (2000) 193 198 www.lsvir.com/locat/laa Symmtric cntrosymmtric matrix vctor multiplication A. Mlman 1 Dpartmnt of Mathmatics, Univrsity of San Francisco, San Francisco,

More information

Basic Polyhedral theory

Basic Polyhedral theory Basic Polyhdral thory Th st P = { A b} is calld a polyhdron. Lmma 1. Eithr th systm A = b, b 0, 0 has a solution or thr is a vctorπ such that π A 0, πb < 0 Thr cass, if solution in top row dos not ist

More information

CHAPTER 1. Introductory Concepts Elements of Vector Analysis Newton s Laws Units The basis of Newtonian Mechanics D Alembert s Principle

CHAPTER 1. Introductory Concepts Elements of Vector Analysis Newton s Laws Units The basis of Newtonian Mechanics D Alembert s Principle CHPTER 1 Introductory Concpts Elmnts of Vctor nalysis Nwton s Laws Units Th basis of Nwtonian Mchanics D lmbrt s Principl 1 Scinc of Mchanics: It is concrnd with th motion of matrial bodis. odis hav diffrnt

More information

A Propagating Wave Packet Group Velocity Dispersion

A Propagating Wave Packet Group Velocity Dispersion Lctur 8 Phys 375 A Propagating Wav Packt Group Vlocity Disprsion Ovrviw and Motivation: In th last lctur w lookd at a localizd solution t) to th 1D fr-particl Schrödingr quation (SE) that corrsponds to

More information

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming

CPSC 665 : An Algorithmist s Toolkit Lecture 4 : 21 Jan Linear Programming CPSC 665 : An Algorithmist s Toolkit Lctur 4 : 21 Jan 2015 Lcturr: Sushant Sachdva Linar Programming Scrib: Rasmus Kyng 1. Introduction An optimization problm rquirs us to find th minimum or maximum) of

More information

Self-Adjointness and Its Relationship to Quantum Mechanics. Ronald I. Frank 2016

Self-Adjointness and Its Relationship to Quantum Mechanics. Ronald I. Frank 2016 Ronald I. Frank 06 Adjoint https://n.wikipdia.org/wiki/adjoint In gnral thr is an oprator and a procss that dfin its adjoint *. It is thn slf-adjoint if *. Innr product spac https://n.wikipdia.org/wiki/innr_product_spac

More information

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes

A Sub-Optimal Log-Domain Decoding Algorithm for Non-Binary LDPC Codes Procdings of th 9th WSEAS Intrnational Confrnc on APPLICATIONS of COMPUTER ENGINEERING A Sub-Optimal Log-Domain Dcoding Algorithm for Non-Binary LDPC Cods CHIRAG DADLANI and RANJAN BOSE Dpartmnt of Elctrical

More information

Elements of Statistical Thermodynamics

Elements of Statistical Thermodynamics 24 Elmnts of Statistical Thrmodynamics Statistical thrmodynamics is a branch of knowldg that has its own postulats and tchniqus. W do not attmpt to giv hr vn an introduction to th fild. In this chaptr,

More information

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012

The van der Waals interaction 1 D. E. Soper 2 University of Oregon 20 April 2012 Th van dr Waals intraction D. E. Sopr 2 Univrsity of Orgon 20 pril 202 Th van dr Waals intraction is discussd in Chaptr 5 of J. J. Sakurai, Modrn Quantum Mchanics. Hr I tak a look at it in a littl mor

More information

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages:

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages: CS 6353 Compilr Construction, Homwork #1 1. Writ rgular xprssions for th following informally dscribd languags: a. All strings of 0 s and 1 s with th substring 01*1. Answr: (0 1)*01*1(0 1)* b. All strings

More information

General Notes About 2007 AP Physics Scoring Guidelines

General Notes About 2007 AP Physics Scoring Guidelines AP PHYSICS C: ELECTRICITY AND MAGNETISM 2007 SCORING GUIDELINES Gnral Nots About 2007 AP Physics Scoring Guidlins 1. Th solutions contain th most common mthod of solving th fr-rspons qustions and th allocation

More information

AS 5850 Finite Element Analysis

AS 5850 Finite Element Analysis AS 5850 Finit Elmnt Analysis Two-Dimnsional Linar Elasticity Instructor Prof. IIT Madras Equations of Plan Elasticity - 1 displacmnt fild strain- displacmnt rlations (infinitsimal strain) in matrix form

More information

Mutually Independent Hamiltonian Cycles of Pancake Networks

Mutually Independent Hamiltonian Cycles of Pancake Networks Mutually Indpndnt Hamiltonian Cycls of Pancak Ntworks Chng-Kuan Lin Dpartmnt of Mathmatics National Cntral Univrsity, Chung-Li, Taiwan 00, R O C discipl@ms0urlcomtw Hua-Min Huang Dpartmnt of Mathmatics

More information

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero. SETION 6. 57 6. Evaluation of Dfinit Intgrals Exampl 6.6 W hav usd dfinit intgrals to valuat contour intgrals. It may com as a surpris to larn that contour intgrals and rsidus can b usd to valuat crtain

More information

5.80 Small-Molecule Spectroscopy and Dynamics

5.80 Small-Molecule Spectroscopy and Dynamics MIT OpnCoursWar http://ocw.mit.du 5.80 Small-Molcul Spctroscopy and Dynamics Fall 008 For information about citing ths matrials or our Trms of Us, visit: http://ocw.mit.du/trms. Lctur # 3 Supplmnt Contnts

More information

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals. Chaptr 7 Th Hydrogn Atom Background: W hav discussd th PIB HO and th nrgy of th RR modl. In this chaptr th H-atom and atomic orbitals. * A singl particl moving undr a cntral forc adoptd from Scott Kirby

More information

Brief Introduction to Statistical Mechanics

Brief Introduction to Statistical Mechanics Brif Introduction to Statistical Mchanics. Purpos: Ths nots ar intndd to provid a vry quick introduction to Statistical Mchanics. Th fild is of cours far mor vast than could b containd in ths fw pags.

More information

perm4 A cnt 0 for for if A i 1 A i cnt cnt 1 cnt i j. j k. k l. i k. j l. i l

perm4 A cnt 0 for for if A i 1 A i cnt cnt 1 cnt i j. j k. k l. i k. j l. i l h 4D, 4th Rank, Antisytric nsor and th 4D Equivalnt to th Cross Product or Mor Fun with nsors!!! Richard R Shiffan Digital Graphics Assoc 8 Dunkirk Av LA, Ca 95 rrs@isidu his docunt dscribs th four dinsional

More information

Abstract Interpretation: concrete and abstract semantics

Abstract Interpretation: concrete and abstract semantics Abstract Intrprtation: concrt and abstract smantics Concrt smantics W considr a vry tiny languag that manags arithmtic oprations on intgrs valus. Th (concrt) smantics of th languags cab b dfind by th funzcion

More information

EEO 401 Digital Signal Processing Prof. Mark Fowler

EEO 401 Digital Signal Processing Prof. Mark Fowler EEO 401 Digital Signal Procssing Prof. Mark Fowlr ot St #18 Introduction to DFT (via th DTFT) Rading Assignmnt: Sct. 7.1 of Proakis & Manolakis 1/24 Discrt Fourir Transform (DFT) W v sn that th DTFT is

More information

Chapter 6. The Discrete Fourier Transform and The Fast Fourier Transform

Chapter 6. The Discrete Fourier Transform and The Fast Fourier Transform Pusan ational Univrsity Chaptr 6. Th Discrt Fourir Transform and Th Fast Fourir Transform 6. Introduction Frquncy rsponss of discrt linar tim invariant systms ar rprsntd by Fourir transform or z-transforms.

More information

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches. Subjct Chmistry Papr No and Titl Modul No and Titl Modul Tag 8/ Physical Spctroscopy / Brakdown of th Born-Oppnhimr approximation. Slction ruls for rotational-vibrational transitions. P, R branchs. CHE_P8_M

More information

Problem Set 6 Solutions

Problem Set 6 Solutions 6.04/18.06J Mathmatics for Computr Scinc March 15, 005 Srini Dvadas and Eric Lhman Problm St 6 Solutions Du: Monday, March 8 at 9 PM in Room 3-044 Problm 1. Sammy th Shark is a financial srvic providr

More information

Homework #3. 1 x. dx. It therefore follows that a sum of the

Homework #3. 1 x. dx. It therefore follows that a sum of the Danil Cannon CS 62 / Luan March 5, 2009 Homwork # 1. Th natural logarithm is dfind by ln n = n 1 dx. It thrfor follows that a sum of th 1 x sam addnd ovr th sam intrval should b both asymptotically uppr-

More information

First derivative analysis

First derivative analysis Robrto s Nots on Dirntial Calculus Chaptr 8: Graphical analysis Sction First drivativ analysis What you nd to know alrady: How to us drivativs to idntiy th critical valus o a unction and its trm points

More information

Note If the candidate believes that e x = 0 solves to x = 0 or gives an extra solution of x = 0, then withhold the final accuracy mark.

Note If the candidate believes that e x = 0 solves to x = 0 or gives an extra solution of x = 0, then withhold the final accuracy mark. . (a) Eithr y = or ( 0, ) (b) Whn =, y = ( 0 + ) = 0 = 0 ( + ) = 0 ( )( ) = 0 Eithr = (for possibly abov) or = A 3. Not If th candidat blivs that = 0 solvs to = 0 or givs an tra solution of = 0, thn withhold

More information

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1

Abstract Interpretation. Lecture 5. Profs. Aiken, Barrett & Dill CS 357 Lecture 5 1 Abstract Intrprtation 1 History On brakthrough papr Cousot & Cousot 77 (?) Inspird by Dataflow analysis Dnotational smantics Enthusiastically mbracd by th community At last th functional community... At

More information

On the Hamiltonian of a Multi-Electron Atom

On the Hamiltonian of a Multi-Electron Atom On th Hamiltonian of a Multi-Elctron Atom Austn Gronr Drxl Univrsity Philadlphia, PA Octobr 29, 2010 1 Introduction In this papr, w will xhibit th procss of achiving th Hamiltonian for an lctron gas. Making

More information

Recall that by Theorems 10.3 and 10.4 together provide us the estimate o(n2 ), S(q) q 9, q=1

Recall that by Theorems 10.3 and 10.4 together provide us the estimate o(n2 ), S(q) q 9, q=1 Chaptr 11 Th singular sris Rcall that by Thorms 10 and 104 togthr provid us th stimat 9 4 n 2 111 Rn = SnΓ 2 + on2, whr th singular sris Sn was dfind in Chaptr 10 as Sn = q=1 Sq q 9, with Sq = 1 a q gcda,q=1

More information

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula

u x v x dx u x v x v x u x dx d u x v x u x v x dx u x v x dx Integration by Parts Formula 7. Intgration by Parts Each drivativ formula givs ris to a corrsponding intgral formula, as w v sn many tims. Th drivativ product rul yilds a vry usful intgration tchniqu calld intgration by parts. Starting

More information

Week 3: Connected Subgraphs

Week 3: Connected Subgraphs Wk 3: Connctd Subgraphs Sptmbr 19, 2016 1 Connctd Graphs Path, Distanc: A path from a vrtx x to a vrtx y in a graph G is rfrrd to an xy-path. Lt X, Y V (G). An (X, Y )-path is an xy-path with x X and y

More information

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis Middl East Tchnical Univrsity Dpartmnt of Mchanical Enginring ME 43 Introduction to Finit Elmnt Analysis Chaptr 3 Computr Implmntation of D FEM Ths nots ar prpard by Dr. Cünyt Srt http://www.m.mtu.du.tr/popl/cunyt

More information

Estimation of apparent fraction defective: A mathematical approach

Estimation of apparent fraction defective: A mathematical approach Availabl onlin at www.plagiarsarchlibrary.com Plagia Rsarch Library Advancs in Applid Scinc Rsarch, 011, (): 84-89 ISSN: 0976-8610 CODEN (USA): AASRFC Estimation of apparnt fraction dfctiv: A mathmatical

More information

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0

Function Spaces. a x 3. (Letting x = 1 =)) a(0) + b + c (1) = 0. Row reducing the matrix. b 1. e 4 3. e 9. >: (x = 1 =)) a(0) + b + c (1) = 0 unction Spacs Prrquisit: Sction 4.7, Coordinatization n this sction, w apply th tchniqus of Chaptr 4 to vctor spacs whos lmnts ar functions. Th vctor spacs P n and P ar familiar xampls of such spacs. Othr

More information

Thus, because if either [G : H] or [H : K] is infinite, then [G : K] is infinite, then [G : K] = [G : H][H : K] for all infinite cases.

Thus, because if either [G : H] or [H : K] is infinite, then [G : K] is infinite, then [G : K] = [G : H][H : K] for all infinite cases. Homwork 5 M 373K Solutions Mark Lindbrg and Travis Schdlr 1. Prov that th ring Z/mZ (for m 0) is a fild if and only if m is prim. ( ) Proof by Contrapositiv: Hr, thr ar thr cass for m not prim. m 0: Whn

More information

On spanning trees and cycles of multicolored point sets with few intersections

On spanning trees and cycles of multicolored point sets with few intersections On spanning trs and cycls of multicolord point sts with fw intrsctions M. Kano, C. Mrino, and J. Urrutia April, 00 Abstract Lt P 1,..., P k b a collction of disjoint point sts in R in gnral position. W

More information

Text: WMM, Chapter 5. Sections , ,

Text: WMM, Chapter 5. Sections , , Lcturs 6 - Continuous Probabilit Distributions Tt: WMM, Chaptr 5. Sctions 6.-6.4, 6.6-6.8, 7.-7. In th prvious sction, w introduc som of th common probabilit distribution functions (PDFs) for discrt sampl

More information

Introduction to the Fourier transform. Computer Vision & Digital Image Processing. The Fourier transform (continued) The Fourier transform (continued)

Introduction to the Fourier transform. Computer Vision & Digital Image Processing. The Fourier transform (continued) The Fourier transform (continued) Introduction to th Fourir transform Computr Vision & Digital Imag Procssing Fourir Transform Lt f(x) b a continuous function of a ral variabl x Th Fourir transform of f(x), dnotd by I {f(x)} is givn by:

More information

1 Isoparametric Concept

1 Isoparametric Concept UNIVERSITY OF CALIFORNIA BERKELEY Dpartmnt of Civil Enginring Spring 06 Structural Enginring, Mchanics and Matrials Profssor: S. Govindj Nots on D isoparamtric lmnts Isoparamtric Concpt Th isoparamtric

More information

Capturing. Fig. 1: Transform. transform. of two time. series. series of the. Fig. 2:

Capturing. Fig. 1: Transform. transform. of two time. series. series of the. Fig. 2: Appndix: Nots on signal procssing Capturing th Spctrum: Transform analysis: Th discrt Fourir transform A digital spch signal such as th on shown in Fig. 1 is a squnc of numbrs. Fig. 1: Transform analysis

More information

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK Data Assimilation 1 Alan O Nill National Cntr for Earth Obsrvation UK Plan Motivation & basic idas Univariat (scalar) data assimilation Multivariat (vctor) data assimilation 3d-Variational Mthod (& optimal

More information

Derangements and Applications

Derangements and Applications 2 3 47 6 23 Journal of Intgr Squncs, Vol. 6 (2003), Articl 03..2 Drangmnts and Applications Mhdi Hassani Dpartmnt of Mathmatics Institut for Advancd Studis in Basic Scincs Zanjan, Iran mhassani@iasbs.ac.ir

More information

Supplementary Materials

Supplementary Materials 6 Supplmntary Matrials APPENDIX A PHYSICAL INTERPRETATION OF FUEL-RATE-SPEED FUNCTION A truck running on a road with grad/slop θ positiv if moving up and ngativ if moving down facs thr rsistancs: arodynamic

More information

4.2 Design of Sections for Flexure

4.2 Design of Sections for Flexure 4. Dsign of Sctions for Flxur This sction covrs th following topics Prliminary Dsign Final Dsign for Typ 1 Mmbrs Spcial Cas Calculation of Momnt Dmand For simply supportd prstrssd bams, th maximum momnt

More information

Homotopy perturbation technique

Homotopy perturbation technique Comput. Mthods Appl. Mch. Engrg. 178 (1999) 257±262 www.lsvir.com/locat/cma Homotopy prturbation tchniqu Ji-Huan H 1 Shanghai Univrsity, Shanghai Institut of Applid Mathmatics and Mchanics, Shanghai 272,

More information

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES Eduard N. Klnov* Rostov-on-Don, Russia Th articl considrs phnomnal gomtry figurs bing th carrirs of valu spctra for th pairs of th rmaining additiv

More information

EXST Regression Techniques Page 1

EXST Regression Techniques Page 1 EXST704 - Rgrssion Tchniqus Pag 1 Masurmnt rrors in X W hav assumd that all variation is in Y. Masurmnt rror in this variabl will not ffct th rsults, as long as thy ar uncorrlatd and unbiasd, sinc thy

More information

Problem Set #2 Due: Friday April 20, 2018 at 5 PM.

Problem Set #2 Due: Friday April 20, 2018 at 5 PM. 1 EE102B Spring 2018 Signal Procssing and Linar Systms II Goldsmith Problm St #2 Du: Friday April 20, 2018 at 5 PM. 1. Non-idal sampling and rcovry of idal sampls by discrt-tim filtring 30 pts) Considr

More information

Computing and Communications -- Network Coding

Computing and Communications -- Network Coding 89 90 98 00 Computing and Communications -- Ntwork Coding Dr. Zhiyong Chn Institut of Wirlss Communications Tchnology Shanghai Jiao Tong Univrsity China Lctur 5- Nov. 05 0 Classical Information Thory Sourc

More information

Chapter 6 Folding. Folding

Chapter 6 Folding. Folding Chaptr 6 Folding Wintr 1 Mokhtar Abolaz Folding Th folding transformation is usd to systmatically dtrmin th control circuits in DSP architctur whr multipl algorithm oprations ar tim-multiplxd to a singl

More information

MATH 319, WEEK 15: The Fundamental Matrix, Non-Homogeneous Systems of Differential Equations

MATH 319, WEEK 15: The Fundamental Matrix, Non-Homogeneous Systems of Differential Equations MATH 39, WEEK 5: Th Fundamntal Matrix, Non-Homognous Systms of Diffrntial Equations Fundamntal Matrics Considr th problm of dtrmining th particular solution for an nsmbl of initial conditions For instanc,

More information

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real.

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real. Midtrm #, Physics 37A, Spring 07. Writ your rsponss blow or on xtra pags. Show your work, and tak car to xplain what you ar doing; partial crdit will b givn for incomplt answrs that dmonstrat som concptual

More information

1.2 Faraday s law A changing magnetic field induces an electric field. Their relation is given by:

1.2 Faraday s law A changing magnetic field induces an electric field. Their relation is given by: Elctromagntic Induction. Lorntz forc on moving charg Point charg moving at vlocity v, F qv B () For a sction of lctric currnt I in a thin wir dl is Idl, th forc is df Idl B () Elctromotiv forc f s any

More information

Title: Vibrational structure of electronic transition

Title: Vibrational structure of electronic transition Titl: Vibrational structur of lctronic transition Pag- Th band spctrum sn in th Ultra-Violt (UV) and visibl (VIS) rgions of th lctromagntic spctrum can not intrprtd as vibrational and rotational spctrum

More information

From Elimination to Belief Propagation

From Elimination to Belief Propagation School of omputr Scinc Th lif Propagation (Sum-Product lgorithm Probabilistic Graphical Modls (10-708 Lctur 5, Sp 31, 2007 Rcptor Kinas Rcptor Kinas Kinas X 5 ric Xing Gn G T X 6 X 7 Gn H X 8 Rading: J-hap

More information

Coupled Pendulums. Two normal modes.

Coupled Pendulums. Two normal modes. Tim Dpndnt Two Stat Problm Coupld Pndulums Wak spring Two normal mods. No friction. No air rsistanc. Prfct Spring Start Swinging Som tim latr - swings with full amplitud. stationary M +n L M +m Elctron

More information

Hydrogen Atom and One Electron Ions

Hydrogen Atom and One Electron Ions Hydrogn Atom and On Elctron Ions Th Schrödingr quation for this two-body problm starts out th sam as th gnral two-body Schrödingr quation. First w sparat out th motion of th cntr of mass. Th intrnal potntial

More information

Outline. Thanks to Ian Blockland and Randy Sobie for these slides Lifetimes of Decaying Particles Scattering Cross Sections Fermi s Golden Rule

Outline. Thanks to Ian Blockland and Randy Sobie for these slides Lifetimes of Decaying Particles Scattering Cross Sections Fermi s Golden Rule Outlin Thanks to Ian Blockland and andy obi for ths slids Liftims of Dcaying Particls cattring Cross ctions Frmi s Goldn ul Physics 424 Lctur 12 Pag 1 Obsrvabls want to rlat xprimntal masurmnts to thortical

More information

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM Elctronic Journal of Diffrntial Equations, Vol. 2003(2003), No. 92, pp. 1 6. ISSN: 1072-6691. URL: http://jd.math.swt.du or http://jd.math.unt.du ftp jd.math.swt.du (login: ftp) LINEAR DELAY DIFFERENTIAL

More information

Some remarks on Kurepa s left factorial

Some remarks on Kurepa s left factorial Som rmarks on Kurpa s lft factorial arxiv:math/0410477v1 [math.nt] 21 Oct 2004 Brnd C. Kllnr Abstract W stablish a connction btwn th subfactorial function S(n) and th lft factorial function of Kurpa K(n).

More information

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon.

4. Money cannot be neutral in the short-run the neutrality of money is exclusively a medium run phenomenon. PART I TRUE/FALSE/UNCERTAIN (5 points ach) 1. Lik xpansionary montary policy, xpansionary fiscal policy rturns output in th mdium run to its natural lvl, and incrass prics. Thrfor, fiscal policy is also

More information

Finding low cost TSP and 2-matching solutions using certain half integer subtour vertices

Finding low cost TSP and 2-matching solutions using certain half integer subtour vertices Finding low cost TSP and 2-matching solutions using crtain half intgr subtour vrtics Sylvia Boyd and Robrt Carr Novmbr 996 Introduction Givn th complt graph K n = (V, E) on n nods with dg costs c R E,

More information

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator. Exam N a m : _ S O L U T I O N P U I D : I n s t r u c t i o n s : It is important that you clarly show your work and mark th final answr clarly, closd book, closd nots, no calculator. T i m : h o u r

More information

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list.

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list. 3 3 4 8 6 3 3 4 8 6 3 3 4 8 6 () (d) 3 Sarching Linkd Lists Sarching Linkd Lists Sarching Linkd Lists ssum th list is sortd, but is stord in a linkd list. an w us binary sarch? omparisons? Work? What if

More information

SCHUR S THEOREM REU SUMMER 2005

SCHUR S THEOREM REU SUMMER 2005 SCHUR S THEOREM REU SUMMER 2005 1. Combinatorial aroach Prhas th first rsult in th subjct blongs to I. Schur and dats back to 1916. On of his motivation was to study th local vrsion of th famous quation

More information

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis

Middle East Technical University Department of Mechanical Engineering ME 413 Introduction to Finite Element Analysis Middl East Tchnical Univrsity Dpartmnt of Mchanical Enginring ME Introduction to Finit Elmnt Analysis Chaptr 5 Two-Dimnsional Formulation Ths nots ar prpard by Dr. Cünyt Srt http://www.m.mtu.du.tr/popl/cunyt

More information

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES

BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES BINOMIAL COEFFICIENTS INVOLVING INFINITE POWERS OF PRIMES DONALD M. DAVIS Abstract. If p is a prim (implicit in notation and n a positiv intgr, lt ν(n dnot th xponnt of p in n, and U(n n/p ν(n, th unit

More information

On the irreducibility of some polynomials in two variables

On the irreducibility of some polynomials in two variables ACTA ARITHMETICA LXXXII.3 (1997) On th irrducibility of som polynomials in two variabls by B. Brindza and Á. Pintér (Dbrcn) To th mmory of Paul Erdős Lt f(x) and g(y ) b polynomials with intgral cofficints

More information

Recursive Estimation of Dynamic Time-Varying Demand Models

Recursive Estimation of Dynamic Time-Varying Demand Models Intrnational Confrnc on Computr Systms and chnologis - CompSysch 06 Rcursiv Estimation of Dynamic im-varying Dmand Modls Alxandr Efrmov Abstract: h papr prsnts an implmntation of a st of rcursiv algorithms

More information

A central nucleus. Protons have a positive charge Electrons have a negative charge

A central nucleus. Protons have a positive charge Electrons have a negative charge Atomic Structur Lss than ninty yars ago scintists blivd that atoms wr tiny solid sphrs lik minut snookr balls. Sinc thn it has bn discovrd that atoms ar not compltly solid but hav innr and outr parts.

More information

Linear Non-Gaussian Structural Equation Models

Linear Non-Gaussian Structural Equation Models IMPS 8, Durham, NH Linar Non-Gaussian Structural Equation Modls Shohi Shimizu, Patrik Hoyr and Aapo Hyvarinn Osaka Univrsity, Japan Univrsity of Hlsinki, Finland Abstract Linar Structural Equation Modling

More information