Chaptr 6 Folding Wintr 1 Mokhtar Abolaz Folding Th folding transformation is usd to systmatically dtrmin th control circuits in DSP architctur whr multipl algorithm oprations ar tim-multiplxd to a singl functional unit. Th hardwar is rducd by a factor of N, th tim is incrasd by th sam factor. May lad to a larg numbr of rgistrs, thus rgistrs minimization tchniqus ar studid. 1
Exampl B E C alid for cycls A D Cycl A B E C D a b a+b 1 a+b c a+b+c a+b a1 b1 a1+b1 a+b+c a+b+c 3 a1+b1 c1 a1+b1+c1 a1+b1 4 a b a+b a1+b1+c1 a1+b1+c1 5 a+b c a+b+c a+b 6 a3 b3 a3+b3 a+b+c a+b+c Folding Transformation Th objctiv is to provid a systmatic tchniqu for dsigning control circuits for hardwar whr svral algorithm oprations ar mappd to th sam pic of hardwar via tim-multiplxing of cours. W start with a DFG for th algorithm. W nd th following dfinitions
Folding Transformation and ar two nods in th original DFG. and ar connctd via an dg with a dlay w w Folding factor is N Nod computation l th itration is prformd at tim Nl +u Nod computation l th itration is prformd at tim Nl +v H u and H v ar th hardwar units and ar prformd at H u and H v ar piplind by P u and P v stags Folding Transformation 3
Folding Transformation Th rsults of th l th itration of nod is availabl at Nl+u+P u Sinc thr ar w dlays btwn and, th rsult is ndd in th l+w th itration of, which is xcutd at Nl+w+v D F = [ N l + w + v] [ Nl + P + u] = Nw P u + v u u Folding Transformation Folding St Is an ordrd st of oprations xcutd by th sam functional unit. Each folding st contains N ntris som of which may b null oprations Th J th position within th folding st is xcutd in th tim partition j For xampl th folding st S 1 = {A 1, φ, A } for N=3 A 1 is prformd during th th tim partition S 1, whil A is don in th nd tim partition S 1 Folding st is obtaind using a schduling and allocation algorithm 4
Exampl N=4 Folding sts ar addr S 1 ={4,,3,1} and a multiplir S ={5,8,6,7} Addition taks 1 and multiplication tim units 1-stag addr and - stag multiplir Exampl DF = Nw Pu + v u 5
Addr Addr Arrivs at Aftr dlay of Exampl Folding Transformation What if som of th D F s ar ngativ Of cours w can not implmnt that A condition: D F W can us rtiming of th original graph to gt a valid D F s Rcall, rtiming quation w r = w + r r 6
7 Folding Transformation + = + + = + + = + = N D r r N D r r Nr Nr D D Nr Nr u v P Nw D u v P r r w N D u v P Nw D D F F F F u F u F u F F th dlays in th foldd rtimd graph is Folding Transformation W can us th tchniqus in Chaptr 4 to solv for rtiming. Thn w fold th graph valid transformation
Exrcis D r r F N Exrcis 8
Exrcis Solution On solution is r1=-1 r= r3= r4= r5= r6=-1 r7=-1 r8=-1 Can w rach th abov solution from th mthod w studis in this cours? NO Anothr solution r1=-1 r= r3=-1 r4= r5=-1 r6=-1 r7=- r8=- 9
Rgistrs Minimization Tchniqus Th objctiv is to minimiz th numbr of rgistrs in th implmntation of a DSP algorithm. Topics Lif tim analysis Data allocation using forward-backward rgistr allocation Rgistr minimization in foldd architctur Exampls Lif Tim Analysis A data sampl variabl is aliv from th tim it is producd, until th tim it is consumd dad. During that tim, th variabl is stord in a rgistr. Th maximum numbr of liv variabls at any tim is th minimum numbr of rgistrs rquird for th implmntation. W us th convntion that th variabl is not aliv during th cycl it is producd in, and aliv during th cycl it is consumd in. 1
Cycl 1 3 4 5 6 7 a b c #liv 1 N=6 priodic with a priod of N=6 Exampl a d g b h c f i Transpos a b c d f g h i i h g f d c b a Matrix Transpos i f c h b g d a 11
Output tim with zro dlay Exampl Sampl a b c d f g h i T input 1 3 4 5 6 7 8 T zlout 3 6 1 4 7 5 8 T diff 4 - -4 - T output 4 7 1 5 8 11 6 9 1 Lif 4 1 7 1 3 5 4 8 5 11 6 6 7 9 8 1 +4 Circular Lif-Tim Chart 1
Data Allocation Dtrmin th min. numbr of rgistrs Input ach variabl at th tim its lif starts. If mor than on us multipl rgistrs such that th longst liftim is allocatd to th initial rgistr. Each variabl is allocatd in a forward mannr until it is dad or rachs th last rgistr Allocation is priodic, all allocation to currnt itration rpats itslf aftr aftr N If rachs th last rgistr and not dad allocat backward if mor than on, choos on that has bn allocatd backward bfor, thn forward again and so on. Allocation tabl Cycl I/P R1 R R3 R4 O/P 1 a b a c b a 3 d c b a 4 d c b a a 5 f d c b d 6 g f b c g 7 h c f b b 8 i h c f 9 i h c f h 1 i f c c 11 i f f 1 i i 13
Cycls a b c #liv 1 3 4 5 6 7 1 + +1=3 Cycl input R1 R R3 Output 1 3 4 5 6 7 a b a b a b a c a b a c b c b b c b,c Synthsis Cycl I/P R1 R R3 R4 O/P 1 a b a c b a 3 d c b a 4 a a 5 f d c b d 6 g f b c g 7 h c f b b 8 i h c f 9 i h c f h 1 i f c c 11 i f f 1 i i 14
Exampl Cycl input R1 R R3 Output a 1 b a b a 3 b a 4 c a b a 5 c b 6 7 c b b c b,c Rgistr Minimization for Foldd Architctur Prform rtiming for folding Writ th folding quations s th folding quations to construct a liftim tabl Draw th liftim chart and dtrmin th minimum numbr of rgistrs Prform forward-backward rgistr allocation Draw th foldd architctur 15
Considr th Biquad xampl. Nod u is cratd at tim u+p u Nod u is consumd at tim = u+p u + max v D F W hav alrady solvd this xampl and w got th folding quations Exampl D F,=Nw-P u +v-u Biquad Filtr Nod T input T output 1 4 9 ----- 3 3 3 4 1 1 5 6 4 4 7 5 6 8 3 4 T out for nod Nod producs data at u + Pu + max{ DF } tim=u+p u v 16
Biquad Filtr Nod T input T output 1 4 9 ----- 3 3 3 4 1 1 5 6 4 4 7 5 6 8 3 4 Biquad Filtr 17
{} Explanation Now, w build th architctur final architctur is in th prvious slid by considring vry dg in th graph. Combining ths partial architctur to mak th final on. 18
{} OT a b c d IN {3} D {} R1 R D {1} Considring 1 Addr to addr N1 is producd at tim 4l+ ndd by at tim 1 a dlay of 1 An dg from R1 whr th rsult will b aftr 1 tim unit to nod addr that switchs at tim 4l+1 Also, input is switchd at 4l+3 and output at 4l+ OT a b c d IN D {} R1 R {} D 1 5 addr to multiplir That rquir dlay D F 1 5 cratd at consumd at A path from output of addr to input of multiplir switchs at 4l+ 19
OT a b c d IN D {} R1 {1} R {} D 1 6 addr to multiplir N1 producs at tim, ndd aftr a dlay of at multiplir at Nd a switch to mov it from addr to R1 at, R1 to R at 1, R to multiplir at OT a b c d IN D {} R1 {1} {} R {3} D 1 7 dlay of 3 N1 producd at, ndd aftr 3 at 3 W nd a switch to go from addr to R1 at, R1 R at 1, R R at, R multiplir at 3
OT a b c d IN D {} R1 {1} R {,3,} {1} D 1 8 dlay of 5 N1 producs rsult at, ndd aftr 5 at multiplir 4l+1 A switch Addr to R1 at, R1 R at 1, R R at, R R at 3. R R at 4 4l+, R Multiplir at 5 4l+1 OT a b c d {3} IN D R1 R D 3 1 dlay of N3 producs rsult at 3, ndd at 3 at addr 1
OT a b c d {1} IN D R1 R D 4 dlay of N4 producs rsult at 1, ndd at addr at 1 no dlay Switch from output of addr to input of addr with no dlay at 1 OT a b c d {} IN D R1 R D 5 3 dlay of N5 producs at, ndd at addr with no dlay A switch from output of multiplir to input of addr at
OT {} a b c d IN D R1 R D 6 4 dlay of N6 producs at 4 i.. = ndd with no dlay at addr A switch from output of multiplir to input of addr at OT a b c d {1} IN {} D R1 R D 7 3 dlay of 1 N7 producs at 3+=1, ndd aftr 1 dlay at addr A switch from multiplir to R1 at 1, R1 addr at 3
OT a b c d {3} IN {} D R1 R D 8 4 dlay of 1 N8 producs at 1+=3 ndd aftr on dlay to input of addr A switch from Multiplir R1 at 3 and R1 addr at {} OT {, a, 3, 1} b c d IN {3} {,1,} {,} {1,3} D {1,3} {} R1 R {,,3} {} {1,,3} D Suprimposing th switchs producs th final architctur Not that constants a,b,c,d ar muxd into multiplir at {,,3,1} 4