Announements Lossy Compression 1-211 Funmentl Dt Strutures n Algorithms Peter Lee Ferury 18, 2003 Homework # is ville Due on Mony, Mrh 17, 11:9pm Get strte now! Quiz #2 Aville on Tuesy, Fe.2 Some questions will e esier if you hve some prts of HW working Re Chpter 8 HW is out! Before we egin Eliz Eliz ws one of the first AI progrms J. Weizenum, 19 At the time, it impresse people who use it Eliz hs een implemente mny, mny times gnu ems hs one try M-x otor Eliz s impt Mny stories of Eliz s impt some people eme so epenent tht Weizenum eventully h to withrw its use some psyhitrists sw Eliz s wy for the profession to hnle mny more ptients Eliz might e use for most ptients, n the humn otor reserve for only the most serious ses 1
Eliz s rules Eliz is remrkly simple progrm Some smple rules: X me Y X you Y I rememer X Why o you rememer X just now? My <fmily-memer> is X Who else in your fmily is X? X <fmily-memer> Y Tell me more out your fmily X Tht is very interesting Why Eliz? The nme ws hosen for its ility to onverse inresingly well The Greek legen of Pygmlion the mysogynist King of Cyprus fell in love with n ivory sttue, Glte tking pity, Aphroite me Glte ome live Pygmlion then mrrie Glte Why Eliz? ont George Bernr Shw wrote ply, Pygmlion, se on the legen Professor Higgins retes ly from low-lss okney flower venor, Eliz Doolittle first filme in 1938 Lter, pte into the politilly orret My Fir Ly Wrp-Up on LZW Compression Byte metho LZW We strt with trie tht ontins root n n hilren one hil for eh possile hrter eh hil lele 0 n When we ompress s efore, y wlking own the trie ut, fter emitting oe n growing the trie, we must strt from the root s hil lele, where is the hrter tht use us to grow the trie LZW: Byte metho exmple Suppose our entire hrter set onsists only of the four letters: {,,, } Let s onsier the ompression of the string 2
Byte LZW: Compress exmple Byte LZW: Compress exmple Ditionry: Ditionry: 1 Byte LZW: Compress exmple Byte LZW: Compress exmple Ditionry: Ditionry: 10 103 Byte LZW: Compress exmple Byte LZW: Compress exmple Ditionry: Ditionry: 7 7 1033 1033 3
Byte LZW output So, the input ompresses to 1033 whih gin n e given in it form, just like in the inry metho or ompresse gin using Huffmn Byte LZW: Unompress exmple The unompress step for yte LZW is the most omplite prt of the entire proess, ut is lrgely similr to the inry metho Byte LZW: Unompress exmple Byte LZW: Unompress exmple 1033 1033 Ditionry: Ditionry: Byte LZW: Unompress exmple Byte LZW: Unompress exmple 1033 1033 Ditionry: Ditionry:
Byte LZW: Unompress exmple Byte LZW: Unompress exmple 1033 1033 Ditionry: Ditionry: 7 LZW pplitions LZW is n extremely useful lossless metho for ompressing t LZW is use in the GIF n ompresse TIFF stnrs for imge t Unisys hols the ptent on LZW, ut llows free nonommeril use Quiz Brek LZW performne Suppose we hve file of N s: 1. Wht woul the output look like fter LZW ompression? 2. Wht, roughly, is the size of the output (in ig-oh terms)? 3. How ig woul the output e if we use Huffmn inste? Lossy Compression
Lossy ompression Tehniques Often, we n tolerte some loss of t through the ompress/ eompress yle. Imges, n espeilly vieo/uio, n e huge HDTV it rte is >1Gps! Big prolem for storge n network Lossy ompression is se on mthemtil trnsformtions Disrete Cosine Trnsform (DCT) Use in JPEG lgorithm Wvelet se imge ompression Use in MPEG- Mny, mny, others Imge files Consier this olor imge 2 8 10 12 1 One-imensionl rry of 3 with height ytes Three two-imensionl rrys, one for eh olor omponent This is prt of fmous imge 1 2 8 10 12 1 1 (Do you know who? Hint: Sply) The imge is 1x1 itmp imge enlrge Here is the Re prt of the imge Green Prt 2 2 8 8 10 10 12 12 1 1 1 1 2 8 10 12 1 1 2 8 10 12 1 1
Blue Prt The re imge, gin 2 8 10 12 1 1 2 8 10 12 1 1 173 1 1 1 18 132 123 132 10 1 173 181 181 181 189 173 198 189 189 189 181 1 18 1 1 173 181 198 20 198 181 1 20 20 20 20 198 189 181 181 198 20 20 222 231 21 181 1 231 222 20 198 189 181 181 181 20 222 222 222 231 222 198 181 231 21 189 173 1 1 173 181 181 189 198 222 239 231 20 21 20 189 173 18 18 18 18 1 1 18 1 198 222 231 21 239 181 1 10 123 123 11 11 123 10 18 10 18 1 20 239 27 1 82 82 90 82 90 107 123 123 11 132 10 1 198 231 123 198 7 9 7 82 82 99 107 11 11 123 132 132 18 21 239 239 107 82 82 7 90 107 123 11 11 123 11 11 123 198 2 90 7 7 99 7 11 123 132 123 123 11 11 10 1 189 27 99 99 82 90 107 123 123 123 123 123 132 10 1 181 198 27 239 1 132 107 18 10 132 132 123 132 18 10 10 1 21 198 231 1 1 132 1 1 10 10 10 18 18 132 10 1 222 27 239 222 181 181 10 1 10 18 18 18 10 132 1 20 222 21 198 181 181 181 181 173 18 1 18 10 10 1 198 222 239 Byte vlues (0 2) inite intensity of the olor t eh pixel The re imge, gin 173 1 1 1 18 132 123 132 10 1 173 181 181 181 189 173 2 198 189 189 189 181 1 18 1 1 173 181 198 20 198 181 1 20 20 20 20 198 189 181 181 198 20 20 222 231 21 181 1 231 222 20 198 189 181 181 181 20 222 222 222 231 222 198 181 231 21 189 173 1 1 173 181 181 189 198 222 239 231 20 21 20 189 173 18 18 18 18 1 1 18 1 198 222 231 21 239 181 1 10 123 123 11 11 123 10 18 10 18 1 20 239 27 8 1 82 82 90 82 90 107 123 123 11 132 10 1 198 231 123 198 7 9 7 82 82 99 107 11 11 123 132 132 18 21 10 239 239 107 82 82 7 90 107 123 11 11 123 11 11 123 198 2 90 7 7 99 7 11 123 132 123 123 11 11 10 1 189 12 27 99 99 82 90 107 123 123 123 123 123 132 10 1 181 198 27 239 1 132 107 18 10 132 132 123 132 18 10 10 1 21 1 198 231 1 1 132 1 1 10 10 10 18 18 132 10 1 222 27 239 222 181 181 10 1 10 18 18 18 10 132 1 20 222 1 21 198 181 181 181 181 173 18 1 18 10 10 1 198 222 239 2 8 10 12 1 1 Byte vlues (0 2) inite intensity of the olor t eh pixel JPEG JPEG JPEG in nutshell Joint Photogrphi Expert Group Vote s interntionl stnr in 1992 Works well for oth olor n grysle imges B G R RGB to YIQ (optionl) Y I Q for eh plne (sn) Mny steps in the lgorithm Some requiring sophistition in mthemtis We ll skip mny prts n fous on just the min elements of JPEG DPCM Zig-zg Qunt DCT RLE Huffmn 11010001 for eh 8x8 lok 7
JPEG in nutshell Liner trnsform oing B G R RGB to YIQ (optionl) Y I Q for eh plne (sn) For vieo, uio, or imges, one key first step of the ompression will e to enoe vlues over regions of time or spe DPCM RLE Zig-zg Qunt DCT for eh 8x8 lok The si strtegy is to selet set of liner sis funtions φ i tht spn the spe sin, os, wvelets, efine t isrete points Huffmn 11010001 Liner trnsform oing Cosine trnsform Coeffiients: In mtrix nottion: Where A is n nxn mtrix, n eh row efines sis funtion Disrete Cosine Trnsform Bsis funtions DCT seprtes the imge into spetrl su-ns of iffering importne With input imge A, the output oeffiients B re given y the following eqution: N 1 n N 2 give the imge s height n with 8
JPEG in nutshell Quntiztion B G R RGB to YIQ (optionl) DPCM RLE Y Zig-zg I Q Qunt for eh plne (sn) DCT for eh 8x8 lok The purpose of quntiztion is to enoe n entire region of vlues into single vlue For exmple, n simply elete loworer its: 101101 oul e enoe s 1011 or 101 When iviing y power-of-two, this mounts to eleting whole its Other ivision onstnts give finer ontrol over it loss Huffmn 11010001 JPEG uses stnr quntiztion tle JPEG quntiztion tle JPEG in nutshell q = B G R RGB to YIQ (optionl) Y I Q for eh plne (sn) DPCM Zig-zg Qunt DCT for eh 8x8 lok RLE Eh B(k 1,k 2 ) is ivie y q(k 1,k 2 ). Eye is most sensitive to low frequenies (upper-left). Huffmn 11010001 Zig-zg sn JPEG in nutshell Purpose is to onvert 8x8 lok into 1x vetor, with low-frequeny oeffiients t the front B G R RGB to YIQ (optionl) Y I Q for eh plne (sn) DPCM Zig-zg Qunt DCT for eh 8x8 lok RLE Huffmn 11010001 9
Finl stges Exmple: GIF The DPCM (ifferentil pulse oe moultion) n RLE (run length enoing) steps tke vntge of ommon hrteristi of mny imges: An 8x8 lok is often not too ifferent thn the previous one Within lok, there re often long sequenes of zeros 72KB Exmple: JPEG t mx qulity Exmple: JPEG t 0% 378KB 2KB Exmple: JPEG t 2% Exmple: JPEG t min qulity 7KB 28KB 10
Mtrix eomposition Suppose A is n m n mtrix, e.g.: SVD A = 120 100 120 100 10 10 10 10 0 0 70 80 10 120 10 10 We n eompose A into three mtries, U, S, n V, suh tht A = U S V T Deomposition exmple Singulr vlue eomposition A = 120 100 120 100 10 10 10 10 0 0 70 80 10 120 10 10 U = 0.709-0.772-0.32 0.1009 0.01-0.000-0.139-0.987 0.300 0.7121-0.98 0.1113 0.709 0.18 0.2-0.01 S = 38.1 0 0 0 0 20.1 0 0 0 0 7.82 0 0 0 0 0.9919 V = 0.209-0.19 0.00-0.3137 0.338-0.1330-0.71-0.873 0.300-0.17-0.188 0.8081 0.09 0.829 0.217-0.109 Orthonorml: U U T = I Digonl, with eresing singulr vlues Orthonorml: V V T = I Suh ftoring of mtrix, or eomposition is lle n SVD. Extly how to fin U, V, n S is eyon the sope of this ourse. But you ll fin out in your mtrix/liner lger ourse Note: Very importnt lso for grphis/nimtion lgorithms So wht out ompression? Let: s i e the i th eigen vlue in S U i e the i th olumn in U V i e the i th olumn in V Then, nother formul for mtrix A is A = s 1 U 1 V 1 T + s 2 U 2 V 2 T +.+ s K U K V K T SVD exmple A = 120 100 120 100 10 10 10 10 0 0 70 80 10 120 10 10 U 1 U = 0.709-0.772-0.32 0.1009 0.01-0.000-0.139-0.987 0.300 0.7121-0.98 0.1113 0.709 0.18 0.2-0.01 S = 38.1 0 0 0 0 20.1 0 0 0 0 7.82 0 0 0 0 0.9919 V = 0.209-0.19 0.00-0.3137 0.338-0.1330-0.71-0.873 0.300-0.17-0.188 0.8081 0.09 0.829 0.217-0.109 V 1 s 1 A1 = s 1 U 1 V 1 T = 11 9 117 112 10 9 11 10 70 9 72 9 19 12 12 1 This is lle the rnk-1 pproximtion 11
Let s form rnk-1 sum A1 = s 1 U 1 V 1 T A1 = 11 9 117 112 10 9 11 10 70 9 72 9 19 12 12 1 Error Mtrix A - A1 is 3 12 0 1 1 0 10 1 2 11 1 2 Reltively smll with rnk-1 pproximtion. Wht o we lern here? To ompute A1 we only nee: just one olumn from U, just one olumn from V, n just one singulr vlue An we get: pretty goo pproximtion to the originl mtrix 9 ytes inste of 1 A ig svings in storge! How out rnk-2 pproximtion? A2 = s 1 U 1 V 1 T + s 2 U 2 V 2 T We get A2 = 122 98 119 100 10 9 11 10 2 7 9 81 17 123 11 19 Error Mtrix A - A2 2 2 1 0 0 1 1 0 2 3 1 1 3 3 1 1 Anlysis To get n ie of how lose the pproximtion to the originl mtrix is, we n lulte: Men of Rnk1 error mtrix =3.812 Men of Rnk2 error mtrix =1.370 Where men is the verge of the ll entries We relly on t gin muh y lulting the rnk-2 pproximtion (why?) SVD exmple Oservtion A = 120 100 120 100 10 10 10 10 0 0 70 80 10 120 10 10 U = 0.709-0.772-0.32 0.1009 0.01-0.000-0.139-0.987 0.300 0.7121-0.98 0.1113 0.709 0.18 0.2-0.01 S = 38.1 0 0 0 0 20.1 0 0 0 0 7.82 0 0 0 0 0.9919 V = 0.209-0.19 0.00-0.3137 0.338-0.1330-0.71-0.873 0.300-0.17-0.188 0.8081 0.09 0.829 0.217-0.109 First eigen vlue is signifintly lrger thn the rest The ontriution from the rnk-1 sum is very signifint ompre to the sum of ll other rnk pproximtions. So even if you leve out ll other rnk sums, you still get pretty goo pproximtion with just two vetors. 12
Some smples (128x128) Smples ont Originl mge 9K Rnk 1 pprox 311 ytes Rnk 1 pprox 13K Rnk 8 pprox 7K Some size oservtions Note tht theoretilly the sizes of the ompresse imges shoul e Rnk 1 = + (128 + 128 + 1)*3 Bmp Heer U 1 V 1 + S 1 ytes/pixel Rnk 8 = + (128+128+1)*3*8 = K Rnk 1 = + (128 + 128 + 1)*3*1 = 12K Rnk 32 = + (128 + 128 + 1)*3*32 = 2K Rnk = 8K (pretty lose to the originl) Mtl Coe for SVD Mtl is omputer lger system (www.mthworks.om) Here is Mtl oe tht n perform SVD on n imge. A=imre(':\temp\rhino','mp'); N = size(a)[1]; R = A(:,:,1); // extrt Re mtrix G = A(:,:,2); // extrt Green Mtrix B = A(:,:,3); // extrt lue mtrix Apply SVD to eh of the mtries [ur,sr,vr]=sv(oule(r)); [ug,sg,vg]=sv(oule(g)); [u,s,v]=sv(oule(b)); Complete Mtl Coe for SVD t.. Mtl outputs A=imre(':\temp\rosemry','mp'); s=size(a) %imges(a) R = A(:,:,1); G = A(:,:,2); B = A(:,:,3); [ur,sr,vr]=sv(oule(r)); [ug,sg,vg]=sv(oule(g)); [u,s,v]=sv(oule(b)); %initilize mtries to zero mtries Rk=zeros(s(1),s(2)); Gk=zeros(s(1),s(2)); Bk=zeros(s(1),s(2)); k = 8; % k is the esire rnk % form the rnk sums for i=1:k, Rk=Rk + sr(i,i)*ur(:,i)*trnspose(vr(:,i)); en for i=1:k, Gk=Gk + sg(i,i)*ug(:,i)*trnspose(vg(:,i)); en for i=1:k, Bk=Bk + s(i,i)*u(:,i)*trnspose(v(:,i)); en % Now form the rnk-k pproximtion of A Ak = A; Ak(:,:,1)=Rk; Ak(:,:,2)=Gk; Ak(:,:,3)=Bk; % now plot the rnk-k pproximtion of imge imges(ak) originl originl Rnk-1 Rnk-1 Rnk- Rnk- Rnk-8 Rnk-8 Originl imges were pproximtely 128x128 13
Aptive rnk methos All populr imge ompression progrms pply ompression lgorithm to suloks of the imge This exploits the uneven hrteristis of the originl imge If prts of the imge re less omplex thn the others, then smller numer of singulr vlues re neee to otin "lose" pproximtion Aptive Rnk Methos t.. So inste of piking sme rnk for eh su-lok, we eie how mny singulr vlues to pik from eh su-lok y looking t the following: Perent of r vlues = s 1 + s 2 +.+ s r s 1 + s 2 +.+ s k Where k is the mx numer of non zero singulr vlues of A. Results of Aptive Rnking Metho We pplie the ptive rnking metho to Dnny Sletor. Here re the results. Originl 9K 80% of singulr vlues 2K 0% of singulr vlues 1K 10% of singulr vlues 1K 1