Conen-Based Shape Rerieval Using Differen Shape Descripors: A Comparaive Sudy Dengsheng Zhang and Guojun Lu Gippsland School of Compuing and Informaion Technology Monash Universiy Churchill, Vicoria 3842 Ausralia dengsheng.zhang, guojun.lu@infoech.monash.edu.au hp://www.gsci.monash.edu.au/~dengs/ hp://www.gsci.monash.edu.au/~guojunl/ 1
Ouline Moivaions Comparison of Conour Descripors o Fourier Descripors FD) o Curvaure Scale Space Descripors CSSD) o Rerieval Experimens o Discussions Comparison of Region Descripors o Zernike Momens Descripors ZMD) o Grid Descripors GD) o Rerieval Experimens o Discussions Conclusions 2
Moivaions Conen-Based Image Rerieval Invesigae Techniques for Shape Represenaion Compare Oher Techniques Wih MPEG-7 Shape Descripors 3
FD-I FD is derived by applying Fourier ransform on shape signaure, such as cenroid disance funcion r). Signaure Derivaion where r ) 2 2 2 = [ x ) x ] + [ y ) y ] ) 1/, = 0,1,..., N c 1 1 N = 0 c 1 1 N y y ). = 0 x c = x ) c = N N 1 and x), y)) are shape boundary coordinaes Figure 1. Three apple shapes on he op and heir cenroid disance signaures on he boom. 4
FD-II FD derivaion o The discree Fourier ransform of r) is hen given by a n N 1 1 = r )exp N = 0 j2π n ), N n = 0, 1,..., N 1 a n are he Fourier ransformed coefficiens of r). o Translaion Normalizaion. a n is ranslaion normalized due o he ranslaion invariance of r) o Roaion Normalizaion. Ignore he phase informaion of a n and only reain he magniude of a n o Scaling Normalizaion. All he oher coefficien magniudes are normalized by a 0 o Number of FDs Used. 5, 10, 15, 30, 60 and 90 FDs are esed for rerieval, resuls show 10 FDs are sufficien for shape represenaion Similariy Measuremen. Euclidean disance d = N i= 1 Q FD i FD T i 2 ) 1 2 5
6 CSSD-I Curvaure Derivaion 2 3/ 2 2 )) ) )) / ) ) ) ) y x y x y x k & & & && && & + = Gaussian Smooh of Curvaure ), ) ) ' σ g x x =, ), ) ) ' σ g y y = CSS Map Compuaion CSS Peak Exracion a) b) c) Figure 2. The CSS conour map and he CSS peaks map of a) apple 1; b) apple 2; c) apple 3.
CSSD-II Normalizaion o Translaion Normalizaion. CSSD is ranslaion invarian due o curvaure is ranslaion invarian o Scale Normalizaion. Normalizing all he shapes ino fixed number of boundary poins e.g. 128). o Roaion Normalizaion. Circular shifing he highes peak o he origin of he CSS map. Similariy measuremen o The similariy beween wo shapes A and B is measured by he summaion of he peak differences beween all he mached peaks and he peak values of all he unmached peaks o In order o increase robusness, four schemes of circular shifing maching are applied in order o olerae variaions of peak heighs of poenial maching peaks: shifing primary peak he highes peak) of A o mach he primary peak of B oher peaks of A are shifed accordingly); shifing primary peak of A o mach secondary peak second highes CSS peak) of B; shifing secondary peak of A o mach he primary peak of B; shifing secondary peak of A o mach he secondary peak of B. o Mirror shape maching 7
Rerieval Effeciveness Framework. Java-based indexing and rerieval framework Plaform. Widows plaform on Penium III-866 PC Daabase. MPEG-7 conour shape daabase CE- 1 Se A1, Se A2 and Se B. Se A1 and Se A2, each consising of 420 shapes of 70 classes, are for es of scale invariance and roaion invariance respecively. Se B consiss of 1400 shapes for es of overall robusness Query Mehod. All he shapes in Se A1, A2 and B are used as queries Evaluaion Mehod. Precision and recall. Precision: he raio of he number of rerieved relevan shapes o he oal number of rerieved shapes; Recall: he raio of he number of rerieved relevan images o he oal number of relevan shapes in he whole daabase. For each query, he precision of he rerieval a each level of he recall is obained. The resul precision of rerieval is he average precision of all he query rerievals. 8
Precision 110 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 Recall CSSD FD a) Precision 110 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 Recall CSSD FD b) Precision 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 Recall CSSD FD c) Figure 3. Average precision and recall of rerieval using FD and CSSD on a) Se A1; b) Se A2; c) Se B. 9
Compuaion Efficiency Plaform. Windows plaform on Penium III-866 PC. Daabase. Se B of MPEG-7 conour shape daabase Table 1. The elapsed ime of feaure exracion and rerieval for 1400 shapes Time Toal ime of Average ime of Toal ime of Average ime of feaure exracion of 1400 shapes feaure exracion of each shape rerieval of 1400 queries rerieval of each query Shape descripors FD 80960 ms 57.8 ms 54150 ms 38.6 ms CSSD 120629 ms 86.1 ms 317178 ms 226.5 ms 10
Discussions Conour-based Shape Descripors Feaure Domains. FD is obained from specral domain while CSSD is obained from spaial domain. Dimensions. Dimension of FD feaure is consan, while dimension of CSSD feaure varies for each shape. Compuaion Complexiy. The compuaion process of CSSD is more complex han ha of FD. Compuaion of CSSD has an exra process of scaling normalizaion before CSSD exracion, and he exracion of he CSSD feaure akes wo processes, i.e., CSS map compuaion and CSS peaks exracion. Online Maching Compuaion. The online maching of wo ses of FDs is simply he Euclidean disance beween wo feaure vecors of 10 dimensions. The online maching of wo ses of CSSD involves a leas 8 schemes of circular shif maching, and for each scheme of circular shif maching, i involves 6 shifs and he Euclidean disance calculaion beween wo feaure vecors of 6 dimensions. 11
Type of Feaures Capured. FD capures boh global feaures and local feaures while CSSD does no capure global feaures. Parameers or Thresholds Influence. For FD, he parameer is he number of FDs, which is predicable. For CSSD, he parameers are he number of sampling poins, he hreshold o eliminae shor peaks and he olerance value for peak posiion maching, which are deermined empirically. In MPEG-7 [ISO00], four more empirical facors are used o implemen his echnique: he scale facor and he exponenial facor for he peak ransformaion, he hreshold o eliminae all he peaks in a shape and he daabase dependen value used for peak normalizaion. Hierarchical Represenaion. FD suppors hierarchical coarse o fine represenaion while CSSD does no. In order o suppor hierarchical represenaion, CSSD has o incorporae shape global feaures such as eccenriciy and circulariy which are unreliable. Suiabiliy for Efficien Indexing. FD is suiable o be organized ino efficien daa srucure, while CSSD is no, due o is variable dimensions and complex disance calculaion. 12
ZMD-I Zernike Polynomials: V x, y) = V ρ cosθ, ρ sinθ ) R ρ) exp jmθ ) nm nm = nm n where Rnm ρ ) = s= m ) / 2 0 1) s n s)! ρ n+ m n m s! s)! s)! 2 2 n 2s and ρ is he radius from x, y) o he shape cenroid, θ is he angle beween ρ and x axis, n and m are inegers and subjec o n- m = even, m n. Zernike polynomials are a complee se of complexvalued funcion orhogonal over he uni disk, i.e., x 2 + y 2 = 1. Complex Zernike Momens of Order n wih Repeiion m n + A = 1 * 2 nm f x, y) Vnm x, y), x + y 2 1 π x y 13
ZMD-II Normalisaion o Scale Normalizaion. Scale shape ino uni disk o Translaion Normalizaion. Shif shape cenroid o he cener of he uni disk o Roaion Normalizaion. Ignore he phase informaion of A mn, and only reain he magniude of A mn o Magniude Normalizaion. Magniudes are normalized ino 0, 1) wih he shape mass o Number of ZMDs Used. 36 ZMDs are used in accordance wih MPEG-7 documens Similariy Measuremen. Euclidean disance d = N i= 1 Q ZMD i ZMD T i 2 ) 1 2 14
GD-I Grid Overlay. Projec shape ono a grid of fixed size e.g.16 16) grid cells Cell Value Assignaion. Assign value of 1 o a cell if is covered by he shape and 0 if no Acquire Shape Number. A binary sequence is obained by scanning he grid in lef-righ and op-boom order. The shape numbers of he above wo shapes are 001111000 011111111 111111111 111111111 111110011 001100011 and 001100000 011100000 111100000 111100000 011111100 000111000 respecively. 15
GD-II Normalizaion o Scaling Normalizaion. Scale shape ino fixed size bounding recangle o Roaion Normalizaion. Roae shape major axis o horizonal direcion o Translaion Normalizaion. Translae he roaed shape ino he uppermos par of he bounding recangle Similariy measuremen o Ciy Block Disance. XOR operaion on he wo ses of shape numbers 16
Rerieval Effeciveness Framework. Java-based indexing and rerieval framework Plaform. Widows plaform on Penium III-866 PC Daabases. Se B of MPEG-7 conour shape daabase CE-1B) and MPEG-7 region shape daabase CE-2). MPEG- 7 region shape daabase consiss of 3621 shapes of mainly rademarks Query Mehod. All he 1400 shapes in CE-1B are used as queries; 31 classes each has 21 members) of shapes from CE-2 are used as queries Evaluaion Mehod. Precision and recall 17
Precision 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 Recall ZMD GD a) Precision 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 ZMD GD Recall b) Figure 4. a) Rerieval performance of ZMD and GD on conourbased shapes; b) Rerieval performance of ZMD and GD on regionbased shapes. 18
Compuaion Efficiency Plaform. Windows plaform on Penium III-866 PC Daabase. MPEG-7 region-based shape daabase CE-2 Table 2. Time info of feaure exracion and rerieval of CE-2 shapes using region shape descripors. Time Toal ime of Average ime of Toal ime of Average ime of feaure exracion feaure exracion rerieval of 651 rerieval of each of 3621 shapes of each shape queries query Shape descripors ZMD 4325010 ms 1194.4 ms 63854 ms 98 ms GD 2628034 ms 725.7 ms 729909 ms 1121.2ms 19
Discussions Region-based Shape Descripors Applicaions. Boh ZMD and GD are applicaion independen. Feaure domains. ZMD is exraced from specral domain while GD is exraced from spaial domain. Compacness. The dimension of ZMD is low while ha of GD is high. Robusness. ZMD is he more robus o shape variaions han GD. Compuaion complexiy. Boh he offline and online compuaion of GD are more expensive han ZMD. Accuracy. A he same level of recall, he rerieval precision of ZMD is higher han ha of GD. Hierarchical represenaion. Boh ZMD and GD suppor hierarchical represenaion. The number of ZMDs can be adjused o mee hierarchical requiremen. For GD, hierarchical represenaion can be achieved by adjusing he cell size or combined wih eccenriciy and circulariy. GMD does no suppor hierarchical represenaion because higher geomeric momen invarians are difficul o obain. Agree wih human inuiion. GD is a more inuiive shape represenaion han ZMD. 20
Conclusions Conour-based Shape Descripors Drawbacks of CSSD compared wih FD o CSSD is only robus o local boundary variaions, i s no robus in global sense o The low dimension advanage is offse by is complex maching o The rerieval performance of CSSD is low, he overall precision of similariy-based rerieval on Se B of MPEG- 7 conour shape daabase) only achieves 33.6 o CSSD does no suppor hierarchical represenaion. In order o suppor hierarchical represenaion, i has o incorporae oher global shape feaures o The represenaion and rerieval performance depend on empirical facors such as he number of sample poins on he boundary, he hreshold o eliminae shor peaks and he olerance value for peak posiion maching, ec. o CSSD is no suiable for efficien indexing due o he expensive maching and variaion of feaure dimensions o Based on hese facs, we recommend ha FD be included as one of he conour shape descripors in MPEG-7 21
Conclusions Region-based shape descripors In erms of low compuaion complexiy, compac represenaion, robusness and rerieval performance, Zernike momens descripors ZMD) is more suiable for region-based shape rerieval han GD GD agrees more wih human inuiion 22