Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon, and Hong Koo Km School of Informaton and Communcatons Gwangju Insttute of Scence and Technology (GIST) {ldy, mjeon, hongoo}@gst.ac.r Abstract. In ths paper, a pop-clc nose detecton method s proposed to mprove the qualty of audo sgnals recorded usng portable mcrophones. In order to reduce false alarm and mssng detecton errors, the proposed method utlzes the second-order dfference of an uncorrelated resdual sgnal, followed by adaptve medan thresholdng. It s shown from performance evaluaton that the proposed method acheves hgher detecton accuracy, under varous sgnal-tonose rato condtons, than wth a conventonal method usng the frst-order dfference of the resdual sgnal. Keywords: Pop-clc nose, lnear predcton, dfference of resdual sgnal, adaptve medan thresholdng 1 Introducton When an audo sgnal s recorded by portable audtory sensors such as a condenser mcrophone or a mcro electrcal-mechancal system (MEMS) mcrophone, there are many factors that generate nose. In partcular, a pop-clc nose s generated by varous physcal phenomena n acoustcs, such as touchng screens, clcng buttons, and so on [1]. Interference by such noses whle recordng can be hghly annoyng to most lsteners. In order to detect a pop-clc nose, there have been many nose detecton methods proposed [1 4]. Among them, the technques usng the resdual sgnal [3] and the frst-order dfference of the resdual sgnal [4] were reported to successfully detect pop-clc noses. However, the error rate ncreased extremely n the nterval where the sgnal-to-nose rato (SNR) was hgh [4]. Thus, n order to mprove detecton accuracy under hgh SNR condtons, the proposed pop-clc nose detecton method frst utlzes the second-order dfference of an uncorrelated resdual sgnal. Ths gves more emphass on a pop-clc nose as t has strong energy n the hgh frequency band. Next, adaptve medan threshold [5][6] s appled at those values to determne whether or not there s a pop-clc nose. Followng ths ntroducton, Secton 2 proposes a pop-clc nose detecton method usng the second-order dfference of an uncorrelated resdual sgnal. Next, Secton 3 evaluates the performance of the proposed method and compares t wth that of a ISSN: 2287-1233 ASTL Copyrght 2013 SERSC
Advanced Scence and Technology Letters Fg. 1. Procedure of the proposed pop-clc nose detecton method. conventonal method usng the frst-order dfference of the resdual sgnal. Fnally, Secton 4 concludes the paper. 2 Proposed Pop-Clc Nose Detecton Method Fg. 1 shows an overall procedure of the proposed pop-clc nose detecton method. As shown n the fgure, the resdual sgnal of the -th frame, e, s frst obtaned by the lnear predcton (LP) technque, defned as where p e = s a s ( n ) (1) = 1 a and s are the -th LP coeffcent and the n-th nput sample of the -th frame, respectvely. In addton, p s the lnear predcton order. Smlarly to Eq. (1), the resdual sgnal of the nput sgnal appled to the (-1)-th lnear predcton flter, e M, s obtaned as e M p = 1 1 = s a s ( n ) (2) 1 where a s the -th LP coeffcent obtaned from the (-1)-th frame. The dfference between e and e M corresponds to an uncorrelated resdual sgnal between the -th and (-1)-th frames, whch s obtaned as e D M = e e. (3) Next, the absolute value of the second-order dfference of an uncorrelated resdual sgnal s obtaned as D D D g = e ( n 1) 2 e + e ( n + 1). (4) Copyrght 2013 SERSC 165
Advanced Scence and Technology Letters Table 2. Comparson of F-measure values between two pop-clc nose detecton methods at 15 db SNR. LP order 4 6 8 10 Conventonal method 0.74 0.84 0.81 0.81 Proposed method 0.85 0.95 0.89 0.86 The g n Eq. (4) becomes large durng nose ntervals, thus t s used to decde whether or not the -th frame ncludes a pop-clc nose. The decson s actually performed by applyng adaptve medan thresholdng to g, such that 1, f g > θ med( g ( n m),, g,, g ( n + m)) N( ) = 0, otherwse (5) where med ( ), θ, and m are a medan flter, a scale factor, and the coverage of the medan flter, respectvely. In Eq. (5), N () ndcates the presence of pop-clc nose or not by ts value. In other words, a pop-clc nose s detected, that s, N ( ) = 1, f the value of the second-order dfference of an uncorrelated resdual sgnal s hgher than the value of the medan threshold. Otherwse, N ( ) = 0, whch means there s no pop-clc nose wthn the -th frame. 3 Performance Evaluaton The performance of the proposed method was evaluated by usng F-measure [7], and t was compared wth that of a conventonal method usng the frst-order dfference of the resdual sgnal. For the test, three male and three female voce sgnals, sampled at 48 Hz, were chosen. Each sample was one mnute long wth 100 pop-clc noses n total. For the analyss, each sgnal was segmented nto consecutve frames by applyng a Hannng wndow whose length was 4,096 samples, where each frame was overlapped by half wth the prevous frame. Note here that the parameters n Eq. (5) for medan threshold were set as θ = 50 and m = 30. Frst, n order to select a sutable LP order, the detecton accuracy was measure dependng on the LP order at 15 db SNR. Table 1 compares F-measure values between the conventonal and proposed methods for four dfferent LP orders. It was shown from the table that the proposed method provded hgher F-measure value than the conventonal method for all LP orders. The hghest detecton accuracy was acheved when LP order was sx. Thus, the LP order was set to sx for the next experment. Second, n order to evaluate the performance of the detecton under SNR condtons, pop-clc noses were mxed to have dfferent SNRs from 0 25 db at 5 db ncrements. The hgher SNR means that a pop-clc nose has lower power compared wth that of the orgnal sgnal. Fg. 2 also compares F-measure values between the conventonal and proposed methods under dfferent SNRs. It was shown from the fgure that the proposed method had hgher F-measure values than the conventonal method under all the SNRs. 166 Copyrght 2013 SERSC
Advanced Scence and Technology Letters Fg. 2. F-measure values of the conventonal and proposed methods under dfferent SNRs rangng from 0 25 db. 4 Concluson In ths paper, a pop-clc nose detecton method was proposed for mproved portable audtory sensng. The proposed method used second-order dfference of an uncorrelated resdual sgnal, and t acheved hgher pop-clc nose detecton accuracy by 4.43% than wth a conventonal method usng the frst-order dfference of the resdual sgnal. Acnowledgments. Ths wor was supported n part by the NRF grant funded by the government of Korea (MSIP) (No. 2012-010636), and by the MSIP, Korea, under the ITRC support program supervsed by the NIPA (NIPA-2013-H0301-13-4005). References 1. Sadler, B. M.: Detecton n correlated mpulsve nose usng fourth-order cumulants. IEEE Transactons on Sgnal Processng, 44(11), (1996) pp. 2793-2800. 2. Chandra, C., Moore, M. S., Mtra, S. K.: An effcent method for the removal of mpulse nose from speech and audo sgnals. In: Proceedngs of ISCAS, (1998) pp. 206-208. 3. Kauppnen, I.: Methods for detectng mpulsve nose n speech and audo sgnals. In: Proceedngs of Internatonal Conference on 14th Dgtal Sgnal Processng, (2002) pp. 967-970. 4. Hong, J., Par, J., Han, S., Hahn, M.: Sporadc nose reducton for robust speech recognton n moble devces. In: Proceedngs of IEEE Internatonal Conference on Consumer Electroncs, (2011) pp. 831-832. 5. Chen, T., Wu, H. R.: Adaptve mpulse detecton usng center-weghted medan flters. IEEE Sgnal Processng Letters, 8(1), (2001) pp. 1-3. Copyrght 2013 SERSC 167
Advanced Scence and Technology Letters 6. Esquef, P. A. A., Bscanho, L. W. P., Dnz, P. S. R., Freeland, F. P.: A double-thresholdbased approach to mpulsve nose detecton n audo sgnals. In: Proceedngs of EUSIPCO, (2000) pp. 2041-2044. 7. Powers, D. M. W.: Evaluaton: from precson, recall and F-measure to ROC, nformedness, maredness & correlaton. Journal of Machne Learnng Technologes, 2(1), (2011) pp. 37-63. 168 Copyrght 2013 SERSC