Paper Iroducio ~ Modellig he Uceraiy i Recoverig Ariculaio fro Acousics ~ Kori Richod, Sio Kig, ad Paul Taylor Tooi Toda Noveber 6, 2003
Proble Addressed i This Paper Modellig he acousic-o-ariculaory appig Speech producio Acousic speech sigal Ariculaory cofiguraio Iversio appig Acousic-ariculaory daa Saisical odel
Coes Iversio appig Acousic-ariculaory daa MOCHA Mappig wih ulilayer percepro MLP Mappig wih iure desiy ewor MDN Coparig MLP wih MDN
Iversio Mappig Mappig fro acousic speech sigal o ariculaory cofiguraio Ill-posed proble o-uique soluio Applicaios Speech codig Speech raiig Speech recogiio Speech syhesis
MOCHA daabase Mulichael ariculaory daabase MOCHA Quee Margare Uiversiy College Four daa sreas Acousic wavefor 16 Hz, 16 bi Larygograph 16 Hz, 16 bi Elecropalaograph Elecroageic ariculograph EMA 460 Briish TIMIT seeces 40 speaers Available: 2 speaers, ale ad feale
EMA: Elecroageic Ariculograph Saplig he ovee of receiver coils aached o he ariculaors 9 pois 2 referece pois Top lip, boo lip, boo icisor, ogue ip, ogue body, ogue dorsu, velu, bridge of he ose, upper icisor - ad y-coordiaes i he idsagial plae 14 chaels 500 Hz
Saples: 2-D plo 60 50 40 Bridge of he ose ref. 30 y [] 20 Upper icisor ref. Upper lip 10 Togue dorsu Togue body 0 Togue ip -10 Lower lip Velu -20 Lower icisor -30-20 -10 0 10 20 30 40 50 60 70 []
Saples: Tie sequece [] y [] 60 40 20 0-20 10 0-10 -20-30 0 200 400 600 800 1000 1200 Upper lip Velu Togue dorsu Togue body Togue ip Lower icisor Upper lip Togue ip Lower lip Togue dorsu Togue body Lower icisor Lower lip Velu 0 200 400 600 800 1000 1200 Tie [s]
MLP: Mulilayer Percepro Ipu layer: 400 uis 20 fraes of 20 filer ba coefficies Oupu layer: 14 uis 7 - ad y-coodiaes Hidde layer: 38 uis Pruig fro 50 uis Hidde layer Ariculaory oupu Full coecio Full coecio Acousic ipu
Feaures Acousic feaure 20 el-scale filerba coefficies 20 s haig widow, 10s shif Noralizaio: 1/4, 95% ierval [0.0, 1.0] Ariculaory feaure Lesseig he effec of oise caused by easuree error i EMA achie 10s shif Noralizaio: 1/5, 95% ierval [0.1, 0.9] Reovig silece fraes Daa se For raiig: 368 seeces For validaio: 46 seeces For es: 46 seeces
Weigh Opiizaio Error fucio = E y ; w N K y Gradie desce raiig w i+ 1 w i Scaled Cojugae Gradie SCG 2 ; w* = i = w + w = η E w i i w* a he iiu of E Usig o oly firs derivaives bu also secod derivaives
Eperieal Evaluaio Evaluaio easures Roo ea square RMS error Overall disace bewee wo rajecories Correlai coefficie Resuls Siilariy of shape ad sychroy of wo rajecories P. 7, Table 1 Average of RMS error: 1.62 Esiaed rajecories, p. 9, Figure 2
Shorcoigs of MLP Mappig Discoiuiy i esiaed rajecories Ariculaors ove slowly ad soohly Low-pass filerig Chael specific cuoff frequecy Isufficie o odel oe-o-ay appigs Usig coe widows Effecive bu isufficie Soe liiaios of usig he su-of-squares error fucio
MDN: Miure Desiy Newor Oupu: codiioal probabiliy desiy fucio Kerel: Gaussia fucio Weigh Mea Spherical covariace = M p φ α = M j j z z ep ep α α α µ µ z = ep σ σ z = 1 α 11 µ 1 σ 2 α 21 µ 2 σ p = K K K µ 2 /2 2 1 ep 2 1 σ σ π φ
Weigh Opiizaio i MDN Error fucio K-eas based iiializaio Ucodiioal desiy of he arge daa = M N E l φ α, z E π α α = =, 2 z E σ µ π µ = K z E 1, 2 σ µ π σ = = M j j j 1, φ α φ α π
Eperieal Evaluaio Resuls Oupu probabiliy, p. 14, Figure 4, 5 Low variace: accuracy is higher High variace: accuracy is lower Phoeic depede variaces, p. 16, Table 2. Low variace of criical ariculaor Togue ip y [s, z, ] Upper lip [, w, b, p] Ecepioal eaple Velu [,, ] Usig characerisics of variace
Coparig MDN wih MLP MLP Sigle Gaussia probabiliy desiy fucio MDN Mea: varied accordig o ipu oupu of MLP Variace: fied global variace i raiig daa Muliple Gaussia probabiliy desiy fucio Weigh: varied accordig o ipu Mea: varied accordig o ipu Variace: varied accordig o ipu
Copariso wih Mea Lielihood Resuls P. 18, Table. 3 Y-coordiaes is ore iproved ecep for velu Y: 13.3%, X: 4.8% Coparig probabiliy desiy fucios P. 19, Figure 6 Effeciveess of uli desiy for odellig oe-oay appigs
Coclusio Acousic-o-ariculaory appig Ill-posed proble Usig MOCHA daabase Speech wavefor EMA Mappig wih MLP or MDN MDN: ore fleible ad accurae odel Effecive o oe-o-ay appig Variace varied accordig o ipu sigal
Fuure Wor Coiuous rajecory Kala soohig wih variace Usig ariculaory iforaio Cos fucio i cocaeaive speech syhesis Specral esiaio Speech recogiio
Proposed Algorih Esiaed saic feaure Esiaed dyaic feaure 2 1 0-1 0.2 0.1 0-0.1-0.2 0 1000 2000 Tie [s] 1. Esiaig o oly saic bu also dyaic probabiliy Esiaed saic feaure 1.2 0.8 0.4 0-0.4 ML wih boh disribuios Saic ea 0 1000 2000 Tie [s] 2. Feaure geeraio wih ML for saic ad dyaic disribuios