Preare by Prof. Hu Jang CSE638 --4 CSE638 3. Seech & Language Processng o.5 Paern Classfcaon III & Paern Verfcaon Prof. Hu Jang Dearmen of Comuer Scence an Engneerng York Unversy Moel Parameer Esmaon Maxmum Lkelhoo ML Esmaon: ML meho: mos oular moel esmaon EM Exece-Maxmzaon algorhm Examles: Unvarae Gaussan srbuon Mulvarae Gaussan srbuon Mulnomal srbuon Gaussan Mxure moel Markov chan moel: n-gram for language moelng Hen Markov Moel HMM Dscrmnave Tranng alernave moel esmaon meho Maxmum Muual Informaon MMI Mnmum Classfcaon Error MCE Large Margn Esamon LME Bayesan Moel Esmaon: Bayesan heory MDI Mnmum Dscrmnaon Informaon De. of CSE York Unv.
Preare by Prof. Hu Jang CSE638 --4 De. of CSE York Unv. Dscrmnave TranngI: Maxmum Muual Informaon Esmaon The moel s vewe as a nosy aa generaon channel class observaon feaure. Deermne moel arameers o maxmze muual nformaon beween an. close relaon beween an nosy aa generaon channel I log log log log arg max } { I MMI Dscrmnave TranngI: Maxmum Muual Informaon Esmaon Dffculy: jon srbuon s unknown. Soluon: collec a reresenave ranng se T T o aroxmae he jon srbuon. Omzaon: Ierave graen-ascen meho Growh-ransformaon meho T MMI I log arg max log arg max arg max } {
Preare by Prof. Hu Jang CSE638 --4 Dscrmnave TranngII: Mnmum Classfcaon Error Esmaon In a -class aern classfcaon roblem gven a se of ranng aa D{ T T} esmae moel arameers for all class o mnmze oal classfcaon errors n D. MCE: mnmze emrcal classfcaon errors Objecve funcon oal classfcaon errors n D For each ranng aa efne msclassfcaon measure: or + max ' ' ln[ ] + max ln[ ' ] ' f > ncorrec classfcaon for error f < correc classfcaon for error ' ' Dscrmnave TranngII: Mnmum Classfcaon Error Esmaon Sof-max: aroxmae by a fferenable funcon: + ln ex[ η ' ] ' ' or ln[ ] + ln ex[ η ln ' ] ' ' where η>. / η / η De. of CSE York Unv. 3
Preare by Prof. Hu Jang CSE638 --4 Dscrmnave TranngII: Mnmum Classfcaon Error Esmaon 3 Error coun for one aa s H where H. s se funcon. Toal errors n ranng se: T Q Λ H Se funcon s no fferenable aroxmae by a sgmo funcon smoohe oal errors n ranng se. Q Λ Q' Λ l where T l + e a a> s a arameer o conrol s shae. Dscrmnave TranngII: Mnmum Classfcaon Error Esmaon 3 MCE esmaon of moel arameers for all classes: { } MCE arg mn Q' Omzaon: no smle soluon s avalable Ierave graen escen meho. GPD generalze robablsc escen meho. n+ n ε ' Q n De. of CSE York Unv. 4
Preare by Prof. Hu Jang CSE638 --4 De. of CSE York Unv. 5 The MCE/GPD Meho Fn nal moel arameers e.g. ML esmaes Calculae graen of he objecve funcon Calculae he value of he graen base on he curren moel arameers Uae moel arameers Ierae unl convergence ' n n n Q ε + How o calculae graen? The key ssue n MCE/GPD s how o se a roer se sze exermenally. [ ] T T T l l a l l Q ] [ '
Preare by Prof. Hu Jang CSE638 --4 Overranng Overfng Low classfcaon error rae n ranng se oes no always lea o a low error rae n a new es se ue o overranng. Measurng Performance of MCE Objecve funcon Classfcaon Error n % When o converge: monor hree quanes n he MCE/GPD The objecve funcon Error rae n ranng se Error rae n es se De. of CSE York Unv. 6
Preare by Prof. Hu Jang CSE638 --4 Large Margn Esmaon searaon bounary FΛ-F Λ moel Λ moel Λ Large-Margn Classfer orgnal searaon bounary FΛ-F Λ Λ Λ Λ Λ new searaon bounary FΛ -FΛ De. of CSE York Unv. 7
Preare by Prof. Hu Jang CSE638 --4 How o efne searaon margn? In -class searable roblem: For a aa oken x of class Λ x FxΛ FxΛ > For a aa oken x of class Λ x Fx Λ Fx Λ > How o efne searaon margn? Exen o mulle-class roblem: classes Λ Λ Λ For a aa oken x of class Λ x FxΛ max FxΛ j mn j [ FxΛ FxΛ ] j j De. of CSE York Unv. 8
Preare by Prof. Hu Jang CSE638 --4 Large Margn Esmaon An -class roblem: each class s reresene by one moel Λ Λ Λ Λ { Gven a ranng se D efne a subse calle suor oken se S base on nal moel as: S { D an ε} } Large-Margn Esmaon LME: Λ ˆ arg max mn subjec o all > Λ S Bayesan Theory Bayesan mehos vew moel arameers as ranom varables havng some known ror srbuon. Pror secfcaon Secfy ror srbuon of moel arameers θ as θ. Tranng aa D allow us o conver he ror srbuon no a oseror srbuon. Bayesan learnng θ D θ θ D θ D θ D We nfer or ece everyhng solely base on he oseror srbuon. Bayesan nference Moel esmaon: he MAP maxmum a oseror esmaon Paern Classfcaon: Bayesan classfcaon Sequenal on-lne ncremenal learnng Ohers: recon moel selecon ec. De. of CSE York Unv. 9
Preare by Prof. Hu Jang CSE638 --4 Bayesan Learnng Poseror θ D Lkelhoo P D θ Pror θ θmap θml θ The MAP esmaon of moel arameers Do a on esmae abou θ base on he oseror srbuon θ MAP arg max θ D arg max θ D θ θ Then θmap s reae as esmae of moel arameers jus lke ML esmae. Somemes nee he EM algorhm o erve. θ MAP esmaon omally combne ror knowlege wh new nformaon rove by aa. MAP esmaon s use n seech recognon o aa seech moels o a arcular seaker o coe wh varous accens From a generc seaker-neenen seech moel ror Collec a small se of aa from a arcular seaker The MAP esmae gve a seaker-aave moel whch su beer o hs arcular seaker. De. of CSE York Unv.
Preare by Prof. Hu Jang CSE638 --4 Bayesan Classfcaon Assume we have classes each class has a classcononal f θ wh arameers θ. The ror knowlege abou θ s nclue n a ror θ. For each class we have a ranng aa se D. Problem: classfy an unknown aa Y no one of he classes. The Bayesan classfcaon s one as: Y arg max Y D arg max Y θ θ D θ where θ D θ θ D θ D θ D Recursve Bayes Learnng Sequenal Bayesan Learnng Bayesan heory roves a framework for on-lne learnng a.k.a. ncremenal learnng aave learnng. When we observe ranng aa one by one we can ynamcally ajus he moel o learn ncremenally from aa. Assume we observe ranng aa se D{ n} one by one θ θ θ θ D n Learnng Rule: oseror ror lkelhoo Knowlege abou Moel a hs sage Knowlege abou Moel a hs sage Knowlege abou Moel a hs sage Knowlege abou Moel a hs sage De. of CSE York Unv.
Preare by Prof. Hu Jang CSE638 --4 De. of CSE York Unv. How o secfy rors onnformave rors In case we on have enough ror knowlege jus use a fla ror a he begnnng. Conjugae rors: for comuaon convenence For some moels f her robably funcons are a reroucng ensy we can choose he ror as a secal form calle conjugae ror so ha afer Bayesan leanng he oseror wll have he exac same funcon form as he ror exce he all arameers are uae. o every moel has conjugae ror. Conjugae Pror For a unvarae Gaussan moel wh only unknown mean: If we choose he ror as a Gaussan srbuon Gaussan s conjugae ror s Gaussan Afer observng a new aa x he oseror wll sll be Gaussan: ] ex[ π x x x ] ex[ π where ] ex[ π + + + + x x
Preare by Prof. Hu Jang CSE638 --4 The sequenal MAP Esmae of Gaussan For unvarae Gaussan wh unknown mean he MAP esmae of s mean afer observng x: x + + + Afer observng nex aa x: x + + + De. of CSE York Unv. 3