Neur Network (Bsc Ides) Hung-y Lee
Lernng Lookng for Functon Speech Recognton f Hndwrtten Recognton f Wether forecst f Py vdeo gmes f wether tody Postons nd numer of enemes 你好 sunny tomorrow fre
Frmework x : ŷ : (e) Mode Hypothess Functon Set f, f y Trnng: Pck the est Functon f * Best Functon f * Testng: f x y Trnng Dt x : functon nput ŷ : functon output x, yˆ, x, yˆ,
Outne. Wht s the mode (functon hypothess set)?. Wht s the est functon? 3. How to pck the est functon?
Tsk Consdered Tody Cssfcton Bnry Cssfcton Ony two csses nput oject Css A (yes) Css B (no) Spm fterng Is n e-m spm or not? Recommendton systems recommend the product to the customer or not? Mwre detecton Is the softwre mcous or not? Stock predcton W the future vue of stock ncrese or not?
Tsk Consdered Tody Cssfcton Bnry Cssfcton Ony two csses Mut-css Cssfcton More thn two csses nput oject Css A (yes) Css B (no) nput oject Css A Css B Css C
Mut-css Cssfcton Hndwrtng Dgt Cssfcton Input: Css:,,., 9, csses Imge Recognton Input: Css: dog, ct, ook,. Thousnds of csses
Mut-css Cssfcton Re speech recognton s not mut-css cssfcton The HW s mut-css cssfcton frme Input: Csses: h, how re you, I m sorry. Cnnot e enumerted // // /ε/ The frme eongs to whch phoneme. Csses re the phonemes.
. Wht s the mode?
Wht s the functon we re ookng for? cssfcton y = f x f: R N R M x: nput oject to e cssfed y: css Assume oth x nd y cn e represented s fxed-sze vector x s vector wth N dmensons, nd y s vector wth M dmensons
Wht s the functon we re ookng for? Hndwrtng Dgt Cssfcton f: R N R M x: mge y: css 6 x 6 Ech pxe corresponds to n eement n the vector : for nk, : otherwse 6 x 6 = 56 dmensons dmensons for dgt recognton 3 3 or not or not 3 or not
. Wht s the mode? A Lyer of Neuron
Snge Neuron z f: R N R z x x w w z Actvton functon z y x N w N s z e z
Snge Neuron z f: R N R z x x w w z Actvton functon z y x N w N s z e z
Snge Neuron f: R N R Snge neuron cn ony do nry cssfcton, cnnot hnde mut-css cssfcton x x y x N s not "" "" y y.5.5
A Lyer of Neuron f: R N R M Hndwrtng dgt cssfcton Csses:,,., 9, csses x x x N If y s the mx, then the mge s. y neurons or not y or not y 3 3 or not
. Wht s the mode? Lmtton of Snge Lyer
Lmtton of Snge Lyer x w w x z z w x w x Input Output x x No Yes Yes No x yes no threshod < threshod threshod threshod Cn we? < threshod threshod x
Lmtton of Snge Lyer No, we cn t x w w x z x x x x
Lmtton of Snge Lyer x w w x z Input Output x x No Yes Yes NOT AND AND OR No
Neur Network NOT AND Neur Network AND OR x z x z z Hdden Neurons
=.73 =.7 z x x =.7 =.5 x x =.5 =.7 z x =.7 =.73 x
x =.73 =.7 w z =.7 =.5 w x =.5 =.7 (.73,.5) x =.7 =.73 (.7,.7) (.5,.73) x
. Wht s the mode? Neur Network
Neur Network s Mode Input x f: R N R M Lyer Lyer Lyer L Output y vector x x x N Input Lyer Hdden Lyers Output Lyer Fuy connected feedforwrd network Deep Neur Network: mny hdden yers y y M vector y
Notton j Lyer nodes N j Lyer N nodes Output of neuron: Lyer Neuron Output of one yer: : vector
Notton Lyer nodes N j j w j Lyer N nodes W w j w w Lyer to Lyer from neuron j (Lyer ) to neuron (Lyer ) N w w N
Notton Lyer nodes N j j Lyer N nodes : s for neuron t yer s for neurons n yer
Notton j Lyer nodes N j w w w j z Lyer N z z z nodes : nput of the ctvton functon for neuron t yer : nput of the ctvton functon the neurons n yer w z w N j w j j
Notton - Summry :output of neuron w j : weght :output of yer W : weght mtrx z : nput of ctvton functon : s z : nput of ctvton functon for yer : s vector
Retons etween Lyer Outputs Lyer nodes N j j z z z z Lyer N nodes
Retons etween Lyer Outputs nodes N Lyer Lyer nodes N j j z z z z w w w w z z z W z w w z w w z w w z
Retons etween Lyer Outputs z z z z z nodes N Lyer Lyer nodes N j j z z z z
Retons etween Lyer Outputs z z z W z j j z z W Lyer nodes N Lyer N nodes
Functon of Neur Network Input x W, Lyer Lyer Lyer L W, W L, L Output y vector x x x N x W x W L y y M L L- L L W vector y y
Functon of Neur Network Input x W, Lyer Lyer Lyer L W, W L, L Output y vector x x y vector y x N y M y f x W L W W x L
. Wht s the est functon?
Best Functon = Best Prmeters y f L W W W x x functon set f x; W,, W, W, L L ecuse dfferent prmeters W nd ed to dfferent functon Form wy to defne functon set: prmeter set L Pck the est functon f* Pck the est prmeter set θ*
Cost Functon Defne functon for prmeter set C θ C θ evute how d prmeter set s The est prmeter set θ s the one tht mnmzes C θ θ = rg mn θ C θ C θ s ced cost/oss/error functon If you defne the goodness of the prmeter set y nother functon O θ O θ s ced ojectve functon
Cost Functon Gven trnng dt: r r R R x, yˆ x, yˆ x, yˆ Hndwrtng Dgt Cssfcton sum over trnng exmpes C R r f x ; r..4. 3 yˆ r Mnmze dstnce 3
3. How to pck the est functon? Grdent Descent
Sttement of Proems Sttement of proems: There s functon C(θ) θ represents prmeter set θ = {θ, θ, θ 3, } Fnd θ * tht mnmzes C(θ) Brute force? Enumerte posse θ Ccuus? Fnd θ * such tht C C, * *,
Grdent Descent Ide For smpfcton, frst consder tht θ hs ony one vre C Drop somewhere When the stops, we fnd the oc mnm 3
Grdent Descent Ide η s ced ernng rte For smpfcton, frst consder tht θ hs ony one vre C Rndomy strt t θ Compute θ θ η dc θ Τdθ dc θ Τdθ Compute dc θ Τdθ θ θ η dc θ Τdθ
Grdent Descent Suppose tht θ hs two vres {θ, θ } Rndomy strt t θ = θ Compute the grdents of C θ t θ : C θ = Updte prmeters θ θ θ = θ θ η C θ Τ θ C θ Τ θ C θ Τ θ C θ Τ θ θ = θ η C θ Compute the grdents of C θ t θ : C θ C θ Τ = θ C θ Τ θ
Grdent Descent θ C θ C θ Strt t poston θ θ θ Grdent Movement θ θ 3 C θ C θ 3 Compute grdent t θ Move to θ = θ - η C θ Compute grdent t θ Move to θ = θ η C θ θ
Form Dervton of Grdent Descent Suppose tht θ hs two vres {θ, θ } C(θ) Gven pont, we cn esy fnd the pont wth the smest vue nery. How?
Form Dervton of Grdent Descent Tyor seres: Let h(x) e nfntey dfferente round x = x. h x k h h k x k k! x x x hx x x x x x h! When x s cose to x h x h x h x x x
E.g. Tyor seres for h(x)=sn(x) round x =π/4 sn(x)= The pproxmton s good round π/4.
Mutvre Tyor seres,,,, y y y y x h x x x y x h y x h y x h When x nd y s cose to x nd y,,,, y y y y x h x x x y x h y x h y x h + somethng reted to (x-x ) nd (y-y ) +
Form Dervton of Grdent Descent Bsed on Tyor Seres: If the red crce s sm enough, n the red crce u C C, s C, C, C, C, C, C, v v s u, C(θ)
Form Dervton of Grdent Descent Fnd θ nd θ yedng the smest vue of C θ n the crce v u s C v u, C,, C v u, C, C Its vue dependng on the rdus of the crce, u nd v. Ths s how grdent descent updtes prmeters. Bsed on Tyor Seres: If the red crce s sm enough, n the red crce, C
Grdent Descent for Neur Network Strtng Prmeters C C ompute C C C ompute C C L L, W,, W,,, W,, W, j w C C w w w w Mons of prmeters To compute the grdents effcenty, we use ckpropgton.
Stuck t oc mnm? Sdde pont Who s Afrd of Non- Convex Loss Functons? http://vdeoectures.ne t/em7_ecun_w/ Deep Lernng: Theoretc Motvtons http://vdeoectures.ne t/deepernng5_e ngo_theoretc_motv tons/
3. How to pck the est functon? Prctc Issues for neur network
Prctc Issues for neur network Prmeter Intzton Lernng Rte Stochstc grdent descent nd Mn-tch Recpe for Lernng
Prmeter Intzton For grdent Descent, we need to pck n ntzton prmeter θ. The ntzton prmeters hve some nfuence to the trnng. We w go ck to ths ssue n the future. Suggeston tody: Do not set the prmeters θ equ Set the prmeters n θ rndomy
Lernng Rte C Set the ernng rte η crefuy cost Very Lrge Lrge Just mke sm No. of prmeters updtes Error Surfce
Lernng Rte C Set the ernng rte η crefuy Toy Exmpe x w z y z y * w Trnng Dt ( exmpes) x = [.,.5,.,.5,.,.5, 3., 3.5, 4., 4.5, 5., 5.5, 6., 6.5, 7., 7.5, 8., 8.5, 9., 9.5] y = [.,.4,.9,.6,.,.5,.8, 3.5, 3.9, 4.7, 5., 5.3, 6.3, 6.5, 6.7, 7.5, 8., 8.5, 8.9, 9.5]
Lernng Rte Toy Exmpe Error Surfce: C(w,) C strt trget
. Lernng Rte Toy Exmpe Dfferent ernng rte η.. ~ 3k updtes ~.3k updtes
Stochstc Grdent Descent nd Mn-tch Grdent Descent Stochstc Grdent Descent C r r C R C r C Pck n exmpe x r If exmpe x r hve equ protes to e pcked r r r C R C E Fster! Better! r r r y x f R C ˆ ; r r R C
Stochstc Grdent Descent nd Mn-tch When usng stochstc grdent descent C Strtng t θ Trnng Dt: R R r r y x y x y x y x ˆ,, ˆ,, ˆ,, ˆ, C pck x pck x pck x r r r r r C pck x R R R R R C pck x R R R C Seen the exmpes once One epoch Wht s epoch?
Stochstc Grdent Descent nd Mn-tch Toy Exmpe Grdent Descent Updte fter seeng exmpes See exmpes Stochstc Grdent Descent If there re exmpes, updte tmes n one epoch. See ony one exmpe epoch
Stochstc Grdent Descent nd Mn-tch Grdent Descent C r C C Stochstc Grdent Descent Pck n exmpe x r Shuffe your dt Mn Btch Grdent Descent Pck B exmpes s C r r C tch B s tch sze B R x r Averge the grdent of the exmpes n the tch r
Stochstc Grdent Descent nd Mn-tch Hndwrtng Dgt Cssfcton Btch sze = Grdent Descent
Stochstc Grdent Descent nd Mn-tch Why mn-tch s fster thn stochstc grdent descent? Stochstc Grdent Descent z = W x z = W x Mn-tch mtrx z z = W x x Prctcy, whch one s fster?
Recpe for Lernng Dt provded n homework Testng Dt Trnng Dt Vdton Re Testng x ŷ x y x y Best Functon f *
Recpe for Lernng Dt provded n homework Testng Dt Trnng Dt Vdton Re Testng x ŷ x y x y Immedtey know the ccurcy Do not know the ccurcy unt the dedne (wht rey count)
Recpe for Lernng Do I get good resuts on trnng set? no Modfy your trnng process Your code hs ug. Cn not fnd good functon Stuck t oc mnm, sdde ponts. Chnge the trnng strtegy Bd mode There s no good functon n the hypothess functon set. Proy you need gger network
Recpe for Lernng Do I get good resuts on trnng set? yes Do I get good resuts on vdton set? yes done no no Modfy your trnng process Preventng Overfttng Your code usuy do not hve ug t ths stuton.
Recpe for Lernng - Overfttng You pck est prmeter set θ * r r Trnng Dt: x, yˆ r r However, r : f x ; * Testng Dt: x u u u ˆ f x ; * Trnng dt nd testng dt hve dfferent dstruton. Trnng Dt: y Testng Dt: yˆ
Recpe for Lernng - Overfttng Pnce: Hve more trnng dt You cn do tht n re ppcton, ut you cn t do tht n homework. We w go ck to ths ssue n the future.
Concudng Remrks. Wht s the mode (functon hypothess set)? Neur Network. Wht s the est functon? Cost Functon 3. How to pck the est functon? Prmeter Intzton Lernng Rte Stochstc grdent descent, Mn-tch Recpe for Lernng Grdent Descent
Acknowedgement 感謝余朗祺同學於上課時糾正投影片上的拼字錯誤 感謝吳柏瑜同學糾正投影片上的 notton 錯誤 感謝 Yes Hung 糾正投影片上的打字錯誤