Learning to Process Natural Language in Big Data Environment

Size: px

Start display at page:

Download "Learning to Process Natural Language in Big Data Environment"

Christine Edwards
5 years ago
Views:

1 CCF ADL 2015 Nanchang Oc 11, 2015 Learning o Process Naural Language in Big Daa Environmen Hang Li Noah s Ark Lab Huawei Technologies

2 Par 2: Useful Deep Learning Tools

3 Powerful Deep Learning Tools (Unsupervised Neural Word Embedding Recurren Neural Neworks Convoluional Neural Neworks Recursive Neural Neworks

4 Neural Word Embedding

5 Neural Word Embedding Moivaion Represening words wih lower-dimensional (100~ real-valued vecors Unsupervised seing As inpu o neural neworks Tool: Word2Vec Mehod: SGNS (Skip-Gram wih Negaive Sampling

6 Skip-Gram wih Negaive Sampling (Mikolov e al., 2013 Inpu: occurrences beween words and conexs M w 1 c1 c2 c3 c4 c w 2 w Probabiliy model: 1 P( D 1 w, c ( wc 1 e 1 P( D 0 w, c ( wc 1 e wc wc

7 Skip-Gram wih Negaive Sampling Negaive sampling: randomly sample unobserved pair E c w,c N [log ( w c N~ P N ] Objecive in learning L w c #( w, clog ( wc k EC ~ P log ( wcn N Algorihm: sochasic gradien decen

8 Inerpreaion as Marix Facorizaion (Levy & Goldberg 2014 Poinwise Muual Informaion Marix M c1 c2 c3 c4 c5 w 1 w 2 w log P( w, c P( w P( c

9 Inerpreaion as Marix Facorizaion w 1 w 2 w 3 M c1 c2 c3 c4 c w 1 W M WC T marix facorizaion, equivalen o SGNS w 2 w Word Embedding

10 Word Represenaion: Neural Word Embedding w 1 M (Mikolov e al., 2013 c1 c2 c3 c4 c w 2 w log P( w, c P( w P( c W T M WC marix facorizaion w w 2 w word embedding or word2vec

11 Recurren Neural Nework

12 Recurren Neural Nework (RNN (Mikolov e al h 1 h he ca sa on he ma x h 1 h f ( h 1, x he ca sa. ma x

13 Simple Recurren Neural Nework h f ( h, x ( Whh 1 1 Wx x b h h1 x +1

14 Long Term Shor Memory (LSTM ( h 1, x (Hochreier & Schmidhuber, 1997 g h, x, x ( 1 i inpu gae forge gae m f ( h 1, x c oupu gae ( h 1 o oupu gae h Have a memory (vecor o memorize previous values Use inpu gae, oupu gae, forge gae Gae: elemen-wise produc wih vecor wih values in [0,1] i f o c g h ( W ( W ( W 1 anh( W i o ih fh oh g h h h 1 1 gh W h f 1 anh( c ix W W x fx ox c x x 1 b W gx i b b x f o b g

15 Gaed Recurren Uni (GRU h 1 ( x, h 1 r rese gae (Cho e al., 2014 Have a memory (vecor o memorize previous values Use rese gae, updae gae x h 1 g updae gae z h r z h g ( W z rh ( W zh h h 1 h anh( W 1 1 gh W ( r rx W zx x 1 (1 z g x h b r b z W gx x b g ( x, h 1

16 Recurren Neural Nework Language Model Model h p anh( W h P( x h x 1 1 x W x 1 x b hx sof max( Wh b Objecive of Learning 1 T log ˆ T 1 p x 1 x h 1 h x 1 x

17 Recurren Neural Nework (RNN (Mikolov e al On sequence of words Variable lengh Long dependency: LSTM or GRU he ca sa on he ma h 1 h f ( h 1, x he ca sa. ma x

18 Convoluional Neural Nework

19 Convoluional Neural Nework (CNN (Hu e al., 2014 Concaenaion he ca sa on he ma he ca sa on sa on he ma he ca sa on he ma max pooling he ca ca sa ca sa sa on sa on on he on he he ma he ca sa ca sa on sa on he on he ma convoluion he ca sa on he ma

20 Example: Image Convoluion Filer Dark pixel value = 1, ligh pixel value = Filer Leow Wee Kheng

21 Example: Image Convoluion Feaure Map

22 Convoluion z z z z z ( l, f i ( l, f i w ( l, f ( l1 i (0 i (0 i ( w, b [ x ( l, f is sigmoid T i, x ( l, f is oupu of T i1 z, x ( l1 i ype f funcion T ih1 b ] T ( l, f are parameers of is inpu from cancaenaed f 1,2,, F for locaion i in layer l neuron of is inpu for locaion i from layer l 1 l ype f in layer l word vecors for locaion i w ( l, f z i ( l, f ( l1 z i b ( l, f Filer feaure map neuron +1 convoluion

23 Max Pooling z z z ( l, f i ( l, f i is ( l1, f 2i1 max( z, z ( l1, f 2i ( l1, f 2i1 oupu of, z ( l1, f 2i pooling of are inpu of ype f pooling of for locaion i in layer l ype f for locaion i in layer l max pooling

24 Senence Classificaion Using Convoluional Neural Nework y f ( x sof max( Wz z CNN( x b (L z y concaenaion max pooling convoluion x

25 Convoluional Neural Nework (CNN (Hu e al. 2014, Blunsom e al Concaenaion he ca sa on he ma he ca sa on sa on he ma Robus parsing Shared parameer on same level Fixed lengh, zero padding he ca sa on he ma max pooling he ca ca sa sa on on he ca sa sa on on he he ma he ca sa ca sa on sa on he on he ma convoluion he ca sa on he ma

26 Recursive Neural Nework

27 Recursive Neural Nework (Socher e al., 2013 he ca sa on he ma he ca sa on he ma

28 Recursive Neural Nework p score U p c1 p anhw c2 b c1 c2

29 Learning of Recursive Neural Nework The score of a ree is he sum of he scores of is nodes. Max margin parsing (, ( ( n s y x s y nodes n rees : greedily searched ( on incorrec ree :penaly, (, (, ( max, ( ( x Z z y z y z x s y x s L x Z z

30 Recursive Neural Nework (RNN (Socher e al On parse ree of senence Learning is based on max margin parsing he ca sa on he ma he ca sa on he ma

31 Learning of Senence Represenaion

32 Represenaion of Word Meaning dog ca puppy kien Using high-dimensional real-valued vecors o represen he meaning of words

33 Represenaion of Senence Meaning New finding: This is possible Mary is loved by John Mary loves John John loves Mary Using high-dimensional real-valued vecors o represen he meaning of senences

34 Recen Breakhrough in Disribuional Linguisics From words o senences Composiional Represening synax, semanics, even pragmaics

35 How Is Learning of Senence Meaning Possible? Deep neural neworks (complicaed non-linear models Big Daa Task-oriened Error-driven and gradien-based

36 Naural Language Tasks Classificaion: assigning a label o a sring s c Generaion: creaing a sring Maching: maching wo srings s, s R Translaion: ransforming one sring o anoher s Srucured predicion: mapping sring o srucure s s'

37 Naural Language Applicaions Can Be Formalized as Tasks Classificaion Senimen analysis Generaion Language modeling Maching Search Quesion answering Translaion Machine ranslaion Naural language dialogue (single urn Tex summarizaion Paraphrasing Srucured Predicion Informaion Exracion Parsing

38 Learning of Represenaions Classificaion in Tasks s Generaion Maching s, Translaion s r r s(r r c Srucured Predicion s s' r R

39 Our Observaion Unsupervised word-embedding (e.g., Word2Vec is needed, only when here is no enough daa for supervised word-embedding Convoluional Neural Nework is suiable for maching asks Recurren Neural Nework is suiable for generaion asks No observed so far ha Recursive Neural Nework works beer han he oher wo models

40 References Tomas Mikolov, Marin Karafiá, Lukas Burge, Jan Cernocký, and Sanjeev Khudanpur. Recurren Neural Nework based Language Model. InerSpeech Omer Levy, Yoav Goldberg, and Ido Dagan. Improving Disribuional Similariy wih Lessons Learned from Word Embeddings. TACL 2015 pp Tomas Mikolov, Ilya Suskever, Kai Chen, Greg S. Corrado, and Jeff Dean. Disribued Represenaions of Words and Phrases and Their Composiionaliy. NIPS 2013, pp Hochreier, S., & Schmidhuber, J. Long Shor-Term Memory. Neural Compuaion, 9(8, , Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014. Learning Phrase Represenaions using RNN Encoder- Decoder for Saisical Machine Translaion. arxiv: Hu, B., Lu, Z., Li, H., & Chen, Q. Convoluional Neural Nework Archiecures for Maching Naural Language Senences. NIPS 2014 (pp Blunsom, P., Grefensee, E., & Kalchbrenner, N. (2014. A Convoluional neural nework for modeling senences. ACL Socher, Richard, John Bauer, Chrisopher D. Manning, and Andrew Y. Ng. "Parsing wih composiional vecor grammars." ACL 2013.

41 Thank you!

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics