Recurrent Neural Networks. deeplearning.ai. Why sequence models?

Size: px

Start display at page:

Download "Recurrent Neural Networks. deeplearning.ai. Why sequence models?"

Shannon Ramsey
6 years ago
Views:

1 Recurrent Neural Networks deeplearning.ai Why sequence models?

2 Examples of sequence data The quick brown fox jumped over the lazy dog. Speech recognition Music generation Sentiment classification There is nothing to like in this movie. DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG Voulez-vous chanter avec moi? Do you want to sing with me? Machine translation Video activity recognition Name entity recognition Running Yesterday, Harry Potter met Hermione Granger. Yesterday, Harry Potter met Hermione Granger.

3 Recurrent Neural Networks deeplearning.ai Notation

4 Motivating example x: Harry Potter and Hermione Granger invented a new spell.

5 Representing words x: Harry Potter and Hermione Granger invented a new spell.! "#$! "%$! "&$! "($

6 Representing words x: Harry Potter and Hermione Granger invented a new spell.! "#$! "%$! "&$! "($ And = 367 Invented = 4700 A = 1 New = 5976 Spell = 8376 Harry = 4075 Potter = 6830 Hermione = 4200 Gran = 4000

7 Recurrent Neural Networks deeplearning.ai Recurrent Neural Network Model

8 Why not a standard network?! "#$ ) "#$! "%$ ) "%$! "' ($ ) "' *$ Problems: - Inputs, outputs can be different lengths in different examples. - Doesn t share features learned across different positions of text.

9 Recurrent Neural Networks He said, Teddy Roosevelt was a great President. He said, Teddy bears are on sale!

10 Forward Propagation )- "#$ )- "%$ )- ".$ )- "' *$ + ",$ + "#$ + "%$ + "' (/#$! "#$! "%$! ".$! "' ($

11 Simplified RNN notation + "1$ = 3( "1/#$ ! "1$ ) )- "1$ = 3(5 ;6 + "1$ + 9 ; )

12 Recurrent Neural Networks deeplearning.ai Backpropagation through time

13 Forward propagation and backpropagation '( "&$ '( ")$ '( "*$ '( "+.$! "#$! "&$! ")$! "+,-&$ % "&$ % ")$ % "*$ % "+,$

14 Forward propagation and backpropagation L "1$ '( "1$, ' "1$ = Backpropagation through time

15 Recurrent Neural Networks deeplearning.ai Different types of RNNs

16 Examples of sequence data The quick brown fox jumped over the lazy dog. Speech recognition Music generation Sentiment classification There is nothing to like in this movie. DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG Voulez-vous chanter avec moi? Do you want to sing with me? Machine translation Video activity recognition Name entity recognition Running Yesterday, Harry Potter met Hermione Granger. Yesterday, Harry Potter met Hermione Granger.

17 Examples of RNN architectures

18 Examples of RNN architectures

19 Summary of RNN types () #'% () #'% () #*% () #+,% () " #$% " #$% " #$% & #'% One to one & One to many & #'% & #*% & #+.% Many to one () #'% () #*% () #+,% () #'% () #+,% " #$% " #$% & #'% & #*% & #+.% Many to many & #'% & #+.% Many to many

20 Recurrent Neural Networks deeplearning.ai Language model and sequence generation

21 What is language modelling? Speech recognition The apple and pair salad. The apple and pear salad.!(the apple and pair salad) =!(The apple and pear salad) =

22 Language modelling with an RNN Training set: large corpus of english text. Cats average 15 hours of sleep a day. The Egyptian Mau is a bread of cat. <EOS>

23 RNN model Cats average 15 hours of sleep a day. <EOS> L &' ()*, & ()* = - & ()* ()* 0 log &' 0 L = - ) L ()* &' ()*, & ()* 0

24 Recurrent Neural Networks deeplearning.ai Sampling novel sequences

25 Sampling a sequence from a trained RNN '( "&$ '( "/$ '( "0$ '( ") *$! "#$! "&$! "/$! "0$! ") *$ % "&$ ' "&$ ' "/$ ' ") -.&$

26 Character-level language model Vocabulary = [a, aaron,, zulu, <UNK>] '( "&$ '( "/$ '( "0$ '( ") *$! "#$! "&$! "/$! "0$! ") *$ % "&$ '( "&$ '( "/$ '( ") -.&$

27 Sequence generation News Shakespeare President enrique peña nieto, announced sench s sulk former coming football langston paring. I was not at all surprised, said hich langston. Concussion epidemic, to be examined. The mortal moon hath her eclipse in love. And subject of this thou art another this fold. When besser be my love to me see sabl s. For whose are ruse of mine eyes heaves. The gray football the told some and this has on the uefa icon, should money as.

28 Recurrent Neural Networks deeplearning.ai Vanishing gradients with RNNs

29 Vanishing gradients with RNNs '( "&$ '( "-$ '( "/$ '( ") *$! "#$! "&$! "-$! "/$! ") *$ % "&$ % "-$ % ").$ % "/$ % '( Exploding gradients.

30 Recurrent Neural Networks deeplearning.ai Gated Recurrent Unit (GRU)

31 RNN unit! "#$ = &(( )! "#*+$, - "#$ + / ) )

32 GRU (simplified) The cat, which already ate, was full. [Cho et al., On the properties of neural machine translation: Encoder-decoder approaches] [Chung et al., Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling]

33 Full GRU 5 "#$ = tanh(( > [ 5 "#*+$, - "#$ ] + / > ) Γ 2 = 3(( 2 5 "#*+$, - "#$ + / 2 ) 5 "#$ = Γ 2 5 "#$ + 1 Γ "#*+$ The cat, which ate already, was full.

34 Recurrent Neural Networks deeplearning.ai LSTM (long short term memory) unit

35 GRU and LSTM GRU LSTM! #$% = tanh(, - Γ /! #$12%, 4 #$% ) Γ 8 = 9(, 8! #$12%, 4 #$% ) Γ / = 9(, /! #$12%, 4 #$% + 6 / )! #$% = Γ 8! #$% + 1 Γ 8! #$12% = #$% =! #$% [Hochreiter & Schmidhuber Long short-term memory]

36 LSTM units GRU! #$% = tanh(, - Γ /! #$12%, 4 #$% ) Γ 8 = 9(, 8! #$12%, 4 #$% ) Γ / = 9(, /! #$12%, 4 #$% + 6 / )! #$% = Γ 8! #$% + 1 Γ 8! #$12% = #$% =! #$% LSTM! #$% = tanh(, - = #$12%, 4 #$% ) Γ 8 = 9(, 8 = #$12%, 4 #$% ) Γ > = 9(, > = #$12%, 4 #$% + 6 > ) Γ? = 9(,? = #$12%, 4 #$% + 6? )! #$% = Γ 8! #$% + Γ >! #$12% = #$% = Γ?! #$% [Hochreiter & Schmidhuber Long short-term memory]

37 LSTM in pictures D #$%! #$% = tanh(, - = #$12%, 4 #$% ) Γ 8 = 9(, 8 = #$12%, 4 #$% ) Γ > = 9(, > = #$12%, 4 #$% + 6 > ) Γ? = 9(,? = #$12%, 4 #$% + 6? )! #$% = Γ 8! #$% + Γ >! #$12%! #$12% = #$12% = #$%! * #$% B #$% C #$% *! #$% A #$% tanh = #$% forget gate update gate tanh output gate * softmax -! #$% = #$% = #$% = Γ?! #$% D #2% D #F% 4 #$% D #G%! #E% softmax = #2%! #2% *! #2% - softmax = #F%! #F% *! #F% -- * softmax = #G% --! #G% = #E% = #2% = #2% = #F% = #F% = #G% 4 #2% 4 #F% 4 #G%

38 Recurrent Neural Networks deeplearning.ai Bidirectional RNN

39 Getting information from the future He said, Teddy bears are on sale! He said, Teddy Roosevelt was a great President!!" #)%!" #(%!" #*%!" #.%!" #-%!" #/%!" #$% + #,% + #)% + #(% + #*% + #.% + #-% + #/% + #$% ' #-% ' #)% ' #(% ' #*% ' #.% ' #/% ' #$% He said, Teddy bears are on sale!

40 Bidirectional RNN (BRNN)

41 Recurrent Neural Networks deeplearning.ai Deep RNNs

42 Deep RNN example, "#$, "%$, "&$, "'$ ( [&]"+$ ( [&]"#$ ( [&]"%$ ( [&]"&$ ( [&]"'$ ( [%]"+$ ( [%]"#$ ( [%]"%$ ( [%]"&$ ( [%]"'$ ( [#]"+$! "#$! "%$! "&$! "'$

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech