GANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG

Size: px

Start display at page:

Download "GANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG"

Agnes Horton
5 years ago
Views:

1 GANs Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG Machine Learning: Jordan Boyd-Graber UMD GANs 1 / 7

Problems with Generation Generative Models Ain t Perfect Over-emphasis of common outputs, fuzziness Real MLE Adversarial Note: this is probably a good idea if you are doing (Lotter et al.

2 Problems with Generation Generative Models Ain t Perfect Over-emphasis of common outputs, fuzziness Real MLE Adversarial Note: this is probably a good idea if you are doing (Lotter et al. maximum 2015) likelihood! Image Credit: Lotter et al Fitting conventional prob models focuses on common input Can be fuzzy Still better for smaller ammounts of data or if true objective is ML Machine Learning: Jordan Boyd-Graber UMD GANs 2 / 7

3 Adversarial Training It s time for some game theory Machine Learning: Jordan Boyd-Graber UMD GANs 3 / 7

4 Adversarial Training It s time for some game theory Create discriminator that criticizes generated output Is this example real or not Generator is trained to fool discriminator to say it s real Machine Learning: Jordan Boyd-Graber UMD GANs 3 / 7

5 Adversarial Training It s time for some game theory Create discriminator that criticizes generated output Is this example real or not Generator is trained to fool discriminator to say it s real Contrast with encoder / decoder: Machine Learning: Jordan Boyd-Graber UMD GANs 3 / 7

6 Adversarial Training It s time for some game theory Create discriminator that criticizes generated output Is this example real or not Generator is trained to fool discriminator to say it s real Contrast with encoder / decoder: no fixed representation Machine Learning: Jordan Boyd-Graber UMD GANs 3 / 7

7 Training GAN Training Method sample latent vars. z sample minibatch convert w/ generator xreal xfake predict w/ discriminator discriminator loss (higher if fail predictions) y generator loss (higher if make predictions) Machine Learning: Jordan Boyd-Graber UMD GANs 4 / 7

8 Training Equations Discriminator l D (θ D,θ G ) = x P data [logd(x)] z [log(1 D(G(z)))] Real data should get high score Fake data should get low score Machine Learning: Jordan Boyd-Graber UMD GANs 5 / 7

9 Training Equations Discriminator l D (θ D,θ G ) = x P data [logd(x)] z [log(1 D(G(z)))] Real data should get high score Fake data should get low score Machine Learning: Jordan Boyd-Graber UMD GANs 5 / 7

10 Training Equations Discriminator l D (θ D,θ G ) = x P data [logd(x)] z [log(1 D(G(z)))] Real data should get high score Fake data should get low score Machine Learning: Jordan Boyd-Graber UMD GANs 5 / 7

11 Training Equations Discriminator Generator l D (θ D,θ G ) = x P data [logd(x)] z [log(1 D(G(z)))] Real data should get high score Fake data should get low score l G (θ D,θ G ) = l D (θ D,θ G ) If discriminator is very accurate, sometimes better to focus on non-saturating loss Focus on where you can confuse discriminator z [ logd(g(z))] (1) Machine Learning: Jordan Boyd-Graber UMD GANs 5 / 7

12 Problems with Training GANs are great, but training very hard Mode Collapse: generator maps all z to single x Over-confident discriminator Machine Learning: Jordan Boyd-Graber UMD GANs 6 / 7

13 Problems with Training GANs are great, but training very hard Mode Collapse: generator maps all z to single x (other examples as side information) Over-confident discriminator Machine Learning: Jordan Boyd-Graber UMD GANs 6 / 7

14 Problems with Training GANs are great, but training very hard Mode Collapse: generator maps all z to single x (other examples as side information) Over-confident discriminator (smoothing) Machine Learning: Jordan Boyd-Graber UMD GANs 6 / 7

15 Problem! Can t Backprop Problems with Discrete Data through Sampling sample latent vars. z sample minibatch convert w/ generator xreal predict w/ discriminator y xfake Discrete! Can t backprop Machine Learning: Jordan Boyd-Graber UMD GANs 7 / 7

Language Models. Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM PHILIP KOEHN

Language Models. Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM PHILIP KOEHN Language Models Data Science: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM PHILIP KOEHN Data Science: Jordan Boyd-Graber UMD Language Models 1 / 8 Language models Language models answer