Applications of Deep Learning

Size: px
Start display at page:

Download "Applications of Deep Learning"

Transcription

1 Applications of Deep Learning Alpha Go Google Translate Data Center Optimisation Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid November 23, 2016 Template by Philipp Arndt

2 Applications of Deep Learning Introduction November 23, 2016 FFR141 - Complex Systems Seminar Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid Applications of Deep Learning 2

3 Applications of Deep Learning Introduction AlphaGo Google s Neural Machine Translation (GNMT) Deep learning to control data centre cooling Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 3

4 Solving boardgames November 23, 2016 FFR141 - Complex Systems Seminar Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid Applications of Deep Learning 4

5 Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 5

6 Deep neural network Supervised learning Reinforcement learning from games of self-play Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 6

7 Deep neural network Supervised learning Reinforcement learning from games of self-play Monte Carlo simulation Policy network Value network Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 6

8 Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 7

9 Policy network: classifies promising positions Value Network: calculate estimates of winning Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 7

10 Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 8

11 99.8% Winratio against other go programs Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 8

12 Applications of Deep Learning Google s Neural Machine Translation (GNMT) Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 9

13 Google s Neural Machine Translation (GNMT) Introduction Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 10

14 Google s Neural Machine Translation (GNMT) Introduction There are flaws, BUT... September 27, 2016 GNMT announced Error reduction by 60% Bridging the Gap between Human and Machine Translation How? Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 11

15 Google s Neural Machine Translation (GNMT) Introduction Overview Models used so far GNMT Model Experiments and Results Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 12

16 Google s Neural Machine Translation (GNMT) Models used so far Overview Models used so far GNMT Model Experiments and Results Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 13

17 Google s Neural Machine Translation (GNMT) Models used so far Phrase-Based Machine Translation (PBMT) Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 14

18 Google s Neural Machine Translation (GNMT) Models used so far Phrase-Based Machine Translation (PBMT) Probability tables Linguistic properties Neural Machine Translation (NMT) Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 15

19 Google s Neural Machine Translation (GNMT) Models used so far Flaws of NMT Accuracy Speed / Computation Robustness Coverage Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 16

20 Google s Neural Machine Translation (GNMT) GNMT Model Overview Models used so far GNMT Model Experiments and Results Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 17

21 Google s Neural Machine Translation (GNMT) GNMT Model How does GNMT handle these problems? Speed / Computation Robustness Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 18

22 Google s Neural Machine Translation (GNMT) GNMT Model How does GNMT handle these problems? Speed / Computation Robustness Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 19

23 Google s Neural Machine Translation (GNMT) GNMT Model Architecture Parallelism Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 20

24 Google s Neural Machine Translation (GNMT) GNMT Model How does GNMT handle these problems? Speed / Computation Robustness Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 21

25 Google s Neural Machine Translation (GNMT) GNMT Model Segmentation: WordPiece Model (WPM) Abwasserbehandlungsanlage Abwasser behandlungs anlage sewage water treatment plant Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 22

26 Google s Neural Machine Translation (GNMT) Experiments and Results Overview Models used so far GNMT Model Experiments and Results Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 23

27 Google s Neural Machine Translation (GNMT) Experiments and Results Tests on Benchmark Sentence Pairs Workshop on Machine Translation (WMT) data set BiLingual Evaluation Understudy (BLEU) metric Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 24

28 Google s Neural Machine Translation (GNMT) Experiments and Results Tests on Benchmark Sentence Pairs Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 25

29 Google s Neural Machine Translation (GNMT) Experiments and Results Human Evaluation Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 26

30 Google s Neural Machine Translation (GNMT) Experiments and Results Bridging the Gap between Human and Machine Translation Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 27

31 Deep learning to optimize cooling of data centres Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 28

32 Deep learning to control data centre cooling Introduction Facts and definitions 1 Google search = keep a lightbulb going for 25s searches/s PUE = Power Usage Efficiency Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 29

33 Deep learning to control data centre cooling Predicting PUE Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 30

34 Deep learning to control data centre cooling Predicting PUE 99.6% accuracy Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 31

35 Deep learning to control data centre cooling Predicting PUE Difficult to optimize efficiency Non-linear interactions between machines and environment Systems ability to adapt to operational changes Each facility has unique architechture Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 32

36 Deep learning to control data centre cooling Controling the data centre Results: 40% reduction in energy usage for cooling Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 33

37 Conclusions Beating human intuition in board games Solving Language translation tasks Outperforming human engineering abilities Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 34

38 Thank you for listening! Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 35

39 Discussion Questions What are the limitations of deep learning? Are there tasks for which the technique cannot be applied? Are there areas where deep learning should be used, but isn t? Who is responsible when a machine makes a critical error? For example: Who is responsible if an AI or machine causes a train to derail or fails to properly diagnose a patient? Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 36

40 Applications of Deep Learning References Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi. Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. (2016) D. Bahdanau, K. H. Cho, Y. Bengio. Neral Machine Translation by Jointly Learning to Align and Translate. (2015) P. Koehn, F. J. Och, D. Marcu. Statistical Phrase-Based Translation. (2003) R. Sennrich, B. Haddow, A. Birch. Neural Machine Translation of Rare Words with Subword Units. (2016) S. Jean, K. Cho, R. Memisevic, Y. Bengio. On Using Very Large Target Vocabulary for Neural Machine Translation (2015) Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 37

41 Applications of Deep Learning References google-cuts-its-giant-electricity-bill-with-deepmind-powered-ai https: //docs.google.com/a/google.com/viewer?url= internal/assets/machine-learning-applicationsfor-datacenter-optimization-finalv2.pdf Robin Sigurdson, Yvonne Krumbeck, Henrik Arnelid 38

Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, et al. Google arxiv:1609.08144v2 Reviewed by : Bill

More information

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale

Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale Natural Language Processing (CSEP 517): Machine Translation (Continued), Summarization, & Finale Noah Smith c 2017 University of Washington nasmith@cs.washington.edu May 22, 2017 1 / 30 To-Do List Online

More information

Factored Neural Machine Translation Architectures

Factored Neural Machine Translation Architectures Factored Neural Machine Translation Architectures Mercedes García-Martínez, Loïc Barrault and Fethi Bougares LIUM - University of Le Mans IWSLT 2016 December 8-9, 2016 1 / 19 Neural Machine Translation

More information

Neural Hidden Markov Model for Machine Translation

Neural Hidden Markov Model for Machine Translation Neural Hidden Markov Model for Machine Translation Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan and Hermann Ney {surname}@i6.informatik.rwth-aachen.de July 17th, 2018 Human Language Technology and

More information

Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation

Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation Yilin Yang 1 Liang Huang 1,2 Mingbo Ma 1,2 1 Oregon State University Corvallis, OR,

More information

arxiv: v1 [cs.cl] 22 Jun 2015

arxiv: v1 [cs.cl] 22 Jun 2015 Neural Transformation Machine: A New Architecture for Sequence-to-Sequence Learning arxiv:1506.06442v1 [cs.cl] 22 Jun 2015 Fandong Meng 1 Zhengdong Lu 2 Zhaopeng Tu 2 Hang Li 2 and Qun Liu 1 1 Institute

More information

arxiv: v1 [cs.cl] 21 May 2017

arxiv: v1 [cs.cl] 21 May 2017 Spelling Correction as a Foreign Language Yingbo Zhou yingbzhou@ebay.com Utkarsh Porwal uporwal@ebay.com Roberto Konow rkonow@ebay.com arxiv:1705.07371v1 [cs.cl] 21 May 2017 Abstract In this paper, we

More information

Learning to translate with neural networks. Michael Auli

Learning to translate with neural networks. Michael Auli Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each

More information

CS885 Reinforcement Learning Lecture 7a: May 23, 2018

CS885 Reinforcement Learning Lecture 7a: May 23, 2018 CS885 Reinforcement Learning Lecture 7a: May 23, 2018 Policy Gradient Methods [SutBar] Sec. 13.1-13.3, 13.7 [SigBuf] Sec. 5.1-5.2, [RusNor] Sec. 21.5 CS885 Spring 2018 Pascal Poupart 1 Outline Stochastic

More information

Multi-Source Neural Translation

Multi-Source Neural Translation Multi-Source Neural Translation Barret Zoph and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California {zoph,knight}@isi.edu In the neural encoder-decoder

More information

Introduction of Reinforcement Learning

Introduction of Reinforcement Learning Introduction of Reinforcement Learning Deep Reinforcement Learning Reference Textbook: Reinforcement Learning: An Introduction http://incompleteideas.net/sutton/book/the-book.html Lectures of David Silver

More information

Word Attention for Sequence to Sequence Text Understanding

Word Attention for Sequence to Sequence Text Understanding Word Attention for Sequence to Sequence Text Understanding Lijun Wu 1, Fei Tian 2, Li Zhao 2, Jianhuang Lai 1,3 and Tie-Yan Liu 2 1 School of Data and Computer Science, Sun Yat-sen University 2 Microsoft

More information

REINFORCEMENT LEARNING

REINFORCEMENT LEARNING REINFORCEMENT LEARNING Larry Page: Where s Google going next? DeepMind's DQN playing Breakout Contents Introduction to Reinforcement Learning Deep Q-Learning INTRODUCTION TO REINFORCEMENT LEARNING Contents

More information

CSC321 Lecture 15: Recurrent Neural Networks

CSC321 Lecture 15: Recurrent Neural Networks CSC321 Lecture 15: Recurrent Neural Networks Roger Grosse Roger Grosse CSC321 Lecture 15: Recurrent Neural Networks 1 / 26 Overview Sometimes we re interested in predicting sequences Speech-to-text and

More information

CS230: Lecture 9 Deep Reinforcement Learning

CS230: Lecture 9 Deep Reinforcement Learning CS230: Lecture 9 Deep Reinforcement Learning Kian Katanforoosh Menti code: 21 90 15 Today s outline I. Motivation II. Recycling is good: an introduction to RL III. Deep Q-Learning IV. Application of Deep

More information

Coverage Embedding Models for Neural Machine Translation

Coverage Embedding Models for Neural Machine Translation Coverage Embedding Models for Neural Machine Translation Haitao Mi Baskaran Sankaran Zhiguo Wang Abe Ittycheriah T.J. Watson Research Center IBM 1101 Kitchawan Rd, Yorktown Heights, NY 10598 {hmi, bsankara,

More information

Multi-Source Neural Translation

Multi-Source Neural Translation Multi-Source Neural Translation Barret Zoph and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California {zoph,knight}@isi.edu Abstract We build a multi-source

More information

arxiv: v2 [cs.cl] 1 Jan 2019

arxiv: v2 [cs.cl] 1 Jan 2019 Variational Self-attention Model for Sentence Representation arxiv:1812.11559v2 [cs.cl] 1 Jan 2019 Qiang Zhang 1, Shangsong Liang 2, Emine Yilmaz 1 1 University College London, London, United Kingdom 2

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Cyber Rodent Project Some slides from: David Silver, Radford Neal CSC411: Machine Learning and Data Mining, Winter 2017 Michael Guerzhoy 1 Reinforcement Learning Supervised learning:

More information

Utilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms

Utilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms 1 1 1 2 2 30% 70% 70% NTCIR-7 13% 90% 1,000 Utilizing Portion of Patent Families with No Parallel Sentences Extracted in Estimating Translation of Technical Terms Itsuki Toyota 1 Yusuke Takahashi 1 Kensaku

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Deep Reinforcement Learning. Scratching the surface

Deep Reinforcement Learning. Scratching the surface Deep Reinforcement Learning Scratching the surface Deep Reinforcement Learning Scenario of Reinforcement Learning Observation State Agent Action Change the environment Don t do that Reward Environment

More information

A Little History of Machine Learning

A Little History of Machine Learning 機器學習現在 過去 未來 A Little History of Machine Learning Chia-Ping Chen National Sun Yat-sen University @NPTU, December 2016 Outline ubiquitous machine intelligence challenge and reaction AI brief deep learning

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

Edinburgh Research Explorer

Edinburgh Research Explorer Edinburgh Research Explorer Nematus: a Toolkit for Neural Machine Translation Citation for published version: Sennrich R Firat O Cho K Birch-Mayne A Haddow B Hitschler J Junczys-Dowmunt M Läubli S Miceli

More information

Deep Learning and Information Theory

Deep Learning and Information Theory Deep Learning and Information Theory Bhumesh Kumar (13D070060) Alankar Kotwal (12D070010) November 21, 2016 Abstract T he machine learning revolution has recently led to the development of a new flurry

More information

Deep Reinforcement Learning SISL. Jeremy Morton (jmorton2) November 7, Stanford Intelligent Systems Laboratory

Deep Reinforcement Learning SISL. Jeremy Morton (jmorton2) November 7, Stanford Intelligent Systems Laboratory Deep Reinforcement Learning Jeremy Morton (jmorton2) November 7, 2016 SISL Stanford Intelligent Systems Laboratory Overview 2 1 Motivation 2 Neural Networks 3 Deep Reinforcement Learning 4 Deep Learning

More information

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech

More information

CS230: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention

CS230: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention CS23: Lecture 8 Word2Vec applications + Recurrent Neural Networks with Attention Today s outline We will learn how to: I. Word Vector Representation i. Training - Generalize results with word vectors -

More information

Analysis of techniques for coarse-to-fine decoding in neural machine translation

Analysis of techniques for coarse-to-fine decoding in neural machine translation Analysis of techniques for coarse-to-fine decoding in neural machine translation Soňa Galovičová E H U N I V E R S I T Y T O H F R G E D I N B U Master of Science by Research School of Informatics University

More information

WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio WaveNet: A Generative Model for Raw Audio Ido Guy & Daniel Brodeski Deep Learning Seminar 2017 TAU Outline Introduction WaveNet Experiments Introduction WaveNet is a deep generative model of raw audio

More information

Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation

Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation Joern Wuebker, Hermann Ney Human Language Technology and Pattern Recognition Group Computer Science

More information

Machine Learning for Physicists Lecture 1

Machine Learning for Physicists Lecture 1 Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT

More information

Deep Reinforcement Learning. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 19, 2017

Deep Reinforcement Learning. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 19, 2017 Deep Reinforcement Learning STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 19, 2017 Outline Introduction to Reinforcement Learning AlphaGo (Deep RL for Computer Go)

More information

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018 Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:

More information

Multi-Task Word Alignment Triangulation for Low-Resource Languages

Multi-Task Word Alignment Triangulation for Low-Resource Languages Multi-Task Word Alignment Triangulation for Low-Resource Languages Tomer Levinboim and David Chiang Department of Computer Science and Engineering University of Notre Dame {levinboim.1,dchiang}@nd.edu

More information

Improved Learning through Augmenting the Loss

Improved Learning through Augmenting the Loss Improved Learning through Augmenting the Loss Hakan Inan inanh@stanford.edu Khashayar Khosravi khosravi@stanford.edu Abstract We present two improvements to the well-known Recurrent Neural Network Language

More information

CSC321 Lecture 10 Training RNNs

CSC321 Lecture 10 Training RNNs CSC321 Lecture 10 Training RNNs Roger Grosse and Nitish Srivastava February 23, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 10 Training RNNs February 23, 2015 1 / 18 Overview Last time, we saw

More information

Self-Attention with Relative Position Representations

Self-Attention with Relative Position Representations Self-Attention with Relative Position Representations Peter Shaw Google petershaw@google.com Jakob Uszkoreit Google Brain usz@google.com Ashish Vaswani Google Brain avaswani@google.com Abstract Relying

More information

Phrase Table Pruning via Submodular Function Maximization

Phrase Table Pruning via Submodular Function Maximization Phrase Table Pruning via Submodular Function Maximization Masaaki Nishino and Jun Suzuki and Masaaki Nagata NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai, Seika-cho, Soraku-gun,

More information

Algorithms for NLP. Machine Translation II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Algorithms for NLP. Machine Translation II. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Algorithms for NLP Machine Translation II Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Announcements Project 4: Word Alignment! Will be released soon! (~Monday) Phrase-Based System Overview

More information

Out of GIZA Efficient Word Alignment Models for SMT

Out of GIZA Efficient Word Alignment Models for SMT Out of GIZA Efficient Word Alignment Models for SMT Yanjun Ma National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Series March 4, 2009 Y. Ma (DCU) Out of Giza

More information

Deep Learning for NLP

Deep Learning for NLP Deep Learning for NLP Instructor: Wei Xu Ohio State University CSE 5525 Many slides from Greg Durrett Outline Motivation for neural networks Feedforward neural networks Applying feedforward neural networks

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

statistical machine translation

statistical machine translation statistical machine translation P A R T 3 : D E C O D I N G & E V A L U A T I O N CSC401/2511 Natural Language Computing Spring 2019 Lecture 6 Frank Rudzicz and Chloé Pou-Prom 1 University of Toronto Statistical

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Sequence to Sequence Models and Attention

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Sequence to Sequence Models and Attention TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Sequence to Sequence Models and Attention Encode-Decode Architectures for Machine Translation [Figure from Luong et al.] In Sutskever

More information

An overview of word2vec

An overview of word2vec An overview of word2vec Benjamin Wilson Berlin ML Meetup, July 8 2014 Benjamin Wilson word2vec Berlin ML Meetup 1 / 25 Outline 1 Introduction 2 Background & Significance 3 Architecture 4 CBOW word representations

More information

Deep Neural Machine Translation with Linear Associative Unit

Deep Neural Machine Translation with Linear Associative Unit Deep Neural Machine Translation with Linear Associative Unit Mingxuan Wang 1 Zhengdong Lu 2 Jie Zhou 2 Qun Liu 4,5 1 Mobile Internet Group, Tencent Technology Co., Ltd wangmingxuan@ict.ac.cn 2 DeeplyCurious.ai

More information

arxiv: v1 [cs.cl] 22 Jun 2017

arxiv: v1 [cs.cl] 22 Jun 2017 Neural Machine Translation with Gumbel-Greedy Decoding Jiatao Gu, Daniel Jiwoong Im, and Victor O.K. Li The University of Hong Kong AIFounded Inc. {jiataogu, vli}@eee.hku.hk daniel.im@aifounded.com arxiv:1706.07518v1

More information

Conditional Language modeling with attention

Conditional Language modeling with attention Conditional Language modeling with attention 2017.08.25 Oxford Deep NLP 조수현 Review Conditional language model: assign probabilities to sequence of words given some conditioning context x What is the probability

More information

Latent Variable Models in NLP

Latent Variable Models in NLP Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable

More information

Payments System Design Using Reinforcement Learning: A Progress Report

Payments System Design Using Reinforcement Learning: A Progress Report Payments System Design Using Reinforcement Learning: A Progress Report A. Desai 1 H. Du 1 R. Garratt 2 F. Rivadeneyra 1 1 Bank of Canada 2 University of California Santa Barbara 16th Payment and Settlement

More information

Tuning as Linear Regression

Tuning as Linear Regression Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical

More information

Conquering the Complexity of Time: Machine Learning for Big Time Series Data

Conquering the Complexity of Time: Machine Learning for Big Time Series Data Conquering the Complexity of Time: Machine Learning for Big Time Series Data Yan Liu Computer Science Department University of Southern California Mini-Workshop on Theoretical Foundations of Cyber-Physical

More information

David Silver, Google DeepMind

David Silver, Google DeepMind Tutorial: Deep Reinforcement Learning David Silver, Google DeepMind Outline Introduction to Deep Learning Introduction to Reinforcement Learning Value-Based Deep RL Policy-Based Deep RL Model-Based Deep

More information

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Anticipating Visual Representations from Unlabeled Data Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Overview Problem Key Insight Methods Experiments Problem: Predict future actions and objects Image

More information

Approximate Q-Learning. Dan Weld / University of Washington

Approximate Q-Learning. Dan Weld / University of Washington Approximate Q-Learning Dan Weld / University of Washington [Many slides taken from Dan Klein and Pieter Abbeel / CS188 Intro to AI at UC Berkeley materials available at http://ai.berkeley.edu.] Q Learning

More information

UAlacant word-level machine translation quality estimation system at WMT 2015

UAlacant word-level machine translation quality estimation system at WMT 2015 UAlacant word-level machine translation quality estimation system at WMT 2015 Miquel Esplà-Gomis Felipe Sánchez-Martínez Mikel L. Forcada Departament de Llenguatges i Sistemes Informàtics Universitat d

More information

Lecture 1: March 7, 2018

Lecture 1: March 7, 2018 Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights

More information

Identifying QCD transition using Deep Learning

Identifying QCD transition using Deep Learning Identifying QCD transition using Deep Learning Kai Zhou Long-Gang Pang, Nan Su, Hannah Peterson, Horst Stoecker, Xin-Nian Wang Collaborators: arxiv:1612.04262 Outline 2 What is deep learning? Artificial

More information

CSC321 Lecture 22: Q-Learning

CSC321 Lecture 22: Q-Learning CSC321 Lecture 22: Q-Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Q-Learning 1 / 21 Overview Second of 3 lectures on reinforcement learning Last time: policy gradient (e.g. REINFORCE) Optimize

More information

Deep Reinforcement Learning

Deep Reinforcement Learning Martin Matyášek Artificial Intelligence Center Czech Technical University in Prague October 27, 2016 Martin Matyášek VPD, 2016 1 / 50 Reinforcement Learning in a picture R. S. Sutton and A. G. Barto 2015

More information

What s so Hard about Natural Language Understanding?

What s so Hard about Natural Language Understanding? What s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng

More information

Machine Translation. 10: Advanced Neural Machine Translation Architectures. Rico Sennrich. University of Edinburgh. R. Sennrich MT / 26

Machine Translation. 10: Advanced Neural Machine Translation Architectures. Rico Sennrich. University of Edinburgh. R. Sennrich MT / 26 Machine Translation 10: Advanced Neural Machine Translation Architectures Rico Sennrich University of Edinburgh R. Sennrich MT 2018 10 1 / 26 Today s Lecture so far today we discussed RNNs as encoder and

More information

A Video from Google DeepMind.

A Video from Google DeepMind. A Video from Google DeepMind http://www.nature.com/nature/journal/v518/n7540/fig_tab/nature14236_sv2.html Can machine learning teach us cluster updates? Lei Wang Institute of Physics, CAS https://wangleiphy.github.io

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk The POS Tagging Problem 2 England NNP s POS fencers

More information

Improving Lexical Choice in Neural Machine Translation. Toan Q. Nguyen & David Chiang

Improving Lexical Choice in Neural Machine Translation. Toan Q. Nguyen & David Chiang Improving Lexical Choice in Neural Machine Translation Toan Q. Nguyen & David Chiang 1 Overview Problem: Rare word mistranslation Model 1: softmax cosine similarity (fixnorm) Model 2: direction connections

More information

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA Quantum Artificial Intelligence and Machine : The Path to Enterprise Deployments Randall Correll randall.correll@qcware.com +1 (703) 867-2395 Palo Alto, CA 1 Bundled software and services Professional

More information

Backpropagation Through

Backpropagation Through Backpropagation Through Backpropagation Through Backpropagation Through Will Grathwohl Dami Choi Yuhuai Wu Geoff Roeder David Duvenaud Where do we see this guy? L( ) =E p(b ) [f(b)] Just about everywhere!

More information

Human-level control through deep reinforcement. Liia Butler

Human-level control through deep reinforcement. Liia Butler Humanlevel control through deep reinforcement Liia Butler But first... A quote "The question of whether machines can think... is about as relevant as the question of whether submarines can swim" Edsger

More information

Chapter 8: Generalization and Function Approximation

Chapter 8: Generalization and Function Approximation Chapter 8: Generalization and Function Approximation Objectives of this chapter: Look at how experience with a limited part of the state set be used to produce good behavior over a much larger part. Overview

More information

Deep Learning for Natural Language Processing

Deep Learning for Natural Language Processing Deep Learning for Natural Language Processing Dylan Drover, Borui Ye, Jie Peng University of Waterloo djdrover@uwaterloo.ca borui.ye@uwaterloo.ca July 8, 2015 Dylan Drover, Borui Ye, Jie Peng (University

More information

The Noisy Channel Model and Markov Models

The Noisy Channel Model and Markov Models 1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle

More information

A phrase-based hidden Markov model approach to machine translation

A phrase-based hidden Markov model approach to machine translation A phrase-based hidden Markov model approach to machine translation Jesús Andrés-Ferrer Universidad Politécnica de Valencia Dept. Sist. Informáticos y Computación jandres@dsic.upv.es Alfons Juan-Císcar

More information

Maja Popović Humboldt University of Berlin Berlin, Germany 2 CHRF and WORDF scores

Maja Popović Humboldt University of Berlin Berlin, Germany 2 CHRF and WORDF scores CHRF deconstructed: β parameters and n-gram weights Maja Popović Humboldt University of Berlin Berlin, Germany maja.popovic@hu-berlin.de Abstract Character n-gram F-score (CHRF) is shown to correlate very

More information

NEAL: A Neurally Enhanced Approach to Linking Citation and Reference

NEAL: A Neurally Enhanced Approach to Linking Citation and Reference NEAL: A Neurally Enhanced Approach to Linking Citation and Reference Tadashi Nomoto 1 National Institute of Japanese Literature 2 The Graduate University of Advanced Studies (SOKENDAI) nomoto@acm.org Abstract.

More information

a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM

a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM c 1. (Natural Language Processing; NLP) (Deep Learning) RGB IBM 135 8511 5 6 52 yutat@jp.ibm.com a) b) 2. 1 0 2 1 Bag of words White House 2 [1] 2015 4 Copyright c by ORSJ. Unauthorized reproduction of

More information

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning TTIC 31230, Fundamentals of Deep Learning, Winter 2019 David McAllester The Fundamental Equations of Deep Learning 1 Early History 1943: McCullock and Pitts introduced the linear threshold neuron. 1962:

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Philipp Koehn 4 April 205 Linear Models We used before weighted linear combination of feature values h j and weights λ j score(λ, d i ) = j λ j h j (d i ) Such models can

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments

Overview (Fall 2007) Machine Translation Part III. Roadmap for the Next Few Lectures. Phrase-Based Models. Learning phrases from alignments Overview Learning phrases from alignments 6.864 (Fall 2007) Machine Translation Part III A phrase-based model Decoding in phrase-based models (Thanks to Philipp Koehn for giving me slides from his EACL

More information

If Mathematical Proof is a Game, What are the States and Moves? David McAllester

If Mathematical Proof is a Game, What are the States and Moves? David McAllester If Mathematical Proof is a Game, What are the States and Moves? David McAllester 1 AlphaGo Fan (October 2015) AlphaGo Defeats Fan Hui, European Go Champion. 2 AlphaGo Lee (March 2016) 3 AlphaGo Zero vs.

More information

Information Extraction from Text

Information Extraction from Text Information Extraction from Text Jing Jiang Chapter 2 from Mining Text Data (2012) Presented by Andrew Landgraf, September 13, 2013 1 What is Information Extraction? Goal is to discover structured information

More information

Generating Sequences with Recurrent Neural Networks

Generating Sequences with Recurrent Neural Networks Generating Sequences with Recurrent Neural Networks Alex Graves University of Toronto & Google DeepMind Presented by Zhe Gan, Duke University May 15, 2015 1 / 23 Outline Deep recurrent neural network based

More information

On Some Mathematical Results of Neural Networks

On Some Mathematical Results of Neural Networks On Some Mathematical Results of Neural Networks Dongbin Xiu Department of Mathematics Ohio State University Overview (Short) Introduction of Neural Networks (NNs) Successes Basic mechanism (Incomplete)

More information

The Geometry of Statistical Machine Translation

The Geometry of Statistical Machine Translation The Geometry of Statistical Machine Translation Presented by Rory Waite 16th of December 2015 ntroduction Linear Models Convex Geometry The Minkowski Sum Projected MERT Conclusions ntroduction We provide

More information

INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 3, 7 Sep., 2016

INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 3, 7 Sep., 2016 1 INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS Jan Tore Lønning, Lecture 3, 7 Sep., 2016 jtl@ifi.uio.no Machine Translation Evaluation 2 1. Automatic MT-evaluation: 1. BLEU 2. Alternatives 3. Evaluation

More information

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Dr. Yanjun Qi Department of Computer Science University of Virginia Tutorial @ ACM BCB-2018 8/29/18 Yanjun Qi / UVA

More information

Sequence Modeling with Neural Networks

Sequence Modeling with Neural Networks Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes

More information

Investigating Connectivity and Consistency Criteria for Phrase Pair Extraction in Statistical Machine Translation

Investigating Connectivity and Consistency Criteria for Phrase Pair Extraction in Statistical Machine Translation Investigating Connectivity and Consistency Criteria for Phrase Pair Extraction in Statistical Machine Translation Spyros Martzoukos Christophe Costa Florêncio and Christof Monz Intelligent Systems Lab

More information

Today s Lecture. Dropout

Today s Lecture. Dropout Today s Lecture so far we discussed RNNs as encoder and decoder we discussed some architecture variants: RNN vs. GRU vs. LSTM attention mechanisms Machine Translation 1: Advanced Neural Machine Translation

More information

Better Conditional Language Modeling. Chris Dyer

Better Conditional Language Modeling. Chris Dyer Better Conditional Language Modeling Chris Dyer Conditional LMs A conditional language model assigns probabilities to sequences of words, w =(w 1,w 2,...,w`), given some conditioning context, x. As with

More information

Reinforcement Learning as Classification Leveraging Modern Classifiers

Reinforcement Learning as Classification Leveraging Modern Classifiers Reinforcement Learning as Classification Leveraging Modern Classifiers Michail G. Lagoudakis and Ronald Parr Department of Computer Science Duke University Durham, NC 27708 Machine Learning Reductions

More information

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction Maha Elbayad 1,2 Laurent Besacier 1 Jakob Verbeek 2 Univ. Grenoble Alpes, CNRS, Grenoble INP, Inria, LIG, LJK,

More information

Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks

Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks Panos Stinis (joint work with T. Hagge, A.M. Tartakovsky and E. Yeung) Pacific Northwest National Laboratory

More information

Natural Language Processing (CSEP 517): Machine Translation

Natural Language Processing (CSEP 517): Machine Translation Natural Language Processing (CSEP 57): Machine Translation Noah Smith c 207 University of Washington nasmith@cs.washington.edu May 5, 207 / 59 To-Do List Online quiz: due Sunday (Jurafsky and Martin, 2008,

More information

Advances in Neural Machine Translation

Advances in Neural Machine Translation Advances in Neural Machine Translation Rico Sennrich, Alexandra Birch, Marcin Junczys-Dowmunt Institute for Language, Cognition and Computation University of Edinburgh November 1 2016 Sennrich, Birch,

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning and Richard Socher Lecture 11: Further topics in Neural Machine Translation and Recurrent Models Lecture Plan: Going forwards

More information