Spatial Transformer. Ref: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transformer Networks, NIPS, 2015

Similar documents
Spatial Transformer Networks

Deep Learning For Mathematical Functions

Recurrent Neural Networks. Jian Tang

a) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Neural Networks Language Models

Memory-Augmented Attention Model for Scene Text Recognition

EE-559 Deep learning LSTM and GRU

Slide credit from Hung-Yi Lee & Richard Socher

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization

Recurrent Neural Networks

Sequence Modeling with Neural Networks

Long-Short Term Memory and Other Gated RNNs

Neural Networks for NLP. COMP-599 Nov 30, 2016

Deep Learning. Recurrent Neural Network (RNNs) Ali Ghodsi. October 23, Slides are partially based on Book in preparation, Deep Learning

Implicitly-Defined Neural Networks for Sequence Labeling

Natural Language Processing and Recurrent Neural Networks

Recurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST

Based on the original slides of Hung-yi Lee

Faster Training of Very Deep Networks Via p-norm Gates

MULTIPLICATIVE LSTM FOR SEQUENCE MODELLING

COMPARING FIXED AND ADAPTIVE COMPUTATION TIME FOR RE-

Recurrent Autoregressive Networks for Online Multi-Object Tracking. Presented By: Ishan Gupta

RECURRENT NETWORKS I. Philipp Krähenbühl

Recurrent and Recursive Networks

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Gated Recurrent Neural Tensor Network

Lecture 17: Neural Networks and Deep Learning

Conditional Image Generation with PixelCNN Decoders

Deep Learning Tutorial. 李宏毅 Hung-yi Lee

Machine Translation. 10: Advanced Neural Machine Translation Architectures. Rico Sennrich. University of Edinburgh. R. Sennrich MT / 26

Deep Learning and Lexical, Syntactic and Semantic Analysis. Wanxiang Che and Yue Zhang

Sequence Models. Ji Yang. Department of Computing Science, University of Alberta. February 14, 2018

Introduction to Deep Neural Networks

Recurrent Neural Networks 2. CS 287 (Based on Yoav Goldberg s notes)

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning

Seq2Tree: A Tree-Structured Extension of LSTM Network

Recurrent Neuronal Network tailored for Weather Radar Nowcasting

Lecture 11 Recurrent Neural Networks I

arxiv: v1 [cs.cl] 5 Jul 2017

Recurrent Neural Networks. deeplearning.ai. Why sequence models?

Lecture 11 Recurrent Neural Networks I

Conditional Language modeling with attention

Deep Learning for Computer Vision

BEYOND WORD IMPORTANCE: CONTEXTUAL DE-

Recurrent Neural Network

Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop

Lecture 8: Recurrent Neural Networks

Deep Learning Recurrent Networks 2/28/2018

CSC321 Lecture 10 Training RNNs

Deep Learning for Natural Language Processing

Introduction to RNNs!

arxiv: v3 [cs.lg] 14 Jan 2018

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

Improved Learning through Augmenting the Loss

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued)

arxiv: v1 [cs.cl] 21 May 2017

Tackling the Limits of Deep Learning for NLP

NEURAL LANGUAGE MODELS

Incorporating Pre-Training in Long Short-Term Memory Networks for Tweets Classification

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

A Survey of Techniques for Sentiment Analysis in Movie Reviews and Deep Stochastic Recurrent Nets

Recurrent Neural Networks Deep Learning Lecture 5. Efstratios Gavves

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Today s Lecture. Dropout

Neural Architectures for Image, Language, and Speech Processing

Differentiable Fine-grained Quantization for Deep Neural Network Compression

Short-term water demand forecast based on deep neural network ABSTRACT

CSC321 Lecture 15: Exploding and Vanishing Gradients

CSCI 315: Artificial Intelligence through Deep Learning

Eve: A Gradient Based Optimization Method with Locally and Globally Adaptive Learning Rates

Natural Language Understanding. Recap: probability, language models, and feedforward networks. Lecture 12: Recurrent Neural Networks and LSTMs

Deep Sequence Models. Context Representation, Regularization, and Application to Language. Adji Bousso Dieng

(

Recurrent Neural Network

arxiv: v3 [cs.cl] 24 Feb 2018

CSC321 Lecture 16: ResNets and Attention

Lecture 5 Neural models for NLP

EE-559 Deep learning Recurrent Neural Networks

Conditional Kronecker Batch Normalization for Compositional Reasoning

text classification 3: neural networks

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Language Modeling. Hung-yi Lee 李宏毅

Deep Learning for Natural Language Processing

Generative Models for Sentences

A QUESTION ANSWERING SYSTEM USING ENCODER-DECODER, SEQUENCE-TO-SEQUENCE, RECURRENT NEURAL NETWORKS. A Project. Presented to

ATASS: Word Embeddings

Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas

Development of a Deep Recurrent Neural Network Controller for Flight Applications

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Multi-Source Neural Translation

Applied Natural Language Processing

Control-oriented model learning with a recurrent neural network

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Spatial Transformation

Transcription:

Spatial Transormer Re: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transormer Networks, NIPS, 2015

Spatial Transormer Layer CNN is not invariant to scaling and rotation CNN 5 CNN 6 NN layer End-to-end learn Can also transorm eature map

Spatial Transormer Layer How to transorm an image/eature map Layer l-1 Layer l a l 1 11 a l 1 l 1 12 a 13 a l 1 21 a l 1 l 1 22 a 23 a l 1 31 a l 1 l 1 32 a 33 Spatial Transormer Layer Translate l l l a 11 a 12 a 13 l l l a 21 a 22 a 23 l l l a 31 a 32 a 33 General layer: al nm 3 = i=1 3 j=1 l I we want translate as above: a nm l w nm,ij l l 1 w nm,ij a ij l 1 = a (n 1)m l = 1 i i = n 1, j = m w nm,ij = 0 otherwise

Spatial Transormer Layer How to transorm an image/eature map Layer l-1 Layer l Layer l-1 Layer l a l 1 11 a l 1 l 1 12 a 13 l l l a 11 a 12 a 13 a l 1 11 a l 1 l 1 12 a 13 l l l a 11 a 12 a 13 a l 1 21 a l 1 l 1 22 a 23 l l l a 21 a 22 a 23 a l 1 21 a l 1 l 1 22 a 23 l l l a 21 a 22 a 23 a l 1 31 a l 1 l 1 32 a 33 l l l a 31 a 32 a 33 a l 1 31 a l 1 l 1 32 a 33 l l l a 31 a 32 a 33 NN Control the connection NN Control the connection

Image Transormation Expansion, Compression, Translation x y = 2 0 0 2 x y + 0 0 x y x y 1 1 x y = 0.5 0 0 0.5 x y + 0.5 0.5 x y

https://home.gamer.com.tw/c reationdetail.php?sn=792585 Image Transormation Rotation x y = cosθ sinθ sinθ cosθ x y + 0 0 x y x y Rotate θ

Spatial Transormer Layer x y = a b c d x y + e 6 parameters to describe the aine transormation Index o layer l-1 Index o layer l a l 1 11 a l 1 l 1 12 a 13 l l l a 11 a 12 a 13 Layer l-1 a l 1 21 a l 1 l 1 22 a 23 a l 1 31 a l 1 l 1 32 a 33 a c b d e l l l a 21 a 22 a 23 l l l a 31 a 32 a 33 Layer l NN

Spatial Transormer Layer x y = 0a 1b 1c 0d x y + 1 e 1 6 parameters to describe the aine transormation Index o layer l-1 Index o layer l a l 1 11 a l 1 l 1 12 a 13 l l l a 11 a 12 a 13 Layer l-1 a l 1 21 a l 1 l 1 22 a 23 a l 1 31 a l 1 l 1 32 a 33 0a 1b 1c 0d 1 e 1 l a 21 l a 22 l a 23 l a 31 l a 32 l a 33 Layer l NN

Spatial Transormer Layer 1.6 x 2.4 y = 0a 0.5 b 1c 0d x2 y + 0.6 e 2 0.4 6 parameters to describe the aine transormation Index o layer l-1 Index o layer l What is the problem? a l 1 11 a l 1 l 1 12 a 13 l l l a 11 a 12 a 13 Layer l-1 a l 1 21 a l 1 l 1 22 a 23 a l 1 31 a l 1 l 1 32 a 33 0a 0.5 b 1 c 0d 0.6 e 0.4 l a 21 l a 22 l a 23 l a 31 l a 32 l a 33 Layer l NN Gradient is always zero

Interpolation Now we can use gradient descent 1.6 x 2.4 y = 0a 0.5 b 1c 0d x2 y + 0.6 e 2 0.4 6 parameters to describe the aine transormation Index o layer l-1 Index o layer l l l l a 11 a 12 a 13 0.4 0.6 l l l a 21 a 22 a 23 Layer l l l l a 31 a 32 a 33 0.6 0.4 0.4 1.6 2.4 0.6 0.6 0.4 l a 22 l 1 = (1 0.4) (1 0.4) a 22 l 1 + 1 0.6 (1 0.4) a 12 l 1 + 1 0.6 (1 0.6) a 13 l 1 + 1 0.4 (1 0.6) a 23

Street View House Number Single: one transormation layer Multi: many transormation layer

Brid Recognition a c 0 b d 0 e

Highway Network & Grid LSTM

Feedorward v.s. Recurrent 1. Feedorward network does not have input at each step 2. Feedorward network has dierent parameters or each layer x 1 a 1 2 a 2 3 a 3 4 y a t = l a t 1 = σ W t a t 1 + b t t is layer h 0 h 1 h 2 h 3 g y 4 x 1 x 2 a t = a t 1, x t x 3 x 4 = σ W h a t 1 + W i x t + b i t is time step Applying gated structure in eedorward network

GRU Highway Network y t No input x t at each step ha t-1 + ha t No output y t at each step a t-1 is the output o the (t-1)-th layer reset update 1- a t is the output o the t-th layer r z h' No reset gate ah t-1 x t x t

Highway Network Highway Network Gate controller z h a t copy h = σ Wa t 1 z = σ W a t 1 a t = z a t 1 + 1 z h Residual Network h + a t a t 1 copy a t 1 a t 1 Training Very Deep Networks https://arxiv.org/pd/1507.0622 8v2.pd Deep Residual Learning or Image Recognition http://arxiv.org/abs/1512.03385

output layer output layer output layer Input layer Highway Network automatically determines the layers needed! Input layer Input layer

Highway Network

Grid LSTM y depth Memory or both time and depth a b c h LSTM c h t c h Grid LSTM c h x a b time

a l+1 b l+1 a l+1 b l+1 c t-1 h t-1 Grid LSTM c t h t Grid LSTM c t+1 h t+1 a l b l a l b l c t-1 h t-1 Grid LSTM c t h t Grid LSTM c t+1 h t+1 a l-1 b l-1 a l-1 b l-1

Grid LSTM h ' b' a b c c' c h Grid LSTM c h a z z i + tanh z z o a' a b h b

3D Grid LSTM a b e c c h e h a b

Recursive Structure

Application: Sentiment Analysis h 0 h 1 h 2 h 3 g sentiment x 1 x 2 x 3 x 4 word vector Word Sequence Recurrent Structure Special case o recursive structure Recursive Structure How to stack unction is already determined x 1 h 1 h 3 same dimension x 2 x 3 g h 2 x 4

Recursive Model syntactic structure not very good How to do it is out o the scope word sequence: not very good

Recursive Model By composing the two meaning, what should the meaning be. Dimension o word vector = Z Input: 2 X Z, output: Z Meaning o very good V( very good ) syntactic structure not very good V( not ) V( very ) V( good ) not very good

Recursive Model V(w A w B ) V(w A ) + V(w B ) syntactic structure not : neutral good : positive not very good not good : negative Meaning o very good V( very good ) V( not ) V( very ) V( good ) not very good

Recursive Model V(w A w B ) V(w A ) + V(w B ) syntactic structure 棒 : positive 好棒 : positive not very good 好棒棒 : negative Meaning o very good V( very good ) network V( not ) V( very ) V( good ) not very good

Recursive Model not good not bad syntactic structure not very good not good not bad Meaning o very good V( very good ) not : reverse another input V( not ) V( very ) V( good ) not very good

Recursive Model very good very bad syntactic structure not very good very good very bad Meaning o very good V( very good ) very : emphasize another input V( not ) V( very ) V( good ) not very good

- output re 5 classes ( --, -, 0, +, ++ ) V( not very good ) g Train both g V( not ) V( very ) V( good ) not very good

Recursive Neural Tensor Network p = σ W Little interaction between a and b a b = σ x T W x + W ij x i x j i,j

Experiments 5-class sentiment classiication ( --, -, 0, +, ++ ) Demo: http://nlp.stanord.edu:8080/sentiment/rntndemo.html

Experiments Socher, Richard, et al. "Recursive deep models or semantic compositionality over a sentiment treebank." Proceedings o the conerence on empirical methods in natural language processing (EMNLP). Vol. 1631. 2013.

Matrix-Vector Recursive Network inherent meaning how it changes the others = = σ W = W M = Zero? a A Inormative? b B 1 0 Identity? 0 1? not good

Tree LSTM h t-1 h t Typical LSTM Tree LSTM m t-1 LSTM m t x t h 3 m 3 LSTM h 1 m 1 m 2 h 2

More Applications Sentence relatedness NN Tai, Kai Sheng, Richard Socher, and Christopher D. Manning. "Improved semantic representations rom treestructured long short-term memory networks." arxiv preprint arxiv:1503.00075 (2015). Recursive Neural Network Recursive Neural Network Sentence 1 Sentence 2

Batch Normalization

Experimental Results