2017 Fall ECE 692/599: Binary Representation Learning for Large Scale Visual Data
|
|
- Verity Riley
- 6 years ago
- Views:
Transcription
1 2017 Fall ECE 692/599: Binary Representation Learning for Large Scale Visual Data Liu Liu Instructor: Dr. Hairong Qi University of Tennessee, Knoxville September 21, 2017 Liu Liu (UTK) DL Class September 21, / 62
2 Overview 1 Introduction 2 Learning Binary Representation via Cross Entropy 3 End-to-end Learning Binary Representation 4 Discriminative Cross-View Hashing 5 Cross-Domain Image Hashing with Adversarial Learning 6 Conclusion Liu Liu (UTK) DL Class September 21, / 62
3 Massive Datasets Modern media brought massive multimedia dataset. Facebook has about 300 million photo uploads per day Instagram Stories has over 250 million active daily user ImageNet has over 13 million images, with over 21k categories Liu Liu (UTK) DL Class September 21, / 62
4 Resource Constraint Resource-constrained environment smart camera networks (SCN) often deployed in harsh communication environment on-board computation and storage resource is limited distributed object/scene recognition Liu Liu (UTK) DL Class September 21, / 62
5 Multi-modality Social media offer massive volumes of multimedia content textual tags, annotations of images detailed descriptions (pathology reports) and medical images thumbnails, titles of videos Liu Liu (UTK) DL Class September 21, / 62
6 Cross-domain Multi-domain visual contents images from different contexts largely available unlabeled images cross-index/retrieval w.r.t. different domains non-negligible domain shift Liu Liu (UTK) DL Class September 21, / 62
7 Binary Representation Focus on learning efficient representation for visual content. Project high-dimensional visual data into low-dimensional embedding space Binarize the embedding in Hamming space Why binary? binary representation is computationally efficient much less storage (comparing to floating number) versatile for different tasks: retrieval, classification, etc. Liu Liu (UTK) DL Class September 21, / 62
8 End-to-end Learning conventional approach: generating feature step + binary embedding end-to-end approach: learning binary embedding for visual content together with feature learning usually achieved by deep learning approaches Liu Liu (UTK) DL Class September 21, / 62
9 Semantic Similarity vs. Visual Similarity Liu Liu (UTK) DL Class September 21, / 62
10 Hash Code Most common use of binary representation: hash code (retrieval and indexing) Pairwise/Affinity: multiple ways of learning pairwise similarity information 1 min H S 1 q HHT 2 F s.t. H [ 1, 1] n q 2 min H log P(S H) s.t. P(s ij h i, h j ) = σ( h i, h j ) s ij (1 σ( h i, h j )) 1 s ij 3 Both MAP and (weighted) MLE can be considered Triplet loss 1 Hinge ranking loss: min max(0, q/2 (d H (h i, h j ) d H (h i, h k ))) 2 (Normalized) Discounted Cumulative Gain: DCG p = p i=1 2rel i 1 log 2 (i+1) s.t. rel i {0, 1} Liu Liu (UTK) DL Class September 21, / 62
11 Learning Discriminative Representation Traditionally, binary representation is learned as hash code for retrieval purpose, pairwise similarity is exploited. Problem: the uniqueness of each class is lost when using similarity as supervision. Approach: use labels as supervision directly Liu Liu (UTK) DL Class September 21, / 62
12 End-to-end Learning Binary representation Learning Binary Representation via Cross Entropy Liu Liu (UTK) DL Class September 21, / 62
13 Learning Binary Descriptors via Classification Given a dataset of N samples X = {x i } N i=1, x i R d 1 Goal: learn B = {b i } N i=1 { 1, +1}L N via nonlinear hash functions F : R d N R L N, L << d: B = sgn(f (X)) Learn via Classification 1 N min L(W T b i, y i ) + Ω(W) W,F N i=1 b i = sgn(f (x i )) { 1, +1} L, i = 1,..., N (1) where W R L C is the linear classifier; L is some loss function; Y is the ground truth labels of training set, and Ω is the regularizor for the classifier Liu Liu (UTK) DL Class September 21, / 62
14 Cross Entropy Besides L-2 and hinge loss, cross entropy is a common loss function for classification, measuring the probabilistic distributional difference between ground truth and prediction (softmax classifier): Cross Entropy P(y i = k b i ; w k ) = e wt k b i C j=1 ewt j b i (2) C L i = t k (y i ) log P(y i = k b i ; w k ), (3) k=1 Liu Liu (UTK) DL Class September 21, / 62
15 Continuous Relaxation Total formulation: min b i,w,f 1 N N 1(y i ) wk T b i log i=1 b i = sgn(f (x i )), i = 1,..., N C j=1 e wt j b i + λ W 2 F (4) However, Eq. 4 is difficult to optimize directly, we relax it to a continuous form: min b i,w,f 1 N C 1(y i ) wk T N b i log e wt j b i + λ W 2 F i=1 j=1 } N (5) +γ b i F (x i ) ρ F 2 2 i=1 s.t. b i { 1, +1} L, i = 1,..., N. Liu Liu (UTK) DL Class September 21, / 62
16 Alternating Optimization Alternatingly optimize three parameters F step: embedding function optimization F (x) = M T φ(x) (6) Fix the binary code B and classifier W : min M B MT φ(x) ρ M 2 2 s.t. B = { 1, +1} L N. (7) Regularized least squared: M = (φ(x)φ(x) T + ρi) 1 φ(x)b (8) Liu Liu (UTK) DL Class September 21, / 62
17 Alternating Optimization W step: classifier optimization Fix binary code B and the embedding function F : min 1 W N N e wt k b i 1(y i ) log + λ W 2 C j=1 ewt j b F, (9) i i=1 Optimized by gradient descent: v (t) = θv (t 1) + α w (t+1) k L w (t 1) k = w (t) k v (t), k = 1,..., C (10) Liu Liu (UTK) DL Class September 21, / 62
18 Alternating Optimization B step: binary code optimization Fix W and F : ( min b i 1 N 1(y i )wk T N b i + 2γ i=1 ) N F (x i ) T b i + 1 N i=1 N log i=1 C j=1 e wt j b i s.t. b i { 1, 1} L, i = 1,..., N. (11) where log C j=1 ewt j b i in problem (11) is a Log-Sum-Exp (LSE) function max{x 1,..., x n } LSE(x 1,..., x n ) max{x 1,..., x n } + log(n) (12) Liu Liu (UTK) DL Class September 21, / 62
19 Alternating Optimization As a result, Eq. 11 can be approximated as { ( ) 1 N N min 1(y i )wk T b i N b i + 2γ F (x i ) T b i + 1 N i=1 i=1 N i=1 max{wj T b i } j s.t. b i { 1, +1} L, i = 1,..., N. (13) It is an NP-hard problem. We propose a sub-optimal solution greedily. } Liu Liu (UTK) DL Class September 21, / 62
20 Experiment Datasets: CIFAR-10 dataset, the BMW dataset and the Oxford 17 category flower dataset. Exp 1. Classification Task Methods Testing Accuracy Training Time (sec) KSH (5,000 tr) 91.5% 1720 FastHash 92.3% 609 SDH 92.0% 33.4 CCA-ITQ 91.8% 3.2 ResNet Feature 92.4% - CE-Bits (5,000 tr) 92.1% 3.1 CE-Bits 92.4% 22.1 Table: The testing accuracy of different methods on CIFAR-10 dataset (ResNet features), all binary codes are 64 bits. Liu Liu (UTK) DL Class September 21, / 62
21 Experiment Methods Testing Accuracy Training Time (sec) KSH 87.4% 83.1 FastHash 88.5% 38.0 SDH 87.9% 0.71 CCA-ITQ 88.5% 7.67 VGG Feature 88.8% - CE-Bits 88.6% 1.12 Table: The testing accuracy of different methods on Oxford 17 category flower dataset (VGG features), all binary codes are 64 bits. Liu Liu (UTK) DL Class September 21, / 62
22 Experiment Methods Testing Accuracy Training Time (sec) KSH 93.8% 18.4 FastHash 91.1% 14.8 SDH 95.9% 0.15 CCA-ITQ 92.9% 1.17 SURF 94.7% - CE-Bits 97.2% 0.31 Table: The testing accuracy of different methods on BMW dataset (SURF features), all binary codes are 64 bits. Liu Liu (UTK) DL Class September 21, / 62
23 Experiment Exp. 2 Retrieval Task 92 CIFAR 10: Resnet feature CIFAR 10: Resnet feature Precision (%) map(%) CE Bits SDH KSH CCA ITQ FastHash Code width CE Bits SDH KSH CCA ITQ FastHash Code width Figure: Comparison of precision achieved by different methods within Hamming radius of 2. Figure: Comparison of MAP achieved by different methods within Hamming radius of 2. Liu Liu (UTK) DL Class September 21, / 62
24 Experiment Convergence 10 5 Convergence of the Proposed Algorithm CE Bits (64 bit) Cost 10 4 Cost (log) Iteration Figure: The convergence of CE-Bits on CIFAR-10 during training with learning rate α = 5e 3. The code width is 64-bit Liu Liu (UTK) DL Class September 21, / 62
25 End-to-end Learning Binary Representation End-to-end Learning Binary representation with Direct Binary Embedding Liu Liu (UTK) DL Class September 21, / 62
26 Learning with Deep Architectures Problem Formulation min W,F 1 N N i=1 ( ) L(W b i, y i ) + λ b i F (I i ; Ω) 2 2 s.t. b i = thresold(f (I i ; Ω), 0.5) (14) F (I, Ω) = f DBE (f n ( f 2 (f 1 (I ; ω 1 ); ω 2 ) ; ω n )ω DBE ), (15) Similar continuous relaxation: 1 min W,F N N i=1 ( L(W F (I i ; Ω), y i ) + λ 2F (I i ; Ω) 1 1 2) (16) Liu Liu (UTK) DL Class September 21, / 62
27 Direct Binary Embedding Z = f DBE (X) = tanh(relu(bn(xw DBE + b DBE ))) (17) I DCNN X W DBE T BN tanh(relu( )) b DBE Z f DBE F(I; Ω) The benefit of DBE layer approximating binary code is three-fold: 1 batch normalization mitigates training with saturating nonlinearity, and potentially promotes more effective binary representation. 2 ReLU activation is sparse and learns bit 0 inherently. 3 tanh activation bounds the ramping of ReLU activation and learns bit 1 effectively without jeopardizing the sparsity of ReLU. Liu Liu (UTK) DL Class September 21, / 62
28 Classification Multicalss classification: min W,F 1 N N i=1 k=1 C 1(y i ) log e w k F (I i ;Ω) C j=1 ew j F (I i ;Ω) s.t. F (I, Ω) = f DBE (f n ( f 2 (f 1 (I; ω 1 ); ω 2 ) ; ω n )ω DBE ) (18) Multilabel classification: min W,F 1 N log N c + i=1 j=1 1 c + log e w j F (I i ;Ω) C p=1 ew p F (I i ;Ω) ν 1 N e w p F (I i ;Ω) + (1 1(y i)) log N i=1 p=1 e w p F (I i ;Ω) 1 + e w p F (I i ;Ω) s.t. F (I; Ω) = f DBE (f n ( f 2 (f 1 (I; ω 1 ); ω 2 ) ; ω n )ω DBE ) C [ρ1(y i ) (19) ] Liu Liu (UTK) DL Class September 21, / 62
29 Toy Example MNIST with LeNet 10 1 DBE-LeNet 10 0 LeNet log(loss) Figure: The histogram of DBE layer activation epcohs Figure: The convergence of the original LeNet and with DBE trained on MNIST Liu Liu (UTK) DL Class September 21, / 62
30 Toy Example Method LeNet DBE-LeNet SDH FastHash testing acc(%) Table: The comparison of the testing accuracy on MNIST. Code-length for all hashing algorithms is 64-bit. LeNet feature (1000-d continuous vectors) is used for SDH and FastHash. λ 0 1e-4 1e-3 1e-2 1e-1 testing acc(%) Table: The impact on quantization error coefficient λ Liu Liu (UTK) DL Class September 21, / 62
31 Experiment Evaluate the proposed DBE layer with the deep residual network (ResNet) Datasets: CIFAR-10 (50K training, 10K test) and MS COCO (83K training, 40K test) Exp. 1 Classification Methods Testing Accuracy (%) CCA-ITQ FastHash SDH DLBHC ResNet DBE (ours) Table: The testing accuracy of different methods on CIFAR-10 dataset. All binary representations have code-length of 64 bits. Liu Liu (UTK) DL Class September 21, / 62
32 Experiment Performance w.r.t. different code lengths Code length (bits) testing acc(%) Table: Classification accuracy of DBE on CIFAR-10 dataset across different code lengths Liu Liu (UTK) DL Class September 21, / 62
33 Experiment Exp. 2 Natural object retrieval and multilabel image retrieval Code length (bits) CCA-ITQ FastHash SDH DSH DSRH DLBHC DBE (ours) Table: Comparison of mean average precision (map) on CIFAR-10 Liu Liu (UTK) DL Class September 21, / 62
34 Experiment Code length (bits) CCA-ITQ CMFH CCA-ACQ DHN DBE (ours) Table: Comparison of mean average precision (map) on COCO Liu Liu (UTK) DL Class September 21, / 62
35 Experiment Exp. 3 Multilabel image annotation Method O-P O-R O-F1 WARP DBE-Softmax DBE-weighted binary cross entropy DBE-joint cross entropy Table: Performance comparison on COCO for K = 3. The code length for all the DBE methods is 64-bit. Liu Liu (UTK) DL Class September 21, / 62
36 THANK YOU Liu Liu (UTK) DL Class September 21, / 62
END-TO-END BINARY REPRESENTATION LEARNING VIA DIRECT BINARY EMBEDDING. Liu Liu, Alireza Rahimpour, Ali Taalimi, Hairong Qi
ED-TO-ED BIARY REPRESETATIO LEARIG VIA DIRECT BIARY EMBEDDIG Liu Liu, Alireza Rahimpour, Ali Taalimi, Hairong Qi Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationCorrelation Autoencoder Hashing for Supervised Cross-Modal Search
Correlation Autoencoder Hashing for Supervised Cross-Modal Search Yue Cao, Mingsheng Long, Jianmin Wang, and Han Zhu School of Software Tsinghua University The Annual ACM International Conference on Multimedia
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationStochastic gradient descent; Classification
Stochastic gradient descent; Classification Steve Renals Machine Learning Practical MLP Lecture 2 28 September 2016 MLP Lecture 2 Stochastic gradient descent; Classification 1 Single Layer Networks MLP
More informationClassification. Sandro Cumani. Politecnico di Torino
Politecnico di Torino Outline Generative model: Gaussian classifier (Linear) discriminative model: logistic regression (Non linear) discriminative model: neural networks Gaussian Classifier We want to
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationWhat Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1
What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single
More informationIntroduction to Neural Networks
CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character
More informationOnline Passive-Aggressive Algorithms. Tirgul 11
Online Passive-Aggressive Algorithms Tirgul 11 Multi-Label Classification 2 Multilabel Problem: Example Mapping Apps to smart folders: Assign an installed app to one or more folders Candy Crush Saga 3
More informationarxiv: v1 [cs.cv] 8 Oct 2018
Deep LDA Hashing Di Hu School of Computer Science Northwestern Polytechnical University Feiping Nie School of Computer Science Northwestern Polytechnical University arxiv:1810.03402v1 [cs.cv] 8 Oct 2018
More informationLecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants
Advanced Machine Learning Lecture 2 Neural Networks 24..206 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI) 28..
More informationSupervised Quantization for Similarity Search
Supervised Quantization for Similarity Search Xiaojuan Wang 1 Ting Zhang 2 Guo-Jun Qi 3 Jinhui Tang 4 Jingdong Wang 5 1 Sun Yat-sen University, China 2 University of Science and Technology of China, China
More informationWhat is semi-supervised learning?
What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,
More informationMachine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)
More informationLecture 12. Neural Networks Bastian Leibe RWTH Aachen
Advanced Machine Learning Lecture 12 Neural Networks 24.11.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI)
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 2 October 2018 http://www.inf.ed.ac.u/teaching/courses/mlp/ MLP Lecture 3 / 2 October 2018 Deep
More informationLearning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval
Learning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval Qifan Wang 1, Luo Si 1, and Dan Zhang 2 1 Department of Computer Science, Purdue
More informationConvolutional Neural Network Architecture
Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Guy Lebanon Binary Classification Binary classification is the most basic task in machine learning, and yet the most frequent. Binary classifiers often serve as the
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationCENG 793. On Machine Learning and Optimization. Sinan Kalkan
CENG 793 On Machine Learning and Optimization Sinan Kalkan 2 Now Introduction to ML Problem definition Classes of approaches K-NN Support Vector Machines Softmax classification / logistic regression Parzen
More informationStochastic Generative Hashing
Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by
More informationBayesian Networks (Part I)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationFantope Regularization in Metric Learning
Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction
More informationMulticlass Logistic Regression
Multiclass Logistic Regression Sargur. Srihari University at Buffalo, State University of ew York USA Machine Learning Srihari Topics in Linear Classification using Probabilistic Discriminative Models
More informationInterpreting Deep Classifiers
Ruprecht-Karls-University Heidelberg Faculty of Mathematics and Computer Science Seminar: Explainable Machine Learning Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge Author: Daniela
More informationNeural networks and optimization
Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional
More informationMetric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)
Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy
More informationTYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad
TYPES OF MODEL COMPRESSION Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad 1. Pruning 2. Quantization 3. Architectural Modifications PRUNING WHY PRUNING? Deep Neural Networks have redundant parameters.
More informationMaking Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation
Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Dr. Yanjun Qi Department of Computer Science University of Virginia Tutorial @ ACM BCB-2018 8/29/18 Yanjun Qi / UVA
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationLogistic Regression Introduction to Machine Learning. Matt Gormley Lecture 9 Sep. 26, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Logistic Regression Matt Gormley Lecture 9 Sep. 26, 2018 1 Reminders Homework 3:
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationCPSC 340 Assignment 4 (due November 17 ATE)
CPSC 340 Assignment 4 due November 7 ATE) Multi-Class Logistic The function example multiclass loads a multi-class classification datasetwith y i {,, 3, 4, 5} and fits a one-vs-all classification model
More informationExploiting Primal and Dual Sparsity for Extreme Classification 1
Exploiting Primal and Dual Sparsity for Extreme Classification 1 Ian E.H. Yen Joint work with Xiangru Huang, Kai Zhong, Pradeep Ravikumar and Inderjit Dhillon Machine Learning Department Carnegie Mellon
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationDifferentiable Fine-grained Quantization for Deep Neural Network Compression
Differentiable Fine-grained Quantization for Deep Neural Network Compression Hsin-Pai Cheng hc218@duke.edu Yuanjun Huang University of Science and Technology of China Anhui, China yjhuang@mail.ustc.edu.cn
More informationMachine Learning. 7. Logistic and Linear Regression
Sapienza University of Rome, Italy - Machine Learning (27/28) University of Rome La Sapienza Master in Artificial Intelligence and Robotics Machine Learning 7. Logistic and Linear Regression Luca Iocchi,
More informationDo Neural Network Cross-Modal Mappings Really Bridge Modalities?
Do Neural Network Cross-Modal Mappings Really Bridge Modalities? Language Intelligence and Information Retrieval group (LIIR) Department of Computer Science Story Collell, G., Zhang, T., Moens, M.F. (2017)
More informationIndex. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationCS 340 Lec. 16: Logistic Regression
CS 34 Lec. 6: Logistic Regression AD March AD ) March / 6 Introduction Assume you are given some training data { x i, y i } i= where xi R d and y i can take C different values. Given an input test data
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationNonlinear Models. Numerical Methods for Deep Learning. Lars Ruthotto. Departments of Mathematics and Computer Science, Emory University.
Nonlinear Models Numerical Methods for Deep Learning Lars Ruthotto Departments of Mathematics and Computer Science, Emory University Intro 1 Course Overview Intro 2 Course Overview Lecture 1: Linear Models
More informationIntroduction to Machine Learning (67577)
Introduction to Machine Learning (67577) Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Deep Learning Shai Shalev-Shwartz (Hebrew U) IML Deep Learning Neural Networks
More informationIntroduction to Convolutional Neural Networks 2018 / 02 / 23
Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that
More informationSupplemental Material for Discrete Graph Hashing
Supplemental Material for Discrete Graph Hashing Wei Liu Cun Mu Sanjiv Kumar Shih-Fu Chang IM T. J. Watson Research Center Columbia University Google Research weiliu@us.ibm.com cm52@columbia.edu sfchang@ee.columbia.edu
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I
Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In
More informationDiscriminative Learning of Sum-Product Networks. Robert Gens Pedro Domingos
Discriminative Learning of Sum-Product Networks Robert Gens Pedro Domingos X1 X1 X1 X1 X2 X2 X2 X2 X3 X3 X3 X3 X4 X4 X4 X4 X5 X5 X5 X5 X6 X6 X6 X6 Distributions X 1 X 1 X 1 X 1 X 2 X 2 X 2 X 2 X 3 X 3
More informationON THE STABILITY OF DEEP NETWORKS
ON THE STABILITY OF DEEP NETWORKS AND THEIR RELATIONSHIP TO COMPRESSED SENSING AND METRIC LEARNING RAJA GIRYES AND GUILLERMO SAPIRO DUKE UNIVERSITY Mathematics of Deep Learning International Conference
More informationNeural networks COMS 4771
Neural networks COMS 4771 1. Logistic regression Logistic regression Suppose X = R d and Y = {0, 1}. A logistic regression model is a statistical model where the conditional probability function has a
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationMultimodal context analysis and prediction
Multimodal context analysis and prediction Valeria Tomaselli (valeria.tomaselli@st.com) Sebastiano Battiato Giovanni Maria Farinella Tiziana Rotondo (PhD student) Outline 2 Context analysis vs prediction
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationMultiscale methods for neural image processing. Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet
Multiscale methods for neural image processing Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet A TALK IN TWO ACTS Part I: Stacked U-nets The globalization
More informationClustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26
Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar
More informationSGD and Deep Learning
SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationEnergy-Based Generative Adversarial Network
Energy-Based Generative Adversarial Network Energy-Based Generative Adversarial Network J. Zhao, M. Mathieu and Y. LeCun Learning to Draw Samples: With Application to Amoritized MLE for Generalized Adversarial
More informationFace Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi
Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold
More informationIs Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao
More informationNormalization Techniques in Training of Deep Neural Networks
Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,
More informationTowards Accurate Binary Convolutional Neural Network
Paper: #261 Poster: Pacific Ballroom #101 Towards Accurate Binary Convolutional Neural Network Xiaofan Lin, Cong Zhao and Wei Pan* firstname.lastname@dji.com Photos and videos are either original work
More informationTensor Methods for Feature Learning
Tensor Methods for Feature Learning Anima Anandkumar U.C. Irvine Feature Learning For Efficient Classification Find good transformations of input for improved classification Figures used attributed to
More informationA summary of Deep Learning without Poor Local Minima
A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given
More informationOpportunities and challenges in quantum-enhanced machine learning in near-term quantum computers
Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers Alejandro Perdomo-Ortiz Senior Research Scientist, Quantum AI Lab. at NASA Ames Research Center and at the
More informationAn overview of deep learning methods for genomics
An overview of deep learning methods for genomics Matthew Ploenzke STAT115/215/BIO/BIST282 Harvard University April 19, 218 1 Snapshot 1. Brief introduction to convolutional neural networks What is deep
More informationAnticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
Anticipating Visual Representations from Unlabeled Data Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Overview Problem Key Insight Methods Experiments Problem: Predict future actions and objects Image
More informationSemi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)
Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Gigantic Image Collections What does the world look like? High
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationIntroduction to (Convolutional) Neural Networks
Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationMachine Learning - Waseda University Logistic Regression
Machine Learning - Waseda University Logistic Regression AD June AD ) June / 9 Introduction Assume you are given some training data { x i, y i } i= where xi R d and y i can take C different values. Given
More informationCS60010: Deep Learning
CS60010: Deep Learning Sudeshna Sarkar Spring 2018 16 Jan 2018 FFN Goal: Approximate some unknown ideal function f : X! Y Ideal classifier: y = f*(x) with x and category y Feedforward Network: Define parametric
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationEncoder Based Lifelong Learning - Supplementary materials
Encoder Based Lifelong Learning - Supplementary materials Amal Rannen Rahaf Aljundi Mathew B. Blaschko Tinne Tuytelaars KU Leuven KU Leuven, ESAT-PSI, IMEC, Belgium firstname.lastname@esat.kuleuven.be
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationPRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz
PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":
More informationApplied Machine Learning Lecture 5: Linear classifiers, continued. Richard Johansson
Applied Machine Learning Lecture 5: Linear classifiers, continued Richard Johansson overview preliminaries logistic regression training a logistic regression classifier side note: multiclass linear classifiers
More informationCS 1674: Intro to Computer Vision. Final Review. Prof. Adriana Kovashka University of Pittsburgh December 7, 2016
CS 1674: Intro to Computer Vision Final Review Prof. Adriana Kovashka University of Pittsburgh December 7, 2016 Final info Format: multiple-choice, true/false, fill in the blank, short answers, apply an
More information