Multiscale methods for neural image processing. Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet

Size: px
Start display at page:

Download "Multiscale methods for neural image processing. Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet"

Transcription

1 Multiscale methods for neural image processing Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet

2 A TALK IN TWO ACTS Part I: Stacked U-nets The globalization problem for segmentation Simple ways to solve it Part 2: Quantized neural nets Putting neural nets on portable devices Binarized training methods, and why they work

3 PROBLEM SETTING PASCAL Visual Object Classes (VOC) Cityscapes

4 SIMPLE APPROACHES Conventional Classification Networks Designed for classification task Parameter heavy Low resolution output Minimal exchange of of context Dilated Classification Networks

5 WHAT MAKES SEGMENTATION HARD? Field of view problem High-resolution outputs Fine scale localization required Small data

6 TV IN 2D (rx) ij =(x i+1,j x i,j,x i,j+1 x i,j ) Anisotropic Isotropic k(rx) ij k = (rx) ij = x i+1,j x i,j + x i,j+1 x i,j ) q (x i+1,j x ij ) 2 +(x i,j+1 x ij ) 2

7 <latexit sha1_base64="l3jeis1bxocqmcp/50/lq1tp684=">aaae7nicbznlbxmxemfdnkajrxaoxcwipeifykygofzeqnalfx1icrrzjndjynu3thdauf4khlkhrpw4cyxvwbfbu91csxtlmx3n7z8zg88myqq3ttv9s7s80rp2/cbqzfat23fu3ltbv39g0lxttk9tkeojgg0txlf9y61gr5lmwblbdsmsx/ddj0wbnqr39ixji4ktxwnosq2u8dogklxbtiw2iji5pecke4ehdcbchogc5ttxeb33xmud7k63plbprjxraduzjndxfqbjsnpjlkucgzomupkdoawtp4l5nsonyzcd4yqng6mwzgbkyr/k4epgmca41efrfpbeqxeos2pojalkie3u1fnhxmsguy1fjhxxww6zohef4lxam8lifub8ipgkmtup9ivdnm6dopxnu2f8u40u+0rtkbgaobrb71bbkrbutffzdipfhhismxwrd706l6m9xkfjcwtshphhnhkisiqrx07ysr0ecuyycj2ogcbcrhbmf9qawvigsdg/dtunai198aosso1xisiiweqgwrvbodbjjrjgujpu1zmxrlftlnk3drpnblalue3q71fpchh9otzb+w/36jb8sbeayqjd CONVEX SEGMENTATION: FINE SCALE Solve speed limited by the CFL min µ ru + ku fk This Depends on This

8 <latexit sha1_base64="l3jeis1bxocqmcp/50/lq1tp684=">aaae7nicbznlbxmxemfdnkajrxaoxcwipeifykygofzeqnalfx1icrrzjndjynu3thdauf4khlkhrpw4cyxvwbfbu91csxtlmx3n7z8zg88myqq3ttv9s7s80rp2/cbqzfat23fu3ltbv39g0lxttk9tkeojgg0txlf9y61gr5lmwblbdsmsx/ddj0wbnqr39ixji4ktxwnosq2u8dogklxbtiw2iji5pecke4ehdcbchogc5ttxeb33xmud7k63plbprjxraduzjndxfqbjsnpjlkucgzomupkdoawtp4l5nsonyzcd4yqng6mwzgbkyr/k4epgmca41efrfpbeqxeos2pojalkie3u1fnhxmsguy1fjhxxww6zohef4lxam8lifub8ipgkmtup9ivdnm6dopxnu2f8u40u+0rtkbgaobrb71bbkrbutffzdipfhhismxwrd706l6m9xkfjcwtshphhnhkisiqrx07ysr0ecuyycj2ogcbcrhbmf9qawvigsdg/dtunai198aosso1xisiiweqgwrvbodbjjrjgujpu1zmxrlftlnk3drpnblalue3q71fpchh9otzb+w/36jb8sbeayqjd CONVEX SEGMENTATION: COARSE SCALE TV changes the CFL condition min µ ru + ku fk This Depends on This

9 SCALE AFFECTS SPEED! Segmentation using TV 15 iterations 280 iterations 4500 iterations Solution: use multi-grid!

10 NEURAL NETS HAVE THE SAME PROBLEMS X X far away points talk to each other

11 CLASSIFICATION NET Globalize via pooling x y label vector ball car hat grumpy cat output contains global info

12 CLASSIFICATION NET Globalize via convolution x High-res output CFL condition becomes field of view problem

13 FIELD OF VIEW BREAKDOWN Building Skyscraper Example from: Zhao et al. Pyramid Scene Parsing Network

14 GLOBALIZATION METHODS THINGS GET COMPLICATED

15 APPROACHES TO GLOBALIZATION Dilated/de-conv modules Chen et al. Semantic image segmentation 2014 Yu & Koltun, Multi-scale context aggregation 2015 Multi-scale feature ensembling Long et al., Fully Convolution Semantic Segmentation 2015 Chen et al. Attention to scale 2016 Xia et al. Zoom better to see clearer 2016 Hariharan et al. Hypercolumns for object segmentation 2015 Hand-crafted features Lazebnik, Schmid, Ponce, Beyond bags of features 2006 Lucchi et al. Are spatial constraints necessary for segmentation 2011 Chen et al. DeepLab Hengshuang et al. PSPNet

16 SEGMENTATION NET standard net de-conv units High resolution Low localization LC Chen et al. DeepLab

17 DEEPLAB CRF or graph cuts segmentation LC Chen et al. DeepLab

18 PSPNET High memory requirements Hengshuang Zhao et al. Pyramid scene parsing network

19 BACK TO BASICS A NO-FRILLS APPROACH

20 MULTI-GRID SOLVERS

21 THE U-NET A V-cycle for neural nets pool deconv pool deconv Ronneberger et al, 2015

22 MEDICAL IMAGING resolution is high class complexity is low Ronneberger et al, 2015

23 NEURAL NET BUILDING BLOCKS input image Conv Relu Conv Relu

24 RESNET BLOCK input image Conv Relu Conv Relu +

25 1<latexit sha1_base64="xtytqk/naqjj/df88nu05b7 2<latexit sha1_base64="yv89rdpdqj0utzcfxkmioift sha1_base64="gastc9klzhpqd4p6fb6lm+wv 4<latexit sha1_base64="46swzbubqig4kt0cvm0m1behe 4<latexit sha1_base64="46swzbubqig4kt0cvm0m1behe 8<latexit sha1_base64="3zypumahocy2uimapsblr2m 8<latexit sha1_base64="3zypumahocy2uimapsblr2m 4<latexit sha1_base64="46swzbubqig4kt0cvm0m1behe 4<latexit sha1_base64="46swzbubqig4kt0cvm0m1behe 2<latexit sha1_base64="yv89rdpdqj0utzcfxkmioift sha1_base64="gastc9klzhpqd4p6fb6lm+wv 1<latexit sha1_base64="xtytqk/naqjj/df88nu05b7 <latexit sha1_base64="movfet8jhv0 MODIFIED U-NET BLOCK B N R e L U C O N V

26 BIG IDEA: STACKED U-NETS Combine information globalization of U-nets with power of ResNets No frills!

27 TRAINING ON REAL DATASETS

28 LIMITED TRAINING DATA ImageNet 1,000,000 MS-COCO 90,000 PASCAL VOC 1464 Solution: pre-train on big datasets, fine tune on small

29 STANDARD APPROACH Phase I: Train a classification net ball car hat grumpy cat

30 STANDARD APPROACH Phase 2: Add fancy stuff standard net fancy stuff Roughly doubles parameters!

31 Dilated SUNets Remove pooling layers Conventional Classification Networks Increase dilation factor of convolutions Exact same parameters! Dilated Classification Networks

32 RESULTS

33 METRICS Classification Top-1 accuracy Top-5 accuracy Segmentation Intersection over union Cat

34 RESULTS Classification Performance Model Top-1 Top-5 Depth Params ResNet M ResNet M ResNet M DenseNet M DenseNet M SUNet M SUNet M SUNet M 4.5% miou with 7x fewer parameters Segmentation Performance Model miou ResNet-101 [218] SUNet SUNet SUNet % miou w/o sacrificing classification performance Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein, Stacked U-Nets: A No- Frills Approach to Natural Image Segmentation

35 SEGMENTATION RESULTS Training on train+val (2x data) Training on train-set Methods Validation miou Resnet+ASPP Xception+ASPP+Decoder SUNet nce comparison on PASCAL VOC 2012 validat Methods miou Piecewise (VGG16) [224] 78.0 LRR+CRF [209] 77.3 DeepLabv2+CRF [210] 79.7 Large-Kernel+CRF [220] 82.2 Deep Layer Cascade [254] 82.7 Understanding Conv [221] 83.1 RefineNet [205] 82.4 RefineNet-ResNet152 [205] 83.4 PSPNet [217] 85.4 SUNet Advantages 1/2 the weights, 1/5 the RAM

36 SAMPLE RESULTS

37 SAMPLE RESULTS

38 SAMPLE RESULTS

39 SAMPLE RESULTS

40 FAILURE CASE

41 FAILURE CASE

42 SUMMARY OF PART I CFL condition for convex segmentation FOV problem for neural nets Unet blocks globalize information better We can compete with SOTA using simple architectures SUNets use fewer parameters and train fast

43 QUESTIONS?

44 QUANTIZED NETS

45 DEEP NETS ARE BIG Low power devices Canziani etal, An Analysis of Deep Neural Network Models for Practical Applications, arxiv 2016.

46 SOLUTION: QUANTIZED/LOW- PRECISION NETWORKS ±1 ±1 ±1 BinaryConnect [Courbariaux NIPS 15] BinaryNet [Hubara NIPS 16] XNOR-Net [Rastegar, ECCV 16] DoReFA-Net [Zhou, arxiv 16] DeepCompression[Han, ICLR 16] Advantages FAST & hardware friendly: no multiplications Low storage costs Low power consumption

47 <latexit sha1_base64="lrs0zfrrtt/qh/okcxt0qnco7zq=">aaacgnicbva9swnben3z2/h1aqfnyhc0cxciacnaweywmzceslez08xdvxn3to1hwb9ibau/wu5sbfwj/gs3myvjfddweg+gmxlrkoxfipjyjianpmdm5+ylc4tlyyv+6lrvjpnhuogjtewtyhak0fbbgrjqqqgmigkx0fvjz7+4bwnfos+xk0jtsustysezoqnlbzqq7jfxqgslhqdbumlym9j452635redutahhsfhgbtjaoww/91ojzxtojflzm09dfjs5syg4bk6huzmiwx8ml1c3vhnfnhm3v+hs7ed0qzxylxpph3170tolludfbloxfdkjno98t+vnmf82myftjmezx8xxzmkmnbeilqtd HOW TO TRAIN QUANTIZED NETS? minimize f(w) Non-quantized: Stochastic Gradient Descent w k+1 = w k rf(w k ) Fully-quantized: Stochastic rounding [Gupta ICML 15] w k+1 = w k Q[ rf(w k )]

48 EXPERIMENT Train CNNs (VGG-Net, ResNets, Wide-RseNet) with binary weight on CIFAR-10 Deterministic Rounding Stochastic Rounding Full-Precision

49 <latexit sha1_base64="lrs0zfrrtt/qh/okcxt0qnco7zq=">aaacgnicbva9swnben3z2/h1aqfnyhc0cxciacnaweywmzceslez08xdvxn3to1hwb9ibau/wu5sbfwj/gs3myvjfddweg+gmxlrkoxfipjyjianpmdm5+ylc4tlyyv+6lrvjpnhuogjtewtyhak0fbbgrjqqqgmigkx0fvjz7+4bwnfos+xk0jtsustysezoqnlbzqq7jfxqgslhqdbumlym9j452635redutahhsfhgbtjaoww/91ojzxtojflzm09dfjs5syg4bk6huzmiwx8ml1c3vhnfnhm3v+hs7ed0qzxylxpph3170tolludfbloxfdkjno98t+vnmf82myftjmezx8xxzmkmnbeilqtd HOW TO TRAIN QUANTIZED NETS? minimize f(w) Fully-quantized: Stochastic rounding [Gupta ICML 15] w k+1 = w k Q[ rf(w k )] Semi-quantized: BinaryConnect [Courbariaux NIPS 15] w k+1 = w k rf(q[w k ]) Floating point

50 EXPERIMENT Train CNNs (VGG-Net, ResNets, Wide-RseNet) with binary weight on CIFAR-10 Deterministic Rounding Stochastic Rounding BinaryConnect Full-Precision The SR method cannot beat BC, why?

51 BINARIZED NETS GO MAINSTREAM Hubara et al. Binarized neural nets Rastegari et al. XNOR Net Zhou et al. Dorefa-Net: training low precision Lin et al. Nets with few multiplications Lin, Zhao, Pan. Towards accurate binary Shayar et al. Learning discrete weights Park, Ahn, Yoo. Weighted Entropy quantization Mishra et al. WRPN: Wide reduced precision nets Kim Zhu et al. Trained ternary quantization Umuroglu et al. Finn: A framework Kim & Smaragdis. Bitwise Nets and tons more

52 CONVERGENCE UNDER CONVEXITY ASSUMPTIONS

53 CONVERGENCE THEORY FOR STOCHASTIC ROUNDING Fully quantized Theorem 2 Assume that F is µ -strongly convex and the learning rates are G given by. Let bound the gradient magnitude. Then E[F ( w T ) F (w? )] apple (1 + log(t + 1))G2 (1 + 1 ) 2µT + p d 2 G iteration count

54 CONVERGENCE THEORY FOR STOCHASTIC ROUNDING Fully quantized Theorem 2 Assume that F is µ -strongly convex and the learning rates are G given by. Let bound the gradient magnitude. Then E[F ( w T ) F (w? )] apple (1 + log(t + 1))G2 (1 + 1 ) 2µT + p d 2 G noise floor

55 CONVERGENCE THEORY FOR BINARYCONNECT semi-quantized Theorem 1 Assume that F is µ -strongly convex and the learning rates are G given by. Let bound the gradient magnitude. Then noise floor L 2 is a Lipschitz constants for the Hessian

56 But this can t be the whole story. What can we say about these methods on NON-CONVEX problems? The answer has to do with exploration vs exploitation

57 FLOATING POINT Learning rate = 1 Quantized scalar weight = rf(w)

58 FLOATING POINT Learning rate = 1 Quantized scalar weight = rf(w)

59 FLOATING POINT Learning rate = 1 Quantized scalar weight = rf(w)

60 FLOATING POINT Learning rate = 1 Quantized scalar weight = rf(w)

61 FLOATING POINT Learning rate = Histogram

62 FLOATING POINT Learning rate = Histogram

63 FLOATING POINT Learning rate = Histogram

64 FULLY QUANTIZED Learning rate = Histogram

65 FULLY QUANTIZED Learning rate = Histogram

66 FULLY QUANTIZED Learning rate = Histogram

67 WHAT S WRONG? Floating Point / Binary Connect Exploration Exploitation Shrink Learning rate Exploration Exploitation Stochastic Rounding Exploration Exploitation Shrink Learning rate Exploration Exploitation

68 MARKOV CHAIN INTERPRETATION Weight space w k+1 w k+1 = w k Q[ rf(w k )] w 2 w k w 1

69 MARKOV CHAIN INTERPRETATION w k+1 w k+1 = w k Q[ rf(w k )] w 2 w k w 1 Long term dynamics governed by the equilibrium distribution

70 MARKOV CHAIN INTERPRETATION equilibrium distribution

71 WHAT ABOUT STOCHASTIC ROUNDING? Fully discrete stochastic rounding does not concentrate Theorem: Then on stationary points If the noise satisfies Tails aren t too fat: Not concentrated at 0 SGD iterates stay bounded R 1 p x,k (z) dz < C/ 2 Iterates of fully-quantized SGD. don t concentrate on minimizers. But BinaryConnect does! Assumptions are weak enough for neural nets!

72 ANNEALING PROPERTIES MAKE A DIFFERENCE Train CNNs (VGG-Net, ResNets, Wide-RseNet) with binary weight on CIFAR-10/100 concentration/ annealing BinaryConnect Full-Precision

73 SUMMARY OF PART II Quantized nets reduce net size and energy Training requires specialized rounding methods We still don t understand the non-convexities of neural nets!

74 THANKS! Stacked U-Nets: A No-Frills Approach to Natural Image Segmentation Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Towards a deeper Understanding of Training Quantized Networks Hao Li, Soham De, Zheng Xu, Christoph Studer, Hanan Samet, Tom Goldstein

Towards Accurate Binary Convolutional Neural Network

Towards Accurate Binary Convolutional Neural Network Paper: #261 Poster: Pacific Ballroom #101 Towards Accurate Binary Convolutional Neural Network Xiaofan Lin, Cong Zhao and Wei Pan* firstname.lastname@dji.com Photos and videos are either original work

More information

Towards a Deeper Understanding of Training Quantized Neural Networks

Towards a Deeper Understanding of Training Quantized Neural Networks Hao Li * 1 Soham De * 1 Zheng Xu 1 Christoph Studer Hanan Samet 1 Tom Goldstein 1 Abstract Training neural networks with coarsely quantized weights is a key step towards learning on embedded platforms

More information

arxiv: v1 [cs.ne] 20 Apr 2018

arxiv: v1 [cs.ne] 20 Apr 2018 arxiv:1804.07802v1 [cs.ne] 20 Apr 2018 Value-aware Quantization for Training and Inference of Neural Networks Eunhyeok Park 1, Sungjoo Yoo 1, Peter Vajda 2 1 Department of Computer Science and Engineering

More information

TYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad

TYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad TYPES OF MODEL COMPRESSION Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad 1. Pruning 2. Quantization 3. Architectural Modifications PRUNING WHY PRUNING? Deep Neural Networks have redundant parameters.

More information

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Supplementary Material Xingang Pan 1, Ping Luo 1, Jianping Shi 2, and Xiaoou Tang 1 1 CUHK-SenseTime Joint Lab, The Chinese University

More information

Dense Fusion Classmate Network for Land Cover Classification

Dense Fusion Classmate Network for Land Cover Classification Dense Fusion lassmate Network for Land over lassification hao Tian Harbin Institute of Technology tianchao@sensetime.com ong Li SenseTime Group Limited licong@sensetime.com Jianping Shi SenseTime Group

More information

Value-aware Quantization for Training and Inference of Neural Networks

Value-aware Quantization for Training and Inference of Neural Networks Value-aware Quantization for Training and Inference of Neural Networks Eunhyeok Park 1, Sungjoo Yoo 1, and Peter Vajda 2 1 Department of Computer Science and Engineering Seoul National University {eunhyeok.park,sungjoo.yoo}@gmail.com

More information

Neural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017

Neural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017 Neural Network Approximation Low rank, Sparsity, and Quantization zsc@megvii.com Oct. 2017 Motivation Faster Inference Faster Training Latency critical scenarios VR/AR, UGV/UAV Saves time and energy Higher

More information

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Asaf Bar Zvi Adi Hayat. Semantic Segmentation Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random

More information

Predicting Deeper into the Future of Semantic Segmentation Supplementary Material

Predicting Deeper into the Future of Semantic Segmentation Supplementary Material Predicting Deeper into the Future of Semantic Segmentation Supplementary Material Pauline Luc 1,2 Natalia Neverova 1 Camille Couprie 1 Jakob Verbeek 2 Yann LeCun 1,3 1 Facebook AI Research 2 Inria Grenoble,

More information

Multi-Precision Quantized Neural Networks via Encoding Decomposition of { 1, +1}

Multi-Precision Quantized Neural Networks via Encoding Decomposition of { 1, +1} Multi-Precision Quantized Neural Networks via Encoding Decomposition of {, +} Qigong Sun, Fanhua Shang, Kang Yang, Xiufang Li, Yan Ren, Licheng Jiao Key Laboratory of Intelligent Perception and Image Understanding

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Convolutional Neural Network Architecture

Convolutional Neural Network Architecture Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation

More information

Binary Deep Learning. Presented by Roey Nagar and Kostya Berestizshevsky

Binary Deep Learning. Presented by Roey Nagar and Kostya Berestizshevsky Binary Deep Learning Presented by Roey Nagar and Kostya Berestizshevsky Deep Learning Seminar, School of Electrical Engineering, Tel Aviv University January 22 nd 2017 Lecture Outline Motivation and existing

More information

Differentiable Fine-grained Quantization for Deep Neural Network Compression

Differentiable Fine-grained Quantization for Deep Neural Network Compression Differentiable Fine-grained Quantization for Deep Neural Network Compression Hsin-Pai Cheng hc218@duke.edu Yuanjun Huang University of Science and Technology of China Anhui, China yjhuang@mail.ustc.edu.cn

More information

Compressing deep neural networks

Compressing deep neural networks From Data to Decisions - M.Sc. Data Science Compressing deep neural networks Challenges and theoretical foundations Presenter: Simone Scardapane University of Exeter, UK Table of contents Introduction

More information

Deep Residual. Variations

Deep Residual. Variations Deep Residual Network and Its Variations Diyu Yang (Originally prepared by Kaiming He from Microsoft Research) Advantages of Depth Degradation Problem Possible Causes? Vanishing/Exploding Gradients. Overfitting

More information

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/

More information

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":

More information

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu 1,, Qing Lu 1, Yu Hu, Lin Yang 1, Sharon Hu 1, Danny Chen 1, Yiyu Shi 1 1 Univerity of Notre Dame Huazhong

More information

Compressed Fisher vectors for LSVR

Compressed Fisher vectors for LSVR XRCE@ILSVRC2011 Compressed Fisher vectors for LSVR Florent Perronnin and Jorge Sánchez* Xerox Research Centre Europe (XRCE) *Now with CIII, Cordoba University, Argentina Our system in a nutshell High-dimensional

More information

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This

More information

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu 1,2, Qing Lu 2, Lin Yang 2, Sharon Hu 2, Danny Chen 2, Yu Hu 1, Yiyu Shi 2 1 Huazhong University of Science

More information

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao

More information

An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks

An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks Charbel Sakr, Naresh Shanbhag Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign

More information

Layer 1. Layer L. difference between the two precisions (B A and B W ) is chosen to balance the sum in (1), as follows: B A B W = round log 2 (2)

Layer 1. Layer L. difference between the two precisions (B A and B W ) is chosen to balance the sum in (1), as follows: B A B W = round log 2 (2) AN ANALYTICAL METHOD TO DETERMINE MINIMUM PER-LAYER PRECISION OF DEEP NEURAL NETWORKS Charbel Sakr Naresh Shanbhag Dept. of Electrical Computer Engineering, University of Illinois at Urbana Champaign ABSTRACT

More information

Deep Learning with Low Precision by Half-wave Gaussian Quantization

Deep Learning with Low Precision by Half-wave Gaussian Quantization Deep Learning with Low Precision by Half-wave Gaussian Quantization Zhaowei Cai UC San Diego zwcai@ucsd.edu Xiaodong He Microsoft Research Redmond xiaohe@microsoft.com Jian Sun Megvii Inc. sunjian@megvii.com

More information

Stochastic Layer-Wise Precision in Deep Neural Networks

Stochastic Layer-Wise Precision in Deep Neural Networks Stochastic Layer-Wise Precision in Deep Neural Networks Griffin Lacey NVIDIA Graham W. Taylor University of Guelph Vector Institute for Artificial Intelligence Canadian Institute for Advanced Research

More information

arxiv: v1 [cs.cv] 11 Sep 2018

arxiv: v1 [cs.cv] 11 Sep 2018 Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference Jeffrey L. McKinstry jlmckins@us.ibm.com Steven K. Esser sesser@us.ibm.com Rathinakumar Appuswamy rappusw@us.ibm.com

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

Binary Convolutional Neural Network on RRAM

Binary Convolutional Neural Network on RRAM Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua

More information

arxiv: v1 [cs.lg] 10 Nov 2017

arxiv: v1 [cs.lg] 10 Nov 2017 Quantized Memory-Augmented Neural Networks Seongsik Park, Seijoon Kim, Seil Lee, Ho Bae 2, and Sungroh Yoon,2 Department of Electrical and Computer Engineering, Seoul National University, Seoul 8826, Korea

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

Rethinking Binary Neural Network for Accurate Image Classification and Semantic Segmentation

Rethinking Binary Neural Network for Accurate Image Classification and Semantic Segmentation Rethinking nary Neural Network for Accurate Image Classification and Semantic Segmentation Bohan Zhuang 1, Chunhua Shen 1, Mingkui Tan 2, Lingqiao Liu 1, and Ian Reid 1 1 The University of Adelaide, Australia;

More information

2017 Fall ECE 692/599: Binary Representation Learning for Large Scale Visual Data

2017 Fall ECE 692/599: Binary Representation Learning for Large Scale Visual Data 2017 Fall ECE 692/599: Binary Representation Learning for Large Scale Visual Data Liu Liu Instructor: Dr. Hairong Qi University of Tennessee, Knoxville lliu25@vols.utk.edu September 21, 2017 Liu Liu (UTK)

More information

Analytical Guarantees on Numerical Precision of Deep Neural Networks

Analytical Guarantees on Numerical Precision of Deep Neural Networks Charbel Sakr Yongjune Kim Naresh Shanbhag Abstract The acclaimed successes of neural networks often overshadow their tremendous complexity. We focus on numerical precision - a key parameter defining the

More information

Understanding CNNs using visualisation and transformation analysis

Understanding CNNs using visualisation and transformation analysis Understanding CNNs using visualisation and transformation analysis Andrea Vedaldi Medical Imaging Summer School August 2016 Image representations 2 encoder Φ representation An encoder maps the data into

More information

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network

More information

σ(x) = clip( x m + α 2, 0, α) = min(max( x m + α 2, 0), α) (1)

σ(x) = clip( x m + α 2, 0, α) = min(max( x m + α 2, 0), α) (1) TRUE GRADIENT-BASED TRAINING OF DEEP BINARY ACTIVATED NEURAL NETWORKS VIA CONTINUOUS BINARIZATION Charbel Sakr, Jungwook Choi, Zhuo Wang, Kailash Gopalakrishnan, Naresh Shanbhag Dept. of Electrical and

More information

Neural networks COMS 4771

Neural networks COMS 4771 Neural networks COMS 4771 1. Logistic regression Logistic regression Suppose X = R d and Y = {0, 1}. A logistic regression model is a statistical model where the conditional probability function has a

More information

Nonlinear Models. Numerical Methods for Deep Learning. Lars Ruthotto. Departments of Mathematics and Computer Science, Emory University.

Nonlinear Models. Numerical Methods for Deep Learning. Lars Ruthotto. Departments of Mathematics and Computer Science, Emory University. Nonlinear Models Numerical Methods for Deep Learning Lars Ruthotto Departments of Mathematics and Computer Science, Emory University Intro 1 Course Overview Intro 2 Course Overview Lecture 1: Linear Models

More information

1 What a Neural Network Computes

1 What a Neural Network Computes Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists

More information

Introduction to Convolutional Neural Networks 2018 / 02 / 23

Introduction to Convolutional Neural Networks 2018 / 02 / 23 Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that

More information

Quantisation. Efficient implementation of convolutional neural networks. Philip Leong. Computer Engineering Lab The University of Sydney

Quantisation. Efficient implementation of convolutional neural networks. Philip Leong. Computer Engineering Lab The University of Sydney 1/51 Quantisation Efficient implementation of convolutional neural networks Philip Leong Computer Engineering Lab The University of Sydney July 2018 / PAPAA Workshop Australia 2/51 3/51 Outline 1 Introduction

More information

Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas

Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas Center for Machine Perception Department of Cybernetics Faculty of Electrical Engineering

More information

Analytical Guarantees on Numerical Precision of Deep Neural Networks

Analytical Guarantees on Numerical Precision of Deep Neural Networks Charbel Sakr Yongjune Kim Naresh Shanbhag Abstract The acclaimed successes of neural networks often overshadow their tremendous complexity. We focus on numerical precision - a key parameter defining the

More information

Faster Machine Learning via Low-Precision Communication & Computation. Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich)

Faster Machine Learning via Low-Precision Communication & Computation. Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich) Faster Machine Learning via Low-Precision Communication & Computation Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich) 2 How many bits do you need to represent a single number in machine

More information

Unraveling the mysteries of stochastic gradient descent on deep neural networks

Unraveling the mysteries of stochastic gradient descent on deep neural networks Unraveling the mysteries of stochastic gradient descent on deep neural networks Pratik Chaudhari UCLA VISION LAB 1 The question measures disagreement of predictions with ground truth Cat Dog... x = argmin

More information

arxiv: v3 [cs.cv] 8 Jul 2018

arxiv: v3 [cs.cv] 8 Jul 2018 On the Robustness of Semantic Segmentation Models to Adversarial Attacks Anurag Arnab Ondrej Miksik Philip H.S. Torr University of Oxford {anurag.arnab, ondrej.miksik, philip.torr}@eng.ox.ac.uk arxiv:1711.09856v3

More information

Lecture 14: Deep Generative Learning

Lecture 14: Deep Generative Learning Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han

More information

Normalization Techniques in Training of Deep Neural Networks

Normalization Techniques in Training of Deep Neural Networks Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

Don t Decay the Learning Rate, Increase the Batch Size. Samuel L. Smith, Pieter-Jan Kindermans, Quoc V. Le December 9 th 2017

Don t Decay the Learning Rate, Increase the Batch Size. Samuel L. Smith, Pieter-Jan Kindermans, Quoc V. Le December 9 th 2017 Don t Decay the Learning Rate, Increase the Batch Size Samuel L. Smith, Pieter-Jan Kindermans, Quoc V. Le December 9 th 2017 slsmith@ Google Brain Three related questions: What properties control generalization?

More information

Lecture 9: Generalization

Lecture 9: Generalization Lecture 9: Generalization Roger Grosse 1 Introduction When we train a machine learning model, we don t just want it to learn to model the training data. We want it to generalize to data it hasn t seen

More information

arxiv: v1 [cs.cv] 3 Aug 2017

arxiv: v1 [cs.cv] 3 Aug 2017 DONG ET AL.: STOCHASTIC QUANTIZATION 1 arxiv:1708.01001v1 [cs.cv] 3 Aug 2017 Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization Yinpeng Dong 1 dyp17@mails.tsinghua.edu.cn Renkun

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Deep Convolutional Neural Networks for Pairwise Causality

Deep Convolutional Neural Networks for Pairwise Causality Deep Convolutional Neural Networks for Pairwise Causality Karamjit Singh, Garima Gupta, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal TCS Research, Delhi Tata Consultancy Services Ltd. {karamjit.singh,

More information

Convolutional neural networks

Convolutional neural networks 11-1: Convolutional neural networks Prof. J.C. Kao, UCLA Convolutional neural networks Motivation Biological inspiration Convolution operation Convolutional layer Padding and stride CNN architecture 11-2:

More information

Logistic Regression Introduction to Machine Learning. Matt Gormley Lecture 8 Feb. 12, 2018

Logistic Regression Introduction to Machine Learning. Matt Gormley Lecture 8 Feb. 12, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Logistic Regression Matt Gormley Lecture 8 Feb. 12, 2018 1 10-601 Introduction

More information

Weighted-Entropy-based Quantization for Deep Neural Networks

Weighted-Entropy-based Quantization for Deep Neural Networks Weighted-Entropy-based Quantization for Deep Neural Networks Eunhyeok Park, Junwhan Ahn, and Sungjoo Yoo canusglow@gmail.com, junwhan@snu.ac.kr, sungjoo.yoo@gmail.com Seoul National University Computing

More information

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks. Microsoft Research

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks. Microsoft Research LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua Microsoft Research zdqzeros@gmail.com jiaoyan@microsoft.com

More information

Deep Learning Year in Review 2016: Computer Vision Perspective

Deep Learning Year in Review 2016: Computer Vision Perspective Deep Learning Year in Review 2016: Computer Vision Perspective Alex Kalinin, PhD Candidate Bioinformatics @ UMich alxndrkalinin@gmail.com @alxndrkalinin Architectures Summary of CNN architecture development

More information

arxiv: v1 [stat.ml] 18 Jan 2019

arxiv: v1 [stat.ml] 18 Jan 2019 Foothill: A Quasiconvex Regularization Function Mouloud Belbahri, Eyyüb Sari, Sajad Darabi, Vahid Partovi Nia arxiv:90.0644v [stat.ml] 8 Jan 09 Huawei Technologies Co., Ltd. Montreal Research Center, Canada

More information

arxiv: v2 [stat.ml] 18 Jun 2017

arxiv: v2 [stat.ml] 18 Jun 2017 FREEZEOUT: ACCELERATE TRAINING BY PROGRES- SIVELY FREEZING LAYERS Andrew Brock, Theodore Lim, & J.M. Ritchie School of Engineering and Physical Sciences Heriot-Watt University Edinburgh, UK {ajb5, t.lim,

More information

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis T. BEN-NUN, T. HOEFLER Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis https://www.arxiv.org/abs/1802.09941 What is Deep Learning good for? Digit Recognition Object

More information

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University

More information

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01 (Li Lab) National Science Foundation Center for Big Learning (CBL) Department of Electrical and Computer Engineering (ECE) Department of Computer & Information Science & Engineering (CISE) Pytorch Tutorial

More information

FreezeOut: Accelerate Training by Progressively Freezing Layers

FreezeOut: Accelerate Training by Progressively Freezing Layers FreezeOut: Accelerate Training by Progressively Freezing Layers Andrew Brock, Theodore Lim, & J.M. Ritchie School of Engineering and Physical Sciences Heriot-Watt University Edinburgh, UK {ajb5, t.lim,

More information

arxiv: v1 [cs.cv] 6 Dec 2018

arxiv: v1 [cs.cv] 6 Dec 2018 Trained Rank Pruning for Efficient Deep Neural Networks Yuhui Xu 1, Yuxi Li 1, Shuai Zhang 2, Wei Wen 3, Botao Wang 2, Yingyong Qi 2, Yiran Chen 3, Weiyao Lin 1 and Hongkai Xiong 1 arxiv:1812.02402v1 [cs.cv]

More information

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017 Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper

More information

arxiv: v1 [cs.cv] 25 May 2018

arxiv: v1 [cs.cv] 25 May 2018 Heterogeneous Bitwidth Binarization in Convolutional Neural Networks arxiv:1805.10368v1 [cs.cv] 25 May 2018 Josh Fromm Department of Electrical Engineering University of Washington Seattle, WA 98195 jwfromm@uw.edu

More information

arxiv: v1 [cs.lg] 10 Jan 2019

arxiv: v1 [cs.lg] 10 Jan 2019 Quantized Epoch-SGD for Communication-Efficient Distributed Learning arxiv:1901.0300v1 [cs.lg] 10 Jan 2019 Shen-Yi Zhao Hao Gao Wu-Jun Li Department of Computer Science and Technology Nanjing University,

More information

Scalable Methods for 8-bit Training of Neural Networks

Scalable Methods for 8-bit Training of Neural Networks Scalable Methods for 8-bit Training of Neural Networks Ron Banner 1, Itay Hubara 2, Elad Hoffer 2, Daniel Soudry 2 {itayhubara, elad.hoffer, daniel.soudry}@gmail.com {ron.banner}@intel.com (1) Intel -

More information

ADANET: adaptive learning of neural networks

ADANET: adaptive learning of neural networks ADANET: adaptive learning of neural networks Joint work with Corinna Cortes (Google Research) Javier Gonzalo (Google Research) Vitaly Kuznetsov (Google Research) Scott Yang (Courant Institute) MEHRYAR

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Krähenbühl and Vladlen Koltun Stanford University Presenter: Yuan-Ting Hu 1 Conditional Random Field (CRF) E x I = φ u

More information

Failures of Gradient-Based Deep Learning

Failures of Gradient-Based Deep Learning Failures of Gradient-Based Deep Learning Shai Shalev-Shwartz, Shaked Shammah, Ohad Shamir The Hebrew University and Mobileye Representation Learning Workshop Simons Institute, Berkeley, 2017 Shai Shalev-Shwartz

More information

Local Affine Approximators for Improving Knowledge Transfer

Local Affine Approximators for Improving Knowledge Transfer Local Affine Approximators for Improving Knowledge Transfer Suraj Srinivas & François Fleuret Idiap Research Institute and EPFL {suraj.srinivas, francois.fleuret}@idiap.ch Abstract The Jacobian of a neural

More information

The K-FAC method for neural network optimization

The K-FAC method for neural network optimization The K-FAC method for neural network optimization James Martens Thanks to my various collaborators on K-FAC research and engineering: Roger Grosse, Jimmy Ba, Vikram Tankasali, Matthew Johnson, Daniel Duckworth,

More information

Deep Learning for Gravitational Wave Analysis Results with LIGO Data

Deep Learning for Gravitational Wave Analysis Results with LIGO Data Link to these slides: http://tiny.cc/nips arxiv:1711.03121 Deep Learning for Gravitational Wave Analysis Results with LIGO Data Daniel George & E. A. Huerta NCSA Gravity Group - http://gravity.ncsa.illinois.edu/

More information

Bayesian Networks (Part I)

Bayesian Networks (Part I) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,

More information

CSC321 Lecture 8: Optimization

CSC321 Lecture 8: Optimization CSC321 Lecture 8: Optimization Roger Grosse Roger Grosse CSC321 Lecture 8: Optimization 1 / 26 Overview We ve talked a lot about how to compute gradients. What do we actually do with them? Today s lecture:

More information

Deep Learning Lecture 2

Deep Learning Lecture 2 Fall 2016 Machine Learning CMPSCI 689 Deep Learning Lecture 2 Sridhar Mahadevan Autonomous Learning Lab UMass Amherst COLLEGE Outline of lecture New type of units Convolutional units, Rectified linear

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Energy Landscapes of Deep Networks

Energy Landscapes of Deep Networks Energy Landscapes of Deep Networks Pratik Chaudhari February 29, 2016 References Auffinger, A., Arous, G. B., & Černý, J. (2013). Random matrices and complexity of spin glasses. arxiv:1003.1129.# Choromanska,

More information

TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights

TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights Diwen Wan 1,2, Fumin Shen 1, Li Liu 2, Fan Zhu 2, Jie Qin 3, Ling Shao 2, and Heng Tao Shen 1 1 Center for Future Media and School

More information

Topic 3: Neural Networks

Topic 3: Neural Networks CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why

More information

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»

More information

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning  Ian Goodfellow Last updated Deep Feedforward Networks Lecture slides for Chapter 6 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2016-10-04 Roadmap Example: Learning XOR Gradient-Based Learning Hidden Units

More information

arxiv: v4 [cs.cv] 6 Sep 2017

arxiv: v4 [cs.cv] 6 Sep 2017 Deep Pyramidal Residual Networks Dongyoon Han EE, KAIST dyhan@kaist.ac.kr Jiwhan Kim EE, KAIST jhkim89@kaist.ac.kr Junmo Kim EE, KAIST junmo.kim@kaist.ac.kr arxiv:1610.02915v4 [cs.cv] 6 Sep 2017 Abstract

More information

CondenseNet: An Efficient DenseNet using Learned Group Convolutions

CondenseNet: An Efficient DenseNet using Learned Group Convolutions CondenseNet: An Efficient DenseNet using Learned Group s Gao Huang Cornell University gh@cornell.edu Shichen Liu Tsinghua University liushichen@gmail.com Kilian Q. Weinberger Cornell University kqw@cornell.edu

More information

arxiv: v1 [cs.cv] 3 Feb 2017

arxiv: v1 [cs.cv] 3 Feb 2017 Deep Learning with Low Precision by Half-wave Gaussian Quantization Zhaowei Cai UC San Diego zwcai@ucsd.edu Xiaodong He Microsoft Research Redmond xiaohe@microsoft.com Jian Sun Megvii Inc. sunjian@megvii.com

More information

Accelerating Convolutional Neural Networks by Group-wise 2D-filter Pruning

Accelerating Convolutional Neural Networks by Group-wise 2D-filter Pruning Accelerating Convolutional Neural Networks by Group-wise D-filter Pruning Niange Yu Department of Computer Sicence and Technology Tsinghua University Beijing 0008, China yng@mails.tsinghua.edu.cn Shi Qiu

More information

arxiv: v2 [cs.lg] 3 Dec 2018

arxiv: v2 [cs.lg] 3 Dec 2018 Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? arxiv:1806.07550v2 [cs.lg] 3 Dec 2018 Shilin Zhu UC San Diego La Jolla, CA 92093 shz338@eng.ucsd.edu Abstract Binary neural

More information

Domain adaptation for deep learning

Domain adaptation for deep learning What you saw is not what you get Domain adaptation for deep learning Kate Saenko Successes of Deep Learning in AI A Learning Advance in Artificial Intelligence Rivals Human Abilities Deep Learning for

More information

Perturbative Neural Networks

Perturbative Neural Networks Perturbative Neural Networks Felix Juefei-Xu Carnegie Mellon University felixu@cmu.edu Vishnu Naresh Boddeti Michigan State University vishnu@msu.edu Marios Savvides Carnegie Mellon University msavvid@ri.cmu.edu

More information

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch Psychology 452 Week 12: Deep Learning What Is Deep Learning? Preliminary Ideas (that we already know!) The Restricted Boltzmann Machine (RBM) Many Layers of RBMs Pros and Cons of Deep Learning Course Structure

More information

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Neural Networks 2. 2 Receptive fields and dealing with image inputs CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There

More information