Towards Accurate Binary Convolutional Neural Network
|
|
- Duane Wade
- 5 years ago
- Views:
Transcription
1 Paper: #261 Poster: Pacific Ballroom #101 Towards Accurate Binary Convolutional Neural Network Xiaofan Lin, Cong Zhao and Wei Pan* Photos and videos are either original work or taken from Wikimedia, under Creative Commons license
2 DJI Drones use CNN for many tasks Large model size: Hundreds of megabytes of floating point weight values Expensive computation: Billions of floating point multiplication-accumulation Challenges for DJI Drones Limited resources for computation and power in DJI drones
3 Compression of deep neural networks for mobile applications Synapse and neuron pruning Quantization Sparse, irregular computation -- difficult to process efficiently Regular computation, smaller datapaths, fewer bits per weight and activation [1] S.Han, H.Mao, and J.W.Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arxiv preprint arxiv: , [2] P.Molchanov, S.Tyree, T.Karras, et al. Pruning Convolutional Neural Networks for Resource Efficient Inference. ICLR [3] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. arxiv preprint arxiv: , [4] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arxiv preprint arxiv: , [5] G.A.Howard, M.Zhu, B.Chen, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arxiv preprint arxiv: , [6] N.F.Iandola, S.Han, W.M.Moskewicz, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arxiv preprint arxiv: , 2016.
4 Compression of deep neural networks for mobile applications Synapse and neuron pruning Quantization Sparse, irregular computation -- difficult to process efficiently Regular computation, smaller datapaths, fewer bits per weight and activation [1] S.Han, H.Mao, and J.W.Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arxiv preprint arxiv: , [2] P.Molchanov, S.Tyree, T.Karras, et al. Pruning Convolutional Neural Networks for Resource Efficient Inference. ICLR [3] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. arxiv preprint arxiv: , [4] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arxiv preprint arxiv: , [5] G.A.Howard, M.Zhu, B.Chen, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arxiv preprint arxiv: , [6] N.F.Iandola, S.Han, W.M.Moskewicz, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arxiv preprint arxiv: , Floating point multiplication and accumulation is still the bottleneck!
5 Binarized neural networks (BNNs) The extreme case of quantization: binary weight and activation +1 and -1 (using sign() function) Key computation: binary matrix multiplication and accumulation x y = POPCOUNT(x XNOR y),x i,y i 2 { 1, +1}, 8i An example: 1 1 apple 1 1 Floating point operation 1 ( 1) + ( 1) 1 Bitwise operation! POPCOUNT(1 XNOR ( 1), 1 XNOR 1) XNOR Truth Table Input Output x y XNOR [7] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to + 1 or-1, ICML [8] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. ECCV 2016.
6 Prediction accuracy with BNNs Competitive on small benchmarks: MNIST (handwritten digits), SVHN (street number), CIFAR-10 (10 classes objects) Too much loss on large benchmarks: ImageNet (1000 classes objects). MNIST SVHN CIFAR-10 ImageNet Binary weights & activations 99.04% 97.47% 89.85% 51.2% Full Precision weights & activations 99.06% 98.31% 92.38% 69.3% Accuracy loss 0.2% 0.84% 2.53% 18.1% [8] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. ECCV 2016.
7 Binary matrix multiplication Observation: too much accuracy loss using sign() for binarization Real-Value Weights Binary Weights Real-Value Inputs Binary Inputs Top-1 accuracy: 69.3% 60.8% Top-1 accuracy: 69.3% 51.2% Plan: approximate the weights and activations more precisely Intuitive example: say, we want to approximate a real number x =1.512 f 1 (x) =sign(x) ) x>0 f 1 (x) =sign(x),f 2 (x) = sign(x 1) ) x>1 f 1 (x) =sign(x),f 2 (x) = sign(x 1),f 3 (x) = sign(x 2) ) 1 <x<2 base 1 base 2 base 3
8 Approximate full precision weights using shift parameters Construct binary wight bases by shift B 1, B 2,, B M 2 { 1, +1} w h c in c out B i = F ui (W ) := sign(w mean(w )+u i std(w )) move the weight in sign() function by certain shift parameters shift parameters can be learned Approximate full precision weights using binary bases W 1 B B M B M
9 Approximate full precision activations using shift parameters Construct binary activation bases by shift 1 R = ReLU(Input) 2 h v (R) =clip(r + v, 0, 1) 3 H v (R) :=2I hv (R) Approximate full precision activations using binary bases A 1,, A N = H v1 (R),,H vn (R) R 1 A N A N
10 Approximate full precision activations by shift Construct binary activation bases by shift 1 R = ReLU(Input) 2 h v (R) =clip(r + v, 0, 1) 3 H v (R) :=2I hv (R) Approximate full precision activations using binary bases A 1,, A N = H v1 (R),,H vn (R) R 1 A N A N
11 Approximate full precision activations by shift Construct binary activation bases by shift 1 R = ReLU(Input) 2 h v (R) =clip(r + v, 0, 1) 3 H v (R) :=2I hv (R) Approximate full precision activations using binary bases A 1,, A N = H v1 (R),,H vn (R) R 1 A N A N
12 Approximate full precision activations by shift Construct binary activation bases by shift 1 R = ReLU(Input) 2 h v (R) =clip(r + v, 0, 1) 3 H v (R) :=2I hv (R) Approximate full precision activations using binary bases A 1,, A N = H v1 (R),,H vn (R) R 1 A N A N
13 Parallel & multiple binary convolution Conv(W, R)! MX NX Conv m B m, na n NX = m=1 nconv MX n=1 m B m, A n! = n=1 m=1 m=1 n=1 sum-sum operation can be parallel MX NX m n Conv (B m, A n ) Binary Conv: x y = POPCOUNT(x XNOR y),x i,y i 2 { 1, +1}, 8i Advantages 1. Bitwise operations 2. More bases, better approximation 3. Parallel computation (sum-sum) is hardware friendly
14 Result Model on ImageNet Benchmark Full-Precision ResNet-18 [full-precision weights and activations] BWN [full-precision activation] [8] Rastegari et al. (2016) DoReFa-Net [1-bit weight and 4-bit activation] [4] Zhou et al. (2016) XNOR-Net [binary weight and activation] [8] Rastegari et al. (2016) BNN [binary weight and activation] [7] Courbariaux et al. (2016) weight (bit) activation (bit) Accuracy (Top-1) Accuracy Loss % % 7.5% % 10.1% % 18.1% % 27.1% Ours [3 weight bases, 3 activation bases] % 5.4% Ours [5 weight bases, 5 activation bases] % 4.3% Full-Precision ResNet-34 [full-precision weights and activations] % Ours [5 weight bases, 5 activation bases] % 4.9%
15 Future work 1.Applicability to other tasks, e.g., object detection, parsing, face recognition, speech recognition, etc. 2.Hardware acceleration on mobile 3.Software acceleration on cloud Paper: #261 Poster: Pacific Ballroom #101 Contact: Thank You!
TYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad
TYPES OF MODEL COMPRESSION Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad 1. Pruning 2. Quantization 3. Architectural Modifications PRUNING WHY PRUNING? Deep Neural Networks have redundant parameters.
More informationNeural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017
Neural Network Approximation Low rank, Sparsity, and Quantization zsc@megvii.com Oct. 2017 Motivation Faster Inference Faster Training Latency critical scenarios VR/AR, UGV/UAV Saves time and energy Higher
More informationMulti-Precision Quantized Neural Networks via Encoding Decomposition of { 1, +1}
Multi-Precision Quantized Neural Networks via Encoding Decomposition of {, +} Qigong Sun, Fanhua Shang, Kang Yang, Xiufang Li, Yan Ren, Licheng Jiao Key Laboratory of Intelligent Perception and Image Understanding
More informationQuantisation. Efficient implementation of convolutional neural networks. Philip Leong. Computer Engineering Lab The University of Sydney
1/51 Quantisation Efficient implementation of convolutional neural networks Philip Leong Computer Engineering Lab The University of Sydney July 2018 / PAPAA Workshop Australia 2/51 3/51 Outline 1 Introduction
More informationDifferentiable Fine-grained Quantization for Deep Neural Network Compression
Differentiable Fine-grained Quantization for Deep Neural Network Compression Hsin-Pai Cheng hc218@duke.edu Yuanjun Huang University of Science and Technology of China Anhui, China yjhuang@mail.ustc.edu.cn
More informationMultiscale methods for neural image processing. Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet
Multiscale methods for neural image processing Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet A TALK IN TWO ACTS Part I: Stacked U-nets The globalization
More informationarxiv: v1 [cs.lg] 10 Nov 2017
Quantized Memory-Augmented Neural Networks Seongsik Park, Seijoon Kim, Seil Lee, Ho Bae 2, and Sungroh Yoon,2 Department of Electrical and Computer Engineering, Seoul National University, Seoul 8826, Korea
More informationarxiv: v1 [cs.cv] 25 May 2018
Heterogeneous Bitwidth Binarization in Convolutional Neural Networks arxiv:1805.10368v1 [cs.cv] 25 May 2018 Josh Fromm Department of Electrical Engineering University of Washington Seattle, WA 98195 jwfromm@uw.edu
More informationLQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks. Microsoft Research
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua Microsoft Research zdqzeros@gmail.com jiaoyan@microsoft.com
More informationQuantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu 1,, Qing Lu 1, Yu Hu, Lin Yang 1, Sharon Hu 1, Danny Chen 1, Yiyu Shi 1 1 Univerity of Notre Dame Huazhong
More informationTwo-Step Quantization for Low-bit Neural Networks
Two-Step Quantization for Low-bit Neural Networks Peisong Wang 1,2, Qinghao Hu 1,2, Yifan Zhang 1,2, Chunjie Zhang 1,2, Yang Liu 4, and Jian Cheng 1,2,3 1 Institute of Automation, Chinese Academy of Sciences,
More informationBinary Deep Learning. Presented by Roey Nagar and Kostya Berestizshevsky
Binary Deep Learning Presented by Roey Nagar and Kostya Berestizshevsky Deep Learning Seminar, School of Electrical Engineering, Tel Aviv University January 22 nd 2017 Lecture Outline Motivation and existing
More informationScalable Methods for 8-bit Training of Neural Networks
Scalable Methods for 8-bit Training of Neural Networks Ron Banner 1, Itay Hubara 2, Elad Hoffer 2, Daniel Soudry 2 {itayhubara, elad.hoffer, daniel.soudry}@gmail.com {ron.banner}@intel.com (1) Intel -
More informationLayer 1. Layer L. difference between the two precisions (B A and B W ) is chosen to balance the sum in (1), as follows: B A B W = round log 2 (2)
AN ANALYTICAL METHOD TO DETERMINE MINIMUM PER-LAYER PRECISION OF DEEP NEURAL NETWORKS Charbel Sakr Naresh Shanbhag Dept. of Electrical Computer Engineering, University of Illinois at Urbana Champaign ABSTRACT
More informationarxiv: v4 [cs.cv] 2 Aug 2016
arxiv:1603.05279v4 [cs.cv] 2 Aug 2016 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi Allen Institute for AI,
More informationStochastic Layer-Wise Precision in Deep Neural Networks
Stochastic Layer-Wise Precision in Deep Neural Networks Griffin Lacey NVIDIA Graham W. Taylor University of Guelph Vector Institute for Artificial Intelligence Canadian Institute for Advanced Research
More informationFrequency-Domain Dynamic Pruning for Convolutional Neural Networks
Frequency-Domain Dynamic Pruning for Convolutional Neural Networks Zhenhua Liu 1, Jizheng Xu 2, Xiulian Peng 2, Ruiqin Xiong 1 1 Institute of Digital Media, School of Electronic Engineering and Computer
More informationσ(x) = clip( x m + α 2, 0, α) = min(max( x m + α 2, 0), α) (1)
TRUE GRADIENT-BASED TRAINING OF DEEP BINARY ACTIVATED NEURAL NETWORKS VIA CONTINUOUS BINARIZATION Charbel Sakr, Jungwook Choi, Zhuo Wang, Kailash Gopalakrishnan, Naresh Shanbhag Dept. of Electrical and
More informationQuantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Journal of Machine Learning Research 18 (2018) 1-30 Submitted 9/16; Revised 4/17; Published 4/18 Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations Itay Hubara*
More informationWeighted-Entropy-based Quantization for Deep Neural Networks
Weighted-Entropy-based Quantization for Deep Neural Networks Eunhyeok Park, Junwhan Ahn, and Sungjoo Yoo canusglow@gmail.com, junwhan@snu.ac.kr, sungjoo.yoo@gmail.com Seoul National University Computing
More informationDeep Learning with Low Precision by Half-wave Gaussian Quantization
Deep Learning with Low Precision by Half-wave Gaussian Quantization Zhaowei Cai UC San Diego zwcai@ucsd.edu Xiaodong He Microsoft Research Redmond xiaohe@microsoft.com Jian Sun Megvii Inc. sunjian@megvii.com
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationarxiv: v2 [cs.lg] 3 Dec 2018
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? arxiv:1806.07550v2 [cs.lg] 3 Dec 2018 Shilin Zhu UC San Diego La Jolla, CA 92093 shz338@eng.ucsd.edu Abstract Binary neural
More informationarxiv: v2 [cs.cv] 20 Aug 2018
Joint Training of Low-Precision Neural Network with Quantization Interval Parameters arxiv:1808.05779v2 [cs.cv] 20 Aug 2018 Sangil Jung sang-il.jung Youngjun Kwak yjk.kwak Changyong Son cyson Jae-Joon
More informationarxiv: v2 [cs.cv] 17 Nov 2017
Towards Effective Low-bitwidth Convolutional Neural Networks Bohan Zhuang, Chunhua Shen, Mingkui Tan, Lingqiao Liu, Ian Reid arxiv:1711.00205v2 [cs.cv] 17 Nov 2017 Abstract This paper tackles the problem
More informationarxiv: v1 [cs.cv] 16 Mar 2016
arxiv:1603.05279v1 [cs.cv] 16 Mar 2016 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi Allen Institute for AI,
More informationDNN FEATURE MAP COMPRESSION USING LEARNED REPRESENTATION OVER GF(2)
DNN FEATURE MAP COMPRESSION USING LEARNED REPRESENTATION OVER GF(2) Anonymous authors Paper under double-blind review ABSTRACT In this paper, we introduce a method to compress intermediate feature maps
More informationOn the Complexity of Neural-Network-Learned Functions
On the Complexity of Neural-Network-Learned Functions Caren Marzban 1, Raju Viswanathan 2 1 Applied Physics Lab, and Dept of Statistics, University of Washington, Seattle, WA 98195 2 Cyberon LLC, 3073
More informationarxiv: v1 [stat.ml] 18 Jan 2019
Foothill: A Quasiconvex Regularization Function Mouloud Belbahri, Eyyüb Sari, Sajad Darabi, Vahid Partovi Nia arxiv:90.0644v [stat.ml] 8 Jan 09 Huawei Technologies Co., Ltd. Montreal Research Center, Canada
More informationAuto-balanced Filter Pruning for Efficient Convolutional Neural Networks
Auto-balanced Filter Pruning for Efficient Convolutional Neural Networks Xiaohan Ding, 1 Guiguang Ding, 1 Jungong Han, 2 Sheng Tang 3 1 School of Software, Tsinghua University, Beijing 100084, China 2
More informationarxiv: v1 [cs.cv] 3 Feb 2017
Deep Learning with Low Precision by Half-wave Gaussian Quantization Zhaowei Cai UC San Diego zwcai@ucsd.edu Xiaodong He Microsoft Research Redmond xiaohe@microsoft.com Jian Sun Megvii Inc. sunjian@megvii.com
More informationarxiv: v1 [cs.ne] 20 Apr 2018
arxiv:1804.07802v1 [cs.ne] 20 Apr 2018 Value-aware Quantization for Training and Inference of Neural Networks Eunhyeok Park 1, Sungjoo Yoo 1, Peter Vajda 2 1 Department of Computer Science and Engineering
More informationDeep Neural Network Compression with Single and Multiple Level Quantization
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Deep Neural Network Compression with Single and Multiple Level Quantization Yuhui Xu, 1 Yongzhuang Wang, 1 Aojun Zhou, 2 Weiyao Lin,
More informationQuantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu 1,2, Qing Lu 2, Lin Yang 2, Sharon Hu 2, Danny Chen 2, Yu Hu 1, Yiyu Shi 2 1 Huazhong University of Science
More informationPerformance Guaranteed Network Acceleration via High-Order Residual Quantization
Performance Guaranteed Network Acceleration via High-Order Residual Quantization Zefan Li, Bingbing Ni, Wenjun Zhang, Xiaokang Yang, Wen Gao 2 Shanghai Jiao Tong University, 2 Peking University {Leezf,
More informationEfficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error
Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error Chunhui Jiang, Guiying Li, Chao Qian, Ke Tang Anhui Province Key Lab of Big Data Analysis and Application, University
More informationTowards Effective Low-bitwidth Convolutional Neural Networks
Towards Effective Low-bitwidth Convolutional Neural Networks Bohan Zhuang 1,2, Chunhua Shen 1,2, Mingkui Tan 3, Lingqiao Liu 1, Ian Reid 1,2 1 The University of Adelaide, Australia, 2 Australian Centre
More informationCompressing deep neural networks
From Data to Decisions - M.Sc. Data Science Compressing deep neural networks Challenges and theoretical foundations Presenter: Simone Scardapane University of Exeter, UK Table of contents Introduction
More informationarxiv: v2 [cs.ne] 23 Nov 2017
Minimum Energy Quantized Neural Networks Bert Moons +, Koen Goetschalckx +, Nick Van Berckelaer* and Marian Verhelst + Department of Electrical Engineering* + - ESAT/MICAS +, KU Leuven, Leuven, Belgium
More informationRethinking Binary Neural Network for Accurate Image Classification and Semantic Segmentation
Rethinking nary Neural Network for Accurate Image Classification and Semantic Segmentation Bohan Zhuang 1, Chunhua Shen 1, Mingkui Tan 2, Lingqiao Liu 1, and Ian Reid 1 1 The University of Adelaide, Australia;
More informationarxiv: v1 [cs.cv] 4 Apr 2019
Regularizing Activation Distribution for Training Binarized Deep Networks Ruizhou Ding, Ting-Wu Chin, Zeye Liu, Diana Marculescu Carnegie Mellon University {rding, tingwuc, zeyel, dianam}@andrew.cmu.edu
More informationarxiv: v2 [cs.cv] 6 Dec 2018
HAQ: Hardware-Aware Automated Quantization Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han {kuanwang, zhijian, yujunlin, jilin, songhan}@mit.edu Massachusetts Institute of Technology arxiv:1811.0888v
More informationarxiv: v2 [cs.cv] 19 Sep 2017
Learning Convolutional Networks for Content-weighted Image Compression arxiv:1703.10553v2 [cs.cv] 19 Sep 2017 Mu Li Hong Kong Polytechnic University csmuli@comp.polyu.edu.hk Shuhang Gu Hong Kong Polytechnic
More informationValue-aware Quantization for Training and Inference of Neural Networks
Value-aware Quantization for Training and Inference of Neural Networks Eunhyeok Park 1, Sungjoo Yoo 1, and Peter Vajda 2 1 Department of Computer Science and Engineering Seoul National University {eunhyeok.park,sungjoo.yoo}@gmail.com
More informationQuantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference Benoit Jacob Skirmantas Kligys Bo Chen Menglong Zhu Matthew Tang Andrew Howard Hartwig Adam Dmitry Kalenichenko
More informationDesigning Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning
Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning Tien-Ju Yang, Yu-Hsin Chen, Vivienne Sze Massachusetts Institute of Technology {tjy, yhchen, sze}@mit.edu Abstract Deep
More informationarxiv: v3 [cs.lg] 2 Jun 2016
Darryl D. Lin Qualcomm Research, San Diego, CA 922, USA Sachin S. Talathi Qualcomm Research, San Diego, CA 922, USA DARRYL.DLIN@GMAIL.COM TALATHI@GMAIL.COM arxiv:5.06393v3 [cs.lg] 2 Jun 206 V. Sreekanth
More informationarxiv: v1 [cs.cv] 3 Aug 2017
DONG ET AL.: STOCHASTIC QUANTIZATION 1 arxiv:1708.01001v1 [cs.cv] 3 Aug 2017 Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization Yinpeng Dong 1 dyp17@mails.tsinghua.edu.cn Renkun
More informationTowards a Deeper Understanding of Training Quantized Neural Networks
Hao Li * 1 Soham De * 1 Zheng Xu 1 Christoph Studer Hanan Samet 1 Tom Goldstein 1 Abstract Training neural networks with coarsely quantized weights is a key step towards learning on embedded platforms
More informationarxiv: v1 [cs.lg] 10 Jan 2019
Quantized Epoch-SGD for Communication-Efficient Distributed Learning arxiv:1901.0300v1 [cs.lg] 10 Jan 2019 Shen-Yi Zhao Hao Gao Wu-Jun Li Department of Computer Science and Technology Nanjing University,
More informationDeep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering
Deep learning on 3D geometries Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Overview Background Methods Numerical Result Future improvements Conclusion Background
More informationFixed-point Factorized Networks
Fixed-point Factorized Networks Peisong Wang 1,2 and Jian Cheng 1,2,3 1 Institute of Automation, Chinese Academy of Sciences 2 University of Chinese Academy of Sciences 3 Center for Excellence in Brain
More informationFxpNet: Training deep convolutional neural network in fixed-point representation
FxpNet: Training deep convolutional neural networ in fixed-point representation Xi Chen Department of Computer Science and Technology Tsinghua University 100084, Beijing, China aaron.xichen@gmail.com Xiaolin
More informationTBN: Convolutional Neural Network with Ternary Inputs and Binary Weights
TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights Diwen Wan 1,2, Fumin Shen 1, Li Liu 2, Fan Zhu 2, Jie Qin 3, Ling Shao 2, and Heng Tao Shen 1 1 Center for Future Media and School
More informationDeep Learning Year in Review 2016: Computer Vision Perspective
Deep Learning Year in Review 2016: Computer Vision Perspective Alex Kalinin, PhD Candidate Bioinformatics @ UMich alxndrkalinin@gmail.com @alxndrkalinin Architectures Summary of CNN architecture development
More informationAnalytical Guarantees on Numerical Precision of Deep Neural Networks
Charbel Sakr Yongjune Kim Naresh Shanbhag Abstract The acclaimed successes of neural networks often overshadow their tremendous complexity. We focus on numerical precision - a key parameter defining the
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationAccelerating Convolutional Neural Networks by Group-wise 2D-filter Pruning
Accelerating Convolutional Neural Networks by Group-wise D-filter Pruning Niange Yu Department of Computer Sicence and Technology Tsinghua University Beijing 0008, China yng@mails.tsinghua.edu.cn Shi Qiu
More informationConvolutional Neural Network Architecture
Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation
More informationFINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs
FINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs Vladimir Rybalkin, Muhammad Mohsin Ghaffar and Norbert Wehn Microelectronic Systems Design Research
More informationA MAIN/SUBSIDIARY NETWORK FRAMEWORK FOR SIMPLIFYING BINARY NEURAL NETWORKS
Under review as a conference paper at ICLR 209 A MAIN/SUBSIDIARY NETWORK FRAMEWORK FOR SIMPLIFYING BINARY NEURAL NETWORKS Anonymous authors Paper under double-blind review ABSTRACT To reduce memory footprint
More informationarxiv: v2 [cs.cv] 7 Dec 2017
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Xiangyu Zhang Xinyu Zhou Mengxiao Lin Jian Sun Megvii Inc (Face++) {zhangxiangyu,zxy,linmengxiao,sunjian}@megvii.com arxiv:1707.01083v2
More informationRecent Advances in Efficient Computation of Deep Convolutional Neural Networks
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks however, is that the computational complexity as well as the storage requirements of these DNNs has also increased drastically
More informationDiscrimination-aware Channel Pruning for Deep Neural Networks
Discrimination-aware Channel Pruning for Deep Neural Networks Zhuangwei Zhuang 1, Mingkui Tan 1, Bohan Zhuang 2, Jing Liu 1, Yong Guo 1, Qingyao Wu 1, Junzhou Huang 3,4, Jinhui Zhu 1 1 South China University
More informationTRAINING AND INFERENCE WITH INTEGERS IN DEEP NEURAL NETWORKS
Published as a conference paper at ICLR 218 TRAINING AND INFERENCE WITH INTEGERS IN DEEP NEURAL NETWORKS Shuang Wu 1, Guoqi Li 1, Feng Chen 2, Luping Shi 1 1 Department of Precision Instrument 2 Department
More informationAnalytical Guarantees on Numerical Precision of Deep Neural Networks
Charbel Sakr Yongjune Kim Naresh Shanbhag Abstract The acclaimed successes of neural networks often overshadow their tremendous complexity. We focus on numerical precision - a key parameter defining the
More informationCondenseNet: An Efficient DenseNet using Learned Group Convolutions
CondenseNet: An Efficient DenseNet using Learned Group s Gao Huang Cornell University gh@cornell.edu Shichen Liu Tsinghua University liushichen@gmail.com Kilian Q. Weinberger Cornell University kqw@cornell.edu
More informationarxiv: v1 [cs.cv] 31 Jan 2018
Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks Deepak Mittal Shweta Bhardwaj Mitesh M. Khapra Balaraman Ravindran arxiv:1801.10447v1 [cs.cv] 31 Jan 2018 Department
More informationImproved Bayesian Compression
Improved Bayesian Compression Marco Federici University of Amsterdam marco.federici@student.uva.nl Karen Ullrich University of Amsterdam karen.ullrich@uva.nl Max Welling University of Amsterdam Canadian
More informationADAPTIVE QUANTIZATION OF NEURAL NETWORKS
ADAPTIVE QUANTIZATION OF NEURAL NETWORKS Soroosh Khoram Department of Electrical and Computer Engineering University of Wisconsin - Madison khoram@wisc.edu Jing Li Department of Electrical and Computer
More informationarxiv: v2 [cs.cv] 17 Jul 2018 ABSTRACT
PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS Jungwook Choi 1, Zhuo Wang 2, Swagath Venkataramani 2, Pierce I-Jen Chuang 1, Vijayalakshmi Srinivasan 1, Kailash Gopalakrishnan 1
More informationGlobal Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond
Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This
More informationEfficient Deep Learning Inference based on Model Compression
Efficient Deep Learning Inference based on Model Compression Qing Zhang, Mengru Zhang, Mengdi Wang, Wanchen Sui, Chen Meng, Jun Yang Alibaba Group {sensi.zq, mengru.zmr, didou.wmd, wanchen.swc, mc119496,
More informationCondenseNet: An Efficient DenseNet using Learned Group Convolutions
CondenseNet: An Efficient DenseNet using Learned Group s Gao Huang Cornell University gh@cornell.edu Shichen Liu Tsinghua University liushichen@gmail.com Kilian Q. Weinberger Cornell University kqw@cornell.edu
More informationarxiv: v3 [cs.ne] 2 Feb 2018
DOREFA-NET: TRAINING LOW BITWIDTH CONVOLU- TIONAL NEURAL NETWORKS WITH LOW BITWIDTH GRADIENTS Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou Megvii Inc. {zsc, wyx, nzk, zxy, wenhe, zouyuheng}@megvii.com
More informationarxiv: v2 [cs.cv] 30 Oct 2018
Discrimination-aware Channel Pruning for Deep Neural Networks arxiv:1810.11809v2 [cs.cv] 30 Oct 2018 Zhuangwei Zhuang 1, Mingkui Tan 1, Bohan Zhuang 2, Jing Liu 1, Yong Guo 1, Qingyao Wu 1, Junzhou Huang
More informationarxiv: v1 [cs.lg] 19 May 2017
The High-Dimensional Geometry of Binary Neural Networks arxiv:175.7199v1 [cs.lg] 19 May 217 Alexander G. Anderson Redwood Center for Theoretical Neuroscience University of California, Berkeley aga@berkeley.edu
More informationPRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz
PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":
More informationarxiv: v1 [cs.cv] 11 Sep 2018
Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference Jeffrey L. McKinstry jlmckins@us.ibm.com Steven K. Esser sesser@us.ibm.com Rathinakumar Appuswamy rappusw@us.ibm.com
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationarxiv: v1 [cs.cv] 6 Dec 2018
Trained Rank Pruning for Efficient Deep Neural Networks Yuhui Xu 1, Yuxi Li 1, Shuai Zhang 2, Wei Wen 3, Botao Wang 2, Yingyong Qi 2, Yiran Chen 3, Weiyao Lin 1 and Hongkai Xiong 1 arxiv:1812.02402v1 [cs.cv]
More informationPerturbative Neural Networks
Perturbative Neural Networks Felix Juefei-Xu Carnegie Mellon University felixu@cmu.edu Vishnu Naresh Boddeti Michigan State University vishnu@msu.edu Marios Savvides Carnegie Mellon University msavvid@ri.cmu.edu
More informationSYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks Julian Faraone* Nicholas Fraser # Michaela Blott # Philip H.W. Leong* The University of Sydney* Xilinx Research Labs # (julian.faraone,
More informationarxiv: v1 [stat.ml] 15 Dec 2017
BT-Nets: Simplifying Deep Neural Networks via Block Term Decomposition Guangxi Li 1, Jinmian Ye 1, Haiqin Yang 2, Di Chen 1, Shuicheng Yan 3, Zenglin Xu 1, 1 SMILE Lab, School of Comp. Sci. and Eng., Univ.
More informationSparse Regularized Deep Neural Networks For Efficient Embedded Learning
Sparse Regularized Deep Neural Networks For Efficient Embedded Learning Anonymous authors Paper under double-blind review Abstract Deep learning is becoming more widespread in its application due to its
More informationTasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based
UNDERSTANDING CNN ADAS Tasks Self Driving Localizati on Perception Planning/ Control Driver state Vehicle Diagnosis Smart factory Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning
More informationLarge-Scale FPGA implementations of Machine Learning Algorithms
Large-Scale FPGA implementations of Machine Learning Algorithms Philip Leong ( ) Computer Engineering Laboratory School of Electrical and Information Engineering, The University of Sydney Computer Engineering
More informationSajid Anwar, Kyuyeon Hwang and Wonyong Sung
Sajid Anwar, Kyuyeon Hwang and Wonyong Sung Department of Electrical and Computer Engineering Seoul National University Seoul, 08826 Korea Email: sajid@dsp.snu.ac.kr, khwang@dsp.snu.ac.kr, wysung@snu.ac.kr
More informationAttention Based Pruning for Shift Networks
Attention Based Pruning for Shift Networks Ghouthi Boukli Hacene, Ghouthi Hacene, Carlos Lassance, Vincent Gripon To cite this version: Ghouthi Boukli Hacene, Ghouthi Hacene, Carlos Lassance, Vincent Gripon.
More informationAn Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks
An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks Charbel Sakr, Naresh Shanbhag Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign
More informationRAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY
RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY SIFT HOG ALL ABOUT THE FEATURES DAISY GABOR AlexNet GoogleNet CONVOLUTIONAL NEURAL NETWORKS VGG-19 ResNet FEATURES COMES FROM DATA
More informationarxiv: v1 [cs.lg] 15 Dec 2017
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference arxiv:1712.05877v1 [cs.lg] 15 Dec 2017 Benoit Jacob Skirmantas Kligys Bo Chen Menglong Zhu Matthew Tang Andrew
More informationarxiv: v1 [cs.ne] 20 Jun 2016
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients Shuchang Zhou, Zekun Ni, Xinyu Zhou, He Wen, Yuxin Wu, Yuheng Zou Megvii Inc. {shuchang.zhou,mike.zekun}@gmail.com
More informationVery Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas
Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas Center for Machine Perception Department of Cybernetics Faculty of Electrical Engineering
More informationNormalization Techniques in Training of Deep Neural Networks
Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,
More informationACIQ: ANALYTICAL CLIPPING FOR INTEGER QUAN-
ACIQ: ANALYTICAL CLIPPING FOR INTEGER QUAN- TIZATION OF NEURAL NETWORKS Anonymous authors Paper under double-blind review ABSTRACT Unlike traditional approaches that focus on the quantization at the network
More informationFaster Machine Learning via Low-Precision Communication & Computation. Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich)
Faster Machine Learning via Low-Precision Communication & Computation Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich) 2 How many bits do you need to represent a single number in machine
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationExploring the Granularity of Sparsity in Convolutional Neural Networks
Exploring the Granularity of Sparsity in Convolutional Neural Networks Anonymous TMCV submission Abstract Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with
More informationDeep Residual. Variations
Deep Residual Network and Its Variations Diyu Yang (Originally prepared by Kaiming He from Microsoft Research) Advantages of Depth Degradation Problem Possible Causes? Vanishing/Exploding Gradients. Overfitting
More information