Channel Pruning and Other Methods for Compressing CNN
|
|
- Phillip Patrick Griffith
- 6 years ago
- Views:
Transcription
1 Channel Pruning and Other Methods for Compressing CNN Yihui He Xi an Jiaotong Univ. October 12, 2017 Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
2 Background Recent CNN acceleration works fall into three categories: (1) optimized implementation (e.g., FFT, dictionary) (2) quantization (e.g., BinaryNet) (3) structured simplification (convert CNN structure into compact one) We focuses on the last one. Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
3 Background W1 number of channels nonlinear W2 nonlinear W3 (a) (b) (c) (d) Structured simplification methods that accelerate CNNs: (a) a network with 3 conv layers. (b) sparse connection deacti-vates some connections between channels. (c) tensor factorization factorizes a convolutional layer into several pieces. (d) channel pruning reduces number of channels in each layer (focus of this paper). Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
4 Sparse Connection S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems, pages , 2015 Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
5 Tensor factorization Accelerating very deep convolutional networks for classification and detection. Speeding up convolutional neural networks with low rank expansions. Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
6 Our approach A B C W c k h k w nonlinear c n nonlinear We aim to reduce the width of feature map B, while minimizing the reconstruction error on feature map C. Our optimization algorithm performs within the dotted box, which does not involve nonlinearity. This figure illustrates the situation that two channels are pruned for feature map B. Thus corresponding channels of filters W can be removed. Furthermore, even though not directly optimized by our algorithm, the corresponding filters in the previous layer can also be removed (marked by dotted filters). c, n: number of channels for feature maps B and C, k h k w : kernel size. Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
7 Optimization Formally, to prune a feature map with c channels, we consider applying n c k h k w convolutional filters W on N c k h k w input volumes X sampled from this feature map, which produces N n output matrix Y. Here, N is the number of samples, n is the number of output channels, and k h, k w are the kernel size. For simple representation, bias term is not included in our formulation. To prune the input channels from c to desired c (0 c c), while minimizing reconstruction error, we formulate our problem as follow: 1 c 2 arg min β,w 2N Y β i X i W i (1) i=1 F subject to β 0 c F is Frobenius norm. X i is N k h k w matrix sliced from ith channel of input volumes X, i = 1,..., c. W i is n k h k w filter weights sliced from ith channel of W. β is coefficient vector of length c for channel selection, and β i is ith entry of β. Notice that, if β i = 0, X i will be no longer useful, which could be safely pruned from feature map. W i could also be removed. Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
8 Optimization Solving this l 0 minimization problem in Eqn. 1 is NP-hard. we relax the l 0 to l 1 regularization: arg min β,w 1 2N Y c β i X i W i i=1 subject to β 0 c, i W i F = 1 2 F + λ β 1 λ is a penalty coefficient. By increasing λ, there will be more zero terms in β and one can get higher speed-up ratio. We also add a constrain i W i F = 1 to this formulation, which avoids trivial solution. Now we solve this problem in two folds. First, we fix W, solve β for channel selection. Second, we fix β, solve W to reconstruct error. (2) Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
9 Optimization (i) The subproblem of β: In this case, W is fixed. We solve β for channel selection. ˆβ LASSO 1 c 2 (λ) = arg min β 2N Y β i Z i + λ β 1 (3) subject to β 0 c Here Z i = X i W i (size N n). We will ignore ith channels if β i = 0. (ii) The subproblem of W: In this case, β is fixed. We utilize the selected channels to minimize reconstruction error. We can find optimized solution by least squares: arg min W i=1 Y X (W ) 2 Here X = [β 1 X 1 β 2 X 2... β i X i... β c X c ] (size N ck h k w ). W is n ck h k w reshaped W, W = [W 1 W 2... W i... W c ]. After obtained result W, it is reshaped back to W. Then we assign β i β i W i F, W i W i / W i F. Constrain i W i F = 1 satisfies. Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16 F F (4)
10 Analysis Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
11 Analysis 5 4 conv1_1 first k max response ours conv2_1 first k max response ours conv3_1 first k max response ours increase of error (%) conv3_2 first k max response ours conv4_1 first k max response ours conv4_2 first k max response ours increase of error (%) speed-up ratio speed-up ratio speed-up ratio Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
12 Analysis Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
13 Results Increase of top-5 error (1-view, baseline 89.9%) Solution Jaderberg (Zhang 2015 s impl.) Asym Filter pruning (fine-tuned, our impl.) Ours (without fine-tune) Ours (fine-tuned) Table 1: Accelerating the VGG-16 model using a speedup ratio of 2, 4, or 5 (smaller is better). Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
14 Results Increase of top-5 error (1-view, 89.9%) Solution 4 5 Asym. 3D Asym. 3D (fine-tuned) Our 3C Our 3C (fine-tuned) Table 2: Performance of combined methods on the VGG-16 model using a speed-up ratio of 4 or 5. Our 3C solution outperforms previous approaches (smaller is better). Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
15 Absolute performance Table 3. GPU acceleration comparison. We measure forward-pass time per image. Our approach generalizes well on GPU (smaller is better). Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
16 Acc Detection 2, 4 acceleration for Faster R-CNN detection. mmap is AP at IoU=.50:.05:.95 (primary challenge metric of COCO [32]). Yihui He (Xi an Jiaotong Univ.) Channel Pruning and Other Methods for Compressing CNN October 12, / 16
arxiv: v2 [cs.cv] 21 Aug 2017 Abstract
Channel Pruning for Accelerating Very Deep Neural Networks Yihui He * Xi an Jiaotong University Xi an, 70049, China heyihui@stu.xjtu.edu.cn Xiangyu Zhang Megvii Inc. Beijing, 0090, China zhangxiangyu@megvii.com
More informationPRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz
PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":
More informationChannel Pruning for Accelerating Very Deep Neural Networks
Channel Pruning for Accelerating Very Deep Neural Networks Yihui He * Xi an Jiaotong University Xi an, 70049, China heyihui@stu.xjtu.edu.cn Xiangyu Zhang Megvii Inc. Beijing, 0090, China zhangxiangyu@megvii.com
More informationAccelerating Convolutional Neural Networks by Group-wise 2D-filter Pruning
Accelerating Convolutional Neural Networks by Group-wise D-filter Pruning Niange Yu Department of Computer Sicence and Technology Tsinghua University Beijing 0008, China yng@mails.tsinghua.edu.cn Shi Qiu
More informationEfficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error
Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error Chunhui Jiang, Guiying Li, Chao Qian, Ke Tang Anhui Province Key Lab of Big Data Analysis and Application, University
More informationTensor networks and deep learning
Tensor networks and deep learning I. Oseledets, A. Cichocki Skoltech, Moscow 26 July 2017 What is a tensor Tensor is d-dimensional array: A(i 1,, i d ) Why tensors Many objects in machine learning can
More informationLocal Affine Approximators for Improving Knowledge Transfer
Local Affine Approximators for Improving Knowledge Transfer Suraj Srinivas & François Fleuret Idiap Research Institute and EPFL {suraj.srinivas, francois.fleuret}@idiap.ch Abstract The Jacobian of a neural
More informationDeep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning
Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional
More informationTYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad
TYPES OF MODEL COMPRESSION Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad 1. Pruning 2. Quantization 3. Architectural Modifications PRUNING WHY PRUNING? Deep Neural Networks have redundant parameters.
More informationAccelerating Very Deep Convolutional Networks for Classification and Detection
Accelerating Very Deep Convolutional Networks for Classification and Detection Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun arxiv:55.6798v [cs.cv] 6 May 5 Abstract This paper aims to accelerate
More informationDeep Learning: Approximation of Functions by Composition
Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationMini-project in scientific computing
Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational
More informationNeural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017
Neural Network Approximation Low rank, Sparsity, and Quantization zsc@megvii.com Oct. 2017 Motivation Faster Inference Faster Training Latency critical scenarios VR/AR, UGV/UAV Saves time and energy Higher
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationFrequency-Domain Dynamic Pruning for Convolutional Neural Networks
Frequency-Domain Dynamic Pruning for Convolutional Neural Networks Zhenhua Liu 1, Jizheng Xu 2, Xiulian Peng 2, Ruiqin Xiong 1 1 Institute of Digital Media, School of Electronic Engineering and Computer
More informationConvolutional Neural Network Architecture
Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationSpatial Transformer Networks
BIL722 - Deep Learning for Computer Vision Spatial Transformer Networks Max Jaderberg Andrew Zisserman Karen Simonyan Koray Kavukcuoglu Contents Introduction to Spatial Transformers Related Works Spatial
More informationIntroduction to Deep Learning CMPT 733. Steven Bergner
Introduction to Deep Learning CMPT 733 Steven Bergner Overview Renaissance of artificial neural networks Representation learning vs feature engineering Background Linear Algebra, Optimization Regularization
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationExploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun and Rob Fergus Dept. of Computer Science, Courant Institute, New
More informationIntroduction to Convolutional Neural Networks 2018 / 02 / 23
Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that
More informationConvolutional neural networks
11-1: Convolutional neural networks Prof. J.C. Kao, UCLA Convolutional neural networks Motivation Biological inspiration Convolution operation Convolutional layer Padding and stride CNN architecture 11-2:
More informationTETRIS: TilE-matching the TRemendous Irregular Sparsity
TETRIS: TilE-matching the TRemendous Irregular Sparsity Yu Ji 1,2,3 Ling Liang 3 Lei Deng 3 Youyang Zhang 1 Youhui Zhang 1,2 Yuan Xie 3 {jiy15,zhang-yy15}@mails.tsinghua.edu.cn,zyh02@tsinghua.edu.cn 1
More informationImproved Bayesian Compression
Improved Bayesian Compression Marco Federici University of Amsterdam marco.federici@student.uva.nl Karen Ullrich University of Amsterdam karen.ullrich@uva.nl Max Welling University of Amsterdam Canadian
More informationAuto-balanced Filter Pruning for Efficient Convolutional Neural Networks
Auto-balanced Filter Pruning for Efficient Convolutional Neural Networks Xiaohan Ding, 1 Guiguang Ding, 1 Jungong Han, 2 Sheng Tang 3 1 School of Software, Tsinghua University, Beijing 100084, China 2
More informationΝεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing
Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 21η BackProp for CNNs: Do I need to understand it? Why do we have to write the
More informationAnalyzing the Performance of Multilayer Neural Networks for Object Recognition
Analyzing the Performance of Multilayer Neural Networks for Object Recognition Pulkit Agrawal, Ross Girshick, Jitendra Malik {pulkitag,rbg,malik}@eecs.berkeley.edu University of California Berkeley Supplementary
More informationTENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018
TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 Tensors Computations and the GPU AGENDA Tensor Networks and Decompositions Tensor Layers in
More informationAn Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks
An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks Charbel Sakr, Naresh Shanbhag Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign
More informationExploring the Granularity of Sparsity in Convolutional Neural Networks
Exploring the Granularity of Sparsity in Convolutional Neural Networks Anonymous TMCV submission Abstract Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with
More informationDifferentiable Fine-grained Quantization for Deep Neural Network Compression
Differentiable Fine-grained Quantization for Deep Neural Network Compression Hsin-Pai Cheng hc218@duke.edu Yuanjun Huang University of Science and Technology of China Anhui, China yjhuang@mail.ustc.edu.cn
More informationConvolutional Dictionary Learning and Feature Design
1 Convolutional Dictionary Learning and Feature Design Lawrence Carin Duke University 16 September 214 1 1 Background 2 Convolutional Dictionary Learning 3 Hierarchical, Deep Architecture 4 Convolutional
More informationVery Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas
Very Deep Residual Networks with Maxout for Plant Identification in the Wild Milan Šulc, Dmytro Mishkin, Jiří Matas Center for Machine Perception Department of Cybernetics Faculty of Electrical Engineering
More informationSparse signals recovered by non-convex penalty in quasi-linear systems
Cui et al. Journal of Inequalities and Applications 018) 018:59 https://doi.org/10.1186/s13660-018-165-8 R E S E A R C H Open Access Sparse signals recovered by non-conve penalty in quasi-linear systems
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationarxiv: v1 [cs.cv] 6 Dec 2018
Trained Rank Pruning for Efficient Deep Neural Networks Yuhui Xu 1, Yuxi Li 1, Shuai Zhang 2, Wei Wen 3, Botao Wang 2, Yingyong Qi 2, Yiran Chen 3, Weiyao Lin 1 and Hongkai Xiong 1 arxiv:1812.02402v1 [cs.cv]
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationCompressing deep neural networks
From Data to Decisions - M.Sc. Data Science Compressing deep neural networks Challenges and theoretical foundations Presenter: Simone Scardapane University of Exeter, UK Table of contents Introduction
More informationStat542 (F11) Statistical Learning. First consider the scenario where the two classes of points are separable.
Linear SVM (separable case) First consider the scenario where the two classes of points are separable. It s desirable to have the width (called margin) between the two dashed lines to be large, i.e., have
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More informationIN recent years, convolutional neural networks (CNNs) have
SUBMISSION TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Holistic CNN Compression via Low-rank Decomposition with Knowledge Transfer Shaohui Lin, Rongrong Ji, Senior Member, IEEE, Chao
More informationSajid Anwar, Kyuyeon Hwang and Wonyong Sung
Sajid Anwar, Kyuyeon Hwang and Wonyong Sung Department of Electrical and Computer Engineering Seoul National University Seoul, 08826 Korea Email: sajid@dsp.snu.ac.kr, khwang@dsp.snu.ac.kr, wysung@snu.ac.kr
More informationA practical theory for designing very deep convolutional neural networks. Xudong Cao.
A practical theory for designing very deep convolutional neural networks Xudong Cao notcxd@gmail.com Abstract Going deep is essential for deep learning. However it is not easy, there are many ways of going
More informationGlobal Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond
Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This
More informationAsaf Bar Zvi Adi Hayat. Semantic Segmentation
Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random
More informationarxiv: v1 [cs.cv] 21 Nov 2016
Training Sparse Neural Networks Suraj Srinivas Akshayvarun Subramanya Indian Institute of Science Indian Institute of Science Bangalore, India Bangalore, India surajsrinivas@grads.cds.iisc.ac.in akshayvarun07@gmail.com
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationImproving L-BFGS Initialization for Trust-Region Methods in Deep Learning
Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University
More informationDeep Convolutional Neural Networks for Pairwise Causality
Deep Convolutional Neural Networks for Pairwise Causality Karamjit Singh, Garima Gupta, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal TCS Research, Delhi Tata Consultancy Services Ltd. {karamjit.singh,
More informationPreserving Privacy in Data Mining using Data Distortion Approach
Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationCSCI 315: Artificial Intelligence through Deep Learning
CSCI 35: Artificial Intelligence through Deep Learning W&L Fall Term 27 Prof. Levy Convolutional Networks http://wernerstudio.typepad.com/.a/6ad83549adb53ef53629ccf97c-5wi Convolution: Convolution is
More informationOptimization methods
Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationLeveraging Machine Learning for High-Resolution Restoration of Satellite Imagery
Leveraging Machine Learning for High-Resolution Restoration of Satellite Imagery Daniel L. Pimentel-Alarcón, Ashish Tiwari Georgia State University, Atlanta, GA Douglas A. Hope Hope Scientific Renaissance
More informationExploring the Granularity of Sparsity in Convolutional Neural Networks
Exploring the Granularity of Sparsity in Convolutional Neural Networks Huizi Mao 1, Song Han 1, Jeff Pool 2, Wenshuo Li 3, Xingyu Liu 1, Yu Wang 3, William J. Dally 1,2 1 Stanford University 2 NVIDIA 3
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationInfo-Greedy Sequential Adaptive Compressed Sensing
Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for
More informationHarmonic Analysis of Deep Convolutional Neural Networks
Harmonic Analysis of Deep Convolutional Neural Networks Helmut Bőlcskei Department of Information Technology and Electrical Engineering October 2017 joint work with Thomas Wiatowski and Philipp Grohs ImageNet
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationarxiv: v1 [stat.ml] 15 Dec 2017
BT-Nets: Simplifying Deep Neural Networks via Block Term Decomposition Guangxi Li 1, Jinmian Ye 1, Haiqin Yang 2, Di Chen 1, Shuicheng Yan 3, Zenglin Xu 1, 1 SMILE Lab, School of Comp. Sci. and Eng., Univ.
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 8. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 8 Sinan Kalkan Loss functions Many correct labels case: Binary prediction for each label, independently: L i = σ j max 0, 1 y ij f j y ij = +1 if
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationarxiv: v1 [cs.it] 26 Oct 2018
Outlier Detection using Generative Models with Theoretical Performance Guarantees arxiv:1810.11335v1 [cs.it] 6 Oct 018 Jirong Yi Anh Duc Le Tianming Wang Xiaodong Wu Weiyu Xu October 9, 018 Abstract This
More information1 Sparsity and l 1 relaxation
6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the
More informationTwo well known examples. Applications of structured low-rank approximation. Approximate realisation = Model reduction. System realisation
Two well known examples Applications of structured low-rank approximation Ivan Markovsky System realisation Discrete deconvolution School of Electronics and Computer Science University of Southampton The
More informationMIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design
MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation
More informationLayout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning
Layout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning Haoyu Yang 1, Jing Su, Yi Zou, Bei Yu 1, and Evangeline F Y Young 1 1 CSE Department, The Chinese University of Hong Kong,
More informationDictionary Learning Using Tensor Methods
Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone
More informationApplication to Hyperspectral Imaging
Compressed Sensing of Low Complexity High Dimensional Data Application to Hyperspectral Imaging Kévin Degraux PhD Student, ICTEAM institute Université catholique de Louvain, Belgium 6 November, 2013 Hyperspectral
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationBias-free Sparse Regression with Guaranteed Consistency
Bias-free Sparse Regression with Guaranteed Consistency Wotao Yin (UCLA Math) joint with: Stanley Osher, Ming Yan (UCLA) Feng Ruan, Jiechao Xiong, Yuan Yao (Peking U) UC Riverside, STATS Department March
More informationDiscriminative Models
No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models
More informationHigh-Dimensional Statistical Learning: Introduction
Classical Statistics Biological Big Data Supervised and Unsupervised Learning High-Dimensional Statistical Learning: Introduction Ali Shojaie University of Washington http://faculty.washington.edu/ashojaie/
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationRegML 2018 Class 8 Deep learning
RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x
More informationA tutorial on sparse modeling. Outline:
A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing
More informationSignal Recovery from Permuted Observations
EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,
More informationGlobal Optimality in Matrix and Tensor Factorizations, Deep Learning and More
Global Optimality in Matrix and Tensor Factorizations, Deep Learning and More Ben Haeffele and René Vidal Center for Imaging Science Institute for Computational Medicine Learning Deep Image Feature Hierarchies
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu
More informationDeep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering
Deep learning on 3D geometries Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Overview Background Methods Numerical Result Future improvements Conclusion Background
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationColor Scheme. swright/pcmi/ M. Figueiredo and S. Wright () Inference and Optimization PCMI, July / 14
Color Scheme www.cs.wisc.edu/ swright/pcmi/ M. Figueiredo and S. Wright () Inference and Optimization PCMI, July 2016 1 / 14 Statistical Inference via Optimization Many problems in statistical inference
More informationarxiv: v1 [cs.lg] 25 Sep 2018
Utilizing Class Information for DNN Representation Shaping Daeyoung Choi and Wonjong Rhee Department of Transdisciplinary Studies Seoul National University Seoul, 08826, South Korea {choid, wrhee}@snu.ac.kr
More informationarxiv: v1 [stat.ml] 27 Jul 2016
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding arxiv:1607.08194v1 [stat.ml] 27 Jul 2016 Vardan Papyan*, Yaniv Romano*, Michael Elad October 17, 2017 Abstract In recent years, deep
More informationDirect Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationA NONCONVEX ADMM ALGORITHM FOR GROUP SPARSITY WITH SPARSE GROUPS. Rick Chartrand and Brendt Wohlberg
A NONCONVEX ADMM ALGORITHM FOR GROUP SPARSITY WITH SPARSE GROUPS Rick Chartrand and Brendt Wohlberg Los Alamos National Laboratory Los Alamos, NM 87545, USA ABSTRACT We present an efficient algorithm for
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationAgenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT
versus 1 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox:
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More information