Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
|
|
- Juliana Flynn
- 5 years ago
- Views:
Transcription
1 Index A Activation functions, neuron/perceptron binary threshold activation function, linear activation function, 102 rectified linear unit, 106 sigmoid activation function, SoftMax activation function, tanh activation function, 107 AdadeltaOptimizer, AdagradOptimizer, AdamOptimizer, 135 Auto encoders architecture, 323 cases, 324 combined classification network, class prediction, 326 denoising auto-encoder implementation, 333 element wise activation function, 324 hidden layer, 323 KL divergence, learning rule of model, 324 multiple hidden layers, 325 network, class prediction, 326 sparse, 328 unsupervised ANN, 322 B Backpropagation, 109 convolution layer, for gradient computation cost derivative, 116 cost function, , 112 cross-entropy cost, SoftMax activation layer, 115 forward pass and backward pass, 114 hidden layer unit, 110 independent sigmoid output units, 111 multi-layer neural network, 113 neural networks, 114 partial derivative, partial derivative, cost function, propagating error, 109 sigmoid activation functions, 114 SoftMax function, 114 Softmax output layer, 114 pooling layer, Backpropagation through time (BPTT), 256 Batch normalization, Bayesian inference Bernoulli distribution, 282 likelihood function, , 286 likelihood function plot, 284 posterior distribution, 281 posterior probability distribution, 281, 283, prior, 283 prior probability distribution, 283, 285 Bayesian networks, 38 Bayes rule, 38 Bernoulli distribution, Bidirectional RNN, Binary threshold activation function, Binomial distribution, 49 Block Gibbs sampling, 305 Boltzmann distribution, C Calculus, 23 convex function, convex set, differentiation, gradient of function, Hessian matrix of function, 25 local and global minima, maxima and minima of functions, 26 for univariate function, multivariate convex and non-convex functions, Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow, 393
2 Calculus (cont.) non-convex function, 31 positive semi-definite and definite, 29 successive partial derivatives, 25 Taylor series, 34 Central Limit theorem, 53 Collaborative filtering contrastive divergence, 315 derived probabilities, 317 description, 313 energy configuration, 317 joint configuration, 316 matrix factorization method, 313 probability of hidden unit, 316 RBMs, 314 restricted Boltzmann View, user, Continuous bag of words (CBOW) hidden-layer embedding, 230 hidden layer vector, 229, 231 SoftMax output probability, 231 TensorFlow implementation, 234 word embeddings, Contrastive divergence, , 315 Convolutional neural networks (CNNs), 153 architectures, 206 AlexNet, LeNet, ResNet, VGG16, components, 179 convolution layer, input layer, 180 pooling layer, 182 convolution operation, 153 2D convolution of image, D convolution of signal, LTI/LSI systems, signals in one dimension, , digit recognition on MNIST dataset, dropout layers and regularization, elements, 153 image-processing filters, 169 Gaussian filter, 173 gradient-based filters, identity transform, Mean filter, Median filter, Sobel edge-detection filter, for solving real-world problems, translational equivariance, pooling, weight sharing, 187 Cross-correlation, 180 D Deep belief networks (DBNs) backpropagation, 318 implementation, 319 learning algorithm, 318 MNIST dataset, 318 RBMs, 317 ReLU activation functions, 319 schematic diagram, 317, 318 sigmoid units, 318 Deep learning evolution artificial neural networks, artificial neuron structure, 90 biological neuron structure, 89 perceptron learning algorithms activation functions, hidden layers linear, backpropagation (see Backpropagation, for gradient computation) geometrical interpretation, hyperplane, classes, 93 limitations, machine-learning domain, 94 non-linearity, rule, multi-layer perceptrons network, weight parameters vector, 95 vs. traditional methods, Denoising auto-encoder, 333 E Elliptical contours, 123, 125 F Forget-gate value, 264 Fully convolutional network (FCN) architecture, 356 down and up sampling max unpooling, 360 transpose convolution, 361, 363 unpooling, 359 output feature maps, network, pixel categories, 356 SoftMax probability, 357 G, H Gated recurrent unit (GRU), Gaussian blur, 173 Generative adversarial networks (GANs) 394
3 agents zero-sum game, 378 cost function and training, generative models, 378 illustration, 379 maximin and minimax problem, minimax and saddle points, neural networks, 378 TensorFlow implementation, 386 vanishing gradient, generator, 386 zero sum game, 381 Gibbs sampling bivariate normal distribution, 305 block, 305 burn in period, 306 conditional distributions, 305 generating samples, 306 Markov Chain Monte Carlo method, 304 restricted Boltzmann machines, Global co-occurrence methods, 241 building word vectors, extraction, word embeddings, 242 statistics and prediction methods, 240 SVD method, 241 word combination, 241 Word-embeddings plot, 245 word-vector embedding matrix, 242 Global minima, 28 GloVe, 245 Gradient clipping, 261 Gradient descent, backpropagation, 236 GradientDescentOptimizer, 130 Graphical processing unit (GPU), 152 I, J Image classification, Image segmentation, 345 binary thresholding method, histogram, 345, 349 FCN (see Fully convolutional network (FCN)) K-means clustering, 352 Otsu s method, semantic segmentation, 355 sliding window approach, 355 in TensorFlow implementation, semantic segmentation, 365 U-Net convolutional neutral network, Watershed algorithm, K Karush Kahn Tucker method, 78 K-means algorithm, 352 Kullback-Leibler (KL) divergence plot for mean, 327 sparse auto-encoders, L Lagrangian multipliers, 79 Language modeling, Lasso Regularization, 16 Linear activation function, 102 Linear algebra, 2 determinant of matrix, 12 interpretation, 13 Eigen vectors, characteristic equation of matrix, power iteration method, identity matrix or operator, inverse of matrix, 14 linear independence of vectors, 9 10 matrix, 4 5 matrix operations and manipulations, 5 addition of two matrices, 6 matrix working on vector, 8 product of two matrices, 6 product of two vectors, 7 subtractions of two matrices, 6 transpose of matrix, 7 norm of vector, product of vector in direction of another vector, pseudo inverse of matrix, 16 rank of matrix, scalar, 4 tensor, 5 unit vector in direction of specific vector, 17 vector, 3 4 Linear shift invariant (LSI) systems, Linear time invariant (LTI) systems, Localization network, Local minima point, 28 Long short-term memory (LSTM) architecture, 262 building blocks and function, exploding-and vanishing-gradient problems, forget gate, 263 output gates, 263 M, N Machine learning, 55 constrained optimization problem, and data science, 2 dimensionality reduction methods, 79 principal component analysis, singular value decomposition, optimization techniques contour plot and lines, gradient descent, 66 linear curve,
4 Machine learning (cont.) for multivariate cost function, gradient descent, negative curvature, 75 Newton s method, 74 positive curvature, steepest descent, 70 stochastic gradient descent, regularization, constraint optimization problem, supervised learning, 56 classification, hyperplanes and linear classifiers, linear regression, unsupervised learning, 65 Markov Chain, 288 Markov Chain Monte Carlo (MCMC) methods, 280 aperiodicity, 289 area of Pi, 287 computation of Pi, 287 detailed balance condition, 289 implementation, 289 irreducibility, 289 metropolis algorithm acceptance probability, 291 bivariate Gaussian distribution, sampling, heuristics, 290 implementation, 290 transition probability function, 290, 291 probability zones, 287 sampling, 286 states, gas molecules, 288 stochastic/random, 288 transition probability, 288 Matrix factorization method, 313 Maximum likelihood estimate (MLE) technique, Max unpooling, 360 Momentum-based optimizers, Monte Carlo method, 287 Multi-layer Perceptron (MLP), 99 O Object detection fast R-CNN network, 377 R-CNN network, sliding-window technique, 375 task, 375 Otsu s method, Overfitting, 84 P, Q PCA and ZCA whitening advantage, illustration, pixels, 340 spatial structure, 341 techniques, 340 whitening transform, 341 Perceptron, 92 Points of inflection, 26 Principal component analysis, 279 See also PCA and ZCA whitening Probability, 34 Bayes rule, 38 chain rule, 37 conditional independence of events, 38 correlation coefficient, 44 covariance, 44 distribution Bernoulli distribution, binomial distribution, 49 multivariate normal distribution, 48 normal distribution, Poisson distribution, 50 uniform distribution, expectation of random variable, 39 hypothesis testing and p value, independence of events, 37 likelihood function, 51 MLE, mutually exclusive events, 37 probability density function (pdf), 39 probability mass function (pmf), 38 skewness and Kurtosis, 40, 42 unions, intersection, and conditional, variance of random variable, R Rectified linear unit (ReLU) activation function, 106 Recurrent neural networks (RNNs) architectural principal, 252 bidirectional RNN, BPTT, 256 component, embeddings layer, 252 folded and unfolded structure, 252 GRU, language modeling, LSTM, MNIST digit identification, TensorFlow Alice in Wonderland, 273 implementation, LSTM,
5 input tensor shape, LSTM network, 265 next-word prediction and sentence completion, 268 traditional language models, 255 vanishing and exploding gradient problem gradient clipping, 261 LSTMs, memory-to-memory weight connection matrix and ReLU units, 261 sigmoid function, 259 temporal components, 259 Restricted Boltzmann machines (RBMs) Block Gibbs sampling, 305 collaborative filtering binary visible unit, 315 contrastive divergence, 315 hidden units, , 317 joint configuration, 316 Netflix Challenge, 314 probability of hidden unit, 316 schematic diagram, matrix factorization method, 313 SoftMax function, 315 three-way energy configuration, 317 conditional probability distribution, 296 contrastive divergence, DBNs (see Deep belief networks (DBNs)) deep networks, 294 discrete variables, 297 Gibbs sampling, graphical probabilistic model, 295 implementation, MNIST dataset, 309 joint configuration, 295 joint probability distribution, 295, 298 machine learning algorithms, 294 partition function Z, 295 sigmoid function, 299 symmetrical undirected network, 299 training, 299 visible and hidden layers architecture, 294 Ridge regression, 86 Ridge regularization, 16 RMSprop, S Saddle points, 127, 129, Semantic segmentation, 355 in TensorFlow, FCN network, 365 Sigmoid activation function, Singular value decomposition (SVD), , 313, 340 Skip-gram models, 236 TensorFlow implementation, 240 word embedding, Sliding window approach, 355 SoftMax activation function, Sparse auto-encoders hidden layer output, 329 hidden layer sigmoid activations, 328 hidden structures, input data, 328 implementation, TensorFlow, 329 Stochastic gradient descent (SGD), 71, 127 Supremum norm, 15 T Tanh activation function, 107 Taylor series expansion, 34 TensorFlow commands, define check Tensor shape, 120 explicit evaluation, 120 Interactive Session() command, invoke session and display, variable, 121 Numpy Array to Tensor conversion, 122 placeholders and feed dictionary, 122 TensorFlow and Numpy Library, 119 TensorFlow constants, 120 TensorFlow variable, random initial values, 121 tf.session(), 121 variables, 121 variable state update, 122 deep-learning packages, 118 features, deep-learning frameworks, gradient-descent optimization methods elliptical contours, 123, 125 non-convexity of cost functions, 126 saddle points, 127, 129 installation, 119 linear regression actual house price vs. predicted house price, 146 cost plot over epochs, 145 implementation, 143 meta graph definition, 390 mini-batch stochastic gradient descent, rate, 129 models deployment, production, multi-class classification, SoftMax function full-batch gradient descent, 146 stochastic gradient descent, 149 optimizers AdadeltaOptimizer, AdagradOptimizer, AdamOptimizer, 135 batch size, 138 epochs, 138 GradientDescentOptimizer,
6 TensorFlow (cont.) MomentumOptimizer and Neterov Algorithm, number of batches, 138 RMSprop, XOR implementation computation graph, hidden layers, 138 linear activation functions, hidden layer, 142 Traditional language models, 255 Transfer learning, 211 with Google InceptionV3, , 216 guidelines, 212 with pre-trained VGG16, , 221 Transpose convolution, 361, 363 U U-Net architecture, 364 Unpooling, 359 V Vector representation of words, 227 Vector space model (VSM), 227 W, X, Y Watershed algorithm, Word-embeddings plot, 245 Word-embedding vector, Word2Vec CBOW method (see Continuous bag of words (CBOW)) global co-occurrence methods, 240 GloVe, 245 skip-gram models, TensorFlow implementation, CBOW, 231 word analogy, word vectors, 249 Word-vector embeddings matrix, 242 Z Zero sum game,
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationIntroduction to Neural Networks
CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationBasic Principles of Unsupervised and Unsupervised
Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationSpeaker Representation and Verification Part II. by Vasileios Vasilakakis
Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationDeep unsupervised learning
Deep unsupervised learning Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea Unsupervised learning In machine learning, there are 3 kinds of learning paradigm.
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationLearning Deep Architectures for AI. Part II - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model
More information10. Artificial Neural Networks
Foundations of Machine Learning CentraleSupélec Fall 217 1. Artificial Neural Networks Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning
More informationStatistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks
Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered
More informationDeep Learning Srihari. Deep Belief Nets. Sargur N. Srihari
Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous
More informationLecture 14: Deep Generative Learning
Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationRecurrent Neural Networks. Jian Tang
Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationIntroduction to Convolutional Neural Networks 2018 / 02 / 23
Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationThe connection of dropout and Bayesian statistics
The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationDeep Neural Networks
Deep Neural Networks DT2118 Speech and Speaker Recognition Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 45 Outline State-to-Output Probability Model Artificial Neural Networks Perceptron Multi
More informationMark Gales October y (x) x 1. x 2 y (x) Inputs. Outputs. x d. y (x) Second Output layer layer. layer.
University of Cambridge Engineering Part IIB & EIST Part II Paper I0: Advanced Pattern Processing Handouts 4 & 5: Multi-Layer Perceptron: Introduction and Training x y (x) Inputs x 2 y (x) 2 Outputs x
More informationTTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization
TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Vanishing and Exploding Gradients ReLUs Xavier Initialization Batch Normalization Highway Architectures: Resnets, LSTMs and GRUs Causes
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationDeep Learning Architectures and Algorithms
Deep Learning Architectures and Algorithms In-Jung Kim 2016. 12. 2. Agenda Introduction to Deep Learning RBM and Auto-Encoders Convolutional Neural Networks Recurrent Neural Networks Reinforcement Learning
More informationEVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN)
EVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN) TARGETED PIECES OF KNOWLEDGE Linear regression Activation function Multi-Layers Perceptron (MLP) Stochastic Gradient Descent
More informationCSC321 Lecture 15: Exploding and Vanishing Gradients
CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationDeep Learning & Artificial Intelligence WS 2018/2019
Deep Learning & Artificial Intelligence WS 2018/2019 Linear Regression Model Model Error Function: Squared Error Has no special meaning except it makes gradients look nicer Prediction Ground truth / target
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationMachine Learning Basics III
Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationDeep Feedforward Networks. Seung-Hoon Na Chonbuk National University
Deep Feedforward Networks Seung-Hoon Na Chonbuk National University Neural Network: Types Feedforward neural networks (FNN) = Deep feedforward networks = multilayer perceptrons (MLP) No feedback connections
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationReading Group on Deep Learning Session 4 Unsupervised Neural Networks
Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Jakob Verbeek & Daan Wynen 206-09-22 Jakob Verbeek & Daan Wynen Unsupervised Neural Networks Outline Autoencoders Restricted) Boltzmann
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationIntroduction to Deep Learning CMPT 733. Steven Bergner
Introduction to Deep Learning CMPT 733 Steven Bergner Overview Renaissance of artificial neural networks Representation learning vs feature engineering Background Linear Algebra, Optimization Regularization
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationRecurrent Neural Networks (Part - 2) Sumit Chopra Facebook
Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationNeural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35
Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How
More informationIntroduction to Neural Networks
Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationAdvanced Training Techniques. Prajit Ramachandran
Advanced Training Techniques Prajit Ramachandran Outline Optimization Regularization Initialization Optimization Optimization Outline Gradient Descent Momentum RMSProp Adam Distributed SGD Gradient Noise
More informationRecurrent Neural Networks with Flexible Gates using Kernel Activation Functions
2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,
More informationStephen Scott.
1 / 35 (Adapted from Vinod Variyam and Ian Goodfellow) sscott@cse.unl.edu 2 / 35 All our architectures so far work on fixed-sized inputs neural networks work on sequences of inputs E.g., text, biological
More informationDeep Learning and Lexical, Syntactic and Semantic Analysis. Wanxiang Che and Yue Zhang
Deep Learning and Lexical, Syntactic and Semantic Analysis Wanxiang Che and Yue Zhang 2016-10 Part 2: Introduction to Deep Learning Part 2.1: Deep Learning Background What is Machine Learning? From Data
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationIntroduction to (Convolutional) Neural Networks
Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationUnsupervised Learning
CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More information1 What a Neural Network Computes
Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists
More informationDeep Learning Lab Course 2017 (Deep Learning Practical)
Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron
More informationDeep Generative Models. (Unsupervised Learning)
Deep Generative Models (Unsupervised Learning) CEng 783 Deep Learning Fall 2017 Emre Akbaş Reminders Next week: project progress demos in class Describe your problem/goal What you have done so far What
More informationLong-Short Term Memory and Other Gated RNNs
Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationModelling Time Series with Neural Networks. Volker Tresp Summer 2017
Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationSpeech and Language Processing
Speech and Language Processing Lecture 5 Neural network based acoustic and language models Information and Communications Engineering Course Takahiro Shinoaki 08//6 Lecture Plan (Shinoaki s part) I gives
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationFinal Examination CS 540-2: Introduction to Artificial Intelligence
Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11
More informationThe K-FAC method for neural network optimization
The K-FAC method for neural network optimization James Martens Thanks to my various collaborators on K-FAC research and engineering: Roger Grosse, Jimmy Ba, Vikram Tankasali, Matthew Johnson, Daniel Duckworth,
More informationCSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer
CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene
More informationDeep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning
Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional
More informationNeural Networks biological neuron artificial neuron 1
Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input
More informationMachine Learning. Boris
Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More information