TensorFlow: A Framework for Scalable Machine Learning

Similar documents
Tensor Flow. Tensors: n-dimensional arrays Vector: 1-D tensor Matrix: 2-D tensor

INF 5860 Machine learning for image classification. Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018

Introduction to TensorFlow

@SoyGema GEMA PARREÑO PIQUERAS

TensorFlow. Dan Evans

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

Crash Course on TensorFlow! Vincent Lepetit!

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI

Stephen Scott.

introduction to convolutional networks using tensorflow

(Artificial) Neural Networks in TensorFlow

>TensorFlow and deep learning_

CSC 498R: Internet of Things 2

Introduction to TensorFlow

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01

(Artificial) Neural Networks in TensorFlow

Introduction to Deep Learning CMPT 733. Steven Bergner

A Practitioner s Guide to MXNet

Machine Learning. Boris

More on Neural Networks

Deep Learning: Pre- Requisites. Understanding gradient descent, autodiff, and softmax

Deep neural networks and fraud detection

Recurrent Neural Network

Recurrent Neural Network

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks and Deep Learning

CSCI 315: Artificial Intelligence through Deep Learning

Theano: A Few Examples

CSC321 Lecture 5: Multilayer Perceptrons

Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

PATTERN RECOGNITION AND MACHINE LEARNING

Recurrent Neural Network

Deep Learning. Convolutional Neural Networks Applications

TF Mutiple Hidden Layers: Regression on Boston Data Batched, Parameterized, with Dropout

Welcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1

Jakub Hajic Artificial Intelligence Seminar I

Deep Learning 101 a Hands-on Tutorial

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

Neural Network Tutorial & Application in Nuclear Physics. Weiguang Jiang ( 蒋炜光 ) UTK / ORNL

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Introduction to Deep Learning

Introduction to Neural Networks

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

Supporting Information

Source localization in an ocean waveguide using supervised machine learning

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

CSC 411 Lecture 10: Neural Networks

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Name: Student number:

Gradient Coding: Mitigating Stragglers in Distributed Gradient Descent.

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

J. Sadeghi E. Patelli M. de Angelis

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

ECS289: Scalable Machine Learning

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Classification of Higgs Boson Tau-Tau decays using GPU accelerated Neural Networks

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Tensor networks and deep learning

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Analysis of the Learning Process of a Recurrent Neural Network on the Last k-bit Parity Function

CNTK Microsoft s Open Source Deep Learning Toolkit. Taifeng Wang Lead Researcher, Microsoft Research Asia 2016 GTC China

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

CSC321 Lecture 2: Linear Regression

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Deep Learning intro and hands-on tutorial

Least Mean Squares Regression. Machine Learning Fall 2018

Statistical Machine Learning

Deep Learning for NLP

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning

Least Mean Squares Regression

MNIST Example Kailai Xu September 8, This is one of the series notes on deep learning. The short note and code is based on [1].

Deep Learning: a gentle introduction

Implementation of a deep learning library in Futhark

Deep Learning (CNNs)

Deep Learning Lecture 2

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

An overview of deep learning methods for genomics

Neural networks (NN) 1

Deep Learning for NLP

CSE 559A: Computer Vision

GENERAL. CSE 559A: Computer Vision AUTOGRAD AUTOGRAD

Why Aren t You Using Probabilistic Programming? Dustin Tran Columbia University

11/3/15. Deep Learning for NLP. Deep Learning and its Architectures. What is Deep Learning? Advantages of Deep Learning (Part 1)

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Layout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation

Advanced computational methods X Selected Topics: SGD

ECE521 Lecture 7/8. Logistic Regression

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA

THE retina in general consists of three layers: photoreceptors

Introduction to Machine Learning Spring 2018 Note Neural Networks

CS Deep Reinforcement Learning HW2: Policy Gradients due September 19th 2018, 11:59 pm

Reading Group on Deep Learning Session 1

Transcription:

TensorFlow: A Framework for Scalable Machine Learning

You probably Outline want to know... What is TensorFlow? Why did we create TensorFlow? How does Tensorflow Work? Example: Linear Regression Example: Convolutional Neural Network Distributed TensorFlow

Fast, flexible, and scalable open-source machine learning library One system for research and production Runs on CPU, GPU, TPU, and Mobile Apache 2.0 license

TensorFlow Handles Complexity Modeling complexity Distributed System Heterogenous System

Under the Hood

A multidimensional array. A graph of operations.

The TensorFlow Graph Computation is defined as a graph Graph is defined in high-level language (Python) Graph is compiled and optimized Graph is executed (in parts or fully) on available low level devices (CPU, GPU, TPU) Nodes represent computations and state Data (tensors) flow along edges

Build a graph; then run it. a... c = tf.add(a, b) b add c... session = tf.session() value_of_c = session.run(c, {a=1, b=2})

Any Computation is a TensorFlow Graph biases Add weights MatMul examples labels Relu Xent

Any Computation is a TensorFlow Graph e t a t ith s w variables biases Add weights MatMul examples labels Relu Xent

Automatic Differentiation Automatically add ops which compute gradients for variables biases... Xent grad

Any Computation is a TensorFlow Graph e t a t ith s Simple gradient descent: w biases... learning rate Xent grad Mul =

Any Computation is a TensorFlow Graph d e t u trib dis Device A biases Add...... learning rate Devices: Processes, Machines, CPUs, GPUs, TPUs, etc Mul Device B =

Send and Receive Nodes d e t u trib dis Device A biases Add...... learning rate Devices: Processes, Machines, CPUs, GPUs, TPUs, etc Mul Device B =

Send and Receive Nodes d e t u trib dis Device A biases Send Recv Add... Send... Recv Recv learning rate Device B Send Devices: Processes, Machines, CPUs, GPUs, TPUs, etc Mul Send Recv =

Linear Regression

Linear Regression result input y = Wx + b parameters

What are we trying to do? Mystery equation: y = 0.1 * x + 0.3 + noise Model: y = W * x + b Objective: Given enough (x, y) value samples, figure out the value of W and b.

y = Wx + b in TensorFlow import tensorflow as tf

y = Wx + b in TensorFlow import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name= x )

y = Wx + b in TensorFlow import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name= x ) W = tf.get_variable(shape=[], name= W )

y = Wx + b in TensorFlow import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name= x ) W = tf.get_variable(shape=[], name= W ) b = tf.get_variable(shape=[], name= b )

y = Wx + b in TensorFlow y import tensorflow as tf x = tf.placeholder(shape=[none], + dtype=tf.float32, name= x ) W = tf.get_variable(shape=[], name= W ) b = tf.get_variable(shape=[], name= b ) b matmul y = W * x + b W x

Variables Must be Initialized y Collects all variable initializers init_op = tf.initialize_all_variables() + init_op Makes an execution environment initializer assign b initializer assign W matmul sess = tf.session() sess.run(init_op) x Actually initialize the variables

Running the Computation x_in = [3] y fetch sess.run(y, feed_dict={x: x_in}) + Only what s used to compute a fetch will be evaluated All Tensors can be fed, but all placeholders must be fed b matmul W feed x

Putting it all together import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name='x') W = tf.get_variable(shape=[], name='w') b = tf.get_variable(shape=[], name='b') y = W * x + b with tf.session() as sess: sess.run(tf.initialize_all_variables()) print(sess.run(y, feed_dict={x: x_in})) Build the graph Prepare execution environment Initialize variables Run the computation (usually often)

Define a Loss Given y, y_train compute a loss, for instance: # create an operation that calculates loss. loss = tf.reduce_mean(tf.square(y - y_train))

Minimize loss: optimizers tf.train.adadeltaoptimizer tf.train.adagradoptimizer error tf.train.adagraddaoptimizer tf.train.adamoptimizer function minimum parameters (weights, biases)

Train Feed (x, ylabel) pairs and adjust W and b to decrease the loss. W W# Create an optimizer b b- ( dl/dw ) ( dl/db ) TensorFlow computes gradients automatically optimizer = tf.train.gradientdescentoptimizer(0.5) # Create an operation that minimizes loss. train = optimizer.minimize(loss) Learning rate

Putting it all together loss = tf.reduce_mean(tf.square(y - y_train)) Define a loss optimizer = tf.train.gradientdescentoptimizer(0.5) Create an optimizer train = optimizer.minimize(loss) Op to minimize the loss with tf.session() as sess: sess.run(tf.initialize_all_variables()) Initialize variables for i in range(1000): sess.run(train, feed_dict={x: x_in, y_label: y_in}) Iteratively run the training op

Putting it all together import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name='x') W = tf.get_variable(shape=[], name='w') b = tf.get_variable(shape=[], name='b') y = W * x + b Build the graph loss = tf.reduce_mean(tf.square(y - y_train)) Define a loss optimizer = tf.train.gradientdescentoptimizer(0.5) Create an optimizer train = optimizer.minimize(loss) Op to minimize the loss with tf.session() as sess: Prepare environment sess.run(tf.initialize_all_variables()) for i in range(1000): sess.run(train, feed_dict={x: x_in, y_label: y_in}) Initialize variables Iteratively run the training op

TensorBoard

Deep Convolutional Neural Network

Remember linear regression? import tensorflow as tf x = tf.placeholder(shape=[none], dtype=tf.float32, name='x') W = tf.get_variable(shape=[], name='w') b = tf.get_variable(shape=[], name='b') y = W * x + b loss = tf.reduce_mean(tf.square(y - y_label)) optimizer = tf.train.gradientdescentoptimizer(0.5) train = optimizer.minimize(loss)... Build the graph

Convolutional DNN x = tf.contrib.layers.conv2d(x, kernel_size=[5,5],...) x = tf.contrib.layers.max_pool2d(x, kernel_size=[2,2],...) x = tf.contrib.layers.conv2d(x, kernel_size=[5,5],...) x = tf.contrib.layers.max_pool2d(x, kernel_size=[2,2],...) x = tf.contrib.layers.fully_connected(x, activation_fn=tf.nn.relu) x = tf.contrib.layers.dropout(x, 0.5) logits = tf.config.layers.linear(x) x conv 5x5 (relu) maxpool 2x2 conv 5x5 (relu) maxpool 2x2 fully_connected (relu) dropout 0.5 fully_connected (linear) logits https://github.com/martinwicke/tensorflow-tutorial/blob/master/2_mnist.ipynb

Defining Complex Networks Parameters network loss gradients grad Mul learning rate =

Distributed TensorFlow

Data Parallelism (Between-Graph Replication) Parameter Servers Δp p Model Replicas... Data...

Describe a cluster: ClusterSpec tf.train.clusterspec({ "worker": [ "worker0.example.com:2222", "worker1.example.com:2222", "worker2.example.com:2222" ], "ps": [ "ps0.example.com:2222", "ps1.example.com:2222" ]})

Share the graph across devices with tf.device("/job:ps/task:0"): weights_1 = tf.variable(...) biases_1 = tf.variable(...) with tf.device("/job:ps/task:1"): weights_2 = tf.variable(...) biases_2 = tf.variable(...) with tf.device("/job:worker/task:7"): input, labels =... layer_1 = tf.nn.relu(tf.matmul(input, weights_1) + biases_1) logits = tf.nn.relu(tf.matmul(layer_1, weights_2) + biases_2) train_op =... with tf.session("grpc://worker7.example.com:2222") as sess: for _ in range(10000): sess.run(train_op)

Architecture... Python front end C++ front end Bindings + Compound Ops TensorFlow Distributed Execution System Ops add mul Print reshape... Kernels CPU GPU TPU Android ios...

Tutorials Tutorials on tensorflow.org Image recognition: https://www.tensorflow.org/tutorials/image_recognition Word embeddings: https://www.tensorflow.org/tutorials/word2vec Language Modeling: https://www.tensorflow.org/tutorials/recurrent Translation: https://www.tensorflow.org/tutorials/seq2seq Udacity Course: https://www.udacity.com/course/deep-learning--ud730

Q&A