TensorFlow. Dan Evans

Similar documents
Introduction to TensorFlow

(Artificial) Neural Networks in TensorFlow

(Artificial) Neural Networks in TensorFlow

INF 5860 Machine learning for image classification. Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018

Crash Course on TensorFlow! Vincent Lepetit!

TensorFlow: A Framework for Scalable Machine Learning

@SoyGema GEMA PARREÑO PIQUERAS

introduction to convolutional networks using tensorflow

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

>TensorFlow and deep learning_

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

Deep Learning: Pre- Requisites. Understanding gradient descent, autodiff, and softmax

CSC 498R: Internet of Things 2

Neural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01

) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.

ECE521 Lecture 7/8. Logistic Regression

Introduction to Machine Learning Spring 2018 Note Neural Networks

Stephen Scott.

Deep neural networks and fraud detection

Tensor Flow. Tensors: n-dimensional arrays Vector: 1-D tensor Matrix: 2-D tensor

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

Neural Networks and the Back-propagation Algorithm

Multilayer Perceptron

Simple Neural Nets For Pattern Classification

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Neural networks. Chapter 19, Sections 1 5 1

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

AI Programming CS F-20 Neural Networks

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning (CSE 446): Neural Networks

CSC321 Lecture 5: Multilayer Perceptrons

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Nonlinear Classification

Input layer. Weight matrix [ ] Output layer

Machine Learning Basics

Neural networks. Chapter 20. Chapter 20 1

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Classification with Perceptrons. Reading:

EEE 241: Linear Systems

Artificial Neural Networks

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Artificial Neuron (Perceptron)

Machine Learning. Boris

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Machine Learning (CSE 446): Backpropagation

Introduction to Neural Networks

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

Artificial Intelligence

Lecture 4: Perceptrons and Multilayer Perceptrons

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

Lab 5: 16 th April Exercises on Neural Networks

CSC 411 Lecture 10: Neural Networks

Neural Networks and Deep Learning

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI

Introduction to TensorFlow

Final Examination CS 540-2: Introduction to Artificial Intelligence

Revision: Neural Network

CS 4700: Foundations of Artificial Intelligence

Artifical Neural Networks

Multi-layer Perceptron Networks

Sections 18.6 and 18.7 Analysis of Artificial Neural Networks

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Network Tutorial & Application in Nuclear Physics. Weiguang Jiang ( 蒋炜光 ) UTK / ORNL

Unit 8: Introduction to neural networks. Perceptrons

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

ECE521 Lectures 9 Fully Connected Neural Networks

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

Perceptron. (c) Marcin Sydow. Summary. Perceptron

Artificial Neural Networks Examination, June 2005

Neural networks. Chapter 20, Section 5 1

Artificial Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Deep Learning. Basics and Intuition. Constantin Gonzalez Principal Solutions Architect, Amazon Web Services

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Recurrent Neural Network

Learning Deep Architectures for AI. Part I - Vijay Chakilam

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Course 395: Machine Learning - Lectures

CSCI 315: Artificial Intelligence through Deep Learning

9 Classification. 9.1 Linear Classifiers

Deep Neural Networks (1) Hidden layers; Back-propagation

Artificial Neural Network

Kirk Borne Booz Allen Hamilton

Introduction to Deep Learning CMPT 733. Steven Bergner

Data Mining Part 5. Prediction

Practicals 5 : Perceptron

CSC242: Intro to AI. Lecture 21

Transcription:

TensorFlow Presentation references material from https://www.tensorflow.org/get_started/get_started and Data Science From Scratch by Joel Grus, 25, O Reilly, Ch. 8 Dan Evans

TensorFlow www.tensorflow.org An open-source software library for machine learning A system for building and training neural networks to detect and decipher patterns and correlations, analogous to (but not the same as) human learning and reasoning Used for both research and production at Google, often replacing its closed-source predecessor, DistBelief Developed by the Google Brain team for internal Google use, it was released under the Apache 2. open source license in November, 25 https://en.wikipedia.org/wiki/tensorflow

TensorFlow API s Core API is Python (also considered most flexible) Additional supported API s for C++, Java, Go Community API s for C#, Haskell, Julia, Ruby, Rust, Scala

TensorFlow Installation Before TensorFlow installation, install () Windows: Python 3.5.x or 3.6.x (TF GPU support available) (2) Mac: Python 2.7 or 3.3+ (3) Ubuntu: Python 2.7 or 3.x (TF GPU support available) Pick installation method virtualenv (2,3) native pip (,2,3) Docker (2,3) Anaconda (Python Data Sciences platform) (,2,3) Follow the simple command line install instructions on the web My Mac install used the virtualenv

Why TensorFlow? Perceptrons A perceptron is the simplest neural network It takes n inputs, computes a weighted sum, and fires if the sum is greater than or equal to p = wi+w2i2+... + wnin + bin+ outp = if x >= else (known as a step function) b, the bias, is a normalizing constant which keeps as the threshold; the input to b (in+) is always implicitly p is the dot product of the vectors [w, w2,..., wn, b][i, i2,..., in, ]

Perceptrons(2) Consider the three perceptrons pa = [2, 2, -3], po = [2, 2, -], and p~ab = [-2, 2, -] Table shows dot product and threshold output with and inputs Input [,] [,] [.] [,] pa po 2+2-3= 2+2-=3 p~ab -2+2-=- 2+-3=- 2+-= -2+-=-3 +2-3=- +2-= +2-= +-3=-3 +-=- +-=-

AND-OR Decision Space.5. [,] [,] Input 2 AND Boundary.5 OR Boundary. [,] [,] -.5 -.5..5 Input..5

Training Perceptrons Start with estimated weights and calculate the results from the training set of inputs Use the error outputs to reestimate the weights Make successive passes until the weights converge to produce the correct output for the training set A good training algorithm will converge rapidly pa Pass [,] [,] [.] [,] [,,] +-=2 [.5,.5,-] 2.5+.5-.=2 [2,2,-2.5] 3 2+2-2.5=.5 [2.5,2.5,-3] 4 2.5+2.5-3=2 +-=.5+-=.5 2+-2.5=.5 2.5+-3=-.5 +-= +.5-=.5 +2-2.5=-.5 +2.5-3=-.5 +-= +-=- +-2.5=-2.5 +-3=-3

Layers More complicated neural networks take the output of one layer of perceptrons as input to the next (hidden) layer Deep learning uses many-layered neural networks Consider exclusive-or (xor) which is true if only one of its two operands are true Logically a xor b = not (a and b) and (a or b) a xor b = not (AND) and (OR)

Layers(2) Graph a b pa po Variables Input Layer Out/In Layer 2 Out a pa po pa po p~ao p~ao p~ao b pa po p~ao output pa po p~ao

TensorFlow - Tensors The central unit of data in TensorFlow is the tensor (perceptron weights) A tensor is a set of primitive values shaped into an array of any number of dimensions. A tensor's rank is its number of dimensions [rows, columns, layers, ] 3 #rank tensor; this is a scalar with shape [] [.,2.,3.] #rank tensor - vector with shape [3] [[.,2.,3.], [4.,5.,6.]] #rank 2 tensor - matrix with shape [2,3] [[[.,2.,3.]], [[7.,8.,9.]]] #rank 3 tensor with shape [2,,3]

A [2,,3] Tensor Layers(3) 3 2 Rows(2) 7 7 8 9 Columns()

Computational Graph A series of TensorFlow operations arranged into a graph of nodes and edges Each node takes zero or more tensors as inputs and produces a tensor as an output A constant node takes no inputs, and outputs a value (tensor) it stores internally Constant Tensors with floating point values are created with the constant() method

A Two-node Computational Graph import tensorflow as tf node = tf.constant(3., dtype=tf.float32) node2 = tf.constant(4.) # also tf.float32 implicitly print(node, node2) The final print statement displays the two - dimensional nodes as objects and produces Tensor("Const:", shape=(), dtype=float32) Tensor("Const_:", shape=(), dtype=float32)

Sessions To evaluate the nodes, run the computational graph within a session A session encapsulates the control and state of the TensorFlow runtime Create a Session object and invoke its run method to evaluate the computational graph s nodes, node and node2 sess = tf.session() print(sess.run([node, node2])) When the graph is evaluated, the result is a new [2] tensor: [3., 4.]

Operations Nodes can be combined using operations, producing a new node Add the two constant nodes to produce a node (and a new graph): node3 = tf.add(node, node2) print("node3:", node3) print("sess.run(node3):", sess.run(node3)) The last two print statements produce node3: Tensor("Add:", shape=(), dtype=float32) sess.run(node3): 7.

TensorBoard TensorFlow provides a utility called TensorBoard that can display a picture of the computational graph TensorBoard visualizes the graph as:

Placeholders The node3 graph always produces a constant result, but a graph can be parameterized to accept external inputs, known as placeholders a = tf.placeholder(tf.float32) b = tf.placeholder(tf.float32) adder_node = a + b # + provides a shortcut for tf.add(a, b) adder_node acts like a function (or a lambda) which takes two input parameters (a and b) and performs an operation on them The graph can be evaluated multiple times, for example using dictionary literals to define the placeholders by name print(sess.run(adder_node, {a: 3, b: 4.5})) # a and b are [] tensors print(sess.run(adder_node, {a: [, 3], b: [2, 4]})) # a and b are [2] tensors Resulting output 7.5 [ 3. 7.]

adder_node in TensorBoard

Enhance the Graph Make a computational graph more complex by adding another operation add_and_triple = adder_node * 3. print(sess.run(add_and_triple, {a: 3, b: 4.5})) This code produces the output 22.5 Note that a and b are parameters to adder_node

Variables Variables are defined with initial values and types W = tf.variable([2,2,-3],dtype=tf.float32) W is defined but is not yet set After all global variables have been defined, get the initialization function and execute it using the run method of the session init = tf.global_variables_initializer() sess.run(init)

AND Perceptron Define the input parameter and the and node x = tf.placeholder(tf.float32) Define the and node and_node = tf.to_float(tf.less_equal(.,tf.reduce_sum(w*x,))) Compute the vector dot product of W and the parameter x (the same shape [3] as W) Compare the results element-wise to zero producing True or False Convert True or False to a float or

Evaluation The perceptron requires three inputs, the two operands and the bias input which is always Run the model with four cases print(sess.run(and_node, {x: [[,,], [,,],[,,],[,,]]})) Resulting output [....]

Training Start with estimated weights and calculate the results from a training set of inputs (inputs with known outputs) Use the errors (known as the deltas) to determine new weights that will reduce the deltas in the training set Make successive passes, modifying the weights each time until they converge to produce the correct output for the training set

Training(2) Training operates in the realm of calculus (continuous functions) where one of the most effective tools for weight convergence is the gradient If you are standing on the side of a hill (a continuous twodimensional surface, a function of latitude and longitude), the gradient is the direction of the steepest ascent (or descent) from your position Taking the gradient of the error function provides a guideline for a guess at the next set of weights

Training - Converting the Step Function The step function used in the and_node is not continuous and does not have a derivative Instead, we use the sigmoid (S-shaped) logistic function to give a fuzzy (less than.5) or (greater than.5) and_node = tf.sigmoid(tf.reduce_sum(w*x,))

Training(3) Create an error function that computes the sum of the square errors from each of the training set outputs - this is the function to be minimized during the training y=tf.placeholder(tf.float32) diff=tf.reduce_sum(tf.square(and_node - y)) Get a gradient optimizer - the parameter is the rate of movement along the gradient for each step opt=tf.train.gradientdescentoptimizer(.) Get a function from the optimizer that minimizes the error function train=opt.minimize(diff)

Train the Perceptron Assign arbitrary values to the weights, then run the training for 2 passes - x is the training set, y is the expected output of each member of the training set sess.run(tf.assign(w,[,,-])) for i in range(2): sess.run(train,{x:[[,,],[,,],[,,],[,,]],y: [,,,]}) Evaluate and display the trained weights print(sess.run(w)) [.968336.968336-3.22287] print(sess.run(and_node,{x:[[,,],[,,],[,,],[,,]]})) [.692664.24862868.24862868.46882 ]

More Complicated Neural Networks @...@ @...@ @...@ Each 5x5 digit image can provide 25 simple inputs to 26-dimensional perceptrons Each of the can provide an input to each of -dimensional preceptors in the next layer..@....@....@....@....@.. @... @...@ @...@ @... @... @...@ @...@ @...@ @...@ The output might ultimately be a -dimensional vector [,,,,,,,,,,] (e.g. a 3) (th position indicates unclassifiable) There are interesting questions about how many neurons (perceptrons) in a layer are needed and how many layers are useful. Reductions in computational requirements without compromising classification are important. @@@@..@@@@.@@@..@@@@...@..@@@@ @@@@..@@@...@@..@@@. Variations that should all be recognized as a 3

TensorFlow Conclusions Provides an extensive platform for machine learning Provides operations that match the concepts of neural networks Suppresses the multi-dimensional computational detail in a natural way Easy to install and use on either Windows, Mac, or Linux