MNIST Example Kailai Xu September 8, This is one of the series notes on deep learning. The short note and code is based on [1].

Size: px

Start display at page:

Download "MNIST Example Kailai Xu September 8, This is one of the series notes on deep learning. The short note and code is based on [1]."

Griffin Ray
5 years ago
Views:

1 MNIST Example Kailai Xu September 8, 2017 This is one of the series notes on deep learning. The short note and code is based on [1]. The MNIST classification example is a classical example to illustrate the application of deep learning. In the following, we assume the reader can judge the corresponding Tensorflow function from the function names. Algorithm 1 MNIST Classification Algorithm 1: procedure MNIST classification 2: Input: N : S : C : x R N S : y R N C : W R S C : b R C : Number of training instances Instance flat size Number of possible classes Training data One-hot labels Weights Bias 3: l xw + b 4: y pred argmax(softmax(l), dimension = 1) 5: y true argmax(y, dimension = 1) 6: c softmax cross entropy with logits(l, labels = y) 7: cost reduce mean(c) 8: optimizer GradientDescentOptimizer(cost) Build optimizer. 9: function Performance measure 10: z equal(y pred, y true ) 11: acc reduce mean(cast(z, float32)) The triggering function will look as follows 1

2 Algorithm 2 TensorFlow Run 1: procedure tfrun 2: Create a session: s Session() 3: Initialize all the variables : s.run(initialize all variables) 4: for i = 1, 2,... do 5: Generate a new batch x, y 6: Create feed dictionary feed dict 7: Run the session s.run(optimizer, feed dict) 8: Report accuracy Performance measure. There are other useful techniques to visualize results besides looking at the accuracy. For example, we can plot the confusion matrix as below Also, it will be helpful to plot out the weights. The left shows the weights of W after 10 iterations while the right after 100 iterations. We can clearly see a pattern behind the weights. 2

Code # other imports such as matplotlib import tensorflow as tf import numpy as np from sklearn.metrics import confusion_matrix # data import from tensorflow.examples.tutorials.

3 Code # other imports such as matplotlib import tensorflow as tf import numpy as np from sklearn.metrics import confusion_matrix # data import from tensorflow.examples.tutorials.mnist import input_data data = input_data.read_data_sets("data/mnist/", one_hot=true) # visualize functions def plot_weights(): # Get the values for the weights from the TensorFlow variable. w = session.run(w) # Get the lowest and highest values for the weights. # This is used to correct the colour intensity across # the images so they can be compared with each other. w_min = np.min(w) w_max = np.max(w) # Create figure with 3x4 sub-plots, # where the last 2 sub-plots are unused. fig, axes = plt.subplots(3, 4) fig.subplots_adjust(hspace=0.3, wspace=0.3) for i, ax in enumerate(axes.flat): # Only use the weights for the first 10 sub-plots. if i<10: 3

4 # Get the weights for the i'th digit and reshape it. # Note that w.shape == (img_size_flat, 10) image = w[:, i].reshape((28,28)) # Set the label for the sub-plot. ax.set_xlabel("weights: {0}".format(i)) # Plot the image. ax.imshow(image, vmin=w_min, vmax=w_max, cmap='seismic') # Remove ticks from each sub-plot. ax.set_xticks([]) ax.set_yticks([]) plt.savefig('weights.png') def print_confusion_matrix(): # Get the true classifications for the test-set. cls_true = np.argmax(data.test.labels, 1) # Get the predicted classifications for the test-set. cls_pred = session.run(y_pred, feed_dict=feed_dict) # Get the confusion matrix using sklearn. cm = confusion_matrix(y_true=cls_true, y_pred=cls_pred) # Print the confusion matrix as text. # print(cm) # Plot the confusion matrix as an image. plt.imshow(cm, interpolation='nearest', cmap=plt.cm.blues) # Make various adjustments to the plot. plt.tight_layout() # plt.colorbar() tick_marks = np.arange(c) plt.xticks(tick_marks, range(c)) plt.yticks(tick_marks, range(c)) plt.xlabel('predicted') plt.ylabel('true') 4

5 # construct graph N = None S = 28*28 C = 10 x = tf.placeholder(tf.float32,[n,s]) y = tf.placeholder(tf.float32, [N,C]) W = tf.variable(tf.zeros([s,c])) b = tf.variable(tf.zeros([c])) l = tf.matmul(x,w) + b y_pred = tf.argmax(tf.nn.softmax(l),dimension=1) y_true = tf.argmax(y, dimension=1) c = tf.nn.softmax_cross_entropy_with_logits(l, labels=y) cost = tf.reduce_mean(c) optimizer = tf.train.gradientdescentoptimizer(learning_rate=0.5).minimize(cost) z = tf.equal(y_true,y_pred) acc = tf.reduce_mean(tf.cast(z, tf.float32)) # run batchs session = tf.session() session.run(tf.initialize_all_variables()) for i in range(1000): x_batch, y_true_batch = data.train.next_batch(100) feed_dict = { x: x_batch, y: y_true_batch, } session.run(optimizer, feed_dict=feed_dict) x_batch, y_true_batch = data.test.images, data.test.labels feed_dict = { x: x_batch, y: y_true_batch, } acc_val = session.run(acc, feed_dict=feed_dict) print('accuracy at iter %d: %f'%(i, acc_val)) 5

6 References [1] Tensorflow-tutorials/01 simple linear model.ipynb at master hvass-labs/tensorflow-tutorials. TensorFlow-Tutorials/blob/master/01_Simple_Linear_Model.ipynb. (Accessed on 09/08/2017). 6

(Artificial) Neural Networks in TensorFlow

(Artificial) Neural Networks in TensorFlow By Prof. Seungchul Lee Industrial AI Lab http://isystems.unist.ac.kr/ POSTECH Table of Contents I. 1. Recall Supervised Learning Setup II. 2. Artificial Neural