Midterm for CSC321, Intro to Neural Networks Winter 2018, night section Tuesday, March 6, 6:10-7pm

Similar documents
Name: Student number:

CSC321 Lecture 4: Learning a Classifier

CSC321 Lecture 7 Neural language models

CSC321 Lecture 8: Optimization

CSC321 Lecture 4: Learning a Classifier

CSC321 Lecture 7: Optimization

CSC 411 Lecture 10: Neural Networks

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

Statistical NLP for the Web

CSC321 Lecture 5: Multilayer Perceptrons

Classification with Perceptrons. Reading:

Midterm: CS 6375 Spring 2015 Solutions

CSC321 Lecture 6: Backpropagation

Machine Learning Practice Page 2 of 2 10/28/13

The exam is closed book, closed notes except your one-page cheat sheet.

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CSC321 Lecture 4 The Perceptron Algorithm

Midterm: CS 6375 Spring 2018

Nonlinear Classification

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

17 Neural Networks NEURAL NETWORKS. x XOR 1. x Jonathan Richard Shewchuk

CSC321 Lecture 15: Recurrent Neural Networks

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Natural Language Processing with Deep Learning CS224N/Ling284

Midterm II. Introduction to Artificial Intelligence. CS 188 Spring ˆ You have approximately 1 hour and 50 minutes.

Artificial Neural Networks 2

CSC321 Lecture 16: ResNets and Attention

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

CE213 Artificial Intelligence Lecture 14

Intro to Neural Networks and Deep Learning

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

1 Machine Learning Concepts (16 points)

CSC321 Lecture 2: Linear Regression

1 What a Neural Network Computes

Neural Networks and Deep Learning

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Neural Networks: Backpropagation

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

Lecture 6: Backpropagation

4. Multilayer Perceptrons

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

Lecture 5: Logistic Regression. Neural Networks

Lecture 15: Exploding and Vanishing Gradients

Lecture 35: Optimization and Neural Nets

Lecture 17: Neural Networks and Deep Learning

INTRODUCTION TO DATA SCIENCE

Lecture 2: Linear regression

Statistical Machine Learning

FA Homework 2 Recitation 2

CSC321 Lecture 5 Learning in a Single Neuron

Understanding How ConvNets See

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Introduction to Machine Learning Midterm Exam

Computing Neural Network Gradients

Administration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

text classification 3: neural networks

Computational statistics

CSCI 315: Artificial Intelligence through Deep Learning

Notes on Back Propagation in 4 Lines

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Lecture 4: Training a Classifier

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural networks COMS 4771

Lecture 5 Neural models for NLP

NEURAL LANGUAGE MODELS

CSCI567 Machine Learning (Fall 2018)

Feed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

Lecture 3 Feedforward Networks and Backpropagation

Do not tear exam apart!

MATH 1553, SPRING 2018 SAMPLE MIDTERM 1: THROUGH SECTION 1.5

9 Classification. 9.1 Linear Classifiers

Multilayer Perceptrons (MLPs)

Midterm II. Introduction to Artificial Intelligence. CS 188 Spring ˆ You have approximately 1 hour and 50 minutes.

Vector, Matrix, and Tensor Derivatives

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Artificial Neural Networks

Midterm, Fall 2003

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

From perceptrons to word embeddings. Simon Šuster University of Groningen

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Introduction to Neural Networks

Machine Learning. Neural Networks

Problem Score Problem Score 1 /10 7 /10 2 /10 8 /10 3 /10 9 /10 4 /10 10 /10 5 /10 11 /10 6 /10 12 /10 Total /120

Lecture 15: Recurrent Neural Nets

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

FINAL: CS 6375 (Machine Learning) Fall 2014

Machine Learning (CSE 446): Neural Networks

Midterm 2 V1. Introduction to Artificial Intelligence. CS 188 Spring 2015

CSC321 Lecture 9: Generalization

Introduction to Deep Learning

6.036: Midterm, Spring Solutions

Neural Networks. Intro to AI Bert Huang Virginia Tech

Transcription:

Midterm for CSC321, Intro to Neural Networks Winter 2018, night section Tuesday, March 6, 6:10-7pm Name: Student number: This is a closed-book test. It is marked out of 15 marks. Please answer ALL of the questions. Here is some advice: The questions are NOT arranged in order of difficulty, so you should attempt every question. Questions that ask you to briefly explain something only require short (1-3 sentence) explanations. Don t write a full page of text. We re just looking for the main idea. None of the questions require long derivations. If you find yourself plugging through lots of equations, consider giving less detail or moving on to the next question. Many questions have more than one right answer.

Final mark: / 15 1. [2pts] When tuning hyperparameters, it s important to use a validation set. [1pt] Briefly explain what can go wrong if you tune hyperparameters on the training set. [1pt] Briefly explain what can go wrong if you tune them on the test set. 2. [1pt] Give the definition of a convex function. I.e., let f be a scalar-valued function that takes a vector x as input. The definition is that f is convex if and only if... 2

3. [3pts] Consider a layer of a multilayer perceptron with logistic activation function σ: z i = j w ij x j + b i h i = σ(z i ) The two layers have dimensions D 1 and D 2, respectively. Let M denote the mini-batch size. Recall the backprop equations for a multilayer perceptron: z i = σ (z i ) h i x j = i z i w ij w ij = z i x j Your job is to write NumPy code to implement the backward pass for this layer. Assume the following NumPy arrays have already been computed: X, an M D 1 matrix representing the x j values for a whole mini-batch. (I.e., each row contains the activations for one training example.) The matrices Z and H are defined analogously. W, a D 2 D 1 matrix representing the weights. I.e., the (i, j) entry of W is w ij. H_bar, the matrix of error signals h i, the same size as H. Also, assume you are given a function sigma_prime which computes the derivatives of the logistic function, elementwise. Write NumPy code that computes arrays Z_bar, X_bar, and W_bar, representing error signals for the corresponding variables. W_bar should be the average of the derivatives over the mini-batch. Your code should be vectorized, i.e. do not use for loops. This can be done in three lines, one for each backprop equation. 3

4. [1pt] TRUE or FALSE: in logistic regression, if every training example is classified correctly, then the cost is zero. Briefly explain your answer. 5. [1pt] Consider the following multilayer perceptron shown on the left. All units use the logistic activation function, and there are no biases. On the right-hand diagram, give the weights for an equivalent network (i.e. one which computes the same function). You don t need to explain your answer. 4

6. [2pts] Briefly explain two reasons to use automatic differentiation rather than finite differences to compute the gradient for a neural network during training. 7. [1pt] Alice and Barbara are trying to redesign the LeNet conv net architecture to reduce the number of weights. Alice wants to reduce the number of feature maps in the first convolution layer. Barbara wants to reduce the number of hidden units in the last layer before the output. Whose approach is better? Why? 5

8. [2pts] Recall the neural language model architecture from Assignment 1. Suppose there are 100 words in the dictionary, the embedding dimension is 30, the context length is 3 words, and the hidden layer has 60 units. You don t need to explain your answer, but doing so may help you get partial credit. (a) [1pt] How many parameters are needed for the embedding layer? assume this layer has no biases.) (You may (b) [1pt] How many weights and biases are needed for the hidden layer? 9. [2pts] Recall that a plateau is a flat region in the cost function. Give two examples of plateaux (plual of plateau) that can occur in neural net training, and briefly explain why they are plateaux. 6