Homework 2 EM, Mixture Models, PCA, Dualitys

Size: px
Start display at page:

Download "Homework 2 EM, Mixture Models, PCA, Dualitys"

Transcription

1 Homework 2 EM, Mixture Moels, PCA, Dualitys CMU : Machine Learning (Fall 2015) OUT: Oct 5, 2015 DUE: Oct 19, 2015, 10:20 AM Guielines The homework is ue at 10:20 am on Monay October 19, Each stuent will given two late ays that can be spent on any homeworks, but at most one late ay per homework. Once you have use up your late ays for the term, late homework submissions will receive no creit. Submit both a paper copy an an electronic copy through through the submission website: https: //autolab.cs.cmu.eu/courses/10715-f15. You can sign in using your Anrew creentials. You shoul make sure to eit your account information an choose a nickname/hanle. This hanle will be use to isplay your results for any competition style questions on the class leaerboar. Some questions will be autograe. Please make sure to carefully follow the submission instructions for these questions. We recommen that you typeset your solutions using software such as LA TEXİf you write, ensure your hanwriting is clear an legible. The TAs will not invest unue effort to ecrypt ba hanwriting. Programming guielines: Octave: You must write submitte coe in Octave. Octave is a free scientific programming language, with syntax similar to that of MATLAB. Installation instructions can be foun on the Octave website. (You can evelop your coe in MATLAB if you prefer, but you must test it in Octave before submitting, or it may fail in the autograer.) Autograing: This problem is autograe using the CMU Autolab system. The coe which you write will be execute remotely against a suite of tests, an the results use to automatically assign you a grae. To make sure your coe executes correctly on our servers, you shoul avoi using libraries which are not present in the basic Octave install. Submission Instructions: For each programming question you will be given a function signature. You will be aske to write a single Octave function which satisfies the signature. In the coe hanout linke above, we have provie you with a single foler containing stubs for each of the functions you nee to complete. Do not moify the structure of this irectory or rename these files. Complete each of these functions, then compress this irectory as a tar file an submit to Autolab online. You may submit coe as many times as you like. When you ownloa the files, you shoul confirm that the autograer is functioning correctly by compressing an submitting the irectory of stubs provie. This shoul result in a grae of zero for all questions. SUBMISSION CHECKLIST Submission executes on our machines in less than 20 minutes. Submission is smaller than 2000K. Submission is a.tar file. Submission returns matrices of the exact imension specifie. 1

2 1 An EM algorithm for a Mixture of Bernoullis [Eric; 25 pts] In this section, you will erive an expectation-maximization (EM) algorithm to cluster black an white images. The inputs x (i) can be thought of as vectors of binary values corresponing to black an white pixel values, an the goal is to cluster the images into groups. You will be using a mixture of Bernoullis moel to tackle this problem. For the sake of brevity, you o not nee to substitute in previously erive expression in later problems. For example, beyon question you may use P (x (i) p (k) ) in your answers. 1.1 Mixture of Bernoullis 1. (2pts) Consier a vector of binary ranom variables, x {0, 1} D. Assume each variable x is rawn from a Bernoulli(p ) istribution, so P (x = 1) = p. Let p (0, 1) D be the resulting vector of Bernoulli parameters. Write an expression for P (x p). 2. (2pts) Now suppose we have a mixture of K Bernoulli istributions: each vector x (i) is rawn from some vector of Bernoulli ranom variables with parameters p (k), we will call this Bernoulli(p (k) ). Let {p (1),..., p (K) } = p. Assume a istribution π(k) over the selection of which set of Bernoulli parameters p (k) is chosen. Write an expression for P (x (i) p, π). 3. (2pts) Finally, suppose we have inputs X = {x (i) }...n. Using the above, write an expression for the log likelihoo of the ata X, log P (X π, p). 1.2 Expectation step 1. (4pts) Now, we introuce the latent variables for the EM algorithm. Let z (i) {0, 1} K be an inicator vector, such that z (i) k = 1 if x (i) was rawn from a Bernoulli(p (k) ), an 0 otherwise. Let Z = {z (i) }...n. What is P (z (i) π)? What is P (x (i) z (i), p, π)? 2. (2pts) Using the above two quantities, erive the likelihoo of the ata an the latent variables, P (Z, X π, p). 3. (5pts) Let η(z (i) k ) = E[z(i) k x(i), π, p]. Show that η(z (i) k ) = π k D =1 (p(k) D =1 (p(j) j π j )x(i) )x(i) (1 p (k) )1 x(i) (1 p (j) )1 x(i) Let p, π be the new parameters that we like to maximize, so p, π are from the previous iteration. Use this to erive the following final expression for the E step in the expectation-maximization algorithm: [ ] K D ( ) E[log P (Z, X p, π) X, p, π] = η(z (i) k ) log π k + x (i) log p(k) + (1 x (i) ) log(1 p(k) ) 1.3 Maximization step k=1 =1 1. (4pts) We nee to maximize the above expression with respect to π, p. First, show that the value of p that maximizes the E step is N p (k) = η(z(i) k )x(i) N k where N k = N η(z(i) k ). 2. (4pts) Show that the value of π that maximizes the E step is π k = N k k N k The exponential families notation may be useful. Alternatively, you can use Lagrange multipliers. 2

3 2 Clustering Images of Numbers [Eric; 25 pts] In this section you will use the above algorithm to cluster images of numbers. You will be using the MNIST ataset. Each input is a binary number corresponing to black an white pixels, an is a flattene version of the 28x28 pixel image. We will use the following conventions: N is the number of atapoints, D is the imension of each input, an K is the number of clusters. Xs is an N D matrix of the input ata, where row i is the pixel ata for picture i. p is a K D matrix of Bernoulli parameters, where row k is the vector of parameters for the kth mixture of Bernoullis. mix p is a K 1 vector containing the istribution over the various mixtures. eta is a N K matrix containing the results of the E step, so eta[i,k] = η(z (i) k ). clusters is an N 1 vector containing the final cluster labels of the input ata. Each label is a number from 1 to K. 2.1 Programming (20pts) 1. Implement the E step of the algorithm within [eta] = Estep(Xs,p,mix p), saving your calculate values within eta. 2. Implement the M step of the algorithm within [p,mix p] = Mstep(Xs, moel, alpha1, alpha2). p an mix p returne by this function shoul contain the new values that maximize the E step. alpha1, alpha2 are Dirichlet smoothing parameters (explaine below). 3. Implement [clusters] = MoBlabels(Xs,p,mix p). This function will take in the estimate parameters an return the resulting labels that cluster the ata. 4. Some hints an tips: The autograer will separately grae the accuracy of your implementation separately for the E-step (4pts), M-step (4pts), an M-step with smoothing (2pts). Finally, it will run your implementation for several iterations on a subset of the MNIST ataset (8pts). A small subset of the MNIST ataset is provie for your convenience. Use the log operator to make your calculations more numerically stable. In particular, pay attention to the calculation of η(z (i) k ). You will nee to avoi zeros in π an p or else you will take log(0) =. Use Dirichlet prior smoothing with the parameters α 1, α 2 when upating these variables: p (k) = N η(z(i) k )x(i) + α 1 N k + α 1 D N k + α 2 π k = k N k + α 2K Initialize your parameters p by ranomly sampling from a Uniform(0, 1) istribution an normalizing each p (k) to have unit length, an π k = 1/k. Running your implementation on the MNIST ataset with K = 20 clusters an α 1 = α 2 = 10 8 for 20 iterations shoul give you some results similar to the following (our reference coe runs in approximately 10 secons): 3

4 2.2 Analysis 1. (5pts) For each cluster, reshape the pixels into a 28x28 matrix an print the resulting grayscale images. What o you see? Explain, an inclue one such image in your writeup. See software/octave/oc/interpreter/representing-images.html for help on printing the matrix, or use the provie helper function show clusters(p, a, b) which will print the mixtures in p in an a b gri. 2. (2pts) Using your implemente MoBlabels function, cluster the ata. Using the true labels given in ytrain, how many unique igits oes each cluster typically have? Are there any clusters that picke out exactly one igit? 3 Kernel PCA [Fish; 15 pts] Principal component analysis (PCA) is use to emphasize variation an is often use to make ata easy to explore an visualize. Suppose we have ata x 1, x 2, x N R that has zero mean. PCA aims at fining a new coorinate system such that the greatest variance by some projection of the ata comes to lie on the first coorinate, the secon greatest variance on the secon coorinate, an so on. The basis vectors of the new coorinate system are the eigenvectors of the co-variance matrix, i.e., 1 N x i x T i = λ i v i vi T, (1) where v 1, v 2,, v are orthogonal (< v i, v j >= 0 for i j). Then we can plot our ata points on our new coorinate system. The position of each ata points in the new coorinate system can be erive by projecting x on to the basis of the new coorinate system, i.e., v 1, v 2, v. 3.1 kernel PCA Often we will want to make linear or non-linear transformation on the ata so that they are projecte to a higher imensional space. Suppose there is a transformation function φ : R R l. We map all the ata to a new space through this function, an we have φ(x 1 ), φ(x 2 ),, φ(x N ). We again want to o PCA on the transforme ata in the new space. How can we o this? 1. (2pts) Write out the co-variance matrix, C, for φ(x 1 ), φ(x 2 ),, φ(x N ). (Define φ(x) = 1 N φ(x j )) j=1 4

5 2. (2pts) Now we want to fin the basis vectors for the orthogonalize new space by solving eigenvectors ( of C. Using the ) ( efinition of ) an eigenvector, ( λv = ) Cv, explain why v is a linear combination of φ(x 1 ) φ(x), φ(x 2 ) φ(x),, φ(x N ) φ(x). Since v is a linear combination of these vectors, v = ( ) N α i φ(x i ) φ(x). 3. (2pts) Before starting to erive α an fin out v, we introuce kernel function here. A kernel function is in the form of k(x i, x j ) = φ(x i ), φ(x j ). Notice that we o not nee to know the function φ to calculate k(x i, x j ). We can simply efine a function that is relate to x i, x j an k(x i, x j ) = k(x j, x i ). A classic example is Raial basis function (RBF) kernel, which is k(x i, x j ) = exp( x i x j 2 /2σ 2 ). In orer to use these kernel function, we nee to get ri of all the φ(x i ) an replace them with the kernel function. A kernel matrix is efine as K R N N an K ij = k(x i, x j ). Since we are ealing with non-centere ata, we first use a moifie kernel matrix K ( ) ( ) ij = φ(x i ) φ(x), φ(x j ) φ(x). By using the results in question an an the efinition of eigenvectors, show that 4. (2pts) Show that the solutions α in are also solutions for equation (3pts) Show that Thus, by solving the eigenvectors of K, we get α. Nλ Kα = K 2 α. (2) Nλα = Kα. (3) K = (I ee T )K(I ee T ). (4) 6. (2pts)After obtaining α, we get the un-normalize basis vector v. To normalize it, what is the factor you nee to multiply to v? 7. (2pts) For a ata point x, what is its position in the normalize new space? Explain why you can get the new coorinate without explicitly calculate φ(x). 4 Huber function an its application in solving ual problem [Fish; 35pts] 4.1 Huber function Define a function where > (3pts) Show that the function B(x, ) = min λ + x2 λ 0 λ +. (5) B(x, ) = { x 2 2 x if x if x > (6) with the minimizer λ = max(0, x ). B is often calle a huber function. 2. (3pts) Is the function B convex in x? (Assume is a fixe constant) 3. (3pts) Derive the graient of function B at any given point (x, ). (Remember > 0.) 5

6 4. (3pts) Function B is often use as a penalty term in classification or regression problems, which have the form min x L(x) + B(x, ), (7) where L is the loss function, an is a parameter with positive value. Describe how parameter affects the penalty term B on the solution. 4.2 An application of using Huber loss to solve ual problem Consier the optimization problem min x ( 1 2 ix 2 i + r i x i ) (8) s.t. a T x = 1, x i [ 1, 1] for i = 1, 2, n. (9) 1. (3pts) Show that the problem has strictly feasible solutions if an only if a 1 > (3pts) Re-express x i [ 1, 1] as x 2 i 1, for i = 1, 2,, n. Write own the ual of this problem. 3. (6pts) Write own the KKT conition for this problem. Does it characterize the optimal solution? 4. (5pts) Show that we can further reuce the ual problem to a one-imensional convex problem min µ µ where B is the Huber function efine in the previous section. B(r i + µa i, i ), (10) 5. (3pts) Describe an algorithm to solve this ual problem. What is the time complexity for your algorithm? 6. (3pts) How can you recover an optimal primal solution x after solving the ual? 6

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

MATH , 06 Differential Equations Section 03: MWF 1:00pm-1:50pm McLaury 306 Section 06: MWF 3:00pm-3:50pm EEP 208

MATH , 06 Differential Equations Section 03: MWF 1:00pm-1:50pm McLaury 306 Section 06: MWF 3:00pm-3:50pm EEP 208 MATH 321-03, 06 Differential Equations Section 03: MWF 1:00pm-1:50pm McLaury 306 Section 06: MWF 3:00pm-3:50pm EEP 208 Instructor: Brent Deschamp Email: brent.eschamp@ssmt.eu Office: McLaury 316B Phone:

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

HOMEWORK 4: SVMS AND KERNELS

HOMEWORK 4: SVMS AND KERNELS HOMEWORK 4: SVMS AND KERNELS CMU 060: MACHINE LEARNING (FALL 206) OUT: Sep. 26, 206 DUE: 5:30 pm, Oct. 05, 206 TAs: Simon Shaolei Du, Tianshu Ren, Hsiao-Yu Fish Tung Instructions Homework Submission: Submit

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning Homework 2, version 1.0 Due Oct 16, 11:59 am Rules: 1. Homework submission is done via CMU Autolab system. Please package your writeup and code into a zip or tar

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

State-Space Model for a Multi-Machine System

State-Space Model for a Multi-Machine System State-Space Moel for a Multi-Machine System These notes parallel section.4 in the text. We are ealing with classically moele machines (IEEE Type.), constant impeance loas, an a network reuce to its internal

More information

Lagrangian and Hamiltonian Mechanics

Lagrangian and Hamiltonian Mechanics Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical

More information

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent CMU 10-725/36-725: Convex Optimization (Fall 2017) OUT: Nov 4 DUE: Nov 18, 11:59 PM START HERE: Instructions Collaboration policy: Collaboration

More information

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn CMU 10-701: Machine Learning (Fall 2016) https://piazza.com/class/is95mzbrvpn63d OUT: September 13th DUE: September

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Programming Assignment 4: Image Completion using Mixture of Bernoullis

Programming Assignment 4: Image Completion using Mixture of Bernoullis Programming Assignment 4: Image Completion using Mixture of Bernoullis Deadline: Tuesday, April 4, at 11:59pm TA: Renie Liao (csc321ta@cs.toronto.edu) Submission: You must submit two files through MarkUs

More information

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we

More information

CS9840 Learning and Computer Vision Prof. Olga Veksler. Lecture 2. Some Concepts from Computer Vision Curse of Dimensionality PCA

CS9840 Learning and Computer Vision Prof. Olga Veksler. Lecture 2. Some Concepts from Computer Vision Curse of Dimensionality PCA CS9840 Learning an Computer Vision Prof. Olga Veksler Lecture Some Concepts from Computer Vision Curse of Dimensionality PCA Some Slies are from Cornelia, Fermüller, Mubarak Shah, Gary Braski, Sebastian

More information

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth MA 2232 Lecture 08 - Review of Log an Exponential Functions an Exponential Growth Friay, February 2, 2018. Objectives: Review log an exponential functions, their erivative an integration formulas. Exponential

More information

Homework 3. Convex Optimization /36-725

Homework 3. Convex Optimization /36-725 Homework 3 Convex Optimization 10-725/36-725 Due Friday October 14 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013

Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013 Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013 A. Kernels 1. Let X be a finite set. Show that the kernel

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

Equations of lines in

Equations of lines in Roberto s Notes on Linear Algebra Chapter 6: Lines, planes an other straight objects Section 1 Equations of lines in What ou nee to know alrea: The ot prouct. The corresponence between equations an graphs.

More information

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics. MATH A Test #2. June 25, 2014 SOLUTIONS

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics. MATH A Test #2. June 25, 2014 SOLUTIONS YORK UNIVERSITY Faculty of Science Department of Mathematics an Statistics MATH 505 6.00 A Test # June 5, 04 SOLUTIONS Family Name (print): Given Name: Stuent No: Signature: INSTRUCTIONS:. Please write

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Summary: Differentiation

Summary: Differentiation Techniques of Differentiation. Inverse Trigonometric functions The basic formulas (available in MF5 are: Summary: Differentiation ( sin ( cos The basic formula can be generalize as follows: Note: ( sin

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

Bohr Model of the Hydrogen Atom

Bohr Model of the Hydrogen Atom Class 2 page 1 Bohr Moel of the Hyrogen Atom The Bohr Moel of the hyrogen atom assumes that the atom consists of one electron orbiting a positively charge nucleus. Although it oes NOT o a goo job of escribing

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Tensors, Fields Pt. 1 and the Lie Bracket Pt. 1

Tensors, Fields Pt. 1 and the Lie Bracket Pt. 1 Tensors, Fiels Pt. 1 an the Lie Bracket Pt. 1 PHYS 500 - Southern Illinois University September 8, 2016 PHYS 500 - Southern Illinois University Tensors, Fiels Pt. 1 an the Lie Bracket Pt. 1 September 8,

More information

Matrix Recipes. Javier R. Movellan. December 28, Copyright c 2004 Javier R. Movellan

Matrix Recipes. Javier R. Movellan. December 28, Copyright c 2004 Javier R. Movellan Matrix Recipes Javier R Movellan December 28, 2006 Copyright c 2004 Javier R Movellan 1 1 Definitions Let x, y be matrices of orer m n an o p respectively, ie x 11 x 1n y 11 y 1p x = y = (1) x m1 x mn

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson JUST THE MATHS UNIT NUMBER 10.2 DIFFERENTIATION 2 (Rates of change) by A.J.Hobson 10.2.1 Introuction 10.2.2 Average rates of change 10.2.3 Instantaneous rates of change 10.2.4 Derivatives 10.2.5 Exercises

More information

Support Vector Machine (continued)

Support Vector Machine (continued) Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need

More information

Homework 6: Image Completion using Mixture of Bernoullis

Homework 6: Image Completion using Mixture of Bernoullis Homework 6: Image Completion using Mixture of Bernoullis Deadline: Wednesday, Nov. 21, at 11:59pm Submission: You must submit two files through MarkUs 1 : 1. a PDF file containing your writeup, titled

More information

3.6. Let s write out the sample space for this random experiment:

3.6. Let s write out the sample space for this random experiment: STAT 5 3 Let s write out the sample space for this ranom experiment: S = {(, 2), (, 3), (, 4), (, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)} This sample space assumes the orering of the balls

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 10-708 Probabilistic Graphical Models Homework 3 (v1.1.0) Due Apr 14, 7:00 PM Rules: 1. Homework is due on the due date at 7:00 PM. The homework should be submitted via Gradescope. Solution to each problem

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs Problem Sheet 2: Eigenvalues an eigenvectors an their use in solving linear ODEs If you fin any typos/errors in this problem sheet please email jk28@icacuk The material in this problem sheet is not examinable

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10 Some vector algebra an the generalize chain rule Ross Bannister Data Assimilation Research Centre University of Reaing UK Last upate 10/06/10 1. Introuction an notation As we shall see in these notes the

More information

Machine Learning: Assignment 1

Machine Learning: Assignment 1 10-701 Machine Learning: Assignment 1 Due on Februrary 0, 014 at 1 noon Barnabas Poczos, Aarti Singh Instructions: Failure to follow these directions may result in loss of points. Your solutions for this

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

164 Final Solutions 1

164 Final Solutions 1 164 Final Solutions 1 1. Question 1 True/False a) Let f : R R be a C 3 function such that fx) for all x R. Then the graient escent algorithm starte at the point will fin the global minimum of f. FALSE.

More information

EE 418: Network Security and Cryptography

EE 418: Network Security and Cryptography Problem 1 EE 418: Network Security an Cryptography Homework 5 Assigne: Wenesay, November 23, 2016, Due: Tuesay, December 6, 2016 Instructor: Tamara Bonaci Department of Electrical Engineering University

More information

EE 595 (PMP) Introduction to Security and Privacy Homework 4

EE 595 (PMP) Introduction to Security and Privacy Homework 4 EE 595 (PMP) Introuction to Security an Privacy Homework 4 Assigne: Monay, February 12, 2017, Due: Sunay, March 5, 2017 Instructor: Tamara Bonaci Department of Electrical Engineering University of Washington,

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Lecture 10: October 30, 2017

Lecture 10: October 30, 2017 Information an Coing Theory Autumn 2017 Lecturer: Mahur Tulsiani Lecture 10: October 30, 2017 1 I-Projections an applications In this lecture, we will talk more about fining the istribution in a set Π

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

The Principle of Least Action and Designing Fiber Optics

The Principle of Least Action and Designing Fiber Optics University of Southampton Department of Physics & Astronomy Year 2 Theory Labs The Principle of Least Action an Designing Fiber Optics 1 Purpose of this Moule We will be intereste in esigning fiber optic

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

MTH 133 Solutions to Exam 1 February 21, Without fully opening the exam, check that you have pages 1 through 11.

MTH 133 Solutions to Exam 1 February 21, Without fully opening the exam, check that you have pages 1 through 11. MTH Solutions to Eam February, 8 Name: Section: Recitation Instructor: INSTRUCTIONS Fill in your name, etc. on this first page. Without fully opening the eam, check that you have pages through. Show all

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Rank, Trace, Determinant, Transpose an Inverse of a Matrix Let A be an n n square matrix: A = a11 a1 a1n a1 a an a n1 a n a nn nn where is the jth col

Rank, Trace, Determinant, Transpose an Inverse of a Matrix Let A be an n n square matrix: A = a11 a1 a1n a1 a an a n1 a n a nn nn where is the jth col Review of Linear Algebra { E18 Hanout Vectors an Their Inner Proucts Let X an Y be two vectors: an Their inner prouct is ene as X =[x1; ;x n ] T Y =[y1; ;y n ] T (X; Y ) = X T Y = x k y k k=1 where T an

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Math 1271 Solutions for Fall 2005 Final Exam

Math 1271 Solutions for Fall 2005 Final Exam Math 7 Solutions for Fall 5 Final Eam ) Since the equation + y = e y cannot be rearrange algebraically in orer to write y as an eplicit function of, we must instea ifferentiate this relation implicitly

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

AP Calculus. Derivatives and Their Applications. Presenter Notes

AP Calculus. Derivatives and Their Applications. Presenter Notes AP Calculus Derivatives an Their Applications Presenter Notes 2017 2018 EDITION Copyright 2017 National Math + Science Initiative, Dallas, Texas. All rights reserve. Visit us online at www.nms.org Copyright

More information

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations

Lecture XII. where Φ is called the potential function. Let us introduce spherical coordinates defined through the relations Lecture XII Abstract We introuce the Laplace equation in spherical coorinates an apply the metho of separation of variables to solve it. This will generate three linear orinary secon orer ifferential equations:

More information

Solving the Schrödinger Equation for the 1 Electron Atom (Hydrogen-Like)

Solving the Schrödinger Equation for the 1 Electron Atom (Hydrogen-Like) Stockton Univeristy Chemistry Program, School of Natural Sciences an Mathematics 101 Vera King Farris Dr, Galloway, NJ CHEM 340: Physical Chemistry II Solving the Schröinger Equation for the 1 Electron

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Diagonalization of Matrices Dr. E. Jacobs

Diagonalization of Matrices Dr. E. Jacobs Diagonalization of Matrices Dr. E. Jacobs One of the very interesting lessons in this course is how certain algebraic techniques can be use to solve ifferential equations. The purpose of these notes is

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

1 A Support Vector Machine without Support Vectors

1 A Support Vector Machine without Support Vectors CS/CNS/EE 53 Advanced Topics in Machine Learning Problem Set 1 Handed out: 15 Jan 010 Due: 01 Feb 010 1 A Support Vector Machine without Support Vectors In this question, you ll be implementing an online

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

CSC411 Fall 2018 Homework 5

CSC411 Fall 2018 Homework 5 Homework 5 Deadline: Wednesday, Nov. 4, at :59pm. Submission: You need to submit two files:. Your solutions to Questions and 2 as a PDF file, hw5_writeup.pdf, through MarkUs. (If you submit answers to

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Support Vector Machines

Support Vector Machines EE 17/7AT: Optimization Models in Engineering Section 11/1 - April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable

More information

Rotation by 90 degree counterclockwise

Rotation by 90 degree counterclockwise 0.1. Injective an suejective linear map. Given a linear map T : V W of linear spaces. It is natrual to ask: Problem 0.1. Is that possible to fin an inverse linear map T 1 : W V? We briefly review those

More information

Announcements - Homework

Announcements - Homework Announcements - Homework Homework 1 is graded, please collect at end of lecture Homework 2 due today Homework 3 out soon (watch email) Ques 1 midterm review HW1 score distribution 40 HW1 total score 35

More information

Physics 251 Results for Matrix Exponentials Spring 2017

Physics 251 Results for Matrix Exponentials Spring 2017 Physics 25 Results for Matrix Exponentials Spring 27. Properties of the Matrix Exponential Let A be a real or complex n n matrix. The exponential of A is efine via its Taylor series, e A A n = I + n!,

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Final Exam: Sat 12 Dec 2009, 09:00-12:00

Final Exam: Sat 12 Dec 2009, 09:00-12:00 MATH 1013 SECTIONS A: Professor Szeptycki APPLIED CALCULUS I, FALL 009 B: Professor Toms C: Professor Szeto NAME: STUDENT #: SECTION: No ai (e.g. calculator, written notes) is allowe. Final Exam: Sat 1

More information

Outcomes. Unit 14. Review of State Machines STATE MACHINES OVERVIEW. State Machine Design

Outcomes. Unit 14. Review of State Machines STATE MACHINES OVERVIEW. State Machine Design 4. Outcomes 4.2 Unit 4 tate Machine Design I can create a state iagram to solve a sequential problem I can implement a working state machine given a state iagram 4.3 Review of tate Machines 4.4 TATE MACHINE

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France APPROXIMAE SOLUION FOR RANSIEN HEA RANSFER IN SAIC URBULEN HE II B. Bauouy CEA/Saclay, DSM/DAPNIA/SCM 91191 Gif-sur-Yvette Ceex, France ABSRAC Analytical solution in one imension of the heat iffusion equation

More information

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam.

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam. CS 189 Spring 2013 Introduction to Machine Learning Midterm You have 1 hour 20 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. Please use non-programmable calculators

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

Derivative Methods: (csc(x)) = csc(x) cot(x)

Derivative Methods: (csc(x)) = csc(x) cot(x) EXAM 2 IS TUESDAY IN QUIZ SECTION Allowe:. A Ti-30x IIS Calculator 2. An 8.5 by inch sheet of hanwritten notes (front/back) 3. A pencil or black/blue pen Covers: 3.-3.6, 0.2, 3.9, 3.0, 4. Quick Review

More information