FALL 2018 MATH 4211/6211 Optimization Homework 1

Similar documents
FALL 2018 MATH 4211/6211 Optimization Homework 4

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Convex Optimization / Homework 1, due September 19

b 1 b 2.. b = b m A = [a 1,a 2,...,a n ] where a 1,j a 2,j a j = a m,j Let A R m n and x 1 x 2 x = x n

Math 164-1: Optimization Instructor: Alpár R. Mészáros

Math 164-1: Optimization Instructor: Alpár R. Mészáros

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

Lecture 2: Convex Sets and Functions

Machine Learning and Computational Statistics, Spring 2017 Homework 2: Lasso Regression

Symmetric Matrices and Eigendecomposition

Solving Linear Systems

Unconstrained minimization of smooth functions

Conditional Gradient (Frank-Wolfe) Method

Math 5052 Measure Theory and Functional Analysis II Homework Assignment 7

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Lecture 5: September 15

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Homework 3 Conjugate Gradient Descent, Accelerated Gradient Descent Newton, Quasi Newton and Projected Gradient Descent

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

Functions of Several Variables

Convex Optimization. (EE227A: UC Berkeley) Lecture 4. Suvrit Sra. (Conjugates, subdifferentials) 31 Jan, 2013

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Constrained Optimization and Lagrangian Duality

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Convex Optimization Overview

Math 361: Homework 1 Solutions

CS-E4830 Kernel Methods in Machine Learning

IE 521 Convex Optimization Homework #1 Solution

Introduction to gradient descent

Introduction and Math Preliminaries

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 273 (51) - Final

MATH FINAL EXAM REVIEW HINTS

Section 4.2. Types of Differentiation

Computational Optimization. Mathematical Programming Fundamentals 1/25 (revised)

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

MATH10212 Linear Algebra B Homework Week 5

HW2 Solutions

Math Real Analysis II

MATH3283W LECTURE NOTES: WEEK 6 = 5 13, = 2 5, 1 13

Lecture 2: Convex functions

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Differentiable Functions

4 Linear Algebra Review

Lecture 1 Introduction

The Steepest Descent Algorithm for Unconstrained Optimization

CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA

Designing Information Devices and Systems I Spring 2018 Homework 11

Solutions for Homework Assignment 2

MATH 680 Fall November 27, Homework 3

Evaluating Determinants by Row Reduction

1 Introduction to Optimization

Important. Need a more in depth preview? Get ALL the premium chapter 1 tutorials FREE

CSCI : Optimization and Control of Networks. Review on Convex Optimization

22A-2 SUMMER 2014 LECTURE Agenda

Math 320: Real Analysis MWF 1pm, Campion Hall 302 Homework 4 Solutions Please write neatly, and in complete sentences when possible.

A Brief Review on Convex Optimization

Math (P)refresher Lecture 8: Unconstrained Optimization

Discrete Mathematics. Spring 2017

Ross Program 2017 Application Problems

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

Analytic Number Theory Solutions

1 Convexity, concavity and quasi-concavity. (SB )

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Homework 5. Solutions

Homework #3 RELEASE DATE: 10/28/2013 DUE DATE: extended to 11/18/2013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIALS ARE WELCOMED ON THE FORUM.

5. Subgradient method

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Multivariate Calculus Solution 1

Lecture 2: Review of Prerequisites. Table of contents

Homework 3. Convex Optimization /36-725

EE Applications of Convex Optimization in Signal Processing and Communications Dr. Andre Tkacenko, JPL Third Term

A2 Mathematics Assignment 7 Due Date: Friday 28 th February 2014

Algebra: Chapter 3 Notes

April 26, Applied mathematics PhD candidate, physics MA UC Berkeley. Lecture 4/26/2013. Jed Duersch. Spd matrices. Cholesky decomposition

Homework 1 Solutions

1 Directional Derivatives and Differentiability

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

Functions of Several Variables

Problem 1 Cost of an Infinite Horizon LQR

HOMEWORK 7 solutions

Interior-Point Methods for Linear Optimization

Solving Linear Systems

EXAM. Exam 1. Math 5316, Fall December 2, 2012

ORF 363/COS 323 Final Exam, Fall 2018

Lecture 14: October 17

10-725/ Optimization Midterm Exam

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

Math 581 Problem Set 6 Solutions

Math 54 - HW Solutions 5

1. Implement AdaBoost with boosting stumps and apply the algorithm to the. Solution:

Computing Neural Network Gradients

Math (P)Review Part II:

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Transcription:

FALL 2018 MATH 4211/6211 Optimization Homework 1 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution manual). Copying others solutions or programs is strictly prohibited and will result in grade of 0 to all involved students. Please type your answers in Latex and submit a single PDF file on icollege before due time. Please do not include your name anywhere in the submitted PDF file. Instead, name your PDF file as hw1 123456789.pdf (replace 123456789 by your own Panther ID number). It is recommended to use notations that are consistent to lectures. By default all vectors are treated as column vectors. 1

Problem 1. (1 point) Let x = (x 1,..., x n ) T R n, then the p-norm (p 1) of x is defined by x p = ( n x i p ) 1/p. The standard Euclidean norm is x 2 (often denoted by x without subscript 2). Prove the following statements. x = x T x for all x R n ; x x 1 for all x R n. For MATH 6211, also prove x 1 n x for all x R n ; Proof. By the definition of the standard Euclidian norm x = x 2 = n x i 2. On the other hand, the inner product of x and itself is x T x = (x 1 ) 2 + (x 2 ) 2 +... + (x n ) 2 = n x i 2. Hence we have x = x T x. Proof. Since x 1 = n x i, we have ( 2 x 2 1 = x i ) = x i 2 + x i x j i j x i 2 = x. Therefore x 1 x. Moreover, there is 2 x i x j x i 2 + x j 2 for any x i, x j R, and we have x i x j = 2 x i x j ( x i 2 + x j 2 ) = (n 1) i<j i<j i j x i 2, where the last equality is because each x i 2 appears n 1 times due to the sum. Therefore, we have x 2 1 = x i 2 + x i x j n x i 2 = n x 2, i j from which it follows that x 1 n x. 2

Problem 2. (1 point) Let x = (x 1,..., x n ) T R n, and f : R n R. Recall the following definitions: The gradient of f at x is defined by f(x) = ( f x 1,..., f x n ) T R n ; The Hessian of f at x is 2 f(x) = x 2 1 x 1 x 2. x 1 x n x 2 x 1 x 2 2. x n x 2 x n x 1 x n x 2.... x 2 n R n n The Taylor expansion of f(y) at a given point x up to the second order term is f(y) = f(x) + f(x) T (y x) + 1 2 (y x)t 2 f(x)(y x) + o( y x 2 ) Find the gradient and the Hessian of the function f defined below at point x = (0, 1) T R 2. f(x) = (x 1 x 2 ) 4 + x 2 1 x 2 2 2x 1 + 2x 2 + 1 In addition, find the Taylor expansion of f at x = (0, 1) T up to the second order term. We first compute the partial derivatives: x 2 ) 3 2x 2 + 2. Therefore the gradient is f(x) = f x 1 (x) = 4(x 1 x 2 ) 3 + 2x 1 2, ( f (x), f ) T [ 4(x1 x (x) = 2 ) 3 ] + 2x 1 2 x 1 x 2 4(x 1 x 2 ) 3. 2x 2 + 2 Plugging in x = (x 1, x 2 ) T = (0, 1) T, we obtain f((0, 1) T ) = ( 6, 4) T. f x 2 (x) = 4(x 1 We compute the second-order partial derivatives / x i x j to obtain the Hessian matrix [ 2 12(x1 x f(x) = 2 ) 2 + 2 12(x 1 x 2 ) 2 ] 12(x 1 x 2 ) 2 12(x 1 x 2 ) 2 2 Plugging in x = (x 1, x 2 ) T = (0, 1) T, we obtain the Hessian of f at x = (0, 1) T as [ ] 2 f((0, 1) T 14 12 ) =. 12 10 Note that f at x = (0, 1) T is f((0, 1) T ) = 3. The Taylor expansion of f(y) at x is then f(y) = f(x) + f(x) T (y x) + 1 2 (y x)t 2 f(x)(y x) + o( y x 2 ) ( ) y1 = 3 + ( 6, 4) + 1 [ ] ( ) 14 12 y 2 1 2 (y y1 1, y 2 1) + o( y 12 10 y 2 1 1 2 + y 2 1 2 ) = 7y 2 1 12y 1 y 2 + 5y 2 2 + 6y 1 6y 2 + 4 + o( y 1 2 + y 2 1 2 ). 3

Problem 3. (1 point) Let A R m n and b R m be given. Define function f : R n R by f(x) = Ax b 2. Find the expressions of the quantities below using A, x, and b. f(x); 2 f(x). Denote the matrix A = [a ij ] R m n (i.e., an m-by-n matrix with a ij as the (i, j)th entry) and b = (b 1,..., b m ) T R m. Also denote y := Ax b R m where y i = n j=1 a ijx j b i for i = 1,..., m. Note that f(x) = Ax b 2 = y 2 = m ( n j=1 a ijx j b i ) 2. Hence we have f/ x j = m 2a ij( n j=1 a ijx j b i ), which is the inner product of 2(a 1j,..., a mj ) T (2 multiplies the jth column of A) and y. By stacking the partial derivatives f/ x j for j = 1,..., n, we obtain the gradient as f(x) = 2A T y = 2A T (Ax b). We compute the second order partial derivatives to obtain the (k, j)th entry of the Hessian as 2 f x k x j = m 2a ija ik for k, j = 1,..., n. Therefore the Hessian matrix is 2 f(x) = [ 2 f x k x j ] = 2[ m a ija ik ] = 2A T A. 4

Problem 4. (1 point) Show that for any matrix A R m n and vector b R m, the set {x R n : Ax = b} is convex. Proof. Denote C = {x R n : Ax = b}. For any x, y C, there is Ax = Ay = b. For any θ [0, 1], we hence have A(θx+(1 θ)y) = θax+(1 θ)ay = θb+(1 θ)b = b, which means θx+(1 θ)y C. By the definition of convex sets, we know C is convex. 5

Problem 5. (1 point) Show that the set {x R n : x r} is convex, where r > 0 is a given real number. Proof. Denote C = {x R n : x r}. For any x, y C, there is x, y r. For any θ [0, 1], we hence have θx + (1 θ)y θx + (1 θ)y = θ x + (1 θ) y θr + (1 θ)r = r where we used the triangle inequality of norms to obtain the first inequality. The result above implies θx + (1 θ)y C, and hence C is convex. 6

Problem 6. (1 point) Let C R n be a convex set, and f : C R be a convex function. Prove that the following statements hold for any k 2, x 1,..., x k C,,..., 0, and + θ 2 + + = 1: x 1 + θ 2 x 2 + + x k C; f( x 1 + θ 2 x 2 + + x k ) f(x 1 ) + θ 2 f(x 2 ) + + f(x k ). Hint: use induction on k. Proof. If k = 2, we know the statement holds since C is a convex set. Assume the statement holds for k (induction hypothesis). Then we consider z := x 1 + θ 2 x 2 + + x k + +1 x k+1 where k+1 θ i = 1 and θ i 0 for i = 1,..., k + 1. If +1 = 0, then z = x 1 + θ 2 x 2 + + x k C due to the induction hypothesis. If +1 (0, 1], then we have Note that θ i z = x 1 + θ 2 x 2 + + x k + +1 x k+1 ( = (1 +1 ) x 1 + + 1 +1 0 for all i = 1,..., k and 1 +1 x k ) + +1 x k+1. + + = + + = 1 +1 = 1, 1 +1 1 +1 1 +1 1 +1 we know that x 1 + + x k C due to the induction hypothesis. Then it follows θ that z C since it is a convex combination of 1 x 1 + + x k and x k+1. Therefore the statement holds for k + 1. By induction we know the statement holds for all k 2. Proof. If k = 2, we know the statement holds since f is a convex function. Assume the statement holds for k (induction hypothesis). Then we again consider z := x 1 + θ 2 x 2 + + x k + +1 x k+1 where k+1 θ i = 1 and θ i 0 for i = 1,..., k + 1. If +1 = 0, then the statement holds due to the induction hypothesis. If +1 (0, 1], then we have f(z) = f( x 1 + θ 2 x 2 + + x k + +1 x k+1 ) ( ( ) ) = f (1 +1 ) x 1 + + x k + +1 x k+1 1 +1 1 +1 ( ) (1 +1 )f x 1 + + x k + +1 f(x k+1 ) 1 +1 1 +1 (1 +1 ) k+1 = θ i f(x i ), k θ i 1 +1 f(x i ) + +1 f(x k+1 ) where the first inequality above is due to the convexity of f, and the second inequality above is due to the induction hypothesis. This means the statement holds for k + 1 as well. Therefore by induction we know the statement holds for all k 2. 7