Introduction to Machine Learning DIS10

Size: px
Start display at page:

Download "Introduction to Machine Learning DIS10"

Transcription

1 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig all of the partial derivatives ad settig them to 0, we get this system of equatios: λx = 1 λy = 1 x + y = 3 We ca ifer that y = x. Pluggig this ito the costrait, we have: x + 4x = 3 3 which shows that x = ± 5. We have two critical poits, ( 3 5, 3 5 ) ad ( 3 5, 3 5 ). Pluggig these ito our objective fuctio f, we fid that the miimizer is the former, with a value of (b) Miimize the fuctio such that Solutio: The Lagragia is: f (x,y,z) = x y x + y + 3z = 1. x y + λ(x + y + 3z 1) Takig all of the partial derivatives ad settig them to 0, we get this system of equatios: x = λx y = λy 0 = λz x + y + 3z = 1 CS 189, Fall 017, DIS10 1

2 To solve this, we look at several cases: Case 1: λ = 0. This implies that x = y = 0, ad z = ± 1 3. We have two critical poits: (0,0,± 3 1). Case : λ 0. z must be 0. Case a: x = 0. The costrait gives us that y = ± 1. This gives us aother two critical poits: (0,± 1,0). Case b: y = 0. The costrait gives us x = ±1, givig us aother two critical poits: (±1,0,0). Pluggig i all of our critical poits, we fid that (0,± 1,0) miimizes our fuctio with a value of 1. Support Vector Machies (a) We typically frame a SVM problem as tryig to maximize the margi. Explai ituitively why a bigger margi will result i a model that will geeralize better, or perform better i practice. Solutio: Oe ituitio is that if poits are closer to the border, we are less certai about their class. Thus, it would make sese to create a boudary where our certaity is highest about all the traiig set poits. Aother ituitio ivolves thikig about the process that geerated the data we are workig with. Sice it s a oisy process, if we drew a boudary close to oe of our traiig poits of some class, it s very possible that a poit of the same class will be geerated across the boudary, resultig i a icorrect classificatio. Therefore it makes sese to make the boudary as far away from our traiig poits as possible. (b) Will movig poits which are ot support vectors further away from the decisio boudary effect the SVM s hige loss? Solutio: No, the hige loss is defied as N max(0,1 y i( (w) (x) + b)). For osupport vectors, the right had side of the max fuctio is already egative ad movig the poit further away from the boudary will make it oly more egative. The max will retur a zero regardless. This meas that the loss fuctio ad the cosequet decisio boudary is etirely determied by the orietatio of the support vectors ad othig else. (c) Show that the width of a SVM slab with liearly separable data is w. Solutio: The width of the margi is defied by the poits that lie o it, also called support vectors. Let s say we have a poit, x, which is a support vector. The distace betwee x ad the separatig hyperplae ca be calculated by projectig the vector startig at the plae ad edig at x oto the plae s uit ormal vector. The equatio of the plae is w T x + b = 0. CS 189, Fall 017, DIS10

3 Sice w by defiitio is orthogoal to the hyperplae, we wat to project x x oto the uit vector ormal to the hyperplae, w w. w T w ( x x) = 1 w ( wt x w T x) = 1 w ( wt x + b w T x b) Sice we set w T x +b = 1 (or 1) ad by defiitio, w T x+b = 0, this quatity just turs ito 1 w, or 1 1 w, so the distace is the absolute value, w. Sice the margi is half of the slab, we double it to get the full width of w. (d) You are preseted with the followig set of data (triagle = +1, circle = -1): Fid the equatio (by had) of the hyperplae w T x + b = 0 that would be used by a SVM classifier. Which poits are support vectors? Solutio: The equatio of the hyperplae will pass through poit (,1), with a slope of -1. The equatio of this lie is x 1 + x = 3. We kow that from this form, w 1 = w. We also kow that the at the support vectors, w T x + b = ±1. This gives us the equatios: 1w 1 + 0w + b = 1 3w 1 + w + b = 1 Solvig this system of equatios, we get w = [ 1, 1 ]T ad b = 3. The support vectors are (1,0),(0,1), ad (3,). 3 Simple SGD updates Let us cosider a simple least squares problem, where we are iterested i optimizig the fuctio F(w) = 1 Aw y = 1 1 (a i w y i ). (a) What is the closed form OLS solutio? What is the time complexity of computig this solutio i terms of flops? CS 189, Fall 017, DIS10 3

4 Solutio: The closed form solutio is ŵ = (A A) 1 A y. (1) This takes time d + d + d 3 to compute d to fid A A sice it takes multiplicatios to compute each etry of this d d matrix, d to fid A y sice it takes d multiplicatios to compute each etry of this -vector, ad d 3 time to ivert a matrix via Gaussia elimiatio. (b) Write dow the gradiet descet update. What is the time complexity of computig a ε optimal solutio? Solutio: For gradiet descet, we have the update w t+1 = w t γ A (Aw t y). We kow from HW that deotig e k = w k w ad lettig Q deote ( the coditio umber of A A, we have e k Q 1 Q+1) ek 1. We therefore obtai geometric covergee to the optimum, ad the umber of iteratios is roughly T Qlog(1/ε) to coverge to withi ε of optimum (write this out to see why, usig the approximatio 1 x e x ). Also ote that durig each iteratio, we perform work d, sice Aw t takes d time to compute, ad performig the multiplicatio A (Aw t y) takes d time as well. So the total cost is d log(1/ε). (c) Write dow the stochastic gradiet descet update. What is the time complexity of computig a ε optimal solutio? You may wat to quickly go through a derivatio here. What happes whe Aw = y? Discuss why you would use ay of these methods for your problem. Solutio: Let us derive the SGD covergece rate from first priciples. We have the update equatio w t+1 = w t γa J (a J w t y J ), where J is chose uiformly at radom from the set {1,,3,...,}. Notice that this makes sese as a oisy gradiet, sice E J [a J (a J w t y J )] = f J (w t ) = 1 (A Aw t A y) = f (w t ) ad so the gradiet estimate is ubiased. Let us ow compute the covergece rate. We have w t+1 w = w t w + γ f J (w t ) γ(w t w ) f J (w t ). Now otice that there are two sources of radomess i the RHS. The iterate w t is i itself radom, sice we have chose radom idices up util that poit. The idex J is also radom. Crucially, these two sources of radomess are idepedet of each other. I particular, we may ow take the ier product ad compute the expectatio over the idex CS 189, Fall 017, DIS10 4

5 J to obtai ] E J [γ(w t w ) f J (w t ) = γ(w t w ) E[ f J (w t )] = γ(w t w ) f (w t ) = γ(w t w ) A A(w t w ) = A(w t w ) λ mi (A A) w t w, where i the last step, we have used a simple eigevalue boud; go back ad look at HW6 if this is ot clear. Lettig m = λ mi (A A), we have E J [ wt+1 w ] (1 γm) wt w + γ E J f J (w t ). We will ow make some additioal assumptios. First, we assume that a i = 1. Next, we assume that we will always stay withi a regio such that the fuctio F(w) M (ote that we ca do this by evaluatig the loss ad esurig that we do t take a step if this coditio is violated, or by projectio.) Cosequetly, we have E J f J (w t ) = 1 = 1 = F(w t ) M. a i (a i w t y i ) a i (a i w t y i ) We are ow i a positio to complete the aalysis. We have E [ w t+1 w ] (1 γm)e [ wt w ] + γ M, where we have take a additioal expectatio with respect to the radomess up to ad icludig time t. Rollig this out (thik of iductio i reverse), we have E [ w t+1 w ] (1 γm) E [ w t 1 w ] + γ M (1 γm) + γ M. Do you spot the patter? We effectively have E [ w t w ] (1 γm) t E [ w 0 w ] + γ M t 1 (1 γm) i i=0 (1 γm) t E [ w 0 w ] + γ M (1 γm) i i=0 = (1 γm) t E [ w 0 w ] + γ M 1 γm = (1 γm) t E [ w 0 w ] M + γ m. CS 189, Fall 017, DIS10 5

6 Now if we wat the LHS to be less tha ε, it suffices to set each of the above terms to be less tha ε/. I particular, we have the relatios γ M ε/ ad m (1 γm) t E [ w 0 w ] ε/. Doig some algebra, we are led to the choices γ = εm 4M, ad t = 1 γm log(d 0/ε) = 4M m where D 0 deotes our iitial distace to optimum. log(d 0 /ε), ε I effect, we coverge i ε 1 log(1/ε) iteratios, ad each iteratio takes O(d) time (why?). Let us ow compare all three algorithms. Clearly, GD beats OLS provided dlog1/ε < d, which happes whe d > log(1/ε). Thik about what this meas! Settig ε = 10 6 (almost optimum), we see that GD wis for ay problem i which d > 0! Comparig SGD ad GD, the quatities are dlog(1/ε) d ε log(1/ε). I other words, SGD provides gais i covergece whe 1/ε, i.e., whe we have sufficietly may samples. There are also other advatages to SGD that this aalysis does t quite illustrate; for istace scalability ad geeralizatio ability. Comparig SGD ad OLS, we see that SGD wis whe d > d ε log(1/ε), ad so the relevat compariso is betwee d ad 1/ε. SGD agai wis for moderately sized problems. (d) Write dow the SGD update for logistic regressio o two classes F(w) = 1 y i log 1 σ(w x i ) + (1 y 1 i)log 1 σ(w x i ). Discuss why this is equivalet to miimizig a cross-etropy loss. CS 189, Fall 017, DIS10 6

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice

10/2/ , 5.9, Jacob Hays Amit Pillay James DeFelice 0//008 Liear Discrimiat Fuctios Jacob Hays Amit Pillay James DeFelice 5.8, 5.9, 5. Miimum Squared Error Previous methods oly worked o liear separable cases, by lookig at misclassified samples to correct

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods Support Vector Machies ad Kerel Methods Daiel Khashabi Fall 202 Last Update: September 26, 206 Itroductio I Support Vector Machies the goal is to fid a separator betwee data which has the largest margi,

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Inverse Matrix. A meaning that matrix B is an inverse of matrix A. Iverse Matrix Two square matrices A ad B of dimesios are called iverses to oe aother if the followig holds, AB BA I (11) The otio is dual but we ofte write 1 B A meaig that matrix B is a iverse of matrix

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

PRELIM PROBLEM SOLUTIONS

PRELIM PROBLEM SOLUTIONS PRELIM PROBLEM SOLUTIONS THE GRAD STUDENTS + KEN Cotets. Complex Aalysis Practice Problems 2. 2. Real Aalysis Practice Problems 2. 4 3. Algebra Practice Problems 2. 8. Complex Aalysis Practice Problems

More information

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations Chapter The Solutio of Numerical Algebraic ad Trascedetal Equatios Itroductio I this chapter we shall discuss some umerical methods for solvig algebraic ad trascedetal equatios. The equatio f( is said

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Solution of Final Exam : / Machine Learning

Solution of Final Exam : / Machine Learning Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if

More information

Linear Classifiers III

Linear Classifiers III Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

f(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim

f(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim Math 3, Sectio 2. (25 poits) Why we defie f(x) dx as we do. (a) Show that the improper itegral diverges. Hece the improper itegral x 2 + x 2 + b also diverges. Solutio: We compute x 2 + = lim b x 2 + =

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Recitation 4: Lagrange Multipliers and Integration

Recitation 4: Lagrange Multipliers and Integration Math 1c TA: Padraic Bartlett Recitatio 4: Lagrage Multipliers ad Itegratio Week 4 Caltech 211 1 Radom Questio Hey! So, this radom questio is pretty tightly tied to today s lecture ad the cocept of cotet

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

10.1 Sequences. n term. We will deal a. a n or a n n. ( 1) n ( 1) n 1 2 ( 1) a =, 0 0,,,,, ln n. n an 2. n term.

10.1 Sequences. n term. We will deal a. a n or a n n. ( 1) n ( 1) n 1 2 ( 1) a =, 0 0,,,,, ln n. n an 2. n term. 0. Sequeces A sequece is a list of umbers writte i a defiite order: a, a,, a, a is called the first term, a is the secod term, ad i geeral eclusively with ifiite sequeces ad so each term Notatio: the sequece

More information

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0. THE SOLUTION OF NONLINEAR EQUATIONS f( ) = 0. Noliear Equatio Solvers Bracketig. Graphical. Aalytical Ope Methods Bisectio False Positio (Regula-Falsi) Fied poit iteratio Newto Raphso Secat The root of

More information

Lecture 9: Boosting. Akshay Krishnamurthy October 3, 2017

Lecture 9: Boosting. Akshay Krishnamurthy October 3, 2017 Lecture 9: Boostig Akshay Krishamurthy akshay@csumassedu October 3, 07 Recap Last week we discussed some algorithmic aspects of machie learig We saw oe very powerful family of learig algorithms, amely

More information

Notes on iteration and Newton s method. Iteration

Notes on iteration and Newton s method. Iteration Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Lecture 8: October 20, Applications of SVD: least squares approximation

Lecture 8: October 20, Applications of SVD: least squares approximation Mathematical Toolkit Autum 2016 Lecturer: Madhur Tulsiai Lecture 8: October 20, 2016 1 Applicatios of SVD: least squares approximatio We discuss aother applicatio of sigular value decompositio (SVD) of

More information

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32

Boosting. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 1, / 32 Boostig Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machie Learig Algorithms March 1, 2017 1 / 32 Outlie 1 Admiistratio 2 Review of last lecture 3 Boostig Professor Ameet Talwalkar CS260

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

CS537. Numerical Analysis and Computing

CS537. Numerical Analysis and Computing CS57 Numerical Aalysis ad Computig Lecture Locatig Roots o Equatios Proessor Ju Zhag Departmet o Computer Sciece Uiversity o Ketucky Leigto KY 456-6 Jauary 9 9 What is the Root May physical system ca be

More information

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book. THE UNIVERSITY OF WARWICK FIRST YEAR EXAMINATION: Jauary 2009 Aalysis I Time Allowed:.5 hours Read carefully the istructios o the aswer book ad make sure that the particulars required are etered o each

More information

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

v = -!g(x 0 ) Ûg Ûx 1 Ûx 2 Ú If we work out the details in the partial derivatives, we get a pleasing result. n Ûx k, i x i - 2 b k

v = -!g(x 0 ) Ûg Ûx 1 Ûx 2 Ú If we work out the details in the partial derivatives, we get a pleasing result. n Ûx k, i x i - 2 b k The Method of Steepest Descet This is the quadratic fuctio from to that is costructed to have a miimum at the x that solves the system A x = b: g(x) = - 2 I the method of steepest descet, we

More information

MIDTERM 3 CALCULUS 2. Monday, December 3, :15 PM to 6:45 PM. Name PRACTICE EXAM SOLUTIONS

MIDTERM 3 CALCULUS 2. Monday, December 3, :15 PM to 6:45 PM. Name PRACTICE EXAM SOLUTIONS MIDTERM 3 CALCULUS MATH 300 FALL 08 Moday, December 3, 08 5:5 PM to 6:45 PM Name PRACTICE EXAM S Please aswer all of the questios, ad show your work. You must explai your aswers to get credit. You will

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

NBHM QUESTION 2007 Section 1 : Algebra Q1. Let G be a group of order n. Which of the following conditions imply that G is abelian?

NBHM QUESTION 2007 Section 1 : Algebra Q1. Let G be a group of order n. Which of the following conditions imply that G is abelian? NBHM QUESTION 7 NBHM QUESTION 7 NBHM QUESTION 7 Sectio : Algebra Q Let G be a group of order Which of the followig coditios imply that G is abelia? 5 36 Q Which of the followig subgroups are ecesarily

More information

Section 11.8: Power Series

Section 11.8: Power Series Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i

More information

Math 21C Brian Osserman Practice Exam 2

Math 21C Brian Osserman Practice Exam 2 Math 1C Bria Osserma Practice Exam 1 (15 pts.) Determie the radius ad iterval of covergece of the power series (x ) +1. First we use the root test to determie for which values of x the series coverges

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j

The z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Sequences I. Chapter Introduction

Sequences I. Chapter Introduction Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

More information

P1 Chapter 8 :: Binomial Expansion

P1 Chapter 8 :: Binomial Expansion P Chapter 8 :: Biomial Expasio jfrost@tiffi.kigsto.sch.uk www.drfrostmaths.com @DrFrostMaths Last modified: 6 th August 7 Use of DrFrostMaths for practice Register for free at: www.drfrostmaths.com/homework

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

Confidence Intervals for the Population Proportion p

Confidence Intervals for the Population Proportion p Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machie Learig (Fall 2014) Drs. Sha & Liu {feisha,yaliu.cs}@usc.edu October 9, 2014 Drs. Sha & Liu ({feisha,yaliu.cs}@usc.edu) CSCI567 Machie Learig (Fall 2014) October 9, 2014 1 / 49 Outlie Admiistratio

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

NUMERICAL METHODS FOR SOLVING EQUATIONS

NUMERICAL METHODS FOR SOLVING EQUATIONS Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

More information

Section A assesses the Units Numerical Analysis 1 and 2 Section B assesses the Unit Mathematics for Applied Mathematics

Section A assesses the Units Numerical Analysis 1 and 2 Section B assesses the Unit Mathematics for Applied Mathematics X0/70 NATIONAL QUALIFICATIONS 005 MONDAY, MAY.00 PM 4.00 PM APPLIED MATHEMATICS ADVANCED HIGHER Numerical Aalysis Read carefully. Calculators may be used i this paper.. Cadidates should aswer all questios.

More information

Solutions to Homework 7

Solutions to Homework 7 Solutios to Homework 7 Due Wedesday, August 4, 004. Chapter 4.1) 3, 4, 9, 0, 7, 30. Chapter 4.) 4, 9, 10, 11, 1. Chapter 4.1. Solutio to problem 3. The sum has the form a 1 a + a 3 with a k = 1/k. Sice

More information

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019

Outline. CSCI-567: Machine Learning (Spring 2019) Outline. Prof. Victor Adamchik. Mar. 26, 2019 Outlie CSCI-567: Machie Learig Sprig 209 Gaussia mixture models Prof. Victor Adamchik 2 Desity estimatio U of Souther Califoria Mar. 26, 209 3 Naive Bayes Revisited March 26, 209 / 57 March 26, 209 2 /

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

Ma 530 Infinite Series I

Ma 530 Infinite Series I Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

More information

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS DEMETRES CHRISTOFIDES Abstract. Cosider a ivertible matrix over some field. The Gauss-Jorda elimiatio reduces this matrix to the idetity

More information

Math 220B Final Exam Solutions March 18, 2002

Math 220B Final Exam Solutions March 18, 2002 Math 0B Fial Exam Solutios March 18, 00 1. (1 poits) (a) (6 poits) Fid the Gree s fuctio for the tilted half-plae {(x 1, x ) R : x 1 + x > 0}. For x (x 1, x ), y (y 1, y ), express your Gree s fuctio G(x,

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Math 116 Second Exam

Math 116 Second Exam Math 6 Secod Exam November, 6 Name: Exam Solutios Istructor: Sectio:. Do ot ope this exam util you are told to do so.. This exam has 9 pages icludig this cover. There are 8 problems. Note that the problems

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information