Introduction to Machine Learning DIS10

Save this PDF as:

Size: px
Start display at page:

Transcription

1 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig all of the partial derivatives ad settig them to 0, we get this system of equatios: λx = 1 λy = 1 x + y = 3 We ca ifer that y = x. Pluggig this ito the costrait, we have: x + 4x = 3 3 which shows that x = ± 5. We have two critical poits, ( 3 5, 3 5 ) ad ( 3 5, 3 5 ). Pluggig these ito our objective fuctio f, we fid that the miimizer is the former, with a value of (b) Miimize the fuctio such that Solutio: The Lagragia is: f (x,y,z) = x y x + y + 3z = 1. x y + λ(x + y + 3z 1) Takig all of the partial derivatives ad settig them to 0, we get this system of equatios: x = λx y = λy 0 = λz x + y + 3z = 1 CS 189, Fall 017, DIS10 1

3 Sice w by defiitio is orthogoal to the hyperplae, we wat to project x x oto the uit vector ormal to the hyperplae, w w. w T w ( x x) = 1 w ( wt x w T x) = 1 w ( wt x + b w T x b) Sice we set w T x +b = 1 (or 1) ad by defiitio, w T x+b = 0, this quatity just turs ito 1 w, or 1 1 w, so the distace is the absolute value, w. Sice the margi is half of the slab, we double it to get the full width of w. (d) You are preseted with the followig set of data (triagle = +1, circle = -1): Fid the equatio (by had) of the hyperplae w T x + b = 0 that would be used by a SVM classifier. Which poits are support vectors? Solutio: The equatio of the hyperplae will pass through poit (,1), with a slope of -1. The equatio of this lie is x 1 + x = 3. We kow that from this form, w 1 = w. We also kow that the at the support vectors, w T x + b = ±1. This gives us the equatios: 1w 1 + 0w + b = 1 3w 1 + w + b = 1 Solvig this system of equatios, we get w = [ 1, 1 ]T ad b = 3. The support vectors are (1,0),(0,1), ad (3,). 3 Simple SGD updates Let us cosider a simple least squares problem, where we are iterested i optimizig the fuctio F(w) = 1 Aw y = 1 1 (a i w y i ). (a) What is the closed form OLS solutio? What is the time complexity of computig this solutio i terms of flops? CS 189, Fall 017, DIS10 3

5 J to obtai ] E J [γ(w t w ) f J (w t ) = γ(w t w ) E[ f J (w t )] = γ(w t w ) f (w t ) = γ(w t w ) A A(w t w ) = A(w t w ) λ mi (A A) w t w, where i the last step, we have used a simple eigevalue boud; go back ad look at HW6 if this is ot clear. Lettig m = λ mi (A A), we have E J [ wt+1 w ] (1 γm) wt w + γ E J f J (w t ). We will ow make some additioal assumptios. First, we assume that a i = 1. Next, we assume that we will always stay withi a regio such that the fuctio F(w) M (ote that we ca do this by evaluatig the loss ad esurig that we do t take a step if this coditio is violated, or by projectio.) Cosequetly, we have E J f J (w t ) = 1 = 1 = F(w t ) M. a i (a i w t y i ) a i (a i w t y i ) We are ow i a positio to complete the aalysis. We have E [ w t+1 w ] (1 γm)e [ wt w ] + γ M, where we have take a additioal expectatio with respect to the radomess up to ad icludig time t. Rollig this out (thik of iductio i reverse), we have E [ w t+1 w ] (1 γm) E [ w t 1 w ] + γ M (1 γm) + γ M. Do you spot the patter? We effectively have E [ w t w ] (1 γm) t E [ w 0 w ] + γ M t 1 (1 γm) i i=0 (1 γm) t E [ w 0 w ] + γ M (1 γm) i i=0 = (1 γm) t E [ w 0 w ] + γ M 1 γm = (1 γm) t E [ w 0 w ] M + γ m. CS 189, Fall 017, DIS10 5

6 Now if we wat the LHS to be less tha ε, it suffices to set each of the above terms to be less tha ε/. I particular, we have the relatios γ M ε/ ad m (1 γm) t E [ w 0 w ] ε/. Doig some algebra, we are led to the choices γ = εm 4M, ad t = 1 γm log(d 0/ε) = 4M m where D 0 deotes our iitial distace to optimum. log(d 0 /ε), ε I effect, we coverge i ε 1 log(1/ε) iteratios, ad each iteratio takes O(d) time (why?). Let us ow compare all three algorithms. Clearly, GD beats OLS provided dlog1/ε < d, which happes whe d > log(1/ε). Thik about what this meas! Settig ε = 10 6 (almost optimum), we see that GD wis for ay problem i which d > 0! Comparig SGD ad GD, the quatities are dlog(1/ε) d ε log(1/ε). I other words, SGD provides gais i covergece whe 1/ε, i.e., whe we have sufficietly may samples. There are also other advatages to SGD that this aalysis does t quite illustrate; for istace scalability ad geeralizatio ability. Comparig SGD ad OLS, we see that SGD wis whe d > d ε log(1/ε), ad so the relevat compariso is betwee d ad 1/ε. SGD agai wis for moderately sized problems. (d) Write dow the SGD update for logistic regressio o two classes F(w) = 1 y i log 1 σ(w x i ) + (1 y 1 i)log 1 σ(w x i ). Discuss why this is equivalet to miimizig a cross-etropy loss. CS 189, Fall 017, DIS10 6

Seunghee Ye Ma 8: Week 5 Oct 28

Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

Support vector machine revisited

6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

Infinite Sequences and Series

Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

Introduction to Optimization Techniques. How to Solve Equations

Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

Recurrence Relations

Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

Notes on iteration and Newton s method. Iteration

Notes o iteratio ad Newto s method Iteratio Iteratio meas doig somethig over ad over. I our cotet, a iteratio is a sequece of umbers, vectors, fuctios, etc. geerated by a iteratio rule of the type 1 f

Solution of Final Exam : / Machine Learning

Solutio of Fial Exam : 10-701/15-781 Machie Learig Fall 2004 Dec. 12th 2004 Your Adrew ID i capital letters: Your full ame: There are 9 questios. Some of them are easy ad some are more difficult. So, if

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

10.1 Sequences. n term. We will deal a. a n or a n n. ( 1) n ( 1) n 1 2 ( 1) a =, 0 0,,,,, ln n. n an 2. n term.

0. Sequeces A sequece is a list of umbers writte i a defiite order: a, a,, a, a is called the first term, a is the secod term, ad i geeral eclusively with ifiite sequeces ad so each term Notatio: the sequece

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

THE SOLUTION OF NONLINEAR EQUATIONS f( ) = 0. Noliear Equatio Solvers Bracketig. Graphical. Aalytical Ope Methods Bisectio False Positio (Regula-Falsi) Fied poit iteratio Newto Raphso Secat The root of

Output Analysis and Run-Length Control

IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

THE UNIVERSITY OF WARWICK FIRST YEAR EXAMINATION: Jauary 2009 Aalysis I Time Allowed:.5 hours Read carefully the istructios o the aswer book ad make sure that the particulars required are etered o each

Most text will write ordinary derivatives using either Leibniz notation 2 3. y + 5y= e and y y. xx tt t

Itroductio to Differetial Equatios Defiitios ad Termiolog Differetial Equatio: A equatio cotaiig the derivatives of oe or more depedet variables, with respect to oe or more idepedet variables, is said

Analysis of Algorithms. Introduction. Contents

Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

Sequences I. Chapter Introduction

Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

MA131 - Analysis 1. Workbook 2 Sequences I

MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

Section 11.8: Power Series

Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i

P1 Chapter 8 :: Binomial Expansion

P Chapter 8 :: Biomial Expasio jfrost@tiffi.kigsto.sch.uk www.drfrostmaths.com @DrFrostMaths Last modified: 6 th August 7 Use of DrFrostMaths for practice Register for free at: www.drfrostmaths.com/homework

Lesson 10: Limits and Continuity

www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

Section A assesses the Units Numerical Analysis 1 and 2 Section B assesses the Unit Mathematics for Applied Mathematics

X0/70 NATIONAL QUALIFICATIONS 005 MONDAY, MAY.00 PM 4.00 PM APPLIED MATHEMATICS ADVANCED HIGHER Numerical Aalysis Read carefully. Calculators may be used i this paper.. Cadidates should aswer all questios.

IP Reference guide for integer programming formulations.

IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

Confidence Intervals for the Population Proportion p

Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

NUMERICAL METHODS FOR SOLVING EQUATIONS

Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

Ma 530 Infinite Series I

Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

Riemann Sums y = f (x)

Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

Itroductio to Artificial Itelligece CAP 601 Summer 013 Midterm Exam 1. Termiology (7 Poits). Give the followig task eviromets, eter their properties/characteristics. The properties/characteristics of the

CHAPTER 5. Theory and Solution Using Matrix Techniques

A SERIES OF CLASS NOTES FOR 2005-2006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 3 A COLLECTION OF HANDOUTS ON SYSTEMS OF ORDINARY DIFFERENTIAL

5.1 Review of Singular Value Decomposition (SVD)

MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

TEACHER CERTIFICATION STUDY GUIDE

COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

Differentiable Convex Functions

Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

Lecture 2 October 11

Itroductio to probabilistic graphical models 203/204 Lecture 2 October Lecturer: Guillaume Oboziski Scribes: Aymeric Reshef, Claire Verade Course webpage: http://www.di.es.fr/~fbach/courses/fall203/ 2.

Differentiable Convex Functions

Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

Physics 324, Fall 2002 Dirac Notatio These otes were produced by David Kapla for Phys. 324 i Autum 2001. 1 Vectors 1.1 Ier product Recall from liear algebra: we ca represet a vector V as a colum vector;

Analytic Continuation

Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for

Cov(aX, cy ) Var(X) Var(Y ) It is completely invariant to affine transformations: for any a, b, c, d R, ρ(ax + b, cy + d) = a.s. X i. as n.

CS 189 Itroductio to Machie Learig Sprig 218 Note 11 1 Caoical Correlatio Aalysis The Pearso Correlatio Coefficiet ρ(x, Y ) is a way to measure how liearly related (i other words, how well a liear model

MATH 10550, EXAM 3 SOLUTIONS

MATH 155, EXAM 3 SOLUTIONS 1. I fidig a approximate solutio to the equatio x 3 +x 4 = usig Newto s method with iitial approximatio x 1 = 1, what is x? Solutio. Recall that x +1 = x f(x ) f (x ). Hece,

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

How to Maximize a Function without Really Trying

How to Maximize a Fuctio without Really Tryig MARK FLANAGAN School of Electrical, Electroic ad Commuicatios Egieerig Uiversity College Dubli We will prove a famous elemetary iequality called The Rearragemet

Markov Decision Processes

Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Lecture 16

EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Lecture 16 Variace Questio: Let us retur oce agai to the questio of how may heads i a typical sequece of coi flips. Recall that we

A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS

A NOTE ON THE TOTAL LEAST SQUARES FIT TO COPLANAR POINTS STEVEN L. LEE Abstract. The Total Least Squares (TLS) fit to the poits (x,y ), =1,,, miimizes the sum of the squares of the perpedicular distaces

Element sampling: Part 2

Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

Solutions to Tutorial 3 (Week 4)

The Uiversity of Sydey School of Mathematics ad Statistics Solutios to Tutorial Week 4 MATH2962: Real ad Complex Aalysis Advaced Semester 1, 2017 Web Page: http://www.maths.usyd.edu.au/u/ug/im/math2962/

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES.

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 5: SINGULARITIES. ANDREW SALCH 1. The Jacobia criterio for osigularity. You have probably oticed by ow that some poits o varieties are smooth i a sese somethig

Lecture 4 Conformal Mapping and Green s Theorem. 1. Let s try to solve the following problem by separation of variables

Lecture 4 Coformal Mappig ad Gree s Theorem Today s topics. Solvig electrostatic problems cotiued. Why separatio of variables does t always work 3. Coformal mappig 4. Gree s theorem The failure of separatio

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

MTH Assignment 1 : Real Numbers, Sequences

MTH -26 Assigmet : Real Numbers, Sequeces. Fid the supremum of the set { m m+ : N, m Z}. 2. Let A be a o-empty subset of R ad α R. Show that α = supa if ad oly if α is ot a upper boud of A but α + is a

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

AP Calculus Chapter 9: Infinite Series

AP Calculus Chapter 9: Ifiite Series 9. Sequeces a, a 2, a 3, a 4, a 5,... Sequece: A fuctio whose domai is the set of positive itegers = 2 3 4 a = a a 2 a 3 a 4 terms of the sequece Begi with the patter

FMA901F: Machine Learning Lecture 4: Linear Models for Classification. Cristian Sminchisescu

FMA90F: Machie Learig Lecture 4: Liear Models for Classificatio Cristia Smichisescu Liear Classificatio Classificatio is itrisically o liear because of the traiig costraits that place o idetical iputs

REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

Integer Linear Programming

Iteger Liear Programmig Itroductio Iteger L P problem (P) Mi = s. t. a = b i =,, m = i i 0, iteger =,, c Eemple Mi z = 5 s. t. + 0 0, 0, iteger F(P) = feasible domai of P Itroductio Iteger L P problem

Properties and Hypothesis Testing

Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?

MATH 529 The Boudary Problem The drukard s walk (or boudary problem) is oe of the most famous problems i the theory of radom walks. Oe versio of the problem is described as follows: Suppose a particle

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform

Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

CS 70 Discrete Mathematics ad Probability Theory Sprig 2016 Rao ad Walrad Note 19 Some Importat Distributios Recall our basic probabilistic experimet of tossig a biased coi times. This is a very simple

Definition of z-transform.

- Trasforms Frequecy domai represetatios of discretetime sigals ad LTI discrete-time systems are made possible with the use of DTFT. However ot all discrete-time sigals e.g. uit step sequece are guarateed

PUTNAM TRAINING INEQUALITIES

PUTNAM TRAINING INEQUALITIES (Last updated: December, 207) Remark This is a list of exercises o iequalities Miguel A Lerma Exercises If a, b, c > 0, prove that (a 2 b + b 2 c + c 2 a)(ab 2 + bc 2 + ca

Algorithms for Clustering

CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

18.S096: Homework Problem Set 1 (revised)

8.S096: Homework Problem Set (revised) Topics i Mathematics of Data Sciece (Fall 05) Afoso S. Badeira Due o October 6, 05 Exteded to: October 8, 05 This homework problem set is due o October 6, at the

Carleton College, Winter 2017 Math 121, Practice Final Prof. Jones. Note: the exam will have a section of true-false questions, like the one below.

Carleto College, Witer 207 Math 2, Practice Fial Prof. Joes Note: the exam will have a sectio of true-false questios, like the oe below.. True or False. Briefly explai your aswer. A icorrectly justified

Monte Carlo Integration

Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

4.1 Sigma Notation and Riemann Sums

0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

Statistical Fundamentals and Control Charts

Statistical Fudametals ad Cotrol Charts 1. Statistical Process Cotrol Basics Chace causes of variatio uavoidable causes of variatios Assigable causes of variatio large variatios related to machies, materials,

17 Phonons and conduction electrons in solids (Hiroshi Matsuoka)

7 Phoos ad coductio electros i solids Hiroshi Matsuoa I this chapter we will discuss a miimal microscopic model for phoos i a solid ad a miimal microscopic model for coductio electros i a simple metal.

INFINITE SEQUENCES AND SERIES

11 INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES 11.4 The Compariso Tests I this sectio, we will lear: How to fid the value of a series by comparig it with a kow series. COMPARISON TESTS

INF-GEO Solutions, Geometrical Optics, Part 1

INF-GEO430 20 Solutios, Geometrical Optics, Part Reflectio by a symmetric triagular prism Let be the agle betwee the two faces of a symmetric triagular prism. Let the edge A where the two faces meet be

PRACTICE FINAL/STUDY GUIDE SOLUTIONS

Last edited December 9, 03 at 4:33pm) Feel free to sed me ay feedback, icludig commets, typos, ad mathematical errors Problem Give the precise meaig of the followig statemets i) a f) L ii) a + f) L iii)

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

CS395/Ma395 (Sprig 2005) Test Oe Name: Page 1 Test Oe (Aswer Key) CS395/Ma395: Aalysis of Algorithms This is a closed book, closed otes, 70 miute examiatio. It is worth 100 poits. There are twelve (12)

7 LINEAR MODELS. 7.1 The Optimization Framework for Linear Models. Learning Objectives:

7 LINEAR MODELS The essece of mathematics is ot to make simple thigs complicated, but to make complicated thigs simple. Staley Gudder I Chapter 4, you leared about the perceptro algorithm for liear classificatio.

Math 210A Homework 1

Math 0A Homework Edward Burkard Exercise. a) State the defiitio of a aalytic fuctio. b) What are the relatioships betwee aalytic fuctios ad the Cauchy-Riema equatios? Solutio. a) A fuctio f : G C is called

Math 116 Practice for Exam 3

Math 6 Practice for Eam 3 Geerated April 4, 26 Name: SOLUTIONS Istructor: Sectio Number:. This eam has questios. Note that the problems are ot of equal difficulty, so you may wat to skip over ad retur

UNIT #5. Lesson #2 Arithmetic and Geometric Sequences. Lesson #3 Summation Notation. Lesson #4 Arithmetic Series. Lesson #5 Geometric Series

UNIT #5 SEQUENCES AND SERIES Lesso # Sequeces Lesso # Arithmetic ad Geometric Sequeces Lesso #3 Summatio Notatio Lesso #4 Arithmetic Series Lesso #5 Geometric Series Lesso #6 Mortgage Paymets COMMON CORE

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

P.3 Polynomials and Special products

Precalc Fall 2016 Sectios P.3, 1.2, 1.3, P.4, 1.4, P.2 (radicals/ratioal expoets), 1.5, 1.6, 1.7, 1.8, 1.1, 2.1, 2.2 I Polyomial defiitio (p. 28) a x + a x +... + a x + a x 1 1 0 1 1 0 a x + a x +... +

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

Complex Numbers Primer

Before I get started o this let me first make it clear that this documet is ot iteded to teach you everythig there is to kow about complex umbers. That is a subject that ca (ad does) take a whole course

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

ON POINTWISE BINOMIAL APPROXIMATION

Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece

Singular Continuous Measures by Michael Pejic 5/14/10

Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

On Generalized Fibonacci Numbers

Applied Mathematical Scieces, Vol. 9, 215, o. 73, 3611-3622 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/1.12988/ams.215.5299 O Geeralized Fiboacci Numbers Jerico B. Bacai ad Julius Fergy T. Rabago Departmet

1. By using truth tables prove that, for all statements P and Q, the statement

Author: Satiago Salazar Problems I: Mathematical Statemets ad Proofs. By usig truth tables prove that, for all statemets P ad Q, the statemet P Q ad its cotrapositive ot Q (ot P) are equivalet. I example.2.3

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured