1 Review From Last Time

Size: px
Start display at page:

Download "1 Review From Last Time"

Transcription

1 COS 5: Foundatons of Machne Learnng Rob Schapre Lecture #8 Scrbe: Monrul I Sharf Aprl 0, 2003 Revew Fro Last Te Last te, we were talkng about how to odel dstrbutons, and we had ths setup: Gven - exaples x,..., x X..d fro D whch we are tryng to odel and X = N - features f,..., f n, where f : X R - eprcal average of features Êf where, Êf = f (x ) = - and the two sets P and Q defned as: { P = p : E p f = Êf } Q = { q of the for q(x) = exp( λ } f (x) Z λ We also ntroduced a theore whch s the dualty theore Theore The followng are equvalent:. q = arg ax p P H(p) 2. q = arg ax q Q ln q(x ) 3. q P Q. Moreover, q s unquely defned by any one of these condtons. Today we talk about how to fnd ths q. We wll descrbe a nce sple algorth called the Iteratve Scalng algorth, that can be used to fnd q. 2 Iteratve Scalng Algorth 2. Dervaton We are lookng for a sequence of these λ s that lead to the λ s that nze the negatve log lkelhood,.e. axze the lkelhood. Let us ntroduce soe new notatons. Let g λ be the lnear cobnaton of features for apartcularλ, andletq λ be the assocated dstrbuton, where: g λ (x) = λ f (x) andq λ (x) =e gλ(x) /Z λ

2 Now, log loss s L(λ) where, L(λ) = ln q λ (x ) () We are lookng for a sequence of vectors λ, λ 2,... and we want L to be gong down for each subsequent vector. So n a sngle te step say fro t to t + the dfference can be defned as below, and we want to upper bound ths. L = L( λ t+ L( λ t ) (2) Let λ = λ t+ and λ = λ t.alsoletα be such that λ = λ + α. Fro (2) we get Now the frst ter of (3) L = L( λ t+ L( λ t ) [ ] = ln eg λ (x ) ln eg λ(x ) Z λ Z λ = [g λ (x ) g λ (x )] ln Z λ +lnz λ (3) = α f (x ) = α f (x ) = α Êf (4) Now, Z λ Z λ = = ( ) exp λ f (x) x Z λ ( ) ( ) exp λ f (x) exp α f (x) x = x q λ (x)exp (pluggng n the value of λ Z ) λ α f (x) x = = q λ (x) f (x)e α (by convexty) e α q λ (x)f (x) x e α E qλ f (5) 2

3 Pluggng (4) and (5) nto (3) we can shown that, L α Êf +ln e α E qλ f We have derved an upper bound of the change n the loss functon. Now, to optze, we can take the dervatve to choose the α that s the sallest. = Ê + E e α α E e α =0 (6) where Ê = Êf and E = E qλ f. Thethngtonotcesthatfwehaveanysolutonfor ths, say α we can add a constant c and get another soluton, α = α + c. The reason s that the constants get cancelled. We wll choose ths constant n such a way that the denonator of the second ter equals. By usng that trck we can set 2.2 The Algorth α =lnê E The algorth works teratvely by calculatng successve λ t s n each round for t =,2,... λ t+, = λ t, +ln Êf E qλt f The above can be wrtten usng probablty dstrbutons. ( Êf ) f (x) p t+ (x) = Z p t(x) E pt f We are tryng to fnd a dstrbuton where t has ths for, and converges to q. Let us try to understand ths ntutvely. Assue that Êf >E pt f (actually, we want the to be equal). The rato nsde the product above wll be greater than, and therefore the dstrbuton wll concentrate ore on the ponts wth hgher feature values, and therefore ncrease the expected value of the dstrbuton, brngng t closer to the eprcal average. Let us prove ths. 2.3 Provng That the Algorth Works Theore 2 p t q as t To prove the above, we are gong to use soethng called an auxlary functons. Auxlary functon: fn A : prob dstrbutons R, and satsfes the followng propertes: () contnuous 3

4 (2) L( λ t+ L( λ t ) A(p t ) 0 we are boundng the change n the loss (3) A(p) =0= Êf = E p [f ] (p P) Proof: The dea s we wll frst show that f an auxlary functon A exsts then p t converges to q and then we show that there does exst an auxlary functon. Our Frst cla s A(P t ) 0 f there does exst an auxlary functon. As A cannot becoe postve, L s always decreasng. And we also know that L s always nonnegatve fro ts defnton n (). Therefore ther dfferences has to converge to 0. So, A s beng squeezed and wll also converge to 0. Now let s say Therefore, p Q Now, ( ) A(p) =A l p t t p = l x p t = l A(p t ) =0 p P } t }{{}{{} by condton (3) by contnuty of A So, p P Q whch, by the dualty theore, ples that p = q. Now we plug n our choce of α and can get the followng: L Ê ln Ê E = RE ( Ê E ) }{{} A(p t ) A(p) = RE ( Êf E p [f ]) We have found our auxlary functon because our requred propertes are et. Frst of all, relatve entropy s contnuous. Second of all, t s nonnegatve, and therefore ts negatve s nonpostve. And fnally, the last property, what happens f A(p) = 0? That eans relatve entropy has to be equal to 0, and that can only happen when the two dstrbutons are dentcal. That s exactly what the thrd requred property of our auxlary functon s. 3 Log Loss n Onlne Learnng Now let us change gears. We were talkng about log loss. We wll now look at t n a spler settng - the onlne learnng odel. It has applcatons n nforaton theory and portofolo anageent. Applcatons such as data copresson are also connected to log loss, especally n the onlne learnng odel. Now let us see how log loss coes n the on-lne learnng odel.e. learnng wth expert advce. We have a bunch of experts. Before the experts were predctng classfcatons, but now as we are tryng to learn a dstrbuton, each expert predcts a dstrbuton. The aster algorth cobnes these predctons and akes a cobned predcton. When an exaple s seen, everybody suffers a loss, that s the log loss of that exaple that was observed. So the odel can be setup n the followng way: 4

5 for t =,2,..., T : -eachexpert s predctng dst p t, on X - aster algorth predcts q t (also a prob. dst.) -observex t X (arbtrary) - aster suffers loss = ln q t (x t ) -expert suffers loss = ln p t, (x t ) So, the algorth observes x, x 2... x t, and repeatedly tres to predct the next outcoe x t. For exaple, consder the case that you are lstenng to soeone talkng and you want to predct the next word that he says. You cannot predct the word exactly, but what you can do s you can predct a probablty dstrbuton of the next word that he ght say.e. how lkely s t that each word wll follow the last word. It ust be entoned that we are not gong to suppose that we have a perfect expert. We are also not assung that the x t s are rando, they are arbtrary. We want to prove that the total loss or the cuulatve loss of the learnng algorth s not uch worse than the loss of the best expert. Here, cuulatve loss {}}{ T ln q t (x t ) t= cuulatve loss of best expert { ( }}{ n ) ln p t, (x t ) + sall aount (7) t p t, (x t ) = p (x t x t q t (x t ) = q(x t x t where, x t eans x,...,x t Let us talk a lttle bt about copresson. Suppose we want to encode or copress essages that are cong fro X. Thnk of X as an alphabet. You have at your dsposal, a set of copresson algorths. Naturally, we want to pck the one whch gves the best copresson, but we want to do that onlne. The set of approaches that we are workng towards s called Unversal Copresson n nforaton theory. 3. Algorth for solvng ths odel Let us talk about an algorth for solvng ths odel. Here s the otvaton behnd the algorth: We are gong to pretend: -expert chosen unforly rando Pr[ = ] =/N (ths s called a pror) - sequence x, x 2,... s generated accordng to dstrbuton of that expert.e. p Pr[x t x t, = ] =p (x t x t ) We wll copute q(x t x t =Pr[x t x t ] (8) 5

6 For coputng the above we ust use Bayes rule. Pr[x t x t ] = Fro Bayes rule we know, = Pr [ x t = x t Pr [ = x t ]. }{{} calledthe posteror Pr[B A] = Pr[A B]Pr[B] Pr[A] Now we use Bayes rule for the frst factor n (9) ] [ Pr xt x t, = ] } {{ } p (x t x t (9) Now, Pr [ = x t ] Pr [ x t = = ] {}}{. Pr[ = ] Pr [ x t ] (0) /N Pr [ x t = ] = Pr[x ]. Pr[x 2 x,].....pr[x t x t 2,] = p (x.p (x 2 x.....p (x t x t 2 = t t = p (x t x t Fnal splfed coputaton Copute: By usng (9),(0) and () n (8) we get = w t, (assue) () Update: q ( x t x t = the weghted average of the predctons of experts { }}{ w t, p (x t x t ) w t, }{{} noralzaton w t+, = w t,.p (x t x t 3.2 Coparson wth prevous algorths We have seen algorths lke ths before. For exaple n the weghted aorty algorth the coputaton step also calculated a weghted average. The update functon s alost slar.intheupdaterulewehadβ loss, where loss was f we ade stake and 0 f not. In ths case we can thnk of that ter beng β ln p (x t x t 6

7 If we choose β = e then ths ter becoes p (x t x t. In other words, we can thnk of ths as not havng to copute β, and that t has a natural value whch s e. In the next lecture we wll analyze ths algorth. 7

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

1 Definition of Rademacher Complexity

1 Definition of Rademacher Complexity COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #9 Scrbe: Josh Chen March 5, 2013 We ve spent the past few classes provng bounds on the generalzaton error of PAClearnng algorths for the

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

Computational and Statistical Learning theory Assignment 4

Computational and Statistical Learning theory Assignment 4 Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

System in Weibull Distribution

System in Weibull Distribution Internatonal Matheatcal Foru 4 9 no. 9 94-95 Relablty Equvalence Factors of a Seres-Parallel Syste n Webull Dstrbuton M. A. El-Dacese Matheatcs Departent Faculty of Scence Tanta Unversty Tanta Egypt eldacese@yahoo.co

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

On Pfaff s solution of the Pfaff problem

On Pfaff s solution of the Pfaff problem Zur Pfaff scen Lösung des Pfaff scen Probles Mat. Ann. 7 (880) 53-530. On Pfaff s soluton of te Pfaff proble By A. MAYER n Lepzg Translated by D. H. Delpenc Te way tat Pfaff adopted for te ntegraton of

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fttng of Data Davd Eberly Geoetrc Tools, LLC http://www.geoetrctools.co/ Copyrght c 1998-2014. All Rghts Reserved. Created: July 15, 1999 Last Modfed: February 9, 2008 Contents 1 Lnear Fttng

More information

Xiangwen Li. March 8th and March 13th, 2001

Xiangwen Li. March 8th and March 13th, 2001 CS49I Approxaton Algorths The Vertex-Cover Proble Lecture Notes Xangwen L March 8th and March 3th, 00 Absolute Approxaton Gven an optzaton proble P, an algorth A s an approxaton algorth for P f, for an

More information

Preference and Demand Examples

Preference and Demand Examples Dvson of the Huantes and Socal Scences Preference and Deand Exaples KC Border October, 2002 Revsed Noveber 206 These notes show how to use the Lagrange Karush Kuhn Tucker ultpler theores to solve the proble

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18 Multpont Analyss for Sblng ars Bostatstcs 666 Lecture 8 revously Lnkage analyss wth pars of ndvduals Non-paraetrc BS Methods Maxu Lkelhood BD Based Method ossble Trangle Constrant AS Methods Covered So

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Slobodan Lakić. Communicated by R. Van Keer

Slobodan Lakić. Communicated by R. Van Keer Serdca Math. J. 21 (1995), 335-344 AN ITERATIVE METHOD FOR THE MATRIX PRINCIPAL n-th ROOT Slobodan Lakć Councated by R. Van Keer In ths paper we gve an teratve ethod to copute the prncpal n-th root and

More information

LECTURE :FACTOR ANALYSIS

LECTURE :FACTOR ANALYSIS LCUR :FACOR ANALYSIS Rta Osadchy Based on Lecture Notes by A. Ng Motvaton Dstrbuton coes fro MoG Have suffcent aount of data: >>n denson Use M to ft Mture of Gaussans nu. of tranng ponts If

More information

Applied Mathematics Letters

Applied Mathematics Letters Appled Matheatcs Letters 2 (2) 46 5 Contents lsts avalable at ScenceDrect Appled Matheatcs Letters journal hoepage: wwwelseverco/locate/al Calculaton of coeffcents of a cardnal B-splne Gradr V Mlovanovć

More information

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS Darusz Bskup 1. Introducton The paper presents a nonparaetrc procedure for estaton of an unknown functon f n the regresson odel y = f x + ε = N. (1) (

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Chapter 12 Lyes KADEM [Thermodynamics II] 2007

Chapter 12 Lyes KADEM [Thermodynamics II] 2007 Chapter 2 Lyes KDEM [Therodynacs II] 2007 Gas Mxtures In ths chapter we wll develop ethods for deternng therodynac propertes of a xture n order to apply the frst law to systes nvolvng xtures. Ths wll be

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fttng of Data Davd Eberly Geoetrc Tools, LLC http://www.geoetrctools.co/ Copyrght c 1998-2015. All Rghts Reserved. Created: July 15, 1999 Last Modfed: January 5, 2015 Contents 1 Lnear Fttng

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

Gradient Descent Learning and Backpropagation

Gradient Descent Learning and Backpropagation Artfcal Neural Networks (art 2) Chrstan Jacob Gradent Descent Learnng and Backpropagaton CSC 533 Wnter 200 Learnng by Gradent Descent Defnton of the Learnng roble Let us start wth the sple case of lnear

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = J j. k i. Suppleentary Materal Dervaton of Eq. 1a. Assue j s a functon of the rate constants for the N coponent reactons: j j (k 1,,..., k,..., k N ( The dervatve wth respect to teperature T s calculated by usng

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

XII.3 The EM (Expectation-Maximization) Algorithm

XII.3 The EM (Expectation-Maximization) Algorithm XII.3 The EM (Expectaton-Maxzaton) Algorth Toshnor Munaata 3/7/06 The EM algorth s a technque to deal wth varous types of ncoplete data or hdden varables. It can be appled to a wde range of learnng probles

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Solutions to Homework 7, Mathematics 1. 1 x. (arccos x) (arccos x) 1

Solutions to Homework 7, Mathematics 1. 1 x. (arccos x) (arccos x) 1 Solutons to Homework 7, Mathematcs 1 Problem 1: a Prove that arccos 1 1 for 1, 1. b* Startng from the defnton of the dervatve, prove that arccos + 1, arccos 1. Hnt: For arccos arccos π + 1, the defnton

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Chapter One Mixture of Ideal Gases

Chapter One Mixture of Ideal Gases herodynacs II AA Chapter One Mxture of Ideal Gases. Coposton of a Gas Mxture: Mass and Mole Fractons o deterne the propertes of a xture, we need to now the coposton of the xture as well as the propertes

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

ACTM State Calculus Competition Saturday April 30, 2011

ACTM State Calculus Competition Saturday April 30, 2011 ACTM State Calculus Competton Saturday Aprl 30, 2011 ACTM State Calculus Competton Sprng 2011 Page 1 Instructons: For questons 1 through 25, mark the best answer choce on the answer sheet provde Afterward

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

The Parity of the Number of Irreducible Factors for Some Pentanomials

The Parity of the Number of Irreducible Factors for Some Pentanomials The Party of the Nuber of Irreducble Factors for Soe Pentanoals Wolfra Koepf 1, Ryul K 1 Departent of Matheatcs Unversty of Kassel, Kassel, F. R. Gerany Faculty of Matheatcs and Mechancs K Il Sung Unversty,

More information

Affine and Riemannian Connections

Affine and Riemannian Connections Affne and Remannan Connectons Semnar Remannan Geometry Summer Term 2015 Prof Dr Anna Wenhard and Dr Gye-Seon Lee Jakob Ullmann Notaton: X(M) space of smooth vector felds on M D(M) space of smooth functons

More information

HMMT February 2016 February 20, 2016

HMMT February 2016 February 20, 2016 HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Fermi-Dirac statistics

Fermi-Dirac statistics UCC/Physcs/MK/EM/October 8, 205 Fer-Drac statstcs Fer-Drac dstrbuton Matter partcles that are eleentary ostly have a type of angular oentu called spn. hese partcles are known to have a agnetc oent whch

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Engineering Risk Benefit Analysis

Engineering Risk Benefit Analysis Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

Differentiating Gaussian Processes

Differentiating Gaussian Processes Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the

More information

Chapter 13. Gas Mixtures. Study Guide in PowerPoint. Thermodynamics: An Engineering Approach, 5th edition by Yunus A. Çengel and Michael A.

Chapter 13. Gas Mixtures. Study Guide in PowerPoint. Thermodynamics: An Engineering Approach, 5th edition by Yunus A. Çengel and Michael A. Chapter 3 Gas Mxtures Study Gude n PowerPont to accopany Therodynacs: An Engneerng Approach, 5th edton by Yunus A. Çengel and Mchael A. Boles The dscussons n ths chapter are restrcted to nonreactve deal-gas

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

CSCI B609: Foundations of Data Science

CSCI B609: Foundations of Data Science CSCI B609: Foundatons of Data Scence Lecture 13/14: Gradent Descent, Boostng and Learnng from Experts Sldes at http://grgory.us/data-scence-class.html Grgory Yaroslavtsev http://grgory.us Constraned Convex

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning Journal of Machne Learnng Research 00-9 Submtted /0; Publshed 7/ Erratum: A Generalzed Path Integral Control Approach to Renforcement Learnng Evangelos ATheodorou Jonas Buchl Stefan Schaal Department of

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate The Frst Internatonal Senar on Scence and Technology, Islac Unversty of Indonesa, 4-5 January 009. Desgnng Fuzzy Te Seres odel Usng Generalzed Wang s ethod and Its applcaton to Forecastng Interest Rate

More information

Introducing Entropy Distributions

Introducing Entropy Distributions Graubner, Schdt & Proske: Proceedngs of the 6 th Internatonal Probablstc Workshop, Darstadt 8 Introducng Entropy Dstrbutons Noel van Erp & Peter van Gelder Structural Hydraulc Engneerng and Probablstc

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e. SSTEM MODELLIN In order to solve a control syste proble, the descrptons of the syste and ts coponents ust be put nto a for sutable for analyss and evaluaton. The followng ethods can be used to odel physcal

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Handling Overload (G. Buttazzo, Hard Real-Time Systems, Ch. 9) Causes for Overload

Handling Overload (G. Buttazzo, Hard Real-Time Systems, Ch. 9) Causes for Overload PS-663: Real-Te Systes Handlng Overloads Handlng Overload (G Buttazzo, Hard Real-Te Systes, h 9) auses for Overload Bad syste desgn eg poor estaton of worst-case executon tes Sultaneous arrval of unexpected

More information

Elastic Collisions. Definition: two point masses on which no external forces act collide without losing any energy.

Elastic Collisions. Definition: two point masses on which no external forces act collide without losing any energy. Elastc Collsons Defnton: to pont asses on hch no external forces act collde thout losng any energy v Prerequstes: θ θ collsons n one denson conservaton of oentu and energy occurs frequently n everyday

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Perron Vectors of an Irreducible Nonnegative Interval Matrix Perron Vectors of an Irreducble Nonnegatve Interval Matrx Jr Rohn August 4 2005 Abstract As s well known an rreducble nonnegatve matrx possesses a unquely determned Perron vector. As the man result of

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Several generation methods of multinomial distributed random number Tian Lei 1, a,linxihe 1,b,Zhigang Zhang 1,c

Several generation methods of multinomial distributed random number Tian Lei 1, a,linxihe 1,b,Zhigang Zhang 1,c Internatonal Conference on Appled Scence and Engneerng Innovaton (ASEI 205) Several generaton ethods of ultnoal dstrbuted rando nuber Tan Le, a,lnhe,b,zhgang Zhang,c School of Matheatcs and Physcs, USTB,

More information

CHAPTER 6 CONSTRAINED OPTIMIZATION 1: K-T CONDITIONS

CHAPTER 6 CONSTRAINED OPTIMIZATION 1: K-T CONDITIONS Chapter 6: Constraned Optzaton CHAPER 6 CONSRAINED OPIMIZAION : K- CONDIIONS Introducton We now begn our dscusson of gradent-based constraned optzaton. Recall that n Chapter 3 we looked at gradent-based

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table: SELECTED PROOFS DeMorgan s formulas: The frst one s clear from Venn dagram, or the followng truth table: A B A B A B Ā B Ā B T T T F F F F T F T F F T F F T T F T F F F F F T T T T The second one can be

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Markov Chain Monte-Carlo (MCMC)

Markov Chain Monte-Carlo (MCMC) Markov Chan Monte-Carlo (MCMC) What for s t and what does t look lke? A. Favorov, 2003-2017 favorov@sens.org favorov@gal.co Monte Carlo ethod: a fgure square The value s unknown. Let s saple a rando value

More information

Communication Complexity 16:198: February Lecture 4. x ij y ij

Communication Complexity 16:198: February Lecture 4. x ij y ij Communcaton Complexty 16:198:671 09 February 2010 Lecture 4 Lecturer: Troy Lee Scrbe: Rajat Mttal 1 Homework problem : Trbes We wll solve the thrd queston n the homework. The goal s to show that the nondetermnstc

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information