Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Size: px
Start display at page:

Download "Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions"

Transcription

1 CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde ths class oly wth the permsso of the Istructor. o far we ve dscussed covex learg problems. Covex learg problem are of partcular terest maly because they come wth strog theoretcal guaratees. For example, we ca apply GD algorthm to obta desrable learg rates. As t turs out, eve though o-covex problems form formdable challeges theory: They ofte ted to solve may terestg problems practce. I ths lecture we wll dscuss the tas of trag eural etwors usg tochastc Gradet Descet Algorthm. Eve though, we caot guaratee ths algorthm wll coverge to optmum, ofte state-of-the-art results are obtaed by ths algorthm ad t has become a bechmar algorthm for ML Neural Networs wth smooth actvato fuctos We recall that gve a graph (V, E) ad a actvato fucto σ we defed N (V,E),σ to be the class of all eural etwors mplemetable by the archtecture of (V, E) ad actvato fucto σ (ee lectures 5 ad 6). Gve a fxed archtecture a target fucto f ω,b N (V,E),σ s parametrzed by a set of weghts ω : E R ad bas b : V R. The emprcal loss (0-1) s gve by L 0,1 (ω, b) = m =1 l 0,1 (f (0,1) ω,b (x() ), y ) Where we add the superscrpt (0, 1) to ote that we are cosderg a target fucto the class N (V,E),σσsg. Of course, the aforemetoed problem s o-dfferetable ( fact o cotous), therefore we caot apply GD le method. Therefore we wll do two alteratos to the archtectures cosdered so far. Frst stead of σ σsg that we cosdered so far we wll cosder a dfferet actvato fucto. Namely, σ(a) = e a. That meas that each euro, ow returs as output v (t) = σ( ω(t), v(t 1) (x) + b (t) ) whch s a smooth fucto ts parameter. I tur the fucto f ω,b becomes smooth ts parameter (sce ts a composto of addto of smooth fuctos). Remar: Note that we care about smoothess terms of ω ad b!!! Whle f ω,b s a fucto of x: I trag, we cosder the emprcal loss as a fucto of the parameters ad we wat to optmze over these. Of course, ow the target fucto does ot retur 0 or 1 but a real umber, therefore we also replace the 0 1 wth a surrogate covex loss fucto. For cocretess we let l(a, y) = (a y) 2. We ow obta the dfferetable emprcal problem m L (ω, b) = l(f ω,b (x () ), y ) =1 16-1

2 16-2 Lecture 16: Bacpropogato Algorthm To see that these alterato do ot cause ay loss expressve power or geeralzato we prove the followg clam Clam Let (V, E) be a fxed feed-forward graph, the for every sample : For every (ω, b ) f L (ω, b) f L (0,1) (ω, b) m l 0,1 (sg(f ω,b (x () )), y ) L (ω, b ) =1 The frst clam shows that we ca acheve a soluto that s compettve wth the loss of the optmal eural etwor wth 0 1 actvato fucto. The secod statemet tells us that the 0 1 soluto of the optmzer of L wll also have small 0 1 loss. I other words, by mmzg the dfferetable problem, we acheve a soluto wth small emprcal 0 1 loss. Proof. For the frst clam, ote that lm a σ(a) = 1 ad lm a σ(a) = 0, hece hece hece, the frst statemet hold. lm f c ω,c b = f 0,1 c ω,b, lm L (ω, b) L (0,1) c (ω, b) As to the secod statemet, ths follows from the fact that l s a surrogate loss fucto. Thus we tured the o-smooth problem to a dfferetable problem. Ths meas that we ca ow try to apply a gradet descet method, smlar to GD as we used covex problems. There are two ssues to over come 1. Though the loss fucto mght be covex, the ERM problem as a whole, gve ts depedece o the paratmeter s o covex. We have oly show that GD coverges whe the ERM problem s covex the parameters. 2. To perform GD we stll eed to compute the gradet f ω,b, where the depedece betwee the parameters may be hghly volved. The frst problem turs out to be a real ssue ad deed there s o guaratee that GD wll coverge to a global optmum whe the problem s essetally o-covex. I fact, eve covergece to a local mmum s ot guarateed, though oe ca show that GD wll coverge to a crtcal pot (more accurately to a pot where f ω,b ɛ (uder certa smoothess assumptos). The problem s geerally solved by re-teratg the algorhtm from dfferet talzato pots: wth the hope that oe of the staces wll deed coverge to a suffcetly optmal pot. However, all hardess results we dscussed so far apply: Therefore for ay method, f the etwor s expressve eough to represet, for example, tersecto of halfspaces the for some staces the method must fal. The secod pot s actually solvable ad we wll ext see how oe ca compute the gradet of the loss: Ths s ow as the Bacpropagato algorthm, whch has become the worhorse of Mache Learg the past few years.

3 Lecture 16: Bacpropogato Algorthm A Few Remars o NNs practce Before presetg the Bacpropagato algorthm, t s worth dscussg some smplfcatos we have cosdered here over what s ofte used practce: the actvato fucto We are restrctg our atteto to a sgmodal actvato fucto. These has bee used the past. The geeral tuto beg, that they are a smoothg of the 0 1 actvato fucto. I realty, trag wth sgmodal fucto ted to get stuc: whe the weghts are very large the the dervatve starts to behave roughly le the 0 1 fucto whch mea they vash. Oe chage that was suggested s to use the relu actvato fucto σ relu = max(0, a) Ule the sgmodal fucto, ts dervatve does t vash wheever the put s postve. I terms of expressve power, they ca express sgmodal le fucto usg σ relu (a + 1) σ relu (a) o the overall expressvty of the etwor does t chage (as log as we allow twce as may euros at each layer, whch s the same order of euros) Regularzato For geeralzato we rely here o the geeralzato boud of O(E log E ). I practce the umber of free parameters (weghts ad bas) ted to be extremely larger the the umber of examples. Therefore certa regularzato s ofte employed o the weght (e.g. l 2, l 1 regularzato). There have also bee other heurstcs for regulerzg eural etwors such as dropout: Where roughly, durg trag oe zero out some weghts durg the update step. As we saw past lecture GD comes wth ts ow geeralzato guaratees. Geeralzato bouds to GD for o-covex optmzato has bee recetly obtaed [?], but these are ot ecessarly for the learg rates used practce The Bacpropagato Algorthm We ext dscuss the Bacpropogato algorthm that computes ω,b lear tme. To smplfy ad mae otatos easer, stead of carryg a bas term: let us assume that each layer V (t) cotas a sgle euro v (t) 0 that always outputs a costat 1. thus the output of a euro s gve by σ( ω, v (t 1) ) ad we supress the bas b as a addtoal weght ω,0. We ext wsh to compute the dervatve f ω. Now suppose euro v (t) computes: Where u (t) v (t) The usg a smple cha rule we obta that (x) = σ(u (t) (x)) (x) = ω, v (t 1) (x). ω, = u(t) = ω, v (t 1) (x) Thus to compute the partal dervatve wth respect to a sgle weght, we see that t s eough to compute.

4 16-4 Lecture 16: Bacpropogato Algorthm o we focus computg f of some varable z the we have by cha rule:. Now aga suppose f s a fucto of u (t) 1,..., u(t) m, whch are tur fucto m z = =1 u(t) z (16.1) Now f z = u (t 1) choce of actvato fucto s the output of some euro a prevous layer: The calculato of s easy for our u (t) = ω, σ(u (t 1) ) u(t) = ω, σ (u (t 1) ). Usg Eq for f = u (t) wll gve us also ω. we ca recursvely calculate all partal dervatve u(t) for t < t, whch tur The ave approach to calculate the gradet s that we calculate ductvely all dervatves of the form for t < t, the usg Eq wth f = u (t+1) we calculate all dervatves u(t+1) l. Ths calculato calls for each fxed umber of tmes proportoal to E umber of edges, therefore the overall calculato tme s gve by O( V E ). The bac propogato algorhtm calculates the dervatve through dyamcal programmg ad reduces the complexty to O( V + E ): Bacproporgato We ext cosder a approach to calculate the partal dervatve that taes tme O( V + E ). Algorthm 1 Bacpropogato Iput: Graph G(V, E) ad parameters ω : E R. ET T = depthg,.e v (T ) s the output euro. ET m (T ) = 1 for t = T % tart from top layer ad move toward bottom layer do for = 1,..., V (t) % Go over all euros at layer t do euro v (t) receve messages m (t+1) ad passes a message m (t) (v (t) ), sums them up: = V (t+1) =1 m (t+1) (v (t) ) (v (t 1) ) to each euro at a lower level: ed for ed for Clam At each ode v (t) m (t) (v (t 1) ) = the value s exactly ω. v (t)

5 Lecture 16: Bacpropogato Algorthm 16-5 Proof. We prove the statemet by ducto. The message that receves the output euro s gve by = ω ω = 1. Next for each euro we have by ducto: = V (t+1) =1 ω u (t+1) u(t+1) (16.2) Whch by cha rule gves the desred result The Bacpropagato, each euro does umber of computato that s proportoal to ts degree, overall the umber of calculato s proportoal to twce the umber of edges whch gves overall umber of calculatos O( V + E ).

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Regression and the LMS Algorithm

Regression and the LMS Algorithm CSE 556: Itroducto to Neural Netorks Regresso ad the LMS Algorthm CSE 556: Regresso 1 Problem statemet CSE 556: Regresso Lear regresso th oe varable Gve a set of N pars of data {, d }, appromate d b a

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

CSE 5526: Introduction to Neural Networks Linear Regression

CSE 5526: Introduction to Neural Networks Linear Regression CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Generalized Linear Regression with Regularization

Generalized Linear Regression with Regularization Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

Third handout: On the Gini Index

Third handout: On the Gini Index Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set. Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the

More information

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n .. Soluto of Problem. M s obvously cotuous o ], [ ad ], [. Observe that M x,..., x ) M x,..., x ) )..) We ext show that M s odecreasg o ], [. Of course.) mles that M s odecreasg o ], [ as well. To show

More information

MOLECULAR VIBRATIONS

MOLECULAR VIBRATIONS MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal

More information

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis) We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should

More information

( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006

( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006 Mult-Layer Refracto Problem Rafael Espercueta, Bakersfeld College, November, 006 Lght travels at dfferet speeds through dfferet meda, but refracts at layer boudares order to traverse the least-tme path.

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

MA/CSSE 473 Day 27. Dynamic programming

MA/CSSE 473 Day 27. Dynamic programming MA/CSSE 473 Day 7 Dyamc Programmg Bomal Coeffcets Warshall's algorthm (Optmal BSTs) Studet questos? Dyamc programmg Used for problems wth recursve solutos ad overlappg subproblems Typcally, we save (memoze)

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

ENGI 4421 Propagation of Error Page 8-01

ENGI 4421 Propagation of Error Page 8-01 ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.

More information

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers. PROBLEMS A real umber s represeted appromately by 63, ad we are told that the relatve error s % What s? Note: There are two aswers Ht : Recall that % relatve error s What s the relatve error volved roudg

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

arxiv: v1 [cs.lg] 22 Feb 2015

arxiv: v1 [cs.lg] 22 Feb 2015 SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we

More information

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y. .46. a. The frst varable (X) s the frst umber the par ad s plotted o the horzotal axs, whle the secod varable (Y) s the secod umber the par ad s plotted o the vertcal axs. The scatterplot s show the fgure

More information

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.

More information

ANALYSIS ON THE NATURE OF THE BASIC EQUATIONS IN SYNERGETIC INTER-REPRESENTATION NETWORK

ANALYSIS ON THE NATURE OF THE BASIC EQUATIONS IN SYNERGETIC INTER-REPRESENTATION NETWORK Far East Joural of Appled Mathematcs Volume, Number, 2008, Pages Ths paper s avalable ole at http://www.pphm.com 2008 Pushpa Publshg House ANALYSIS ON THE NATURE OF THE ASI EQUATIONS IN SYNERGETI INTER-REPRESENTATION

More information

We have already referred to a certain reaction, which takes place at high temperature after rich combustion.

We have already referred to a certain reaction, which takes place at high temperature after rich combustion. ME 41 Day 13 Topcs Chemcal Equlbrum - Theory Chemcal Equlbrum Example #1 Equlbrum Costats Chemcal Equlbrum Example #2 Chemcal Equlbrum of Hot Bured Gas 1. Chemcal Equlbrum We have already referred to a

More information

Generative classification models

Generative classification models CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato

More information

F. Inequalities. HKAL Pure Mathematics. 進佳數學團隊 Dr. Herbert Lam 林康榮博士. [Solution] Example Basic properties

F. Inequalities. HKAL Pure Mathematics. 進佳數學團隊 Dr. Herbert Lam 林康榮博士. [Solution] Example Basic properties 進佳數學團隊 Dr. Herbert Lam 林康榮博士 HKAL Pure Mathematcs F. Ieualtes. Basc propertes Theorem Let a, b, c be real umbers. () If a b ad b c, the a c. () If a b ad c 0, the ac bc, but f a b ad c 0, the ac bc. Theorem

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

2. Independence and Bernoulli Trials

2. Independence and Bernoulli Trials . Ideedece ad Beroull Trals Ideedece: Evets ad B are deedet f B B. - It s easy to show that, B deedet mles, B;, B are all deedet ars. For examle, ad so that B or B B B B B φ,.e., ad B are deedet evets.,

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Arithmetic Mean and Geometric Mean

Arithmetic Mean and Geometric Mean Acta Mathematca Ntresa Vol, No, p 43 48 ISSN 453-6083 Arthmetc Mea ad Geometrc Mea Mare Varga a * Peter Mchalča b a Departmet of Mathematcs, Faculty of Natural Sceces, Costate the Phlosopher Uversty Ntra,

More information

Investigating Cellular Automata

Investigating Cellular Automata Researcher: Taylor Dupuy Advsor: Aaro Wootto Semester: Fall 4 Ivestgatg Cellular Automata A Overvew of Cellular Automata: Cellular Automata are smple computer programs that geerate rows of black ad whte

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Non-uniform Turán-type problems

Non-uniform Turán-type problems Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

Support vector machines

Support vector machines CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.

More information

Algorithms Theory, Solution for Assignment 2

Algorithms Theory, Solution for Assignment 2 Juor-Prof. Dr. Robert Elsässer, Marco Muñz, Phllp Hedegger WS 2009/200 Algorthms Theory, Soluto for Assgmet 2 http://lak.formatk.u-freburg.de/lak_teachg/ws09_0/algo090.php Exercse 2. - Fast Fourer Trasform

More information

5 Short Proofs of Simplified Stirling s Approximation

5 Short Proofs of Simplified Stirling s Approximation 5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

Mu Sequences/Series Solutions National Convention 2014

Mu Sequences/Series Solutions National Convention 2014 Mu Sequeces/Seres Solutos Natoal Coveto 04 C 6 E A 6C A 6 B B 7 A D 7 D C 7 A B 8 A B 8 A C 8 E 4 B 9 B 4 E 9 B 4 C 9 E C 0 A A 0 D B 0 C C Usg basc propertes of arthmetc sequeces, we fd a ad bm m We eed

More information

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018 Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of

More information

Lecture 12: Multilayer perceptrons II

Lecture 12: Multilayer perceptrons II Lecture : Multlayer perceptros II Bayes dscrmats ad MLPs he role of hdde uts A eample Itroducto to Patter Recoto Rcardo Guterrez-Osua Wrht State Uversty Bayes dscrmats ad MLPs ( As we have see throuhout

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

PPCP: The Proofs. 1 Notations and Assumptions. Maxim Likhachev Computer and Information Science University of Pennsylvania Philadelphia, PA 19104

PPCP: The Proofs. 1 Notations and Assumptions. Maxim Likhachev Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 PPCP: The Proofs Maxm Lkhachev Computer ad Iformato Scece Uversty of Pesylvaa Phladelpha, PA 19104 maxml@seas.upe.edu Athoy Stetz The Robotcs Isttute Carege Mello Uversty Pttsburgh, PA 15213 axs@rec.r.cmu.edu

More information

L5 Polynomial / Spline Curves

L5 Polynomial / Spline Curves L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a

More information

C.11 Bang-bang Control

C.11 Bang-bang Control Itroucto to Cotrol heory Iclug Optmal Cotrol Nguye a e -.5 C. Bag-bag Cotrol. Itroucto hs chapter eals wth the cotrol wth restrctos: s boue a mght well be possble to have scotutes. o llustrate some of

More information

Ideal multigrades with trigonometric coefficients

Ideal multigrades with trigonometric coefficients Ideal multgrades wth trgoometrc coeffcets Zarathustra Brady December 13, 010 1 The problem A (, k) multgrade s defed as a par of dstct sets of tegers such that (a 1,..., a ; b 1,..., b ) a j = =1 for all

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Aal. Appl. 365 200) 358 362 Cotets lsts avalable at SceceDrect Joural of Mathematcal Aalyss ad Applcatos www.elsever.com/locate/maa Asymptotc behavor of termedate pots the dfferetal mea value

More information

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout

More information

LINEAR REGRESSION ANALYSIS

LINEAR REGRESSION ANALYSIS LINEAR REGRESSION ANALYSIS MODULE V Lecture - Correctg Model Iadequaces Through Trasformato ad Weghtg Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Aalytcal methods for

More information

Department of Agricultural Economics. PhD Qualifier Examination. August 2011

Department of Agricultural Economics. PhD Qualifier Examination. August 2011 Departmet of Agrcultural Ecoomcs PhD Qualfer Examato August 0 Istructos: The exam cossts of sx questos You must aswer all questos If you eed a assumpto to complete a questo, state the assumpto clearly

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

8.1 Hashing Algorithms

8.1 Hashing Algorithms CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega

More information

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i. CS 94- Desty Matrces, vo Neuma Etropy 3/7/07 Sprg 007 Lecture 3 I ths lecture, we wll dscuss the bascs of quatum formato theory I partcular, we wll dscuss mxed quatum states, desty matrces, vo Neuma etropy

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

LINEARLY CONSTRAINED MINIMIZATION BY USING NEWTON S METHOD

LINEARLY CONSTRAINED MINIMIZATION BY USING NEWTON S METHOD Jural Karya Asl Loreka Ahl Matematk Vol 8 o 205 Page 084-088 Jural Karya Asl Loreka Ahl Matematk LIEARLY COSTRAIED MIIMIZATIO BY USIG EWTO S METHOD Yosza B Dasrl, a Ismal B Moh 2 Faculty Electrocs a Computer

More information

arxiv:math/ v1 [math.gm] 8 Dec 2005

arxiv:math/ v1 [math.gm] 8 Dec 2005 arxv:math/05272v [math.gm] 8 Dec 2005 A GENERALIZATION OF AN INEQUALITY FROM IMO 2005 NIKOLAI NIKOLOV The preset paper was spred by the thrd problem from the IMO 2005. A specal award was gve to Yure Boreko

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

15-381: Artificial Intelligence. Regression and neural networks (NN)

15-381: Artificial Intelligence. Regression and neural networks (NN) 5-38: Artfcal Itellece Reresso ad eural etorks NN) Mmck the bra I the earl das of AI there as a lot of terest develop models that ca mmc huma thk. Whle o oe ke eactl ho the bra orks ad, eve thouh there

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several

More information

A conic cutting surface method for linear-quadraticsemidefinite

A conic cutting surface method for linear-quadraticsemidefinite A coc cuttg surface method for lear-quadratcsemdefte programmg Mohammad R. Osoorouch Calfora State Uversty Sa Marcos Sa Marcos, CA Jot wor wth Joh E. Mtchell RPI July 3, 2008 Outle: Secod-order coe: defto

More information

2SLS Estimates ECON In this case, begin with the assumption that E[ i

2SLS Estimates ECON In this case, begin with the assumption that E[ i SLS Estmates ECON 3033 Bll Evas Fall 05 Two-Stage Least Squares (SLS Cosder a stadard lear bvarate regresso model y 0 x. I ths case, beg wth the assumto that E[ x] 0 whch meas that OLS estmates of wll

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

22 Nonparametric Methods.

22 Nonparametric Methods. 22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

Lecture 5: Interpolation. Polynomial interpolation Rational approximation

Lecture 5: Interpolation. Polynomial interpolation Rational approximation Lecture 5: Iterpolato olyomal terpolato Ratoal appromato Coeffcets of the polyomal Iterpolato: Sometme we kow the values of a fucto f for a fte set of pots. Yet we wat to evaluate f for other values perhaps

More information

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Introduction to Regression. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Mache Learg CSE6740/CS764/ISYE6740, Fall 0 Itroducto to Regresso Le Sog Lecture 4, August 30, 0 Based o sldes from Erc g, CMU Readg: Chap. 3, CB Mache learg for apartmet hutg Suppose ou are to move to

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statstc ad Radom Samples A parameter s a umber that descrbes the populato. It s a fxed umber, but practce we do ot kow ts value. A statstc s a fucto of the sample data,.e., t s a quatty whose

More information

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class) Assgmet 5/MATH 7/Wter 00 Due: Frday, February 9 class (!) (aswers wll be posted rght after class) As usual, there are peces of text, before the questos [], [], themselves. Recall: For the quadratc form

More information

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg

More information

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods Malaysa Umodalty Joural Tests of Mathematcal for Global Optmzato Sceces (): of 05 Sgle - 5 Varable (007) Fuctos Usg Statstcal Methods Umodalty Tests for Global Optmzato of Sgle Varable Fuctos Usg Statstcal

More information