Rademacher Complexity. Examples

Size: px
Start display at page:

Download "Rademacher Complexity. Examples"

Transcription

1 Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed that t yelds a upper boud o the expected value of the uform (over the choce of actos/rules devato betwee the expected rsk r ad the emprcal rsk R, amely, where we recall the otato E {r(a R(a} E Rad(L {Z 1,..., Z } a A L := {z Z l(a, z R : a A}. I ths lecture we establsh bouds for Rad(L {z 1,..., z } for ay z 1,..., z Z the settg of regresso. I ervsed learg, the observed examples correspod to pars of pots,.e., Z = (X, Y X Y. The pot X s called feature or covarate, ad the pot Y s ts correspodg label. The set of admssble decsos s a subset of the set fuctos from X to Y,.e., A B := {a : X Y}, ad the loss fucto s of the form l(a, (x, y = φ(a(x, y, for a fucto φ : Y Y R +. The regresso settg s represeted by the choce X = R d for a gve dmeso d, Y = R. We have S = {(X 1, Y 1,..., (X, Y } ad s = {(x 1, y 1,..., (x, y } whch represets a realzato of the trag sample. Let us recall the followg otato: A {x 1,..., x } := {(a(x 1,..., a(x Y : a A}. Proposto 3.1 Let the fucto ŷ φ(ŷ, y be γ-lpschtz for ay y Y. The, for ay (x 1, y 1,..., (x, y X Y, Rad(L {(x 1, y 1,..., (x, y } γ Rad(A {x 1,..., x } Proof: By the cotracto property of Rademacher complexty, Lemma.10, we get 1 Rad(L s = E Ω φ(w x, y = Rad((φ(, y 1,..., φ(, y A {x 1,..., x } w R d : w c γ Rad(A {x 1,..., x }. Below we show how to cotrol the quatty Rad(A {x 1,..., x } for some fucto classes A of terest. 3. Lear predctors l /l costrats I the case of l /l costrats, the Rademacher complexty of lear predctors does ot deped explctly o the dmeso d (the depedece o d s mplct, va the term max x. 3-1

2 3- Lecture 3: Rademacher Complexty. Examples Proposto 3. Let A := {x R d w x : w R d, w 1}. The, for ay x 1..., x R d, Rad(A {x 1,..., x } max x Proof: We have Rad(A {x 1,..., x } = E w R d : w 1 w R d : w 1 Ω w x = E w R d : w 1 w ( Ω x w E Ω x by Cauchy-Schwarz s eq. x y x y E Ω x E Ω x d ( = E Ω x,j j=1 j=1 by Jese s, as x x s cocave d = E (Ω x,j as the Ω s are depedet ad EΩ = 0 = E x max x as Ω = 1. Remark 3.3 Note that as the predctors that we are cosderg are lear,.e., x R d w x, the costrat w 1 the defto of A Proposto 3. s wthout loss of geeralty. I fact, f w c for a gve costat c 0, the we ca rescale w x = ( w w ( w x ad we have the equvalece {x R d w x : w R d, w c} = {x R d w (cx : w R d, w 1}. Proposto 3. stll apples, wth a costat c o the rght-had sde of the boud. 3.3 Lear predctors l 1 /l costrats (l 1 Boostg I the case of l 1 /l costrats, the Rademacher complexty of lear predctors oly depeds logarthmcally o the dmeso d. Proposto 3.4 Let A 1 := {x R d w x : w R d, w 1 1}. The, for ay x 1..., x R d, Rad(A 1 {x 1,..., x } max x log(d

3 Lecture 3: Rademacher Complexty. Examples 3-3 Proof: We have Rad(A 1 {x 1,..., x } = E w R d : w 1 1 w R d : w 1 1 E Ω x Ω w x = E w 1 E Ω x w R d : w 1 1 Let t j := (x 1,j,..., x,j R for ay j 1 : d, ad let T = {t 1,..., t d }. The, Ω x = max Ω x,j = max w ( Ω x by Hölder s equalty x y x 1 y Ω t j, = max t T, Ω t whose expectato looks lke a Rademacher complexty apart from the absolute value (ad the ormalzato by 1/. To remove the absolute value, ote that for ay ω 1,..., ω { 1, 1} we have max t T ω t = max t T T ω t, where we have defed T = { t 1,..., t d }, wth t j = ( x 1,j,..., x,j. Hece, we have ad the proof follows by Massart s lemma as Rad(T T Rad(A 1 {x 1,..., x } Rad(T T, max t T T t log T T max log(d x. Remark 3.5 Note that as the predctors that we are cosderg are lear,.e., x R d w x, the costrat w 1 1 the defto of A 1 s wthout loss of geeralty. I fact, f w 1 c for a gve costat c 0, the we ca rescale w x = ( w w 1 ( w 1 x ad we have the equvalece {x R d w x : w R d, w 1 c} = {x R d w (cx : w R d, w 1 1}. Proposto 3.4 stll apples, wth a costat c o the rght-had sde of the boud. 3.4 Lear predctors smplex/l costrats (Boostg Proposto 3.6 Let d := {w R d : w 1 = 1, w 1,..., w d 0} ad let A := {x R d w x : w d }. The, for ay x 1..., x R d, Rad(A {x 1,..., x } max x log d

4 3-4 Lecture 3: Rademacher Complexty. Examples Proof: We have Rad(A {x 1,..., x } = E Note that for ay vector v = (v 1,..., v d R d we have The, Ω w x = E w ( Ω x. w v = max v j. E w ( Ω x = E max Ω x,j = Rad(T, wth T = {t 1..., t d } wth t j = (x 1,j,..., x,j for ay j {1,..., d}. The proof follows by Massart s lemma as log T Rad(T max t log d max x. t T 3.5 Feed-forward eural etworks Let us defe a feed-forward eural etworks wth actvato fuctos appled elemet-wse to ts uts. A layer l (k : R d k 1 R d k cossts of a coordate-wse composto of a actvato fucto σ (k : R R ad a affe map, amely, l (k (x := σ (k (W (k x + b (k, for a gve teracto matrx W (k ad bas vector b (k. A eural etwork wth depth p (ad p 1 hdde layers s gve by the fucto f p : R d R defed as f (p (x := l (p l (1 (x l (p ( l ( (l (1 (x, wth d 0 = d, d p = 1, σ (r = σ for a gve fucto σ for all r < p, ad σ (p (x = x (.e., the last layer s smply a affe map. The actvato fucto σ s kow to the desg maker, whle the teracto matrces ad the bas vectors are treated as parameters to tue. For stace, a class of eural etworks wth depth p s gve by A (p := {x R d f (p (x : W (k ω, b (k β k}, (3.1 where for a gve matrx M, the l orm s defed as M := max l M j. The Rademacher complexty of a feed-forward eural etwork ca be bouded recursvely by cosderg each layer at a tme. A boud that ca be used for the recurso s gve by the followg proposto, that expresses the Rademacher complextes at the outputs of oe layer terms of the outputs at the prevous layers. Proposto 3.7 Let L be a class of fuctos from R d to R that cludes the zero fucto. Let σ : R R be α-lpschtz ad defe L := {x R d σ( m j=1 w jl j (x + b R : b β, w 1 ω, l 1,..., l m L}. The, for ay x 1,..., x R d, β Rad(L {x 1,..., x } α( + ω Rad(L {x 1,..., x } (3.

5 Lecture 3: Rademacher Complexty. Examples 3-5 If L = L, the β Rad(L {x 1,..., x } α( + ω Rad(L {x 1,..., x } (3.3 Proof: We gve a proof that makes use of may of the property of the Rademacher complexty descrbed the prevous lecture. Let F : = {x R d m w j l j (x R : w 1 ω, l 1,..., l m L}, G : = {x R d b R : b β}. By the cotracto property ad the summato property of Rademacher complextes, we have ( Rad(L {x 1,..., x } α Rad(F {x 1,..., x } + Rad(G {x 1,..., x }. O the oe had, as L cotas the zero fucto we have F {x 1,..., x } = ω cov(l L, where L L = {l l : l L, l L }. I fact, frst of all ote that where Rad(F {x 1,..., x } = Rad(F {x 1,..., x } F := {x R d m w j l j (x R : w 1 = ω, l 1,..., l m L} (ths s because the maxmum over w 1 ω s acheved for the values satsfyg w 1 = ω. The, ote that for ay w R m such that w 1 = 1 we have w l = w (l 0 + w (0 l, :w 0 :w <0 where 0 represets the zero fucto. The rght-had sde s a covex combato of elemets L L. Hece, by the covex hall property of Rademacher complexty we fd Rad(F {x 1,..., x } = ω Rad(cov(L L {x 1,..., x } = ω Rad((L L {x 1,..., x } = ω Rad(L {x 1,..., x } + ω Rad( L {x 1,..., x } = ω Rad(L {x 1,..., x }, where the factor s ot ecessary f L = L. O the other had, Rad(G {x 1,..., x } = E b: b β b Ω E b Ω = β E Ω β, b: b β where the last equalty follows by Jese s equalty, as E Ω E[( Ω ] = usg the depedece of the Ω s ad that Ω = 1. We are ow ready to gve a boud for the full eural etwork. We ca use Proposto 3.7 to ru the recurso, otcg that the last layer volves a lear fucto (whch s 1-Lpschtz. The frst layer requres a dfferet treatmet, ad we ca use Proposto 3.4. Usg Proposto 3.7 we ca establsh the followg boud for the Rademacher complexty of a layered eural etwork.

6 3-6 Lecture 3: Rademacher Complexty. Examples Proposto 3.8 Let σ be λ-lpschtz. Let A (p be defed as 3.1. The, for ay x 1..., x R d, Rad(A (p {x 1,..., x } 1 p 3 (β + ωβλ (ωλ k + ω(ωλ p max x log(d If λ = 1 ad σ s at-symmetrc, amely, σ(x = σ( x, we have Rad(A (p {x 1,..., x } 1 k=0 p (β k=0 ω k + ω p 1 max x log(d Proof: As the last layer of the eural etwork s lear,.e., σ (p (x = x, we ca apply Proposto 3.7 wth α = 1 (as σ (p s 1-Lpschtz oce ad the apply (3. Proposto 3.7 wth α = λ for p tmes. We fd Rad(A (p {x 1,..., x } β ( βλ + ω p 3 (ωλ k + (ωλ p Rad(A 1 {x 1,..., x }. k=0 The result of the frst equalty follows by Proposto 3.4. The secod equalty ca be proved usg the same strategy, usg (3.3 stead of (3..

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations

13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations Lecture 7 3. Parametrc ad No-Parametrc Ucertates, Radal Bass Fuctos ad Neural Network Approxmatos he parameter estmato algorthms descrbed prevous sectos were based o the assumpto that the system ucertates

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

4 Inner Product Spaces

4 Inner Product Spaces 11.MH1 LINEAR ALGEBRA Summary Notes 4 Ier Product Spaces Ier product s the abstracto to geeral vector spaces of the famlar dea of the scalar product of two vectors or 3. I what follows, keep these key

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Lecture 4 Sep 9, 2015

Lecture 4 Sep 9, 2015 CS 388R: Radomzed Algorthms Fall 205 Prof. Erc Prce Lecture 4 Sep 9, 205 Scrbe: Xagru Huag & Chad Voegele Overvew I prevous lectures, we troduced some basc probablty, the Cheroff boud, the coupo collector

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever. 9.4 Sequeces ad Seres Pre Calculus 9.4 SEQUENCES AND SERIES Learg Targets:. Wrte the terms of a explctly defed sequece.. Wrte the terms of a recursvely defed sequece. 3. Determe whether a sequece s arthmetc,

More information

PROJECTION PROBLEM FOR REGULAR POLYGONS

PROJECTION PROBLEM FOR REGULAR POLYGONS Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Class 13,14 June 17, 19, 2015

Class 13,14 June 17, 19, 2015 Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral

More information

Lecture 07: Poles and Zeros

Lecture 07: Poles and Zeros Lecture 07: Poles ad Zeros Defto of poles ad zeros The trasfer fucto provdes a bass for determg mportat system respose characterstcs wthout solvg the complete dfferetal equato. As defed, the trasfer fucto

More information

2SLS Estimates ECON In this case, begin with the assumption that E[ i

2SLS Estimates ECON In this case, begin with the assumption that E[ i SLS Estmates ECON 3033 Bll Evas Fall 05 Two-Stage Least Squares (SLS Cosder a stadard lear bvarate regresso model y 0 x. I ths case, beg wth the assumto that E[ x] 0 whch meas that OLS estmates of wll

More information

Some Different Perspectives on Linear Least Squares

Some Different Perspectives on Linear Least Squares Soe Dfferet Perspectves o Lear Least Squares A stadard proble statstcs s to easure a respose or depedet varable, y, at fed values of oe or ore depedet varables. Soetes there ests a deterstc odel y f (,,

More information

ρ < 1 be five real numbers. The

ρ < 1 be five real numbers. The Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

α1 α2 Simplex and Rectangle Elements Multi-index Notation of polynomials of degree Definition: The set P k will be the set of all functions:

α1 α2 Simplex and Rectangle Elements Multi-index Notation of polynomials of degree Definition: The set P k will be the set of all functions: Smplex ad Rectagle Elemets Mult-dex Notato = (,..., ), o-egatve tegers = = β = ( β,..., β ) the + β = ( + β,..., + β ) + x = x x x x = x x β β + D = D = D D x x x β β Defto: The set P of polyomals of degree

More information

MOLECULAR VIBRATIONS

MOLECULAR VIBRATIONS MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal

More information

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET Abstract. The Permaet versus Determat problem s the followg: Gve a matrx X of determates over a feld of characterstc dfferet from

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

ENGI 4421 Propagation of Error Page 8-01

ENGI 4421 Propagation of Error Page 8-01 ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

The Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012

The Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012 Sc. Math. Japocae Vol. 00, No. 0 0000, 000 000 1 The Arthmetc-Geometrc mea equalty a exteral formula Yuk Seo October 23, 2012 Abstract. The classcal Jese equalty ad ts reverse are dscussed by meas of terally

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Algorithms Theory, Solution for Assignment 2

Algorithms Theory, Solution for Assignment 2 Juor-Prof. Dr. Robert Elsässer, Marco Muñz, Phllp Hedegger WS 2009/200 Algorthms Theory, Soluto for Assgmet 2 http://lak.formatk.u-freburg.de/lak_teachg/ws09_0/algo090.php Exercse 2. - Fast Fourer Trasform

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

13. Artificial Neural Networks for Function Approximation

13. Artificial Neural Networks for Function Approximation Lecture 7 3. Artfcal eural etworks for Fucto Approxmato Motvato. A typcal cotrol desg process starts wth modelg, whch s bascally the process of costructg a mathematcal descrpto (such as a set of ODE-s)

More information

8.1 Hashing Algorithms

8.1 Hashing Algorithms CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Introduction to Matrices and Matrix Approach to Simple Linear Regression Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,

More information

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic. c Pogsa Porchawseskul, Faculty of Ecoomcs, Chulalogkor Uversty olato of costat varace of s but they are stll depedet. C,, he error term s sad to be heteroscedastc. c Pogsa Porchawseskul, Faculty of Ecoomcs,

More information

Ideal multigrades with trigonometric coefficients

Ideal multigrades with trigonometric coefficients Ideal multgrades wth trgoometrc coeffcets Zarathustra Brady December 13, 010 1 The problem A (, k) multgrade s defed as a par of dstct sets of tegers such that (a 1,..., a ; b 1,..., b ) a j = =1 for all

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning

Principal Components. Analysis. Basic Intuition. A Method of Self Organized Learning Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

3. Basic Concepts: Consequences and Properties

3. Basic Concepts: Consequences and Properties : 3. Basc Cocepts: Cosequeces ad Propertes Markku Jutt Overvew More advaced cosequeces ad propertes of the basc cocepts troduced the prevous lecture are derved. Source The materal s maly based o Sectos.6.8

More information

Department of Agricultural Economics. PhD Qualifier Examination. August 2011

Department of Agricultural Economics. PhD Qualifier Examination. August 2011 Departmet of Agrcultural Ecoomcs PhD Qualfer Examato August 0 Istructos: The exam cossts of sx questos You must aswer all questos If you eed a assumpto to complete a questo, state the assumpto clearly

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.

1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers. PROBLEMS A real umber s represeted appromately by 63, ad we are told that the relatve error s % What s? Note: There are two aswers Ht : Recall that % relatve error s What s the relatve error volved roudg

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n .. Soluto of Problem. M s obvously cotuous o ], [ ad ], [. Observe that M x,..., x ) M x,..., x ) )..) We ext show that M s odecreasg o ], [. Of course.) mles that M s odecreasg o ], [ as well. To show

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

Lecture 12: Multilayer perceptrons II

Lecture 12: Multilayer perceptrons II Lecture : Multlayer perceptros II Bayes dscrmats ad MLPs he role of hdde uts A eample Itroducto to Patter Recoto Rcardo Guterrez-Osua Wrht State Uversty Bayes dscrmats ad MLPs ( As we have see throuhout

More information

ONE GENERALIZED INEQUALITY FOR CONVEX FUNCTIONS ON THE TRIANGLE

ONE GENERALIZED INEQUALITY FOR CONVEX FUNCTIONS ON THE TRIANGLE Joural of Pure ad Appled Mathematcs: Advaces ad Applcatos Volume 4 Number 205 Pages 77-87 Avalable at http://scetfcadvaces.co. DOI: http://.do.org/0.8642/jpamaa_7002534 ONE GENERALIZED INEQUALITY FOR CONVEX

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

arxiv:math/ v1 [math.gm] 8 Dec 2005

arxiv:math/ v1 [math.gm] 8 Dec 2005 arxv:math/05272v [math.gm] 8 Dec 2005 A GENERALIZATION OF AN INEQUALITY FROM IMO 2005 NIKOLAI NIKOLOV The preset paper was spred by the thrd problem from the IMO 2005. A specal award was gve to Yure Boreko

More information

CSE 5526: Introduction to Neural Networks Linear Regression

CSE 5526: Introduction to Neural Networks Linear Regression CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto

More information

Generalized Linear Regression with Regularization

Generalized Linear Regression with Regularization Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee

More information

Transforms that are commonly used are separable

Transforms that are commonly used are separable Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

Non-uniform Turán-type problems

Non-uniform Turán-type problems Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at

More information

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set. Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

Lattices. Mathematical background

Lattices. Mathematical background Lattces Mathematcal backgroud Lattces : -dmesoal Eucldea space. That s, { T x } x x = (,, ) :,. T T If x= ( x,, x), y = ( y,, y), the xy, = xy (er product of xad y) x = /2 xx, (Eucldea legth or orm of

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

A unified matrix representation for degree reduction of Bézier curves

A unified matrix representation for degree reduction of Bézier curves Computer Aded Geometrc Desg 21 2004 151 164 wwwelsevercom/locate/cagd A ufed matrx represetato for degree reducto of Bézer curves Hask Suwoo a,,1, Namyog Lee b a Departmet of Mathematcs, Kokuk Uversty,

More information

The Occupancy and Coupon Collector problems

The Occupancy and Coupon Collector problems Chapter 4 The Occupacy ad Coupo Collector problems By Sarel Har-Peled, Jauary 9, 08 4 Prelmares [ Defto 4 Varace ad Stadard Devato For a radom varable X, let V E [ X [ µ X deote the varace of X, where

More information

1 Solution to Problem 6.40

1 Solution to Problem 6.40 1 Soluto to Problem 6.40 (a We wll wrte T τ (X 1,...,X where the X s are..d. wth PDF f(x µ, σ 1 ( x µ σ g, σ where the locato parameter µ s ay real umber ad the scale parameter σ s > 0. Lettg Z X µ σ we

More information

D KL (P Q) := p i ln p i q i

D KL (P Q) := p i ln p i q i Cheroff-Bouds 1 The Geeral Boud Let P 1,, m ) ad Q q 1,, q m ) be two dstrbutos o m elemets, e,, q 0, for 1,, m, ad m 1 m 1 q 1 The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should

More information

MA 524 Homework 6 Solutions

MA 524 Homework 6 Solutions MA 524 Homework 6 Solutos. Sce S(, s the umber of ways to partto [] to k oempty blocks, ad c(, s the umber of ways to partto to k oempty blocks ad also the arrage each block to a cycle, we must have S(,

More information

Multiple Choice Test. Chapter Adequacy of Models for Regression

Multiple Choice Test. Chapter Adequacy of Models for Regression Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to

More information

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis) We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the

More information

General Method for Calculating Chemical Equilibrium Composition

General Method for Calculating Chemical Equilibrium Composition AE 6766/Setzma Sprg 004 Geeral Metod for Calculatg Cemcal Equlbrum Composto For gve tal codtos (e.g., for gve reactats, coose te speces to be cluded te products. As a example, for combusto of ydroge wt

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

Lecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,

Lecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1, Lecture (Part II) Materals Covered Ths Lecture: Chapter 2 (2.6 --- 2.0) The umber of ways of parttog dstct obects to dstct groups cotag, 2,, obects, respectvely, where each obect appears exactly oe group

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Unit 9. The Tangent Bundle

Unit 9. The Tangent Bundle Ut 9. The Taget Budle ========================================================================================== ---------- The taget sace of a submafold of R, detfcato of taget vectors wth dervatos at

More information

arxiv: v1 [cs.lg] 22 Feb 2015

arxiv: v1 [cs.lg] 22 Feb 2015 SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we

More information

7.0 Equality Contraints: Lagrange Multipliers

7.0 Equality Contraints: Lagrange Multipliers Systes Optzato 7.0 Equalty Cotrats: Lagrage Multplers Cosder the zato of a o-lear fucto subject to equalty costrats: g f() R ( ) 0 ( ) (7.) where the g ( ) are possbly also olear fuctos, ad < otherwse

More information

(b) By independence, the probability that the string 1011 is received correctly is

(b) By independence, the probability that the string 1011 is received correctly is Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty

More information

NP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer

NP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer NP!= P By Lu Ra Table of Cotets. Itroduce 2. Prelmary theorem 3. Proof 4. Expla 5. Cocluso. Itroduce The P versus NP problem s a major usolved problem computer scece. Iformally, t asks whether a computer

More information