Rademacher Complexity. Examples
|
|
- Rosamond Little
- 5 years ago
- Views:
Transcription
1 Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed that t yelds a upper boud o the expected value of the uform (over the choce of actos/rules devato betwee the expected rsk r ad the emprcal rsk R, amely, where we recall the otato E {r(a R(a} E Rad(L {Z 1,..., Z } a A L := {z Z l(a, z R : a A}. I ths lecture we establsh bouds for Rad(L {z 1,..., z } for ay z 1,..., z Z the settg of regresso. I ervsed learg, the observed examples correspod to pars of pots,.e., Z = (X, Y X Y. The pot X s called feature or covarate, ad the pot Y s ts correspodg label. The set of admssble decsos s a subset of the set fuctos from X to Y,.e., A B := {a : X Y}, ad the loss fucto s of the form l(a, (x, y = φ(a(x, y, for a fucto φ : Y Y R +. The regresso settg s represeted by the choce X = R d for a gve dmeso d, Y = R. We have S = {(X 1, Y 1,..., (X, Y } ad s = {(x 1, y 1,..., (x, y } whch represets a realzato of the trag sample. Let us recall the followg otato: A {x 1,..., x } := {(a(x 1,..., a(x Y : a A}. Proposto 3.1 Let the fucto ŷ φ(ŷ, y be γ-lpschtz for ay y Y. The, for ay (x 1, y 1,..., (x, y X Y, Rad(L {(x 1, y 1,..., (x, y } γ Rad(A {x 1,..., x } Proof: By the cotracto property of Rademacher complexty, Lemma.10, we get 1 Rad(L s = E Ω φ(w x, y = Rad((φ(, y 1,..., φ(, y A {x 1,..., x } w R d : w c γ Rad(A {x 1,..., x }. Below we show how to cotrol the quatty Rad(A {x 1,..., x } for some fucto classes A of terest. 3. Lear predctors l /l costrats I the case of l /l costrats, the Rademacher complexty of lear predctors does ot deped explctly o the dmeso d (the depedece o d s mplct, va the term max x. 3-1
2 3- Lecture 3: Rademacher Complexty. Examples Proposto 3. Let A := {x R d w x : w R d, w 1}. The, for ay x 1..., x R d, Rad(A {x 1,..., x } max x Proof: We have Rad(A {x 1,..., x } = E w R d : w 1 w R d : w 1 Ω w x = E w R d : w 1 w ( Ω x w E Ω x by Cauchy-Schwarz s eq. x y x y E Ω x E Ω x d ( = E Ω x,j j=1 j=1 by Jese s, as x x s cocave d = E (Ω x,j as the Ω s are depedet ad EΩ = 0 = E x max x as Ω = 1. Remark 3.3 Note that as the predctors that we are cosderg are lear,.e., x R d w x, the costrat w 1 the defto of A Proposto 3. s wthout loss of geeralty. I fact, f w c for a gve costat c 0, the we ca rescale w x = ( w w ( w x ad we have the equvalece {x R d w x : w R d, w c} = {x R d w (cx : w R d, w 1}. Proposto 3. stll apples, wth a costat c o the rght-had sde of the boud. 3.3 Lear predctors l 1 /l costrats (l 1 Boostg I the case of l 1 /l costrats, the Rademacher complexty of lear predctors oly depeds logarthmcally o the dmeso d. Proposto 3.4 Let A 1 := {x R d w x : w R d, w 1 1}. The, for ay x 1..., x R d, Rad(A 1 {x 1,..., x } max x log(d
3 Lecture 3: Rademacher Complexty. Examples 3-3 Proof: We have Rad(A 1 {x 1,..., x } = E w R d : w 1 1 w R d : w 1 1 E Ω x Ω w x = E w 1 E Ω x w R d : w 1 1 Let t j := (x 1,j,..., x,j R for ay j 1 : d, ad let T = {t 1,..., t d }. The, Ω x = max Ω x,j = max w ( Ω x by Hölder s equalty x y x 1 y Ω t j, = max t T, Ω t whose expectato looks lke a Rademacher complexty apart from the absolute value (ad the ormalzato by 1/. To remove the absolute value, ote that for ay ω 1,..., ω { 1, 1} we have max t T ω t = max t T T ω t, where we have defed T = { t 1,..., t d }, wth t j = ( x 1,j,..., x,j. Hece, we have ad the proof follows by Massart s lemma as Rad(T T Rad(A 1 {x 1,..., x } Rad(T T, max t T T t log T T max log(d x. Remark 3.5 Note that as the predctors that we are cosderg are lear,.e., x R d w x, the costrat w 1 1 the defto of A 1 s wthout loss of geeralty. I fact, f w 1 c for a gve costat c 0, the we ca rescale w x = ( w w 1 ( w 1 x ad we have the equvalece {x R d w x : w R d, w 1 c} = {x R d w (cx : w R d, w 1 1}. Proposto 3.4 stll apples, wth a costat c o the rght-had sde of the boud. 3.4 Lear predctors smplex/l costrats (Boostg Proposto 3.6 Let d := {w R d : w 1 = 1, w 1,..., w d 0} ad let A := {x R d w x : w d }. The, for ay x 1..., x R d, Rad(A {x 1,..., x } max x log d
4 3-4 Lecture 3: Rademacher Complexty. Examples Proof: We have Rad(A {x 1,..., x } = E Note that for ay vector v = (v 1,..., v d R d we have The, Ω w x = E w ( Ω x. w v = max v j. E w ( Ω x = E max Ω x,j = Rad(T, wth T = {t 1..., t d } wth t j = (x 1,j,..., x,j for ay j {1,..., d}. The proof follows by Massart s lemma as log T Rad(T max t log d max x. t T 3.5 Feed-forward eural etworks Let us defe a feed-forward eural etworks wth actvato fuctos appled elemet-wse to ts uts. A layer l (k : R d k 1 R d k cossts of a coordate-wse composto of a actvato fucto σ (k : R R ad a affe map, amely, l (k (x := σ (k (W (k x + b (k, for a gve teracto matrx W (k ad bas vector b (k. A eural etwork wth depth p (ad p 1 hdde layers s gve by the fucto f p : R d R defed as f (p (x := l (p l (1 (x l (p ( l ( (l (1 (x, wth d 0 = d, d p = 1, σ (r = σ for a gve fucto σ for all r < p, ad σ (p (x = x (.e., the last layer s smply a affe map. The actvato fucto σ s kow to the desg maker, whle the teracto matrces ad the bas vectors are treated as parameters to tue. For stace, a class of eural etworks wth depth p s gve by A (p := {x R d f (p (x : W (k ω, b (k β k}, (3.1 where for a gve matrx M, the l orm s defed as M := max l M j. The Rademacher complexty of a feed-forward eural etwork ca be bouded recursvely by cosderg each layer at a tme. A boud that ca be used for the recurso s gve by the followg proposto, that expresses the Rademacher complextes at the outputs of oe layer terms of the outputs at the prevous layers. Proposto 3.7 Let L be a class of fuctos from R d to R that cludes the zero fucto. Let σ : R R be α-lpschtz ad defe L := {x R d σ( m j=1 w jl j (x + b R : b β, w 1 ω, l 1,..., l m L}. The, for ay x 1,..., x R d, β Rad(L {x 1,..., x } α( + ω Rad(L {x 1,..., x } (3.
5 Lecture 3: Rademacher Complexty. Examples 3-5 If L = L, the β Rad(L {x 1,..., x } α( + ω Rad(L {x 1,..., x } (3.3 Proof: We gve a proof that makes use of may of the property of the Rademacher complexty descrbed the prevous lecture. Let F : = {x R d m w j l j (x R : w 1 ω, l 1,..., l m L}, G : = {x R d b R : b β}. By the cotracto property ad the summato property of Rademacher complextes, we have ( Rad(L {x 1,..., x } α Rad(F {x 1,..., x } + Rad(G {x 1,..., x }. O the oe had, as L cotas the zero fucto we have F {x 1,..., x } = ω cov(l L, where L L = {l l : l L, l L }. I fact, frst of all ote that where Rad(F {x 1,..., x } = Rad(F {x 1,..., x } F := {x R d m w j l j (x R : w 1 = ω, l 1,..., l m L} (ths s because the maxmum over w 1 ω s acheved for the values satsfyg w 1 = ω. The, ote that for ay w R m such that w 1 = 1 we have w l = w (l 0 + w (0 l, :w 0 :w <0 where 0 represets the zero fucto. The rght-had sde s a covex combato of elemets L L. Hece, by the covex hall property of Rademacher complexty we fd Rad(F {x 1,..., x } = ω Rad(cov(L L {x 1,..., x } = ω Rad((L L {x 1,..., x } = ω Rad(L {x 1,..., x } + ω Rad( L {x 1,..., x } = ω Rad(L {x 1,..., x }, where the factor s ot ecessary f L = L. O the other had, Rad(G {x 1,..., x } = E b: b β b Ω E b Ω = β E Ω β, b: b β where the last equalty follows by Jese s equalty, as E Ω E[( Ω ] = usg the depedece of the Ω s ad that Ω = 1. We are ow ready to gve a boud for the full eural etwork. We ca use Proposto 3.7 to ru the recurso, otcg that the last layer volves a lear fucto (whch s 1-Lpschtz. The frst layer requres a dfferet treatmet, ad we ca use Proposto 3.4. Usg Proposto 3.7 we ca establsh the followg boud for the Rademacher complexty of a layered eural etwork.
6 3-6 Lecture 3: Rademacher Complexty. Examples Proposto 3.8 Let σ be λ-lpschtz. Let A (p be defed as 3.1. The, for ay x 1..., x R d, Rad(A (p {x 1,..., x } 1 p 3 (β + ωβλ (ωλ k + ω(ωλ p max x log(d If λ = 1 ad σ s at-symmetrc, amely, σ(x = σ( x, we have Rad(A (p {x 1,..., x } 1 k=0 p (β k=0 ω k + ω p 1 max x log(d Proof: As the last layer of the eural etwork s lear,.e., σ (p (x = x, we ca apply Proposto 3.7 wth α = 1 (as σ (p s 1-Lpschtz oce ad the apply (3. Proposto 3.7 wth α = λ for p tmes. We fd Rad(A (p {x 1,..., x } β ( βλ + ω p 3 (ωλ k + (ωλ p Rad(A 1 {x 1,..., x }. k=0 The result of the frst equalty follows by Proposto 3.4. The secod equalty ca be proved usg the same strategy, usg (3.3 stead of (3..
Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More information18.413: Error Correcting Codes Lab March 2, Lecture 8
18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationInvestigation of Partially Conditional RP Model with Response Error. Ed Stanek
Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More information9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d
9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,
More informationLecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions
CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More information1 Onto functions and bijections Applications to Counting
1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of
More information1 Review and Overview
CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationStrong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity
BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout
More informationChapter 4 Multiple Random Variables
Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:
More information13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations
Lecture 7 3. Parametrc ad No-Parametrc Ucertates, Radal Bass Fuctos ad Neural Network Approxmatos he parameter estmato algorthms descrbed prevous sectos were based o the assumpto that the system ucertates
More informationBounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy
Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled
More information4 Inner Product Spaces
11.MH1 LINEAR ALGEBRA Summary Notes 4 Ier Product Spaces Ier product s the abstracto to geeral vector spaces of the famlar dea of the scalar product of two vectors or 3. I what follows, keep these key
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationLecture 4 Sep 9, 2015
CS 388R: Radomzed Algorthms Fall 205 Prof. Erc Prce Lecture 4 Sep 9, 205 Scrbe: Xagru Huag & Chad Voegele Overvew I prevous lectures, we troduced some basc probablty, the Cheroff boud, the coupo collector
More informationHomework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015
Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts
More informationmeans the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.
9.4 Sequeces ad Seres Pre Calculus 9.4 SEQUENCES AND SERIES Learg Targets:. Wrte the terms of a explctly defed sequece.. Wrte the terms of a recursvely defed sequece. 3. Determe whether a sequece s arthmetc,
More informationPROJECTION PROBLEM FOR REGULAR POLYGONS
Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c
More informationChapter 9 Jordan Block Matrices
Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.
More information18.657: Mathematics of Machine Learning
8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationCS286.2 Lecture 4: Dinur s Proof of the PCP Theorem
CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationLecture 07: Poles and Zeros
Lecture 07: Poles ad Zeros Defto of poles ad zeros The trasfer fucto provdes a bass for determg mportat system respose characterstcs wthout solvg the complete dfferetal equato. As defed, the trasfer fucto
More information2SLS Estimates ECON In this case, begin with the assumption that E[ i
SLS Estmates ECON 3033 Bll Evas Fall 05 Two-Stage Least Squares (SLS Cosder a stadard lear bvarate regresso model y 0 x. I ths case, beg wth the assumto that E[ x] 0 whch meas that OLS estmates of wll
More informationSome Different Perspectives on Linear Least Squares
Soe Dfferet Perspectves o Lear Least Squares A stadard proble statstcs s to easure a respose or depedet varable, y, at fed values of oe or ore depedet varables. Soetes there ests a deterstc odel y f (,,
More informationρ < 1 be five real numbers. The
Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationα1 α2 Simplex and Rectangle Elements Multi-index Notation of polynomials of degree Definition: The set P k will be the set of all functions:
Smplex ad Rectagle Elemets Mult-dex Notato = (,..., ), o-egatve tegers = = β = ( β,..., β ) the + β = ( + β,..., + β ) + x = x x x x = x x β β + D = D = D D x x x β β Defto: The set P of polyomals of degree
More informationMOLECULAR VIBRATIONS
MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal
More informationAN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET
AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET Abstract. The Permaet versus Determat problem s the followg: Gve a matrx X of determates over a feld of characterstc dfferet from
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationENGI 4421 Propagation of Error Page 8-01
ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationThe Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012
Sc. Math. Japocae Vol. 00, No. 0 0000, 000 000 1 The Arthmetc-Geometrc mea equalty a exteral formula Yuk Seo October 23, 2012 Abstract. The classcal Jese equalty ad ts reverse are dscussed by meas of terally
More informationA tighter lower bound on the circuit size of the hardest Boolean functions
Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the
More informationAlgorithms Theory, Solution for Assignment 2
Juor-Prof. Dr. Robert Elsässer, Marco Muñz, Phllp Hedegger WS 2009/200 Algorthms Theory, Soluto for Assgmet 2 http://lak.formatk.u-freburg.de/lak_teachg/ws09_0/algo090.php Exercse 2. - Fast Fourer Trasform
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More information13. Artificial Neural Networks for Function Approximation
Lecture 7 3. Artfcal eural etworks for Fucto Approxmato Motvato. A typcal cotrol desg process starts wth modelg, whch s bascally the process of costructg a mathematcal descrpto (such as a set of ODE-s)
More information8.1 Hashing Algorithms
CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More informationIntroduction to Matrices and Matrix Approach to Simple Linear Regression
Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,
More informationCOV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.
c Pogsa Porchawseskul, Faculty of Ecoomcs, Chulalogkor Uversty olato of costat varace of s but they are stll depedet. C,, he error term s sad to be heteroscedastc. c Pogsa Porchawseskul, Faculty of Ecoomcs,
More informationIdeal multigrades with trigonometric coefficients
Ideal multgrades wth trgoometrc coeffcets Zarathustra Brady December 13, 010 1 The problem A (, k) multgrade s defed as a par of dstct sets of tegers such that (a 1,..., a ; b 1,..., b ) a j = =1 for all
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationPrincipal Components. Analysis. Basic Intuition. A Method of Self Organized Learning
Prcpal Compoets Aalss A Method of Self Orgazed Learg Prcpal Compoets Aalss Stadard techque for data reducto statstcal patter matchg ad sgal processg Usupervsed learg: lear from examples wthout a teacher
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More information3. Basic Concepts: Consequences and Properties
: 3. Basc Cocepts: Cosequeces ad Propertes Markku Jutt Overvew More advaced cosequeces ad propertes of the basc cocepts troduced the prevous lecture are derved. Source The materal s maly based o Sectos.6.8
More informationDepartment of Agricultural Economics. PhD Qualifier Examination. August 2011
Departmet of Agrcultural Ecoomcs PhD Qualfer Examato August 0 Istructos: The exam cossts of sx questos You must aswer all questos If you eed a assumpto to complete a questo, state the assumpto clearly
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More information1. A real number x is represented approximately by , and we are told that the relative error is 0.1 %. What is x? Note: There are two answers.
PROBLEMS A real umber s represeted appromately by 63, ad we are told that the relatve error s % What s? Note: There are two aswers Ht : Recall that % relatve error s What s the relatve error volved roudg
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More information= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n
.. Soluto of Problem. M s obvously cotuous o ], [ ad ], [. Observe that M x,..., x ) M x,..., x ) )..) We ext show that M s odecreasg o ], [. Of course.) mles that M s odecreasg o ], [ as well. To show
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationLecture 12: Multilayer perceptrons II
Lecture : Multlayer perceptros II Bayes dscrmats ad MLPs he role of hdde uts A eample Itroducto to Patter Recoto Rcardo Guterrez-Osua Wrht State Uversty Bayes dscrmats ad MLPs ( As we have see throuhout
More informationONE GENERALIZED INEQUALITY FOR CONVEX FUNCTIONS ON THE TRIANGLE
Joural of Pure ad Appled Mathematcs: Advaces ad Applcatos Volume 4 Number 205 Pages 77-87 Avalable at http://scetfcadvaces.co. DOI: http://.do.org/0.8642/jpamaa_7002534 ONE GENERALIZED INEQUALITY FOR CONVEX
More informationLecture 02: Bounding tail distributions of a random variable
CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome
More informationarxiv:math/ v1 [math.gm] 8 Dec 2005
arxv:math/05272v [math.gm] 8 Dec 2005 A GENERALIZATION OF AN INEQUALITY FROM IMO 2005 NIKOLAI NIKOLOV The preset paper was spred by the thrd problem from the IMO 2005. A specal award was gve to Yure Boreko
More informationCSE 5526: Introduction to Neural Networks Linear Regression
CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto
More informationGeneralized Linear Regression with Regularization
Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee
More informationTransforms that are commonly used are separable
Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )
More informationCHAPTER 4 RADICAL EXPRESSIONS
6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube
More informationNon-uniform Turán-type problems
Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at
More informationFor combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.
Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationLattices. Mathematical background
Lattces Mathematcal backgroud Lattces : -dmesoal Eucldea space. That s, { T x } x x = (,, ) :,. T T If x= ( x,, x), y = ( y,, y), the xy, = xy (er product of xad y) x = /2 xx, (Eucldea legth or orm of
More informationLogistic regression (continued)
STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory
More informationA unified matrix representation for degree reduction of Bézier curves
Computer Aded Geometrc Desg 21 2004 151 164 wwwelsevercom/locate/cagd A ufed matrx represetato for degree reducto of Bézer curves Hask Suwoo a,,1, Namyog Lee b a Departmet of Mathematcs, Kokuk Uversty,
More informationThe Occupancy and Coupon Collector problems
Chapter 4 The Occupacy ad Coupo Collector problems By Sarel Har-Peled, Jauary 9, 08 4 Prelmares [ Defto 4 Varace ad Stadard Devato For a radom varable X, let V E [ X [ µ X deote the varace of X, where
More information1 Solution to Problem 6.40
1 Soluto to Problem 6.40 (a We wll wrte T τ (X 1,...,X where the X s are..d. wth PDF f(x µ, σ 1 ( x µ σ g, σ where the locato parameter µ s ay real umber ad the scale parameter σ s > 0. Lettg Z X µ σ we
More informationD KL (P Q) := p i ln p i q i
Cheroff-Bouds 1 The Geeral Boud Let P 1,, m ) ad Q q 1,, q m ) be two dstrbutos o m elemets, e,, q 0, for 1,, m, ad m 1 m 1 q 1 The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More informationTHE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5
THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should
More informationMA 524 Homework 6 Solutions
MA 524 Homework 6 Solutos. Sce S(, s the umber of ways to partto [] to k oempty blocks, ad c(, s the umber of ways to partto to k oempty blocks ad also the arrage each block to a cycle, we must have S(,
More informationMultiple Choice Test. Chapter Adequacy of Models for Regression
Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to
More informationThe Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)
We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the
More informationGeneral Method for Calculating Chemical Equilibrium Composition
AE 6766/Setzma Sprg 004 Geeral Metod for Calculatg Cemcal Equlbrum Composto For gve tal codtos (e.g., for gve reactats, coose te speces to be cluded te products. As a example, for combusto of ydroge wt
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationLecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,
Lecture (Part II) Materals Covered Ths Lecture: Chapter 2 (2.6 --- 2.0) The umber of ways of parttog dstct obects to dstct groups cotag, 2,, obects, respectvely, where each obect appears exactly oe group
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationUnit 9. The Tangent Bundle
Ut 9. The Taget Budle ========================================================================================== ---------- The taget sace of a submafold of R, detfcato of taget vectors wth dervatos at
More informationarxiv: v1 [cs.lg] 22 Feb 2015
SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we
More information7.0 Equality Contraints: Lagrange Multipliers
Systes Optzato 7.0 Equalty Cotrats: Lagrage Multplers Cosder the zato of a o-lear fucto subject to equalty costrats: g f() R ( ) 0 ( ) (7.) where the g ( ) are possbly also olear fuctos, ad < otherwse
More information(b) By independence, the probability that the string 1011 is received correctly is
Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty
More informationNP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer
NP!= P By Lu Ra Table of Cotets. Itroduce 2. Prelmary theorem 3. Proof 4. Expla 5. Cocluso. Itroduce The P versus NP problem s a major usolved problem computer scece. Iformally, t asks whether a computer
More information