Supplement to Graph Sparsification Approaches for Laplacian Smoothing
|
|
- Hugo Stone
- 5 years ago
- Views:
Transcription
1 Supplemet to Graph Sparsificatio Approaches for Laplacia Smoothig Veerajaeyulu Sadhaala Yu-Xiag Wag Rya J. Tibshirai Machie Learig Departmet Caregie Mello Uiversity Pittsburgh PA 53 Departmet of Statistics Caregie Mello Uiversity Pittsburgh PA 53 This documet cotais proofs supplemetary details ad supplemetary experimets for the paper Graph Sparsificatio Approaches for Laplacia Smoothig. All sectio umbers equatio umbers ad figure umbers i this supplemetary documet are preceded by the letter A to distiguish them from those from the mai paper. A. Proof of Theorem Part a. By optimality of ˆθ for problem 5 y ˆθ + λˆθ T Lˆθ y ˆβ + λ ˆβ T L ˆβ y ˆβ + λ + ɛ ˆβ T L ˆβ where we have used the spectral similarity of L L. Rearragig we see that ˆθ ˆβ y T ˆθ ˆβ + λ + ɛ ˆβ T L ˆβ λˆθ T Lˆθ. Substitutig y = ˆβ + y ˆβ o the right-had side ad agai rearragig ˆθ ˆβ y ˆβ T ˆθ ˆβ + λ + ɛ ˆβ T L ˆβ λˆθ T Lˆθ. Usig y ˆβ = λl ˆβ from the statioarity coditio for ˆθ ˆβ λˆθ T L ˆβ λ ɛ ˆβ T L ˆβ λˆθ T Lˆθ λ L / ˆθ L / ˆβ λ ɛ ˆβ T L ˆβ λˆθ T Lˆθ λ + ɛ L / ˆθ L / ˆβ λ ɛ ˆβ T L ˆβ λˆθ T Lˆθ where we have agai used the spectral similarity of L L. Now we examie two cases for last lie above. If + ɛ L/ ˆθ L / ˆβ the ˆθ ˆβ λ + ɛ ˆβ T L ˆβ λˆθ T Lˆθ. If + ɛ L / ˆθ > L / ˆβ the ˆθ ˆβ λ + ɛˆθ T Lˆθ λ ɛ ˆβT L ˆβ. Puttig these together we get the desired fial boud. Proof of b. Followig the proof strategy for part a except with the regularizatio parameter deoted by
2 λ for problem 5 we have ˆθ ˆβ λˆθ T L ˆβ + + ɛλ λ ˆβ T L ˆβ λ ˆθT Lˆθ λ L / ˆθ L / ˆβ + + ɛλ λ ˆβ T L ˆβ λ ˆθT Lˆθ λ + ɛ L / ˆθ L / ˆβ + + ɛλ λ ˆβ T L ˆβ λ ˆθT Lˆθ = λ + ɛ L / ˆθ L / λ ˆβ λ + ɛ L / ˆθ + + ɛλ λ ˆβ T L ˆβ. }{{} a We ow examie two cases for the first term a o the lie above. If λ /λ + ɛ L / ˆθ L / ˆβ the a 4λ + ɛ/λ ˆβ T L ˆβ; if λ /λ + ɛ L / ˆθ > L / ˆβ the a 0. Therefore ˆθ ˆβ 4λ + ɛ + + ɛλ λ ˆβT L ˆβ. λ A easy calculatio shows that the optimal value of λ makig the leadig factor above as small as possible is λ = λ. Pluggig this i gives the desired result. A. Proof of Theorem Let us first cosider the uivariate logistic fuctio g : R R defied by gx = ax + log + e x for a costat a R. It is clear that g is covex with first ad secod derivatives g x = a + πx g x = πx πx where πx = e x / + e x. If x R the g x δ δ where δ = π R. Thus by strog covexity of g over the iterval [ R R] we have gz gx g xz x or equivaletly by the mootoicity of the iverse lik fuctio π gz gx g xz x δ δ z x for all x z [ R R]. δ δ z x for all πx πz [δ δ]. Now write the logistic loss i 3 as fβ = i= gβ i; y i ; the the above shows that fθ fβ fβ T θ β δ δ θ β wheever πβ i πθ i [δ δ] i =.... A. Hece uder the assumptios of the theorem we may begi with the assertio that fˆθ + λ ˆθT Lˆθ f ˆβ + λ ˆβT L ˆβ by the optimality of ˆθ for its ow problem the rearrage ad use A. to arrive at δ δ ˆβ ˆθ f ˆβ T ˆθ ˆβ + λ ˆβT L ˆβ λ ˆθT Lˆθ f ˆβ T ˆθ ˆβ + λ + ɛ ˆβ T L ˆβ λ ˆθT Lˆθ. Usig f ˆβ = λl ˆβ from the statioarity coditio for ˆβ i we have δ δ ˆβ ˆθ λˆθ T L ˆβ + + ɛλ λ ˆβ T L ˆβ λ ˆθT Lˆθ. The right-had side is precisely of the same form as that aalyzed i parts a ad b of Theorem ad the results follow.
3 A.3 Estimatio Error Bouds ad Stability The followig is a estimatio error boud derived for Laplacia smoothig i regressio where the error rate is show to scale with λ/. Theorem A.. Assume that y Nβ σ I ad deote by... the eigevalues of the graph Laplacia matrix L. Let i 0 {... } ad cosider a choice of tuig parameter λ = Θ i=i 0+ i Dβ where D the graph differece operator so that L = D T D. The the graph Laplacia smoothig estimate ˆβ i with the regressio loss satisfies ˆβ β ullityl = O P + i 0 + i=i 0+ Dβ. i Proof. Let R = rowl the row space of L ad R = ulll the ull space of L. Also let P R be the projectio oto R ad P R be the projectio oto R ad abbreviate x R = P R x x R = P R x. We ca decompose ˆβ β = ˆβ ˆβ R + ˆβ ˆβ R. The first term ˆβ ˆβ R is o the order of ullityl which is the umber of coected compoets i the uderlyig graph G. This cotributes the first term i the error rate of the theorem. Now it suffices to cosider ˆβ ˆβ R. Usig the optimality of ˆβ i y ˆβ + λ ˆβ T L ˆβ y β + λβ T Lβ. After settig y = β + ɛ with ɛ N0 σ I expadig ad rearragig we arrive at the basic iequality ˆβ β R ˆβ β P R ɛ + λβ T Lβ λ ˆβ T L ˆβ. A. Abbreviatig δ = ˆβ β let us write ɛ T P R δ = ɛ T P i0 P R δ + ɛ T I P i0 P R δ where P i0 is the projectio oto the first i 0 eigevectors of L i.e. the eigevectors associated with eigevalues... i0. The first term above satisfies whereas the secod term satisfies ɛ T P i0 P R δ P i0 ɛ δ R = O P i0 δ R ɛ T I P i0 P R δ = ɛ T I P i0 D + Dδ = ɛt I P i0 D + D + T I P i0 ɛ λdδ + λ λ λ Dδ. A.3 I the last lie here we used the iequality a T b a + b. Directly from the sigular value decompositio of D it is easy to verify that D + T I P i0 ɛ = O P. i=i i 0+ 3
4 Pluggig these bouds ito the basic iequality A. we see that ˆβ β R O P i0 ˆβ β R + O P i=i i λ + λ D ˆβ D ˆβ 0 + λ Dβ λ D ˆβ 0+ O P i0 ˆβ β R + O P λ + λ Dβ. i=i i 0+ I the last lie we have agai used a + b a + b. Pluggig i the value of λ as described i the statemet of the theorem yields ˆβ β R O P i0 ˆβ β R + O P i=i i 0+ Dβ. We ca view this as a quadratic equatio of form ax bx c 0 i x = ˆβ β R. As a > 0 the larger of its two roots serves as a boud for x i.e. x b + b + 4ac/a b/a + c/a or x b /a + c/a which meas that ˆβ β R = O P i 0 + Dβ. i This completes the proof. i=i 0+ Remark. Assumig that the umber of coected compoets ullityl stays bouded as grows the optimal i 0 i the theorem would be chose to balace i 0 / with the last term i the rate. This depeds of course o the decay of eigevalues of L; for differet graphs G the eigevalues will decay at differet rates leadig to differet error bouds. But i ay case the result i the theorem reduces to ˆβ β λ = O P Dβ which whe Dβ = O is precisely as claimed i 7 i the mai paper. Our ext result shows that whe the solutio ˆβ is tued as i Theorem A. the achieved pealty is o the same order as that for β. Lemma A.. Uder the same coditios as i Theorem A. if ullityl = O ad we were to choose i 0 = Oλ Dβ the the achieved pealty term satisfies D ˆβ = O P Dβ. Hece i particular if Dβ = O the D ˆβ = O P. Proof. Returig to the proof of Theorem A. cosider replacig the step i A.3 by ɛ T I P i0 P R δ = ɛ T I P i0 D + Dδ = ɛt I P i0 D + D + T I P i0 ɛ 0.5λDδ + λ 0.5λ λ 4 Dδ. Carryig o as i the proof of Theorem A. we arrive at ˆβ β R O P i0 ˆβ β R + O P i=i i 0+ λ + 3λ Dβ λ D ˆβ. Usig ˆβ β R 0 ad rearragig we have D ˆβ i 0 O P λ ˆβ β R + 3 Dβ = O P Dβ where we have used the choice of i 0 ad the correspodig rate of ˆβ β R from Theorem A.. Lastly we show that uder the coditios of the last lemma both ˆβ ad ˆθ the latter beig the solutio of the sparsified problem 5 achieve the same error rate. 4
5 Corollary A.. Uder the same coditios as i Lemma A. assumig that L is a + ɛ approximatio of L the solutio ˆθ of the sparsified problem 5 with the regressio loss ad tuig parameter λ = Θ i=i 0+ i Dβ satisfies just as ˆβ does. ˆθ β λ = O P Dβ Proof. There are two ways to prove this result. The first strategy is simply to ote that ˆθ β ˆθ ˆβ + ˆβ β ad both terms are O P λ Dβ / the first term cotrolled by part b of Theorem i the mai paper i combiatio with Lemma A. ad the secod term hadled by Theorem A.. The secod strategy is to replicate the proof of Theorem A. applied to the problem 5 with L i place of L ad the coclude that the resultig error rate is ot chaged sice all eigevalues of L are withi a multiplicative factor betwee / + ɛ ad + ɛ of those of L. A.4 Proof of Lemma We use Hoeffdig s iequality as a mai tool for provig the results for both the uiform ad kn samplers first treatig the uiform samplig case. For i =... q let e i = u i v i be the ith edge sampled ad deote by X i the term added to x T Lx due to samplig ei. The X i = W q x u x v with probability w e /W for each e = u v E recallig W = e E w e. Note that q i= X i = x T Lx. The variables X X... X q are idepedet ad lie i a iterval of legth W r/q where r = max uv E x u x v mi uv E x u x v. Usig Hoeffdig s iequality P q i= X i EX i > t exp t q i= W r /q = exp t q W r. A.4 As L is a ubiased estimator of L we have q i= EX i = Ex T Lx = x T Lx. Abbreviatig µ = x T Lx ad substitutig t = δµ ito the Hoeffdig boud A.4 we ifer that x T Lx µ δµ A.5 with probability at least exp δ µ q/w r. Now we use the smoothess assumptio 8 o x i.e. Dx µsw mi /W as well as r max x u x v uv E max uv E w uv w mi x u x v = Dx w mi recallig w mi = mi e E w e. Therefore r µs/w ad the result i A.5 holds with probability at least exp δ q/s. To make this probability at least / we eed to choose q s /δ log samples. Lastly to give the result as writte i the lemma we simply substitute δ = ɛ/ + ɛ ad observe that A.5 the implies + ɛ µ xt Lx + ɛ µ + ɛµ. + ɛ 5
6 The argumets for kn samplig are similar. We sample oly from odes with degree greater tha k; call this set U. Whe samplig edges icidet to ode u U for i =... k let e ui deote the ith edge sampled ad let X ui be its cotributio to x T Lx. The X ui = W u k x u x v with probability w e /W u for each v Nu recallig W u = v Nu w uv ad Nu deotes the eighbors of u. The variables X ui u U i =... k are idepedet ad lie i a iterval of legth at most W max r/k where W max = max u V W u ad r is as above i the proof for the uiform samplig part. Usig Hoeffdig s iequality oce agai P k X ui EX ui > t exp u U k i= t Wmax r/k = exp 8t k r u U i= u U W max A.6 Deotig z u = v Nu w uvx u x v for all u V ote that EX ui = z u /k for all u U. Thus ad k u U i= u U i= X ui = x T Lx k EX ui = x T Lx u V \U u V \U This meas that the Hoeffdig boud A.6 settig t = δµ implies the statemet i A.5 with probability exp 8δ µ k/r u U W max exp 8δ µ k/r W max. Boudig r sµ/w from the smoothess coditio 8 as argued i the proof for the uiform samplig case this is further lower bouded by exp 8δ W k/s W max. To make this probability at least / we eed to choose k sw max /W /4δ log. The rest follows as i the uiform samplig case i.e. to get the result as stated i the lemma we take δ = ɛ/ + ɛ. A.5 Proof of Lemma The proof is simple. Deote by L the Laplacia matrix of G H ad L the Laplacia matrix of G H. Also deote the edge weights of G H by w uu w vv ad the edge weights of G H by w uu w vv. Observe x T Lx = w vv x uv x uv + w uu x uv x u v u V G {vv } E H v V H {uu } E G x T Lx = w vv x uv x uv + w uu x uv x u v. z u z u.. u V G {vv } E H v V H {uu } E G Applyig the appropriate spectral sparsificatio bouds to the ier sums gives the result. A.6 Proof of Theorem 3 Let us deote F β = i Ω lx T i β; y i + µ β v. Note that by costructio F is strogly covex with parameter µ > 0 which implies that F θ F β g T θ β + µ θ β A.7 for ay subgradiet g of F at β. Therefore by the optimality of ˆθ for problem 0 F ˆθ + λ ˆθT Lˆθ F ˆβ + λ ˆβT L ˆβ 6
7 MSE Time secods ad after rearragig ad applyig A.7 we have µ ˆβ ˆθ g T ˆθ ˆβ + λ ˆβT L ˆβ λ ˆθT Lˆθ g T ˆθ ˆβ + λ + ɛ ˆβ T L ˆβ λ ˆθT Lˆθ for ay subgradiet g of F at β. As there exists a subgradiet such that g = λl ˆβ from the statioarity coditio for ˆβ i 9 we have µ ˆβ ˆθ λˆθ T L ˆβ + + ɛλ λ ˆβ T L ˆβ λ ˆθT Lˆθ ad the remaider of the aalysis proceeds exactly as i parts a ad b of Theorem. A.7 Gaussia Smoothig with the Google+ Data I this experimet we added i.i.d. N0 5.5 oise to the compoets of β the smooth sigal costructed by a simulated diffusio over the Google+ graph as described i the mai paper. Figure A. shows the results whe cosiderig a Gaussia loss i the Laplacia smoothig problems MSE Sparse kn Uif Dese Sparse kn Uif Dese Time Number of problems Figure A.: MSE ad timig results for a Gaussia smoothig problem with the Google+ data. 7
ECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationThe log-behavior of n p(n) and n p(n)/n
Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity
More informationSupplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationCS / MCS 401 Homework 3 grader solutions
CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationSupplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity
More informationSummary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector
Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationMath Solutions to homework 6
Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationSupplemental Material: Proofs
Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special
More information1+x 1 + α+x. x = 2(α x2 ) 1+x
Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem
More informationMath 220B Final Exam Solutions March 18, 2002
Math 0B Fial Exam Solutios March 18, 00 1. (1 poits) (a) (6 poits) Fid the Gree s fuctio for the tilted half-plae {(x 1, x ) R : x 1 + x > 0}. For x (x 1, x ), y (y 1, y ), express your Gree s fuctio G(x,
More informationSupplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We
More informationMath 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions
Math 451: Euclidea ad No-Euclidea Geometry MWF 3pm, Gasso 204 Homework 3 Solutios Exercises from 1.4 ad 1.5 of the otes: 4.3, 4.10, 4.12, 4.14, 4.15, 5.3, 5.4, 5.5 Exercise 4.3. Explai why Hp, q) = {x
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationProblem Set 2 Solutions
CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More informationA Risk Comparison of Ordinary Least Squares vs Ridge Regression
Joural of Machie Learig Research 14 (2013) 1505-1511 Submitted 5/12; Revised 3/13; Published 6/13 A Risk Compariso of Ordiary Least Squares vs Ridge Regressio Paramveer S. Dhillo Departmet of Computer
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More informationMath 128A: Homework 1 Solutions
Math 8A: Homework Solutios Due: Jue. Determie the limits of the followig sequeces as. a) a = +. lim a + = lim =. b) a = + ). c) a = si4 +6) +. lim a = lim = lim + ) [ + ) ] = [ e ] = e 6. Observe that
More informationFundamental Theorem of Algebra. Yvonne Lai March 2010
Fudametal Theorem of Algebra Yvoe Lai March 010 We prove the Fudametal Theorem of Algebra: Fudametal Theorem of Algebra. Let f be a o-costat polyomial with real coefficiets. The f has at least oe complex
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationNotes for Lecture 11
U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information1 Review and Overview
DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,
More informationChapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities
Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationSparsification using Regular and Weighted. Graphs
Sparsificatio usig Regular a Weighte 1 Graphs Aly El Gamal ECE Departmet a Cooriate Sciece Laboratory Uiversity of Illiois at Urbaa-Champaig Abstract We review the state of the art results o spectral approximatio
More information18.657: Mathematics of Machine Learning
8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h
More informationRandomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)
Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information( ) (( ) ) ANSWERS TO EXERCISES IN APPENDIX B. Section B.1 VECTORS AND SETS. Exercise B.1-1: Convex sets. are convex, , hence. and. (a) Let.
Joh Riley 8 Jue 03 ANSWERS TO EXERCISES IN APPENDIX B Sectio B VECTORS AND SETS Exercise B-: Covex sets (a) Let 0 x, x X, X, hece 0 x, x X ad 0 x, x X Sice X ad X are covex, x X ad x X The x X X, which
More informationEconometrica Supplementary Material
Ecoometrica Supplemetary Material SUPPLEMENT TO NONPARAMETRIC INSTRUMENTAL VARIABLES ESTIMATION OF A QUANTILE REGRESSION MODEL : MATHEMATICAL APPENDIX (Ecoometrica, Vol. 75, No. 4, July 27, 1191 128) BY
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationLecture 12: February 28
10-716: Advaced Machie Learig Sprig 2019 Lecture 12: February 28 Lecturer: Pradeep Ravikumar Scribes: Jacob Tyo, Rishub Jai, Ojash Neopae Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationHOMEWORK #10 SOLUTIONS
Math 33 - Aalysis I Sprig 29 HOMEWORK # SOLUTIONS () Prove that the fuctio f(x) = x 3 is (Riema) itegrable o [, ] ad show that x 3 dx = 4. (Without usig formulae for itegratio that you leart i previous
More informationTechnical Proofs for Homogeneity Pursuit
Techical Proofs for Homogeeity Pursuit bstract This is the supplemetal material for the article Homogeeity Pursuit, submitted for publicatio i Joural of the merica Statistical ssociatio. B Proofs B. Proof
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More informationThe minimum value and the L 1 norm of the Dirichlet kernel
The miimum value ad the L orm of the Dirichlet kerel For each positive iteger, defie the fuctio D (θ + ( cos θ + cos θ + + cos θ e iθ + + e iθ + e iθ + e + e iθ + e iθ + + e iθ which we call the (th Dirichlet
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationIt is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.
Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationIt is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.
MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationECE534, Spring 2018: Final Exam
ECE534, Srig 2018: Fial Exam Problem 1 Let X N (0, 1) ad Y N (0, 1) be ideedet radom variables. variables V = X + Y ad W = X 2Y. Defie the radom (a) Are V, W joitly Gaussia? Justify your aswer. (b) Comute
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationZeros of Polynomials
Math 160 www.timetodare.com 4.5 4.6 Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered with fidig the solutios of polyomial equatios of ay degree
More information5.1 A mutual information bound based on metric entropy
Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local
More informationHomework Set #3 - Solutions
EE 15 - Applicatios of Covex Optimizatio i Sigal Processig ad Commuicatios Dr. Adre Tkaceko JPL Third Term 11-1 Homework Set #3 - Solutios 1. a) Note that x is closer to x tha to x l i the Euclidea orm
More informationUniversity of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!
Uiversity of Colorado Dever Dept. Math. & Stat. Scieces Applied Aalysis Prelimiary Exam 13 Jauary 01, 10:00 am :00 pm Name: The proctor will let you read the followig coditios before the exam begis, ad
More informationRandom Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices
Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationNICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =
AN INTRODUCTION TO SCHRÖDER AND UNKNOWN NUMBERS NICK DUFRESNE Abstract. I this article we will itroduce two types of lattice paths, Schröder paths ad Ukow paths. We will examie differet properties of each,
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More information36-755, Fall 2017 Homework 5 Solution Due Wed Nov 15 by 5:00pm in Jisu s mailbox
Poits: 00+ pts total for the assigmet 36-755, Fall 07 Homework 5 Solutio Due Wed Nov 5 by 5:00pm i Jisu s mailbox We first review some basic relatios with orms ad the sigular value decompositio o matrices
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationAPPENDIX A SMO ALGORITHM
AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More information4. Hypothesis testing (Hotelling s T 2 -statistic)
4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationProduct measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.
Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationRiesz-Fischer Sequences and Lower Frame Bounds
Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 1 (00), No., 305 314 Riesz-Fischer Sequeces ad Lower Frame Bouds P. Casazza, O. Christese, S. Li ad A. Lider Abstract.
More informationx iu i E(x u) 0. In order to obtain a consistent estimator of β, we find the instrumental variable z which satisfies E(z u) = 0. z iu i E(z u) = 0.
27 However, β MM is icosistet whe E(x u) 0, i.e., β MM = (X X) X y = β + (X X) X u = β + ( X X ) ( X u ) \ β. Note as follows: X u = x iu i E(x u) 0. I order to obtai a cosistet estimator of β, we fid
More informationQuestions and answers, kernel part
Questios ad aswers, kerel part October 8, 205 Questios. Questio : properties of kerels, PCA, represeter theorem. [2 poits] Let F be a RK defied o some domai X, with feature map φ(x) x X ad reproducig kerel
More informationMath 113, Calculus II Winter 2007 Final Exam Solutions
Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this
More informationarxiv: v1 [math.pr] 4 Dec 2013
Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics
More informationAssignment 5: Solutions
McGill Uiversity Departmet of Mathematics ad Statistics MATH 54 Aalysis, Fall 05 Assigmet 5: Solutios. Let y be a ubouded sequece of positive umbers satisfyig y + > y for all N. Let x be aother sequece
More informationINFINITE SEQUENCES AND SERIES
INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES I geeral, it is difficult to fid the exact sum of a series. We were able to accomplish this for geometric series ad the series /[(+)]. This is
More informationMath 61CM - Solutions to homework 3
Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig
More informationAlternating Series. 1 n 0 2 n n THEOREM 9.14 Alternating Series Test Let a n > 0. The alternating series. 1 n a n.
0_0905.qxd //0 :7 PM Page SECTION 9.5 Alteratig Series Sectio 9.5 Alteratig Series Use the Alteratig Series Test to determie whether a ifiite series coverges. Use the Alteratig Series Remaider to approximate
More informationA Proof of Birkhoff s Ergodic Theorem
A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed
More informationMIDTERM 3 CALCULUS 2. Monday, December 3, :15 PM to 6:45 PM. Name PRACTICE EXAM SOLUTIONS
MIDTERM 3 CALCULUS MATH 300 FALL 08 Moday, December 3, 08 5:5 PM to 6:45 PM Name PRACTICE EXAM S Please aswer all of the questios, ad show your work. You must explai your aswers to get credit. You will
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationMA131 - Analysis 1. Workbook 3 Sequences II
MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationComparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series
Applied Mathematical Scieces, Vol. 7, 03, o. 6, 3-337 HIKARI Ltd, www.m-hikari.com http://d.doi.org/0.988/ams.03.3430 Compariso Study of Series Approimatio ad Covergece betwee Chebyshev ad Legedre Series
More information