Collaborative Ranking for Local Preferences Supplement

Size: px
Start display at page:

Download "Collaborative Ranking for Local Preferences Supplement"

Transcription

1 Collaborative Raning for Local Preferences Supplement Ber apicioglu Davi S Rosenberg Robert E Schapire ony Jebara YP YP Princeton University Columbia University Problem Formulation Let U {,,m} be the set of users, let V {,,n} be the set of items, an let {,,} inicate the local time hen, the sample space is efine as X {(u, C, i, t) u U,C V,i C, t } () Let P [ ] enote probability, let C i be the set C excluing element i, anletc U C mean that c is sample uniformly from C hen, the local raning loss associate with hypothesis g is L g (u, C, i, t) P [g (u, i, t) g (u, c, t) 0] () c U C i A Boun on the Generalization Error We assume that the hypothesis class is base on the set of low-ran matrices Given a low-ran matrix M, letg M F be the associate hypothesis, where g M (u, i) M u,i hroughout the paper, we abuse notation an use g M an M interchangeably We assume that ata is generate with respect to D, which is an unnown probability istribution over the sample space X, anwelete enote expectation hen, the generalization error of hypothesis M is E L M (u, C, i), which is the quantity we boun (u,c,i) D below We will erive the generalization boun in two steps In the first step, we will boun the empirical Raemacher complexity of our loss class, efine below, with respect to samples that contain exactly caniates, an in the secon step, we will prove the generalization boun with a reuction to the previous step Lemma Let m be the number of users an let n be the number of items Define L r {L M M R m n has ran at most r} as the class of loss functions associate with low-ran matrices Assume that S X is a set of samples, where each sample contains exactly caniate items; ie if (u, C, i) S, then C Let R S (L r ) enote the Raemacher complexity of L r with respect to S hen, 6emn r (m n)ln r(mn) R S (L r ) Proof Because each sample in S contains exactly caniates, any hypothesis L M L r applie to a sample in S outputs either 0 or hus, the set of ichotomies that are realize by L r on S, calle Π Lr (S ),iswell-efine UsingEquation(6) from Boucheron et al [], we now that R S (L r ) ln Π Lr (S ) Let X X be the set of all samples that contain exactly caniates, Π Lr (S ) Π Lr (X ), soitsuffices to boun Π Lr (X ) We boun Π Lr (X ) by counting the sign configurations of polynomials using proof techniques that are influence by Srebro et al [4] Let (u, {i, j},i) X be a sample an let M be a hypothesis matrix Because M has ran at most r, it can be written as M UV, where U R m r an V R n r Let enote an inicator function that is if an only if its argument is true hen, the loss on the sample can also be rewritten as L M (u, {i, j},i)m u,i M u,j 0 UV UV 0 r U u,i u,j u,a (V i,a V j,a ) a 0 Since carinality of X is at most m n mn, putting it all together, it follows that Π Lr (X ) is boune by the number of sign configurations of mn polynomials, each of egree at most, overr (m n) variables Applying Corollary 3 from Srebro et al [4], we obtain Π Lr (X ) 6emn r(mn) r(mn) aing logarithms an maing basic substitutions yiel the esire result We procee to proving the more general result via a reuction to Lemma heorem Let m be the number of users an let n be the number of items Assume that S consists of

2 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara inepenently an ientically istribute samples chosen from X with respect to a probability istribution D Let L M be the loss function associate with a matrix M, as efine in Equation hen, with probability at least δ, for any matrix M R m n with ran at most r, E L M (u, C, i) (u,c,i) D r (m n)ln 6emn r E L M (u, C, i) (u,c,i) S U ln δ (3) we can boun the empirical local raning loss as E L M (u, C, i) L M (u, C, i) (u,c,i) S U P [M u,i M u,c 0] c U C i E M u,i M u,c 0 c U C i C i UV u,i UV u,c 0 c C i UV u,i UV u,c (4) Proof We will manipulate the efinition of Raemacher complexity [] in orer to use the boun given in Lemma : R S (L r ) E σ E σ sup L M L r sup L M L r σ a L M (u a,c a,i a ) a σ a E U a j a (Ca\{i a}) L M (u a, {i a,j a },i a ) E sup E σ a L M (u a, {i a,j a },i a ) σ L M L r j,,j a E E sup σ a L M (u a, {i a,j a },i a ) σ j,,j L M L r a E sup σ a L M (u a, {i a,j a },i a ) j,,j Eσ L M L r E [R S (L r )] j,,j r (m n)ln a 6emn r(mn) We note that the CLR an the raning SVM [] objectives are closely relate If V is fixe an we only nee to imize U, then each row of V acts as a feature vector for the corresponing item, each row of U acts as a separate linear preictor, an the CLR objective ecomposes into solving simultaneous raning SVM problems In particular, let S u {(a, C, i) S a u} be the examples that correspon to user u, letu u enote row u of U, anlet f rsvm enote the objective function of raning SVM, then f CLR (S; U, V ) U F UV UV u,i u,c m U u F u m u u m f rsvm (S u ; U u,v) u 4 Algorithms UV u,i UV u,c Plugging the boun to heorem 3 in Boucheron et al [] proves the theorem 4 Derivation Let (u, C, i) S be an example, then the corresponing approximate objective function is 3 Collaborative Local Raning Let h (x) max (0, x) be the hinge function, let M be the hypothesis matrix with ran at most r, anlet M UV, where U R m r an V R n r hen, f CLR ((u, C, i);u, V ) V F UV UV u,i u,c We introuce various matrix notation to help us efine the approximate subgraients Given a matrix M, let

3 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara Algorithm Alternating imization for optimizing the CLR objective Input: raining ata S X,regularizationparameter > 0, ranconstraintr, numberofiterations : U Sample matrix uniformly at ranom from m r mr, mr : V Sample matrix uniformly at ranom from n r nr, nr 3: for all tfrom to o 4: U t arg f CLR (S; U, V t ) U 5: V t arg f CLR (S; U t,v) V 6: return U,V M, enote row of M Definethematrix ˆM p,q,z,for p q, as M z, for s p, ˆM s, p,q,z M z, for s q, (5) 0 otherwise, an efine the matrix ˇM p,q,z s, ˇM p,q,z s, as M p, M q, for s z, 0 otherwise (6) Let enote an inicator function that is if an only if its argument is true hen, the subgraient of the approximate objective function with respect to V is V f CLR ((u, C, i);u, V )V UV UV < Û i,c,u (7) u,i u,c c C i Setting η t t as the learning rate at iteration t, the approximate subgraient upate becomes V t V t η t V f CLR ((u, C, i);u, V ) Aftertheupate,the weights are projecte onto a ball with raius he pseuocoe for optimizing both convex subproblems is epicte in Algorithms an 3 We prove the correctness of the algorithms an boun their running time in the next subsection 4 Analysis he convex subproblems we analyze have the general form X D f (X; ) X D X F (X;(u, C, i)) (8) Algorithm Projecte stochastic subgraient escent for optimizing U Input: Factors V R n r,trainingatas, regularization parameter, ranconstraintr, numberof iterations : U 0 m r : for all tfrom to o 3: Choose (u, C, i) S uniformly at ranom 4: η t t 5: C c C i U t V u,i U t V u,c < 6: U t ( η t ) U t ηt i,c,u ˇV c C 7: U t, Ut U t F 8: return U Algorithm 3 Projecte stochastic subgraient escent for optimizing V Input: Factors U R m r,trainingatas, regularization parameter, ranconstraintr, numberof iterations : V 0 n r : for all tfrom to o 3: Choose (u, C, i) S uniformly at ranom 4: η t t 5: C c C i UVt UV u,i t u,c < 6: V t ( η t ) V t ηt Û i,c,u c C 7: V t, Vt V t F 8: return V One can obtain the iniviual subproblems by specifying the omain D an the loss function Forexample, in case of Algorithm, the corresponing imization problem is specifie by X R m rf (X; V ) where V (X;(u, C, i)) XV u,i XV u,c, an in case of Algorithm 3, it is specifie by X R n rf (X; U ) where U (X;(u, C, i)) UX u,i UX u,c Let U arg U f (U; V ) an V arg V f (V ; U ) enote the solution matrices of Equations 9 an 0, respectively Also, given a general convex loss an omain D, let X D be an (9) (0)

4 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara -accurate solution for the corresponing imization problem if f X; X D f (X; ) In the remainer of this subsection, we show that Algorithms an 3 are aaptations of the Pegasos [3] algorithm to the CLR setting hen, we prove certain properties that are prerequisites for obtaining Pegasos s performance guarantees In particular, we show that the approximate subgraients compute by Algorithms an 3 are boune an the loss functions associate with Equations 9 an 0 are convex In the en, we plug these properties into a theorem prove by Shalev-Shwartz et al [3] to show that our algorithms reach an -accurate solution with respect to their corresponing imization problems in Õ iterations Lemma U an V Proof One can obtain the bouns on the norms of the optimal solutions by exaing the ual form of the optimization problems an applying the strong uality theorem Equations 9 an 0 can both be represente as v D v e h (f (v)), () where e C is a constant, h is the hinge function, D is a Eucliean space, an f is a linear function We rewrite Equation as a constraine optimization problem v D,ξ R v e ξ () subject to ξ f (v),,, ξ 0, he Lagrangian of this problem is L (v, ξ,, β) v e ξ, ( f (v) ξ ) β ξ v ξ (e β ) ( f (v)), an its ual function is g (, β) infl (v, ξ,, β) v,ξ Since L (v, ξ,, β) is convex an ifferentiable with respect to v an ξ,thenecessaryansufficient conitions for imizing v an ξ are v L 0 v v f (v), ξ L 0 e β (3) We plug these conitions bac into the ual function an obtain g (, β) infl (v, ξ,, β) v,ξ v f (v) f v f (v) v f (v) (4) f v f (v) Since f is a linear function, we let f (v) v, where is a constant vector, an v f (v) hen, v f (v) f f v f (v) (5) Simplifying Equation 4 using Equation 5 yiels g (, β) v f (v) (6) Finally, we combine Equations 3 an 6, an obtain the ual form of Equation, max (7) subject to 0 e,,

5 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara he primal problem is convex, its constraints are linear, an the omain of its objective is open; thus, Slater s conition hols an strong uality is obtaine Furthermore, the primal problem has ifferentiable objective an constraint functions, which implies that (v, ξ ) is primal optimal an (, β ) is ual optimal if an only if these points satisfy the arush-uhn- ucer () conitions It follows that v (8) Note that we efine e C, where e, an the constraints of the ual problem imply 0 e ;thus, Because of strong uality, there is no uality gap, an the primal an ual objectives are equal at the optimum, v e ξ v his proves the lemma v v (by (8)) Given the bouns in Lemma, it can be verifie that Algorithms an 3 are aaptations of the Pegasos [3] algorithm for optimizing Equations 9 an 0, respectively It still remains to show that Pegasos s performance guarantees hol in our case Lemma 3 In Algorithms an 3, the approximate subgraients have norm at most Proof he approximate subgraient for Algorithm 3 is epicte in Equation 7 Due to the projection step, V F, an it follows that V F he term Û i,c,u is constructe using Equation 5, an it can be verifie that Û i,c,u F U F Using triangle inequality, one can boun Equation 7 with A similar argument can be mae for the approximate subgraient of Algorithm, yieling the slightly higher upper boun given in the lemma statement We combine the lemmas to obtain the correctness an running time guarantees for our algorithms Lemma 4 Let 4,let be the total number of iterations of Algorithm, an let U t enote the parameter compute by the algorithm at iteration t Let Ū t U t enote the average of the parameters prouce by the algorithm hen, with probability at least δ, f Ū; V f (U ; V ) ln δ he analogous result hols for Algorithm 3 as well Proof First, for each loss function V an U,variables are linearly combine, compose with the convex hinge function, an then average All these operations preserve convexity, hence both loss functions are convex Secon, we have argue above that Algorithms an 3 are aaptations of the Pegasos [3] algorithm for optimizing Equations 9 an 0, respectively hir, in Lemma 3, we prove a boun on the approximate subgraients of both algorithms Plugging these three results into Corollary in Shalev-Shwartz et al [3] yiels the statement of the theorem he theorem below gives a boun in terms of iniviual parameters rather than average parameters heorem Assume that the conitions an the boun in Lemma 4 hol Let t be an iteration inex selecte uniformly at ranom from {,,} hen, with probability at least, 4 f (U t ; V ) f (U ; V ) ln δ he analogous result hols for Algorithm 3 as well Proof he result follows irectly from combining Lemma 4 with Lemma 3 in Shalev-Shwartz et al [3] hus, with high probability, our algorithms reach an -accurate solution in Õ iterations Since we argue in Subsection 4 that the running time of each stochastic upate is O (br), it follows that a complete run of projecte stochastic subgraient escent taes Õ br time, an the running time is inepenent of the size of the training ata References [] Stéphane Boucheron, Olivier Bousquet, an Gabor Lugosi heory of classification : A survey of some recent avances ESAIM: Probability an Statistics, 9:33 375, 005

6 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara [] horsten Joachims Optimizing search engines using clicthrough ata In Proceeings of the eighth ACM SIGDD international conference on nowlege iscovery an ata ing, DD 0, pages 33 4, New Yor, NY, USA, 00 ACM [3] Shai S Shwartz, Yoram Singer, Nathan Srebro, an Anrew Cotter Pegasos: primal estimate subgraient solver for SVM Math Program, 7():3 30, March 0 [4] Nathan Srebro, Noga Alon, an ommi Jaaola Generalization error bouns for collaborative preiction with Low-Ran matrices In NIPS, 004

Collaborative Ranking for Local Preferences

Collaborative Ranking for Local Preferences Collaborative Ranking for Local Preferences Berk Kapicioglu Davi S. Rosenberg Robert E. Schapire Tony Jebara YP YP Princeton University Columbia University Abstract For many collaborative ranking tasks,

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

On the Generalization Ability of Online Strongly Convex Programming Algorithms

On the Generalization Ability of Online Strongly Convex Programming Algorithms On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract

More information

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

More information

On the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms

On the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms On the Equivalence of Weak Learnability an Linear Separability: New Relaxations an Efficient Boosting Algorithms Shai Shalev-Shwartz Toyota Technological Institute, Chicago, USA SHAI@TTI-C.ORG Yoram Singer

More information

Connections Between Duality in Control Theory and

Connections Between Duality in Control Theory and Connections Between Duality in Control heory an Convex Optimization V. Balakrishnan 1 an L. Vanenberghe 2 Abstract Several important problems in control theory can be reformulate as convex optimization

More information

On the Cauchy Problem for Von Neumann-Landau Wave Equation

On the Cauchy Problem for Von Neumann-Landau Wave Equation Journal of Applie Mathematics an Physics 4 4-3 Publishe Online December 4 in SciRes http://wwwscirporg/journal/jamp http://xoiorg/436/jamp4343 On the Cauchy Problem for Von Neumann-anau Wave Equation Chuangye

More information

A LIMIT THEOREM FOR RANDOM FIELDS WITH A SINGULARITY IN THE SPECTRUM

A LIMIT THEOREM FOR RANDOM FIELDS WITH A SINGULARITY IN THE SPECTRUM Teor Imov r. ta Matem. Statist. Theor. Probability an Math. Statist. Vip. 81, 1 No. 81, 1, Pages 147 158 S 94-911)816- Article electronically publishe on January, 11 UDC 519.1 A LIMIT THEOREM FOR RANDOM

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Machine Learning Lecture 6 Note

Machine Learning Lecture 6 Note Machine Learning Lecture 6 Note Compiled by Abhi Ashutosh, Daniel Chen, and Yijun Xiao February 16, 2016 1 Pegasos Algorithm The Pegasos Algorithm looks very similar to the Perceptron Algorithm. In fact,

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Mini-Batch Primal and Dual Methods for SVMs

Mini-Batch Primal and Dual Methods for SVMs Mini-Batch Primal and Dual Methods for SVMs Peter Richtárik School of Mathematics The University of Edinburgh Coauthors: M. Takáč (Edinburgh), A. Bijral and N. Srebro (both TTI at Chicago) arxiv:1303.2314

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017 Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training

More information

Adaptive Gain-Scheduled H Control of Linear Parameter-Varying Systems with Time-Delayed Elements

Adaptive Gain-Scheduled H Control of Linear Parameter-Varying Systems with Time-Delayed Elements Aaptive Gain-Scheule H Control of Linear Parameter-Varying Systems with ime-delaye Elements Yoshihiko Miyasato he Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, okyo 6-8569, Japan

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Dissipative numerical methods for the Hunter-Saxton equation

Dissipative numerical methods for the Hunter-Saxton equation Dissipative numerical methos for the Hunter-Saton equation Yan Xu an Chi-Wang Shu Abstract In this paper, we present further evelopment of the local iscontinuous Galerkin (LDG) metho esigne in [] an a

More information

A Unified Theorem on SDP Rank Reduction

A Unified Theorem on SDP Rank Reduction A Unifie heorem on SDP Ran Reuction Anthony Man Cho So, Yinyu Ye, Jiawei Zhang November 9, 006 Abstract We consier the problem of fining a low ran approximate solution to a system of linear equations in

More information

PETER L. BARTLETT AND MARTEN H. WEGKAMP

PETER L. BARTLETT AND MARTEN H. WEGKAMP CLASSIFICATION WITH A REJECT OPTION USING A HINGE LOSS PETER L. BARTLETT AND MARTEN H. WEGKAMP Abstract. We consier the problem of binary classification where the classifier can, for a particular cost,

More information

Logarithmic Regret Algorithms for Strongly Convex Repeated Games

Logarithmic Regret Algorithms for Strongly Convex Repeated Games Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Discrete Operators in Canonical Domains

Discrete Operators in Canonical Domains Discrete Operators in Canonical Domains VLADIMIR VASILYEV Belgoro National Research University Chair of Differential Equations Stuencheskaya 14/1, 308007 Belgoro RUSSIA vlaimir.b.vasilyev@gmail.com Abstract:

More information

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we

More information

Applications of the Wronskian to ordinary linear differential equations

Applications of the Wronskian to ordinary linear differential equations Physics 116C Fall 2011 Applications of the Wronskian to orinary linear ifferential equations Consier a of n continuous functions y i (x) [i = 1,2,3,...,n], each of which is ifferentiable at least n times.

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

Structural Risk Minimization over Data-Dependent Hierarchies

Structural Risk Minimization over Data-Dependent Hierarchies Structural Risk Minimization over Data-Depenent Hierarchies John Shawe-Taylor Department of Computer Science Royal Holloway an Befor New College University of Lonon Egham, TW20 0EX, UK jst@cs.rhbnc.ac.uk

More information

Lecture 10: October 30, 2017

Lecture 10: October 30, 2017 Information an Coing Theory Autumn 2017 Lecturer: Mahur Tulsiani Lecture 10: October 30, 2017 1 I-Projections an applications In this lecture, we will talk more about fining the istribution in a set Π

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

On colour-blind distinguishing colour pallets in regular graphs

On colour-blind distinguishing colour pallets in regular graphs J Comb Optim (2014 28:348 357 DOI 10.1007/s10878-012-9556-x On colour-blin istinguishing colour pallets in regular graphs Jakub Przybyło Publishe online: 25 October 2012 The Author(s 2012. This article

More information

A Dual-Augmented Block Minimization Framework for Learning with Limited Memory

A Dual-Augmented Block Minimization Framework for Learning with Limited Memory A Dual-Augmente Bloc Minimization Framewor for Learning with Limite Memory Ian E.H. Yen Shan-Wei Lin Shou-De Lin University of Texas at Austin National Taiwan University ianyen@cs.utexas.eu {r03922067,slin}@csie.ntu.eu.tw

More information

Self-normalized Martingale Tail Inequality

Self-normalized Martingale Tail Inequality Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value

More information

3.6. Let s write out the sample space for this random experiment:

3.6. Let s write out the sample space for this random experiment: STAT 5 3 Let s write out the sample space for this ranom experiment: S = {(, 2), (, 3), (, 4), (, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)} This sample space assumes the orering of the balls

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena Chaos, Solitons an Fractals (7 64 73 Contents lists available at ScienceDirect Chaos, Solitons an Fractals onlinear Science, an onequilibrium an Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

The Subtree Size Profile of Plane-oriented Recursive Trees

The Subtree Size Profile of Plane-oriented Recursive Trees The Subtree Size Profile of Plane-oriente Recursive Trees Michael FUCHS Department of Applie Mathematics National Chiao Tung University Hsinchu, 3, Taiwan Email: mfuchs@math.nctu.eu.tw Abstract In this

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization JMLR: Workshop an Conference Proceeings vol 30 013) 1 On the Complexity of Banit an Derivative-Free Stochastic Convex Optimization Oha Shamir Microsoft Research an the Weizmann Institute of Science oha.shamir@weizmann.ac.il

More information

Inverse Time Dependency in Convex Regularized Learning

Inverse Time Dependency in Convex Regularized Learning Inverse Time Dependency in Convex Regularized Learning Zeyuan A. Zhu (Tsinghua University) Weizhu Chen (MSRA) Chenguang Zhu (Tsinghua University) Gang Wang (MSRA) Haixun Wang (MSRA) Zheng Chen (MSRA) December

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

Sublinear Optimization for Machine Learning

Sublinear Optimization for Machine Learning Sublinear Optimization for Machine Learning Kenneth L. Clarkson, IBM Almaen Research Center Ela Hazan, Technion - Israel Institute of technology Davi P. Wooruff, IBM Almaen Research Center In this paper

More information

Two formulas for the Euler ϕ-function

Two formulas for the Euler ϕ-function Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,

More information

An extension of Alexandrov s theorem on second derivatives of convex functions

An extension of Alexandrov s theorem on second derivatives of convex functions Avances in Mathematics 228 (211 2258 2267 www.elsevier.com/locate/aim An extension of Alexanrov s theorem on secon erivatives of convex functions Joseph H.G. Fu 1 Department of Mathematics, University

More information

26.1 Metropolis method

26.1 Metropolis method CS880: Approximations Algorithms Scribe: Dave Anrzejewski Lecturer: Shuchi Chawla Topic: Metropolis metho, volume estimation Date: 4/26/07 The previous lecture iscusse they some of the key concepts of

More information

Math 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions

Math 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions Math 3 Winter 2 Avance Bounary Value Problems I Bessel s Equation an Bessel Functions Department of Mathematical an Statistical Sciences University of Alberta Bessel s Equation an Bessel Functions We use

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

Optimal A Priori Discretization Error Bounds for Geodesic Finite Elements

Optimal A Priori Discretization Error Bounds for Geodesic Finite Elements Optimal A Priori iscretization Error Bouns for Geoesic Finite Elements Philipp Grohs, Hanne Harering an Oliver Saner Bericht Nr. 365 Mai 13 Key wors: geoesic finite elements, iscretization error, a priori

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

Monte Carlo Methods with Reduced Error

Monte Carlo Methods with Reduced Error Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for

More information

LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES

LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES ÁRPÁD BÉNYI AND KASSO A. OKOUDJOU Abstract. By using tools of time-frequency analysis, we obtain some improve local well-poseness

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v4 [math.pr] 27 Jul 2016 The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,

More information

arxiv: v1 [cs.lg] 22 Mar 2014

arxiv: v1 [cs.lg] 22 Mar 2014 CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com

More information

A New Converse Bound for Coded Caching

A New Converse Bound for Coded Caching A ew Converse Boun for Coe Caching Chien-Yi Wang, Sung Hoon Lim, an Michael Gastpar School of Computer an Communication Sciences EPFL Lausanne, Switzerlan {chien-yiwang, sunglim, michaelgastpar}@epflch

More information

High-Dimensional p-norms

High-Dimensional p-norms High-Dimensional p-norms Gérar Biau an Davi M. Mason Abstract Let X = X 1,...,X be a R -value ranom vector with i.i.. components, an let X p = j=1 X j p 1/p be its p-norm, for p > 0. The impact of letting

More information

Stochastic Gradient Descent with Only One Projection

Stochastic Gradient Descent with Only One Projection Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine

More information

SVD-free Convex-Concave Approaches for Nuclear Norm Regularization

SVD-free Convex-Concave Approaches for Nuclear Norm Regularization SV-free Convex-Concave Approaches for Nuclear Norm Regularization Yichi Xiao, Zhe Li, ianbao Yang, Lijun Zhang National Key Laboratory for Novel Software echnology, Nanjing University, Nanjing 003, China

More information

A New Minimum Description Length

A New Minimum Description Length A New Minimum Description Length Soosan Beheshti, Munther A. Dahleh Laboratory for Information an Decision Systems Massachusetts Institute of Technology soosan@mit.eu,ahleh@lis.mit.eu Abstract The minimum

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

Step 1. Analytic Properties of the Riemann zeta function [2 lectures]

Step 1. Analytic Properties of the Riemann zeta function [2 lectures] Step. Analytic Properties of the Riemann zeta function [2 lectures] The Riemann zeta function is the infinite sum of terms /, n. For each n, the / is a continuous function of s, i.e. lim s s 0 n = s n,

More information

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

More information

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods Hyperbolic Moment Equations Using Quarature-Base Projection Methos J. Koellermeier an M. Torrilhon Department of Mathematics, RWTH Aachen University, Aachen, Germany Abstract. Kinetic equations like the

More information

Math 180, Exam 2, Fall 2012 Problem 1 Solution. (a) The derivative is computed using the Chain Rule twice. 1 2 x x

Math 180, Exam 2, Fall 2012 Problem 1 Solution. (a) The derivative is computed using the Chain Rule twice. 1 2 x x . Fin erivatives of the following functions: (a) f() = tan ( 2 + ) ( ) 2 (b) f() = ln 2 + (c) f() = sin() Solution: Math 80, Eam 2, Fall 202 Problem Solution (a) The erivative is compute using the Chain

More information

Exponential asymptotic property of a parallel repairable system with warm standby under common-cause failure

Exponential asymptotic property of a parallel repairable system with warm standby under common-cause failure J. Math. Anal. Appl. 341 (28) 457 466 www.elsevier.com/locate/jmaa Exponential asymptotic property of a parallel repairable system with warm stanby uner common-cause failure Zifei Shen, Xiaoxiao Hu, Weifeng

More information

MATH 566, Final Project Alexandra Tcheng,

MATH 566, Final Project Alexandra Tcheng, MATH 566, Final Project Alexanra Tcheng, 60665 The unrestricte partition function pn counts the number of ways a positive integer n can be resse as a sum of positive integers n. For example: p 5, since

More information

Monotonicity for excited random walk in high dimensions

Monotonicity for excited random walk in high dimensions Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient

Online Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of

More information

Chapter 9 Method of Weighted Residuals

Chapter 9 Method of Weighted Residuals Chapter 9 Metho of Weighte Resiuals 9- Introuction Metho of Weighte Resiuals (MWR) is an approimate technique for solving bounary value problems. It utilizes a trial functions satisfying the prescribe

More information

Branch differences and Lambert W

Branch differences and Lambert W 2014 16th International Symposium on Symbolic an Numeric Algorithms for Scientific Computing Branch ifferences an Lambert W D. J. Jeffrey an J. E. Jankowski Department of Applie Mathematics, The University

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

A Weak First Digit Law for a Class of Sequences

A Weak First Digit Law for a Class of Sequences International Mathematical Forum, Vol. 11, 2016, no. 15, 67-702 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1288/imf.2016.6562 A Weak First Digit Law for a Class of Sequences M. A. Nyblom School of

More information