Collaborative Ranking for Local Preferences Supplement
|
|
- Lesley Day
- 5 years ago
- Views:
Transcription
1 Collaborative Raning for Local Preferences Supplement Ber apicioglu Davi S Rosenberg Robert E Schapire ony Jebara YP YP Princeton University Columbia University Problem Formulation Let U {,,m} be the set of users, let V {,,n} be the set of items, an let {,,} inicate the local time hen, the sample space is efine as X {(u, C, i, t) u U,C V,i C, t } () Let P [ ] enote probability, let C i be the set C excluing element i, anletc U C mean that c is sample uniformly from C hen, the local raning loss associate with hypothesis g is L g (u, C, i, t) P [g (u, i, t) g (u, c, t) 0] () c U C i A Boun on the Generalization Error We assume that the hypothesis class is base on the set of low-ran matrices Given a low-ran matrix M, letg M F be the associate hypothesis, where g M (u, i) M u,i hroughout the paper, we abuse notation an use g M an M interchangeably We assume that ata is generate with respect to D, which is an unnown probability istribution over the sample space X, anwelete enote expectation hen, the generalization error of hypothesis M is E L M (u, C, i), which is the quantity we boun (u,c,i) D below We will erive the generalization boun in two steps In the first step, we will boun the empirical Raemacher complexity of our loss class, efine below, with respect to samples that contain exactly caniates, an in the secon step, we will prove the generalization boun with a reuction to the previous step Lemma Let m be the number of users an let n be the number of items Define L r {L M M R m n has ran at most r} as the class of loss functions associate with low-ran matrices Assume that S X is a set of samples, where each sample contains exactly caniate items; ie if (u, C, i) S, then C Let R S (L r ) enote the Raemacher complexity of L r with respect to S hen, 6emn r (m n)ln r(mn) R S (L r ) Proof Because each sample in S contains exactly caniates, any hypothesis L M L r applie to a sample in S outputs either 0 or hus, the set of ichotomies that are realize by L r on S, calle Π Lr (S ),iswell-efine UsingEquation(6) from Boucheron et al [], we now that R S (L r ) ln Π Lr (S ) Let X X be the set of all samples that contain exactly caniates, Π Lr (S ) Π Lr (X ), soitsuffices to boun Π Lr (X ) We boun Π Lr (X ) by counting the sign configurations of polynomials using proof techniques that are influence by Srebro et al [4] Let (u, {i, j},i) X be a sample an let M be a hypothesis matrix Because M has ran at most r, it can be written as M UV, where U R m r an V R n r Let enote an inicator function that is if an only if its argument is true hen, the loss on the sample can also be rewritten as L M (u, {i, j},i)m u,i M u,j 0 UV UV 0 r U u,i u,j u,a (V i,a V j,a ) a 0 Since carinality of X is at most m n mn, putting it all together, it follows that Π Lr (X ) is boune by the number of sign configurations of mn polynomials, each of egree at most, overr (m n) variables Applying Corollary 3 from Srebro et al [4], we obtain Π Lr (X ) 6emn r(mn) r(mn) aing logarithms an maing basic substitutions yiel the esire result We procee to proving the more general result via a reuction to Lemma heorem Let m be the number of users an let n be the number of items Assume that S consists of
2 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara inepenently an ientically istribute samples chosen from X with respect to a probability istribution D Let L M be the loss function associate with a matrix M, as efine in Equation hen, with probability at least δ, for any matrix M R m n with ran at most r, E L M (u, C, i) (u,c,i) D r (m n)ln 6emn r E L M (u, C, i) (u,c,i) S U ln δ (3) we can boun the empirical local raning loss as E L M (u, C, i) L M (u, C, i) (u,c,i) S U P [M u,i M u,c 0] c U C i E M u,i M u,c 0 c U C i C i UV u,i UV u,c 0 c C i UV u,i UV u,c (4) Proof We will manipulate the efinition of Raemacher complexity [] in orer to use the boun given in Lemma : R S (L r ) E σ E σ sup L M L r sup L M L r σ a L M (u a,c a,i a ) a σ a E U a j a (Ca\{i a}) L M (u a, {i a,j a },i a ) E sup E σ a L M (u a, {i a,j a },i a ) σ L M L r j,,j a E E sup σ a L M (u a, {i a,j a },i a ) σ j,,j L M L r a E sup σ a L M (u a, {i a,j a },i a ) j,,j Eσ L M L r E [R S (L r )] j,,j r (m n)ln a 6emn r(mn) We note that the CLR an the raning SVM [] objectives are closely relate If V is fixe an we only nee to imize U, then each row of V acts as a feature vector for the corresponing item, each row of U acts as a separate linear preictor, an the CLR objective ecomposes into solving simultaneous raning SVM problems In particular, let S u {(a, C, i) S a u} be the examples that correspon to user u, letu u enote row u of U, anlet f rsvm enote the objective function of raning SVM, then f CLR (S; U, V ) U F UV UV u,i u,c m U u F u m u u m f rsvm (S u ; U u,v) u 4 Algorithms UV u,i UV u,c Plugging the boun to heorem 3 in Boucheron et al [] proves the theorem 4 Derivation Let (u, C, i) S be an example, then the corresponing approximate objective function is 3 Collaborative Local Raning Let h (x) max (0, x) be the hinge function, let M be the hypothesis matrix with ran at most r, anlet M UV, where U R m r an V R n r hen, f CLR ((u, C, i);u, V ) V F UV UV u,i u,c We introuce various matrix notation to help us efine the approximate subgraients Given a matrix M, let
3 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara Algorithm Alternating imization for optimizing the CLR objective Input: raining ata S X,regularizationparameter > 0, ranconstraintr, numberofiterations : U Sample matrix uniformly at ranom from m r mr, mr : V Sample matrix uniformly at ranom from n r nr, nr 3: for all tfrom to o 4: U t arg f CLR (S; U, V t ) U 5: V t arg f CLR (S; U t,v) V 6: return U,V M, enote row of M Definethematrix ˆM p,q,z,for p q, as M z, for s p, ˆM s, p,q,z M z, for s q, (5) 0 otherwise, an efine the matrix ˇM p,q,z s, ˇM p,q,z s, as M p, M q, for s z, 0 otherwise (6) Let enote an inicator function that is if an only if its argument is true hen, the subgraient of the approximate objective function with respect to V is V f CLR ((u, C, i);u, V )V UV UV < Û i,c,u (7) u,i u,c c C i Setting η t t as the learning rate at iteration t, the approximate subgraient upate becomes V t V t η t V f CLR ((u, C, i);u, V ) Aftertheupate,the weights are projecte onto a ball with raius he pseuocoe for optimizing both convex subproblems is epicte in Algorithms an 3 We prove the correctness of the algorithms an boun their running time in the next subsection 4 Analysis he convex subproblems we analyze have the general form X D f (X; ) X D X F (X;(u, C, i)) (8) Algorithm Projecte stochastic subgraient escent for optimizing U Input: Factors V R n r,trainingatas, regularization parameter, ranconstraintr, numberof iterations : U 0 m r : for all tfrom to o 3: Choose (u, C, i) S uniformly at ranom 4: η t t 5: C c C i U t V u,i U t V u,c < 6: U t ( η t ) U t ηt i,c,u ˇV c C 7: U t, Ut U t F 8: return U Algorithm 3 Projecte stochastic subgraient escent for optimizing V Input: Factors U R m r,trainingatas, regularization parameter, ranconstraintr, numberof iterations : V 0 n r : for all tfrom to o 3: Choose (u, C, i) S uniformly at ranom 4: η t t 5: C c C i UVt UV u,i t u,c < 6: V t ( η t ) V t ηt Û i,c,u c C 7: V t, Vt V t F 8: return V One can obtain the iniviual subproblems by specifying the omain D an the loss function Forexample, in case of Algorithm, the corresponing imization problem is specifie by X R m rf (X; V ) where V (X;(u, C, i)) XV u,i XV u,c, an in case of Algorithm 3, it is specifie by X R n rf (X; U ) where U (X;(u, C, i)) UX u,i UX u,c Let U arg U f (U; V ) an V arg V f (V ; U ) enote the solution matrices of Equations 9 an 0, respectively Also, given a general convex loss an omain D, let X D be an (9) (0)
4 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara -accurate solution for the corresponing imization problem if f X; X D f (X; ) In the remainer of this subsection, we show that Algorithms an 3 are aaptations of the Pegasos [3] algorithm to the CLR setting hen, we prove certain properties that are prerequisites for obtaining Pegasos s performance guarantees In particular, we show that the approximate subgraients compute by Algorithms an 3 are boune an the loss functions associate with Equations 9 an 0 are convex In the en, we plug these properties into a theorem prove by Shalev-Shwartz et al [3] to show that our algorithms reach an -accurate solution with respect to their corresponing imization problems in Õ iterations Lemma U an V Proof One can obtain the bouns on the norms of the optimal solutions by exaing the ual form of the optimization problems an applying the strong uality theorem Equations 9 an 0 can both be represente as v D v e h (f (v)), () where e C is a constant, h is the hinge function, D is a Eucliean space, an f is a linear function We rewrite Equation as a constraine optimization problem v D,ξ R v e ξ () subject to ξ f (v),,, ξ 0, he Lagrangian of this problem is L (v, ξ,, β) v e ξ, ( f (v) ξ ) β ξ v ξ (e β ) ( f (v)), an its ual function is g (, β) infl (v, ξ,, β) v,ξ Since L (v, ξ,, β) is convex an ifferentiable with respect to v an ξ,thenecessaryansufficient conitions for imizing v an ξ are v L 0 v v f (v), ξ L 0 e β (3) We plug these conitions bac into the ual function an obtain g (, β) infl (v, ξ,, β) v,ξ v f (v) f v f (v) v f (v) (4) f v f (v) Since f is a linear function, we let f (v) v, where is a constant vector, an v f (v) hen, v f (v) f f v f (v) (5) Simplifying Equation 4 using Equation 5 yiels g (, β) v f (v) (6) Finally, we combine Equations 3 an 6, an obtain the ual form of Equation, max (7) subject to 0 e,,
5 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara he primal problem is convex, its constraints are linear, an the omain of its objective is open; thus, Slater s conition hols an strong uality is obtaine Furthermore, the primal problem has ifferentiable objective an constraint functions, which implies that (v, ξ ) is primal optimal an (, β ) is ual optimal if an only if these points satisfy the arush-uhn- ucer () conitions It follows that v (8) Note that we efine e C, where e, an the constraints of the ual problem imply 0 e ;thus, Because of strong uality, there is no uality gap, an the primal an ual objectives are equal at the optimum, v e ξ v his proves the lemma v v (by (8)) Given the bouns in Lemma, it can be verifie that Algorithms an 3 are aaptations of the Pegasos [3] algorithm for optimizing Equations 9 an 0, respectively It still remains to show that Pegasos s performance guarantees hol in our case Lemma 3 In Algorithms an 3, the approximate subgraients have norm at most Proof he approximate subgraient for Algorithm 3 is epicte in Equation 7 Due to the projection step, V F, an it follows that V F he term Û i,c,u is constructe using Equation 5, an it can be verifie that Û i,c,u F U F Using triangle inequality, one can boun Equation 7 with A similar argument can be mae for the approximate subgraient of Algorithm, yieling the slightly higher upper boun given in the lemma statement We combine the lemmas to obtain the correctness an running time guarantees for our algorithms Lemma 4 Let 4,let be the total number of iterations of Algorithm, an let U t enote the parameter compute by the algorithm at iteration t Let Ū t U t enote the average of the parameters prouce by the algorithm hen, with probability at least δ, f Ū; V f (U ; V ) ln δ he analogous result hols for Algorithm 3 as well Proof First, for each loss function V an U,variables are linearly combine, compose with the convex hinge function, an then average All these operations preserve convexity, hence both loss functions are convex Secon, we have argue above that Algorithms an 3 are aaptations of the Pegasos [3] algorithm for optimizing Equations 9 an 0, respectively hir, in Lemma 3, we prove a boun on the approximate subgraients of both algorithms Plugging these three results into Corollary in Shalev-Shwartz et al [3] yiels the statement of the theorem he theorem below gives a boun in terms of iniviual parameters rather than average parameters heorem Assume that the conitions an the boun in Lemma 4 hol Let t be an iteration inex selecte uniformly at ranom from {,,} hen, with probability at least, 4 f (U t ; V ) f (U ; V ) ln δ he analogous result hols for Algorithm 3 as well Proof he result follows irectly from combining Lemma 4 with Lemma 3 in Shalev-Shwartz et al [3] hus, with high probability, our algorithms reach an -accurate solution in Õ iterations Since we argue in Subsection 4 that the running time of each stochastic upate is O (br), it follows that a complete run of projecte stochastic subgraient escent taes Õ br time, an the running time is inepenent of the size of the training ata References [] Stéphane Boucheron, Olivier Bousquet, an Gabor Lugosi heory of classification : A survey of some recent avances ESAIM: Probability an Statistics, 9:33 375, 005
6 Ber apicioglu, Davi S Rosenberg, Robert E Schapire, ony Jebara [] horsten Joachims Optimizing search engines using clicthrough ata In Proceeings of the eighth ACM SIGDD international conference on nowlege iscovery an ata ing, DD 0, pages 33 4, New Yor, NY, USA, 00 ACM [3] Shai S Shwartz, Yoram Singer, Nathan Srebro, an Anrew Cotter Pegasos: primal estimate subgraient solver for SVM Math Program, 7():3 30, March 0 [4] Nathan Srebro, Noga Alon, an ommi Jaaola Generalization error bouns for collaborative preiction with Low-Ran matrices In NIPS, 004
Collaborative Ranking for Local Preferences
Collaborative Ranking for Local Preferences Berk Kapicioglu Davi S. Rosenberg Robert E. Schapire Tony Jebara YP YP Princeton University Columbia University Abstract For many collaborative ranking tasks,
More informationAn Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an
More informationLeast-Squares Regression on Sparse Spaces
Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction
More informationA PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.
More informationRobust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k
A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine
More informationUC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics
UC Berkeley Department of Electrical Engineering an Computer Science Department of Statistics EECS 8B / STAT 4B Avance Topics in Statistical Learning Theory Solutions 3 Spring 9 Solution 3. For parti,
More informationLecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012
CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration
More information7.1 Support Vector Machine
67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to
More informationPDE Notes, Lecture #11
PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =
More informationEuler equations for multiple integrals
Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................
More informationOn the Generalization Ability of Online Strongly Convex Programming Algorithms
On the Generalization Ability of Online Strongly Convex Programming Algorithms Sham M. Kakade I Chicago Chicago, IL 60637 sham@tti-c.org Ambuj ewari I Chicago Chicago, IL 60637 tewari@tti-c.org Abstract
More informationSINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES
Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES
More informationOn the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms
On the Equivalence of Weak Learnability an Linear Separability: New Relaxations an Efficient Boosting Algorithms Shai Shalev-Shwartz Toyota Technological Institute, Chicago, USA SHAI@TTI-C.ORG Yoram Singer
More informationConnections Between Duality in Control Theory and
Connections Between Duality in Control heory an Convex Optimization V. Balakrishnan 1 an L. Vanenberghe 2 Abstract Several important problems in control theory can be reformulate as convex optimization
More informationOn the Cauchy Problem for Von Neumann-Landau Wave Equation
Journal of Applie Mathematics an Physics 4 4-3 Publishe Online December 4 in SciRes http://wwwscirporg/journal/jamp http://xoiorg/436/jamp4343 On the Cauchy Problem for Von Neumann-anau Wave Equation Chuangye
More informationA LIMIT THEOREM FOR RANDOM FIELDS WITH A SINGULARITY IN THE SPECTRUM
Teor Imov r. ta Matem. Statist. Theor. Probability an Math. Statist. Vip. 81, 1 No. 81, 1, Pages 147 158 S 94-911)816- Article electronically publishe on January, 11 UDC 519.1 A LIMIT THEOREM FOR RANDOM
More informationConvergence of Random Walks
Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of
More informationMachine Learning Lecture 6 Note
Machine Learning Lecture 6 Note Compiled by Abhi Ashutosh, Daniel Chen, and Yijun Xiao February 16, 2016 1 Pegasos Algorithm The Pegasos Algorithm looks very similar to the Perceptron Algorithm. In fact,
More information'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21
Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting
More informationLower bounds on Locality Sensitive Hashing
Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,
More informationMini-Batch Primal and Dual Methods for SVMs
Mini-Batch Primal and Dual Methods for SVMs Peter Richtárik School of Mathematics The University of Edinburgh Coauthors: M. Takáč (Edinburgh), A. Bijral and N. Srebro (both TTI at Chicago) arxiv:1303.2314
More informationHomework 2 Solutions EM, Mixture Models, PCA, Dualitys
Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture
More informationAll s Well That Ends Well: Supplementary Proofs
All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee
More informationProof of SPNs as Mixture of Trees
A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a
More informationALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS
ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS MARK SCHACHNER Abstract. When consiere as an algebraic space, the set of arithmetic functions equippe with the operations of pointwise aition an
More informationII. First variation of functionals
II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent
More informationMath 342 Partial Differential Equations «Viktor Grigoryan
Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite
More informationMachine Learning. Support Vector Machines. Fabio Vandin November 20, 2017
Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training
More informationAdaptive Gain-Scheduled H Control of Linear Parameter-Varying Systems with Time-Delayed Elements
Aaptive Gain-Scheule H Control of Linear Parameter-Varying Systems with ime-delaye Elements Yoshihiko Miyasato he Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, okyo 6-8569, Japan
More informationu!i = a T u = 0. Then S satisfies
Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace
More informationDissipative numerical methods for the Hunter-Saxton equation
Dissipative numerical methos for the Hunter-Saton equation Yan Xu an Chi-Wang Shu Abstract In this paper, we present further evelopment of the local iscontinuous Galerkin (LDG) metho esigne in [] an a
More informationA Unified Theorem on SDP Rank Reduction
A Unifie heorem on SDP Ran Reuction Anthony Man Cho So, Yinyu Ye, Jiawei Zhang November 9, 006 Abstract We consier the problem of fining a low ran approximate solution to a system of linear equations in
More informationPETER L. BARTLETT AND MARTEN H. WEGKAMP
CLASSIFICATION WITH A REJECT OPTION USING A HINGE LOSS PETER L. BARTLETT AND MARTEN H. WEGKAMP Abstract. We consier the problem of binary classification where the classifier can, for a particular cost,
More informationLogarithmic Regret Algorithms for Strongly Convex Repeated Games
Logarithmic Regret Algorithms for Strongly Convex Repeated Games Shai Shalev-Shwartz 1 and Yoram Singer 1,2 1 School of Computer Sci & Eng, The Hebrew University, Jerusalem 91904, Israel 2 Google Inc 1600
More informationFunction Spaces. 1 Hilbert Spaces
Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure
More informationDiscrete Operators in Canonical Domains
Discrete Operators in Canonical Domains VLADIMIR VASILYEV Belgoro National Research University Chair of Differential Equations Stuencheskaya 14/1, 308007 Belgoro RUSSIA vlaimir.b.vasilyev@gmail.com Abstract:
More information3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes
Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we
More informationApplications of the Wronskian to ordinary linear differential equations
Physics 116C Fall 2011 Applications of the Wronskian to orinary linear ifferential equations Consier a of n continuous functions y i (x) [i = 1,2,3,...,n], each of which is ifferentiable at least n times.
More informationCalculus and optimization
Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function
More informationStructural Risk Minimization over Data-Dependent Hierarchies
Structural Risk Minimization over Data-Depenent Hierarchies John Shawe-Taylor Department of Computer Science Royal Holloway an Befor New College University of Lonon Egham, TW20 0EX, UK jst@cs.rhbnc.ac.uk
More informationLecture 10: October 30, 2017
Information an Coing Theory Autumn 2017 Lecturer: Mahur Tulsiani Lecture 10: October 30, 2017 1 I-Projections an applications In this lecture, we will talk more about fining the istribution in a set Π
More informationIntroduction to the Vlasov-Poisson system
Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its
More informationIterated Point-Line Configurations Grow Doubly-Exponentially
Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection
More informationOn colour-blind distinguishing colour pallets in regular graphs
J Comb Optim (2014 28:348 357 DOI 10.1007/s10878-012-9556-x On colour-blin istinguishing colour pallets in regular graphs Jakub Przybyło Publishe online: 25 October 2012 The Author(s 2012. This article
More informationA Dual-Augmented Block Minimization Framework for Learning with Limited Memory
A Dual-Augmente Bloc Minimization Framewor for Learning with Limite Memory Ian E.H. Yen Shan-Wei Lin Shou-De Lin University of Texas at Austin National Taiwan University ianyen@cs.utexas.eu {r03922067,slin}@csie.ntu.eu.tw
More informationSelf-normalized Martingale Tail Inequality
Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value
More information3.6. Let s write out the sample space for this random experiment:
STAT 5 3 Let s write out the sample space for this ranom experiment: S = {(, 2), (, 3), (, 4), (, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)} This sample space assumes the orering of the balls
More informationAnalyzing Tensor Power Method Dynamics in Overcomplete Regime
Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical
More informationChaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena
Chaos, Solitons an Fractals (7 64 73 Contents lists available at ScienceDirect Chaos, Solitons an Fractals onlinear Science, an onequilibrium an Complex Phenomena journal homepage: www.elsevier.com/locate/chaos
More informationLECTURE NOTES ON DVORETZKY S THEOREM
LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.
More informationDiscrete Mathematics
Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner
More informationTable of Common Derivatives By David Abraham
Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec
More informationLogarithmic spurious regressions
Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate
More informationThe Subtree Size Profile of Plane-oriented Recursive Trees
The Subtree Size Profile of Plane-oriente Recursive Trees Michael FUCHS Department of Applie Mathematics National Chiao Tung University Hsinchu, 3, Taiwan Email: mfuchs@math.nctu.eu.tw Abstract In this
More informationOn combinatorial approaches to compressed sensing
On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu
More informationOn the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization
JMLR: Workshop an Conference Proceeings vol 30 013) 1 On the Complexity of Banit an Derivative-Free Stochastic Convex Optimization Oha Shamir Microsoft Research an the Weizmann Institute of Science oha.shamir@weizmann.ac.il
More informationInverse Time Dependency in Convex Regularized Learning
Inverse Time Dependency in Convex Regularized Learning Zeyuan A. Zhu (Tsinghua University) Weizhu Chen (MSRA) Chenguang Zhu (Tsinghua University) Gang Wang (MSRA) Haixun Wang (MSRA) Zheng Chen (MSRA) December
More informationCalculus of Variations
16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t
More informationarxiv: v2 [cs.ds] 11 May 2016
Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We
More informationSublinear Optimization for Machine Learning
Sublinear Optimization for Machine Learning Kenneth L. Clarkson, IBM Almaen Research Center Ela Hazan, Technion - Israel Institute of technology Davi P. Wooruff, IBM Almaen Research Center In this paper
More informationTwo formulas for the Euler ϕ-function
Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,
More informationAn extension of Alexandrov s theorem on second derivatives of convex functions
Avances in Mathematics 228 (211 2258 2267 www.elsevier.com/locate/aim An extension of Alexanrov s theorem on secon erivatives of convex functions Joseph H.G. Fu 1 Department of Mathematics, University
More information26.1 Metropolis method
CS880: Approximations Algorithms Scribe: Dave Anrzejewski Lecturer: Shuchi Chawla Topic: Metropolis metho, volume estimation Date: 4/26/07 The previous lecture iscusse they some of the key concepts of
More informationMath 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions
Math 3 Winter 2 Avance Bounary Value Problems I Bessel s Equation an Bessel Functions Department of Mathematical an Statistical Sciences University of Alberta Bessel s Equation an Bessel Functions We use
More informationConservation Laws. Chapter Conservation of Energy
20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action
More informationOptimal A Priori Discretization Error Bounds for Geodesic Finite Elements
Optimal A Priori iscretization Error Bouns for Geoesic Finite Elements Philipp Grohs, Hanne Harering an Oliver Saner Bericht Nr. 365 Mai 13 Key wors: geoesic finite elements, iscretization error, a priori
More informationMath Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors
Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+
More informationMonte Carlo Methods with Reduced Error
Monte Carlo Methos with Reuce Error As has been shown, the probable error in Monte Carlo algorithms when no information about the smoothness of the function is use is Dξ r N = c N. It is important for
More informationLOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES
LOCAL WELL-POSEDNESS OF NONLINEAR DISPERSIVE EQUATIONS ON MODULATION SPACES ÁRPÁD BÉNYI AND KASSO A. OKOUDJOU Abstract. By using tools of time-frequency analysis, we obtain some improve local well-poseness
More informationSlide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)
Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to
More informationarxiv: v4 [math.pr] 27 Jul 2016
The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,
More informationarxiv: v1 [cs.lg] 22 Mar 2014
CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com
More informationA New Converse Bound for Coded Caching
A ew Converse Boun for Coe Caching Chien-Yi Wang, Sung Hoon Lim, an Michael Gastpar School of Computer an Communication Sciences EPFL Lausanne, Switzerlan {chien-yiwang, sunglim, michaelgastpar}@epflch
More informationHigh-Dimensional p-norms
High-Dimensional p-norms Gérar Biau an Davi M. Mason Abstract Let X = X 1,...,X be a R -value ranom vector with i.i.. components, an let X p = j=1 X j p 1/p be its p-norm, for p > 0. The impact of letting
More informationStochastic Gradient Descent with Only One Projection
Stochastic Gradient Descent with Only One Projection Mehrdad Mahdavi, ianbao Yang, Rong Jin, Shenghuo Zhu, and Jinfeng Yi Dept. of Computer Science and Engineering, Michigan State University, MI, USA Machine
More informationSVD-free Convex-Concave Approaches for Nuclear Norm Regularization
SV-free Convex-Concave Approaches for Nuclear Norm Regularization Yichi Xiao, Zhe Li, ianbao Yang, Lijun Zhang National Key Laboratory for Novel Software echnology, Nanjing University, Nanjing 003, China
More informationA New Minimum Description Length
A New Minimum Description Length Soosan Beheshti, Munther A. Dahleh Laboratory for Information an Decision Systems Massachusetts Institute of Technology soosan@mit.eu,ahleh@lis.mit.eu Abstract The minimum
More informationLATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION
The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische
More informationLower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners
Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte
More informationLecture 2: Correlated Topic Model
Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables
More informationTractability results for weighted Banach spaces of smooth functions
Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March
More informationHow to Minimize Maximum Regret in Repeated Decision-Making
How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:
More informationStep 1. Analytic Properties of the Riemann zeta function [2 lectures]
Step. Analytic Properties of the Riemann zeta function [2 lectures] The Riemann zeta function is the infinite sum of terms /, n. For each n, the / is a continuous function of s, i.e. lim s s 0 n = s n,
More informationBeyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
More informationHyperbolic Moment Equations Using Quadrature-Based Projection Methods
Hyperbolic Moment Equations Using Quarature-Base Projection Methos J. Koellermeier an M. Torrilhon Department of Mathematics, RWTH Aachen University, Aachen, Germany Abstract. Kinetic equations like the
More informationMath 180, Exam 2, Fall 2012 Problem 1 Solution. (a) The derivative is computed using the Chain Rule twice. 1 2 x x
. Fin erivatives of the following functions: (a) f() = tan ( 2 + ) ( ) 2 (b) f() = ln 2 + (c) f() = sin() Solution: Math 80, Eam 2, Fall 202 Problem Solution (a) The erivative is compute using the Chain
More informationExponential asymptotic property of a parallel repairable system with warm standby under common-cause failure
J. Math. Anal. Appl. 341 (28) 457 466 www.elsevier.com/locate/jmaa Exponential asymptotic property of a parallel repairable system with warm stanby uner common-cause failure Zifei Shen, Xiaoxiao Hu, Weifeng
More informationMATH 566, Final Project Alexandra Tcheng,
MATH 566, Final Project Alexanra Tcheng, 60665 The unrestricte partition function pn counts the number of ways a positive integer n can be resse as a sum of positive integers n. For example: p 5, since
More informationMonotonicity for excited random walk in high dimensions
Monotonicity for excite ranom walk in high imensions Remco van er Hofsta Mark Holmes March, 2009 Abstract We prove that the rift θ, β) for excite ranom walk in imension is monotone in the excitement parameter
More informationMath 1B, lecture 8: Integration by parts
Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores
More informationThe Principle of Least Action
Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationTime-of-Arrival Estimation in Non-Line-Of-Sight Environments
2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor
More informationOnline Learning with Partial Feedback. 1 Online Mirror Descent with Estimated Gradient
Avance Course in Machine Learning Spring 2010 Online Learning wih Parial Feeback Hanous are joinly prepare by Shie Mannor an Shai Shalev-Shwarz In previous lecures we alke abou he general framework of
More informationChapter 9 Method of Weighted Residuals
Chapter 9 Metho of Weighte Resiuals 9- Introuction Metho of Weighte Resiuals (MWR) is an approimate technique for solving bounary value problems. It utilizes a trial functions satisfying the prescribe
More informationBranch differences and Lambert W
2014 16th International Symposium on Symbolic an Numeric Algorithms for Scientific Computing Branch ifferences an Lambert W D. J. Jeffrey an J. E. Jankowski Department of Applie Mathematics, The University
More informationarxiv: v4 [cs.ds] 7 Mar 2014
Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning
More informationA Weak First Digit Law for a Class of Sequences
International Mathematical Forum, Vol. 11, 2016, no. 15, 67-702 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1288/imf.2016.6562 A Weak First Digit Law for a Class of Sequences M. A. Nyblom School of
More information