arxiv: v1 [cs.lg] 22 Feb 2015
|
|
- Geoffrey Washington
- 6 years ago
- Views:
Transcription
1 SDCA wthout Dualty Sha Shalev-Shwartz arxv: v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we show how a varat of SDCA ca be appled for o-covex losses. We prove lear covergece rate eve f dvdual loss fuctos are o-covex as log as the expected loss s covex. Itroducto The followg regularzed loss mmzato problem s assocated wth may mache learg methods: m w R d P w := φ w + λ w. = Oe of the most popular methods for solvg ths problem s Stochastc Dual Coordate Ascet SDCA. 8 aalyzed ths method, ad showed that whe each φ s L-smooth ad covex the the covergece rate of SDCA s ÕL/λ + log/ɛ. As ts ame dcates, SDCA s derved by cosderg a dual problem. I ths paper, we cosder the possblty of applyg SDCA for problems whch dvdual φ are o-covex, e.g., deep learg optmzato problems. I may such cases, the dual problem s meagless. Istead of drectly usg the dual problem, we descrbe ad aalyze a varat of SDCA whch oly gradets of φ are beg used smlar to opto 5 the pseudo code of Prox-SDCA gve 6. Followg 3, we show that SDCA s a varat of the Stochastc Gradet Descet SGD, that s, ts update s based o a ubased estmate of the gradet. But, ulke the valla SGD, for SDCA the varace of the estmato of the gradet teds to zero as we coverge to a mmum. For the case whch each φ s L-smooth ad covex, we derve the same lear covergece rate of ÕL/λ + log/ɛ as 8, but wth a smpler, drect, dual-free, proof. We also provde a lear covergece rate for the case whch dvdual φ ca be o-covex, as log as the average of φ are covex. The rate for o-covex losses has a worst depedece o L/λ ad we leave t ope to see f a better rate ca be obtaed for the o-covex case. Related work: I recet years, may methods for optmzg regularzed loss mmzato problems have bee proposed. For example, SAG 5, SVRG 3, Fto, SAGA, ad SGD 4. The best covergece rate s for accelerated SDCA 6. A systematc study of the covergece rate of the dfferet methods uder o-covex losses s left to future work. School of Computer Scece ad Egeerg, The Hebrew Uversty, Jerusalem, Israel
2 SDCA wthout Dualty We mata pseudo-dual vectors α,..., α, where each α R d. Dual-Free SDCAP, T, η, α 0 Goal: Mmze P w = = φ w + λ w Iput: Objectve P, umber of teratos T, step sze η s.t. β := ηλ <, tal dual vectors α 0 = α 0,..., α0 Italze: w 0 = λ = α0 For t =,..., T Pck uformly at radom from Update: α t = α t ηλ φ w t + α t Update: w t = w t η φ w t + α t Observe that SDCA keeps the prmal-dual relato w t = λ Observe also that the update of α ca be rewrtte as α t = βα t + β = α t φ w t amely, the ew value of α s a covex combato of ts old value ad the egato of the gradet. Fally, observe that, codtoed o the value of w t ad α t, we have that Ew t = w t η Eφ w t + Eα t = w t η φ w t + λw t = = w t η P w t. That s, SDCA s fact a stace of Stochastc Gradet Descet. As we wll see the aalyss secto below, the advatage of SDCA over a valla SGD algorthm s because the varace of the update goes to zero as we coverge to a optmum., 3 Aalyss The theorem below provdes a lear covergece rate for smooth ad covex fuctos. The rate matches the aalyss gve 8, but the aalyss s smpler ad does ot rely o dualty. Theorem. Assume that each φ s L-smooth ad covex, ad the algorthm s ru wth η L+λ. Let w be the mmzer of P w ad let α = φ w. The, for every t, λ E wt w + α t α e ηλt λ L w0 w + α 0 α. L = =
3 I partcular, settg η = L+λ, the after T Ω L λ + teratos we wll have EP w T P w ɛ. The theorem below provdes a lear covergece rate for smooth fuctos, wthout assumg that dvdual φ are covex. We oly requre that the average of φ s covex. The depedece o L/λ s worse ths case. Theorem. Assume that each φ s L-smooth ad that the average fucto, = φ, s covex. Let w be the mmzer of P w ad let α = φ w. The, f we ru SDCA wth η = m{ λ, L λ }, we have that λ E wt w + λ L α t α e ηλt λ w0 w + λ L α 0 α. = = It follows that wheever we have that EP w T P w ɛ. T Ω L λ + 3. SDCA as varace-reduced SGD As we have show before, SDCA s a stace of SGD, the sese that the update ca be wrtte as w t = w t ηv t, wth v t = φ w t + α t satsfyg Ev t = P w t. The advatage of SDCA over a geerc SGD s that the varace of the update goes to zero as we coverge to the optmum. To see ths, observe that E v t = E α t + φ w t = E α t α + α + φ w t E α t α + E φ w t α Theorem or Theorem tells us that the term E α t α goes to zero as e ηλt. For the secod term, by smoothess of φ we have φ w t α = φ w t φ w L w t w, ad therefore, usg Theorem or Theorem aga, the secod term also goes to zero as e ηλt. All all, whe t Ω ηλ log/ɛ we wll have that E v t ɛ. 4 Proofs Observe that 0 = P w = φ w + λw, whch mples that w = λ Defe u = φ w t ad v t = u + α t. We also deote two potetals: A t = j= α t j α j, B t = w t w. 3 α.
4 We wll frst aalyze the evoluto of A t ad B t. If o roud t we update usg elemet the α t βα t + βu, where β = ηλ. It follows that, I addto, A t A t = αt = α αt α = βαt α + βu α αt α = β α t α + β u α β β α t = β α t α + u α β v t = ηλ α t α + u α β v t. u α t α B t B t = w t w w t w = ηw t w v t + η v t. The proofs of Theorem ad Theorem wll follow by studyg dfferet combatos of A t ad B t. 4. Proof of Theorem Defe Combg ad we obta C t = λ L A t + B t. C t C t = ηλ L = ηλ α t α u α + β v t + λ λ L α t α u α + λ β L η ηw t w v t η v t v t + w t w v t The defto of η mples that η λ β/l, so the coeffcet of v t s o-egatve. By smoothess of each φ we have u α = φ w t φ w L w t w. Therefore, λ C t C t ηλ L αt α λ wt w + w t w v t. Takg expectato of both sdes w.r.t. the choce of ad codtoed o w t ad α t ad otg that Ev t = P w t, we obta that λ EC t C t ηλ L E αt α λ wt w + w t w P w t. Usg the strog covexty of P we have w t w P w t P w t P w + λ wt w ad P w t P w λ wt w, whch together yelds w t w P w t 4
5 λ w t w. Therefore, EC t C t ηλ λ L E αt α + λl L + λ w t w = ηλc t. It follows that ad repeatg ths recursvely we ed up wth EC t ηλc t EC t ηλ t C 0 e ηλt C 0, whch cocludes the proof of the frst part of Theorem. The secod part follows by observg that P s L + λ smooth, whch gves P w P w L+λ w w. 4. Proof of Theorem I the proof of Theorem we bouded the term u α by L w t w based o the smoothess of φ. We ow assume that φ s also covex, whch eables to boud u α based o the curret sub-optmalty. Lemma. Assume that each φ s L-smooth ad covex. The, for every w, = Proof. For every, defe φ w φ w L P w P w λ w w g w = φ w φ w φ w w w. Clearly, sce φ s L-smooth so s g. I addto, by covexty of φ we have g w 0 for all w. It follows that g s o-egatve ad smooth, ad therefore, t s self-bouded see Secto..3 7: Usg the defto of g, we obta g w Lg w. φ w φ w = g w Lg w = L φ w φ w φ w w w Takg expectato over ad observg that P w = Eφ w + λ w ad 0 = P w = E φ w + λw we obta E φ w φ w L P w λ w P w + λ w + λw w w = L P w P w λ w w... 5
6 We ow cosder the potetal Combg ad we obta D t = L A t + λ B t. D t D t = ηλ α t α u α + β v t + λ ηw t w v t η v t L = ηλ α t α u α β + L L η v t + w t w v t ηλ α t α u α + w t w v t, L where the last equalty we used the assumpto η L + λ η β L. Take expectato of the above w.r.t. the choce of, usg Lemma, usg Ev t = P w t, ad usg covexty of P that yelds P w P w t w w t P w t, we obta ED t D t ηλ E α t α E u α + w t w Ev t L ηλ L E αt α P w t P w λ wt w + w t w P w t ηλ L E αt α + λ wt w = ηλd t Ths gves ED t ηλd t e ηλ D t, whch cocludes the proof of the frst part of the theorem. The secod part follows by observg that P s L + λ smooth, whch gves P w P w L+λ w w. Refereces Aaro Defazo, Fracs Bach, ad Smo Lacoste-Jule. Saga: A fast cremetal gradet method wth support for o-strogly covex composte objectves. I Advaces Neural Iformato Processg Systems, pages , 04. Aaro J Defazo, Tbéro S Caetao, ad Just Domke. Fto: A faster, permutable cremetal gradet method for bg data problems. arxv preprt arxv:407.70, Re Johso ad Tog Zhag. Acceleratg stochastc gradet descet usg predctve varace reducto. I Advaces Neural Iformato Processg Systems, pages 35 33, Jakub Koečỳ ad Peter Rchtárk. Sem-stochastc gradet descet methods. arxv preprt arxv:3.666, 03. 6
7 5 Ncolas Le Roux, Mark Schmdt, ad Fracs Bach. A stochastc gradet method wth a expoetal covergece rate for fte trag sets. I Advaces Neural Iformato Processg Systems, pages , 0. 6 S. Shalev-Shwartz ad T. Zhag. Accelerated proxmal stochastc dual coordate ascet for regularzed loss mmzato. Mathematcal Programmg SERIES A ad B to appear, Sha Shalev-Shwartz ad Sha Be-Davd. Uderstadg Mache Learg: From Theory to Algorthms. Cambrdge uversty press, Sha Shalev-Shwartz ad Tog Zhag. Stochastc dual coordate ascet methods for regularzed loss mmzato. Joural of Mache Learg Research, 4: , Feb 03. 7
Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationBounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy
Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled
More informationAn Accelerated Proximal Coordinate Gradient Method
A Accelerated Proxmal Coordate Gradet Method Qhag L Uversty of Iowa Iowa Cty IA USA qhag-l@uowaedu Zhaosog Lu Smo Fraser Uversty Buraby BC Caada zhaosog@sfuca L Xao Mcrosoft Research Redmod WA USA lxao@mcrosoftcom
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationStrong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity
BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout
More information2006 Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America
SOLUTION OF SYSTEMS OF SIMULTANEOUS LINEAR EQUATIONS Gauss-Sedel Method 006 Jame Traha, Autar Kaw, Kev Mart Uversty of South Florda Uted States of Amerca kaw@eg.usf.edu Itroducto Ths worksheet demostrates
More informationLECTURE 24 LECTURE OUTLINE
LECTURE 24 LECTURE OUTLINE Gradet proxmal mmzato method Noquadratc proxmal algorthms Etropy mmzato algorthm Expoetal augmeted Lagraga mehod Etropc descet algorthm **************************************
More informationLecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions
CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationComplete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables
Joural of Sceces, Islamc Republc of Ira 8(4): -6 (007) Uversty of Tehra, ISSN 06-04 http://sceces.ut.ac.r Complete Covergece ad Some Maxmal Iequaltes for Weghted Sums of Radom Varables M. Am,,* H.R. Nl
More informationarxiv: v1 [math.oc] 7 Mar 2017
Explotg Strog Covexty from Data wth Prmal-Dual Frst-Order Algorthms Jale Wag L Xao arxv:703.064v [math.oc] 7 Mar 07 Abstract We cosder emprcal rsk mmzato of lear predctors wth covex loss fuctos. Such problems
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationRademacher Complexity. Examples
Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed
More information18.657: Mathematics of Machine Learning
8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More information1 Review and Overview
CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated
More informationCS286.2 Lecture 4: Dinur s Proof of the PCP Theorem
CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationCommunication-Efficient Distributed Primal-Dual Algorithm for Saddle Point Problems
Commucato-Effcet Dstrbuted Prmal-Dual Algorthm for Saddle Pot Problems Yaodog Yu Nayag Techologcal Uversty ydyu@tu.edu.sg Sul Lu Nayag Techologcal Uversty lusl@tu.edu.sg So Jal Pa Nayag Techologcal Uversty
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationLecture 02: Bounding tail distributions of a random variable
CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome
More informationParallel Multi-splitting Proximal Method for Star Networks
Parallel Mult-splttg Proxmal Method for Star Networks Erm We Departmet of Electrcal Egeerg ad Computer Scece Northwester Uversty Evasto, IL 600 erm.we@orthwester.edu Abstract We develop a parallel algorthm
More informationMultiple Linear Regression Analysis
LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationUNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS
Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method
More informationNew Optimisation Methods for Machine Learning Aaron Defazio
New Optmsato Methods for Mache Learg Aaro Defazo A thess submtted for the degree of Doctor of Phlosophy of The Australa Natoal Uversty October 205 c Aaro Defazo 204 Except where otherwse dcated, ths thess
More informationResearch Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings
Hdaw Publshg Corporato Iteratoal Joural of Mathematcs ad Mathematcal Sceces Volume 009, Artcle ID 391839, 9 pages do:10.1155/009/391839 Research Artcle A New Iteratve Method for Commo Fxed Pots of a Fte
More informationMaximum Likelihood Estimation
Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~
More informationLecture Note to Rice Chapter 8
ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationA tighter lower bound on the circuit size of the hardest Boolean functions
Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More informationNew Optimisation Methods for Machine Learning
New Optmsato Methods for Mache Learg Aaro Defazo (Uder Examato) A thess submtted for the degree of Doctor of Phlosophy of The Australa Natoal Uversty November 204 c Aaro Defazo 204 Except where otherwse
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationLecture 4 Sep 9, 2015
CS 388R: Radomzed Algorthms Fall 205 Prof. Erc Prce Lecture 4 Sep 9, 205 Scrbe: Xagru Huag & Chad Voegele Overvew I prevous lectures, we troduced some basc probablty, the Cheroff boud, the coupo collector
More information9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d
9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,
More information( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model
Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch
More informationSolving Constrained Flow-Shop Scheduling. Problems with Three Machines
It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632
More informationStochastic Convex Optimization
Stochastc Covex Optmzato Sha Shalev-Shwartz TTI-Chcago sha@tt-c.org Ohad Shamr The Hebrew Uversty ohadsh@cs.huj.ac.l Natha Srebro TTI-Chcago at@uchcago.edu Karthk Srdhara TTI-Chcago karthk@tt-c.org.edu
More information5 Short Proofs of Simplified Stirling s Approximation
5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:
More informationSTRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING
Joural of tatstcs: Advaces Theory ad Alcatos Volume 5, Number, 6, Pages 3- Avalable at htt://scetfcadvaces.co. DOI: htt://d.do.org/.864/jsata_7678 TRONG CONITENCY FOR IMPLE LINEAR EV MODEL WITH v/ -MIXING
More informationA New Method for Decision Making Based on Soft Matrix Theory
Joural of Scetfc esearch & eports 3(5): 0-7, 04; rtcle o. JS.04.5.00 SCIENCEDOMIN teratoal www.scecedoma.org New Method for Decso Mag Based o Soft Matrx Theory Zhmg Zhag * College of Mathematcs ad Computer
More informationChapter 4 Multiple Random Variables
Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:
More informationCHAPTER 4 RADICAL EXPRESSIONS
6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube
More informationLikewise, properties of the optimal policy for equipment replacement & maintenance problems can be used to reduce the computation.
Whe solvg a vetory repleshmet problem usg a MDP model, kowg that the optmal polcy s of the form (s,s) ca reduce the computatoal burde. That s, f t s optmal to replesh the vetory whe the vetory level s,
More informationCOMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL
Sebasta Starz COMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL Abstract The am of the work s to preset a method of rakg a fte set of dscrete radom varables. The proposed method s based o two approaches:
More informationUnimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods
Malaysa Umodalty Joural Tests of Mathematcal for Global Optmzato Sceces (): of 05 Sgle - 5 Varable (007) Fuctos Usg Statstcal Methods Umodalty Tests for Global Optmzato of Sgle Varable Fuctos Usg Statstcal
More informationConvergence of Large Margin Separable Linear Classification
Covergece of Large Marg Separable Lear Classfcato Tog Zhag Mathematcal Sceces Departmet IBM T.J. Watso Research Ceter Yorktow Heghts, NY 0598 tzhag@watso.bm.com Abstract Large marg lear classfcato methods
More informationComplete Convergence for Weighted Sums of Arrays of Rowwise Asymptotically Almost Negative Associated Random Variables
A^VÇÚO 1 32 ò 1 5 Ï 2016 c 10 Chese Joural of Appled Probablty ad Statstcs Oct., 2016, Vol. 32, No. 5, pp. 489-498 do: 10.3969/j.ss.1001-4268.2016.05.005 Complete Covergece for Weghted Sums of Arrays of
More informationå 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018
Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of
More informationDistributed Accelerated Proximal Coordinate Gradient Methods
Dstrbuted Accelerated Proxmal Coordate Gradet Methods Yog Re, Ju Zhu Ceter for Bo-Ispred Computg Research State Key Lab for Itell. Tech. & Systems Dept. of Comp. Sc. & Tech., TNLst Lab, Tsghua Uversty
More informationA Remark on the Uniform Convergence of Some Sequences of Functions
Advaces Pure Mathematcs 05 5 57-533 Publshed Ole July 05 ScRes. http://www.scrp.org/joural/apm http://dx.do.org/0.436/apm.05.59048 A Remark o the Uform Covergece of Some Sequeces of Fuctos Guy Degla Isttut
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationChapter 9 Jordan Block Matrices
Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationAssignment 7/MATH 247/Winter, 2010 Due: Friday, March 19. Powers of a square matrix
Assgmet 7/MATH 47/Wter, 00 Due: Frday, March 9 Powers o a square matrx Gve a square matrx A, ts powers A or large, or eve arbtrary, teger expoets ca be calculated by dagoalzg A -- that s possble (!) Namely,
More informationThe internal structure of natural numbers, one method for the definition of large prime numbers, and a factorization test
Fal verso The teral structure of atural umbers oe method for the defto of large prme umbers ad a factorzato test Emmaul Maousos APM Isttute for the Advacemet of Physcs ad Mathematcs 3 Poulou str. 53 Athes
More informationStochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization
Stochastc Prmal-Dual Coordate Method for Regularzed Emprcal Rsk Mmzato Yuche Zhag L Xao September 24 Abstract We cosder a geerc covex optmzato problem assocated wth regularzed emprcal rsk mmzato of lear
More informationCubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem
Joural of Amerca Scece ;6( Cubc Nopolyomal Sple Approach to the Soluto of a Secod Order Two-Pot Boudary Value Problem W.K. Zahra, F.A. Abd El-Salam, A.A. El-Sabbagh ad Z.A. ZAk * Departmet of Egeerg athematcs
More informationQ-analogue of a Linear Transformation Preserving Log-concavity
Iteratoal Joural of Algebra, Vol. 1, 2007, o. 2, 87-94 Q-aalogue of a Lear Trasformato Preservg Log-cocavty Daozhog Luo Departmet of Mathematcs, Huaqao Uversty Quazhou, Fua 362021, P. R. Cha ldzblue@163.com
More informationComparison of Dual to Ratio-Cum-Product Estimators of Population Mean
Research Joural of Mathematcal ad Statstcal Sceces ISS 30 6047 Vol. 1(), 5-1, ovember (013) Res. J. Mathematcal ad Statstcal Sc. Comparso of Dual to Rato-Cum-Product Estmators of Populato Mea Abstract
More informationPTAS for Bin-Packing
CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,
More information18.413: Error Correcting Codes Lab March 2, Lecture 8
18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse
More informationRandomized Dual Coordinate Ascent with Arbitrary Sampling
Radomzed Dual Coordate Ascet wth Arbtrary Samplg Zheg Qu Peter Rchtárk Tog Zhag November 21, 2014 Abstract We study the problem of mmzg the average of a large umber of smooth covex fuctos pealzed wth a
More informationTRIANGULAR MEMBERSHIP FUNCTIONS FOR SOLVING SINGLE AND MULTIOBJECTIVE FUZZY LINEAR PROGRAMMING PROBLEM.
Abbas Iraq Joural of SceceVol 53No 12012 Pp. 125-129 TRIANGULAR MEMBERSHIP FUNCTIONS FOR SOLVING SINGLE AND MULTIOBJECTIVE FUZZY LINEAR PROGRAMMING PROBLEM. Iraq Tarq Abbas Departemet of Mathematc College
More informationA conic cutting surface method for linear-quadraticsemidefinite
A coc cuttg surface method for lear-quadratcsemdefte programmg Mohammad R. Osoorouch Calfora State Uversty Sa Marcos Sa Marcos, CA Jot wor wth Joh E. Mtchell RPI July 3, 2008 Outle: Secod-order coe: defto
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More information( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006
Mult-Layer Refracto Problem Rafael Espercueta, Bakersfeld College, November, 006 Lght travels at dfferet speeds through dfferet meda, but refracts at layer boudares order to traverse the least-tme path.
More informationAitken delta-squared generalized Juncgk-type iterative procedure
Atke delta-squared geeralzed Jucgk-type teratve procedure M. De la Se Isttute of Research ad Developmet of Processes. Uversty of Basque Coutry Campus of Leoa (Bzkaa) PO Box. 644- Blbao, 488- Blbao. SPAIN
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationEntropy ISSN by MDPI
Etropy 2003, 5, 233-238 Etropy ISSN 1099-4300 2003 by MDPI www.mdp.org/etropy O the Measure Etropy of Addtve Cellular Automata Hasa Aı Arts ad Sceces Faculty, Departmet of Mathematcs, Harra Uversty; 63100,
More informationSAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives
SAGA: A Fast Icremetal Gradet Method Wth Support for No-Strogly Covex Composte Objectves Aaro Defazo, Fracs Bach, Smo Lacoste-Jule To cte ths verso: Aaro Defazo, Fracs Bach, Smo Lacoste-Jule. SAGA: A Fast
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More informationNon-uniform Turán-type problems
Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at
More informationDIFFERENTIAL GEOMETRIC APPROACH TO HAMILTONIAN MECHANICS
DIFFERENTIAL GEOMETRIC APPROACH TO HAMILTONIAN MECHANICS Course Project: Classcal Mechacs (PHY 40) Suja Dabholkar (Y430) Sul Yeshwath (Y444). Itroducto Hamltoa mechacs s geometry phase space. It deals
More informationA Robust Total Least Mean Square Algorithm For Nonlinear Adaptive Filter
A Robust otal east Mea Square Algorthm For Nolear Adaptve Flter Ruxua We School of Electroc ad Iformato Egeerg X'a Jaotog Uversty X'a 70049, P.R. Cha rxwe@chare.com Chogzhao Ha, azhe u School of Electroc
More informationAnalyzing Fuzzy System Reliability Using Vague Set Theory
Iteratoal Joural of Appled Scece ad Egeerg 2003., : 82-88 Aalyzg Fuzzy System Relablty sg Vague Set Theory Shy-Mg Che Departmet of Computer Scece ad Iformato Egeerg, Natoal Tawa versty of Scece ad Techology,
More informationUnsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationPROJECTION PROBLEM FOR REGULAR POLYGONS
Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c
More informationCSE 5526: Introduction to Neural Networks Linear Regression
CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto
More informationNEUMANN ISOPERIMETRIC CONSTANT ESTIMATE FOR CONVEX DOMAINS
NEUMANN ISOPERIMETRIC CONSTANT ESTIMATE FOR CONVEX DOMAINS XIANZHE DAI, GUOFANG WEI, AND ZHENLEI ZHANG Abstract We preset a geometrc ad elemetary proof of the local Neuma sopermetrc equalty o covex domas
More informationAnalysis of Lagrange Interpolation Formula
P IJISET - Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue, December 4. www.jset.com ISS 348 7968 Aalyss of Lagrage Iterpolato Formula Vjay Dahya PDepartmet of MathematcsMaharaja Surajmal
More informationMachine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18
Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that
More informationArithmetic Mean and Geometric Mean
Acta Mathematca Ntresa Vol, No, p 43 48 ISSN 453-6083 Arthmetc Mea ad Geometrc Mea Mare Varga a * Peter Mchalča b a Departmet of Mathematcs, Faculty of Natural Sceces, Costate the Phlosopher Uversty Ntra,
More information4 Inner Product Spaces
11.MH1 LINEAR ALGEBRA Summary Notes 4 Ier Product Spaces Ier product s the abstracto to geeral vector spaces of the famlar dea of the scalar product of two vectors or 3. I what follows, keep these key
More informationThe Occupancy and Coupon Collector problems
Chapter 4 The Occupacy ad Coupo Collector problems By Sarel Har-Peled, Jauary 9, 08 4 Prelmares [ Defto 4 Varace ad Stadard Devato For a radom varable X, let V E [ X [ µ X deote the varace of X, where
More informationStochastic Convex Optimization
Stochastc Covex Optmzato Sha Shalev-Shwartz TTI-Chcago sha@tt-c.org Ohad Shamr The Hebrew Uversty ohadsh@cs.huj.ac.l Natha Srebro TTI-Chcago at@uchcago.edu Karthk Srdhara TTI-Chcago karthk@tt-c.org Abstract
More informationMOLECULAR VIBRATIONS
MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal
More informationA new type of optimization method based on conjugate directions
A ew type of optmzato method based o cojugate drectos Pa X Scece School aj Uversty of echology ad Educato (UE aj Cha e-mal: pax94@sacom Abstract A ew type of optmzato method based o cojugate drectos s
More informationTHE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5
THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More information