Variational Adaptive-Newton Method

Size: px
Start display at page:

Download "Variational Adaptive-Newton Method"

Transcription

1 Variational Adaptive-Newton Method Mohaad Etiyaz Khan Wu Lin Voot Tangkaratt Zuozhu Liu Didrik Nielsen AIP, RIKEN, Tokyo Abstract We present a black-box learning ethod called the Variational Adaptive-Newton (VAN) ethod. The ethod is especially suitable for explorative-learning tasks such as active learning and reinforceent learning. Just like Bayesian ethods, it estiates a distribution that can be used for exploration, but its coputational coplexity is siilar to that of the continuous optiization ethods. VAN is a second-order ethod that unifies existing ethods in distinct fields of continuous optiization, variational inference, and evolution strategies. Our results show that VAN perfors well on a wide-variety of learning tasks but takes fewer coputations than existing approaches. To suarize, this works presents a ethod has the potential not only to be a general-purpose learning tool, but also to be used to solve difficult probles such as deep reinforceent learning and unsupervised learning. Introduction Huans learn by sequentially exploring the world. During our lifeties, we use our past experiences to acquire new ones and then use the to learn even ore. How can we design ethods that can iic this explorative" learning? This is an open question in artificial intelligence and achine learning. One such approach is based on the Bayesian posterior distribution which can be used to suarize past data exaples and to choose new ones (Perfors et al., ). Bayesian ethods have been applied to a wide variety of explorative-learning tasks, e.g., for active learning and Bayesian optiization to select inforative data exaples (Houlsby et al., ; Gal et al., 7; Brochu et al., ; Fishel and Loeb, ), and for reinforceent learning to learn through interactions (Wyatt, 998; Strens, ). However, estiating the posterior distribution is coputationally deanding, which akes its application to large-scale probles challenging. On the other hand, non-bayesian ethods, such as those based on continuous optiization, are coputationally uch cheaper. This is because they only copute a point estiate of the odel paraeters rather than their distributions. However, since there is no distribution associated with the paraeters, these ethods cannot directly exploit the echaniss of Bayesian exploration. This raises the following question: how can we design explorative-learning ethods that copute a distribution just like Bayesian ethods, but cost siilar to optiization ethods? In this paper, we propose such a ethod. Our ethod is a general purpose black-box optiization ethod, but is especially suitable for exploration-based learning. Just like Newton s ethod, our ethod is a second-order ethod, but it also unifies existing ethods in distinct fields of continuous optiization, variational inference, and evolution strategies. Due to this connection, we call our ethod the Variational Adaptive-Newton (VAN) ethod. We show results for supervised and unsupervised learning, as well as for active learning and reinforceent learning. Our results suggest that VAN can be used as a general purpose learning ethod and has the potential to iprove existing approaches for a variety of difficult learning probles. ZL is fro Singapore University of Technology and Design, Singapore. Work was done in internship at AIP. A longer version of this paper is available at 3st Conference on Neural Inforation Processing Systes (NIPS 7), Long Beach, CA, USA.

2 Variational Optiization with Gaussian distributions We consider learning tasks that can be forulated as a function iniization proble, θ = argin f(θ), where θ R D. () θ A wide variety of probles in supervised, unsupervised, and reinforceent learning fall under this category. However, instead of directly solving proble (), we take a different approach that follows the Variational Optiization (VO) fraework (Staines and Barber, ) where we iniize, in η E q(θ η) f(θ)] := L(η), () where q is a probability distribution with paraeters η. This approach can lead us to the iniu θ because L(η) is an upper bound on the iniu value of f at θ, i.e., in θ f(θ) E q(θ η) f(θ)]. Therefore iniizing L(η) iniizes f(θ), and when the distribution q puts all its ass on θ, we recover the iniu θ. This type of function iniization is coonly used in any areas of stochastic search such as evolution strategies (Hansen and Ostereier, ; Wierstra et al., 8). In our proble context, this forulation is advantageous because it enables learning via exploration, where exploration is facilitated with the use of the distribution q(θ η). In this paper, we will use a Gaussian distribution for exploration, i.e., we set q(θ η) := N (θ µ, Σ) with η = {µ, Σ}. The proble () can then be rewritten as follows, in E N (θ µ,σ) f(θ)] := L(µ, Σ). (3) µ,σ A straightforward approach to iniize L(µ, Σ) is to use SGD to optiize µ and Σ as shown below, V-SGD : µ t+ = µ t ρ t µ L t, Σ t+ = Σ t ρ t Σ L t, (4) where ρ t > is a step size at iteration t, denotes an unbiased stochastic-gradient estiate, and L t = L(µ t, Σ t ). We refer to this approach as Variational SGD or siply V-SGD to differentiate it fro the standard SGD to optiize f(θ) in the θ space. This approach is siple and can be powerful when used with adaptive-gradient ethods to adaptively change the step-size, e.g., AdaGrad and RMSprop. However, as pointed by Wierstra et al. (8), it has issues, especially when REINFORCE (Willias, 99) is used to estiate the gradients of f(θ). Wierstra et al. (8) argue that the V-SGD update becoes increasingly unstable when the covariance is sall, while becoing very sall when the covariance is large. To fix these probles, Wierstra et al. (8) proposed a natural-gradient ethod. Our ethod is also a natural-gradient ethod, but, as we show in the next section, its updates are uch sipler and they lead to a second-order ethod which is siilar to Newton s ethod. 3 Variational Adaptive-Newton Method We propose a new ethod to solve (3) by using a irror-descent algorith. We show that our algorith is a second-order ethod which solves the original proble (), even though it is designed to solve (3). Due to its siilarity to Newton s ethod, we refer to our ethod as the Variational Adaptive-Newton (VAN) ethod. Figure (b) shows an illustrative exaple of our ethod to optiize a one-diensional non-convex function. VAN can be obtained by aking two sall changes to the V-SGD objective. Firstly, note that the V-SGD update in (4) is a solution of the following optiization proble: η t+ = argin η, η L t + η η η ρ t η t ρ t η L t. (5) t A siple interpretation of this optiization proble is that, in V-SGD, we choose the next point η along the gradient but contained within a scaled l -ball centered at the current point η t. This interpretation enables us to slightly odify V-SGD to obtain VAN. Our VAN ethod is based on two odifications to (5). The first odification is to replace the Euclidean distance by a Bregan divergence which results in the irror-descent ethod. Note

3 that for exponential-faily distributions, the Bregan divergence corresponds to the Kullback-Leibler (KL) divergence (Raskutti and Mukherjee, 5). Using the KL divergence as distance easure has the benefit of resulting in natural gradient updates which are efficient when optiizing the paraeter of a probability distribution (Aari, 998). Our second odification is that we optiize the VO objective with respect to the ean paraeterization of the Gaussian distribution := {µ, µµ T +Σ} instead of the paraeter η := {µ, Σ}. The two odifications give us the following proble: t+ = argin, L t + β t D KL q q t ], (6) where q := q(θ ), q t := q(θ t ), and D KL q q t ] = E q log(q/q t ) denotes the KL divergence. The convergence of this procedure is guaranteed under ild conditions (Ghadii et al., 4). The solution to this optiization proble is given by (proof given in Appendix A), µ t+ = µ t β t Σ t+ µ L t, Σ t+ = Σ t + β t Σ L t. (7) The VAN update above corresponds to those of a second-order ethod. To see this connection, we rewrite the gradients in the update by using the following identities (Opper and Archabeau, 9): µ L t = µ E q f(θ)] = E q θ f(θ)], Σ E q f(θ)] = E q θθ f(θ) ]. (8) By substituting these in (7) and expressing the update in ters of the precision atrix P t := Σ t, we get the following updates: VAN: µ t+ = µ t β t P t+ E q t θ f(θ)], P t+ = P t + β t E qt θθ f(θ) ], (9) where q t := N (θ µ t, Σ t ). Copare this to the update of Newton s ethod, Newton s Method: θ t+ = θ t ρ t θθ f(θ t ) ] θ f(θ t )]. () While Newton s ethod scales the gradient by the inverse Hessian to obtain a step direction, VAN scales the averaged gradients by the precision atrix P t which contains a weighted su of the past averaged Hessians. Both the averaged-gradients and averaged-hessians are the average of gradients and Hessians coputed at locations sapled fro q t, respectively. The VAN update therefore uses the second-order inforation but, unlike Newton s ethod, this inforation is averaged over saples fro the explorative distribution q and sued over past iterations. 4 Variants of VAN and Their Connections to Existing Methods A Large-Scale Variant: For probles with a large nuber of paraeters, coputing the full Hessian atrix ight be infeasible. By using a ean-field approxiation q(θ η) = D d= N (θ d µ d, σd ) however, we obtain a ethod that use a diagonal approxiation of Hessian instead of the full Hessian atrix in its updates. Denoting the precision paraeters by s d = /σd, and a vector containing the by s, the following diagonal version of VAN, which we call VAN-D, can be written as follows: VAN-D: µ t+ = µ t β t diag(s t+ ) E qt θ f(θ)], s t+ = s t + β t E qt h(θ)], () where diag(s) is a diagonal atrix containing the vector s as its diagonal and h(θ) is the diagonal of the Hessian θθf(θ). The diagonal Hessian can be coputed using the reparaeterization trick. Connections to AdaGrad: Instead of using the Hessian, we can use a Gauss-Newton variant of VAN-D where we replace the Hessian θ iθ i f(θ) by the square of the gradient θi f(θ)]. The resulting updates are very siilar to the updates of the adaptive-gradient ethod AdaGrad (Duchi et al., ). The ain difference is that our ethod uses the averaged-gradients instead of the gradient at one point. VAN can also be seen as a generalization of an adaptive ethod called AROW (Craer et al., 9). AROW uses a irror descent algorith siilar to ours, but has been applied only to SVMs. Connections to Variational Inference (VI): VI is a special type of VO proble where function f(θ) := logp(y, θ)/q(θ η)] with p(y, θ) being the joint distribution and q being the variational approxiation to the posterior distribution. Our approach is an extension of a recent approach for VI by? called the conjugate-coputation variational inference (CVI). VAN is a generalization of CVI 3

4 Test Loss - * Test Loss Test Loss.5 (a): LassoReg/YearPredictionMSD.3 (e): BatchLogReg/usps_3vs5.6 (i): StochasticLogReg/usps_3vs5 iridge VAN svan.5 VAN Newton svan- AdaGrad svan-d-.5 M=.34 M= # Data Passes.8 3 # Data Passes.8 - # Data Passes (a) The left figure shows iproveents for Lasso regression when copared to the Iterative-Ridge ethod on the YearPredictionMSD dataset (N = 55K and D = 9). Here, stochastic-van (svan) with a inibatch size of converges faster than the batch ethods. The iddle figure shows coparison for Batch-VAN to Newton s ethod for USPS dataset using logistic regression (N = 54 and D = 56), where VAN converges at the sae rate as Newton s ethod. The right figure shows the sae coparison for stochastic-van and its diagonal version (svan-d) with ini-batch size of where we observe a convergence siilar to AdaGrad. Both of our ethods use saples to copute averaged-gradients and Hessian..5.6 ActiveLearning/usps_3vs5 svan-active- svan-.34 M= #Data Passes RL/HalfCheetah-v (b) The top figure shows the function f(θ) = sinc(θ) with a blue curve (global inia at ). The second plot shows the VO objective L(µ, σ) = E qf(θ)] for a Gaussian q = N (θ µ, σ ). The red points and arrows show the iterations of our VAN ethod initialized at µ = 3. and σ =.5. The botto plot shows the progression of q where darker curves indicate higher iterations. Test Rewards 5 svan-d- svan-d- svag-d- V-SGD SGD 5 3 Traning Episode (c) The top figure shows results for active learning on logistic regression. svan-active- uses a axiu-entropy acquisition function to select a ini-batch size of and achieve a good error uch quickly when copared to svan- with no active learning. The botto figure shows results for paraeter-based exploration for reinforceent learning in deep deterinistic-policy gradient ethod (Silver et al., 4). svan-d variants achieve better reward than AROW, V-SGD, and SGD. to a general function f(θ). A direct consequence of our theoretical result fro the previous section is that CVI, just like VAN, is also a second-order ethod. VAN as a natural-gradient ethod: VAN perfors natural-gradient updates in the space of the Gaussian distributions. This can be shown by using a recent result by Raskutti and Mukherjee (5). Our approach however is uch sipler than existing ethods such as Natural Evolution Strategies (NES) (Wierstra et al., 8) which applies natural-gradient descent directly in the space of µ and Σ. Unfortunately, this results in an infeasible algorith since the Fisher inforation atrix has O(D 4 ) paraeters. NES requires sophisticated reparaeterization to solve this issue. An advantage of VAN over NES is that it naturally has O(D ) paraeters and its updates are therefore uch sipler. Results: Our experiental results shown in Fig. (a) and (c) reveal that VAN and its variants perfor well for supervised learning on Lasso and Logistic regression, active learning for logistic regression, and a reinforceent-learning task for paraeter based exploration. Figure (b) shows an exaple where VAN can avoid local inia through exploration. These results show the generality of our ethod and also that it can be used to solve difficult probles such deep reinforceent learning. 4

5 References Aari, S.-I. (998). Natural gradient works efficiently in learning. Neural coputation, ():5 76. Brochu, E., Cora, V. M., and De Freitas, N. (). A tutorial on Bayesian optiization of expensive cost functions, with application to active user odeling and hierarchical reinforceent learning. arxiv preprint arxiv:.599. Craer, K., Kulesza, A., and Dredze, M. (9). Adaptive regularization of weight vectors. In Advances in neural inforation processing systes, pages Duchi, J., Hazan, E., and Singer, Y. (). Adaptive subgradient ethods for online learning and stochastic optiization. The Journal of Machine Learning Research, : 59. Fishel, J. A. and Loeb, G. E. (). Bayesian exploration for intelligent identification of textures. Frontiers in neurorobotics, 6. Gal, Y., Isla, R., and Ghahraani, Z. (7). Deep Bayesian Active Learning with Iage Data. ArXiv e-prints. Ghadii, S., Lan, G., and Zhang, H. (4). Mini-batch stochastic approxiation ethods for nonconvex stochastic coposite optiization. Matheatical Prograing, pages 39. Hansen, N. and Ostereier, A. (). Copletely derandoized self-adaptation in evolution strategies. Evolutionary coputation, 9(): Houlsby, N., Huszár, F., Ghahraani, Z., and Lengyel, M. (). Bayesian Active Learning for Classification and Preference Learning. ArXiv e-prints. Kinga, D. and Ba, J. (4). Ada: A ethod for stochastic optiization. arxiv preprint arxiv: Opper, M. and Archabeau, C. (9). The Variational Gaussian Approxiation Revisited. Neural Coputation, (3): Perfors, A., Tenenbau, J. B., Griffiths, T. L., and Xu, F. (). A tutorial introduction to Bayesian odels of cognitive developent. Cognition, (3):3 3. Raskutti, G. and Mukherjee, S. (5). The inforation geoetry of irror descent. IEEE Transactions on Inforation Theory, 6(3): Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riediller, M. A. (4). Deterinistic policy gradient algoriths. In Proceedings of the 3th International Conference on Machine Learning, ICML 4, Beijing, China, -6 June 4, pages Staines, J. and Barber, D. (). Variational Optiization. ArXiv e-prints. Strens, M. (). A Bayesian fraework for reinforceent learning. In In Proceedings of the Seventeenth International Conference on Machine Learning, pages ICML. Wierstra, D., Schaul, T., Peters, J., and Schidhuber, J. (8). Natural evolution strategies. In Evolutionary Coputation, 8. CEC 8.(IEEE World Congress on Coputational Intelligence). IEEE Congress on, pages IEEE. Willias, R. J. (99). Siple statistical gradient-following algoriths for connectionist reinforceent learning. Machine learning, 8(3-4):9 56. Wyatt, J. (998). Exploration and inference in learning fro reinforceent. 5

6 A Derivation of VAN Denote the ean paraeters of q t (θ) by t which is equal to the expected value of the sufficient statistics φ(θ), i.e., t := E qt φ(θ)]. The irror descent update at iteration t is given by the solution to t+ = argin, L t = argin E q = argin = argin = argin = argin E q + D β KL q q t ] () t φ(θ), )] L t + log ((q/q t ) /βt (3) exp φ(θ), L t q /βt log (4) q /βt t q /βt E q log q /βt t exp φ(θ), (5) L t E q log q β t q t exp φ(θ), β t L t (6) ] D KL q q t exp φ(θ), β t L t /Z t. β t (7) where Z is the noralizing constant of the distribution in the denoinator which is a function of the gradient and step size. Miniizing this KL divergence gives the update q t+ (θ) q t (θ) exp φ(θ), β t L t. (8) By rewriting this, we see that we get an update in the natural paraeters λ t of q t (θ), i.e. λ t+ = λ t β t L t. (9) Recalling that the ean paraeters of a Gaussian q(θ) = N (θ µ, Σ) are () = µ and M () = Σ + µµ T and using the chain rule, we can express the gradient L t in ters of µ and Σ, ()L = µ L Σ L µ () M ()L = Σ L. () Finally, recalling that the natural paraeters of a Gaussian q(θ) = N (θ µ, Σ) are λ () = Σ µ and λ () = Σ, we can rewrite the VAN updates in ters of µ and Σ, Σ t+ = Σ t + β t Σ L t µ t+ = Σ t+ Σ t µ t β t ( µ L t Σ L t ] µ t )] () (3) ( ) = Σ t+ Σ t + β t Σ L t µ t β t Σ t+ µ L t (4) = µ t β t Σ t+ µ L t. (5) 6

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x) 7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

arxiv: v2 [cs.lg] 30 Mar 2017

arxiv: v2 [cs.lg] 30 Mar 2017 Batch Renoralization: Towards Reducing Minibatch Dependence in Batch-Noralized Models Sergey Ioffe Google Inc., sioffe@google.co arxiv:1702.03275v2 [cs.lg] 30 Mar 2017 Abstract Batch Noralization is quite

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Multi-view Discriminative Manifold Embedding for Pattern Classification

Multi-view Discriminative Manifold Embedding for Pattern Classification Multi-view Discriinative Manifold Ebedding for Pattern Classification X. Wang Departen of Inforation Zhenghzou 450053, China Y. Guo Departent of Digestive Zhengzhou 450053, China Z. Wang Henan University

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

Fast yet Simple Natural-Gradient Variational Inference in Complex Models

Fast yet Simple Natural-Gradient Variational Inference in Complex Models Fast yet Simple Natural-Gradient Variational Inference in Complex Models Mohammad Emtiyaz Khan RIKEN Center for AI Project, Tokyo, Japan http://emtiyaz.github.io Joint work with Wu Lin (UBC) DidrikNielsen,

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

An Improved Particle Filter with Applications in Ballistic Target Tracking

An Improved Particle Filter with Applications in Ballistic Target Tracking Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION

RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION RANDOM GRADIENT EXTRAPOLATION FOR DISTRIBUTED AND STOCHASTIC OPTIMIZATION GUANGHUI LAN AND YI ZHOU Abstract. In this paper, we consider a class of finite-su convex optiization probles defined over a distributed

More information

Bayesian Deep Learning

Bayesian Deep Learning Bayesian Deep Learning Mohammad Emtiyaz Khan AIP (RIKEN), Tokyo http://emtiyaz.github.io emtiyaz.khan@riken.jp June 06, 2018 Mohammad Emtiyaz Khan 2018 1 What will you learn? Why is Bayesian inference

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab Support Vector Machines Machine Learning Series Jerry Jeychandra Bloh Lab Outline Main goal: To understand how support vector achines (SVMs) perfor optial classification for labelled data sets, also a

More information

Predictive Vaccinology: Optimisation of Predictions Using Support Vector Machine Classifiers

Predictive Vaccinology: Optimisation of Predictions Using Support Vector Machine Classifiers Predictive Vaccinology: Optiisation of Predictions Using Support Vector Machine Classifiers Ivana Bozic,2, Guang Lan Zhang 2,3, and Vladiir Brusic 2,4 Faculty of Matheatics, University of Belgrade, Belgrade,

More information

Synthetic Generation of Local Minima and Saddle Points for Neural Networks

Synthetic Generation of Local Minima and Saddle Points for Neural Networks uigi Malagò 1 Diitri Marinelli 1 Abstract In this work-in-progress paper, we study the landscape of the epirical loss function of a feed-forward neural network, fro the perspective of the existence and

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Lecture 9: Multi Kernel SVM

Lecture 9: Multi Kernel SVM Lecture 9: Multi Kernel SVM Stéphane Canu stephane.canu@litislab.eu Sao Paulo 204 April 6, 204 Roadap Tuning the kernel: MKL The ultiple kernel proble Sparse kernel achines for regression: SVR SipleMKL:

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Support Vector Machines. Goals for the lecture

Support Vector Machines. Goals for the lecture Support Vector Machines Mark Craven and David Page Coputer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Soe of the slides in these lectures have been adapted/borrowed fro aterials developed

More information

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well

More information

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL paper prepared for the 1996 PTRC Conference, Septeber 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL Nanne J. van der Zijpp 1 Transportation and Traffic Engineering Section Delft University

More information

arxiv: v1 [cs.lg] 8 Jan 2019

arxiv: v1 [cs.lg] 8 Jan 2019 Data Masking with Privacy Guarantees Anh T. Pha Oregon State University phatheanhbka@gail.co Shalini Ghosh Sasung Research shalini.ghosh@gail.co Vinod Yegneswaran SRI international vinod@csl.sri.co arxiv:90.085v

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

LONG-TERM PREDICTIVE VALUE INTERVAL WITH THE FUZZY TIME SERIES

LONG-TERM PREDICTIVE VALUE INTERVAL WITH THE FUZZY TIME SERIES Journal of Marine Science and Technology, Vol 19, No 5, pp 509-513 (2011) 509 LONG-TERM PREDICTIVE VALUE INTERVAL WITH THE FUZZY TIME SERIES Ming-Tao Chou* Key words: fuzzy tie series, fuzzy forecasting,

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

E. Alpaydın AERFAISS

E. Alpaydın AERFAISS E. Alpaydın AERFAISS 00 Introduction Questions: Is the error rate of y classifier less than %? Is k-nn ore accurate than MLP? Does having PCA before iprove accuracy? Which kernel leads to highest accuracy

More information

Soft-margin SVM can address linearly separable problems with outliers

Soft-margin SVM can address linearly separable problems with outliers Non-linear Support Vector Machines Non-linearly separable probles Hard-argin SVM can address linearly separable probles Soft-argin SVM can address linearly separable probles with outliers Non-linearly

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space

Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing

More information

Large-scale Stochastic Optimization

Large-scale Stochastic Optimization Large-scale Stochastic Optimization 11-741/641/441 (Spring 2016) Hanxiao Liu hanxiaol@cs.cmu.edu March 24, 2016 1 / 22 Outline 1. Gradient Descent (GD) 2. Stochastic Gradient Descent (SGD) Formulation

More information

An improved self-adaptive harmony search algorithm for joint replenishment problems

An improved self-adaptive harmony search algorithm for joint replenishment problems An iproved self-adaptive harony search algorith for joint replenishent probles Lin Wang School of Manageent, Huazhong University of Science & Technology zhoulearner@gail.co Xiaojian Zhou School of Manageent,

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

Bayesian Approach for Fatigue Life Prediction from Field Inspection

Bayesian Approach for Fatigue Life Prediction from Field Inspection Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan

More information

Domain-Adversarial Neural Networks

Domain-Adversarial Neural Networks Doain-Adversarial Neural Networks Hana Ajakan, Pascal Gerain 2, Hugo Larochelle 3, François Laviolette 2, Mario Marchand 2,2 Départeent d inforatique et de génie logiciel, Université Laval, Québec, Canada

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Data-Driven Imaging in Anisotropic Media

Data-Driven Imaging in Anisotropic Media 18 th World Conference on Non destructive Testing, 16- April 1, Durban, South Africa Data-Driven Iaging in Anisotropic Media Arno VOLKER 1 and Alan HUNTER 1 TNO Stieltjesweg 1, 6 AD, Delft, The Netherlands

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

Robustness Experiments for a Planar Hopping Control System

Robustness Experiments for a Planar Hopping Control System To appear in International Conference on Clibing and Walking Robots Septeber 22 Robustness Experients for a Planar Hopping Control Syste Kale Harbick and Gaurav S. Sukhate kale gaurav@robotics.usc.edu

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016/2017 Lessons 9 11 Jan 2017 Outline Artificial Neural networks Notation...2 Convolutional Neural Networks...3

More information

Warning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network

Warning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network 565 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 59, 07 Guest Editors: Zhuo Yang, Junie Ba, Jing Pan Copyright 07, AIDIC Servizi S.r.l. ISBN 978-88-95608-49-5; ISSN 83-96 The Italian Association

More information

Bayesian Deep Learning

Bayesian Deep Learning Bayesian Deep Learning Mohammad Emtiyaz Khan RIKEN Center for AI Project, Tokyo http://emtiyaz.github.io Keywords Bayesian Statistics Gaussian distribution, Bayes rule Continues Optimization Gradient descent,

More information

Inexact Proximal Gradient Methods for Non-Convex and Non-Smooth Optimization

Inexact Proximal Gradient Methods for Non-Convex and Non-Smooth Optimization The Thirty-Second AAAI Conference on Artificial Intelligence AAAI-8) Inexact Proxial Gradient Methods for Non-Convex and Non-Sooth Optiization Bin Gu, De Wang, Zhouyuan Huo, Heng Huang * Departent of Electrical

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Neural Networks Trained with the EEM Algorithm: Tuning the Smoothing Parameter

Neural Networks Trained with the EEM Algorithm: Tuning the Smoothing Parameter eural etworks Trained wit te EEM Algorit: Tuning te Sooting Paraeter JORGE M. SATOS,2, JOAQUIM MARQUES DE SÁ AD LUÍS A. ALEXADRE 3 Intituto de Engenaria Bioédica, Porto, Portugal 2 Instituto Superior de

More information

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer. UIVRSITY OF TRTO DIPARTITO DI IGGRIA SCIZA DLL IFORAZIO 3823 Povo Trento (Italy) Via Soarive 4 http://www.disi.unitn.it O TH US OF SV FOR LCTROAGTIC SUBSURFAC SSIG A. Boni. Conci A. assa and S. Piffer

More information

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual 6 IEEE Congress on Evolutionary Coputation Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-1, 6 On the Analysis of the Quantu-inspired Evolutionary Algorith with a Single Individual

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 8: Optimization Cho-Jui Hsieh UC Davis May 9, 2017 Optimization Numerical Optimization Numerical Optimization: min X f (X ) Can be applied

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Training an RBM: Contrastive Divergence. Sargur N. Srihari

Training an RBM: Contrastive Divergence. Sargur N. Srihari Training an RBM: Contrastive Divergence Sargur N. srihari@cedar.buffalo.edu Topics in Partition Function Definition of Partition Function 1. The log-likelihood gradient 2. Stochastic axiu likelihood and

More information

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)

More information

Variations on Backpropagation

Variations on Backpropagation 2 Variations on Backpropagation 2 Variations Heuristic Modifications Moentu Variable Learning Rate Standard Nuerical Optiization Conjugate Gradient Newton s Method (Levenberg-Marquardt) 2 2 Perforance

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

Geometrical intuition behind the dual problem

Geometrical intuition behind the dual problem Based on: Geoetrical intuition behind the dual proble KP Bennett, EJ Bredensteiner, Duality and Geoetry in SVM Classifiers, Proceedings of the International Conference on Machine Learning, 2000 1 Geoetrical

More information

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments

Optimal nonlinear Bayesian experimental design: an application to amplitude versus offset experiments Geophys. J. Int. (23) 155, 411 421 Optial nonlinear Bayesian experiental design: an application to aplitude versus offset experients Jojanneke van den Berg, 1, Andrew Curtis 2,3 and Jeannot Trapert 1 1

More information

Tracking using CONDENSATION: Conditional Density Propagation

Tracking using CONDENSATION: Conditional Density Propagation Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation

More information

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China 6th International Conference on Machinery, Materials, Environent, Biotechnology and Coputer (MMEBC 06) Solving Multi-Sensor Multi-Target Assignent Proble Based on Copositive Cobat Efficiency and QPSO Algorith

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

IAENG International Journal of Computer Science, 42:2, IJCS_42_2_06. Approximation Capabilities of Interpretable Fuzzy Inference Systems

IAENG International Journal of Computer Science, 42:2, IJCS_42_2_06. Approximation Capabilities of Interpretable Fuzzy Inference Systems IAENG International Journal of Coputer Science, 4:, IJCS_4 6 Approxiation Capabilities of Interpretable Fuzzy Inference Systes Hirofui Miyajia, Noritaka Shigei, and Hiroi Miyajia 3 Abstract Many studies

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

6.2 Grid Search of Chi-Square Space

6.2 Grid Search of Chi-Square Space 6.2 Grid Search of Chi-Square Space exaple data fro a Gaussian-shaped peak are given and plotted initial coefficient guesses are ade the basic grid search strateg is outlined an actual anual search is

More information

An incremental mirror descent subgradient algorithm with random sweeping and proximal step

An incremental mirror descent subgradient algorithm with random sweeping and proximal step An increental irror descent subgradient algorith with rando sweeping and proxial step Radu Ioan Boţ Axel Böh Septeber 7, 07 Abstract We investigate the convergence properties of increental irror descent

More information

Mathematical Model and Algorithm for the Task Allocation Problem of Robots in the Smart Warehouse

Mathematical Model and Algorithm for the Task Allocation Problem of Robots in the Smart Warehouse Aerican Journal of Operations Research, 205, 5, 493-502 Published Online Noveber 205 in SciRes. http://www.scirp.org/journal/ajor http://dx.doi.org/0.4236/ajor.205.56038 Matheatical Model and Algorith

More information

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Two-Diensional Multi-Label Active Learning with An Efficient Online Adaptation Model for Iage Classification Guo-Jun Qi, Xian-Sheng Hua, Meber,

More information