Regression, Curve Fitting and Optimisation
|
|
- Sylvia Jones
- 6 years ago
- Views:
Transcription
1 Supervised by Elena Zanini STOR-i, University of Lancaster 4 September 2015
2 1 Introduction Root Finding 2 3 Simulated Annealing 4 5 The Rosenbrock Banana Function 6 7
3 Given a set of data, what is the optimum curve which may be fitted? This question has obvious importance in queries regarding relationships between two or more variables, as well as explaining data quantitatively.
4 If a straight line is needed, we can do the standard trick of using Ordinary Least Squares (OLS). However, there will be situations in which this may not be appropriate.
5 Some Less Trivial Examples y y y x x x
6 We observe that the OLS inference arises from an optimisation problem, namely argmin b R p Y Xb 2. So it makes sense to think about the problem of optimal curve fitting from the perspective of optimisation.
7 Optimisation has an obvious analogue in root finding. There are several core methods we can use for this: Bisection; Newton-Raphson; Secant; Muller s. All of these (except Newton-Raphson) are derivative-free.
8 In higher dimensions, one of the more effective non-derivative-free methods is the Broydon-Fletcher-Goldfarb-Shanno (BFGS) Method, which can be adapted to optimise by changing the iterative equation to x n+1 = x n [Hf (x n )] 1 f (x n ).
9 The Suppose our goal is to minimise the function f (x), where x R n.
10 Start with n + 1 test points: x 1,..., x n+1.
11 Order these points by output value, so that f (x 1 ) f (x 2 )... f (x n+1 ). x 3 x 1 x 2
12 We consider several different candidate points, and if these aren t an improvement, then we shrink the simplex.
13 How well does this work on the problem? yrange yrange yrange xrange xrange xrange
14 Disadvantages of Nelder-Mead We usually require a reasonable idea of the form of the relationship between the two variables in question to produce a reasonable eventual plot; If the data do not conform well to the true underlying relationship, the procedure can be very costly, and could arrive at an incorrect answer if the initial conditions are poorly specified.
15 Several alternative methods of optimisation can be used which employ a probabilistic approach. These include: Simulated Annealing; Genetic Algorithms; Ant Colony Optimisation.
16 Simulated Annealing (SA) is a physical process describing the cooling of a material in a system with a controlled negative temperature gradient. It can be observed that under situations where a substance such as water cools in such a system, an optimal solid arrangement is obtained.
17 How SA works Introduction To use Simulated Annealing in an optimisation problem, the following need to be well defined: The neighbours of each state - e.g. for a discrete domain, a rearrangement of two adjacent states; The energies of each state; The probability of moving from state S to state S - states with smaller energy preferred, so P(E, E, T ) > P(E, E, T ) when E < E.
18 How SA works Introduction In the problem of curve fitting: We shall define a neighbour of the current curve as an addition of a small, simple function; The probabilites shall be set as follows: If E < E, then P(E, E, T ) exp( E E T ); Else, P(E, E, T ) 1.
19 How well does this work on the problem? yrange yrange xrange xrange
20 Disadvantages of SA Often requires a high starting temperature to achieve a reasonable result; The model is very sensitive to starting temperature - choice is not obvious; Is very difficult to achieve a fairly accuracte solution, as it is difficult to construct well defined neighbours which enable effective zeroing in on a state in a continuous domain.
21 Suppose we had no intuition at all as to an underlying relationship, such as in the example shown below y x
22 One way of tackling the problem of curve fitting in this instance is to give each point an associated reward function, with shape similar to a hillock.
23 A reward function found to be useful is f (r) = k d e r 0.55, where k d is a constant depending on the datapoint d and r is the Euclidean distance from the datapoint. Can take k d = e D d.
24 A total reward function is then constructed by summing all the reward functions, and then this can be optimised through a brainless search for the curve that optimises reward. Size of second largest city proper by population size (millions) Size of largest city proper by population size (millions)
25 Disadvantages of this approach Model prone to overfitting; Additional methodology may therefore be needed, such as Cross-Validation or Akaike s Information Criterion; Depending on the initial weighting, the resultant optimal curve can favour the OLS line.
26 All these methods were tried on a series of standard test functions before moving on to a real-life application.
27 Introduction There are several functions which are notoriously tricky to optimise numerically. These were used to test the robustness of the algorithms involved. Some examples include: The Rosenbrock Banana Function; Five-Uneven-Peak Trap; Equal Maxima; Uneven Decreasing Maxima.
28 The Rosenbrock Banana Function This function takes the form f (x, y) = (a x) 2 + b(y x 2 ) 2, for some a, b.
29 Extreme Value Theory, A Brief Background One way of defining extreme events is to define a threshold, and anything exceeding this threshold is classed as extreme. This gives rise to the Generalised Pareto Distribution (GPD), whose likelihood is L(σ, ξ) = 1 k σ k i=1 (1 + ξ y i 1 σ ) (1+ ξ ), ξ 0, L(σ, ξ) = 1 k σ k i=1 exp( y i σ ), ξ = 0; where k is the number of datapoints exceeding the threshold, ξ is the shape and σ is the scale.
30 difference in log of closures Amount of rainfall(mm) days elapsed Day Number The left figure corresponds to log-differences of daily closing prices between 1996 and The right figure shows daily rainfall accumulations in South West England between 1914 and 1962.
31 We use Nelder-Mead to fit the GPD and obtain Dataset Threshold ˆσ ˆξ Rain Dow Jones (Candidate thresholds were chosen by observation using the mean residual life plot.) Other procedures, such as Simulated Annealing, proved to be less successful than Nelder-Mead at finding the MLEs.
32 Other Extreme Value Theory Machinery There are several other things we can consider: An alternative and theoretically equivalent approach would be to use a Poisson Point Process (PPP) model; Sometimes the underlying process is more complicated, and covariates need to be added to the model. The first of these is still relative straightforward using Nelder-Mead, however introducing covariates is more complex, and will often result in a convergence to a local optimum.
33 s Introduction In general: Nelder-Mead remains a very effective algorithm used for blind optimisation ; SA shoud be preferred only if there is a strong intuition for a starting temperature. Pinning down a sensible starting value for the temperature may be a fruitful approach in further work; Computationally, gradient-free methods are preferred.
34 s Introduction With respect to Extreme Value Theory: Nelder-Mead becomes highly sensitive to initial conditions in the covariate case; Investigating the application of SA and an effective choice of threshold may be of interest.
35 References Introduction Atkinson, K.E. (1989). An introduction to numerical analysis. Inference and background on deterministic algorithms. Nocedal, J. & Wright, S.J. (2006). Numerical optimization. Higher-dimensional deterministic methods. Reeves, C.R. (1995). Modern Heuristic Techniques for Combinatorial Problems. Simulated Annealing Reference. Coles, S. (2004). An Introduction to Statistical Modeling of Extreme Values.
Numerical Optimization: Basic Concepts and Algorithms
May 27th 2015 Numerical Optimization: Basic Concepts and Algorithms R. Duvigneau R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 1 Outline Some basic concepts in optimization Some
More informationDevelopment of Stochastic Artificial Neural Networks for Hydrological Prediction
Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationHeuristic Optimisation
Heuristic Optimisation Part 8: Simulated annealing Sándor Zoltán Németh http://web.mat.bham.ac.uk/s.z.nemeth s.nemeth@bham.ac.uk University of Birmingham S Z Németh (s.nemeth@bham.ac.uk) Heuristic Optimisation
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationECS550NFB Introduction to Numerical Methods using Matlab Day 2
ECS550NFB Introduction to Numerical Methods using Matlab Day 2 Lukas Laffers lukas.laffers@umb.sk Department of Mathematics, University of Matej Bel June 9, 2015 Today Root-finding: find x that solves
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More informationEmma Simpson. 6 September 2013
6 September 2013 Test What is? Beijing during periods of low and high air pollution Air pollution is composed of sulphur oxides, nitrogen oxides, carbon monoxide and particulates. Particulates are small
More informationAlgorithms and Complexity theory
Algorithms and Complexity theory Thibaut Barthelemy Some slides kindly provided by Fabien Tricoire University of Vienna WS 2014 Outline 1 Algorithms Overview How to write an algorithm 2 Complexity theory
More informationOptimization. Totally not complete this is...don't use it yet...
Optimization Totally not complete this is...don't use it yet... Bisection? Doing a root method is akin to doing a optimization method, but bi-section would not be an effective method - can detect sign
More informationOptimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23
Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is
More informationOverview of Extreme Value Analysis (EVA)
Overview of Extreme Value Analysis (EVA) Brian Reich North Carolina State University July 26, 2016 Rossbypalooza Chicago, IL Brian Reich Overview of Extreme Value Analysis (EVA) 1 / 24 Importance of extremes
More informationPhysics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester
Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationMax Margin-Classifier
Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization
More informationShiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method
Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 3 Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 2 3.1. Gradient method Classical gradient method: to minimize a differentiable convex
More informationCondensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.
Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Spall John Wiley and Sons, Inc., 2003 Preface... xiii 1. Stochastic Search
More informationImproving L-BFGS Initialization for Trust-Region Methods in Deep Learning
Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University
More informationOn Maximisation of the Likelihood for the Generalised Gamma Distribution
On Maximisation of the Lielihood for the Generalised Gamma Distribution Angela Noufaily & M.C. Jones Department of Mathematics & Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationLecture V. Numerical Optimization
Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize
More informationNon-Linear Optimization
Non-Linear Optimization Distinguishing Features Common Examples EOQ Balancing Risks Minimizing Risk 15.057 Spring 03 Vande Vate 1 Hierarchy of Models Network Flows Linear Programs Mixed Integer Linear
More informationApproximate Normality, Newton-Raphson, & Multivariate Delta Method
Approximate Normality, Newton-Raphson, & Multivariate Delta Method Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 39 Statistical models come in
More informationMotivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms
Motivation, Basic Concepts, Basic Methods, Travelling Salesperson Problem (TSP), Algorithms 1 What is Combinatorial Optimization? Combinatorial Optimization deals with problems where we have to search
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationOptimization Methods for Machine Learning
Optimization Methods for Machine Learning Sathiya Keerthi Microsoft Talks given at UC Santa Cruz February 21-23, 2017 The slides for the talks will be made available at: http://www.keerthis.com/ Introduction
More informationSTAT Advanced Bayesian Inference
1 / 8 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics March 5, 2018 Distributional approximations 2 / 8 Distributional approximations are useful for quick inferences, as starting
More informationJournal of Environmental Statistics
jes Journal of Environmental Statistics February 2010, Volume 1, Issue 3. http://www.jenvstat.org Exponentiated Gumbel Distribution for Estimation of Return Levels of Significant Wave Height Klara Persson
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationUnconstrained Multivariate Optimization
Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued
More informationOptimization for neural networks
0 - : Optimization for neural networks Prof. J.C. Kao, UCLA Optimization for neural networks We previously introduced the principle of gradient descent. Now we will discuss specific modifications we make
More informationMAS2317/3317: Introduction to Bayesian Statistics
MAS2317/3317: Introduction to Bayesian Statistics Case Study 2: Bayesian Modelling of Extreme Rainfall Data Semester 2, 2014 2015 Motivation Over the last 30 years or so, interest in the use of statistical
More informationToday. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods
Optimization Last time Root inding: deinition, motivation Algorithms: Bisection, alse position, secant, Newton-Raphson Convergence & tradeos Eample applications o Newton s method Root inding in > 1 dimension
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Prof. C. F. Jeff Wu ISyE 8813 Section 1 Motivation What is parameter estimation? A modeler proposes a model M(θ) for explaining some observed phenomenon θ are the parameters
More informationOn the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves
On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves Bradley Campbell 1, Vadim Belenky 1, Vladas Pipiras 2 1.
More informationEconomic modelling and forecasting
Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation
More informationSYDE 112, LECTURE 7: Integration by Parts
SYDE 112, LECTURE 7: Integration by Parts 1 Integration By Parts Consider trying to take the integral of xe x dx. We could try to find a substitution but would quickly grow frustrated there is no substitution
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationSTOCHASTIC TRUST REGION RESPONSE SURFACE CONVERGENT METHOD FOR GENERALLY-DISTRIBUTED RESPONSE SURFACE. Kuo-Hao Chang
Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. STOCHASTIC TRUST REGION RESPONSE SURFACE CONVERGENT METHOD FOR GENERALLY-DISTRIBUTED
More informationIntroduction to Logistic Regression and Support Vector Machine
Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationRoots of equations, minimization, numerical integration
Roots of equations, minimization, numerical integration Alexander Khanov PHYS6260: Experimental Methods is HEP Oklahoma State University November 1, 2017 Roots of equations Find the roots solve equation
More information15-889e Policy Search: Gradient Methods Emma Brunskill. All slides from David Silver (with EB adding minor modificafons), unless otherwise noted
15-889e Policy Search: Gradient Methods Emma Brunskill All slides from David Silver (with EB adding minor modificafons), unless otherwise noted Outline 1 Introduction 2 Finite Difference Policy Gradient
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationExtreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level
Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level Presented by: Elizabeth Shamseldin Joint work with: Richard Smith, Doug Nychka, Steve Sain, Dan Cooley Statistics
More informationA New Generalised Inverse Polynomial Model in the Exploration of Response Surface Methodology
Journal of Emerging Trends in Engineering Applied Sciences (JETEAS) (6): 1059-1063 Scholarlink Research Institute Journals 011 (ISSN: 141-7016) jeteas.scholarlinkresearch.org Journal of Emerging Trends
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationPeaks-Over-Threshold Modelling of Environmental Data
U.U.D.M. Project Report 2014:33 Peaks-Over-Threshold Modelling of Environmental Data Esther Bommier Examensarbete i matematik, 30 hp Handledare och examinator: Jesper Rydén September 2014 Department of
More informationOptimization Problems
Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that
More informationIntroduction to SVM and RVM
Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance
More informationThe ismev Package. March 9, 2006
The ismev Package March 9, 2006 Version 1.2 Date 2006-03-10 Title An Introduction to Statistical Modeling of Extreme s Author Original S functions by Stuart Coles, R port and R documentation files by Alec
More informationScientific Computing: Optimization
Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture
More informationLINEAR REGRESSION, RIDGE, LASSO, SVR
LINEAR REGRESSION, RIDGE, LASSO, SVR Supervised Learning Katerina Tzompanaki Linear regression one feature* Price (y) What is the estimated price of a new house of area 30 m 2? 30 Area (x) *Also called
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationRISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS
RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 27599-3260 rls@email.unc.edu
More informationUsing Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method
Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,
More informationP1: JYD /... CB495-08Drv CB495/Train KEY BOARDED March 24, :7 Char Count= 0 Part II Estimation 183
Part II Estimation 8 Numerical Maximization 8.1 Motivation Most estimation involves maximization of some function, such as the likelihood function, the simulated likelihood function, or squared moment
More informationGENG2140, S2, 2012 Week 7: Curve fitting
GENG2140, S2, 2012 Week 7: Curve fitting Curve fitting is the process of constructing a curve, or mathematical function, f(x) that has the best fit to a series of data points Involves fitting lines and
More informationCOMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)
COMS 4771 Lecture 1 1. Course overview 2. Maximum likelihood estimation (review of some statistics) 1 / 24 Administrivia This course Topics http://www.satyenkale.com/coms4771/ 1. Supervised learning Core
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationIntroduction to Black-Box Optimization in Continuous Search Spaces. Definitions, Examples, Difficulties
1 Introduction to Black-Box Optimization in Continuous Search Spaces Definitions, Examples, Difficulties Tutorial: Evolution Strategies and CMA-ES (Covariance Matrix Adaptation) Anne Auger & Nikolaus Hansen
More informationAssessing Dependence in Extreme Values
02/09/2016 1 Motivation Motivation 2 Comparison 3 Asymptotic Independence Component-wise Maxima Measures Estimation Limitations 4 Idea Results Motivation Given historical flood levels, how high should
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationSelected Topics in Optimization. Some slides borrowed from
Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model
More informationGradient-based Adaptive Stochastic Search
1 / 41 Gradient-based Adaptive Stochastic Search Enlu Zhou H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology November 5, 2014 Outline 2 / 41 1 Introduction
More informationCOMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017
COMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University OVERVIEW This class will cover model-based
More informationVariational Methods in Bayesian Deconvolution
PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the
More informationMATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year
MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2 1 Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year 2013-14 OUTLINE OF WEEK 8 topics: quadratic optimisation, least squares,
More information2. Quasi-Newton methods
L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization
More information(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)
Solving Nonlinear Equations & Optimization One Dimension Problem: or a unction, ind 0 such that 0 = 0. 0 One Root: The Bisection Method This one s guaranteed to converge at least to a singularity, i not
More informationInvestigation of an Automated Approach to Threshold Selection for Generalized Pareto
Investigation of an Automated Approach to Threshold Selection for Generalized Pareto Kate R. Saunders Supervisors: Peter Taylor & David Karoly University of Melbourne April 8, 2015 Outline 1 Extreme Value
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 12: Probability 3/2/2011 Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. 1 Announcements P3 due on Monday (3/7) at 4:59pm W3 going out
More informationChapter 14 Stein-Rule Estimation
Chapter 14 Stein-Rule Estimation The ordinary least squares estimation of regression coefficients in linear regression model provides the estimators having minimum variance in the class of linear and unbiased
More informationCSE446: Clustering and EM Spring 2017
CSE446: Clustering and EM Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin, Dan Klein, and Luke Zettlemoyer Clustering systems: Unsupervised learning Clustering Detect patterns in unlabeled
More informationApproximate Bayesian Computation
Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More informationLearning from Data: Regression
November 3, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Classification or Regression? Classification: want to learn a discrete target variable. Regression: want to learn a continuous target variable. Linear
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationMethods for Unconstrained Optimization Numerical Optimization Lectures 1-2
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods
More informationDesign and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras
Design and Optimization of Energy Systems Prof. C. Balaji Department of Mechanical Engineering Indian Institute of Technology, Madras Lecture - 09 Newton-Raphson Method Contd We will continue with our
More informationLecture 1: Supervised Learning
Lecture 1: Supervised Learning Tuo Zhao Schools of ISYE and CSE, Georgia Tech ISYE6740/CSE6740/CS7641: Computational Data Analysis/Machine from Portland, Learning Oregon: pervised learning (Supervised)
More informationMotivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)
AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 5: Nonlinear Equations Dianne P. O Leary c 2001, 2002, 2007 Solving Nonlinear Equations and Optimization Problems Read Chapter 8. Skip Section 8.1.1.
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationStatistics for extreme & sparse data
Statistics for extreme & sparse data University of Bath December 6, 2018 Plan 1 2 3 4 5 6 The Problem Climate Change = Bad! 4 key problems Volcanic eruptions/catastrophic event prediction. Windstorms
More informationLogistic Regression with the Nonnegative Garrote
Logistic Regression with the Nonnegative Garrote Enes Makalic Daniel F. Schmidt Centre for MEGA Epidemiology The University of Melbourne 24th Australasian Joint Conference on Artificial Intelligence 2011
More informationPBEE Design Methods KHALID M. MOSALAM, PROFESSOR & SELIM GÜNAY, POST-DOC UNIVERSITY OF CALIFORNIA, BERKELEY
PBEE Design Methods KHALID M. MOSALAM, PROFESSOR & SELIM GÜNAY, POST-DOC UNIVERSITY OF CALIFORNIA, BERKELEY Outline 1.Introduction 2. 3.Non optimization-based methods 2 Introduction Courtesy of Prof. S.
More informationLogistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu
Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data
More informationOptimization II: Unconstrained Multivariable
Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationGeneralized additive modelling of hydrological sample extremes
Generalized additive modelling of hydrological sample extremes Valérie Chavez-Demoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationIntroduction to Optimization
Introduction to Optimization Blackbox Optimization Marc Toussaint U Stuttgart Blackbox Optimization The term is not really well defined I use it to express that only f(x) can be evaluated f(x) or 2 f(x)
More information