Constrained Optimization Approaches to Estimation of Structural Models: Comment
|
|
- Clifton Howard
- 5 years ago
- Views:
Transcription
1 Constrained Optimization Approaches to Estimation of Structural Models: Comment Fedor Iskhakov, University of New South Wales Jinhyuk Lee, Ulsan National Institute of Science and Technology John Rust, Georgetown University Kyoungwon Seo, Korea Advanced Institute of Science and Techn. Bertel Schjerning, University of Copenhagen January 2015
2 Su and Judd (Econometrica, 2012)
3 Death to NFXP? Su and Judd (Econometrica, 2012) 2228 C.-L. SU AND K. L. JUDD TABLE II NUMERICAL PERFORMANCE OF NFXP AND MPEC IN THE MONTE CARLO EXPERIMENTS a Runs Converged CPU Time # of Major # of Func. # of Contraction β Implementation (out of 1250 runs) (in sec.) Iter. Eval. Mapping Iter MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP ,487 a For each β, weusefivestartingpointsforeachofthe250replications.cputime,numberofmajoriterations, number of function evaluations and number of contraction mapping iterations are the averages for each run.
4 The Nested Fixed Point Algorithm NFXP solves the unconstrained optimization problem Outer loop (Hill-climbing algorithm): max L(θ, EV θ ) θ Likelihood function L(θ, EV θ ) is maximized w.r.t. θ Quasi-Newton algorithm: Usually BHHH, BFGS or a combination. Each evaluation of L(θ, EV θ ) requires solution of EV θ Inner loop (fixed point algorithm): The implicit function EV θ defined by EV θ = Γ(EV θ ) is solved by: Successive Approximations (SA) Newton-Kantorovich (NK) Iterations
5 Mathematical Programming with Equilibrium Constraints MPEC solves the constrained optimization problem max L(θ, EV ) subject to EV = Γ θ(ev ) θ,ev using general-purpose constrained optimization solvers such as KNITRO and CONOPT Su and Judd (Ecta 2012) considers two such implementations: MPEC/AMPL: AMPL formulates problems and pass it to KNITRO. Automatic differentiation (Jacobian and Hessian) Sparsity patterns for Jacobian and Hessian MPEC/MATLAB: User need to supply Jacobians, Hessian, and Sparsity Patterns Su and Judd do not supply analytical derivatives. ktrlink provides link between MATLAB and KNITRO solvers.
6 Zurcher s Bus Engine Replacement Problem Choice set: Each bus comes in for repair once a month and Zurcher chooses between ordinary maintenance (d t = 0) and overhaul/engine replacement (d t = 1) State variables: Harold Zurcher observes: x t : mileage at time t since last engine overhaul ε t = [ε t (d t = 0), ε t (d t = 1)]: other state variable Utility function: u(x t, d, θ u ) + ε t (d t ) = { RC c(0, θu ) + ε t (1) if d t = 1 c(x t, θ u ) + ε t (0) if d t = 0 (1) State variables process x t (mileage since last replacement) p(x t+1 x t, d t, θ x ) = { g(xt+1 0, θ x ) if d t = 1 g(x t+1 x t, θ x ) if d t = 0 (2) If engine is replaced, state of bus regenerates to x t = 0.
7 Zurcher s Bus Engine Replacement Problem Zurcher s problem Maximize expected sum of current and future discounted utilities, summarized by the Bellman equation: V θ (x t, ε t ) = max [u(x t, d, θ u ) + ε t (d) + βev θ (x t+1, ε t+1 x t, ε t, d)] d D(x t ) Under (CI) and (XV) we can integrate out the unobserved state variables = y EV θ (x, d) = Γ θ (EV θ )(x, d) ln exp[u(y, d ; θ u ) + βev θ (y, d )] p(dy x, d, θ x ) d D(y)
8 Structural Estimation Econometric problem: Given observations of mileage and replacement decisions for i = 1,.., n busses observed over T i time periods: (d i,t, x i,t ), t = 1,..., T i and i = 1,..., n, we wish to estimate parameters θ = {θ u, θ x } using MLE Likelihood Under assumption (CI) the log-likelihood function contribution l f has the particular simple form l f T i T i i (θ) = log(p(d i,t x i,t, θ)) + log (p(x i,t x i,t 1, d i,t 1, θ x )) t=2 } {{ } t=2 } {{ } l 1 i (θ) l 2 i (θ x ) where P(d i,t x i,t, θ) is the choice probability given the observable state variable, x i,t.
9 Choice Probabilities and Expected Value functions Under the extreme value (XV) assumption choice probabilities are multinomial logistic P(d x, θ) = exp{u(x, d, θ u) + βev θ (x, d)} j D(y) {u(x, j, θ 1 ) + βev θ (x, j)} (3) The expected value function is given by the unique fixed point to the contraction mapping Γ θ, defined by EV θ (x, d) = Γ θ (EV θ )(x, d) = ln exp[u(y, d ; θ u ) + βev θ (y, d )] y d D(y) p(dy x, d, θ x ) Γ θ is a contraction mapping with unique fixed point EV θ, i.e. Γ(EV ) Γ(W ) β EV W
10 Rust (Econometrica, 1987)
11 Death to NFXP? Su and Judd (Econometrica, 2012) 2228 C.-L. SU AND K. L. JUDD TABLE II NUMERICAL PERFORMANCE OF NFXP AND MPEC IN THE MONTE CARLO EXPERIMENTS a Runs Converged CPU Time # of Major # of Func. # of Contraction β Implementation (out of 1250 runs) (in sec.) Iter. Eval. Mapping Iter MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP , MPEC/AMPL MPEC/MATLAB NFXP ,487 a For each β, weusefivestartingpointsforeachofthe250replications.cputime,numberofmajoriterations, number of function evaluations and number of contraction mapping iterations are the averages for each run.
12 How to do CPR
13 NFXP survival kit Step 1: Read NFXP manual and print out NFXP pocket guide Step 2: Solve for fixed point using Newton Iterations Step 3: Recenter Bellman equation Step 4: Provide analytical gradients of Bellman operator Step 5: Provide analytical gradients of likelihood Step 6: Use BHHH (outer product of gradients as hessian approx.) If NFXP heartbeat is still weak: Read NFXP pocket guide until help arrives!
14 STEP 1: NFXP documentation Main references Rust (1987): Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher Econometrica Rust (2000): Nested Fixed Point Algorithm Documentation Manual: Version 6 https: //editorialexpress.com/jrust/nfxp.html
15 Nested Fixed Point Algorithm NFXP Documentation Manual version 6, (Rust 2000, page 18): Formally, one can view the nested fixed point algorithm as solving the following constrained optimization problem: max L(θ, EV ) subject to EV = Γ θ(ev ) (4) θ,ev Since the contraction mapping Γ always has a unique fixed point, the constraint EV = Γ θ (EV ) implies that the fixed point EV θ is an implicit function of θ. Thus, the constrained optimization problem (4) reduces to the unconstrained optimization problem max L(θ, EV θ ) (5) θ where EV θ is the implicit function defined by EV θ = Γ(EV θ ).
16 NFXP pocket guide
17 STEP 2: Newton-Kantorovich Iterations Problem: Find fixed point of the contraction mapping EV = Γ(EV ) Error bound on successive contraction iterations: EV k+1 EV β EV k EV linear convergence slow when β close to 1 Newton-Kantorovich: Solve [I Γ](EV θ ) = 0 using Newtons method EV k+1 EV A EV k EV 2 quadratic convergence around fixed point, EV
18 STEP 2: Newton-Kantorovich Iterations Newton-Kantorovich iteration: EV k+1 = EV k (I Γ ) 1 (I Γ)(EV k ) where I is the identity operator on B, and 0 is the zero element of B (i.e. the zero function). The nonlinear operator I Γ has a Fréchet derivative I Γ which is a bounded linear operator on B with a bounded inverse. The Fixed Point (poly) Algorithm 1 Successive contraction iterations (until EV is in domain of attraction) 2 Newton-Kantorovich (until convergence)
19 STEP 2: Newton-Kantorovich Iterations Successive Approximations, VERY Slow 0.4 Convergence of fixed point algorithm, beta=[0.9, 0.95, 0.99, 0.999, ] Tolerance Iteration count
20 STEP 2: Newton-Kantorovich Iterations, β = Successive Approximations, VERY Slow Begin contraction iterations j tol tol (j)/ tol (j -1) : : : Elapsed time : ( seconds ) Begin Newton - Kantorovich iterations nwt tol e -13 Elapsed time : ( seconds ) Convergence achieved!
21 STEP 2: Newton-Kantorovich Iterations, β = Quadratic convergence! Begin contraction iterations j tol tol (j)/ tol (j -1) Elapsed time : ( seconds ) Begin Newton - Kantorovich iterations nwt tol e e e e -12 Elapsed time : ( seconds ) Convergence achieved!
22 STEP 2: When to switch to Newton-Kantorovich Observation: tol k = EV k+1 EV k < β EV k EV tol k quickly slow down and declines very slowly for β close to 1 Relative tolerance tol k+1 /tol k approach β When to switch to Newton-Kantorovich? Suppose that EV 0 = EV + k. (Initial EV 0 equals fixed point EV plus an arbitrary constant) Another successive approximation does not solve this: tol 0 = EV 0 Γ(EV 0 ) = EV + k Γ(EV + k) = EV + k (EV + βk) = (1 β)k tol 1 = EV 1 Γ(EV 1 ) = EV + βk Γ(EV + βk) tol 1 /tol 0 = β = EV + βk (EV + β 2 k) = β(1 β)k Newton will immediately strip away the irrelevant constant k Switch to Newton whenever tol 1 /tol 0 is sufficiently close to β
23 STEP 3: Recenter to ensure numerical stability Logit formulas must be reentered. P i = = exp(v i ) j D(y) exp(v j ) exp(v i V 0 ) j D(y) exp(v j V 0 ) and log-sum must be recenteret too EV θ = ln exp(v j )p(dy x, d, θ x ) y j D(y) = V 0 + ln exp(v j V 0 ) p(dy x, d, θ x ) y j D(y) If V 0 is chosen to be V 0 = max j V j we can avoid numerical instability due to overflow/underflow
24 STEP 4: Fréchet derivative of Bellman operator Fréchet derivative For NK iteration we need Γ EV k+1 = EV k (I Γ ) 1 (I Γ)(EV k ) In terms of its finite-dimensional approximation, Γ θ takes the form of an N N matrix equal to the partial derivatives of the N 1 vector Γ θ (EV θ ) with respect to the N 1 vector EV θ Γ θ is simply β times the transition probability matrix for the controlled process {d t, x t } Two lines of code in MATLAB
25 STEP 5: Provide analytical gradients of likelihood Gradient similar to the gradient for the conventional logit l 1 i (θ)/ θ = [d it P(d it x it, θ)] (v repl. v keep )/ θ Only thing that differs is the inner derivative of the choice specific value function that besides derivatives of current utility also includes EV θ / θ wrt. θ By the implicit function theorem we obtain EV θ / θ = [I Γ θ ] 1 Γ/ θ By-product of the N-K algorithm: [I Γ θ ] 1
26 STEP 6: BHHH Recall Newton-Raphson θ g+1 = θ g λ (Σ i H i (θ g )) 1 Σ i s i (θ g ) Berndt, Hall, Hall, and Hausman, (1974): Use outer product of scores as approx. to Hessian θ g+1 = θ g + λ ( Σ i s i s i ) 1 Σi s i Why is this valid? Information identity: E [H i (θ)] = E [s i (θ) s i (θ) ] (only valid for MLE and CMLE)
27 STEP 6: BHHH Some times linesearch may not help Newtons Method 9.4 Non concave likelihood Convex region: Newton Raphson moves in the oposite direction of the gradient NR moves DOWNHILL (Wrong, wrong, way) BHHH: Still good Concave region: Newton Raphson moves in the same direction of the gradient NR moves UPHILL θ
28 STEP 6: BHHH Advantages Σ i s i s i is always positive definite I.e. it always moves uphill for λ small enough Does not rely on second derivatives Disadvantages Only a good approximation At the true parameters for large N for well specified models (in principle only valid for MLE) Only superlinear convergent - not quadratic We can always use BHHH for first iterations and the switch to BFGS to update to get an even more accurate approximation to the hessian matrix as the iterations start to converge.
29 STEP 6: BHHH The road ahead will be long. Our climb will be steep. We may not get there in one year or even in one term. But, America, I have never been more hopeful than I am tonight that we will get there. I promise you, we as a people will get there. (Barack Obama, Nov. 2008)
30 Convergence! β= *** Convergence Achieved *** _ \ \ = /- ; ( ) ( ) _._. ( ) ( ) Number of iterations : 9 grad * direc Log - likelihood Param. Estimates s.e. t- stat RC c Time to convergence is 0 min and 0.07 seconds
31 MPEC versus NFXP-NK: sample size 6,000 Converged CPU Time # of Major # of Func. # of Bellm. # of N-K β (out of 1250) (in sec.) Iter. Eval. Iter. Iter. MPEC-Matlab MPEC-AMPL NFXP-NK
32 MPEC versus NFXP-NK: sample size 60,000 Converged CPU Time # of Major # of Func. # of Bellm. # of N-K β (out of 1250) (in sec.) Iter. Eval. Iter. Iter. MPEC-AMPL NFXP-NK
33 CPU is linear sample size MPEC AMPL NFXP NK cpu time per major iteration (seconds) Sample size x 10 5 T NFXP = x (R 2 = 0.991), T MPEC = x (R 2 = 0.988).
34 CPU is linear sample size MPEC AMPL NFXP NK total cpu time (seconds) Sample size x 10 5 T NFXP = x (R 2 = 0.926), T MPEC = x (R 2 = 0.554).
35 Summary of findings Su and Judd (Econometrica, 2012) used an inefficient version of NFXP that solely relies on the method of successive approximations to solve the fixed point problem. Using the efficient version of NFXP proposed by Rust (1987) we find: MPEC and NFXP-NK are similar in performance when the sample size is relatively small. In problems with large sample sizes, NFXP-NK outperforms MPEC by a significant margin. NFXP does not slow down as β 1 It is non-trivial to compute standard error using MPEC, whereas they are a natural by-product of NFXP.
36 Patient still alive
Numerical Methods-Lecture XIII: Dynamic Discrete Choice Estimation
Numerical Methods-Lecture XIII: Dynamic Discrete Choice Estimation (See Rust 1989) Trevor Gallen Fall 2015 1 / 1 Introduction Rust (1989) wrote a paper about Harold Zurcher s decisions as the superintendent
More informationRevisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation
Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation Jinhyuk Lee Kyoungwon Seo September 9, 016 Abstract This paper examines the numerical properties of the nested
More informationLecture notes: Rust (1987) Economics : M. Shum 1
Economics 180.672: M. Shum 1 Estimate the parameters of a dynamic optimization problem: when to replace engine of a bus? This is a another practical example of an optimal stopping problem, which is easily
More informationMONTE CARLO SIMULATIONS OF THE NESTED FIXED-POINT ALGORITHM. Erik P. Johnson. Working Paper # WP October 2010
MONTE CARLO SIMULATIONS OF THE NESTED FIXED-POINT ALGORITHM Erik P. Johnson Working Paper # WP2011-011 October 2010 http://www.econ.gatech.edu/research/wokingpapers School of Economics Georgia Institute
More informationEstimating Single-Agent Dynamic Models
Estimating Single-Agent Dynamic Models Paul T. Scott Empirical IO Fall, 2013 1 / 49 Why are dynamics important? The motivation for using dynamics is usually external validity: we want to simulate counterfactuals
More informationEstimating Dynamic Programming Models
Estimating Dynamic Programming Models Katsumi Shimotsu 1 Ken Yamada 2 1 Department of Economics Hitotsubashi University 2 School of Economics Singapore Management University Katsumi Shimotsu, Ken Yamada
More informationEstimating Single-Agent Dynamic Models
Estimating Single-Agent Dynamic Models Paul T. Scott New York University Empirical IO Course Fall 2016 1 / 34 Introduction Why dynamic estimation? External validity Famous example: Hendel and Nevo s (2006)
More informationOptimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher. John Rust
Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher John Rust 1987 Top down approach to investment: estimate investment function that is based on variables that are aggregated
More informationConstrained Optimization Approaches to Structural Estimation. Che-Lin Su
Constrained Optimization Approaches to The University of Chicago Booth School of Business Che-Lin.Su@ChicagoBooth.edu ICE 2012 July 16 28, 2012 Outline 1. Estimation of Dynamic Programming Models of Individual
More information1 Hotz-Miller approach: avoid numeric dynamic programming
1 Hotz-Miller approach: avoid numeric dynamic programming Rust, Pakes approach to estimating dynamic discrete-choice model very computer intensive. Requires using numeric dynamic programming to compute
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationLecture Notes: Estimation of dynamic discrete choice models
Lecture Notes: Estimation of dynamic discrete choice models Jean-François Houde Cornell University November 7, 2016 These lectures notes incorporate material from Victor Agguirregabiria s graduate IO slides
More informationEstimating Dynamic Discrete-Choice Games of Incomplete Information
Estimating Dynamic Discrete-Choice Games of Incomplete Information Michael Egesdal Zhenyu Lai Che-Lin Su October 5, 2012 Abstract We investigate the estimation of models of dynamic discrete-choice games
More informationP1: JYD /... CB495-08Drv CB495/Train KEY BOARDED March 24, :7 Char Count= 0 Part II Estimation 183
Part II Estimation 8 Numerical Maximization 8.1 Motivation Most estimation involves maximization of some function, such as the likelihood function, the simulated likelihood function, or squared moment
More informationSOLUTIONS Problem Set 2: Static Entry Games
SOLUTIONS Problem Set 2: Static Entry Games Matt Grennan January 29, 2008 These are my attempt at the second problem set for the second year Ph.D. IO course at NYU with Heski Bar-Isaac and Allan Collard-Wexler
More informationSyllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools
Syllabus By Joan Llull Microeconometrics. IDEA PhD Program. Fall 2017 Chapter 1: Introduction and a Brief Review of Relevant Tools I. Overview II. Maximum Likelihood A. The Likelihood Principle B. The
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationEstimation of Static Discrete Choice Models Using Market Level Data
Estimation of Static Discrete Choice Models Using Market Level Data NBER Methods Lectures Aviv Nevo Northwestern University and NBER July 2012 Data Structures Market-level data cross section/time series/panel
More informationConstrained Optimization Approaches to Structural Estimation. Che-Lin Su
Constrained Optimization Approaches to CMS-EMS Kellogg School of Management Northwestern University c-su@kellogg.northwestern.edu Institute for Computational Economics The University of Chicago July 31-
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationConstrained Optimization Approaches to Structural Estimation. Che-Lin Su
Constrained Optimization Approaches to The University of Chicago Booth School of Business Che-Lin.Su@ChicagoBooth.edu ICE 2010 July 19 30, 2010 Outline 1. Estimation of Dynamic Programming Models of Individual
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationLecture V. Numerical Optimization
Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize
More informationLast lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton
EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares
More informationDiscrete Dependent Variable Models
Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationNONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)
NONLINEAR PROGRAMMING (Hillier & Lieberman Introduction to Operations Research, 8 th edition) Nonlinear Programming g Linear programming has a fundamental role in OR. In linear programming all its functions
More informationA Dynamic Programming Approach for Quickly Estimating Large Scale MEV Models
A Dynamic Programming Approach for Quickly Estimating Large Scale MEV Models Tien Mai Emma Frejinger Mogens Fosgerau Fabien Bastin June 2015 A Dynamic Programming Approach for Quickly Estimating Large
More informationNonlinear Optimization: What s important?
Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationGradient Descent. Dr. Xiaowei Huang
Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,
More informationLine Search Methods for Unconstrained Optimisation
Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic
More informationA nested recursive logit model for route choice analysis
Downloaded from orbit.dtu.dk on: Sep 09, 2018 A nested recursive logit model for route choice analysis Mai, Tien; Frejinger, Emma; Fosgerau, Mogens Published in: Transportation Research Part B: Methodological
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationECON 5350 Class Notes Nonlinear Regression Models
ECON 5350 Class Notes Nonlinear Regression Models 1 Introduction In this section, we examine regression models that are nonlinear in the parameters and give a brief overview of methods to estimate such
More informationMotivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)
AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 5: Nonlinear Equations Dianne P. O Leary c 2001, 2002, 2007 Solving Nonlinear Equations and Optimization Problems Read Chapter 8. Skip Section 8.1.1.
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationHOMEWORK #4: LOGISTIC REGRESSION
HOMEWORK #4: LOGISTIC REGRESSION Probabilistic Learning: Theory and Algorithms CS 274A, Winter 2018 Due: Friday, February 23rd, 2018, 11:55 PM Submit code and report via EEE Dropbox You should submit a
More informationThree essays in computational economics
Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2015 Three essays in computational economics Reich, Gregor Philipp Abstract:
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationEstimation of Discrete Choice Dynamic Programming Models
Estimation of Discrete Choice Dynamic Programming Models Hiroyuki Kasahara Department of Economics University of British Columbia hkasahar@mail.ubc.com Katsumi Shimotsu Faculty of Economics University
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationDiscussion of Maximization by Parts in Likelihood Inference
Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu
More informationLecture 3 Optimization methods for econometrics models
Lecture 3 Optimization methods for econometrics models Cinzia Cirillo University of Maryland Department of Civil and Environmental Engineering 06/29/2016 Summer Seminar June 27-30, 2016 Zinal, CH 1 Overview
More informationIntroduction to Logistic Regression and Support Vector Machine
Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel
More informationOptimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes
Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationA New Computational Algorithm for Random Coe cients Model with Aggregate-level Data
A New Computational Algorithm for Random Coe cients Model with Aggregate-level Data Jinyoung Lee y UCLA February 9, 011 Abstract The norm for estimating di erentiated product demand with aggregate-level
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationSolution and Estimation of Dynamic Discrete Choice Structural Models Using Euler Equations
Solution and Estimation of Dynamic Discrete Choice Structural Models Using Euler Equations Victor Aguirregabiria University of Toronto and CEPR Arvind Magesan University of Calgary May 1st, 2018 Abstract
More informationProximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization
Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More informationDynamic Discrete Choice Structural Models in Empirical IO
Dynamic Discrete Choice Structural Models in Empirical IO Lecture 4: Euler Equations and Finite Dependence in Dynamic Discrete Choice Models Victor Aguirregabiria (University of Toronto) Carlos III, Madrid
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More informationAlgorithms for Constrained Optimization
1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationAn Inexact Newton Method for Nonlinear Constrained Optimization
An Inexact Newton Method for Nonlinear Constrained Optimization Frank E. Curtis Numerical Analysis Seminar, January 23, 2009 Outline Motivation and background Algorithm development and theoretical results
More informationDiscussion Papers Department of Economics University of Copenhagen
Discussion Papers Department of Economics University of Copenhagen No. 15-19 Estimating Discrete-Continuous Choice Models: The Endogenous Grid Method with Taste Shocks Fedor Iskhakov Thomas H. Jørgensen
More informationIntroduction to Scientific Computing
Introduction to Scientific Computing Benson Muite benson.muite@ut.ee http://kodu.ut.ee/ benson https://courses.cs.ut.ee/2018/isc/spring 26 March 2018 [Public Domain,https://commons.wikimedia.org/wiki/File1
More informationNumerical Methods for Large-Scale Nonlinear Systems
Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.
More informationLecture 9: Policy Gradient II 1
Lecture 9: Policy Gradient II 1 Emma Brunskill CS234 Reinforcement Learning. Winter 2019 Additional reading: Sutton and Barto 2018 Chp. 13 1 With many slides from or derived from David Silver and John
More informationLecture 9: Policy Gradient II (Post lecture) 2
Lecture 9: Policy Gradient II (Post lecture) 2 Emma Brunskill CS234 Reinforcement Learning. Winter 2018 Additional reading: Sutton and Barto 2018 Chp. 13 2 With many slides from or derived from David Silver
More information5 Quasi-Newton Methods
Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning First-Order Methods, L1-Regularization, Coordinate Descent Winter 2016 Some images from this lecture are taken from Google Image Search. Admin Room: We ll count final numbers
More informationBindel, Spring 2016 Numerical Analysis (CS 4220) Notes for
Life beyond Newton Notes for 2016-04-08 Newton s method has many attractive properties, particularly when we combine it with a globalization strategy. Unfortunately, Newton steps are not cheap. At each
More informationThe classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know
The Bayes classifier Theorem The classifier satisfies where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know Alternatively, since the maximum it is
More informationThe classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA
The Bayes classifier Linear discriminant analysis (LDA) Theorem The classifier satisfies In linear discriminant analysis (LDA), we make the (strong) assumption that where the min is over all possible classifiers.
More informationNonlinear Multigrid and Domain Decomposition Methods
Nonlinear Multigrid and Domain Decomposition Methods Rolf Krause Institute of Computational Science Universit`a della Svizzera italiana 1 Stepping Stone Symposium, Geneva, 2017 Non-linear problems Non-linear
More informationLecture 17: October 27
0-725/36-725: Convex Optimiation Fall 205 Lecturer: Ryan Tibshirani Lecture 7: October 27 Scribes: Brandon Amos, Gines Hidalgo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationNonlinear Optimization for Optimal Control
Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]
More informationOptimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29
Optimization Yuh-Jye Lee Data Science and Machine Intelligence Lab National Chiao Tung University March 21, 2017 1 / 29 You Have Learned (Unconstrained) Optimization in Your High School Let f (x) = ax
More informationLecture 17: Numerical Optimization October 2014
Lecture 17: Numerical Optimization 36-350 22 October 2014 Agenda Basics of optimization Gradient descent Newton s method Curve-fitting R: optim, nls Reading: Recipes 13.1 and 13.2 in The R Cookbook Optional
More informationEmpirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models
Empirical Evaluation and Estimation of Large-Scale, Nonlinear Economic Models Michal Andrle, RES The views expressed herein are those of the author and should not be attributed to the International Monetary
More informationLecture 6: Discrete Choice: Qualitative Response
Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationConstrained Optimization
1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More information1-D Optimization. Lab 16. Overview of Line Search Algorithms. Derivative versus Derivative-Free Methods
Lab 16 1-D Optimization Lab Objective: Many high-dimensional optimization algorithms rely on onedimensional optimization methods. In this lab, we implement four line search algorithms for optimizing scalar-valued
More informationA Computational Method for Multidimensional Continuous-choice. Dynamic Problems
A Computational Method for Multidimensional Continuous-choice Dynamic Problems (Preliminary) Xiaolu Zhou School of Economics & Wangyannan Institution for Studies in Economics Xiamen University April 9,
More informationNumerical solutions of nonlinear systems of equations
Numerical solutions of nonlinear systems of equations Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan E-mail: min@math.ntnu.edu.tw August 28, 2011 Outline 1 Fixed points
More informationSurvey of NLP Algorithms. L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA
Survey of NLP Algorithms L. T. Biegler Chemical Engineering Department Carnegie Mellon University Pittsburgh, PA NLP Algorithms - Outline Problem and Goals KKT Conditions and Variable Classification Handling
More informationHOMEWORK #4: LOGISTIC REGRESSION
HOMEWORK #4: LOGISTIC REGRESSION Probabilistic Learning: Theory and Algorithms CS 274A, Winter 2019 Due: 11am Monday, February 25th, 2019 Submit scan of plots/written responses to Gradebook; submit your
More informationLinear Programming Methods
Chapter 11 Linear Programming Methods 1 In this chapter we consider the linear programming approach to dynamic programming. First, Bellman s equation can be reformulated as a linear program whose solution
More informationexp(v j=) k exp(v k =)
Economics 250c Dynamic Discrete Choice, continued This lecture will continue the presentation of dynamic discrete choice problems with extreme value errors. We will discuss: 1. Ebenstein s model of sex
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationMS&E 318 (CME 338) Large-Scale Numerical Optimization
Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods
More informationPenalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques
More informationSYSTEMS OF NONLINEAR EQUATIONS
SYSTEMS OF NONLINEAR EQUATIONS Widely used in the mathematical modeling of real world phenomena. We introduce some numerical methods for their solution. For better intuition, we examine systems of two
More informationDynamic and Stochastic Model of Industry. Class: Work through Pakes, Ostrovsky, and Berry
Dynamic and Stochastic Model of Industry Class: Work through Pakes, Ostrovsky, and Berry Reading: Recommend working through Doraszelski and Pakes handbook chapter Recall model from last class Deterministic
More informationConvex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods
Convex Optimization Prof. Nati Srebro Lecture 12: Infeasible-Start Newton s Method Interior Point Methods Equality Constrained Optimization f 0 (x) s. t. A R p n, b R p Using access to: 2 nd order oracle
More informationECONOMETRICS I. Assignment 5 Estimation
ECONOMETRICS I Professor William Greene Phone: 212.998.0876 Office: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Email: wgreene@stern.nyu.edu Course web page: people.stern.nyu.edu/wgreene/econometrics/econometrics.htm
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationIterative Reweighted Least Squares
Iterative Reweighted Least Squares Sargur. University at Buffalo, State University of ew York USA Topics in Linear Classification using Probabilistic Discriminative Models Generative vs Discriminative
More informationNBER WORKING PAPER SERIES IMPROVING THE NUMERICAL PERFORMANCE OF BLP STATIC AND DYNAMIC DISCRETE CHOICE RANDOM COEFFICIENTS DEMAND ESTIMATION
NBER WORKING PAPER SERIES IMPROVING THE NUMERICAL PERFORMANCE OF BLP STATIC AND DYNAMIC DISCRETE CHOICE RANDOM COEFFICIENTS DEMAND ESTIMATION Jean-Pierre H. Dubé Jeremy T. Fox Che-Lin Su Working Paper
More information