Least squares methods in maximum likelihood problems. M.R.Osborne School of Mathematical Sciences, ANU

Size: px
Start display at page:

Download "Least squares methods in maximum likelihood problems. M.R.Osborne School of Mathematical Sciences, ANU"

Transcription

1 Least squares methods in maximum likelihood problems M.R.Osborne School of Mathematical Sciences, ANU September 22,

2 Abstract: It is well known that the Gauss- Newton algorithm for solving nonlinear least squares problems is a special case of the scoring algorithm for maximizing log likelihoods. What has received less attention is that the computation of the current correction in the scoring algorithm in both its line search and trust region forms can be cast as a linear least squares problem. This is an important observation both because it provides likelihood methods with a general framework which accords with computational orthodoxy, and because it can be seen as underpinning computational procedures which have been developed for particular classes of likelihood problems (for example, generalised linear models). Aspects of this orthodoxy as it affects considerations such as convergence and effectiveness will be reviewed. 2

3 1. Point of paper has changed somewhat with more emphasis on what do we know about Gauss-Newton. 2.Will not consider estimation of DE s per se. However, 1. Results apply to stable simple shooting estimation problems, and 2. to suitably imbedded multiple shooting formulations - and we know how to do this. 3. Strictly results do not apply to (our work on) the simultaneous approach to ODE estimation. Problem is technically a mixed problem as limiting multipliers satisfy a stochastic DE. 4. Will not say anything about optimum observation points. However, the information matrix will be much in evidence,and its determinant is a quantity to maximize as a function of these. 3

4 Start with independent event outcomes y t R q, t = 1, 2,, n, associated pdf g (y t ; θ t, t) indexed by points t T R l, and structural information provided by a known parametric model θ t = η (t, x) where θ R s, and x R p. A priori information is the experimental design T n, T n = n. For asymptotics require 1 n t T n f(t) S(T ) f(t)ρ(t)dt The problem is given the event outcomes y t it is required to estimate x. It is not assumed that the individual components of y t are independent. 4

5 Example Exponential family: ( [ ]) θ g y; = c (y, φ) exp [{ y φ T } ] θ b (θ) /a (φ) E {y} = µ ( x, t ) = b (θ) T, V {y} = a (φ) 2 b (θ). signal in noise model θ = µ. (i)normal density g = 1 exp 1 (y µ)2 2πσ 2σ2 c(y, φ) = 1 exp y2 2πσ 2σ 2, a(φ) = σ2 θ = µ, b(θ) = µ 2. 5

6 (ii) multinomial (discrete) distribution g (n; ω) = n! p j=1 n j! p j=1 ω n j j = n! p j=1 n j! e p j=1 n j log ω j, where p j=1 n j = n, and the frequencies must satisfy the condition p j=1 ω j = 1. Eliminating ω p gives p j=1 n j log ω j = It follows that θ j = log p 1 j=1 n j log ω j 1 p 1 i=1 ω +n log i ω j 1 p 1 i=1 ω, b (θ) = n log i 1 1 p 1 e θ j. j=1 p 1 j=1 ω 6

7 Likelihood: G (y; x, T ) = t T g (y t ; θ t, t) Estimation principle: x = arg max x G (y; x, T ). Log likelihood L (y; x, T ) = t T log g (y t ; θ t, t) = t T L t (y t ; θ t, t) Assume: true model η, parameter vector x ; x properly in interior of region in which L is well behaved; boundedness of integrals (computing expectations etc.). 7

8 Properties: E { x L (y; x, T )} = 0; E { 2 x L (y; x, T ) } = E { x L T x L }. Fisher information I n = 1 n E { x L (y; x, T ) T x L (y; x, T ) { } 1 = V n x L (y; x, T ). } Maximum likelihood estimates are consistent and asymptotically minimum variance. 8

9 Algorithms: Describe step h, x x + h Newton J n = 1 n 2 x L (y; x, T n) h = Jn 1 1 n xl (y; x, T ) T Scoring h = In 1 1 n xl (y; x, T ) T Sample S n = 1 x L t (y t ; x, t) T x L t (y t ; x, t) n t T n h = Sn 1 1 n xl (y; x, T ) T 9

10 Equivalence lim n I n(x ) = lim S n n (x ) = lim J n n (x ) = I where I = S(T ) E { 2 x L (y; θ t, t)}ρ(t)dt. Transformation invariance - scoring, sample, but not Newton: Let u = u(x), W = u x then h u = W h x. x L = u LW, I x = W T I u W h x = ( W T I u W ) 1 1 n W T u L T = W 1 (I u ) 11 n ul T 10

11 Implementation: This requires: (1) A method for generating h, a direction in which the objective is increasing. (2) A method for estimating progress. A full step need not be satisfactory. This is especially true of initial steps. To measure progress introduce a monitor Φ(x) - needs same local stationary points and to be increasing when objective F is increasing F h 0 Φh 0 Two approaches: (1) Line search: effective search direction computed, monitor used to gauge a profitable step in this direction. (2) Trust region: step required to lie in an adaptively defined control region - typically one in which linearization of F does not depart too far from true behaviour. 11

12 Direction of search is required to be one which increases the objective. This is the case here for both scoring and sample algorithms x Lh = 1 n xli 1 n xl T > 0, x x. Note this shows that L is a suitable monitor. x Lh is invariant: n x Lh x = x LIn 1 xl T = u LW (W T I n W ) 1 W T u L T = u L(I u n ) 1 x L T = n u Lh u Will see there are good ways to compute this. 12

13 Measuring effectiveness. step x x + λh if Goldstein: accept ρ Ψ(λ, x, h) 1 ρ, 0 < ρ <.5 Φ(x + λh) Φ(x) Ψ = λ x Φ(x)h Can always choose λ to satisfy this test. Armijo: Let 0 < ρ < 1, and λ = ρ k where k is smallest integer such that Φ(x + ρ k 1 h) Φ(x) < Φ(x + ρ k h). Goldstein fine for proving results. No direct link in sense of small step asymptotic equivalence relating λ and ρ k. It does not tell how to find λ while Armijo does. Choice λ = 1 favoured for scale invariant, Newton type methods. Also Ψ.5 for scoring and sample methods provided n large enough..5 ρ.1. 13

14 Convergence: nice inequality (Kantorovitch) x Lh x L h 1 cond S (I n ) 1/2. If region R L bounded and I n positive definite then ascent direction exists uniformly lower bound λ R exists for linesearch step multiplier. Effectiveness means L can be used as a monitor (Not true for pure Newton). Goldstein gives x Lh L(x + λh) L(x) ρλ R 0 L(x) increasing and bounded convergence. Also h = (I n ) 11 n xl T xl n I n. Putting it all together gives x L 2 K (L(x + λh) L(x)) 0 14

15 Must be se- What happens if inf{λ i } = 0? quence { λ i }, inf λ i = 0 ρ > L(x i + λ i h i ) L(x i ) λ i x L(x i )h i Here, Taylor plus mean value theorem gives 1 2(1 ρ) n 2 xl(x) > σ min (I n ). λ i Almost a global result! Set of allowed approximations need not be closed: η = x(1) + x(2) exp x(3)t t = lim n (n n exp 1 n t). 15

16 Least squares. Consider sample estimate: S n = 1 n S n = t T n x L T t x L t = 1 n ST n S n. x L t. Correction satisfies the least squares problem min h r T r; r = S n h e 16

17 Scoring works pretty much the same: I n = 1 n t T n x η T E{ η L T t ηl t } x η Set Vt 1 = E{ η L T t ηl t } then obtain least squares problem where I L n = min h r T r; r = I L n h b. V 1/2 t. x η, b =. V 1/2 t η L t.. 17

18 Example: normal distribution L t = 1 2σ 2(y t µ(x, t)) 2 x L t = 1 σ 2(y t µ(x, t)) x µ t I L n = 2 x L t = 1 σ 2 ( (y t µ(x, t)) 2 x µ t. x µ t. I n = 1 nσ 2 t T n x µ T t xµ t As σ cancels the result is the Gauss-Newton method. In contrast S n = 1 (y t µ(x, t))2 nσ 2 t T σ 2 x µ T t xµ t. n Here the scale does not exit so agreeably. 18

19 Computation: I L n = [ Q 1 h = U 1 Q T 1 b Q 2 ] [ U 0 ] Can get more value from factorization x Lh = ( b T In L ) h [ = b T U Q 0 ] = Q T 1 b 2 0 U 1 Q T 1 b This is needed for Goldstein test, Also 0 as x Lh 0 so provides a scale invariant quantity for convergence testing. Typically would scale columns of I L n - see Higham s book. 19

20 Rate of convergence: Consider the unit step scoring iteration in fixed point form: where x i+1 = F n (x i ), F n (x) = x + I n (x) 1 1 n xl (x) T. The condition for convergence is ϖ ( F n (x n )) < 1, where ϖ ( F n (x n ) ) is the spectral radius of the variation F n = x F n. Then ϖ ( F n (x n )) 0, a.s., n. ϖ ( F n (x n ) ) is a (Newton-like) invariant of the likelihood surface, is a measure of the quality of the modelling, and can be estimated by a modification of the power method. 20

21 To calculate ϖ ( F n (x n ) ) note that x L (x n ) = 0. thus F n (x n ) = I + I n (x n ) 1 1 n 2 x L (x n), = I n (x n ) 1 (I n (x n ) + 1 n 2 x L (x n) If the right hand side were evaluated at x then the result would follow from the strong law of large numbers which shows that the matrix gets small (hence ϖ gets small) almost surely as n. But, by consistency of the estimates, we have ϖ ( F n (x n )) = ϖ ( F n ( x )) + O ( x n x ), a.s., and the desired result follows. ). 21

22 Trust region: Idea is to limit scope of the linear subproblem to a closed control region containing the current estimate: min r T r; h, h 2 D γ r = I L n h b h 2 D = ht D 2 h, Necessary conditions give [ r T 0 ] = λ T [ I I L n D > 0 diagonal. ] π [ 0 h T D 2 ] so h satisfies the perturbed scoring equations (I n + πn D2 ) h = 1 n xl T π can be used to control the trust region by controlling the size of h. Differentiating wrt π gives ( I n + π ) dh n D2 dπ = 1 n D2 h dht D 2 h = n dh T (I n + π ) dh dπ dπ n D2 dπ < 0 d dπ h 2 D < 0. 22

23 The classical form of the algorithm goes back to Levenberg Here two parameters α, β are kept, and the basic sequence of operations is: count=1: do while F(x+h(\pi))<F(x) count=count+1 \pi=\alpha*\pi loop x\leftarrow x+h(\pi) if count=1 then \pi=\beta*\pi Successful steps will be taken eventually as h 1 π D 2 x L T, π Experience suggests choices of α, β are not critical. Typically αβ < 1 to approach Newton like methods. The scoring and sample algorithms are not exact Newton methods. Both can be regarded as regularised methods because of their generic positive definiteness. Not obvious that setting π = 0 will increase rate of convergence. 23

24 Basic theorems mirror line search results. Convergence: Let {x i } produced by the α, β procedure be contained in compact region R in which {π i } < then {L(y; x i, T n )} converges, and limit points of {x i } are stationary points of L. Boundedness: If sequence {π i } determined by α, β procedure is unbounded in R then the norm of 2 xl is also unbounded. Remember iterations have the potential to become unbounded for smooth sets of nonlinear approximating functions as closure of these sets of functions cannot be guaranteed without more work. 24

25 Computation: The trust region problem is: [ ] [ ] min r T X L r; r = n b h h πi 0 ] [ Let Xn L U = Q then h(π) can be found by 0 solving the typically much smaller problem min h s T s; s = [ U πi ] h [ c1 0 where c 1 = Q T 1 b. Make further factorization [ ] [ [ ] [ ] U = Q U ], Q πi 0 T c10 c 1 c then 2 ] h(π) = (U ) 1 c 1 x Lh(π) = c T 1 Uh(π) = [ c T ] [ ] 1 0 πi U ( U T U + πi ) 1 [ U T πi ] [ c 1 0 ] = c

26 The linear subproblem has a useful invariance property with respect to diagonal scaling. Introduce the new variables u = T x where T is diagonal. T 1 ( S T x S x + πd 2) T 1 T h x = T 1 x L T. This is equivalent to ( S T u S u + πt 1 D 2 T 1) h u = u L T. Thus if D i transforms with x then T 1 i i D i transforms in the same way with respect to u. This requirement is satisfied by i D i = (S n ) i. This transformation effects a rescaling of the least squares problem. We have h = ( S T n S n + πd 2) 1 S T n e, Dh = ( D 1 S T n S nd 1 + πi ) 1 D 1 S T n e. The effect of this choice is to rescale the columns of S n to have unit length. It is often sufficient to set π = 1, and D = diag { (S n ) i, i = 1, 2,, p} initially. 26

27 One catch is the initial choice π = π 0. One example is provided by models containing exponential terms such as e xkt which should have negative exponents (x k > 0) but which can become positive (x k + h k < 0) by too large a step. To fix introduce a damping factor τ such that the critical components of x+τh(π 0 ) 0. τh(π 0 ) provides a possible choice for a revised trust region bound γ. This involves solving for π by eg Newton s method the equation h(π) = γ. dh dπ can be found by solving ( U T U + πi ) dh dπ = h This can be written in least squares form: min h s T s; s = [ U π 1/2 I ] dh dπ [ U T h 0 ]. 27

28 Newton s method extrapolates the function as a linear. An alternative strategy could be preferable here. This is based on the observation that, if A = W ΣV T is the singular value decomposition of the matrix in the least squares formulation, then h = p i=1 σ i (w T i b) σ 2 i + π v i is a rational function of π. To mirror this set h a b + (π π 0 ). To identify the parameters a and b use the values h(π 0 ) and dπ d h(π 0) - this corresponds to the same information as used in Newton s method. 28

29 To solve for the parameters we have the equations: The result is h(π 0 ) = a b, d dπ h(π 0) = ht dh dπ h (π 0) = a b 2. b = h(π 0) d dπ h(π 0), giving the correction a = h(π 0) b, π = π 0 h(π 0) γ h(π 0 ) d dπ h(π. 0) γ The effect of the rational extrapolation is just the Newton step modulated by the term h(π 0) γ. As this term 1 this is also a second order convergent process. 29

30 Simple exponential model: Here the model used is µ(t, x) = x(1) + x(2) exp( x(3)t). (1) The values chosen for the parameters are x(1) = 1, x(2) = 5, and x(3) = 10. Initial values are generated using x(i) 0 = x(i) + (1 + x(i))(.5 Rnd) where Rnd indicates a call to a uniform random number generator giving values in [0, 1]. This model is not difficult in the sense that the graph has features that depend on each of the parameters. Thus the interest is in the effects of simulated data errors. 30

31 Two types of random numbers are used to simulate the experimental data. Normal data: The data is generated by evaluating µ(t, x) on a uniform grid with spacing = 1/(n+1) and then perturbing these values using normally distributed random numbers to give values z i = µ(i, x)+ε i, ε i N(0, 2), i = 1, 2,, n. The choice of standard deviation was made to make small sample problems (n = 32) relatively difficult. The log likelihood is taken as L(x) = 1 2 n i=1 (z i µ(i, x)) 2. While the scale is not evident here it resurfaces in its effects on the generated data. 31

32 Poisson data: A Poisson random number generator is used to generate random counts z i corresponding to µ(i, x) as the mean model. The log likelihood used is ( ) n µ(i, x) L(x) = z i log + (z i µ(i, x)). i=1 z i Note that if z i = 0 then the contribution from the logarithm term to the log likelihood is zero. The rows of the least squares problem design matrix are given by e T i A = 1, exp( x(3)t i), x(2)t i exp( x(3)t i ), s i s i s i i = 1, 2,, n where s i = µ(i, x). The corresponding components of the right hand side are b i = z i µ(i, x) s i. 32

33 Numerical experiments comparing the performance of the line search (LS) and trust region (TR) methods are summarised below. For each n the computations were initiated with 10 different seeds for the basic random number generator, and the average number of iterations is reported as a guide to algorithm performance. The parameter settings used are α = 2.5, β =.1 for the trust region method and ρ =.25 for the Armijo parameter used in the line search. Experimenting with these values (for example, the choice α = 1.5, β =.5) made very little difference in the trust region results. Convergence is assumed if x Lh < 1.0e 8. This corresponds to final values of h in the range 1.e 4 to 1.e 6. Normal Poisson n LS TR LS TR * 14*

34 The starred entries in the table correspond to two cases of nonconvergence. The figure shows (in red) the current estimate after 50 iterations together with the data and the starting estimate. This gives an illustration that the set of approximations is not closed. Result shows a straight line fit 34

35 Second example - Gaussians with an exponential background. µ = x(1) exp x(2)t+x(3) exp (t x(4)) 2 /x(5) Line search case: + x(6) exp (t x(7)) 2 /x(8) x(1) = 5, x(2) = 10, x(3) = 18, x(4) =.3333, x(5) =.05, x(6) = 15, x(7) =.6667, x(8) =

36 The results here are much more of a mixed bag. The problems are closely related to the choice of starting values. The figure is produced using peak widths of.01 which makes it easier to see what is going on. In this case the starting values do not see the second peak. Both the background and the first peak are picked up well. Initial conditions miss the second peak 36

37 Third example - trinomial example Data for a trinomial example (m = 3) came from a consulting exercise with CSIRO. It is derived from a study of the effects of a cattle virus on chicken embryos. log 10 (titre) dead deformed normal Cattle virus data The model suggested fits the frequencies explicitly 1 ω 1 = 1 + exp ( β 1 β 3 log (t)), 1 1 ω 2 = 1 + exp ( β 2 β 3 log (t)), ω 3 = 1 ω 1 ω 2. 37

38 Makes sense to develop the algorithm in terms of the frequencies. Numerical results show an impressive rate of convergence for a relatively small data set. Suggests the model analysis is good. its L Lh β 1 β 2 β Results of computations for the trinomial data 38

39 In conclusion Work on the Levenberg algorithm in the 1970 s was responsible for at least some of the encouragement for the shift from linesearch to trust region methods in optimization problems. It can now be argued that this component of the move had somewhat dubious validity. 1. Use of the expected Hessian is already a regularising step. Improved conditioning derived from the trust region parameter could be illusory if the aim is small values for rapid convergence. If significant values of π are required then in the data analytic context the modelling could well be suspect. 2. The old papers relied on a small residual argument to explain good convergence rates. Again in our context this is not satisfactory. It is not completely obvious what the effect of non zero trust region parameters is in any particular case. 3. The trust region algorithm does not scale as well as the linesearch scoring algorithm. 4. Global convergence results of similar power appear available for both approaches. 39

linear least squares methods nonlinear least squares asymptotically second order iterations M.R. Osborne Something old,something new, and...

linear least squares methods nonlinear least squares asymptotically second order iterations M.R. Osborne Something old,something new, and... Something old,something new, and... M.R. Osborne Mathematical Sciences Institute Australian National University Anderssen 70 Canberra, January 30, 2009 Outline linear least squares methods nonlinear least

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

The Bock iteration for the ODE estimation problem

The Bock iteration for the ODE estimation problem he Bock iteration for the ODE estimation problem M.R.Osborne Contents 1 Introduction 2 2 Introducing the Bock iteration 5 3 he ODE estimation problem 7 4 he Bock iteration for the smoothing problem 12

More information

2 Nonlinear least squares algorithms

2 Nonlinear least squares algorithms 1 Introduction Notes for 2017-05-01 We briefly discussed nonlinear least squares problems in a previous lecture, when we described the historical path leading to trust region methods starting from the

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Notes for CS542G (Iterative Solvers for Linear Systems)

Notes for CS542G (Iterative Solvers for Linear Systems) Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A ) 6. Brownian Motion. stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ, P) and a real valued stochastic process can be defined

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

LECTURE NOTE #NEW 6 PROF. ALAN YUILLE

LECTURE NOTE #NEW 6 PROF. ALAN YUILLE LECTURE NOTE #NEW 6 PROF. ALAN YUILLE 1. Introduction to Regression Now consider learning the conditional distribution p(y x). This is often easier than learning the likelihood function p(x y) and the

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k Maria Cameron Nonlinear Least Squares Problem The nonlinear least squares problem arises when one needs to find optimal set of parameters for a nonlinear model given a large set of data The variables x,,

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Unconstrained Optimization

Unconstrained Optimization 1 / 36 Unconstrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University February 2, 2015 2 / 36 3 / 36 4 / 36 5 / 36 1. preliminaries 1.1 local approximation

More information

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk Instructor: Victor F. Araman December 4, 2003 Theory and Applications of Stochastic Systems Lecture 0 B60.432.0 Exponential Martingale for Random Walk Let (S n : n 0) be a random walk with i.i.d. increments

More information

McGill University Department of Mathematics and Statistics. Ph.D. preliminary examination, PART A. PURE AND APPLIED MATHEMATICS Paper BETA

McGill University Department of Mathematics and Statistics. Ph.D. preliminary examination, PART A. PURE AND APPLIED MATHEMATICS Paper BETA McGill University Department of Mathematics and Statistics Ph.D. preliminary examination, PART A PURE AND APPLIED MATHEMATICS Paper BETA 17 August, 2018 1:00 p.m. - 5:00 p.m. INSTRUCTIONS: (i) This paper

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Brownian Motion. Chapter Stochastic Process

Brownian Motion. Chapter Stochastic Process Chapter 1 Brownian Motion 1.1 Stochastic Process A stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ,P and a real valued stochastic

More information

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel LECTURE NOTES on ELEMENTARY NUMERICAL METHODS Eusebius Doedel TABLE OF CONTENTS Vector and Matrix Norms 1 Banach Lemma 20 The Numerical Solution of Linear Systems 25 Gauss Elimination 25 Operation Count

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2

More information

LP. Kap. 17: Interior-point methods

LP. Kap. 17: Interior-point methods LP. Kap. 17: Interior-point methods the simplex algorithm moves along the boundary of the polyhedron P of feasible solutions an alternative is interior-point methods they find a path in the interior of

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Physics 6303 Lecture 22 November 7, There are numerous methods of calculating these residues, and I list them below. lim

Physics 6303 Lecture 22 November 7, There are numerous methods of calculating these residues, and I list them below. lim Physics 6303 Lecture 22 November 7, 208 LAST TIME:, 2 2 2, There are numerous methods of calculating these residues, I list them below.. We may calculate the Laurent series pick out the coefficient. 2.

More information

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 4: Types of errors. Bayesian regression models. Logistic regression Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...

More information

1 Complete Statistics

1 Complete Statistics Complete Statistics February 4, 2016 Debdeep Pati 1 Complete Statistics Suppose X P θ, θ Θ. Let (X (1),..., X (n) ) denote the order statistics. Definition 1. A statistic T = T (X) is complete if E θ g(t

More information

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Review of matrices. Let m, n IN. A rectangle of numbers written like A = Review of matrices Let m, n IN. A rectangle of numbers written like a 11 a 12... a 1n a 21 a 22... a 2n A =...... a m1 a m2... a mn where each a ij IR is called a matrix with m rows and n columns or an

More information

Next topics: Solving systems of linear equations

Next topics: Solving systems of linear equations Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

Arc Search Algorithms

Arc Search Algorithms Arc Search Algorithms Nick Henderson and Walter Murray Stanford University Institute for Computational and Mathematical Engineering November 10, 2011 Unconstrained Optimization minimize x D F (x) where

More information

Lecture 18 Classical Iterative Methods

Lecture 18 Classical Iterative Methods Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem min{f (x) : x R n }. The iterative algorithms that we will consider are of the form x k+1 = x k + t k d k, k = 0, 1,...

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

Numerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore

Numerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore Numerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 13 Steepest Descent Method Hello, welcome back to this series

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

A Sobolev trust-region method for numerical solution of the Ginz

A Sobolev trust-region method for numerical solution of the Ginz A Sobolev trust-region method for numerical solution of the Ginzburg-Landau equations Robert J. Renka Parimah Kazemi Department of Computer Science & Engineering University of North Texas June 6, 2012

More information

AIMS Exercise Set # 1

AIMS Exercise Set # 1 AIMS Exercise Set #. Determine the form of the single precision floating point arithmetic used in the computers at AIMS. What is the largest number that can be accurately represented? What is the smallest

More information

Logistic Regression. Seungjin Choi

Logistic Regression. Seungjin Choi Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Numerical Analysis Comprehensive Exam Questions

Numerical Analysis Comprehensive Exam Questions Numerical Analysis Comprehensive Exam Questions 1. Let f(x) = (x α) m g(x) where m is an integer and g(x) C (R), g(α). Write down the Newton s method for finding the root α of f(x), and study the order

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes LU Decomposition 30.3 Introduction In this Section we consider another direct method for obtaining the solution of systems of equations in the form AX B. Prerequisites Before starting this Section you

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

Vector Derivatives and the Gradient

Vector Derivatives and the Gradient ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 10 ECE 275A Vector Derivatives and the Gradient ECE 275AB Lecture 10 Fall 2008 V1.1 c K. Kreutz-Delgado, UC San Diego

More information

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: Oct. 1 The Dirichlet s P rinciple In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: 1. Dirichlet s Principle. u = in, u = g on. ( 1 ) If we multiply

More information

Lecture 2: Computing functions of dense matrices

Lecture 2: Computing functions of dense matrices Lecture 2: Computing functions of dense matrices Paola Boito and Federico Poloni Università di Pisa Pisa - Hokkaido - Roma2 Summer School Pisa, August 27 - September 8, 2018 Introduction In this lecture

More information

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory Physics 202 Laboratory 5 Linear Algebra Laboratory 5 Physics 202 Laboratory We close our whirlwind tour of numerical methods by advertising some elements of (numerical) linear algebra. There are three

More information

Optimal Newton-type methods for nonconvex smooth optimization problems

Optimal Newton-type methods for nonconvex smooth optimization problems Optimal Newton-type methods for nonconvex smooth optimization problems Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint June 9, 20 Abstract We consider a general class of second-order iterations

More information

Lecture 8: Differential Equations. Philip Moriarty,

Lecture 8: Differential Equations. Philip Moriarty, Lecture 8: Differential Equations Philip Moriarty, philip.moriarty@nottingham.ac.uk NB Notes based heavily on lecture slides prepared by DE Rourke for the F32SMS module, 2006 8.1 Overview In this final

More information

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING

AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING AN AUGMENTED LAGRANGIAN AFFINE SCALING METHOD FOR NONLINEAR PROGRAMMING XIAO WANG AND HONGCHAO ZHANG Abstract. In this paper, we propose an Augmented Lagrangian Affine Scaling (ALAS) algorithm for general

More information

Notes on Markov Networks

Notes on Markov Networks Notes on Markov Networks Lili Mou moull12@sei.pku.edu.cn December, 2014 This note covers basic topics in Markov networks. We mainly talk about the formal definition, Gibbs sampling for inference, and maximum

More information

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lecture 11 and

More information

A globally convergent Levenberg Marquardt method for equality-constrained optimization

A globally convergent Levenberg Marquardt method for equality-constrained optimization Computational Optimization and Applications manuscript No. (will be inserted by the editor) A globally convergent Levenberg Marquardt method for equality-constrained optimization A. F. Izmailov M. V. Solodov

More information

Stability theory is a fundamental topic in mathematics and engineering, that include every

Stability theory is a fundamental topic in mathematics and engineering, that include every Stability Theory Stability theory is a fundamental topic in mathematics and engineering, that include every branches of control theory. For a control system, the least requirement is that the system is

More information

7 Asymptotics for Meromorphic Functions

7 Asymptotics for Meromorphic Functions Lecture G jacques@ucsd.edu 7 Asymptotics for Meromorphic Functions Hadamard s Theorem gives a broad description of the exponential growth of coefficients in power series, but the notion of exponential

More information

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Classical Estimation Topics

Classical Estimation Topics Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Math 471 (Numerical methods) Chapter 3 (second half). System of equations Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular

More information

1. Introduction Boundary estimates for the second derivatives of the solution to the Dirichlet problem for the Monge-Ampere equation

1. Introduction Boundary estimates for the second derivatives of the solution to the Dirichlet problem for the Monge-Ampere equation POINTWISE C 2,α ESTIMATES AT THE BOUNDARY FOR THE MONGE-AMPERE EQUATION O. SAVIN Abstract. We prove a localization property of boundary sections for solutions to the Monge-Ampere equation. As a consequence

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

Free energy estimates for the two-dimensional Keller-Segel model

Free energy estimates for the two-dimensional Keller-Segel model Free energy estimates for the two-dimensional Keller-Segel model dolbeaul@ceremade.dauphine.fr CEREMADE CNRS & Université Paris-Dauphine in collaboration with A. Blanchet (CERMICS, ENPC & Ceremade) & B.

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Constrained Nonlinear Optimization Algorithms

Constrained Nonlinear Optimization Algorithms Department of Industrial Engineering and Management Sciences Northwestern University waechter@iems.northwestern.edu Institute for Mathematics and its Applications University of Minnesota August 4, 2016

More information

Switching Regime Estimation

Switching Regime Estimation Switching Regime Estimation Series de Tiempo BIrkbeck March 2013 Martin Sola (FE) Markov Switching models 01/13 1 / 52 The economy (the time series) often behaves very different in periods such as booms

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

THE RESIDUE THEOREM. f(z) dz = 2πi res z=z0 f(z). C

THE RESIDUE THEOREM. f(z) dz = 2πi res z=z0 f(z). C THE RESIDUE THEOREM ontents 1. The Residue Formula 1 2. Applications and corollaries of the residue formula 2 3. ontour integration over more general curves 5 4. Defining the logarithm 7 Now that we have

More information

6.3 Fundamental solutions in d

6.3 Fundamental solutions in d 6.3. Fundamental solutions in d 5 6.3 Fundamental solutions in d Since Lapalce equation is invariant under translations, and rotations (see Exercise 6.4), we look for solutions to Laplace equation having

More information

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators John Sylvester Department of Mathematics University of Washington Seattle, Washington 98195 U.S.A. June 3, 2011 This research

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

3. THE SIMPLEX ALGORITHM

3. THE SIMPLEX ALGORITHM Optimization. THE SIMPLEX ALGORITHM DPK Easter Term. Introduction We know that, if a linear programming problem has a finite optimal solution, it has an optimal solution at a basic feasible solution (b.f.s.).

More information

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

CHAPTER 11. A Revision. 1. The Computers and Numbers therein CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of

More information

LECTURE-15 : LOGARITHMS AND COMPLEX POWERS

LECTURE-15 : LOGARITHMS AND COMPLEX POWERS LECTURE-5 : LOGARITHMS AND COMPLEX POWERS VED V. DATAR The purpose of this lecture is twofold - first, to characterize domains on which a holomorphic logarithm can be defined, and second, to show that

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Iteration basics Notes for 2016-11-07 An iterative solver for Ax = b is produces a sequence of approximations x (k) x. We always stop after finitely many steps, based on some convergence criterion, e.g.

More information

COMPOSITIONS WITH A FIXED NUMBER OF INVERSIONS

COMPOSITIONS WITH A FIXED NUMBER OF INVERSIONS COMPOSITIONS WITH A FIXED NUMBER OF INVERSIONS A. KNOPFMACHER, M. E. MAYS, AND S. WAGNER Abstract. A composition of the positive integer n is a representation of n as an ordered sum of positive integers

More information

Lecture 4 Eigenvalue problems

Lecture 4 Eigenvalue problems Lecture 4 Eigenvalue problems Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Lecture 17: Numerical Optimization October 2014

Lecture 17: Numerical Optimization October 2014 Lecture 17: Numerical Optimization 36-350 22 October 2014 Agenda Basics of optimization Gradient descent Newton s method Curve-fitting R: optim, nls Reading: Recipes 13.1 and 13.2 in The R Cookbook Optional

More information

Nonlinear Programming Models

Nonlinear Programming Models Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:

More information

ECE 275B Homework # 1 Solutions Version Winter 2015

ECE 275B Homework # 1 Solutions Version Winter 2015 ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2

More information