STATISTICAL ANALYSIS OF CURVE FITTING IN ERRORS IN-VARIABLES MODELS ALI AL-SHARADQAH

Size: px
Start display at page:

Download "STATISTICAL ANALYSIS OF CURVE FITTING IN ERRORS IN-VARIABLES MODELS ALI AL-SHARADQAH"

Transcription

1 STATISTICAL ANALYSIS OF CURVE FITTING IN ERRORS IN-VARIABLES MODELS by ALI AL-SHARADQAH NIKOLAI CHERNOV, COMMITTEE CHAIR WEI-SHEN HSIA CHARLES KATHOLI IAN KNOWLES BORIS KUNIN A DISSERTATION Submitted to the faculty of the University of Alabama at Birmingham, in partial fulfillment of the requirements of the degree of Doctor of Philosophy BIRMINGHAM, ALABAMA 2011

2 STATISTICAL ANALYSIS OF CURVE FITTING IN ERRORS IN-VARIABLES MODELS ALI AL-SHARADQAH APPLIED MATHEMATICS ABSTRACT This dissertation is devoted to the problem of fitting geometric curves such as lines, circles, and ellipses to a set of experimental observations whose both coordinates are contaminated with noisy errors. This kind of regression is called Errors-in-Variables models (EIV), which is quite different and much more difficult to solve than the classical regression. This research study is motivated by the wide range of EIV applications in computer vision and image processing. We adopted statistical assumptions suitable for these applications and we studied the statistical properties of two kinds of fits; geometric fit and algebraic fit for line, circle and ellipse fittings. The main contribution of the dissertation is proposing several fits for both circle and ellipse fitting problems. These proposed fits were discovered after we developed our unconventional statistical analysis that allowed us to effectively assess EIV parameter estimates. This approach was validated through a series of numerical tests. We theoretically compared the most popular fits for circles and ellipses to each other and we showed why, and by how much, each fit differs from others. Our theoretical comparison leads to new unbeatable fits with superior characteristics that surpass all existing fits theoretically and experimentally. Another contribution is discussing some statistical issues in circle fitting. We proved that the most popular and accurate fits have infinite absolute first moment while the one with finite first moment is, paradoxically, the least accurate and has the heaviest bias. Also, we proved that the geometric fit returns absolutely continuous estimator. Keywords: errors in variables (EIV) models, conic fitting, circle fitting, geometric estimation, orthogonal least squares ii

3 DEDICATION TO MY BELOVED PARENTS TO MY WIFE OLA NUSIERAT TO MY KIDS KAREEM AND JANA iii

4 ACKNOWLEDGEMENTS I would like to express deepest gratitude to my advisor Professor Nikolai Chernov. Without his support, I will never successfully finish my dissertation. Without his guidance and giving me his precious time, I will never become a independent researcher with solid mathematical and statistical background. To the faculty of the U.A.B. s mathematics department, in particular, R. Weikard, Y. Karpeshina, G.Stolz, Y. Zeng. I thank you for your continuous support and encouragement. Also, I would like to express a deep gratitude to my PhD committee members, W. Hsia, C. Katholi, I. Knowles, B. Kunin. Of course, I am grateful to my parents, my wife, and my family. Last but not least, to my friends Rami Alahmad, Kendrick White, Derar Issa, Mohammad Muzyan, Ali Shati, and Susan Abdoli. Thank you all for your love, support, and continuous encouragement. iv

5 TABLE OF CONTENTS ABSTRACT ii DEDICATION iii ACKNOWLEDGEMENTS iv LIST OF TABLES viii LIST OF FIGURES ix CHAPTER 1. INTRODUCTION Statistical Model Geometric Fitting Linear fitting Geometric Fitting for Nonlinear Models MLE Versus Geometric Fit Statistical Properties of EIV Models Scope of the Dissertation Algebraic Fits and Weighted Algebraic Fits Algebraic Circle Fits Algebraic Conic Fits Weighted Algebraic Conic Fits Organization of the Dissertation CHAPTER 2. CIRCLE ALGEBRAIC FITS AND THE PROBLEM OF THE MOMENTS Algebraic Circle Fits Kåsa Fit Pratt s Fit Taubin s Fit Numerical Experiments Hyperaccurate Fit The Problem of the Moments Proofs Regularity Condition Pratt s Fit Taubin s Fit Hyper Fit Kåsa s Fit CHAPTER 3. ERROR ANALYSIS: A GENERAL SCHEME v

6 3.1. Introduction and Notations Asymptotic Models The Errors Consistency Versus Geometric Consistency Taylor Expansion Example: Geometric Linear Fitting Kanatani Cramér Rao Lower Bound (KCR) Cramér Rao Lower Bound for General Curve Fitting in EIV Models KCR Components of MSE Classification of Higher Order Terms Kanatani s Classification of Higher Order Terms Total Mean Square Error Covariances Assessing the Quality of Estimators MSE of ˆβ Adjusted Maximum Likelihood Estimator (AMLE) AMLE for Linear Regression Numerical Experiments CHAPTER 4. GEOMETRIC FITTING FOR CIRCLES Introduction Existence of Densities for the Circle Fit Error Analysis of the Geometric Circle Fit Mean Square Error of ˆΘ Classifications of Terms MSE of 2 ˆΘ cov( 1 ˆΘ, 3 ˆΘ) Final formula for the MSE Numerical Experiments CHAPTER 5. ERROR ANALYSIS OF ALGEBRAIC CONIC FITS Conic Fitting and General Framework Statistical Model Matrix Perturbation Variance of Algebraic Fits Bias of Algebraic Fits Error Analysis of Popular Algebraic Linear Fits Our Proposed Linear Fit Error Analysis of Popular Algebraic Circular Fit Comparison of Various Circle Fits Bias of Pratt s and Taubin s fits Transition Between Parameter Schemes Variance and Bias of Algebraic Circle Fits in the Natural Parameters Numerical Experiment Novel Circle Fits Based on our Statistical Analysis Hyperaccurate Algebraic Fit vi

7 Improved Hyperaccurate Fit Adjusted MLE for Circle Fitting Experimental Tests Error Analysis for Algebraic Ellipse Fits Variance and Biase of Algebraic Ellipse Fits Comparison of Various Algebraic Conic Fits Our Proposed Fits for Conic Fitting Experimental tests Summary CHAPTER 6. NUMERICAL EFFICIENT SCHEMES FOR ELLIPSE FITTING Motivation Gradient Weighted Algebraic Fit Numerical Schemes for Conic Fitting Numerical Schemes Solving the Variational Equation (6.14) and its variants Renormalization Scheme Reduced Scheme Matrix Perturbation and Covariance Matrix Variance Useful Identities Statistical Analysis of Popular Conic Fits Bias of Reduced Scheme Bias of GRAF Error Analysis of Fits Based on Generalized Eigenvalue Problem Bias of Renormalization Scheme Proposed Algorithm Numerical Experiments CHAPTER 7. VALIDATION OF THE ERROR ANALYSIS Validation of our Approximative Formulas General criterion Linear Regression Circular Regression Conclusions REFERENCES Appendix. Appendix Proof of Some Theorems and Lemmas vii

8 LIST OF TABLES 3.1 The order of magnitude of the four terms in (3.35) Mean square error (and its components) for geometric circle fit (10 4 values are shown). In this test n points are placed (equally spaced) along a semicircle of center (0, 0) and radius R = 1. The noise level is σ = Mean square error (and its components) for four circle fits (10 4 values are shown). In this test n = 100 points are placed (equally spaced) along a semicircle of radius R = 1 and the noise level is σ = Mean square error (and its components) for four circle fits (10 4 values are shown). In this test n = 100 points are placed (equally spaced) along a semicircle of radius R = 1 and the noise level is σ = Mean square error (and its components) for four circle fits (10 6 values are shown). In this test n = points are placed (equally spaced) along a semicircle of radius R = 1 and the noise level is σ = Mean square error (and its components) for four circle fits (10 6 values are shown). In this test n = 100 points are placed (equally spaced) along a semicircle of radius R = 1 and the noise level is σ = Mean square error of ˆR (and its components) for four circle fits (10 6 values are shown). In this test n = 20 points are placed (equally spaced) along a semicircle of radius R = 1 and the noise level is σ = The nonessential bias for both Least square fit and Taubin s fit viii

9 LIST OF FIGURES 1.1 Dot-dash line represents the geometric fit, dashed line represents the classical fit, while the solid line is the true line Normalized mean square error for various circle fits versus the noise σ Optional caption for list of figures RMSE( ˆβ) σ for MLE and AMLE versus σ. The horizontal line represents KCR divided by σ This diagram commutes, i.e., G L = M G The normalized root mean square error, RMSE(Â) σ, versus σ The arc containing the true points MSE for various circle fits (on the logarithmic scale) versus the sample size n (from 10 to 10 4 ) Normalized root mean square error for two algebraic ellipse fits, LSF (green) and Taubin (red), and KCR (green). In this test n = 100 points are placed (equally spaced) along half of ellipse of center (0, 0), major axis 100, and minor axis 50. The noise level σ varies from 0 to (10 4 )values are shown of the normalized root mean square error, i.e. RMSE σ, and the bias for three algebraic ellipse fits; Taubin fit (solid line), Hyper fit (dots), and Improved Hyper fit (dashes) while KCR is represented by the dash-dot line. 100 points are placed (equally spaced) along half of ellipse of center (0, 0), major axis 100, and minor axis 50. σ varies from.001 to ix

10 5.6 (10 4 )values are shown of the normalized root mean square error, i.e. RMSE σ, and the bias for three algebraic ellipse fits; Taubin fit (dash-dot), Hyper fit (dots), and Improved Hyper fit (dashes). 50 points are placed (equally spaced) along quarter of ellipse centered at (0, 0) with major axis 100, and minor axis 50. The noise level σ varies from.001 to 0.6. KCR is represented by the solid line T-Taubin, HF-Hyperfit, IH1-Improved Hyperfit (Ours 1), F-FNS, H-HEIV,R- Reduced, R-Renormalization, IH2-Ours 2, and D min is the trace of KCR, tr( M ) The p L (dashed line) and p Q (solid line) versus σ The p L (dashed line) and p Q (solid line) versus σ; this figure corresponds to the radius estimate The p L (dashed line) and p Q (solid line) versus σ; this figure corresponds to the center estimate x

11 CHAPTER 1 INTRODUCTION In classical regression, one usually concerns about the relationship between two variables x and y with the functional relation y = g(x). The variable x is called regressor and the variable y is called dependent variable. Then the question raises immediately what form g(x) could take? It might be a linear relationship, i.e. g(x) = α + βx or quadratic g(x) = αx 2 + βx + γ and so on. Thus it is standard to record a set of observations and plot them to depict such a model. The next question is how we can find the best values of the parameter vector Θ = (α, β,...) such that y g(x, α, β,...). As a principle in classical regression, the variable x is considered as error-free, while the dependent variable y is contaminated by some error. In other words, suppose n experimental observations (namely m i = (x i, y i ) T, i = 1,..., n) were recorded, then y i = g(x i, Θ) + ε i, i = 1,..., n where each ε i, for i = 1,..., n is a small random error. As a standard assumption each ε i is independently distributed normal random variable with mean 0 and variance σ 2. If σ 2 is unknown, then it is called nuisance (or latent) parameter. There are several ways to estimate Θ, but the most appealing one is called the Maximum Likelihood Estimator (MLE), which can be found by maximizing the likelihood function. The latter is the product of probabilities of observed deviations. i.e. L(Θ) = 1 (2π) 1 P n e 2σ n/2 2 (y i g(x i,θ)) 2 1

12 Instead, one can equivalently find ˆΘ ML by minimizing the log L(Θ), i.e. (1.1) F c = (y i g(x i, Θ)) 2 min If g(x) = α + βx, then classical linear regression becomes minimizing the objective function (1.2) F c = (y i (α + βx i )) 2, and hence the MLE is (1.3) ˆβc = s xy /s xx and ˆα c = ȳ β x where we use standard notation for sample means x = 1 n xi and ȳ = 1 n yi and for the components of the so called scatter matrix : (1.4) s xx = (x i x) 2, s yy = (y i ȳ) 2, s xy = (x i x)(y i ȳ). The classical linear regression was published by Legendre in 1805, and by Gauss in The estimators ˆα c and ˆβ c have excellent statistical properties. They are optimal in all senses. The classical regression is built according to the assumption that the regressor variable x is error-free. However, in the case that some regressors have been measured with errors, the standard assumption leads to inconsistent estimates. Their biases persist even in very large samples and thus the above estimators (ˆα c, ˆβ c ) lose their appealing features. Regression models in which all variables are subject to errors are known as Errors- In-Variables (EIV) models [61, 17, 67]. The EIV regression problem is quite different (and far more difficult) than the classical regression. The EIV regression, even in the linear case, presents challenging questions and leads to some counterintuitive results. 2

13 Fitting straight lines to observed data when both coordinates x and y are corrupted by errors dates back to the 1870s when Adcock published two papers [1, 2]. He derived formulas for the slope and the intercept estimates of the fitting line. His calculations were based on geometric rather than statistical consideration. Statisticians have studied this problem since 1901 [58]. An intensive research has been focused on line fitting in the twentieth century because of its applications in economics, sciences, image processing, and computer vision. By the late 1990s most of the major issues of the line fitting appeared resolved. The problem of fitting circles or circular arcs dates back to the 1950s. British engineers and archaeologist examined stone rings in the British Isle. The arrangement of the stones are either a circle or an ellipse (or more rarely, a setting of four stones laid on an arc of a circle). They were concerned about whether the ancient people used common units of length or not [10, 30, 72, 73, 74]. In the 1970s, fitting circles to experimental data took place in microwave engineering. The circular fitting emerged in medicine. For example, in many diseases, such as lung cancer, detection of nodule (which is either circular or elliptical) is a real problem for the diagnosis of lung cancer (the most common cause of the cancer death in the world). This illustrates the importance of contour fitting in medicine. Circular fitting also dominates publications in nuclear physics. In high energy physics millions of particles are born in accelerator. After their circular-shaped trajectories are tracked, their energies can be measured by estimating the radius of their trajectories. As we can see fitting lines, circles, and ellipses has a variety of applications. However, major applications of fitting circles and other geometric shapes appear in computer vision, image processing and pattern recognition (it is considered a basic task in these computer sciences). In this dissertation we study the geometric fitting of some geometric contours (lines, circles, and ellipses). Our work is tailored for image processing applications, and as such we adopt standard statistical assumptions that are appropriate for these 3

14 applications and we develop unconventional approach to study the statistical properties of the geometric fitting. More details will be discussed in the dissertation later Statistical Model Our problem can be formulated as follows. Given n experimental observations (namely m i = (x i, y i ) T, i = 1,..., n) and given a family of curves depending on k parameters θ 1,..., θ k, our goal is to estimate the parameter vector Θ = (θ 1,..., θ k ) T that corresponds to the best fitting curve. For example in fitting lines, we have two parameters to estimate (α, β) that describe the model y = α+βx. Similarly, in fitting circles, we may want to estimate the coordinates of the circle center (a, b) and the radius R that describes the best fit to the experimental observations. Suppose one is fitting curves defined by an implicit equation (1.5) P (x, y; Θ) = 0 Let Θ = ( θ 1,..., θ k ) T denote the true parameter vector corresponding to the true curve. Here we should distinguish between the true points and the observations. We call the point m i = ( x i, ỹ i ) T, for which m i = (x i, y i ) T was an inaccurate measurement and which lies on the true curve the true point. Mathematically, it satisfies the implicit equation (1.6) P ( m i ; Θ) = 0, i = 1,..., n For example, the true points in the linear regression satisfy (1.7) ỹ i = α + β x i, i = 1,..., n, where ( α, β) denote the true parameters. In the circular regression they satisfy (1.8) ( x i ã) 2 + (ỹ i b) 2 = R 2, i = 1,..., n, 4

15 where (ã, b, R) denote the true (unknown) parameters. Therefore (1.9) x i = ã + R cos ϕ i, ỹ i = b + R sin ϕ i, where ϕ 1,..., ϕ n specify the locations of the true points on the true circle. The angles ϕ 1,..., ϕ n are regarded as fixed unknowns and treated as additional (latent) parameters of the model. On the other hand, the observed points m i s are regarded as realizations (observations) of the true points m i. In other words, we assume that each m i is a random perturbation of a true point m i, i.e. m i = m i + e i, where e i = (δ i, ε i ) T is a small random vector. In coordinates, (1.10) x i = x i + δ i, y i = ỹ i + ε i, i = 1,..., n, To understand the statistical properties of an estimator ˆΘ, we should adopt realistic assumptions about the probability distribution of the observations. The noise vectors e i = m i m i are assumed to be independent and have zero mean. The independency assumption is essential in practice and it will be a standard assumption in my dissertation, which is devoted to computer vision applications. For edge detection, pattern recognition and computer vision, detecting (observing) any point m i does not give any information about other points and hence the errors are independent. It is also reasonable to assume that the errors e i s have the same covariance matrix for all points, because we use the same algorithm for edge detection. As a common assumption, we assume that δ i, ε i are independent and normally distributed with a common variance σ 2, i.e. (1.11) δ i N(0, σ 2 ), ε i N(0, σ 2 ), i = 1,..., n The true points ( x i, ỹ i ) can be thought of as fixed, so they are treated as nuisance parameters. This model is known as functional model. Functional model is intensively studied in the literature, and widely used in the applied community, especially in the 5

16 computer vision. Therefore, this model is adopted in our work. Others regard x s, ỹ s as realizations of some underlying random variables. This treatment of the true values is known as structural model. In the next section we will leave statistics for a while and discuss the geometric fit (it is also called orthogonal distance regression (ODR)). Then we discuss MLE in the EIV model and their relationship Geometric Fitting Let us first assume n experimental observations m i = (x i, y i ) T are recorded. If the object of approximation is a functional relation between x and y, y = g(x), then one can obtain the best fitting curve y = g(x) by minimizing the sum of the squares of the vertical distances between g(x i ) and y i. This kind of least square is well know as ordinary least squares. Moreover, it coincides with the MLE obtained via the classical regression whenever the regressors are error-free (see equation (1.1)). But if the object of approximation is a curve in the xy plane, then it is natural to minimize the sum of the squares of the geometric (orthogonal) distances from the observed points m i s to the true curve P (x, y; Θ) = 0. In other words, minimizing the objective function (1.12) F = d 2 i, where d i is the smallest Euclidean distance between m i and a point on the curve P. This procedure is called geometric fit or orthogonal distance regression (ODR). It has been used since 1870 s [1] and is commonly regarded as the best (most accurate) fitting method (under the above assumptions). In the following subsections we will discuss briefly the geometric fitting for lines, circles and conics. The linear regression is used in this dissertation to explain our general analysis, which is devoted mainly to circles and conics fittings. Even though we obtain some results in this simple case, but these results contribute marginally to our main results. 6

17 that Linear fitting. With simple calculus and geometry, one can easily show (1.13) F(α, β) = i d 2 i = 1 (y 1 + β 2 i α βx i ) 2. i The minimizer of (1.13) is called the geometric linear fit. To find the minimizer of (1.13), we differentiate F(α, β) with respect to α and solve equation the F/ α = 0, then we get (1.14) ˆα = ȳ ˆβ x, Substituting ˆα in (1.13) and define by x i = x i x and y i = y i ȳ the centered coordinate of (x i, y i ) yield (1.15) F(β) = β 2 (y i βx i ) 2. from which (1.16) F(β) = s yy 2βs xy + s xx β β 2, which is a function of one variable, and as such its minimum attains when (1.17) ˆβ = s yy s xx + (s yy s xx ) 2 + 4s 2 xy 2s xy. The formula (1.17) holds whenever s xy 0, which is true almost surely. In the case s xy = 0, we have F(β) = (1 + β 2 ) 1 (s yy + s xx β 2 ). Simple algebra leads to the conclusion ˆβ = 0 if s xx > s yy while ˆβ = if s xx < s yy. Until the late 1980s all books and papers discussing fitting lines have dealt with the equation y = α + βx. This model fails if data points lie on the vertical or even nearly vertical lines. An alert reader can clearly see that from the formula (1.17), it leads to numerically unstable problem for this case whenever s xy 0. This turns the 7

18 EIV community to using another line parametrization such as (1.18) Ax + By + C = 0 Hence F(α, β) can be expressed in terms of A = (A, B, C) T as (1.19) F(A) = 1 n(a 2 + B 2 ) (Ax i + By i + C) 2 (Note that we intentionally divide the objective by n which does not change the minimizer). Also note that if A = B = 0 then (1.18) describes the entire plane R 2 if C = 0 and an empty set if C 0. Therefore A 2 + B 2 > 0 is required to exclude meaningless cases. Furthermore both numerator and denominator are quadratic functions of A, thus if we multiply A by a nonzero scalar, the minimizer of (1.19) remains unchanged. Thus it is natural to impose the the geometric constraint: (1.20) A 2 + B 2 = 1, or equivalently A T I 0 A = 1 where (1.21) I 0 = which automatically satisfies the restriction A 2 + B 2 > 0. Now the new problem becomes minimizing the objective function (1.22) ( Axi + By i + C ) 2 subject to A T I 0 A = 1. In fact, the constraint A 2 + B 2 = 1 removes the indeterminacy of multiple solutions but not entirely. So how can we choose the right one? Suppose for a while that the program solving (1.22) gives us A. To obtain the right solution, we should check the sign of C. Accordingly, if C 0, then we choose A itself, otherwise we pick A. 8

19 But, how can we solve equation (1.22)? Since C is unconstrained, it can be eliminated by minimizing F with respect to C. Differentiating F with respect to C and setting F/ C = 0, one has C = A x Bȳ. Hence F turns to be F(A ) = n 1 ( ) Ax i + Byi 2. where A = (A, B) T. To implement linear algebra we define Z i = (x i, y i ) T, M i = Z i Z T i and finally we define M = n 1 M i = x x x y x y y y, where x x = n 1 x i x i, etc. Next we write F as F(A ) = (A ) T MA Minimizing F(A ) subject to (1.20) can be solved by employing Lagrangian multiplier λ and hence solving the variational equation (1.23) MA = λa Since M is positive definite, solving the variational equation (1.23) gives two nonnegative real eigenvalues, say λ 1 λ 2. But which eigenvector should we choose? It is the unit eigenvector A 1 corresponding to λ 1, because F(A 1) = (A 1) T MA 1 = λ A = λ Geometric Fitting for Nonlinear Models. In the dissertation we discuss geometric fitting for two geometric contours, circles and conics. We present them here briefly. 9

20 Circular fitting. Suppose one wants to fit a circle to observed points, then d i in (1.12) stands for the distance from (x i, y i ) to the circle, (1.24) d i = r i R, with r i = (x i a) 2 + (y i b) 2 where (a, b) denotes the center and R the radius of the circle. The geometric circle fit is known as the most accurate circle fit. A major concern with the geometric fit, however, is that the above minimization problem has no closed form solution. All practical algorithms for minimizing F are iterative; some implement a general Gauss-Newton [15, 36] or Levenberg-Marquardt [21] schemes, which is a standard procedure to minimize the nonlinear least squares problems. Others use circle-specific methods, such as proposed by Landau [49] and Späth [69]. The performance of iterative algorithms heavily depends on the choice of the initial guess. They often take dozens or hundreds of iterations to converge, and there is always a chance that they would be trapped in a local minimum of F or diverge entirely. These issues were explored in [21] Conic fitting. The situation in conic fitting becomes worse. Given n points m i, i = 1,..., n, then the orthogonal distance d i between the conic P (x, y; Θ) = 0 and m i has no simple analytic formula. This distance can be computed by solving a polynomial equation of the 4th degree, which is a numerically unstable algorithm. The algorithm returns solutions involving complex numbers [3]. Instead one can use iterative schemes such as Newton s method or its modifications such as Levenberg-Marquart algorithm (which is the most accurate scheme to solve nonlinear least squares problems). However it turns to be extremely expensive and still unreliable. Furthermore, Newton s method requires computing the partial derivatives of d i but the latter has no closed form expression as well, so some use a finite difference approximation, which also turns to be inaccurate. These observations make using Newton s method and its implementation schemes impractical for ellipse 10

21 fitting. Finally, many researchers such as [53] pointed out that Levenberg-Marquart algorithm is a very expensive and it could take up to 1000 iterations to converge MLE Versus Geometric Fit Whenever the estimation of parameters becomes an issue, the Maximum Likelihood Estimator (MLE) takes a prominent position. Since we adopt the functional model, m i s are treated as nonrandom (incidental) parameters constrained through the implicit equation (1.6). Assume that e i s are independent and normally distributed with mean 0 = (0, 0) T and covariance matrix σ 2 I 2, where I 2 is a square identity matrix of size 2. The likelihood function is L(Θ, m 1,..., m n ) = n f(m i ), under the constrains P ( m i, Θ) = 0 for i = 1,..., n where f(m i ) = 1 2πσ 2 e 1 2σ 2 (m i m i ) T (m i m i ), i = 1,..., n Finding ˆΘ, the maximizer of the constraint likelihood function L(Θ, m 1,..., m n ), is not an easy task without implementing the logarithmic function. Thus MLE can be obtained by minimizing log L(Θ, m 1,..., m n ), where log L(Θ, m 1,..., m n ) = i [ 1 2σ 2 (m i m i ) T (m i m i ) + log 2πσ 2 ] Adding a constant to a function does not change the location of its minimizer. This means that the above problem reduces to minimizing the objective function (1.25) F(Θ, m 1,..., m n ) = d 2 i where (1.26) d 2 i = (m i m i ) T (m i m i ) = (x i x i ) 2 + (y i ỹ i ) 2 11

22 is the Mahalanobis distance of m i from m i. The above minimization problem is quite unclear since it depends on the unknown true points m i s. Let us consider the simplest case; fitting lines to data. In this case ỹ i = α + β x i. Hence F can be written as (1.27) F(Θ, m 1,..., m n ) = i [ (xi x i ) 2 + (y i (α + β x i ) 2] Differentiating F with respect to x i gives F(Θ, m 1,..., m n ) x i = 2 [ x i x i + β(y i (α + β x i )) ] To get the MLE of x i we set F(Θ, m 1,..., m n ) x i = 0 and obtain ˆ x i = x i + β(y i α) 1 + β 2, and ˆỹ i = α + βˆ x i If we substitute ˆ x i and ˆỹ i in (1.27), we get (1.13). This means that MLE can be obtained by minimizing the sum of the squares of the geometric (orthogonal) distances from the observed points to the true line. For general nonlinear EIV models the relation between ODR and MLE is much more ambiguous and is barely mentioned in the literature, even though it was established by Chan in 1965 [15]. To present the proof of this fact, we introduce a simple but powerful observation. Definition 1.1 (Minimization-in-steps Technique). Let A and B be two compact subsets of the p, q dimensional Euclidean space, respectively. If h is a real-valued continuous function on the Cartesian product A B of the (p + q) dimensional Euclidean space, then the minimization of h over A B can be taken by steps over A and then over B, that is min A B h = min B ( mina h ) In the functional model, estimating the parameters can be formulated as follows. Let u(t, Θ) and v(t, Θ) be two real-valued functions of k+1 real variables t, θ 1,..., θ k. For each Θ, the point (u(t, Θ), v(t, Θ)) represents a curve in the plane as t varies. 12

23 This means for each ( x i, ỹ i ) there exists t i such that x i = u( t i, Θ), ỹ i = v( t i, Θ), for i = 1,..., n We obviously see that in the circular fitting t i equals to ϕ i given in (1.9). Finally, we assume that the curve represented by the implicit parametrization (u(t, Θ), v(t, Θ)) is smooth. Theorem 1.1. [15] Assume that the noise e i = (δ i, ε i ) satisfies (1.10) and (1.11) for each i = 1,..., n and the primary parameter vector Θ is constrained through the implicit equation (1.6), then the MLE ˆΘ is attained on the curve that minimizes the sum of squares of orthogonal distances to the data points. Proof. According to above discussion, we can write (1.25) as (1.28) F 1 (Θ, t 1,..., t n ) = i [( xi u( t i, Θ) ) 2 + ( yi v( t i, Θ) ) 2] To minimize the objective function (1.28) we use minimization-in-steps technique. Accordingly, we minimize (1.28) with respect to the nuisance parameters t 1,..., t n keeping θ 1,..., θ k as fixed parameters. Then we minimize the resulting function with respect to the primary parameter vector Θ. In other word, the curve P (x, y, Θ) = 0 is kept fixed, and hence the conditional minimization of F(Θ, t 1,..., t n ) is attained when each ˆ t i for i = 1..., n corresponds to the point ˆm i = (ˆx i, ŷ i ), where ˆm i lies on the curve P (x, y, Θ) = 0 and has the shortest distance to the observed point m i. i.e. (1.29) min t 1,..., t n i [ (xi u( t i, Θ)) 2 + (y i v( t i, Θ)) 2] = i d i (Θ) 2, where d i (Θ) denotes the (geometric) orthogonal distance from m i to the curve given in (1.5), i.e. the minimum of F is attained when d i is the orthogonal projection of m i to the curve P (x, y, Θ) = 0. Note that the existence and the uniqueness of the point ˆm i for each i = 1,..., n is ensured because of the smoothness of the u and v that parameterize P (x, y, Θ). 13

24 Next we minimize (1.29) with respect Θ. Thus the minimizer ˆΘ will be the minimizer of the sum of squares of orthogonal distances to the data points. i.e. (1.30) ˆΘ = argmin F1 (Θ, m 1,..., m n ) = argmin d 2 i This completes the proof of the theorem 1.4. Statistical Properties of EIV Models It is a standard procedure in traditional statistics to examine the statistical properties of any estimator such as MLE after adopting some statistical assumptions. Their assessments are based on some standard measures such as efficiency, sufficiency, consistency and others. These measures are mainly based on the bias, variance, mean square error (MSE), Cramér Rao lower bounds (CRB), and the asymptotic behavior of the estimator as n. The nature of the problem in EIV models, however, is intractable. In the linear model, the exact densities of the MLE (ˆα, ˆβ) have awkward probability densities. In fact, those densities are not normal and do not belong to any standard family of probability densities. The formulas of their densities are overly complicated, involve double-infinite series, and it was promptly noted [7, 8] that they were not very useful for practical purposes. In an attempt to understand the nature of the estimates (ˆα, ˆβ), Anderson [7] assumed that δ i and ε i are i.i.d. normal random variables and proved that these estimates have infinite first moment: E( ˆα ) = E( ˆβ ) =. As a result, the bias and the Mean Squared Errors (MSE) are not defined either! These astonishing results led Anderson, Kunitomo, and Sawa [7, 8, 48] to investigate the MLE (ˆα, ˆβ) further. They used Taylor expansions up to terms of order 3 and 4 and they approximated the distribution functions of the MLE for the line; and the resulting approximate distributions had finite moments. Those approximations 14

25 remarkably well agreed with numerically computed characteristics of the actual estimators, at least in simulated experiments of all typical cases. Anderson and Sawa [8] noted that their approximations were virtually exact. At the same time, they compared ˆβ, and the slope estimate of the classical regression line ˆβ C, which is optimal when x i s are error-free. However, the latter are quite inaccurate when both variables x and y are observed with random errors and they are known to be heavily biased [7, 19], even though they have finite moments. To demonstrate this phenomenon we generated 10 6 random samples of n = 30 on the true line y = 1 + 3x whose true points are positioned equally spaced on the true line of length L. The noise level σ was set such that σ/l =.05. For each sample we compute both parameter vectors (ˆα, ˆβ) and (ˆα C, ˆβ C ). Finally we compute their averages and draw their corresponding lines as in Figure 1.1. It is clear from the figure how the classical regression estimates underestimate the true value for the parameters. Return back to Anderson (and his coauthors) results. Their results are y=1+3x True line Geometric fit Classical fit Figure 1.1: Dot-dash line represents the geometric fit, dashed line represents the classical fit, while the solid line is the true line. summarized as follows. The mean square error of ˆβ is infinite while that of ˆβ C is finite. 15

26 The estimate ˆβ is consistent and asymptotically unbiased, while ˆβ C is inconsistent and asymptotically biased (except when β = 0). In other words, paradoxically, better estimates have theoretically infinite moments. So in practical terms, what does the nonexistence of moments mean anyway? Anderson explained these controversial properties of the MLE of (ˆα, ˆβ) by the following example. Suppose Z is a random variable with an infinite moment, which is a mixture of a standard normal random variable X, with weight (1 p), and a bad Cauchy random variable Y, with a small weight p. i.e. (1.31) Z = (1 p)x + py Thus, E(Z) = even though p may be very small such as p = This means Z and X appear to be virtually indistinguishable as p is small. Consequently, infinite moments occur because the distributions of these estimates have somewhat heavy tails, even though those tails barely affect the practical performance of the MLE, and as such one can use, in this case, properly constructed approximative distribution. In the nonlinear EIV regression, the probability densities of the MLE can not be obtained explicitly. For example, the MLE s for the geometric parameters for other contours such as circle or ellipse have no closed form and hence it is not known if the probability densities of the center (â, ˆb) and the radius ˆR exist or not. Hence the situation here is even more obscure than it is in the linear model. Furthermore, it was recently discovered [18] that the MLE â, ˆb, ˆR have infinite moments, too, i.e. E( â ) =, E( ˆb ) =, and E( ˆR ) =. On the other hand, the least accurate estimate (the Kåsa fit) returns estimates (a, b, R) that have finite first moment, see [79]. Thus one faces the same methodological questions as in the linear case. Another difficulty added in the EIV account is studying the statistical property called consistency. In traditional statistics, it is natural to seek estimators that converge in probability to the true parameters as n, i.e. lim n ˆθn P = θ. Such estimators are called consistent estimators. This means that as the sample size 16

27 n increases, the distribution of the estimator becomes more and more concentrated near the true value of the parameter. In the EIV functional model the number of (nuisance) parameters grows with n, which makes tracking the consistency an impossible task. To conclude, tracking the statistical properties of the estimator by using standard approaches appears to be impossible so an alternative approach becomes required Scope of the Dissertation Previous discussion leads to immediate methodological questions: (Q1) How can we characterize, in practical terms, the accuracy of estimates whose theoretical MSE is infinite (and whose bias is undefined)? (Q2) Is there any precise meaning to the widely accepted notion that the MLE such as (ˆα, ˆβ) or (â, ˆb, ˆR) are best? My dissertation is a part of a bigger project whose purpose is to revisit some difficulties in the EIV regression studies and develop an unconventional approach to their resolution. The main goal of our dissertation is to answer the above questions and many other issues in the EIV regression that remain largely unresolved. Our approach is tailored for image processing applications, where the number of observed points (pixels) is limited but the noise is small. Our general goal is to develop a simplified error analysis that nonetheless allows us to effectively assess the performance of existing fitting algorithms. In addition, based on our approach one can design algorithms with increased accuracy. Our study for circular and conic regressions will not be complete without discussing algebraic fits and weighted algebraic fits. Thus before closing this chapter we discuss them briefly, then we discuss them in full details thorough the dissertation Algebraic Fits and Weighted Algebraic Fits Since it is an iterative and hence expensive method, the geometric fit requires an initial guess, which should be chosen carefully to ensure the convergence of the 17

28 algorithm to the minimizer of the objective function. Consequently, alternative methods must be used to obtain accurate and non iterative estimates. Usually, instead of computing and minimizing geometric distances, one minimizes various algebraic distances, thus such methods are known as algebraic fits. Such fits are non iterative, simpler and faster, but usually most of them are less accurate than the geometric fit. Several approximations have been designed which take the forms (1.32) F 1 (Θ) = [ P (mi, Θ) ] 2 or F 2 (Θ) = [ w i P (mi, Θ) ] 2 for some weight w i which depends on the parameters and the experimental observations Algebraic Circle Fits. In the course of circle fitting, P, takes the form (1.33) A(x 2 + y 2 ) + Bx + Cy + D = 0. From now we denote by A = (A, B, C, D) the vector of all the algebraic circle parameters A, B, C, D. The simplest fit is known as Kåsa fit [47], which uses the constraint A = 1. Kåsa fit is common in practice and works very well for full circle or a long circular arc but it is heavily biased toward smaller circles when data are clustered along a short arc. Pratt (and independently Chernov-Ososkov) [23, 60] proposed another fit by choosing w i carefully. They minimize F 1 subject to B 2 + C 2 4AC = 1. Their choice leads to a significant reduction of the bias. Another elegant choice of w i is proposed by Taubin [71]. The latter works also very well for conic fitting. A complete analysis about these fits will be discussed in full details through the dissertation Algebraic Conic Fits. In conic fitting many more non iterative algebraic fits were designed by Bookstein [13], Fitzgibbon et al. [28, 29], and others. More details of these fits and others will be given in the dissertation. But all these algebraic fits are not statistically optimal in a sense that they do not achieve Kanatani Cramèr Rao lower bound (KCR) (full details for these concepts and the proofs are 18

29 explained in Chapters 4,5 and 6). Thus other types of conic fits become quite reasonable Weighted Algebraic Conic Fits. The geometric fit is the most accurate method, but this fit has some disadvantages as we described above force the scientists to develop iterative methods that converge to statistically optimal estimates. These methods usually take few iterations to converge. Usually they require 2 to 4 iterations to attain the optimal estimates. These estimates coincides with the geometric fit up to the leading order. The first estimate is obtained by approximating the geometric fit so it is known as approximating maximum likelihood estimate, which is also called Gradient Weighted Algebraic Fit (GRAF). The estimate is a minimizer for the objective function which takes the formal expression F 2 as given via equation (1.32). Finding the minimizer of this objective function leads to a need to solve an equation called the variational equation, which in turn allows the scientist to propose several schemes to solve it. The first attempt was in 1997, where Leedan and Meer [50] proposed a scheme to solve the variational equation using the generalized eigenvalue problem and they called their scheme by Heteroscedastic Errors In Variables (HEIV) method. Another scheme was proposed in 2000 by Chojnacki et al. [25]. They proposed another iterative scheme to solve the variational equation and they called their scheme by Fundamental Numerical Scheme (FNS). As a final remark, even though minimizing the same objective function, these estimates take different paths to converge to the minimizer. Another clever technique, used to obtain an optimal solution and reduce the bias of the parameter estimates, was proposed in the middle of the 1990s by Kanatani who proposed the first and the second-order renormalization schemes which soon took a prominent place in the study of curve fitting in computer vision. Kanatani developed his method in the early 1990s aiming at the reduction of bias and minimizing the variance in two standard image processing tasks, one of which was fitting ellipses to data and the other was the fundamental matrix computation. 19

30 Kanatani s first publications were complicated and hard to understand. It took the computer vision community almost a decade to fully understand Kanatani s method when Chojnacki et. al. [24] compared Kanatani s method to other popular fitting algorithms and placed all of them within a unified framework. Kanatani s method does not minimize any objective function. Instead, he approximated the variational equation by another equation. Then he solved the resulting equation using generalized eigenvalue problem. Another novel scheme is derived in this dissertation to obtain a statistically optimal estimate for conic fitting. Our method does not only attain KCR, but also it has zero bias up to the second leading order as we discuss in Chapter Organization of the Dissertation My dissertation is organized as follows. Chapter 2 discusses the problem of infinite moments for the most popular circular algebraic fits including ours. Chapter 3 presents the general scheme of our error analysis under general statistical assumptions and highlights the procedures to compare two estimators on the same parametric space based on their statistical efficiency, to the leading order. The general scheme is explained by the simplest EIV models, the linear regression. Chapter 4 is devoted to the geometric circular fit. In the first part of the chapter we provide a complete proof for the existence of the probability densities of the geometric fit (i.e. the MLE of (â, ˆb, ˆR)). The proof is based on Lebesgue measure theory. In the second part we apply our general scheme to geometric circular fit. Chapter 5 discusses algebraic fits for lines, circles and ellipses. We put all fits in a unified framework and we discuses their error analysis in general, then we study each fit separately. Accordingly, we compare the most popular circular algebraic fits together with the geometric fit. Such comparison leads us to a non iterative algebraic fit that outperforms the (usually considered unbeatable ) geometric fit. The new method is called Hyperaccurate fit (or Hyperfit for short). Also we will 20

31 develop a new circle fit, which is considered as a modification of MLE, thus we call it Adjusted Maximum Likelihood Estimator (AMLE). Similarly we compare between several algebraic ellipse fits and propose a modified version of Hyperfit for ellipse fitting. The latter proposed method for ellipse fitting outperforms all algebraic fits but not the geometric fit. Chapter 6 surveys some numerical schemes for ellipse fitting and discusses their statistical properties through our analysis. Such a work allows us to present another fit that outperforms all existing numerical schemes for ellipse fitting together with the geometric fit. Lastly, Chapter 7 is devoted to validation our general theory practically through some numerical experiments. The work presented in Chapters 3, and parts of Chapters 4 and 5 that are related to circle fitting problem were published in Electronic Journal of Statistics in The contents of chapter 7 is accepted for publication in Journal Theory of Probability and Mathematical Statistics. Chapter 2 represents a major portion of a third manuscript. Last but not the least, the new results for ellipse fitting problem presented in chapters 5 and 6 will be submitted soon for publication. 21

32 CHAPTER 2 CIRCLE ALGEBRAIC FITS AND THE PROBLEM OF THE MOMENTS We discussed briefly in Chapter 1 the geometric fit for circles (or MLE under our assumption). We discussed the difficulties and the disadvantages to implement it practically together with the infinite moment problem that MLE has. We also mentioned the most popular non-iterative algebraic circle fits such as Kåsa, Pratt, Taubin, which are also considered as approximations to MLE. In this chapter we will discuss these fits and their practical implementations in details. Also we will discuss briefly our fit which we proposed in [4] according to our general analysis (see Chapters 3 and 5). We called this fit Hyperaccurate fit (or Hyper fit for shortcut). Accordingly, we will investigate one of spectacular features of these fits. We prove that each one of the fits, Pratt s fit and Taubin s as well as Hyperaccurate fit, returns center (a, b) and radius R whose first moment is infinite. This chapter is organized as follows. Section 2.1 reviews the most popular algebraic fits. Section 2.2 states our main theorem regarding the nonexistence of the moments of the center and the radius for the Pratt fit, the Taubin fit, and Hyper fit. Section 2.3 provides a proof for our theorem. Finally, Section 2.4 shows how these estimates pose this property but the Kåsa fit does not Algebraic Circle Fits We describe the most popular algebraic circle fits here Kåsa Fit. The simplest and fastest method was introduced in the 1970s by Delogne [27] and Kåsa [47], and then rediscovered and published independently 22

Fitting circles to scattered data: parameter estimates have no moments

Fitting circles to scattered data: parameter estimates have no moments arxiv:0907.0429v [math.st] 2 Jul 2009 Fitting circles to scattered data: parameter estimates have no moments N. Chernov Department of Mathematics University of Alabama at Birmingham Birmingham, AL 35294

More information

GEOMETRIC FITTING OF QUADRATIC CURVES AND SURFACES HUI MA NIKOLAI CHERNOV, COMMITTEE CHAIR WEI-SHEN HSIA CHARLES KATHOLI IAN KNOWLES BORIS KUNIN

GEOMETRIC FITTING OF QUADRATIC CURVES AND SURFACES HUI MA NIKOLAI CHERNOV, COMMITTEE CHAIR WEI-SHEN HSIA CHARLES KATHOLI IAN KNOWLES BORIS KUNIN GEOMETRIC FITTING OF QUADRATIC CURVES AND SURFACES by HUI MA NIKOLAI CHERNOV, COMMITTEE CHAIR WEI-SHEN HSIA CHARLES KATHOLI IAN KNOWLES BORIS KUNIN A DISSERTATION Submitted to the graduate faculty of The

More information

Statistical efficiency of curve fitting algorithms

Statistical efficiency of curve fitting algorithms Statistical efficiency of curve fitting algorithms N. Chernov and C. Lesort Department of Mathematics University of Alabama at Birmingham Birmingham, AL 35294, USA March 20, 2003 Abstract We study the

More information

Overviews of Optimization Techniques for Geometric Estimation

Overviews of Optimization Techniques for Geometric Estimation Memoirs of the Faculty of Engineering, Okayama University, Vol. 47, pp. 8, January 03 Overviews of Optimization Techniques for Geometric Estimation Kenichi KANATANI Department of Computer Science, Okayama

More information

Tutorial. Fitting Ellipse and Computing Fundamental Matrix and Homography. Kenichi Kanatani. Professor Emeritus Okayama University, Japan

Tutorial. Fitting Ellipse and Computing Fundamental Matrix and Homography. Kenichi Kanatani. Professor Emeritus Okayama University, Japan Tutorial Fitting Ellipse and Computing Fundamental Matrix and Homography Kenichi Kanatani Professor Emeritus Okayama University, Japan This tutorial is based on K. Kanatani, Y. Sugaya, and Y. Kanazawa,

More information

Experimental Evaluation of Geometric Fitting Algorithms

Experimental Evaluation of Geometric Fitting Algorithms Memoirs of the Faculty of Engineering, Okayama University, Vol.41, pp.63-72, January, 27 Experimental Evaluation of Geometric Fitting Algorithms Kenichi KANATANI Department of Computer Science Okayama

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Hyperaccurate Ellipse Fitting without Iterations

Hyperaccurate Ellipse Fitting without Iterations Memoirs of the Faculty of Engineering, Okayama University, Vol. 44, pp. 42 49, January 2010 Hyperaccurate Ellipse Fitting without Iterations Kenichi KAATAI Department of Computer Science Okayama University

More information

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation High Accuracy Fundamental Matrix Computation and Its Performance Evaluation Kenichi Kanatani Department of Computer Science, Okayama University, Okayama 700-8530 Japan kanatani@suri.it.okayama-u.ac.jp

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Hyperaccurate Ellipse Fitting without Iterations

Hyperaccurate Ellipse Fitting without Iterations Memoirs of the Faculty of Engineering, Okayama University, Vol. 44, pp. 42-49, January 2010 Hyperaccurate Ellipse Fitting without Iterations Kenichi KAATAI Department of Computer Science Okayama University

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Hyper Least Squares and Its Applications

Hyper Least Squares and Its Applications Memoirs of the Faculty of Engineering, Okayama University, Vol. 45, pp. 15-26, January 2011 Hyper Least Squares and Its Applications Kenichi KAATAI, Prasanna RAGARAJA, Yasuyuki SUGAYA, and Hirotaka IITSUMA

More information

V. Properties of estimators {Parts C, D & E in this file}

V. Properties of estimators {Parts C, D & E in this file} A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared

More information

Lagrange Multipliers

Lagrange Multipliers Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each

More information

Overviews of Optimization Techniques for Geometric Estimation

Overviews of Optimization Techniques for Geometric Estimation Memoirs of the Faculty of Engineering, Okayama University, Vol. 7, pp. -, January 0 Overviews of Optimization Techniques for Geometric Estimation Kenichi KAATAI Department of Computer Science, Okayama

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 2: Estimation Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Classical Estimation

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

LECTURE NOTE #3 PROF. ALAN YUILLE

LECTURE NOTE #3 PROF. ALAN YUILLE LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.

More information

CHAPTER 1: Functions

CHAPTER 1: Functions CHAPTER 1: Functions 1.1: Functions 1.2: Graphs of Functions 1.3: Basic Graphs and Symmetry 1.4: Transformations 1.5: Piecewise-Defined Functions; Limits and Continuity in Calculus 1.6: Combining Functions

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Optimization Problems

Optimization Problems Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that

More information

Linear Algebra I. Ronald van Luijk, 2015

Linear Algebra I. Ronald van Luijk, 2015 Linear Algebra I Ronald van Luijk, 2015 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents Dependencies among sections 3 Chapter 1. Euclidean space: lines and hyperplanes 5 1.1. Definition

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Weekly Activities Ma 110

Weekly Activities Ma 110 Weekly Activities Ma 110 Fall 2008 As of October 27, 2008 We give detailed suggestions of what to learn during each week. This includes a reading assignment as well as a brief description of the main points

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

NATIONAL BOARD FOR HIGHER MATHEMATICS. M. A. and M.Sc. Scholarship Test. September 25, Time Allowed: 150 Minutes Maximum Marks: 30

NATIONAL BOARD FOR HIGHER MATHEMATICS. M. A. and M.Sc. Scholarship Test. September 25, Time Allowed: 150 Minutes Maximum Marks: 30 NATIONAL BOARD FOR HIGHER MATHEMATICS M. A. and M.Sc. Scholarship Test September 25, 2010 Time Allowed: 150 Minutes Maximum Marks: 30 Please read, carefully, the instructions on the following page 1 INSTRUCTIONS

More information

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment.

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment. 1 Introduction and Problem Formulation In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment. 1.1 Machine Learning under Covariate

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Introduction to Computer Graphics (Lecture No 07) Ellipse and Other Curves

Introduction to Computer Graphics (Lecture No 07) Ellipse and Other Curves Introduction to Computer Graphics (Lecture No 07) Ellipse and Other Curves 7.1 Ellipse An ellipse is a curve that is the locus of all points in the plane the sum of whose distances r1 and r from two fixed

More information

Tangent spaces, normals and extrema

Tangent spaces, normals and extrema Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Maximum Likelihood Estimation. only training data is available to design a classifier

Maximum Likelihood Estimation. only training data is available to design a classifier Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

13. Nonlinear least squares

13. Nonlinear least squares L. Vandenberghe ECE133A (Fall 2018) 13. Nonlinear least squares definition and examples derivatives and optimality condition Gauss Newton method Levenberg Marquardt method 13.1 Nonlinear least squares

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Outline Lecture 2 2(32)

Outline Lecture 2 2(32) Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

av 1 x 2 + 4y 2 + xy + 4z 2 = 16.

av 1 x 2 + 4y 2 + xy + 4z 2 = 16. 74 85 Eigenanalysis The subject of eigenanalysis seeks to find a coordinate system, in which the solution to an applied problem has a simple expression Therefore, eigenanalysis might be called the method

More information

Linear regression COMS 4771

Linear regression COMS 4771 Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT

MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT The following is the list of questions for the oral exam. At the same time, these questions represent all topics for the written exam. The procedure for

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Math 302 Outcome Statements Winter 2013

Math 302 Outcome Statements Winter 2013 Math 302 Outcome Statements Winter 2013 1 Rectangular Space Coordinates; Vectors in the Three-Dimensional Space (a) Cartesian coordinates of a point (b) sphere (c) symmetry about a point, a line, and a

More information

MA 323 Geometric Modelling Course Notes: Day 07 Parabolic Arcs

MA 323 Geometric Modelling Course Notes: Day 07 Parabolic Arcs MA 323 Geometric Modelling Course Notes: Day 07 Parabolic Arcs David L. Finn December 9th, 2004 We now start considering the basic curve elements to be used throughout this course; polynomial curves and

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Emanuel E. Zelniker and I. Vaughan L. Clarkson

Emanuel E. Zelniker and I. Vaughan L. Clarkson A GENERALISATION OF THE DELOGNE-KÅSA METHOD FOR FITTING HYPERSPHERES Emanuel E Zelniker and I Vaughan L Clarkson School of Information Technology & Electrical Engineering The University of Queensland Queensland,

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Slope Fields: Graphing Solutions Without the Solutions

Slope Fields: Graphing Solutions Without the Solutions 8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,

More information

SKILL BUILDER TEN. Graphs of Linear Equations with Two Variables. If x = 2 then y = = = 7 and (2, 7) is a solution.

SKILL BUILDER TEN. Graphs of Linear Equations with Two Variables. If x = 2 then y = = = 7 and (2, 7) is a solution. SKILL BUILDER TEN Graphs of Linear Equations with Two Variables A first degree equation is called a linear equation, since its graph is a straight line. In a linear equation, each term is a constant or

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR Appendix B The Derivative B.1 The Derivative of f In this chapter, we give a short summary of the derivative. Specifically, we want to compare/contrast how the derivative appears for functions whose domain

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

Incompatibility Paradoxes

Incompatibility Paradoxes Chapter 22 Incompatibility Paradoxes 22.1 Simultaneous Values There is never any difficulty in supposing that a classical mechanical system possesses, at a particular instant of time, precise values of

More information

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication.

This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. This pre-publication material is for review purposes only. Any typographical or technical errors will be corrected prior to publication. Copyright Pearson Canada Inc. All rights reserved. Copyright Pearson

More information

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) 1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)

More information

Chapter 3. Point Estimation. 3.1 Introduction

Chapter 3. Point Estimation. 3.1 Introduction Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.

More information

Algebraic. techniques1

Algebraic. techniques1 techniques Algebraic An electrician, a bank worker, a plumber and so on all have tools of their trade. Without these tools, and a good working knowledge of how to use them, it would be impossible for them

More information

arxiv: v1 [physics.comp-ph] 22 Jul 2010

arxiv: v1 [physics.comp-ph] 22 Jul 2010 Gaussian integration with rescaling of abscissas and weights arxiv:007.38v [physics.comp-ph] 22 Jul 200 A. Odrzywolek M. Smoluchowski Institute of Physics, Jagiellonian University, Cracov, Poland Abstract

More information

CALCULUS: Math 21C, Fall 2010 Final Exam: Solutions. 1. [25 pts] Do the following series converge or diverge? State clearly which test you use.

CALCULUS: Math 21C, Fall 2010 Final Exam: Solutions. 1. [25 pts] Do the following series converge or diverge? State clearly which test you use. CALCULUS: Math 2C, Fall 200 Final Exam: Solutions. [25 pts] Do the following series converge or diverge? State clearly which test you use. (a) (d) n(n + ) ( ) cos n n= n= (e) (b) n= n= [ cos ( ) n n (c)

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

MTH4101 CALCULUS II REVISION NOTES. 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) ax 2 + bx + c = 0. x = b ± b 2 4ac 2a. i = 1.

MTH4101 CALCULUS II REVISION NOTES. 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) ax 2 + bx + c = 0. x = b ± b 2 4ac 2a. i = 1. MTH4101 CALCULUS II REVISION NOTES 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) 1.1 Introduction Types of numbers (natural, integers, rationals, reals) The need to solve quadratic equations:

More information

Machine Learning Basics: Maximum Likelihood Estimation

Machine Learning Basics: Maximum Likelihood Estimation Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40 NATIONAL BOARD FOR HIGHER MATHEMATICS Research Scholarships Screening Test Saturday, February 2, 2008 Time Allowed: Two Hours Maximum Marks: 40 Please read, carefully, the instructions on the following

More information

MULTIVARIABLE CALCULUS, LINEAR ALGEBRA, AND DIFFERENTIAL EQUATIONS

MULTIVARIABLE CALCULUS, LINEAR ALGEBRA, AND DIFFERENTIAL EQUATIONS T H I R D E D I T I O N MULTIVARIABLE CALCULUS, LINEAR ALGEBRA, AND DIFFERENTIAL EQUATIONS STANLEY I. GROSSMAN University of Montana and University College London SAUNDERS COLLEGE PUBLISHING HARCOURT BRACE

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information