A choice of norm in discrete approximation

Similar documents
The existence theorem for the solution of a nonlinear least squares problem

A new method for searching an L 1 solution of an overdetermined. overdetermined system of linear equations and applications

Confidence regions and intervals in nonlinear regression

THE BEST LEAST ABSOLUTE DEVIATION LINEAR REGRESSION: PROPERTIES AND TWO EFFICIENT METHODS

The Newton-Raphson Algorithm

Discrete Orthogonal Polynomials on Equidistant Nodes

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact

Mathematical foundations of the methods for multicriterial decision making

A multidimensional generalization of the Steinhaus theorem

A f = A f (x)dx, 55 M F ds = M F,T ds, 204 M F N dv n 1, 199 !, 197. M M F,N ds = M F ds, 199 (Δ,')! = '(Δ)!, 187

On the simplest expression of the perturbed Moore Penrose metric generalized inverse

Interval solutions for interval algebraic equations

Deviation Measures and Normals of Convex Bodies

Surface generating on the basis of experimental data

A NOTE ON Q-ORDER OF CONVERGENCE

Multiple ellipse fitting by center-based clustering

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

Czechoslovak Mathematical Journal

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

Spectral gradient projection method for solving nonlinear monotone equations

HIGH ORDER VARIABLE MESH EXPONENTIAL FINITE DIFFERENCE METHOD FOR THE NUMERICAL SOLUTION OF THE TWO POINTS BOUNDARY VALUE PROBLEMS

Existence and data dependence for multivalued weakly Ćirić-contractive operators

Complex and Quaternion Optimization

Pseudo-Boolean Functions, Lovász Extensions, and Beta Distributions

A DIMENSION REDUCING CONIC METHOD FOR UNCONSTRAINED OPTIMIZATION

Subset selection for matrices

Weak sharp minima on Riemannian manifolds 1

Houston Journal of Mathematics c 2009 University of Houston Volume 35, No. 1, 2009

CONVERGENCE OF APPROXIMATING FIXED POINTS FOR MULTIVALUED NONSELF-MAPPINGS IN BANACH SPACES. Jong Soo Jung. 1. Introduction

Geometry and topology of continuous best and near best approximations

Quasi-Newton Methods

Memoirs on Differential Equations and Mathematical Physics

5.8 Chebyshev Approximation

PROJECTIONS ONTO CONES IN BANACH SPACES

Why Locating Local Optima Is Sometimes More Complicated Than Locating Global Ones

SHADOWING PROPERTY FOR INDUCED SET-VALUED DYNAMICAL SYSTEMS OF SOME EXPANSIVE MAPS

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS

THE FORM SUM AND THE FRIEDRICHS EXTENSION OF SCHRÖDINGER-TYPE OPERATORS ON RIEMANNIAN MANIFOLDS

On behaviour of solutions of system of linear differential equations

On the loss of orthogonality in the Gram-Schmidt orthogonalization process

The Subdifferential of Convex Deviation Measures and Risk Functions

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

DIFFERENT ITERATIVE METHODS. Nazli Karaca and Isa Yildirim

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

CARISTI TYPE OPERATORS AND APPLICATIONS

引用北海学園大学学園論集 (171): 11-24

A new ane scaling interior point algorithm for nonlinear optimization subject to linear equality and inequality constraints

Stability and Instability for Dynamic Equations on Time Scales

A FAST SOLVER FOR ELLIPTIC EQUATIONS WITH HARMONIC COEFFICIENT APPROXIMATIONS

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Lie Algebra of Unit Tangent Bundle in Minkowski 3-Space

S.F. Xu (Department of Mathematics, Peking University, Beijing)

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES. S.S. Dragomir and J.J.

Variational Inequalities. Anna Nagurney Isenberg School of Management University of Massachusetts Amherst, MA 01003

CHAPTER 5. Birkhoff Orthogonality and a Revisit to the Characterization of Best Approximations. 5.1 Introduction

REGULARIZATION PARAMETER SELECTION IN DISCRETE ILL POSED PROBLEMS THE USE OF THE U CURVE

The Solvability Conditions for the Inverse Eigenvalue Problem of Hermitian and Generalized Skew-Hamiltonian Matrices and Its Approximation

Solution of Nonlinear Ordinary Delay Differential Equations Using Variational Approach

Examples of Perturbation Bounds of P-matrix Linear Complementarity Problems 1

A derivative-free nonmonotone line search and its application to the spectral residual method

Unconstrained minimization of smooth functions

Computing best I P approximations by functions nonlinear in one parameter

ABBAS NAJATI AND CHOONKIL PARK

Stability of Adjointable Mappings in Hilbert

SOME REMARKS ON THE SPACE OF DIFFERENCES OF SUBLINEAR FUNCTIONS

Sixth Order Newton-Type Method For Solving System Of Nonlinear Equations And Its Applications

Signal Identification Using a Least L 1 Norm Algorithm

Improved Newton s method with exact line searches to solve quadratic matrix equation

Computation of a canonical form for linear differential-algebraic equations

An Accurate Fourier-Spectral Solver for Variable Coefficient Elliptic Equations

Handout on Newton s Method for Systems

A Dykstra-like algorithm for two monotone operators

MATH 590: Meshfree Methods

SYLLABUS UNDER AUTONOMY MATHEMATICS

216 S. Chandrasearan and I.C.F. Isen Our results dier from those of Sun [14] in two asects: we assume that comuted eigenvalues or singular values are

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Complemented Subspaces in the Normed Spaces

High order geometric smoothness for conservation laws

TOPOLOGICAL GROUPS MATH 519

Fully Symmetric Interpolatory Rules for Multiple Integrals over Infinite Regions with Gaussian Weight

Multiobjective optimization methods

Banach Journal of Mathematical Analysis ISSN: (electronic)

Iterative Solvers in the Finite Element Solution of Transient Heat Conduction

arxiv: v1 [math.ca] 23 Oct 2018

Robustness of Principal Components

Chapter One: Introduction

A NOTE ON PAN S SECOND-ORDER QUASI-NEWTON UPDATES

Ninth International Water Technology Conference, IWTC9 2005, Sharm El-Sheikh, Egypt 673

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

O1 History of Mathematics Lecture VIII Establishing rigorous thinking in analysis. Monday 30th October 2017 (Week 4)

GLOBALLY CONVERGENT GAUSS-NEWTON METHODS

Definition and basic properties of heat kernels I, An introduction

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

Variational inequalities for fixed point problems of quasi-nonexpansive operators 1. Rafał Zalas 2

G-frames in Hilbert Modules Over Pro-C*-algebras

Construction of `Wachspress type' rational basis functions over rectangles

Where is matrix multiplication locally open?

MINIMAL GRAPHS PART I: EXISTENCE OF LIPSCHITZ WEAK SOLUTIONS TO THE DIRICHLET PROBLEM WITH C 2 BOUNDARY DATA

Transcription:

147 A choice of norm in discrete approximation Tomislav Marošević Abstract. We consider the problem of choice of norms in discrete approximation. First, we describe properties of the standard l 1, l 2 and l norms, and their essential characteristics for using as error criteria in discrete approximation. After that, we mention the possibility of applications of the so-called total least squares and total least l p norm, for finding the best approximation. Finally, we take a look at some nonstandard, visual error criteria for qualitative smoothing. Key words: discrete approximation, l 1 norm, l 2 norm, l norm, total least squares, total least l p norm, error criteria Sažetak. Izbor norme za diskretnu aproksimaciju. Razmatramo problem izbora norme pri diskretnoj aproksimaciji. Prvo opisujemo svojstva standardnih normi l 1, l 2, l i njihove bitne značajke za uporabu kao kriterija greške kod diskretne aproksimacije. Zatim spominjemo mogućnost primjena tzv. potpunih najmanjih kvadrata i potpune najmanje l p norme za nalaženje najbolje aproksimacije. Konačno navodimo i neke nestandardne, vizualne kriterije greške pri kvalitativnom glad enju. Ključne riječi: diskretna aproksimacija, l 1 norma, l 2 norma, l norma, potpuni najmanji kvadrati, potpuna najmanja l p norma, kriteriji greške 1. Best discrete approximation problem Approximation problem for a real, continuous function f(x) on a certain interval [a, b] is considered in literature (cf. [?], [?], [?], [?], [?]) from several aspects: The lecture presented at the Mathematical Colloquium in Osijek organized by Croatian Mathematical Society Division Osijek, April 19, 1996. Faculty of Electrical Engineering, Department of Mathematics, Istarska 3, HR-31 000 Osijek, Croatia, e-mail: tommar@drava.etfos.hr

148 T. Marošević a choice of a model or a form of the corresponding approximation function F (a, x), with an unknown vector of parameters a = (a 1, a 2,..., a n ) T, and setting up a criterion for quality of approximation, i.e. a distance function d(f(x), F (a, x)) in a metric space, resp. a norm F (a) f in a normed linear space; the best approximation existence problem; the problem of uniqueness of best approximation; characterization of a solution; methods for solving the approximation problem. Here we are going to consider the question of a choice of norm in approximation of a function f, especially in the case of discrete approximation, when the values of the function are given only at finitely many points (x i, f i ), i = 1,..., m. A criterion for the quality of approximation is usually determined by means of norms. Definition 1. The l p norm of the function f given at some finite data points set X = {x i : i = 1,..., m}, is defined by ( m ) 1/p l p (f) = f p = f(x i ) p, 1 p <. For p = as a limit case, the l norm is defined by f = max f(x i). i {1,...,m} Definition 2. The function F (a, x) is said to be the best approximation of the function f in the norm p, if it holds F (a ) f p F (a) f p, a P R n. In that way, the approximation problem is reduced to the problem of minimization of the functional S : R n R, S(a 1, a 2,..., a n ) = F (a) f p (or rather, equivalently, of the functional F (a) f p p). In general, the best approximation in the l p norm is different from the best approximation in the l q norm (p q). The l p norms can be generalized by introducing the weights (w(x i ), i = 1,..., m). If the approximating function F (a, x) is linear in parameters a j, j = 1,..., n, i.e. if F (a, x) = x T a, then it holds (cf. [?]) 1 p < q min a R n Xa f q min a R n Xa f p (where X is a corresponding data matrix, and f is a vector of values of the dependent variable).

A choice of norm in discrete approximation 149 2. The l 2, l 1 and l norms The three norms most frequently used in the practice are: the l 2 norm (the least squares or Euclidean norm); the l 1 norm (the least absolute deviations); the l norm (the Chebyshev norm). The important relations among these norms are stated by the following well known theorem in the discrete case (cf. [?]). Theorem 1. For every v R m it holds v v 2 v 1 v 1 m v 2 m v. The shapes of balls in the l 2, l 1 and l norms in a normed linear space R 2 are shown in Figure 1. and v 2 = r v 1 = r v = r Figure 1. The shapes of balls in the l 2, l 1 and l norms The l 2 norm. It is used traditionally and almost universally for practical applications in the approximation theory (cf. [?], [?], [?]). In the beginning of 19th century Legendre (1804) suggested the use of the l 2 norm for approximation of a solution to the inconsistent system of linear equations (respectively for the equivalent problem of approximation of a function which is given at the finite number of data points), and similar problem was considered by Gauss (cf. [?], [?]). The l 2 norm is differentiable, and in the case when an approximating function F (a) is linear with respect to parameters a = (a 1,..., a n ) T, the approximation problem becomes a well known linear least squares problem (cf. [?]). Since the l 2 norm is strictly convex, in this case the best approximation exists in the l 2 norm, it is unique and depends continuously and smoothly on a function which is approximated (cf. [?]). Statistical considerations show that the l 2 norm is the most suitable choice for smoothing the data in the case when additive data errors ε i, i = 1,..., m,

150 T. Marošević have a normal distribution (i.e. ε i N(0, σ 2 ), because then the influence of errors ε i is minimized by means of use of the l 2 norm (cf. e.g. [?]). Nonlinear least squares problems are also widely analyzed (cf. [?], [?]), and within that framework, especially the so-called separable problems (cf. [?], [?], [?]). The l norm. One of its characteristics is that it considers only those data points where the maximal error appears. The best approximation in the l norm is obtained by minimizing the maximal distance. In 1799 Laplace suggested such a criterion (nowadays called the l norm) for approximative solving of inconsistent systems of linear equations; Fourier (1824) studied a similar problem. In the second half of 19th century Chebyshev made the first systematic analysis of this norm (because of that, the l norm is also called the Chebyshev norm, cf. [?]). In the practice this norm is used if the data errors are very small with respect to an approximation error (cf. [?]). Characterization of the best approximation in the l norm is described by the so-called alternating property, from which an exchange algorithm for obtaining the best l approximation is derived (cf. [?]). In discrete case, computing of an l approximation can be expressed as the linear programming problem (cf. [?]). The l norm is not strictly convex, so the best approximations are not necessarily unique. The l 1 norm. As late as the middle of eighteenth century R. Boscovich determined a criterion by which absolute deviations of data are minimized, among all lines which pass through a centroid of data points. In 1789 Laplace gave an algebraic analysis of this problem, which was also considered by Gauss (cf. [?], [?], [?]). By approximation in the l 1 norm, all deviations of error curve, respectively of errors at data points, are equally valued regardless of whether they are close to zero or to extremal values. This criterion is suitable for use if the errors are subjected to outliers or wild points, because the magnitude of big errors does not lead to the difference of the best approximations (cf. [?]). The theory of l 1 approximations for the finite data points sets is somewhat more different with respect to the properties of continuous l 1 approximations on the interval, what is opposite to the situation in l 2 and l approximations, where there is not a significant difference between analysis of continuous and discrete cases (cf. [?]). Since the l 1 norm is not strictly convex, the best l 1 linear approximation is not necessarily unique. The problem of linear l 1 approximation can be transformed into a linear programming problem. Linear l 1 approximation (regression) is widely analyzed (cf. [?], [?], [?], [?]). Nonlinear approximation in the l 1 norm is less researched. In a discrete nonlinear case, best approximations with respect to the l 1 norm need not necessarily exist, as well as with respect to the l 2 and l norm (cf. [?], [?]).

A choice of norm in discrete approximation 151 3. New approaches a) Total least squares method. If the given data contain additive errors in both the dependent (ε i, i = 1,..., m) and the independent variable (δ i ), then one can observe orthogonal distances d 2 i = ε2 i + δ2 i and their sum m m ( ) Φ(a 1,..., a n, δ 1,..., δ m ) = d 2 i = (F (a, x i + δ i ) f i ) 2 + δi 2, where it is necessary to find minimum min Φ(a 1,..., a n, δ 1,..., δ m ). (a 1,...,a n,δ 1,...,δ m) The idea of total least squares can be generalized by introducing a total least l p norm, or rather the total l p approximation problem (cf. [?], [?], [?]). min m (a 1,...,a n,δ 1,...,δ m) ( ) (F (a, x i + δ i ) f i ) p + δ p i, b) Visual error criteria. The use of standard mathematical norms in vector spaces for measuring the distance between the true curve and the estimated curve can be inappropriate from the graphic viewpoint, in respect of a visual image about distances between the curves in a plane. Therefore, the ideas about qualitative smoothing have been appearing, by which the curves (i.e. functions) are estimated through qualitative features. In such a case, a visual image of distances between the curves is taken into consideration by means of nonstandard error criteria (cf. [?]). The curves are viewed as the sets of points in a plane. The distance between the set G f of points of the true curve f and the set G f of points of the estimated curve f are observed. For example, one uses the Hausdorff distance defined by d H (G f, G f ) = max{sup D(G f, G f ), sup(d(g f, G f )}, where D(G f, G f ) = {d((x, y), G f ) : (x, y) G f } is the set of distances from the points of the set G f to the set G f, and d((x, y), G) = inf (x,y ) G (x, y) (x, y ) 2 denotes the distance from the point (x, y) to the set G. Furthermore, one can define various so-called asymmetric error criteria and symmetric error criteria. Although these criteria seem to be good from the visual impression viewpoint, in certain situations the l 2 norm has an advantage for the use due to its important optimal properties (cf. [?]). References [1] A. Björck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, 1996.

152 T. Marošević [2] P. Bloomfield, W. L. Steiger, Least Absolute Deviations Theory, Applications and Algorithms, Birkhäuser, Boston, 1983. [3] I. Barrodale, F. D. K. Roberts, An efficient algorithm for discrete l 1 linear approximation with linear constraints, SIAM J. Numer. Anal. 15(1978), 603 611. [4] I. Barrodale, F. D. K. Roberts, C. R. Hunt, Computing best l p approximations by functions nonlinear in one parameter, The Computer Journal, 13(1970) 382 386. [5] P. T. Boggs, R. H. Byrd, R. B. Schnabel, A stable and efficient algorithm for nonlinear orthogonal distance regression, SIAM J. Sci. Stat. Comput. 8(1987), 1052 1078. [6] E. W. Cheney, Introduction to Approximation Theory, Chelsea Publishing Company, New York, 1966. [7] J. E. Dennis Jnr., Non-linear Least Squares and Equations, in: The State of the Art in Numerical Analysis, Jacobs, Ed., 1977., 269 309. [8] J. E. Dennis Jnr., R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall Inc., Englewood Cliffs, 1983. [9] Y. Dodge (Ed.), Statistical data analysis based on the L 1 -norm and related methods, 1. International Conference on Statistical Data Analysis based on the L 1 -norm and Related Methods, Elsevier, Amsterdam, 1987. [10] C. F. Gauss (translated by G. W. Stewart), Theory of the Combination of Observations Least Subject to Errors, SIAM, Philadelphia, 1995. [11] G. H. Golub, C. V. Loan, Total Least Squares, in: Smoothing Techniques for Curve Estimation, Th. Gasser and M. Rosenblatt, Eds., Lecture Notes in Mathematics 757, Springer Verlag, Berlin, 1979, 69 76 [12] G. H. Golub, V. Pereyra, The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems whose Variables Separate, SIAM J. Numer. Anal. 10(1973), 413 432. [13] R. Gonin, Numerical algorithms for solving nonlinear L p -norm estimation problems: part I a first-order gradient algorithm for well-conditioned small residual problems, Commun. Statist. -Simula. 15(1986), 801 813. [14] J. S. Marron, A. B. Tsybakov, Visual Error Criteria for Qualitative Smoothing, Journal of the Amer. Stat. Assoc. Vol. 90(430), 1995, 499 507. [15] M. J. D. Powell, Approximation Theory and Methods, Cambridge Univ. Press, Cambridge, 1981.

A choice of norm in discrete approximation 153 [16] J. R. Rice, The Approximations of Functions, Vol. 1 Linear Theory, Wesley, Reading, 1964. [17] A. Ruhe, P. A. Wedin, Algorithms for Separable Nonlinear Least Squares Problems, SIAM Review 22(1980), 318 337. [18] H. Späth, Mathematical Algorithms for Linear Regression, Academic Press Inc., Boston, 1992. [19] A. Tarantola, Inverse Problem Theory Methods for Data Fitting and Model Parameter Estimation, Elsevier, Amsterdam, 1987. [20] Y. Zi-qiang, On some computation methods of the nonlinear L 1 norm regression, J. Num. Method & Comp. Appl., 1994, 18 23. [21] G. A. Watson, The Numerical Solution of Total l p Approximation Problems, in: Numerical Analysis, D. F. Griffiths, Ed., Numerical Analysis, Lecture Notes in Mathematics 1066, Springer Verlag, Berlin, 1984, 221 238