Solving Continuous Linear Least-Squares Problems by Iterated Projection

Similar documents
Taylor Series and the Mean Value Theorem of Derivatives

UNIT #6 EXPONENTS, EXPONENTS, AND MORE EXPONENTS REVIEW QUESTIONS

Numerical Differentiation

Continuity and Differentiability

MATH745 Fall MATH745 Fall

Exam 1 Review Solutions

1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist

158 Calculus and Structures

Introduction to Derivatives

Chapter 2 Limits and Continuity. Section 2.1 Rates of Change and Limits (pp ) Section Quick Review 2.1

Efficient algorithms for for clone items detection

Differentiation Rules c 2002 Donald Kreider and Dwight Lahr

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points

Solutions to the Multivariable Calculus and Linear Algebra problems on the Comprehensive Examination of January 31, 2014

HOMEWORK HELP 2 FOR MATH 151

Function Composition and Chain Rules

3.4 Worksheet: Proof of the Chain Rule NAME

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Digital Filter Structures

232 Calculus and Structures

5.1 We will begin this section with the definition of a rational expression. We

Solve exponential equations in one variable using a variety of strategies. LEARN ABOUT the Math. What is the half-life of radon?

Polynomial Interpolation

Continuity and Differentiability Worksheet

MATH1901 Differential Calculus (Advanced)

f a h f a h h lim lim

Computer Derivations of Numerical Differentiation Formulae. Int. J. of Math. Education in Sci. and Tech., V 34, No 2 (March-April 2003), pp

Pre-Calculus Review Preemptive Strike

1watt=1W=1kg m 2 /s 3

Polynomial Interpolation

Symmetry Labeling of Molecular Energies

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

Chapter 1D - Rational Expressions

Hydraulic validation of the LHC cold mass heat exchanger tube.

The Verlet Algorithm for Molecular Dynamics Simulations

Continuity and Differentiability of the Trigonometric Functions

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

Material for Difference Quotient

7 Semiparametric Methods and Partially Linear Regression

Quantum Numbers and Rules

Chapter 4: Numerical Methods for Common Mathematical Problems

Combining functions: algebraic methods

Polynomials 3: Powers of x 0 + h

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

Midterm #1B. x 8 < < x 8 < 11 3 < x < x > x < 5 or 3 2x > 5 2x < 8 2x > 2

The total error in numerical differentiation

These errors are made from replacing an infinite process by finite one.

Numerical Experiments Using MATLAB: Superconvergence of Nonconforming Finite Element Approximation for Second-Order Elliptic Problems

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk

Precalculus Test 2 Practice Questions Page 1. Note: You can expect other types of questions on the test than the ones presented here!

Exercises for numerical differentiation. Øyvind Ryan

Generic maximum nullity of a graph

How to Find the Derivative of a Function: Calculus 1

1. Which one of the following expressions is not equal to all the others? 1 C. 1 D. 25x. 2. Simplify this expression as much as possible.

Phase space in classical physics

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative

Lab 6 Derivatives and Mutant Bacteria

2.8 The Derivative as a Function

Math 1210 Midterm 1 January 31st, 2014

1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems

Derivatives. By: OpenStaxCollege

lecture 26: Richardson extrapolation

Preface. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

The derivative function

Practice Problem Solutions: Exam 1

MTH 119 Pre Calculus I Essex County College Division of Mathematics Sample Review Questions 1 Created April 17, 2007

5.1 The derivative or the gradient of a curve. Definition and finding the gradient from first principles

Math 1241 Calculus Test 1

4.2 - Richardson Extrapolation

AMS 147 Computational Methods and Applications Lecture 09 Copyright by Hongyun Wang, UCSC. Exact value. Effect of round-off error.

MATH1151 Calculus Test S1 v2a

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Order of Accuracy. ũ h u Ch p, (1)

Recall from our discussion of continuity in lecture a function is continuous at a point x = a if and only if

Differentiation in higher dimensions

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT

New Streamfunction Approach for Magnetohydrodynamics

Quantum Mechanics Chapter 1.5: An illustration using measurements of particle spin.

Analytic Functions. Differentiable Functions of a Complex Variable

A SHORT INTRODUCTION TO BANACH LATTICES AND

1 Solutions to the in class part

E E B B. over U is a full subcategory of the fiber of C,D over U. Given [B, B ], and h=θ over V, the Cartesian arrow M=f

2.1 THE DEFINITION OF DERIVATIVE

Conductance from Transmission Probability

2.11 That s So Derivative

Parameter Fitted Scheme for Singularly Perturbed Delay Differential Equations

Math Spring 2013 Solutions to Assignment # 3 Completion Date: Wednesday May 15, (1/z) 2 (1/z 1) 2 = lim

SECTION 1.10: DIFFERENCE QUOTIENTS LEARNING OBJECTIVES

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016

On One Justification on the Use of Hybrids for the Solution of First Order Initial Value Problems of Ordinary Differential Equations

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Simulation and verification of a plate heat exchanger with a built-in tap water accumulator

MTH-112 Quiz 1 Name: # :

Gradient Descent etc.

DIGRAPHS FROM POWERS MODULO p

Essential Microeconomics : EQUILIBRIUM AND EFFICIENCY WITH PRODUCTION Key ideas: Walrasian equilibrium, first and second welfare theorems

MATHEMATICS FOR ENGINEERING DIFFERENTIATION TUTORIAL 1 - BASIC DIFFERENTIATION

General Solution of the Stress Potential Function in Lekhnitskii s Elastic Theory for Anisotropic and Piezoelectric Materials

(a) At what number x = a does f have a removable discontinuity? What value f(a) should be assigned to f at x = a in order to make f continuous at a?

Transcription:

Solving Continuous Linear Least-Squares Problems by Iterated Projection by Ral Juengling Department o Computer Science, Portland State University PO Box 75 Portland, OR 977 USA Email: juenglin@cs.pdx.edu Abstract I present a new divide-and-conquer algoritm or solving continuous linear least-squares problems. Te metod is applicable wen te column space o te linear system relating data to model parameters is translation invariant. Te central operation is a matrixvector product, wic maes te metod very easy to implement. Secondly, te structure o te computation suggests a straigtorward parallel implementation. A complexity analysis or sequential implementation sows tat te metod as te same asymptotic complexity as well-nown algoritms or discrete linear least-squares. For illustration we wor out te details or te problem o itting quadratic bivariate polynomials to a piecewise constant unction. Introduction Te linear least-squares problem is to determine te parameter values β or a unction tat minimizes te sum o squared dierences between data values {d i } and unction values {x i ; β} or corresponding points {x i }, m β = argmin β d i x i ; β. i= Te problem is a linear least-squares problem i te model is linear in te parameters, tat is, n x; β = = β x. A classical way o determining β goes as ollows:. Express in matrix orm, β = min d Aβ, β were A= x x n x x x n x x m x m n x m and d is te vector o data values.. Construct and solve te so called normal equations, A T Aβ = A T d. Assuming m > n more data values tan parameters and tat A as ull ran, te solution to may be written in terms o te pseudoinverse A + o A, β = A T A A T d = A + d. 5 Geometrically speaing, d = Aβ is te point in te range o A closest to d in te Euclidean norm. Te pseudoinverse projects d to te solution β, wic may be tougt o as te coordinates o d in te column space o A. In te ollowing we consider te continuous, linear version o problem, β = argmin β dx β x dx. 6 Ω

Section Note tat in te continuous setting we assume tere is a data unction dx, deined over some domain Ω. An example is single-band image data, wic is oten interpreted as a piecewise constant unction over te image domain []. Section introduces te ey ideas o te new metod and explains its prerequisites. Section gives te algoritm and discusses its computational merits. As an example, in Section 5 I wor out te elements o te algoritm or te problem o itting a bivariate quadratic polynomial. Te metod or solving 6 tat I call iterated projection wors or any dimension o Ω, but or te sae o clear exposition I assume Ω as dimension and Ω =Ω =[, ] [, ]. Linear Least-Squares by Iterated Projection Figure. Domain Ω wit coordinate origin at center a. Ω partitioned into our subdomains Ω, Ω, Ω, Ω in clocwise order b, next iner partition c, inest partition at wic te data unction d may be accurately represented over te subdomains d. A local coordinate system is centered on eac subdomain only sown or upper let subdomains in b and c. Recall tat te pseudoinverse in 5 projects te data directly onto a solution vector. Wit iterated projection we also obtain te solution by projection. However, instead o directly projecting te data we project solutions o intermediate least-squares problems. Imagine we partition Ω into our equally sized subdomains Ω,, Ω as sown in Figure b, and tat we are able to solve 6 or eac subdomain, wit solutions β,, β. Can we compute te solution β or Ω rom te our solutions β,, β or Ω,, Ω? As we will see below, te answer is yes, under certain circumstances. Wit problem 6 or subdomain Ω i we mean tat te integral in 6 is over Ω i instead o Ω, and tat te coordinates are wit respect to a coordinate system centered on Ω i c. Figure b. But centering te coordinate systems wit respect to eac Ω i means te our integrals are over te same set o points Ω = [ /, /] [ /, /]. Tese are important aspects o iterated projection: te problems corresponding to te our subdomains are our instances o te same problem only te data term in 6 is dierent, as well as similar to te original problem Ω =Ω instead o Ω =Ω. A ew more bits o notation are necessary to deine te intermediate problems more precisely. We denote te coordinates wit respect to Ω o te our subdomain s coordinate origins by x,, x and te solutions corresponding to te our subdomain problems by β,, β, respectively. In te continuous setting β are te coordinates o a unction closest to d in te space F o unctions over Ω tat is spanned by { Ω } = n. Liewise, te solution β i or subdomain Ω i are te coordinates o a unction closest to x dx + x i, in te space F o unctions over Ω spanned by { Ω } = n. We can use te space F to construct anoter space o unctions over Ω, F = { : Ω R x x+x i F or all x Ω and i= }. Note tat F as dimension n, wereas F as dimension n. Te answer to our question is tat we can construct a solution β or 6 rom te solutions β i or te subdomains i F is a subspace o F, F F. 7

Linear Least-Squares by Iterated Projection. Te Subspace Property Wen does te subspace property 7 old or a set o unctions { }? Wat we must veriy is tat or any set o parameters β tere are parameters β,, β suc tat β x x i = β i, x, x Ω, i=,,. In oter words, relation 7 olds wen te space o unctions over R spanned by { } is translation-invariant. Examples are te spaces o polynomials o any inite order, or te space o unctions spanned by sine-cosine pairs wit same wave number.. Solving te Linear Least-Squares Problem We want to obtain te linear least-squares solution β on Ω = Ω rom te solutions β,, β or Ω s subdomains Ω,, Ω, respectively. As we said above, we obtain β by projecting te solutions β,, β. Let a unction F be parameterized by β R n, β T = β T β T β T T β, were β i corresponds to te part o covering subdomain Ω i, and let F be parameterized by γ R n. Te projector rom F onto F we see maps β to te vector β = argmin γ β i,l l x dx. γ x x i 9 i Ω l Assume, or te moment, we now te projector, a n n-matrix, and call it R. Using R, te solution to problem 6 is β = R β = β R, R, R, R, β β β R,i denotes te obvious n n submatrix o R. Equation expresses te central idea and operation o te iterated projection algoritm: obtain solutions to smaller versions o te linear least-squares problem irst, ten derive te solution to te original problem by a simple matrixvector product. Te solutions to te smaller problems are obtained in te same way, by projecting solutions or sub-subdomains, using a similar projection operator R /. Te resulting algoritm is recursive. Te recursion ends at subdomains over wic te data unction d is an element o te space spanned by { } over tat domain, or wen d may be approximated in tat space wit negligible error Figure d.. Te Projection Operator We now turn to te problem o determining te projection matrix R used in. Tere are dierent ways o deriving R. In tis section I describe one way, and in Section 5 I demonstrate anoter way by example. To begin note tat wen subspace property 7 olds, or every element β in te range o R tere is one vector β or wic 9 is exactly zero. I tis is te case, ten β and β =R β represent te same unction over Ω. We denote P te n n-matrix tat maps β representing a unction in F to te vector β representing te same unction in F, β = P β = P, P, P, P, β. Determining P or a particular set o unctions { } is just anoter way o conirming tat te subspace property 7 olds. P is relevant ere because concrete expressions or it are easy to derive, and we can express R in terms o P below.

Section Te second ingredient we need is te inner product on te parameter space o F tat corresponds to te least-squares Euclidean norm,, g = x gx dx Ω = β x γ l l x dx Ω l Ω Ω Ω n = β T Ω Ω Ω n n γ Ω n Ω n Ω n n = β T Q γ. Q is te Gram matrix o te unctions { }. Inner products corresponding to te subdomainunction spaces are deined in te same way, Ω Ω T Ω, g Ω = β i Ω Ω Ω n Ω n Ω n Ω n n γ i = β T i Q γ i. Ω n n Wit P and Q we can express te least-squares property 9, wic is te deining property o R, witout using integrals R β = argmin γ β i,l l x dx γ x x i i Ω l = argmin γ β i,l l x dx P γ i, x i Ω l = argmin γ β i P γ i T Q β i P γ i i Since te sum in is quadratic in γ we can set its gradient w.r.t γ to zero and solve or γ to ind te minimizer. Te result is te ollowing expression or R, R = P T,i Q P,i P T, Q P T, Q P T, Q P T, Q. 5 i= Computational Caracteristics Te ollowing algoritm Iterated_Projection_LSQ is a very simple instance o iterated projection; it requires tat te data unction is piecewise constant over square regions o a regular partition o Ω into m m = l l regions c. Figure d. Not included in te algoritm description is te construction o te projection matrices R. Tese are assumed to ave been precomputed or all relevant scales prior to calling Iterated_Projection_LSQ. Iterated_Projection_LSQD, Input. D: m m data matrix, : scale o domain Output. β : least-squares solution over domain Ω = [, ] [, ] Steps.. I m = :. β project constant unction dx =D, Section.. else:. partition D, D = D D D D

Computational Caracteristics 5 5. β i Iterated_Projection_LSQD i, /, or i =,, 6. β R β β β β Equation. Computational Complexity o Iterated Projection Algoritm Iterated_Projection_LSQ is ormulated or two-dimensional domains. However, adapting te algoritm to domains o dierent dimensionality is straigtorward and we discuss computational complexity or te general case d dimensions. In te general case D is assumed to ave m d entries and is partitioned into d parts in Step, te projection matrix R is o size n d n, te dept o te recursion l = log m is independent o d. We assume tat analytical expressions or R are available, tus, precomputing te projection matrices means evaluating tose expressions or all l scales. Assuming tat te amount o computation or evaluating eac R -entry is independent o, precomputing te projecting matrices requires Olog m d n operations and Olog m d n space. Computing β in Step. requires On operations per entry o D, or Om d n operations total. Computing β in Step 6. requires a matrix-vector product o O d n. Step 6 is carried out i= l d i = d l d times, resulting in a total operation count o O d logm n or Om d n or Step 6. Steps 5. and 6. togeter may be organized in a loop over all D i. In tat case te amount o space required by iterated projection in addition to te space required or te projection matrices is proportional to n and proportional to te dept o te recursion, Olog m n.. Comparison to Classical Linear Least-Squares Algoritms In discrete linear least-squares problems, te problem domain Ω as been abstracted away and its dimension usually plays no part in a complexity analysis. Instead, te complexity o tose algoritms are expressed in te size o matrix A in, say M n []. Setting M = m d or comparison, te cost o iterated projection is OM n d operations, and O log d M n units o space ignoring space required or entering te problem. A classical algoritm or solving te normal equations is Colesy actorization, Colesy_LSQA, d Input. A: M n model matrix, d: M data vector Output. β : least-squares solution argmin β Aβ d Steps.. Compute B = A T A. Compute y = A T d. Factorize B, B = R T R R upper triangular. Solve R T x= y 5. Solve R β = x Te costs o tese steps are, respectively, OM n, OM n, On, On, and On operations. In most situations were iterated projection is applicable we could carry out Step. and Step. o Colesy_LSQ in advance, and te asymptotic cost would be dominated by Step., OM n. Storing te Colesy actors requires On units o space. Oter classical algoritm s complexity caracteristics are essentielly te same [], Capter, and we conclude tat te asymptotic computational cost o iterated projection is te same as te cost o well-nown algoritms or discrete linear least-squares.

6 Section 5 Discussion I ave presented a new idea, iterated projection, or computing te solution to continuous linear least-squares problems. Te idea is to project te data wic is tougt o as a unction into a sequence o unction spaces o decreasing dimensionality. Te inal projection in tis sequence is into te space in wic te solution is sougt. As discussed in Section., iterated projection is asymptotically as eicient as oter linear least-squares algoritms. However, Iterated_Projection_LSQ is a striingly simple algoritm, consisting essentially o a single operation te matrix-vector product in Step 6 and is arguably easier to accelerate by special ardware. Secondly, te way te computation is organized in Iterated_Projection_LSQ maes it easy to distribute te wor among multiple processors. Classical linear least-squares algoritms and iterated projection do not solve exactly te same problem. Te ormer solve te problem Find β suc tat d Aβ is minimal or β = β, 6 iterated projection solves te problem wit linear in β Find β suc tat dx β x dx is minimal or β = β. 7 Ω In practice problem 7 is oten approximated by a problem o te orm 6, obtained troug sampling d and at selected points {x i } c. to in Section. In suc situations, and wen te subspace property 7 olds, iterated projection is a compelling alternative to discrete linear least-squares algoritms. It may give exact solutions except or rounding errors, and it may require less computation wen d is o appropriate orm e.g., unctions in inite-element spaces; Step in Iterated_Projection_LSQ needs to be adapted accordingly. Anoter interesting aspect o iterated projection is tat, since no linear system needs to be solved, te problems o underdetermined or singular systems do not occur. Tis recommends iterated projection or problems lie least-squares based image reconstruction and segmentation [, ], were small regions oten lead to underdetermined or singular least-squares problems. 5 Appendix Iterated Projection wit Quadratic Polynomials In tis appendix we derive te projection matrix R or problem 7 wit { } given as x = x = x x = x x = x 5 x = x x 6 x = x. At irst we veriy tat te space spanned by,, 6 is translation invariant. Let β represent any suc unction, x; β = 6 = β x, x be te origin o anoter coordinate system, and x ; β denote te representation wit respect to te coordinate system centered at x. Function is te same as i teir values and tat o all teir derivatives are identical at any point. Coose x as tat point and write down te identities to obtain β in terms o β and x. β = β + β x + β x + β x + β5 x x + β 6 x β = β + β x + β 5 x,,,,,, and,, respec- To derive matrix P in we set x to tively, and obtain

Appendix Iterated Projection wit Quadratic Polynomials 7 P, = P, = P, = P, =. Matrix Q in can be seen to be Q = 6 576 6 6 576 6 6 Tese are te ingredients needed to derive R via 5. 9 5. Deriving R by QR Factorization One algoritm or solving least-squares problem is by QR actorization. It consists o computing te reduced QR actorization o A in, A = Q R, by wic we obtain an ortonormal basis or te column space o A. Next, te data vector is expanded in tat ortonormal basis, and, inally, an expansion in te basis o interest te columns o A is obtained by solving a triangular system. I now sow ow to derive R by applying essentially tis algoritm, but in te continuous setting. Te polynomials do not orm an ortogonal basis wit respect to te inner product, oterwise Q in 9 were diagonal. An ortornomal basis or te space spanned by are te Legendre polynomials up to order two. Deined over Ω =[, ] [, ], tese are φ x = φ x = x φ x = x φ x = 5 x φ 5 x = x x φ 6 x = 5 x. Te QR actorization o A= 5 6 over Ω is 5 6 = φ φ φ φ φ 5 φ 6 = Φ C 5 5

Section Te product R β may now be expressed as R β = C Φ T 5 6 dx β i i= Ω i φ, / φ, / φ, 6 / = C φ, / φ, / φ 5, 6 / β i i= φ 6, / φ 6, / φ 6, 6 / = C β, R, R, R, R, ence R = C R, C R, C R, C R,. Evaluating te integrals in R,i we get 9 6 9 6 R, = 9 6 9 6 9 6 9 6 5 6 5 R, = 9 6 9 6 5 6 6 5 6 9 6 5 6 6 9 6 6 5 5 6 5 9 6 9 6 9 6 9 6 R, = 9 6 9 6 5 6 5 R, = 9 6 9 6 5 6 6 9 6 6 5. 6 6 9 6 5 6 5 5 6 5 Multiplying R,i by C we inally arrive at te our components o R, R, = R, = 5 5 6 6 5 9 6 5 5 5 6 6 5 9 6 5 R, = R, = 5 5 6 6 5 9 6 5 5 5 6 6 5. 9 6 5 A Matlab implementation o iterated projection based on tese results is available or download rom tp://tp.cs.pdx.edu/pub/juenglin/iterated_projection/examples.tar. Bibliograpy [] P. J. Besl and R. C. Jain. Segmentation troug variable-order surace itting. IEEE Transactions on Pattern Analysis and Macine Intelligence, :67 9, Marc 9. [] T. Kanungo, B. Dom, W. Niblac, and D. Steele. A ast algoritm or MDL-based multi-band image segmentation. In Proceedings o te Conerence on Computer Vision and Pattern Recognition, pages 69 66, Los Alamitos, CA, USA, 99. IEEE Computer Society Press. [] Y. G. Leclerc. Constructing simple stable descriptions or image partitioning. International Journal o Computer Vision, :7, May 99. [] L. N. Treeten and D. Bau. Numerical Linear Algebra. SIAM, Piladelpia, PA, 997.