Analysis-3 lecture schemes

Similar documents
Chapter 2 Metric Spaces

Metric Spaces and Topology

Math 117: Topology of the Real Numbers

REVIEW OF ESSENTIAL MATH 346 TOPICS

Set, functions and Euclidean space. Seungjin Han

Course 212: Academic Year Section 1: Metric Spaces

Maths 212: Homework Solutions

Topology, Math 581, Fall 2017 last updated: November 24, Topology 1, Math 581, Fall 2017: Notes and homework Krzysztof Chris Ciesielski

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

1 Directional Derivatives and Differentiability

Analysis and Linear Algebra. Lectures 1-3 on the mathematical tools that will be used in C103

Multivariable Calculus

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Lecture 7. Econ August 18

A LITTLE REAL ANALYSIS AND TOPOLOGY

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

The following definition is fundamental.

MATH 31BH Homework 1 Solutions

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Math 426 Homework 4 Due 3 November 2017

Math 341: Convex Geometry. Xi Chen

1. If 1, ω, ω 2, -----, ω 9 are the 10 th roots of unity, then (1 + ω) (1 + ω 2 ) (1 + ω 9 ) is A) 1 B) 1 C) 10 D) 0

Functions of Several Variables

Optimization Theory. A Concise Introduction. Jiongmin Yong

Functional Analysis Exercise Class

Theorems. Theorem 1.11: Greatest-Lower-Bound Property. Theorem 1.20: The Archimedean property of. Theorem 1.21: -th Root of Real Numbers

Topological properties

MATH 51H Section 4. October 16, Recall what it means for a function between metric spaces to be continuous:

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

M311 Functions of Several Variables. CHAPTER 1. Continuity CHAPTER 2. The Bolzano Weierstrass Theorem and Compact Sets CHAPTER 3.

Immerse Metric Space Homework

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Course Summary Math 211

Your first day at work MATH 806 (Fall 2015)

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Introduction to Topology

METRIC SPACES KEITH CONRAD

a. Define a function called an inner product on pairs of points x = (x 1, x 2,..., x n ) and y = (y 1, y 2,..., y n ) in R n by

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

Math 201 Topology I. Lecture notes of Prof. Hicham Gebran

Math 302 Outcome Statements Winter 2013

Lecture 6 - Convex Sets

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

Lecture Notes 1: Vector spaces

i c Robert C. Gunning

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

2 Metric Spaces Definitions Exotic Examples... 3

Chapter II. Metric Spaces and the Topology of C

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

Some Background Material

l(y j ) = 0 for all y j (1)

Convex Analysis and Economic Theory Winter 2018

Metric Spaces Math 413 Honors Project

NORMS ON SPACE OF MATRICES

Functional Analysis Exercise Class

Problem Set 2: Solutions Math 201A: Fall 2016

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Problem Set 6: Solutions Math 201A: Fall a n x n,

Real Analysis Notes. Thomas Goller

Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

Chapter 2. Metric Spaces. 2.1 Metric Spaces

Metric Spaces. DEF. If (X; d) is a metric space and E is a nonempty subset, then (E; d) is also a metric space, called a subspace of X:

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

NOTES FOR MAT 570, REAL ANALYSIS I, FALL Contents

Polynomial mappings into a Stiefel manifold and immersions

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

0 Sets and Induction. Sets

CHAPTER 1. Metric Spaces. 1. Definition and examples

Elementary linear algebra

Contents Real Vector Spaces Linear Equations and Linear Inequalities Polyhedra Linear Programs and the Simplex Method Lagrangian Duality

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

1 Lecture 4: Set topology on metric spaces, 8/17/2012

Math General Topology Fall 2012 Homework 1 Solutions

MASTERS EXAMINATION IN MATHEMATICS SOLUTIONS

Lecture 3. Econ August 12

Contents Ordered Fields... 2 Ordered sets and fields... 2 Construction of the Reals 1: Dedekind Cuts... 2 Metric Spaces... 3

Metric Space Topology (Spring 2016) Selected Homework Solutions. HW1 Q1.2. Suppose that d is a metric on a set X. Prove that the inequality d(x, y)

Chapter 8. P-adic numbers. 8.1 Absolute values

Lecture 2: A crash course in Real Analysis

Appendix B Convex analysis

MAS3706 Topology. Revision Lectures, May I do not answer enquiries as to what material will be in the exam.

FUNCTIONAL ANALYSIS LECTURE NOTES: COMPACT SETS AND FINITE-DIMENSIONAL SPACES. 1. Compact Sets

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Several variables. x 1 x 2. x n

Chapter 1 Preliminaries

Week 5 Lectures 13-15

Real Analysis III. (MAT312β) Department of Mathematics University of Ruhuna. A.W.L. Pubudu Thilan

EGR2013 Tutorial 8. Linear Algebra. Powers of a Matrix and Matrix Polynomial

MATH 225 Summer 2005 Linear Algebra II Solutions to Assignment 1 Due: Wednesday July 13, 2005

Homework 5. Solutions

SOUTH AFRICAN TERTIARY MATHEMATICS OLYMPIAD

1. General Vector Spaces

CME Project, Geometry 2009 Correlated to: Kentucky Core Content for Mathematics Assessment 4.1 (High School, Grade 11)

MORE CONSEQUENCES OF CAUCHY S THEOREM

Transcription:

Analysis-3 lecture schemes (with Homeworks) 1 Csörgő István November, 2015 1 A jegyzet az ELTE Informatikai Kar 2015. évi Jegyzetpályázatának támogatásával készült

Contents 1. Lesson 1 4 1.1. The Space R n................................. 4 1.2. k-arrays.................................... 7 1.3. Homeworks.................................. 11 2. Lesson 2 13 2.1. Balls in R n.................................. 13 2.2. Topology in R n................................ 14 2.3. Homeworks.................................. 16 3. Lesson 3 17 3.1. Sequences in R n............................... 17 3.2. Characterization of closed sets with sequences............... 19 3.3. Compact sets................................. 20 3.4. Homeworks.................................. 20 4. Lesson 4 21 4.1. The limit of functions of type R n R m.................. 21 4.2. Limit along a set............................... 23 4.3. The continuity of functions of type R n R m............... 24 4.4. Homeworks.................................. 25 5. Lesson 5 27 5.1. The derivative of functions of type R n R m............... 27 5.2. Partial Derivatives.............................. 28 5.3. Homeworks.................................. 29 6. Lesson 6 30 6.1. The entries of the derivative matrix..................... 30 6.2. Connection between the derivatives and the partial derivatives..... 32 6.3. Directional Derivatives............................ 32 6.4. Homeworks.................................. 33 7. Lesson 7 34 7.1. Higher order derivatives........................... 34 7.2. Taylor s Formula............................... 36 7.3. Homeworks.................................. 40

CONTENTS 3 8. Lesson 8 41 8.1. Local extreme values: the First Derivative Test.............. 41 8.2. Quadratic Forms............................... 42 8.3. Homeworks.................................. 44 9. Lesson 9 45 9.1. Local extreme values: the Second Derivative Test............. 45 9.2. Homeworks.................................. 47 10.Lesson 10 48 10.1. Multiple integrals over intervals....................... 48 10.2. Properties of the integral.......................... 50 10.3. Computation of the integral over intervals................. 51 10.4. Homeworks.................................. 54 11.Lesson 11 55 11.1. Integration over bounded sets........................ 55 11.2. Computation of the integral over normal regions............. 55 11.3. Homeworks.................................. 58 12.Lesson 12 59 12.1. Integral Transformation........................... 59 12.2. Double integral in polar coordinates.................... 59 12.3. Triple integral in cylindrical coordinates.................. 60 12.4. Triple integral in polar coordinates..................... 60 12.5. Homeworks.................................. 61

1. Lesson 1 1.1. The Space R n In this book N denotes the set of positive integers and N 0 denotes the set of nonnegative integers: N = {1, 2, 3,...} and N 0 = {0, 1, 2, 3,...} = N {0}. In Linear Algebra we have studied the vector spaces and their special type, the Euclidean spaces. As you remember, R n was an important example for real Euclidean space. So every definition and theorem in connection with Euclidean spaces is valid for R n. Why R n is important in the multivariable analysis? Multivariable means, that a multivariable function has a finite number of real variables - say n variables. Thus its domain can be regarded as a collection of ordered n-tuples, and forms a subset of R n. The Reader can consider that how connects the case n = 1 to the one-variable analysis studied in the subjects Analysis-1 and Analysis-2. We review shortly the most important properties of R n. For a fixed n N R n is the set of all possible ordered n-tuples whose terms (components) are in R: R n := {x = (x 1, x 2,..., x n ) x i R}. Notice that in the case n = 2 the notation (x, y) is often used instead of (x 1, x 2 ). Similarly in the case n = 3 the notation (x, y, z) may be used instead of (x 1, x 2, x 3 ). We have the following operations in R n : Let x, y R n, λ R. Addition: x + y := (x 1 + y 1, x 2 + y 2,..., x n + y n ) (x, y R n ); Scalar Multiplication: λx := (λx 1, λx 2,..., λx n ) (x R n, λ R); Scalar Product x, y := x 1 y 1 + x 2 y 2 +... + x n y n = n x i y i (x, y R n ); Norm (length): x := x, x = x 2 1 + x2 2 +... + n x2 n = x 2 i (x R n ); We have learnt the properties of the above operations in Linear Algebra, and we have proved that R n is a real Euclidean Space. Consequently it is a Normed Vector Space (see Linear Algebra). i=1 i=1

1.1. The Space R n 5 We have defined the distance in R n as follows d(x, y) := x y = (x 1 y 1 ) 2 + (x 2 y 2 ) 2 +... + (x n y n ) 2 = = n (x i y i ) 2 (x, y R n ). i=1 1.1. Theorem [the properties of the distance] Let X be a linear normed space with norm., especially X = R n with the above defined norm. Then 1. d(x, y) 0 (x, y X). Furthermore d(x, y) = 0 x = y 2. d(x, y) = d(y, x) (x, y X) 3. d(x, y) d(x, z) + d(z, y) (x, y, z X) (triangle inequality) Proof. The first and the second statements are obvious by the axioms of the norm. Let us prove the triangle inequality: d(x, y) = x y = (x z) + (z y) x z + z y = d(x, z) + d(z, y) 1.2. Remark. If we define a mapping d : V V R which satisfies the above properties on a nonempty set X, then X is called metric space and the above properties are called the axioms of the metric space. So we have proved that every linear normed space is a metric space with the metric indicated by the norm d(x, y) = x y. A lot of properties of R n and of functions defined on R n are based on the metric structure of R n, so we could describe them using the notation of the metric: d(x, y). But for simplicity we will use the normed space structure, so the distance will be denoted by x y instead of d(x, y). The Reader can consider that a lot of the definitions and statements can be generalized for any metric space. Let us review some important relations which can be deduced from the Euclidean structure of R n (see: Linear Algebra): 1. Cauchy s inequality. For any elements x, y of an Euclidean space holds x, y x y. Here the equality holds if and only if the vectors x, y are linearly dependent (parallel). Especially in R n it is called Cauchy-Bunyakovsky-Schwarz inequality: (x 1 y 1 + + x n y n ) 2 (x 2 1 + + x 2 n) (y 2 1 + + y 2 n) (x i, y i R) and equality holds if and only if the vectors (x 1,..., x n ) and (y 1,..., y n ) are linearly dependent (parallel).

6 1. Lesson 1 2. Pythagorean Theorem. If N N and x 1,..., x N is an orthogonal system in an Euclidean space, especially in R n then N x i 2 = i=1 N x i 2. i=1 (The square of the hypotenuse equals to the sum of the squares of the perpendicular sides.) We will use the following consequence of the Pythagorean theorem. If x 1,..., x N is an orthogonal system in an Euclidean space and k {1, 2,..., N} is a fixed index then N N x i 2 = x i 2 x k 2, i=1 thus taking square root we got: i=1 N x i x k. i=1 Here equality holds if and only if x i = 0 for any i k. This inequality expresses that the length of the hypotenuse is at least the length of any perpendicular side. Combining this result with the triangle inequality we obtain: N x k x i i=1 N x i (k = 1,..., N). (1.1) i=1 Let us apply this result in R n as follows. If e 1,..., e n is the standard (orthonormal) basis e 1 = (1, 0,..., 0),..., e n = (0, 0,..., 1) in R n and x = (x 1,..., x n ) R n then x can be written as the orthogonal sum x = n x i e i. i=1 Apply the result (1.1) with N = n and with the vectors x 1 e 1,..., x n e n R n. Then we obtain on the one hand x = n x i e i x k e k = x k e k = x k 1 = x k (k = 1,..., n), i=1 on the other hand n x = x i e i i=1 n x i e i = i=1 n x i. i=1

1.2. k-arrays 7 Thus we have the following result: x k x n x i (k = 1,..., n). (1.2) i=1 Notice that these inequalities can be deduced in elementary way too: x = n x 2 i x 2 k = x k (k = 1,..., n). i=1 and x = n x 2 i = n ( n 2 x i 2 x i ) = i=1 i=1 i=1 n x i. i=1 1.2. k-arrays Studying the higher order derivatives of a multivariate function it will be important some basic knowledge about the k-arrays. 1.3. Definition Let n N and k N. The functions A : {1,..., n} k R 1 n k are called real k-arrays with size n n... n. Their set is denoted by R n n... (the number of n-s is k) or by R n k. The function value A(j 1,..., j k ) R is called the (j 1,..., j k )-th entry of the k-array and is denoted by (A) j1,..., j k or by a j1,..., j k. 1.4. Remarks. 1. In the definition {1,..., n} k denotes the k-times Cartesian product of the set {1,..., n}: {1,..., n} k = {1,..., n}... {1,..., n}, that is the set of k-long finite sequences (j 1,..., j k ) where j 1,..., j k {1,..., n}. 2. The 1-arrays are the vectors in R n. The 2-arrays are the matrices in R n n and can be represented by a square in the plane with size n n. 3. The 3-arrays can be represented by a cube in the space with size n n n. A general entry of a 3-array can be written using three indices as a ijk.

8 1. Lesson 1 4. A general k-array can be represented by a k dimensional rectangular box whose sides have the lengthes n. Here the length means the number of entries in the current direction (dimension). 5. R n k is a vector space over R and dim Rn k = nk. It follows from the fact that the elements of R n k are functions defined on an nk -element finite set. So R n k is isomorphic with R nk as vector space. 1.5. Definition The k-array A R n k is called symmetric if for any permutation p 1,..., p k of the index system j 1,..., j k holds (A) p1,..., p k = (A) j1,..., j k. Notice that here the permutation can be permutation with repetition. 1.6. Remarks. Every 1-array is symmetric. The interesting case is from the point of view of symmetry the case k 2. The symbol Ax k In the followings we will generalize the one variable monomials ax k for n-variable. 1.7. Definition Let A R n k and x Rn. Then the symbol Ax k denotes the real number defined as n n n Ax k :=... a j1,..., j k x j1 x j2... x jk R. j 1 =1 j 2 =1 j k =1 The n-multiple sum on the right side can be denoted by one sum-symbol where the indices are running independently of each other from 1 to n: The mapping Ax k = n j 1,...j k =1 a j1,..., j k x j1 x j2... x jk R. R n R, x Ax k is called n-variable homogeneous polynomial of degree k. 1.8. Remarks. 1. Let n = 1, A R 1 k. Denote by a the single entry of A that is (A) 1,...,1 = a. Then for any x = (x 1 ) R 1 = R: Ax k = 1 j 1,...j k =1 1 (A) j1,..., j k x j1 x j2... x jk = a x 1 x 1... thus Ax k is really a generalization of the one-variable monomial ax k. k x 1 = ax k,

1.2. k-arrays 9 2. The n-variable homogeneous polynomials of degree 1 are the linear functionals of type R n R: Ax = n a i x i where A = (a 1,..., a n ) R n, x R n. i=1 3. The n-variable homogeneous polynomials of degree 2 are the quadratic forms: a 11 a 12... a 1n n Ax 2 a 21 a 22... a 2n = a ij x i x j where A = i,j=1. Rn n, x R n. a n1 a n2... a nn 4. It is obvious that if A, B R n k and λ R then for any x Rn hold (A + B)x k = Ax k + Bx k, (λa)x k = λ(ax k ) A(λx) k = λ k (Ax k ). Ax k with symmetric k-array. Let us discuss another formula for the symbol Ax k provided that A R n k is symmetric (in the sense of Definition 1.5). In this case the sum in the definition of Ax k contains a lot of identical terms, more precisely, the term equals to the term a j1,..., j k x j1 x j2... x jk a p1,..., p k x p1 x p2... x pk where p 1,..., p k is a permutation of j 1,..., j k. If the index system j 1,..., j k contains i 1 times the index 1, i 2 times the index 2,..., i n times the index n where i 1, i 2,..., i n N 0, i 1 + i 2 +... + i n = k, then the number of the possible permutations (permutation with repetition) is k! i 1! i 2!... i n! = k! i!, where the meaning of i! will be given in the following definition. 1.9. Definition The n-dimensional vector i = (i 1,..., i n ) is called multi-index if i 1, i 2,..., i n N 0. Thus the set of the multi-indices is N n 0. If i Nn 0 is a multi-index and x R n is a vector then the absolute value of i, the factorial of i and the power x i are defined as i := i 1 + i 2 +... + i n, i! := i 1! i 2!... i n!, x i := x i 1 1 x i 2 2... x in n respectively.

10 1. Lesson 1 Using these abbreviations, the terms of the sum of the definition of Ax k can be grouped into subsets. A subset will contain the terms whose indices are the permutations of each other, and these terms are equal. Such a subset can be described uniquely by a multiindex in the following way. Let i N n 0, i = k be a multi-index and associate to this i the non-increasing index system j(i) where j(i) := (j 1, j 2,..., j k ) := (n,..., n, n 1,..., n 1,..., 1,..., 1). }{{}}{{}}{{} i n times i n 1 times i 1 times Denote by Perm (j(i)) the set of possible permutations p = (p 1,..., p k ) of the sequence j(i). Thus the identical terms of the above mentioned subset are a p1,..., p k x p1 x p2... x pk where p = (p 1,..., p k ) Perm (j(i)). We remind that the number of elements in Perm (j(i)) is k! i!. After these considerations we can write the expression of Ax k as follows: Ax k = n j 1,...j k =1 = i N n 0 i =k a j1,..., j k x j1 x j2... x jk = i N n 0 i =k k! i! a j 1,..., j k x j1 x j2... x jk = i N n 0 i =k p Perm j(i) a p1,..., p k x p1 x p2... x pk = k! i! a j 1,..., j k x i 1 1 x i 2 2... x in n, where j(i) = (j 1, j 2,..., j k ). If we use the notation a i := a j(i) = a j1,..., j k, then we obtain the following form (the so called multi-index form) for Ax k : Ax k = i N n 0 i =k k! i! a i x i. (1.3) ( ) n + k 1 Note that the number of terms in this sum equals to (see: combinations k with repetition). The norm of a k-array. Let us introduce the Euclidean vector norm on the vector space R n k (it is isomorphic with R nk ). This will be called the Frobenius-norm (or Euclidean norm). 1.10. Definition Let A R n k. Its Frobenius-norm (or Euclidean norm) is defined as n A F := j 1,...j k =1 a 2 j 1,..., j k

1.3. Homeworks 11 1.11. Remark. The Frobenius-norm satisfies the axioms of vector norm. Moreover R n k supplied with the Frobenius-norm and Rnk supplied with the Euclidean norm are isomorphic as linear normed spaces. The following theorem gives us an upper estimation for Ax k. 1.12. Theorem Let A R n k and x Rn. Then Ax k A F x k. Proof. The proof is based on the application of the Cauchy-Bunyakovsky-Schwarz inequality several times. We will prove the theorem for the case k = 2. The general proof is similar to this case and requires mathematical induction. Suppose that k = 2. Then we can use the indices i and j instead of j 1 and j 2. Ax 2 2 = (Ax 2 ) 2 = ( n i=1 x 2 x 2 i n a ij x i x j i,j=1 i=1 2 n = x i i=1 ) 2 n n a ij x j n n i=1 j=1 a 2 ij j=1 n j=1 x 2 j n a ij x j j=1 = x 2 A 2 F x 2 = A 2 F x 4. 2 So Ax 2 2 A 2 F x 4, which implies the statement of the theorem. 1.3. Homeworks 1. The following vectors are given in R 4 : x := ( 1, 3, 5, 2) y := (2, 3, 1, 1). Determine: a) x + y b) x y c) 3x d) 2x 5y e) x, y f) x g) d(x, y) 2. Let A R 2 3 be a symmetric 3-array and x R2. Write Ax 3 in the original form (see Defininition 1.7) and in the multi-index form (see formula (1.3)) respectively, and check their identity.

12 1. Lesson 1 3. Compute the Frobenius-norm of the following 2-arrays: [ ] 0 4 1 2 1 a) A = b) B = 3 1 2 3 1 2 0 3 c) Compute Ax 2 if x = (1, 2) then check the statement of Theorem 1.12. d) Compute Bx 2 if x = (1, 1, 1) then check the statement of Theorem 1.12. 4. a) Compute the Frobenius-norm of the following 3-array: A R 2 2 2, a ijk := i + j k 2 (i, j, k = 1, 2). b) Compute Ax 3 if x = ( 2, 1) then check the staement of Theorem 1.12.

2. Lesson 2 2.1. Balls in R n 2.1. Definition The neighbourhood (or ball or environment) of the point a R n with radius r > 0 is the set B(a, r) := {x R n x a < r }. 2.2. Remark. In the case n = 1 the ball is the open interval B(a, r) = (a r, a + r). In the case n = 2 the ball is the open circular disk B(a, r) = {x = (x 1, x 2 ) R 2 (x 1 a 1 ) 2 + (x 2 a 2 ) 2 < r 2 }. One can easily prove the following basic properties of neighbourhoods: 2.3. Theorem 1. If 0 < r 1 < r 2 then B(a, r 1 ) B(a, r 2 ) 2. B(a, r) = {a} r R + 3. (T 2 -property, separation) Let a, b R n, a b Then there exists r 1 > 0, r 2 > 0 such that B(a, r 1 ) B(b, r 2 ) = Proof. We will prove only the third statement of the theorem. a b Let r 1 = r 2 = r =. We will prove that B(a, r) B(b, r) =. 3 Assume, on the contrary that x B(a, r) B(b, r). Then it holds for such an x that x a < a b 3 Using this and the triangle inequality: and x b < a b 3. a b = a x + x b a x + x b = x a + x b < < a b 3 + holds which is a contradiction. a b 3 = 2 a b 3 We add briefly some concepts in connection with the neighbourhoods in different dimensions:

14 2. Lesson 2 2.4. Definition Let k {1,..., n} and I := {i 1,..., i k } {1,..., n}. Suppose that 1 i 1 < i 2 <... < i k n. Denote by i the index vector i = (i 1, i 2,... i k ). The set CHP (i) := {x R n x i = 0 if i / I} R n is called i-coordinate (or: (i 1, i 2,... i k )-coordinate) hyperplane. E. g. in R 3 the (1, 2)-coordinate hyperplane is the xy-plane, the (2)-coordinate hyperplane is the y-axis, etc. Obviously the set CHP (i) is a k-dimensional subspace in R n that is isomorphic with R k via the following mapping: ϕ : CHP (i) R k, ϕ(x) := (x i1,..., x ik ). Thus a point a CHP (i) has two kinds of neighbourhoods: in R n (n-dimensional, denote it by B(a, r)) and in R k (k-dimensional, denote it by B i (a, r)). You can easily prove that B i (a, r) = B(a, r) CHP (i). If e. g. the point a lies on the xy-plane in R 3 then this connection expresses that the intersection of a ball with a plane is a circle. 2.2. Topology in R n In the previous section we have defined the ball. Using this concept we can define important classes of points in connection of a fixed set. 2.5. Definition Let = H R, a R n. Then 1. a is an interior point of H, if r > 0 : B(a, r) H. 2. a is an exterior point of H, if r > 0 : B(a, r) H =. In other words: r > 0 : B(a, r) H. Here H denotes the complement of H that is H = R \ H. 3. a is a boundary point of H, if r > 0 : B(a, r) H and B(a, r) H. 2.6. Remark. Every interior point lies in H, every exterior point lies in H. But a boundary point can belong to H or to its complement. 2.7. Definition 1. The set of the interior points of H is called the interior of H and is denoted by int H. So int H := {a R r > 0 : B(a, r) H} H.

2.2. Topology in R n 15 2. The set of the exterior points of H is called the exterior of H and is denoted by ext H. So ext H := {a R r > 0 : B(a, r) H} H. 3. The set of the boundary points of H is called the boundary of H and is denoted by H. So H := {a R r > 0 : B(a, r) H and B(a, r) H } R. 2.8. Remark. R = int H H ext H and this is a union of disjoint sets. You can easily see that int H= H \ H, so we obtain the interior of a set if we subtract from the set its boundary. If we add the boundary to the set then we obtain the closure of the set as you see in the following definition. 2.9. Definition The set H H is called the closure of H and is denoted by clos H. So clos H := H H. It is obvious that clos H = int H and int H = clos H. This is based on the simple fact that H = H. 2.10. Definition Let H R n. Then 1. H is called an open set df H H. 2. H is called a closed set df H H. 2.11. Remarks. 1. H is open if and only if it does not contain any boundary point and it is closed if and only if it contains all of its boundary points. 2. and R n are open and closed sets at the same time. There is no other set in R n that is open and closed at the same time. 3. H is open H is closed, H is closed H is open. 4. H is open H int H H = int H. 5. H is closed clos H H H = clos H. 2.12. Definition Let = H R n. Then H is called bounded if M > 0 x H : x M.

16 2. Lesson 2 2.3. Homeworks 1. Prove the statements 1. and 2. of Theorem 2.3. 2. Let a R n, r > 0. Prove that in R n a) B(a, r) is an open set. b) The set {x R n x a r} (the closed ball) is a closed set. c) The set {x R n x a = r} (the sphere) is a closed set. 3. a) Prove that any ball B(a, r) contains n linearly independent vectors. b) Prove that for any subspace W R n int W =. c) Prove that for any subspace W R n W is a closed set. 4. Determine int H, H, ext H and clos H if H R 2 and a) H = {(x, y) R 2 1 x < 3, 1 y < 2}. b) H = {(x, y) R 2 x 0, x 2 + y 2 < 1}. 5. Prove that a set = H R n is bounded if and only if it can be covered by a ball that is a R n and r > 0 : H B(a, r).

3. Lesson 3 3.1. Sequences in R n Similarly to number sequences we can define vector sequences in R n. 3.1. Definition A function a : N R n is called a vector sequence in R n. The function value a(k) R n ordered to the number k N is called the k-th term of the sequence. If we want to indicate the components of the k-th term then we will rather use the notation a (k) instead of a(k). If we don t want to indicate the components then we can use the usual notation a k for the k-th term. 3.2. Definition Let a (k) R n (k N) be a vector sequence in R n. Then a (k) = (a (k) 1, a(k) 2,..., a(k) n ) R n (k N). The number sequence a (k) i R (k N) is called the i-th coordinate sequence of (a (k) ) (i = 1,..., n). 3.3. Definition The sequence a (k) R n (k N) is called bounded if M > 0 k N : a (k) M. It is obvious that the vector sequence (a (k) ) is bounded if and only if the real number sequence ( a (k) ) is bounded. Moreover, using the inequalities (1.2) one can prove, that a vector sequence is bounded if and only if its every coordinate sequence is bounded. That is (a (k) ) is bounded (a (k) i ) is bounded (i = 1,..., n). 3.4. Definition The vector sequence a : N R n is called convergent if A R n ε > 0 N N k N : a (k) B(A, ε). The definition can be written using inequalities as follows: A R n ε > 0 N N k N : a (k) A < ε. A vector sequence is called divergent if it is not convergent. It can be proved (using the T 2 -property of the neighbourhoods) that the vector A in the above definition is unique. It is called the limit (or limit vector) of the vector sequence (a (k) ), and it is denoted in one of the following ways: lim a = A, lim a (k) = A, lim k a(k) = A, a (k) A (k ).

18 3. Lesson 3 3.5. Remark. If a : N R n is a vector sequence and A R n then lim k a(k) = A is equivalent with ε > 0 N N k N : a (k) B(A, ε), or using inequalities with ε > 0 N N k N : a (k) A < ε. The number N is called threshold index to ε. 3.6. Theorem Let a (k) R n (k N) be a vector sequence and A R n be a vector. Then lim k a(k) = A lim k a(k) A = 0 Proof. The proof is obvious, if we consider that a (k) A = a (k) A 0. In the following theorem we reduce the convergence of a vector sequence back to the convergence of its coordinate sequences. 3.7. Theorem Let a (k) R n (k N) be a vector sequence and A R n be a vector. Then lim k a(k) = A lim k a(k) i = A i (i = 1,..., n). Proof. Applying the inequalities (1.2) for the vectors a (k) A we obtain a (k) i A i a (k) A n i=1 a (k) i A i (i = 1,..., n), which implies both directions of the statement. This reduction to the coordinate sequences makes it possible to prove easily some important basic theorems: connection between the convergence and the boundedness, between the convergence and the algebraic operations, and also to prove the completeness of R n. Similarly to the number sequences one can prove that a convergent sequence is bounded (using norm instead of the absolute value). The opposite direction is not true: there exist bounded sequences in R n that are divergent. For example if x R n \ {0} then the sequence a k := ( 1) k x (k N) is such a sequence. 3.8. Theorem [Bolzano-Weierstrass] Let a (k) R n (k N) be a bounded vector sequence. Then it has a convergent subsequence.

3.2. Characterization of closed sets with sequences 19 Proof. For simplicity we will present the proof in the case n = 2. The general case can be proved in the same way. As (a (k) ) is bounded, its first coordinate sequence (a (k) 1 ) is a bounded real number sequence. Using the Bolzano-Weierstrass theorem in R (see: Analysis-1) it has a convergent subsequence (a (km) 1, m N). But the subsequence (a (km) 2, m N) of the second coordinate sequence (a (k) 2 ) is also bounded, so applying once more the Bolzano- Weierstrass theorem in R it has a convergent subsequence (a (km s) 2, s N). Then the vector sequence a (kms) = ( a (k ms) 1, a (k ms) 2 ) (s N) is obviously convergent. 3.2. Characterization of closed sets with sequences The closeness of a set in R n can be described with vector sequences. 3.9. Theorem Let = H R. Then H is closed if and only if a k H (k N) convergent sequence : lim a k H. k Proof. : Let a k H (k N) be a convergent sequence and A := lim a k. We need to prove that A H. Suppose indirectly A / H. Then A H. But H is open (because H is closed), therefore ε > 0 : B(A, ε) H. But to this ε: N N k N : a k B(A, ε) H. This is a contradiction: a k H and a k H cannot be at the same time. : Suppose indirectly that H is not closed. This implies that H is not open, thus A H that is not interior point of H. This means that r > 0 : B(a, r) H that is B(a, r) H. Applying this fact for the numbers r := 1 k (k N) we obtain k N a k B(a, 1 k ) H. So we have defined a sequence a k H (k N). Since a k A < 1 k (k N), the limit of this sequence is A. So by the condition A H. But A was chosen from the set H. This is a contradiction. The intuitive content of the above theorem is that it is impossible to go out from a closed set via convergence.

20 3. Lesson 3 3.3. Compact sets 3.10. Definition Let = H R n. H is called a compact set if a k H (k N) sequence (a km, m N) subsequence : The is called to be compact by definition. (a km, m N) is convergent and lim m a k m H. Similarly to the case of compact sets in R (see: Analysis-2) the following theorem can be proved. We need only use the norm instead of the absolute value. 3.11. Theorem Let H R n. Then H is compact if and only if it is closed and bounded. 3.12. Remark. The theorem is not valid in infinite dimensional normed spaces. Every compact set is closed and bounded but there exists a closed and bounded set, that is not compact (see: Functional Analysis). 3.4. Homeworks 1. Prove by definition of the convergence, that in R 2 lim n Determine a threshold index to ε = 10 3. ( n + 1 2n 3, 3n 2 ) ( ) 1 = n + 5 2, 3 2. Determine the limit of the following sequence in R 3 : ( ( 1 a n = n, 1 + 1 ) n, 2n 1 ) n 3n + 7 (n N). 3. Using sequences prove that the following set is not closed: H = {(x, y) R 2 1 x < 3, 1 y < 2} R 2 4. Using the definition of compactness prove that the following sets in R 2 are not compact: a) H = {(x, y) R 2 x 2 + y 2 < 4} b) {(x, y) R 2 1 x 3}

4. Lesson 4 4.1. The limit of functions of type R n R m As in the one variable case the concept of limits expresses where tend the function values to if the variable tends to a certain point. The first problem is to discuss the points where the variable can tend. These points are the so called accumulation points of the domain of the function. 4.1. Definition (Accumulation point) Let = H R n and a R n. We say that a is an accumulation point of H if r > 0 : (B(a, r) \ {a}) H. The set of accumulation points of H is denoted by H that is H := {a R n a is an accumulation point of H}. The points of the set H \ H are called isolated points of H. 4.2. Definition (isolated point) Let = H R n and a R n. We say that a is an isolated point of H if a H and r > 0 : B(a, r) \ {a} H =. After these preliminaries follows the definition of the limit: 4.3. Definition Let f R n R m, a D f. We say that f has limit at the point a if A R m ε > 0 δ > 0 x (B(a, δ) \ {a}) D f : f(x) B(A, ε). Using the T 2 -property of neighbourhoods it can be proved that the vector A in this definition is unique. This unique A is called the limit of the function f at the point a. The notations are: A = lim a f, A = lim x a f(x), f(x) A (x a). 4.4. Remark. Thus the fact lim f = A can be expressed with neighbourhoods: a ε > 0 δ > 0 x (B(a, δ) \ {a}) D f : f(x) B(A, ε), and with inequalities: ε > 0 δ > 0 x D f, 0 < x a < δ : f(x) A < ε.

22 4. Lesson 4 4.5. Examples 1. (the constant function) Let f : R n R m, f(x) = c where c R m is a fixed vector. Then for any a R n : lim c = c, because for any ε > 0 any δ > 0 is good: x a ( ) x B(a, δ) \ {a} D f : f(x) = c B(c, ε). 2. (identity function) Let f : R n R n, f(x) := x. Let a R n. Then lim x = a, x a because for any ε > 0 let δ := ε. It will be good, since ( ) x B(a, δ) \ {a} D f : f(x) = x B(a, δ) = B(a, ε). 3. (canonical projections) Let f : R n R, f(x) := x i where i {1,..., n} is fixed and x = (x 1,..., x n ). Let a = (a 1,..., a n ) R n. Then lim f(x) = a i because for x a any ε > 0 let δ := ε. This is good since if 0 < x a < δ, then by (1.2): f(x) a i = x i a i = (x a) i x a < δ = ε. Note, that in the case n = 1, the projection coincides with the identity. Similarly to the one variable case it can be proved the theorem of Transference Principle: 4.6. Theorem [Transference Principle for limits] Using the previous notations: allowed sequence {}}{ lim f(x) = A x k D f \ {a} (k N), lim x k = a : lim f(x k ) = A. x a The most important corollaries of the Transference Principle are like in the one variable case the algebraic operations with the limits. For m 2 we can speak about the limits by coordinates. To formulate this statement we have to define the coordinate function. 4.7. Definition Let f R n R m, f(x) = (f 1 (x),... f m (x)) R m. Then the function f i : D f R is called the i-th coordinate function of f (i = 1,... m). We often use the notation f = (f 1,..., f m ). Clearly in the case m = 1 f 1 = f. Using the Transference Principle and the convergence of sequences by coordinates, one can prove the following theorem: 4.8. Theorem [limit by coordinates] Suppose that m 2 and let f R n R m, a D f, A = (A 1,..., A m ) R m. Then lim a f = A lim a f i = A i (i = 1,..., m). 4.9. Remark. With short notation our theorem is: ( ) lim f(x) = lim f 1(x),..., lim f m (x). x a x a x a Moreover the existence of one side of the above equality implies the existence of the other side.

4.2. Limit along a set 23 4.2. Limit along a set Sometimes we approach a point in such way that the variable remains in a fixed given set. In this case we speak about limit along a set. First we define the restriction of a function. 4.10. Definition Let f A B, = C D f. The function f C : C B, f C (x) := f(x) is called the restriction of f onto the set C. 4.11. Definition (limit along a set) Let f R n R m, = H R n. Suppose that a R n and a (H D f ). Then the limit along the set H is defined as follows: lim x a x H f(x) := lim x a f H Df (x). Since the definition reduces the limit along a set back to the limit of functions, all the statements (Transference Principle, algebraic operations, limit by coordinates) are valid for limits along a set. 4.12. Remarks. 1. If H = D f then lim x a x H f(x) = lim x a f(x). 2. If n = 1 then we may obtain the one-sided limits, that is for H = (, a): lim f(x) = x a x H and for H = (a, + ): lim f(x) = x a x H lim f(x), x a 0 lim f(x). x a+0 3. If lim x a f(x) = A, then for any set H for which a (H D f ) follows that lim f(x) = A. x a x H The practically useful corollary of this statement is, that if we tend to a point in two different way and we obtain two different results, then the function has no limit at this point (two-way-method). 4. If lim f(x) = lim f(x) = A then x a x H x a x K x a lim f(x) = A. x H K

24 4. Lesson 4 4.3. The continuity of functions of type R n R m We can define the continuity similarly to the one-variable case: 4.13. Definition Let f R n R m, a D f. f is continuous at a if ε > 0 δ > 0 x B(a, δ) D f : f(x) B(f(a), ε). Let us denote the set of functions that are continuous at a by C(a). From the definition it follows immediately that if a is an isolated point of D f then f is continuous at a. if a is an accumulation point of D f then f is continuous at a lim x a f(x) = f(a). 4.14. Definition Let f R n R m. Then f is continuous if it is continuous at every point of its domain, that is a D f : f C(a). Using the results of Examples 4.5, the constant function, the identity function and the canonical projections are continuous. 4.15. Theorem [Transference Principle for continuity] Using our notations: f C(a) x k D f (k N), lim x k = a : lim f(x k ) = f(a). The proof of this theorem is similar to the one-variable case. Using the Transference Principle one can easily see that 1. 2. 3. f, g C(a), c R f + g, f g, c f C(a), g C(a), f C(g(a)) f g C(a), f = (f 1,..., f m ) C(a) f i C(a) (i = 1,..., m). Similarly to the one-variable case, one can prove the most important theorems for continuous functions defined on compact sets. 4.16. Theorem [the compactness of the image] Let f R n R m be a continuous function and suppose that D f is compact. Then R f is compact.

4.4. Homeworks 25 Before stating the following theorem, let us define the extreme values of an n- variable function: 4.17. Definition Let f R n R. The minimum of f is the minimal element of its range (if exists), that is min f := min R f = min {f(x) x D f } = min x D f f(x). The vector a D f is called the place of the minimum, if f(a) = min f. is Respectively, the maximum of f is the maximal element of its range (if exists), that max f := max R f = max {f(x) x D f } = max x D f f(x). The vector a D f is called the place of the maximum, if f(a) = max f. These numbers are called the absolute (or global) extreme values, (absolute (or global) minimum, absolute (or global) maximum) of f. 4.18. Theorem [the minimax theorem of Weierstrass] Let f R n R be a continuous function and D f be compact. Then min f and max f. The definition of the uniform continuity is defined in a similar way as in the onevariable case: 4.19. Definition Let f R n R m. We say that f is uniformly continuous if ε > 0 δ > 0 x, y D f, x y < δ : f(x) f(y) < ε. 4.20. Theorem [theorem of Heine] Let f R n R m be a continuous function and D f be compact. Then f is uniformly continuous. 4.4. Homeworks 1. Determine the following limits if they exist: a) lim (x,y) (0,0) xy x2 y 2 x 2 + y 2 b) lim (x,y) (0,0) 3xy x + xy 2 x 2 + 2y 2 x 2 + y 2 c) lim (x,y) (0,0) x 2 + y 2 + 4 2 d) lim (x,y) (0,0) xy x y.

26 4. Lesson 4 2. Discuss the continuity of the following R 2 R type functions: x y if x + y 0, x + y a) f(x, y) = 0 if x + y = 0; x 2 y 2 b) f(x, y) = x 2 + y 2 if (x, y) (0, 0), 0 if (x, y) = (0, 0).

5. Lesson 5 5.1. The derivative of functions of type R n R m 5.1. Definition Let f R n R m, a int D f. We say that f is differentiable at the point a (denoted by f D(a)) if A R m n : f(a + h) f(a) A h lim = 0. h 0 h 5.2. Theorem The matrix A in the above definition is unique. Proof. Suppose that the matrices A, B R m n satisfy the definition. In this case ( ) f(a + h) f(a) Ah f(a + h) f(a) Bh lim = 0 0 = 0. h 0 h h After calulations we have (B A) h lim = 0. h 0 h Let h 0 along the rays of the unit vectors j e j = (0,..., 1,..., 0), (j = 1,..., n) that is let h := t e j, where t > 0, t 0 + 0. Since t e j = t e j = t 1 = t, so (B A) h (B A) t e j lim = 0 lim = 0 h 0 h t 0+0 t e j (B A) t e j lim = 0 (B A) e j = 0 t 0+0 t B A = 0 B = A. 5.3. Definition The matrix A in the above definition is called the derivative (or: derivative matrix) of f at the point a and is denoted by f (a). So f (a) := A. 5.4. Remarks. 1. If n = m = 1, then we obtain the definition in the case R R, that was studied in Analysis-2.

28 5. Lesson 5 2. In the case n = 1 the derivative can be defined equivalently as it was present in Analysis-2: f f(a + h) f(a) f(x) f(a) (a) = lim = lim h 0 h x a x a 3. In the case n 2 we cannot use the ratio of differences because the division with a vector is undefined. 5.5. Theorem If f D(a) then f C(a). Proof. f(a + h) f(a) = f(a + h) f(a) f (a) h h h + f (a) h 0 (h 0). So lim h 0 f(a + h) = f(a) which implies f C(a). In the following theorem we state some differentiation rules without proof. 5.6. Theorem 1. Let f, g R n R m, f, g D(a). Then f + g D(a) and (in the sense of matrix addition). (f + g) (a) = f (a) + g (a) 2. Let f R n R m, λ R, f D(a). Then λf D(a) and (λf) (a) = λ f (a) (in the sense of matrix scalar multiplication). 3. (Chain Rule) Let g R n R m, g D(a), f R m R p, f D(g(a)). Then f g D(a) and (f g) (a) = f (g(a)) g (a) (in the sense of matrix multiplication). 5.2. Partial Derivatives What are the entries of the derivative matrix? This is a natural question. In this section we prepare the answer. 5.7. Definition Let f R n R m, a = (a 1,..., a n ) int D f and j {1,..., n}. Define the following auxiliary function g a,j (x) := f(a 1,..., a j 1, x, a j+1,..., a n ) (x B(a j, r)),

5.3. Homeworks 29 where r denotes the radius for which B(a, r) D f. The column matrix g a,j (a j) R m 1 if it exists is called the j-th partial derivative (more precisely: the partial derivative by the j-th variable) of the function f at the point a. Its notations are: ( ) ( ) f f(x) j f(a), or f x j (a), or, or. x j x=a x j x=a 5.8. Remarks. 1. Roughly speaking we can compute the the j-th partial derivative, if we fix every variable except the j-th one and then differentiate the obtained one-variable function at a j. 2. By the isomorphism between R m and R m 1 we can say that the partial derivative is a vector in R m. Especially in the case m = 1 (f is a scalar-valued function) because of the isomorphism between R and R 1 1 we can say that the partial derivative is a number. We will use these representations in the followings. 5.9. Definition Using the notations of the previous definition, let D R n denote the set of all points of D f where the j-th partial derivative exists. Suppose that D. Then the function j f : D R m, a j f(a) is called the j-th partial derivative function of f. It is obvious that the partial derivative function is of type R n R m, and in the case m = 1 it is of type R n R. 5.3. Homeworks 1. Using the definition prove that the following functions are differentiable at the given point (a, b), and compute the derivatives: a) f : R 2 R, f(x, y) = x 3 + xy 2y, (a, b) = (2, 1) ; b) f : R 2 R 2, f(x, y) = (x 2 y + 5y, x 2 xy) (a, b) = (2, 1). 2. Determine the partial derivatives of the following R 2 R type functions: a) f(x, y) = x 2 5xy + 3y 2 6x + 7y + 8; b) f(x, y) = arcsin x y ; c) f(x, y) = xy x + y ; d) f(x, y) = x 3 5x 2 y + y 4 ; e) f(x, y) = e x cos y x ln y; f) f(x, y) = arc tg 1 x 1 y ; g) f(x, y) = e2x 3y 2x 3y ; h) f(x, y) = x tg x e xy.

6. Lesson 6 6.1. The entries of the derivative matrix In the previous lesson we have defined the partial derivatives. Using this concept we can determine the entries of the derivative matrix. First we will give another form of the partial derivative. Let f R n R m, a = (a 1,..., a n ) int D f, j {1,..., n} and g a,j be the auxiliary function defined in the previous section. Let F = F a,j be the following other auxiliary function: F (t) := f(a + te j ) (t R, a + te j D f ), where e j denotes the j-th standard unit vector in R n. Then F (t) =f(a + te j ) = = f(a 1 + t 0,..., a j 1 + t 0, a j + t 1, a j+1 + t 0,..., a n + t 0) = = f(a 1,..., a j 1, a j + t, a j+1,..., a n ) = g a,j (a j + t), and F is defined for t < r, where r denotes the radius, for which B(a, r) D f. Using the Chain Rule we deduce that F (t) = g a,j(a j + t) 1 consequently F (0) = g a,j(a j ) = j f(a). After these preliminaries we can discuss the columns of f (a). 6.1. Theorem Let f R n R m, f D(a), j {1,..., n}. Then j f(a) and it is identical with the j-th column of f (a). f(a + h) f(a) f (a) h Proof. f D(a) implies that lim = 0. Apply this relation h 0 h with the vectors h = t e j where e j is the j-th standard unit vector, and t R, t 0. Then f(a + t e j ) f(a) f (a) t e j 0 = lim = t 0 t e j f(a + t e j ) f(a) f (a)e j t = lim = t 0 t e j = lim t 0 f(a + t e j ) f(a) (the j-th column of f (a)) t t This means by the definition of the derivative that F D(0), and that j f(a) = F (0) = the j-th column of f (a).

6.1. The entries of the derivative matrix 31 The following theorem speaks about the rows of the derivative matrix. 6.2. Theorem Let f R n R m, f = (f 1,..., f m ) and a int D f. Then f D(a) f i D(a) (i = 1,..., m). In this case: f i(a) = the i-th row of f (a) (i = 1,..., m). Proof. Let A R m n. Then the fact is equivalent with f(a + h) f(a) A h lim = 0 h 0 h f i (a + h) f i (a) (Ah) i lim h 0 h (see: limit by coordinates). But (Ah) i = (the i-th row of A) h, so the above relation is equivalent with = 0 f i (a + h) f i (a) (the i-th row of A) h lim = 0. h 0 h Using these equivalencies in both directions, the statement of the theorem follows immediately. Using the two previous theorems we obtain: (f (a)) ij = the j-th column of the i-th row = j f i (a). 6.3. Remark. The derivative matrix is: 1 f 1 (a) 2 f 1 (a)... n f 1 (a) 1 f 2 (a) 2 f 2 (a)... n f 2 (a) f (a) = R m n... 1 f m (a) 2 f m (a)... n f m (a) 6.4. Definition Let f R n R, f D(a). The vector grad f(a) := f(a) := ( 1 f(a), 2 f(a),..., n f(a)) R n is called the gradient vector (or simply: gradient) of f at the point a. Obviously the gradient vector f(a) is the vector representation of the derivative matrix f (a) R 1 n.

32 6. Lesson 6 6.2. Connection between the derivatives and the partial derivatives In the previous section we have proved that the differentiability of a function at a point implies the existence of all the partial derivatives at this point. The converse statement is not true as the following example shows. 6.5. Example Let f : R 2 R be the following function: 1 if xy = 0 f(x, y) := 0 if xy 0 Then 1 f(0, 0) = 2 f(0, 0) = 0, but f / C(0, 0) so f / D(0, 0). Using some further assumptions the existence of partial derivatives can imply the differentiability as will be stated without proof in the following theorem. 6.6. Theorem Let f R n R m, a int D f. Suppose that 1. r > 0 x B(a, r) : j f(x) (j = 1,..., n) and 2. f C(a). Then f D(a). 6.3. Directional Derivatives The partial derivatives can be regarded as the derivatives of the functions restricted to a line through the point a, and parallel with one of the coordinate axis. If we take a general direction instead of the directions of coordinate axis then we obtain the concept of directional derivative. 6.7. Definition Let f R n R m, a int D f, e R n, e = 1. Let F = F a,e be the following other auxiliary function: F (t) := f(a + te) (t R, a + te D f ) Then the directional derivative of f at the point a along the direction e is defined as ( ) d e f(a) := F (0) = f(a + te). dt In many cases the directional derivative can be computed with the help of the derivative. t=0

6.4. Homeworks 33 6.8. Theorem Let f R n R m, f D(a). Then for any e R n, e = 1: in the sense of matrix-vector product. e f(a) = f (a) e Proof. Let g(t) = a + te (t R). Then g (t) = e and using the Chain Rule we obtain ( ) d e f(a) = f(a + te) = ( f (a + te) g (t) ) dt t=0 = t=0 = ( f (a + te) e ) t=0 = f (a) e. 6.9. Remark. The direction is often given not by a unit vector but by another way (see Homework 3.). In this cases we have to determine a unit vector that shows in the given direction. 6.4. Homeworks 1. Discuss the differentiability of the function (i.e. at which point of its domain it is differentiable, and what are the derivatives at these points): f : R 2 R, f(x, y) = 3 xy 2. Prove that the following function is differentiable, and determine its derivative: f : R 2 R 2, f(x, y) = ( arc tg x x 2 + y 2 + 1, cos(x3 4xy) ). 3. Determine the directional derivatives in the following case: f : R 2 R, f(x, y) = x 3 + xy 2y 3 at the point P 0 (2, 1) along the following directions a) v = ( 2, 3); b) α = 330 ; c) The direction from A(1, 0) to B(4, 4).

7. Lesson 7 7.1. Higher order derivatives In this section we will study the higher order derivatives of functions of type R n R. Similarly to the one-variable case the second order derivative is defined as the derivative of the derivative function. 7.1. Definition Let f R n R, a int D f. We say that f is 2 times differentiable at a (its notation is: f D 2 (a)) if r > 0 x B(a, r) : f D(x) and f D(a). The derivative function f can be regarded as a vector valued function with the coordinate functions j f (j = 1,..., n). So the equivalent definition of the 2 times differentiability can be given as follows. 7.2. Definition Let f R n R, a int D f. We say that f is 2 times differentiable at a if r > 0 x B(a, r) : f D(x) and j f D(a) (j = 1,..., n). Suppose that f D 2 (a). Since its derivative function is f = ( 1 f, 2 f,..., n f) : R n R n, its second derivative is the derivative matrix of f at a: 1 1 f(a) 2 1 f(a)... n 1 f(a) f (a) = ( f ) 1 2 f(a) 2 2 f(a)... n 2 f(a) (a) = R n n... 1 n f(a) 2 n f(a)... n n f(a) This matrix is called the Hesse-matrix of f at a. The ij-th entry of the Hesse-matrix is ( f (a) ) ij = j i f(a) (i, j = 1,..., n). 7.3. Definition The entries of the Hesse-matrix f (a) are called the second order partial derivatives of f at a. 7.4. Remark. We have defined the second order partial derivatives only for the functions that are 2 times differentiable at a. The concept of the second order partial derivative can be defined in more general case but we will use it only for 2 times differentiable functions.

7.1. Higher order derivatives 35 7.5. Theorem [Theorem of Young] If f D 2 (a), then the Hesse-matrix f (a) is symmetric that is j i f(a) = i j f(a) (i, j = 1,..., n). 7.6. Definition Let f R n R and suppose that the set is nonempty. Then the function D f := { x int D f f D 2 (x) } f : D f R n n, x f (x) is called the second derivative function (or simply: the second derivative) of f. The functions j i f : D f R, x j i f(x) (i, j = 1,..., n) are called the second order partial derivative functions (or simply: the second order partial derivatives) of f. If we want to define the 3 times differentiability of f at a point a int D f (denoted by f D 3 (a) ) then we have to suppose that f is differentiable at a. Since the coordinate functions of f are the second order partial derivatives, so f D 3 (a) f D(a) j i f D(a) (i, j = 1,..., n). 7.7. Definition Suppose that f D 3 (a). Then the numbers k j i f(a) (i, j, k = 1,..., n) are called the 3-rd order partial derivatives of f at a. The 3-array with entries ( f (a) ) ijk = k j i f(a) (i, j, k = 1,..., n) is called the 3-rd order derivative of f at a. In similar way recursively we can define the k-th derivative and the k-th order partial derivatives for k = 4, 5,.... Thus these concepts are defined for any k N. We agree that the 0-th derivative is the function itself. Some notations: f D k (a): f is k times differentiable at a. f (k) (a): the k-th derivative of f at a. jk jk 1... j2 j1 f(a) (j 1,..., j k = 1,..., n): the k-th order partial derivatives of f at a. Their number is n k.

36 7. Lesson 7 So f (k) (a) R n n... n = R n k is a k-array with the entries ( ) f (k) (a) = jk jk 1... j2 j1 f(a) (j 1,..., j k = 1,..., n). (7.1) j 1,..., j k Applying several times the Theorem of Young we obtain the following theorem. 7.8. Theorem If f D k (a) then the k-array f (k) (a) is symmetric in the following sense. Let j 1,..., j k {1,..., n} and the finite sequence p 1,..., p k be a permutation (may be: permutation with repetition) of the finite sequence j 1,..., j k. Then ( ) ( ) f (k) (a) = f (k) (a) p 1,..., p k j 1,..., j k that is pk pk 1... p2 p1 f(a) = jk jk 1... j2 j1 f(a). 7.9. Definition Let f R n R, a int D f. We say that f is k times continuously differentiable at a (its notation is: f C k (a)) if r > 0 x B(a, r) : f D k (x) and f (k) C(a). Naturally, the definition is equivalent with the continuity of the k-th order partial derivative functions at a. 7.2. Taylor s Formula 7.10. Definition (Taylor s polynomial) Let m N {0}, f R n R, f D m (a). The n-variable polynomial T m (x) := f(a) + f (a)(x a) + f (a)(x a) 2 +... + f (m) (a)(x a) m 1! 2! m! m f (k) (a)(x a) k = f(a) + (x R n ) k! k=1 = is called the m-th Taylor-polynomial of f at the center a. 7.11. Remarks. 1. For the meaning of the terms of the above sum we remind the Reader of the definition of the symbol Ax k where A is a k-array and x is a vector (see: Definition 1.7 in Lesson 1). 2. It is obvious that the degree of T m is at most m and that T m (a) = f(a).

7.2. Taylor s Formula 37 In the followings we will use the abbreviation h = x a. To approximate f with the help of its Taylor-polynomials and to prove the Taylor Formula, we need the concept of the line segment in R n and a theorem about the k-th derivative of an auxiliary function. 7.12. Definition Let a R n, h R n \ {0}. The set [a, a + h] := {a + th R n 0 t 1} R n is called a closed line segment in R n. a is the starting point and a + h is the terminal point of the line segment. 7.13. Theorem Let f R n R, k N {0}. Let the closed line segment [a, a+h] be a subset of int D f and suppose that f is k times differentiable at any point of [a, a + h]. Let F R R, F (t) := f(a + th) (t R, a + th D f ). Then [0, 1] int D f, F is k times differentiable at any point of the closed interval [0, 1] and F (k) (t) = f (k) (a + th)h k (t [0, 1]) (7.2) in the sense of the symbol Ax k (see Definition 1.7 in Lesson 1). Proof. The precise proof of this formula requires mathematical induction. For simplicity we will prove it only for the cases n = 1 and n = 2. One will see from these two cases the general inductional step. Proof of (7.2) in the case k = 1: F is a composition of functions f and t a + th. Using the Chain Rule we obtain: F (t) = f (a + th) h = n j f(a + th) h j = f (a + th)h 1. j=1 Proof of (7.2) in the case k = 2 (using that it is true for k = 1): F is an n-term sum of functions t j f(a + th) h j. The j-th term is a scalar multiple of the composition of functions j f and t a + th. Applying the previous result (case k = 1) for j f instead of f we obtain: F (t) = (F (t)) = d n j f(a + th) h j = dt = n h j j=1 j=1 n i j f(a + th) h i = i=1 n j=1 h j d dt ( jf(a + th)) = n i j f(a + th) h i h j = f (a + th)h 2. i,j=1

38 7. Lesson 7 7.14. Theorem [Taylor s formula] Let f R n R, a D f, h R n, h 0, m N {0}. Suppose that f is m + 1 times differentiable at any point of the closed line segment [a, a + h] := {a + th R n 0 t 1}. (Remember that it requires [a, a + h] int D f ). Then there exists ϑ R, 0 < ϑ < 1 such that f(a + h) T m (a + h) = f (m+1) (a + ϑh)h m+1 (m + 1)! that is m f (k) (a)h k f(a + h) = f(a) + k! k=1 + f (m+1) (a + ϑh)h m+1 (m + 1)! (7.3) Proof. Let us define the auxiliary function F R R, F (t) := f(a + th) (t R, a + th D f ). Then by the previous theorem F satisfies the assumptions of the one-variable Taylor Formula (see: Analysis-2) at the center 0. Applying the one-variable Taylor Formula for approximation of F (1) we have: ϑ (0, 1) : F (1) = m k=0 = F (0) + F (k) (0) k! m k=1 (1 0) k + F (m+1) (ϑ) (m + 1)! F (k) (0) k! + F (m+1) (ϑ) (m + 1)!. (1 0) m+1 = (7.4) Using F (1) = f(a + h), F (0) = f(a) and the result of the previous theorem for F (k) we have F (k) (0) = f (k) (a + 0h)h k = f (k) (a)h k (k = 1,..., m) and F (m+1) (ϑ) = f (m+1) (a + ϑh)h m+1. Substituting this result into (7.4) we obtain the statement of the theorem.