An Introduction to Non-Standard Analysis and its Applications Kevin O Neill March 6, 2014 1 Basic Tools 1.1 A Shortest Possible History of Analysis When Newton and Leibnitz practiced calculus, they used infinitesimals, which were supposed to be like real numbers, yet of smaller magnitude than any other type of postive real number. In the 19th century, mathematicians realized they could not justify the use of infinitesimals according to their sense of rigor, so they began using definitions involving ɛ s and δ s. Then, in the early 1960 s, the logician Abraham Robinson figured out a way to rigorously define infinitesimals, creating a subject now known as nonstandard analysis. 1.2 The Hyperreals To motivate our construction of the hyperreals, consider the following sequence: 1, 1 2, 1 3, 1 4,... Under the standard definitions of analysis, we say this sequence has limit 0. But nowhere is any entry of the sequence 0, so rather than defining what it means to take a limit, we want it to represent it by an infinitesimal. This makes sense because given any real number r, the sequence eventually becomes less than r. Now, let R N be the set of infinite sequences of real numbers, and identifying r R with (r, r, r,...) R N, let us try to turn R N into a field that respects the operations of R. Doing so, we will add infinitesimals to the reals (and infinite numbers through division). However, we quickly see that this becomes an issue when we compute the product: (1, 0, 1, 0, 1, 0,...) (0, 1, 0, 1, 0, 1,...) = (0, 0, 0, 0,...). If R N is to become a field, then it must have no zero divisors, so one of the two sequences on the left should be equal to zero. The idea now is to set up an equivalence relation on R N that is large enough to make a sequence with half nonzero entries equal 1
to 0, yet small enough that we may still perform basic operations in a coordinate-wise manner. The introduction of ultrafilters makes this possible. Definition 1. Let I be a nonempty set. Then, F P(I) is an filter if the following hold: 1. If A, B F, then A B F 2. If A F and A B I, then B F We say F is an ultrafilter if for any A I, either A F or A c F. Additionally, an ultrafilter is principal if it is of the form {A I : i A} for some i I and nonprincipal otherwise. We will particularly be interested in nonprincipal ultrafilters, though we have introduced partial definitions since those objects do appear elsewhere in mathematics. The idea behind these definitions is that we want to equate two sequences that agree on a large subset of N. To make this an equivalence relation, we require that intersection of two large sets to be large and obviously, any set containing a large set should be large as well. Thus the collection of large subsets should form a filter. One of the two sequences above whose product is 0 must be zero itself, so we need either the odd or even numbers to be large, so we use an ultrafilter. And lastly, we don t want sequences to be equivalent if and only if they agree on, say, the first coordinate, so we require the ultrafilter to be nonprincipal. Another way of thinking about nonprincipal ultrafilters is as a counterexample to Arrow s Impossibility Theorem for an infinite voting set. Proposition 2. Every infinite set has a nonprincipal ultrafilter on it. Proof. Use Zorn s Lemma on the filter of cofinite sets. Now we are finally ready to define the hypperreals. Definition 3. Let F be a nonprincipal ultrafilter on N. Define the hyperreals to be the set *R = R N /, where (r n ) (s n ) if {n : r n = s n } F. We denote the equivalence classes of these sequences by [r n ]. You may notice that we have only specified one nonprincipal ultrafilter out of many and be wondering what happens if we choose a different filter. It turns out not to matter for our purposes, since we will mostly be transferring results back to R anyway, but now would be a good time to mention the following fact: Fun Fact 4. Under the Continuum Hypothesis, all possible constructions of *R as above are isomorphic as ordered fields. From here on, we will assume the contruction of a single *R. As an exercise, one may check that the operations +,, and < are well-defined in *R. 2
1.3 More * s In our study of *R, it will be helpful to be able to repeat the above construction to transfer more objects from the standard setting to the non-standard. Definition 5. Let A n be a sequence of subsets of R. Then define [A n ] *R by [r n ] [A n ] if and only if {n : r n A n } F. Any set obtained in this manner is said to be internal. As a special case of this construction, *A = [A]. In particular, *N will be called the hypernaturals. When each of the A n is finite, we call [A n ] hyperfinite. [A n ] is said to have hyperfinite cardinality [ A n ] *N, where A n is the usual cardinality of A n. Hypernaturals will be useful in defining integrals as limits of finite sums, which we interpret in the nonstandard setting as a hyperfinite sum. Another useful property of the hypernaturals is called Internal Induction, which states that any internal subset of *N that is closed under the succesor function must be equal to *N. In this article, we will not make use of this principle or frequently refer to internal sets, but the reader may like to know that both are very useful in the nonstandard setting. Also, it is worth noting that infinite hypernaturals can be useful in physical modelling. Rather than approximate a very large number of particles with continuum many, it may be beneficial to model a hypernatural number of them, since hypernaturals manage to retain certain properties of finite numbers that the continuum doesn t. Lastly, we will also want to transfer functions and relations from R to *R. To transfer a function f : A R, we let *f([r n ]) = [f(r n )] where *f : *A *R and claim that a relation *R([r 1n ],..., [r kn ]) holds if and only if {n : R(r 1n,..., r kn )} F. 1.4 Infinitesimal Arithmetic In this section, we will show that infinitesimals behave in the ways we would intuitively expect them to, as well as make a couple definitions that will be useful later in this article. Definition 6. A hyperreal b is an infinitesimal if b < r for all r R +. b is limited if b < M for some M R. Example 7. Let ɛ R +. Then, [1/n] = (1, 1/2, 1/3, 1/4,...) < (ɛ, ɛ,...) = [ɛ] because there exists N N such that n > N implies 1/n < ɛ. Thus, [1/n] is less than [ɛ] on the tail of the sequence. The tail is a large set in our ultrafilter F, so [1/n] < [ɛ]. ɛ > 0 was an arbitrary real, so [1/n] is an infinitesimal. Proposition 8. Define b c for b, c *R if b c is infinitesimal. Then is an equivalence relation. Additionally, every limited hyperreal b is equivalent to some real number, which we call the shadow of b, denoted sh(b). Proof. The relation is clearly symmetric, so suppose b c and c d, and let r R +. Then, b d b c + c d < d/2 + d/2 = d, so is also transitive, hence an equivalence relation. 3
If b is a limited hyperreal, then the set A = {r R : r < b} is bounded, so by the completeness of R, A has a least upper bound c R. We claim b c. Let ɛ > 0. Then, since c is an upper bound of A, we have c + ɛ / A, so b c + ɛ. Also, if b c ɛ, then c ɛ would be an upper bound for A, so b c ɛ, meaning b c ɛ. Since ɛ was arbitrary, b c. 1.5 Transfer Principle To motivate the Transfer Principle on the level of first-order logic, let us observe the following. In the standard setting, the rationals are dense in the reals, which may be expressed x, y R(x < y q Q(x < q < y)). If we put a * by each set above, we get the following statement x, y *R(x < y q *Q(x < q < y)), which is also true. (To see this, consider [x n ] < [y n ], taking x n < y n without loss of generality via our ultrafilter construction. By the density of rationals in the reals, choose q n with x n < q n < y n for each n and take [q n ].) It turns out this is not just a coincidence, but holds more generally, provided we take the appropriate setup. To do this, we work in the setting of the language of a relational structure. Rather than go into a complete description of what this means, let s just say for now that we will use logical sentences constructed from sets, their elements, relations, functions, and logical connectives. Given a sentence φ, we may take its *-transform by replacing a function f with *f replacing a relation R with *R, and replacing any bound P occuring in x P by x *P. The Transfer Principle states that if we are working over the language of R, then, a sentence φ is true if and only if *φ is true. The proof of this principle is via Loś s Theorem, which essentially is a proof by induction which builds up any transfer of a sentence from each of its components. We will not prove Loś s Theorem, but hopefully the reader can get a sense of why it is true from example. To see another example, let us look at the Archimidean property of the reals: The transfer of this statement is x R + ( n N)(nx > 1). x *R + ( n *N)(nx > *1). 4
This second statement is true, yet it is not equivalent to the Archimidean property. In fact, *R is not Archimidean in the standard sense. No repeated addition of [1/n] will ever bring it above 1. The lesson here is that while some properties transfer very naturally between R and *R, there are cases where the non-standard analogue may not be intuitive and/or useful. However, the Transfer Principle still says that it doesn t really matter which setting we work in, the standard or non-standard. Either is a suitable setting for analysis to be done. Since standard analysis is what people have been using for recent history, we often interpret the Transfer Principle as a tool for proving standard results in a non-standard setting. We will see examples of this in later sections. Our last remark on the Transfer Principle is that we have only stated it for firstorder languages, which means that we may only consider elements as variables and certain results about sets will not transfer. There is a stronger version of the Transfer Principle which allows us to make some transfers of higher-order objects, but its use still requires a lot of caution. 2 Working in the Non-Standard Setting 2.1 Non-Standard Definitions Since non-standard analysis is essentially equivalent to standard analysis, then we should be able to find non-standard definitions of standard concepts. The following lemma provides an example: Theorem 9. The function f : (a, b) R is continuous at a point c R if and only if x c implies *f(x) *f(c) for all x *R. Proof. First suppose f is continuous at c and let ɛ > 0. Then, there is a positive δ R such that ( x R)( x c < δ f(x) f(c) < ɛ), which by the Transfer Principle means ( x *R)( x c < δ *f(x) *f(c) < ɛ). If x c, then x c < δ, so *f(x) *f(c) < ɛ. Since ɛ > 0 was arbitrary, *f(x) *f(c). Now suppose x c implies *f(x) *f(c) and let ɛ > 0. Then, in particular, we may choose δ *R + such that x c < δ implies *f(x) *f(c), which in turn implies *f(x) *f(c) < ɛ. Thus, we have the following statement: ( δ *R + )( x *R)( x c < δ *f(x) *f(c) < ɛ), which under the Transfer Principle implies 5
so f is continuous at c. ( δ R + )( x R)( x c < δ f(x) f(c) < ɛ), Rather than prove the following equivalences, we simply leave the reader with the definitions. Definition 10. n *N \ N. 1. A real-valued sequence (s n ) converges to L R if s n L for all 2. L R is a cluster point of (s n ) if s N L for some n *N \ N. 3. If A R, then f is uniformly continuous on A if x y implies f(x) f(y) for all x, y *A. 4. The derivative of f at x is L if f(x+ɛ) f(x) ɛ L for all nonzero infinitesimals ɛ. 2.2 Integration via Hyperfinite Sums Suppose f : [a, b] R is an integrable function, in the standard sense. Then, classically, we know that b a fdx may be computed as the limit of sums over partitions. In particular, if we let {a = x 0, x 1, x 2,..., x n 1, x n = b} be a uniform parition of [a, b] and define S n = n k=1 f(x i) x, where x = (b a)/n, then lim n S n = b a fdx. We may view S n as a function from N to R (i.e., a sequence), so S N is defined for any hypernatural N. By the non-standard definition of sequence convergence, S N b a fdx for all N *N\N, or b a fdx = sh(s N) for any such N. We give S N the name hyperfinite sum as though we were summing over the partition {a, a + (b a)/n,..., b (b a)/n}. This is just a formality, but writing S N = N *f(x i )(b a)/n i=1 in some sense still describes what we are doing, and soon we will use this notation to describe more complicated phenomena. 2.3 Peano s Theorem To see hyperfinite summation take a greater role, let us now prove Peano s Existence Theorem, a fundamental result establishing the existence of a solution to particular differential equations. Theorem 11. Let f : R [0, 1] R be a bounded, continuous function. Then, there is a solution to the initial value problem y (t) = f(y(t), t) with y(0) = y 0 for any y 0 R. 6
Proof. Let N = [N n ] be an unlimited hypernatural and let T = {0, 1/N, 2/N,..., 1}. Define each Y n inductively on the points of {0, 1/N n,..., 1} as follows: k 1 Y n (k/n n ) = y 0 + f(y n (i/n n ), i/n n ) 1. N n i=0 This allows us to define a function Y on T by Y (K/N) = [Y k (K k /N k )], where K = [K k ]. As a hyperfinite sum, we write: K 1 Y (K/N) = y 0 + *f(y (i/n), i/n) 1 N. i=0 Now let y(t) = sh(y ( t)), where t is the member of T to the immediate left of t. Since f is bounded by some M R, Y (t) Y (s) M t s for all s, t T, so Y is continuous, and therefore, so is y. By this continuity, we may write N t y(t) y 0 + *f(*y(i/n), i/n) 1 N. i=0 But the right hand side is just y 0 + t 0 f(y(s), s)ds by our description of integration as a hyperfinite sum, and differentiating both sides of y(t) = y 0 + t 0 f(y(s), s)ds shows that y(t) is a solution to the initial-value problem. 3 Bigger Applications 3.1 Loeb Measure In Section 1.2, we defined internal sets, which essentially are those arising from a sequence A n of subsets of R. While we will not prove much here, internal sets turn out to be the nice sets of non-standard analysis and allow us to define measures on certain subsets of *R and any thorough introduction to the subject will mention internal sets much more frequently. We skip to the following result: Proposition 12. The collection of internal subsets of *R form an algebra (meaning it is closed under finite intersections, finite unions, and complements). The proof of this is rather straightforward and actually follows from the fact that for general sequences A n, B n of subsets of R, we have [A n ] [B n ] = [A n B n ], [A n ] [B n ] = [A n B n ], and [A n ] c = [A c n]. However, the next result, which relates to the concept of countable saturation, is a little less obvious: Theorem 13. If n N X n is internal, then it is equal to n k X n for some k N. In particular, this says that the collection of internal subsets of *R is not a σ-algebra, meaning it is not closed under infinite unions. Those of you familiar with measure theory 7
will remember that σ-algebras are the key to creating measures. However, there is a possible fix which actually makes use of this fact to help us construct a measure. Let S be a hyperfinite set and let P I (S) be the collection of internal subsets of S. As noted above, P I (S) is an algebra, yet not a σ-algebra. For A P I (S), let µ(a) = A S, where refers to hyperfinite cardinality. Showing that µ is finitely additive is fairly straightforward, but we claim that µ is actually countably additive, hence it forms a measure. To see this, recall that a countable additive function ν : M *R only has to satisfy ν( n N A n ) = n=1 ν(a n) for disjoint A n when n N A n M. But for internal sets, this only holds when n N A n is a finite union, and this case is already taken care of by finite additivity. Next, we modify our funtion to have real image, since our goal is to construct a measure, which by definition takes on real values. So let { sh(µ(a)) : if µ(a) is limited µ L (A) = : if µ(a) is unlimited or This gives us a premeasure on P I (S), which under standard constructions give us a measure on the σ-algebra of all µ L -measurable subsets of S (some of which will not be internal). The final result is known as Loeb measure. It can be shown that in special cases, Loeb measure reduces to Lebesgue measure, and that µ L is regular in the sense that any µ L -measurable set may be approximated arbitrarily closely by internal sets. 3.2 Brownian Motion Following Lindstrom, we make the following definition: Definition 14. A one-dimensional Brownian motion is a stochastic process b : Ω [0, ) R such that b(ω, 0) = 0 for all ω and (i) If s 1 < t 1 s 2 < t 2... s n < t n, then the random variables b(, t 1 ) b(, s 1 ),..., b(, t n ) b(, s n ) are independent. (ii) It s < t, the random variable b(, t) b(, s) is Gaussian distributed with mean zero and variance t s. (iii) For almost all ω, the path t b(ω, t) is continuous. Intuitively, Brownian motion can be described as an infinitesimal random walk. This expression has mostly been used in a colloquial manner, yet non-standard analysis allows us to give this rigorous meaning. Fix an unlimited hypernatural N and let T = {0, 1/N,..., N 2 1 N, N}. Let Ω be the collection of internal maps from T to { 1, 1}, where by an internal map, we mean a function *f = [f n ] for some standard functions f n. This 8
makes Ω a set of hyperfinite cardinality 2 N 2 +1. Now define a probability measure P on Ω by setting P (A) = A 2 N 2 +1 on internal sets A. Intuitively, this is like saying that every coin has a 1/2 chance of coming up heads (think of the finite case). This induces a well-defined probability measure P in the same way as in the previous section. Doing this, it can be shown that if B : Ω T *R by Then, b : Ω [0, ) R by B(ω, k k 1 N ) = j=0 ω(j/n) N, b(ω, t) = sh(b(ω, t)) is a Brownian motion (where t is the element of T immediately to the right of t). 3.3 An Application to Hilbert Spaces Shortly after Robinson published his foundations of non-standard analysis, he and Allen R. Bernstein published a paper in which they used non-standard analysis to prove the following result, quoted verbatim: Theorem 15. Let T be a bounded linear operator on an infinite-dimensional Hilbert space H over the complex numbers and let p(z) 0 be a polynomial with complex coefficients such that p(t ) is completely continuous (compact). then T leave invariant at least one closed linear subspace of H other than H or {0}. This was a big deal for two reasons. One, the invariant subspace problem is of great importance in functional analysis. Two, this was a result that had not ever been proven using standard methods. At this time, mathematicians began to think of non-standard analysis as more of a useful tool than an interesting side note. It is worth noting that shortly after their paper was published, Paul R. Halmos translated the non-standard proof into a standard proof, yet it is still significant that people would first think of the proof in a non-standard setting. 4 References [1] Bernstein, Allen R. and Robinson, Abraham. Solution of an Invariant Subspace Problem of K. T. Smith and P. R. Halmos. Pacific Journal of Mathematics, Vol. 16, No.3: 1966. [2] Cutland, Nigel. Non-Standard Analysis and its Applications. Cambridge University Press, Cambridge: 1988. 9
[3] Goldblatt, Robert. Lectures on the Hyperreals: An Introduction to Non-Standard Analysis. Springer, New York: 1998. 10