The Uniformity Principle A New Tool for Probabilistic Robustness Analysis B. R. Barmish and C. M. Lagoa Department of Electrical and Computer Engineering University of Wisconsin-Madison, Madison, WI 53706 Abstract This paper provides a new mathematical framework for analysis of control systems which are operated with admissible values of uncertain parameters which exceed the bounds specied by classical robustness theory. In such a case, it is important to quantify the tradeos between risk of performance degradation and increased tolerance of uncertainty. If a large increase in the uncertainty bound can be established, an acceptably small risk may often be justied. Since robustness problem formulations do not include statistical descriptions of the uncertainty, the question arises whether it is possible to provide such assurances in a \distribution-free" manner. In other words, if F denotes a class of admissible probability distributions for the uncertainty X, we seek some worstcase f 2 F having the following property The probability of performance satisfaction under f is smaller than the probability under any other f 2 F. Said another way, f provides the best possible guarantee. Within this new framework, the main result, Theorem 2., is called the Uniformity Principle. To apply this theorem, we require the uncertain parameters X i to be zero-mean independent random variables with known support interval. For each uncertainty X i, the class F is assumed to consist of density functions f i (x i ) which are symmetric and non-increasing with respect to jx i j. Then, for a given symmetric target X for X, the Uniformity Principle indicates the probability X is in X is minimized by the uniform distribution for X. The interpretation of this result in a feedback control context is given in [].. Introduction This paper provides the mathematical underpinnings associated with a new approach to robustness problems; see [] for a full exposition in a controltheoretic framework. The main result, which we call the Uniformity Principle, facilitates probabilistic assessment of systems in a so-called \distribution-free" manner. That is, without having a detailed statistical description of the uncertain parameters, we want to make a worst-case probabilistic assessment some performance specication is satised. Although it may seem ad hoc to assume a uniform distribution when statistical information about the uncertain parameters is unavailable, the Uniformity Principle justies such an assumption under rather mild conditions on the \shape" of the underlying density functions. For more general cases than those covered by the Theorem 2.2 to follow, it is also proven a class of truncated uniform distributions leads to the worst case. However, for this more general nonlinear problem class, the results of this paper should be viewed as preliminary because the truncation points for these distributions are not characterized; see Section 3 for further discussion. The problem setup in the next section is reminiscent of the formulation of Huber [2] in the eld of statistics. In contrast to his distributional robustness formulation, however, robustness motivation dictates almost no apriori information about the \shape" of the underlying density functions should be assumed. if X i denotes the i-th uncertain parameter, we do not take its density f i as a contamination of some other known density. Instead, it is only assumed f i (x i ) is symmetric and nonincreasing with respect to jx i j. If one interprets the results of this paper in a control framework, there is consistency with the work of [3]-[6]. The key dierence between the framework of this paper and these references is simple to explain In [3]-[5], since sampling of the X i is involved, a choice of density function for each parameter is required. In this regard, the results in this paper can be interpreted to say the uniform sampling of each X i is the \right" way to proceed. That is, probabilities which are estimated using a uniform distribution are \tight" in the sense the do not underestimate the true risk. In [6], cases are identied for which a worst-case distribution turns out to be impulsive. In contrast with this paper, the results in [6] require an adequate number of uncertain parameters so the hypotheses associated with the Central Limit Theorem are satised. For a problem with xed input data, it often turns out the central limit predictor is not justied, but use of the uniform distribution is nevertheless correct. In other words, under the assumptions to follow, the uniform distribution is seen to be the worst case independently of the number of uncertain parameters. Finally, numerical results obtained in [] support a conclusion reached in [3]-[6]. That is, there are many cases for which one can operate well above the classical robustness margin while keeping the risk of performance violation very small. 2. Formulation and Main Result In this section, the main objective is to establish the so-called Uniformity Principle. Its proof is given
with the help of a number of preliminary lemmas. 2. Preliminary Notation Throughout this section, X = (X ; X 2 ; ; X`) is taken to be a a vector of independent zero-mean random variables. For each X i, the support is supp X i = [?; ] the support for X is the unit cube in R` which is denoted by B. Now, consistent with the discussion in Section, a density function f i (x i ) is said to be admissible for X i if it is symmetric and nonincreasing with respect to jx i j. The class of admissible density functions for each X i is denoted by F. Finally, when f 2 F is written for a joint density function f(x ; x 2 ; ; x`) = f (x )f 2 (x 2 ) f`(x`); the understanding is each marginal f i is in F. The density function for the uniform distribution u is expressed in terms of its marginals u(x ; x 2 ; ; x`) = u (x )u 2 (x 2 ) u`(x`) When it needs to be made clear a density function f is attached to X, the notation X f is used with i-th component X f;i. 2.2 The Uniformity Principle (See Section 2.6 for Proof) Let X be closed convex symmetric set in R` with center x 0 = 0. Then it follows min f2f ProbfX f 2 X g = ProbfX u 2 X g 2.3 Notation for Supporting Lemmas In order to establish the Uniformity Principle, some notation associated with the proof is introduced. First, the standard notation I is used to denote the indicator function of a set ; i.e., I () = for 2 and I () = 0 otherwise. In the sequel, `() denotes the classical Lebesgue measure of a set R`. When a lower-dimensional volume measure is required, the following convention is used If is a (`? )-dimensional subset of R`, then `? () is the volume of with respect to R`?. Now, two subsets of F are dened The notation F s is used to denote the class of piecewise-constant (simple) density functions f 2 F which, for > 0, can be expressed as f() = k I [ N ; N k )() for some positive integer N. Note being a density function forces the k 0. Furthermore, the non-increasing condition for membership in F dictates 2 N ; and the requirement f integrates to unity forces k = N 2 Now, the class of truncated uniform distributions is denoted by U T. More precisely, a joint density function u t (x ; x 2 ; ; x`) = u t; (x )u t;2 (x 2 ) u t;`(x`) is in U T if there exist positive constants t ; t 2 ; ; t` such u t;i (x i ) = I (?ti;t 2t i)(x i ) i for i = ; 2; ; `. Note the special case of uniform distribution corresponds to all t i =. Finally, to conclude this subsection, we summarize by noting u 2 U T F s F The objective of the next three lemmas is to establish ProbfX f 2 X g can be minimized by restricting our attention to truncated uniform distributions in U T. To this end, a \rst principles" development is provided. Note, however, a more direct and less elementary route to Lemma 2.6 is possible by expressing densities f 2 F as continuous mixtures of densities in U T. 2.4 Lemma Let X be a closed set in R`. Then inf f2f ProbfX f 2 X g = inf ProbfX fs 2 X g f s2f s Proof In order to prove the lemma, it suces to show for any density f 2 F and any > 0, there exists some f s 2 F s such ProbfX fs 2 X g ProbfX f 2 X g + This being the case, the conclusion of the lemma follows by taking the inmum on both sides above. The desired density f s above is now constructed. Indeed, since the i-th component f i of the joint density f(x ; x 2 ; ; x`) = f (x )f 2 (x 2 ) f`(x`) is integrable and non-increasing, it is Riemann integrable. Hence, with F? i (k; N) = lim f i (x i ) x i" N k for each positive integer N and k 2 f; 2; ; N g, each of the lower Riemann sums converges. That is, F i (N) = 2 N F? i (k; N) lim N! F i(n) = f i ()d =? there exists a positive integer N enough so with it follows F (N) = F (N)F 2 (N) F`(N); 0? F (N) < 2 large For the remainder of the proof, N is held xed. This denes the partitions of [?; ] which enables construction of the desired density function f s 2 F s. Indeed, we claim for x i > 0 and f s;i (?x i ) = f s;i (x i ), the marginals of f s dened by f s;i (x i ) = F i (N) F? i (k; N)I [ N ; k N )(x i)
are in F s. To prove this claim, it is rst noted for x i 0, the non-increasing property for each f i (x i ) implies for k 2 > k, F? i (k 2; N) F? i (k ; N) Hence, f s;i 2 F s for i = ; 2; ; `. To complete the proof of the lemma, with B denoting the unit cube, it suces to show ProbfX fs 2 X g = f s dx f dx + = ProbfX f 2 X g + with the understanding the integrals above are over subsets of R`. The denition of N is now used in combination with the fact the approximation to f dened by ^f s;i (x i ) = F i (N)f s;i (x i ) satises f i (x i ) ^f s;i (x i ) for x i 0. This leads to the chain of inequalities (f? f s ) dx jf? ^f j dx + B (f? ^f) dx +? F (N) = 2(? F (N)) < (f s? ^f) dx The next lemma involves simple concepts about linear functions and linear inequalities. 2.5 Lemma Let = ( ; 2 ; ; m ) 2 R m and assume a ; a 2 ; ; a m are real scalars. Then, the problem of minimizing the linear function subject to () = mx i= a i i 2 m 0 and + 2 + + m = is solved by one of the m vectors () ; (2) ; ; (m) given by (k) i = 8 < k for i = ; 2; ; k; 0 otherwise Proof From the theory of convex analysis (for example, see [8]), it is well known a linear function on a polytope is maximized at an extreme point. Now, it is straightforward to verify the linear inequality constraints for generate a polytope whose extreme points are a subset of the list of (k) above. 2.6 Lemma Let X be a closed set in R`. Then inf f2f ProbfX f 2 X g = inf ProbfX ut 2 X g u t2u T Proof In view of Lemma 2.4, it suces to prove the following Given any f s 2 F s, there exists some u t 2 U T such ProbfX ut 2 X g ProbfX fs 2 X g Hence f s 2 F s is taken as xed and its density is expressed as f s (x ; x 2 ; ; x`) = f s; (x )f s;2 (x 2 ) f s;`(x`) with piecewise-constant components of the form f s;i (x i ) = XN i for x i 0, with ik 0 and XN i ik I [ N i ; k N i ) (x i) ik = N i 2 for i = ; 2; ; `. Now, to minimize the probability of inclusion in X, an optimization with respect to the ik is set up. That is, a set of ik is sought which minimizes the probability X fs 2 X. To complete the proof of the lemma, it suces to x all ik with index i > and prove the half-probability () = XN X k I [ N ; kn )(x )f s;2 (x 2 ) f s;`(x`)dx is minimized by a set of f k g corresponding to a truncated uniform distribution in the class U T. Note once the result is proven for the rst component f s;, it can be repeated for the remaining components f s;2 ; f s;3 ; ; f s;`. The desired minimum above is now established by invoking Lemma 2.5. That is, with scalars a k = I [ N ; kn )(x )f s;2 (x 2 ) f s;`(x`)dx X for k = ; 2; ; N, modulo scaling, the function () = XN a k k and the constraints on the k satisfy the hypotheses of Lemma 2.5. 2.7 Lemma (Brunn{Minkowski; see [7]) Let A and B be compact sets in R k and dene = fa + (? )b a 2 A; b 2 Bg C for 2 [0; ]. Then with Lebesgue measure k (C ) = the function h() = [ k (C )] k C dx; is concave. We establish the next lemma via application of the Brunn{Minkowski result above. The reader is reminded about the convention used for (`? )-dimensional volume measures in R`; see Section 2.3.
2.8 Lemma Let X be a compact convex symmetric set in R` with center x 0 = 0. Dene sections X i () = fx = (x ; x 2 ; ; x`) 2 X x i = g for i = ; 2; ; ` and assume is real. ` 2, each of the functions g i () = `? (X i ()) Then, for is non-increasing for 0 and non-decreasing for 0. Proof The result is established for i =, and it is noted an identical proof can be used for the other sections. In addition, only 0 is considered because symmetry of X implies an identical proof can be used for 0. 2 > 0 is xed and we claim g ( ) g ( 2 ). To prove this, dene the (`? )-dimensional sets ~X() = f~x 2 R`? (; ~x) 2 X g and observe for any 0, Now, for 2 [0; ], let `? ( ~ X ()) = `? (X ()) ~X = ~ X (2 ) + (? ) ~ X (?2 ); and make the following identications with Lemma 2.7 A ~ X ( 2 ); B ~ X (? 2 ); C ~ X ; h() [`? ( X ~ )] `? Since X is symmetric with center x 0 = 0 then h(0) = h(). Moreover, by Lemma 2.7, h() is concave. h() h() for all 2 [0; ]. Now, we x = = + 2 2 2 and observe = 2?(? ) 2. Furthermore, since X is convex, the inclusion X ( ) X ( 2 ) + (? )X (? 2 ) leads to ~X ( ) ~ X = ~ X (2 ) + (? ) ~ X (? 2 ) `? ( ~ X ( )) `? ( ~ X ) Now, monotonicity of `? for 0 implies [`? ( ~ X ( ))] `? [`? ( ~ X )] `? = h( ) h() = [`? ( ~ X( 2 ))] `? `? (X ( )) = `? ( ~ X ( )) `? ( ~ X (2 )) = `? (X ( 2 )) Equivalently, it follows g ( ) g ( 2 ). 2.9 Proof of the Uniformity Principle In view of Lemma 2.6, only truncated uniform distributions u t = (u t; ; u t;2 ; ; u t;`) 2 U T are considered. Letting (?t i ; t i ) denote the support of u t;i, an optimization problem in the variable t = (t ; t 2 ; ; t`) is set up. To prove the Uniformity Principle, it must be shown the truncation vector t = t = (; ; ; ; ) minimizes ProbfX ut 2 X g with respect to t 2 B. To establish t is the minimizer, it suces to prove the minimizer has rst co-ordinate t = t = ; if this fact is established, the same argument can be repeated for t 2 ; t 3 ; ; t`. t = is treated as a variable and t i = t i is xed for i = 2; 3; ; `. Now with t = (; t2 ; t 3 ; ; t`), we claim the function '() = ProbfX ut 2 X g is minimized by =. To prove this claim, observe '() = I (?;) (x )I 2`t 2 t 3 (x (?t2;t2) 2) t` X I (x`)dx (?t`;t`) dx` In order to complete the proof, the integral above is re-written in a form which makes application of Lemma 2.8 transparent. To this end, we dene the symmetric convex set = fx 2 X jxi j t i for i = 2; 3; ; `g; X t the hyperplanes = f(x ; x 2 ; ; x`) x = g H and the function g() = 2`? `? (H \ X t 2 t 3 t ) t` for ; 2 [0; ] and ` 2. For the degenerate case ` =, the simpler formula g() = 2 I X () is used. Then, with this notation, the integral dening '() reduces to '() = 8 < g(0) for = 0; R 0 g()d otherwise Moreover, for ` 2, Lemma 2.8 guarantees g() is non-increasing. For ` =, the convexity of X leads to the same guarantee. It is now clear '() is precisely equal to the average value of the function g() over the interval [0; ]. Since g() is nonincreasing, this average is minimized when is maximal; i.e., = minimizes '(). 3. Conclusions and Further Research In this section, some extensions and renements of the theory are presented and new directions for research are discussed. The results in this paper suggest some obvious continuations. Of foremost interest
For a large class of nonlinear functions of independent random variables, the worst-case corresponds to a truncated uniform distribution in the class U T. This leads to two problems of immediate interest First, suppose the support of the i-th marginal u t;i 2 U T is denoted by (?t i ; t i ) [?; ]. Then, it would be important to characterize the truncation points t i. If these scalars can be readily computed, it becomes possible to implement eective Monte Carlo solutions to a number of problems which are known to be NPhard when analyzed in a more traditional framework. It might even turn out for the scenario in the subsection above, there is a large class of problems for which all t i =. In other words, no truncation is required and the uniform distribution emerges again as the worst-case. A limited number of experiments to date lead us to conjecture this may indeed be true for a rich class of multilinear uncertainty structures and targets. If this proves to be the case, then more general versions of the Uniformity Principle (see Section 2) could also be established. Another direction for future work involves elimination of the symmetry requirement associated with distributions f 2 F. To this end, if the denition of U T is modied to allow asymmetric truncations, the statement of Lemma 2.6 remains intact. In other words, the worst-case distribution has i-th component u t;i which has one constant value over an interval of the form [?t? i ; 0] and possibly a dierent value over an interval of the form [0; t + i ]. It appears also possible to embellish a number of the results in this paper so as to accommodate additional apriori information about the class of density functions F. To illustrate, suppose F is now the set of all admissible density functions whose decay rate satises certain constraints. Then, it is possible to modify the technical results in this paper to take this information into account. There are many additional open future research problems associated with taking more apriori information into account. To conclude, note the results in this paper enable us to take a new approach in the analysis of uncertain dynamical systems. Often, one can operate well above the robusteness margin given by classical methods and still expect robust stability and/or performance with high condence levels. References [] Barmish, B. R. and C. M. Lagoa (995). \The Uniform Distribution A Rigorous Justication for its Use in Robustness Analysis," Technical Report ECE{ 95{2, Department of Electrical and Computer Engineering, University of Wisconsin-Madison. [2] Huber, P. J. (98). Robust Statistics, John Wiley & Sons, New York. [3] Stengel, R. F. and L. R. Ray (99). \Stochastic Robustness of Linear Time-Invariant Systems," IEEE Transactions on Automatic Control, AC-36, pp. 82-87. [4] Ray, L. R. and R. F. Stengel (993). \A Monte Carlo Approach to the Analysis of Control Systems Robustness," Automatica, vol. 3, pp. 229-236. [5] Tempo, R., E. W. Bai (995). \Robustness Analysis with Nonlinear Parametric Uncertainty A Probabilistic Approach," CENS-CNR Report, Politecnico di Torino. [6] Barmish, B. R., B. T. Polyak (996). \A New Approach to Open Robustness Problems Based on Probabilistic Prediction Formulae." Submitted to IFAC World Congress, San Francisco, California. [7] Berger, M. (987). Geometry, Springer-Verlag, Berlin, New York. [8] Rockafellar, R. T. (970). Convex Analysis, Princeton University Press, Princeton. Acknowledgements This research was supported by the U.S.National Science Foundation under Grant Number ECS-948709 and by funds of JNICT-Portugal. The authors express their thanks to Professors Rajeev Agrawal and John Gubner for a number of discussions about this work. In addition, gratitude is expressed to Professors Victor Klee and Carsten Schuett for pointing out relevant references associated with the Brunn{Minkowski theory and Saeed Asgari for extensive comments about the readability of the manuscript.