Fuzzy Systems, Modeling and Identification

Fuzz Sstems, Modeling and Identification Robert Babuška Delft Universit of Technolog, Department of Electrical Engineering Control Laborator, Mekelweg 4, P.O. Bo 53, 26 GA Delft, The Netherlands tel: +3 5 7857, fa: +3 5 2786679, e-mail: r.babuska@et.tudelft.nl Summar This tet provides an introduction to the use of fuzz sets and fuzz logic for the approimation of functions and modeling of static and dnamic sstems. The concept of a fuzz sstem is first eplained. Afterwards, the motivation and practical relevance of fuzz modeling are highlighted. Two tpes of rule-based fuzz models are described: the linguistic (Mamdani) model and the Takagi Sugeno model. For each model, the structure of the rules, the inference and defuzzification methods are presented. Fuzz modeling of dnamic sstems is addressed, as well as the methods to construct fuzz models from knowledge and data (measurements). Illustrative eamples are given throughout the tet. At the end, homework problems are included. MATLAB programs implementing some of the eamples are available from the author. The reader is encouraged to stud and possibl modif these eamples in order to get a better insight in the methods presented.

Preface Prerequisites: This tet provides an introduction to the use of fuzz sets and fuzz logic for the approimation of functions and modeling of static and dnamic sstems. It is assumed that the reader has basic knowledge of set and fuzz set theor (membership functions, operations on fuzz sets union, intersection and complement, fuzz relations, ma-min composition, etension principle), mathematical analsis (univariate and multivariate functions, composition of functions), and linear algebra (sstem of linear equations, least-square solution). Organization. The material is organized in five sections: In the Introduction, different modeling paradigms are first presented. Then, the concept of a fuzz sstem is first eplained and the motivation and practical relevance of fuzz modeling are highlighted. Section 2 describes two tpes of rule-based fuzz models: the linguistic (Mamdani) model and the Takagi Sugeno model. For each model, the structure of the rules, the inference and defuzzification methods are presented. At the end of this section, fuzz modeling of dnamic sstems is addressed. In Section 3, methods to construct fuzz models from knowledge and numerical data are presented. Section 4 reviews some engineering applications of fuzz modeling, and the concluding Section 5 gives a short summar. Illustrative eamples are provided throughout the tet, and at the end, homework problems are included. Some of the numerical eamples given have been implemented in MATLAB. The code is available from the author on request. The reader is encouraged to stud and possibl modif these eamples in order to get a better insight in the methods presented. A subject inde is provided for a quick reference. Aims: After studing the material, the reader should be able to: Characterize a fuzz sstem and give some eamples of fuzz sstems. Define the linguistic (Mamdani) and the Takagi-Sugeno fuzz model in terms of their structure, inference and defuzzification mechanisms. Eplain how dnamic sstems are represented b fuzz models, give eamples. List the steps and choices in the knowledge-based design of fuzz models. Name and briefl characterize the presented techniques for data-driven acquisition and tuning of fuzz models. Further reading. Readers interested in a detailed and fundamental treatment of fuzz set theor and fuzz logic can consult research monographs b Dubois and Prade (98) or Klir and Yuan (995). Basic, as well as more advanced concepts of fuzz modeling and control, are presented, for instance, b Pedrcz (993), Driankov, et al. (993) or Yager and Filev (994). Mathematical notation. Throughout the tet, the following conventions are used. Lower case characters in italics, such as or i, denote scalar variables and elements of vectors. Vectors are printed in bold, i.e., denotes a column vector. A row vector is denoted b the transpose operator, e.g., T. Upper case bold characters denote matrices, for instance, X is a matri. Upper case italic characters such as A denote crisp and fuzz sets. A linguistic variable (a variable whose values are fuzz sets) is denoted b ~. The term crisp is used as an opposite to fuzz. For instance, a fuzz number is a normal conve fuzz set, while a crisp number ma b a real or an integer number. A list of the used mathematical smbold is included in Appendi A. i

Acknowledgement. I am grateful to Piet Bruijn and Govert Monsees who read drafts of this tet and contributed b their useful comments. ii

Contents Introduction. Fuzz sstems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.2 Practical relevance of fuzz modeling : : : : : : : : : : : : : : : : : : : : : : : 4 2 Rule-Based Fuzz Models 5 2. Linguistic fuzz model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.. Relational representation of a linguistic model : : : : : : : : : : : : : : : 6 2..2 Ma-min (Mamdani) inference : : : : : : : : : : : : : : : : : : : : : : : 8 2..3 Multivariable sstems : : : : : : : : : : : : : : : : : : : : : : : : : : : 2..4 Defuzzification : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 2..5 Singleton model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 2.2 Takagi Sugeno model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.2. Inference mechanism : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.2.2 TS model as a quasi-linear sstems : : : : : : : : : : : : : : : : : : : : 4 2.3 Modeling dnamic sstems : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 3 Building Fuzz Models 6 3. Structure and parameters : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 3.2 Knowledge-based design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9 3.3 Data-driven acquisition/tuning of fuzz models : : : : : : : : : : : : : : : : : : 9 3.3. Least-squares estimation of consequents : : : : : : : : : : : : : : : : : : 9 3.3.2 Template-based modeling : : : : : : : : : : : : : : : : : : : : : : : : : 2 3.3.3 Neuro-fuzz modeling : : : : : : : : : : : : : : : : : : : : : : : : : : : 22 3.3.4 Fuzz clustering : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23 4 Overview of Applications 24 5 Summar and Concluding Remarks 25 A List of Smbols 27 Subject inde 3 iii

Introduction Developing mathematical models of real sstems is a central topic in man disciplines of engineering and science. Models can be used for simulations, analsis of the sstem s behavior, better understanding of the underling mechanisms in the sstem, design of new processes, or design of controllers. Traditionall, modeling is seen as a conjunction of a thorough understanding of the sstem s nature and behavior, and of a suitable mathematical treatment that leads to a usable model. This approach is usuall termed white-bo (phsical, mechanistic, first-principle) modeling. However, the requirement for a good understanding of the phsical background of the problem at hand proves to be a severe limiting factor in practice, when comple and poorl understood sstems are considered. Difficulties encountered in conventional white-bo modeling can arise, for instance, from poor understanding of the underling phenomena, inaccurate values of various process parameters, or from the compleit of the resulting model. A complete understanding of the underling mechanisms is virtuall impossible for a majorit of real sstems. However, gathering an acceptable degree of knowledge needed for phsical modeling ma be a ver difficult, time-consuming and epensive or even impossible task. Even if the structure of the model is determined, a major problem of obtaining accurate values for the parameters remains. It is the task of sstem identification to estimate the parameters from data measured on the sstem. Identification methods are currentl developed to a mature level for linear sstems onl. Most real processes are, however, nonlinear and can be approimated b linear models onl locall. A different approach assumes that the process under stud can be approimated b using some sufficientl general black-bo structure used as a general function approimator. The modeling problem then reduces to postulating an appropriate structure of the approimator, in order to correctl capture the dnamics and nonlinearit of the sstem. In black-bo modeling, the structure of the model is hardl related to the structure of the real sstem. The identification problem consists of estimating the parameters of the model. If representative process data are available, black-bo models usuall can be developed quite easil, without requiring process-specific knowledge. A severe drawback of this approach is that the structure and parameters of these models usuall do not have an phsical significance. Such models cannot be used for analzing the sstem s behavior otherwise than b numerical simulation, cannot be scaled up or down when moving from one process scale to another, and therefore are less useful for industrial practice. There is a range of modeling techniques that attempt to combine the advantages of the white-bo and black-bo approaches, such that the known parts of the sstem are modeled using phsical knowledge, and the unknown or less certain parts are approimated in a black-bo manner, using process data and black-bo modeling structures with suitable approimation properties. These methods are often denoted as hbrid, semi-mechanistic or gra-bo modeling. A common drawback of most standard modeling approaches is that the cannot make effective use of etra information, such as the knowledge and eperience of engineers and operators, which is often imprecise and qualitative in its nature. The fact that humans are often able to manage comple tasks under significant uncertaint has stimulated the search for alternative modeling and control paradigms. So-called intelligent modeling and control methodologies, which emplo techniques motivated b biological sstems and human intelligence to develop models and controllers for dnamic sstems, have been introduced. These techniques eplore alternative representation schemes, using, for instance, natural language, rules, semantic networks

or qualitative models, and possess formal methods to incorporate etra relevant information. Fuzz modeling and control are tpical eamples of techniques that make use of human knowledge and deductive processes. Artificial neural networks, on the other hand, realize learning and adaptation capabilities b imitating the functioning of biological neural sstems on a simplified level. The different modeling paradigms are summarized in Tab.. Table. Different modeling paradigms. modeling source of method of eample deficienc approach information acquisition mechanistic formal knowledge mathematical differential cannot use soft (white-bo) and data (Lagrange eq.) equations knowledge black-bo data optimization regression, cannot at all (learning) neural network use knowledge fuzz various knowledge knowledge- rule-based curse of and data based + learning model dimensionalit. Fuzz sstems A static or dnamic sstem which makes use of fuzz sets or fuzz logic and of the corresponding mathematical framework is called a fuzz sstem. There are a number of was fuzz sets can be involved in a sstem, such as: In the description of the sstem. A sstem can be defined, for instance, as a collection of if-then rules with fuzz predicates, or as a fuzz relation. An eample of a fuzz rule describing the relationship between a heating power and the temperature trend in a room ma be: If the heating power is high then the temperature will increase fast: In the specification of the sstem s parameters. The sstem can be defined b an algebraic or differential equation, in which the parameters are fuzz numbers instead of real numbers. As an eample consider an equation: = ~3 + ~5 2 ; where ~ 3 and ~ 5 are fuzz number about three and about five, respectivel, defined b membership functions. Fuzz numbers epress the uncertaint in the parameter values. The input, output and state variables of a sstem ma be fuzz sets. Fuzz inputs can be readings from unreliable sensors ( nois data), or quantities related to human perception, such as comfort, beaut, etc. Fuzz sstems can process such information, which is not the case with conventional (crisp) sstems. A fuzz sstem can simultaneousl have several of the above attributes. Table 2 gives an overview of the relationships between fuzz and crisp sstem descriptions and variables. In this tet we will focus on the last tpe of sstems, i.e., fuzzil described sstems with crisp or fuzz inputs. B means of the etension principle a crisp function can be evaluated for a fuzz argument (Zadeh, 975). 2

Table 2. Crisp and fuzz information in sstems. sstem input data resulting mathematical framework description output data crisp crisp crisp functional analsis, linear algebra, etc. crisp fuzz fuzz etension principle fuzz crisp/ fuzz fuzz fuzz relational calculus, fuzz inference Fuzz sstems can be regarded as a generalization of interval-valued sstems, which are in turn a generalization of crisp sstems. This is depicted in Fig. which gives an eample of a function and its interval and fuzz forms. The evaluation of the function for crisp, interval and fuzz data is schematicall depicted as well. Note that a function f : X! Y can be regarded as a subset of the Cartesian product X Y, i.e., as a relation. The evaluation of the function for a given input proceeds in three steps: ) etend the given input into the product space X Y (vertical dashed lines in Fig. ), 2) find the intersection of this etension with the relation, 3) project this intersection onto Y (horizontal dashed lines in Fig. ). This view is independent of the nature of both the function and the data (crisp, interval, fuzz). Remember this view of function evaluation, as will help ou to understand the use of fuzz relations for inference in fuzz modeling. crisp argument interval or fuzz argument crisp function interval function fuzz function Figure. Evaluation of a crisp, interval and fuzz function for crisp, interval and fuzz arguments. Most common are fuzz sstems defined b means of if-then rules: rule-based fuzz sstems. In the rest of this tet we will focus on these sstems onl. Fuzz sstems can serve different purposes, such as modeling, data analsis, prediction or control. In this tet a fuzz rule-based sstem is for simplicit called a fuzz model, regardless of its eventual purpose. 3

.2 Practical relevance of fuzz modeling Incomplete or vague knowledge about sstems. Conventional sstem theor relies on crisp mathematical models of sstems, such as algebraic and differential or difference equations. For some sstems, such as electro-mechanical sstems, mathematical models can be obtained. This is because the phsical laws governing the sstems are well understood. For a large number of practical problems, however, the gathering of an acceptable degree of knowledge needed for phsical modeling is a difficult, time-consuming and epensive or even impossible task. In the majorit of sstems, the underling phenomena are understood onl partiall and crisp mathematical models cannot be derived or are too comple to be useful. Eamples of such sstems can be found in the chemical or food industries, biotechnolog, ecolog, finance, sociolog, etc. A significant portion of information about these sstems is available as the knowledge of human eperts, process operators and designers. This knowledge ma be to vague and uncertain to be epressed b mathematical functions. It is, however, often possible to describe the functioning of sstems b means of natural language, in the form of if-then rules. Fuzz rule-based sstems can be used as knowledge-based models constructed b using knowledge of eperts in the given field of interest (Pedrcz, 99; Yager and Filev, 994). From this point of view, fuzz sstems are similar to epert sstems studied etensivel in the smbolic artificial intelligence (Buchanan and Shortliffe, 984; Patterson, 99). Adequate processing of imprecise information. Precise numerical computation with conventional mathematical models onl makes sense when the parameters and input data are accuratel known. As this is often not the case, a modeling framework is needed which can adequatel process not onl the given data, but also the associated uncertaint. The stochastic approach is a traditional wa of dealing with uncertaint. However, it has been recognized that not all tpes of uncertaint can be dealt with within the stochastic framework. Various alternative approaches have been proposed (Smets, et al., 988), fuzz logic and set theor being one of them. Transparent (gra-bo) modeling and identification. Identification of dnamic sstems from inputoutput measurements is an important topic of scientific research with a wide range of practical applications. Man real-world sstems are inherentl nonlinear and cannot be represented b linear models used in conventional sstem identification (Ljung, 987). Recentl, there is a strong focus on the development of methods for the identification of nonlinear sstems from measured data. Artificial neural networks and fuzz models belong to the most popular model structures used. From the input-output view, fuzz sstems are fleible mathematical functions which can approimate other functions or just data (measurements) with a desired accurac. This propert is called general function approimation (Kosko, 994; Wang, 994; Zeng and Singh, 995). Compared to other well-known approimation techniques such as artificial neural networks, fuzz sstems provide a more transparent representation of the sstem under stud, which is mainl due to the possible linguistic interpretation in the form of rules. The logical structure of the rules facilitates the understanding and analsis of the model in a semi-qualitative manner, close to the wa humans reason about the real world. 4

2 Rule-Based Fuzz Models In rule-based fuzz sstems, the relationships between variables are represented b means of fuzz if then rules of the following general form: If antecedent proposition then consequent proposition: The antecedent proposition is alwas a fuzz proposition of the tpe ~ is A where ~ is a linguistic variable and A is a linguistic constant (term). The proposition s truth value (a real number between zero and one) depends on the degree of match (similarit) between ~ and A. Depending on the form of the consequent two main tpes of rule-based fuzz models are distinguished: Linguistic fuzz model: both the antecedent and the consequent are fuzz propositions. Takagi Sugeno (TS) fuzz model: the antecedent is a fuzz proposition, the consequent is a crisp function. These two tpes of fuzz models detailed in the subsequent sections. 2. Linguistic fuzz model The linguistic fuzz model (Zadeh, 973; Mamdani, 977) has been introduced as a wa to capture available (semi-)qualitative knowledge in the form of if then rules: R i : If ~ is A i then ~ is B i ; i =; 2;:::;K: () Here ~ is the input (antecedent) linguistic variable, and A i are the antecedent linguistic terms (constants). Similarl, ~ is the output (consequent) linguistic variable and B i are the consequent linguistic terms. The values of ~ (~) and the linguistic terms A i (B i ) are fuzz sets defined in the domains of their respective base variables: 2 2 X R p and 2 Y R q. The membership functions of the antecedent (consequent) fuzz sets are then the mappings: (): X! [; ], (): Y! [; ]. Fuzz sets A i define fuzz regions in the antecedent space, for which the respective consequent propositions hold. The linguistic terms A i and B i are usuall selected from sets of predefined terms, such as Small, Medium, etc. B denoting these sets b A and B respectivel, we have A i 2Aand B i 2B. The rule base R = fr i ji =; 2;:::;Kg and the sets A and B constitute the knowledge base of the linguistic model. Eample 2. Consider a simple fuzz model which qualitativel describes how the heating power of a gas burner depends on the ogen suppl (assuming a constant gas suppl). We have a scalar input, the ogen flow rate (), and a scalar output, the heating power (). Define the set of antecedent linguistic terms: A = flow; OK; Highg, and the set of consequent linguistic terms: B = flow; Highg. The qualitative relationship between the model input and output can be epressed b the following rules: R : If O 2 flow rate is Low then heating power is Low: R 2 : If O 2 flow rate is OK then heating power is High: R 3 : If O 2 flow rate is High then heating power is Low: 2 Base variable is the domain variable in which fuzz sets are defined. 5

The meaning of the linguistic terms is defined b their membership functions, depicted in Fig. 2. The numerical values along the base variables are selected somewhat arbitraril. Note that no universal meaning of the linguistic terms can be defined. For this eample, it will depend on the tpe and flow rate of the fuel gas, tpe of burner, etc. Nevertheless, the qualitative relationship epressed b the rules remains valid. Low OK High Low High 2 3 25 5 75 O 2 flow rate [m 3 /s] heating power [W] Figure 2. Membership functions. In order to be able to use the linguistic model, we need an algorithm which allows us to compute the output value, given some input value. This algorithm is called the fuzz inference algorithm (or mechanism). For the linguistic model, the inference mechanism can be derived b using fuzz relational calculus, as shown in the following section. 2.. Relational representation of a linguistic model Each rule in () can be regarded as a fuzz relation (fuzz restriction on the simultaneous occurrences of values and ): R i :(X Y )! [; ]. This relation can be computed in two basic was: b using fuzz conjunctions (Mamdani method) and b using fuzz implications (fuzz logic method), see for instance (Driankov, et al., 993). Fuzz implications are used when the if-then rule () is strictl regarded as an implication A i! B i, i.e., A implies B. In classical logic this means that if A holds, B must hold as well for the implication to be true. Nothing can, however, be said about B when A does not hold, and the relationship also cannot be inverted. When using a conjunction, A ^ B, the interpretation of the if-then rules is it is true that A and B simultaneousl hold. This relationship is smmetric and can be inverted. For simplicit, in this tet we restrict ourselves to the Mamdani (conjunction) method. The relation R is computed b the minimum (^) operator: R i = A i B i ; that is, Ri (; ) = Ai () ^ Bi () : (2) Note that the minimum is computed on the Cartesian product space of X and Y, i.e., for all possible pairs of and. The fuzz relation R representing the entire model () is given b the disjunction (union) of the K individual rule s relations R i : K[ R = R i ; that is, R (; ) = ma [ A i () ^ Bi ()] : (3) ik i= Now the entire rule base is encoded in the fuzz relation R and the output of the linguistic model can be computed b the relational ma-min composition (): ~ =~ R: (4) 6

Eample 2.2 Let us compute the fuzz relation for the linguistic model of Eample 2.. First we discretize the input and output domains, for instance: X = f; ; 2; 3g and Y = f; 25; 5; 75; g. The (discrete) membership functions are given in Tab. 3 for the antecedent linguistic terms, and in Tab. 4 for the consequent terms. Table 3. Antecedent membership functions. domain element linguistic term 2 3 Low..6.. OK..4..4 High.... Table 4. Consequent membership functions. domain element linguistic term 25 5 75 Low...6.. High...3.9. The fuzz relations R i corresponding to the individual rule, can now be computed b using eq. (2). For rule R,wehaveR = Low Low, for rule R 2, we obtain R 2 = OK High, and finall for rule R 3, R 3 = High Low. The fuzz relation R, which represents the entire rule base, is the union (element-wise maimum) of the relations R i : R = R 2 = R 3 = 2 6 4 2 6 4 2 6 4 : : :6 :6 :6 :6 :3 :4 :4 :3 :9 : :3 :4 :4 : : : : : :6 3 7 5 3 7 5 3 7 5 9 >= >; R = 2 6 4 : : :6 :6 :6 :6 :4 :4 : : :3 :9 : : : :6 :4 :4 3 7 5 : (5) Graphical visualization of these steps is given in Fig. 3. In this figure, the relations are computed on a finer discretization b using the membership functions of Fig. 2. This eample can be run under MATLAB b calling the script ling. See the file ling.m for details of the implementation. Now consider an input fuzz set to the model, A = [; :6; :3; ], which can be denoted as Somewhat Low flow rate, as it is close to Low but does not equal Low. The result of ma-min composition is the fuzz set B =[; ; :6; :4; :4], which gives the epected approimatel Low heating power. For A =[; :2; ; :2] (approimatel OK), we obtain B =[:2; :2; :3; :9; ], i.e., approimatel High heating power. Verif these results as an eercise. 7

R = Low and Low R2 = OK and High.5.5 5 2 3 5 2 3 R3 = High and Low R = R or R2 or R3.5.5 5 2 3 5 2 3 Figure 3. Fuzz relations R, R 2, R 3 corresponding to the individual rules, and the aggregated relation R corresponding to the entire rule base. Because of the relational representation, the linguistic fuzz model is sometimes called a fuzz graph. Figure 4 shows the fuzz graph for our eample (contours of R, where the shading corresponds to the membership degree). The relational composition (4) can be regarded as a function evaluation on the fuzz graph, see also Fig.. 2..2 Ma-min (Mamdani) inference In the previous section, we have seen that a rule base can be represented as a fuzz relation. The output of a rule-based fuzz model is then computed b the ma-min relational composition. In this section, it will be shown that the relational calculus can be b-passed. This is advantageous, as the discretization of domains and storing of the relation R can be avoided. To show this, suppose an input fuzz value ~ = A, for which the output value B is given b the relational composition: B () = ma X [ A () ^ R(; )] : (6) After substituting for R (; ) from (3), the following epression is obtained: B () = ma A () ^ ma [ A i () ^ Bi ()] : (7) X ik Since the ma and min operation are taken over different domains, their order can be changed as follows: B () = ma ik ma X [ A () ^ A i ()] ^ Bi () 8 : (8)

8 6 4 2.5.5 2 2.5 3 Figure 4. A fuzz graph for the linguistic model of Eample 2.2. Darker shading corresponds to higher membership degree. The solid line is a possible crisp function representing a similar relationship as the fuzz model. Denote i = ma X [ A () ^ Ai ()] the degree of fulfillment of the ith rule s antecedent. The output fuzz set of the linguistic model is thus: B () = ma ik [ i ^ Bi ()]; 2 Y : (9) The entire algorithm, called the ma-min or Mamdani inference, is summarized in Algorithm 2. and visualized in Fig. 5. Step Step 2 A A A 2 A 3 β B B 2 B 3 B β 2 B 2 β 3 B 3 model: { ~ ~ If is A then is B ~ ~ ~ ~ If is A 3 then is B 3 B If is A 2 then is B 2 Step 3 ~ ~ data: is A is B Figure 5. A schematic representation of the Mamdani inference algorithm. 9

Algorithm 2. Mamdani (ma-min) inference. Compute the degree of fulfillment b: i = ma X [ A () ^ Ai ()] ; i K: Note that for a singleton fuzz set ( A () =for = and A () =otherwise) the equation for i simplifies to i = Ai ( ). 2. Derive the output fuzz sets B i: B i () = i ^ Bi (); 2 Y; i K: 3. Aggregate the output fuzz sets B i : B () = ma ik B i (); 2 Y : Eample 2.3 Let us take the input fuzz set A =[; :6; :3; ] from Eample 2.2 and compute the corresponding ouput fuzz set b the Mamdani inference method. Step ields the following degrees of fulfillment: = ma X [ A () ^ A ()] = ma ([; :6; :3; ] ^ [; :6; ; ]) = ; () 2 = ma X [ A () ^ A 2 ()] = ma ([; :6; :3; ] ^ [; :4; ; :4]) = :4; () 3 = ma X [ A () ^ A 3 ()] = ma ([; :6; :3; ] ^ [; ; :; ]) = : : (2) In step 2, the individual consequent fuzz sets are computed: B = ^ B =^ [; ; :6; ; ] = [; ; :6; ; ]; (3) B 2 = 2 ^ B 2 =:4 ^ [; ; :3; :9; ] = [; ; :3; :4; :4]; (4) B 3 = 3 ^ B 3 =: ^ [; ; :6; ; ] = [:; :; :; ; ] : (5) Finall, step 3 gives the overall output fuzz set: B = ma B =[; ; :6; :4; :4]; ik i which is identical to the result from Eample 2.2. Verif the result for the second input fuzz set as an eercise. From a comparison of the number of operations in eamples 2.2 and 2.3, it ma seem that the saving with the Mamdani inference method with regard to relational composition is not significant. This is, however, onl true for a rough discretization (such as the one used in Eample 2.2) and for a small number of inputs (one in this case). Note that the Mamdani inference method does not require an discretization and thus can work with analticall defined membership functions. It also can make use of learning algorithms, as discussed in Section 3.3.3. 2..3 Multivariable sstems So far, the linguistic model was presented in a general manner covering both the SISO and MIMO cases. In the MIMO case, all fuzz sets in the model are defined on vector domains b multivariate membership functions. It is, however, usuall, more convenient to write the antecedent and consequent propositions as logical combinations of fuzz propositions with univariate membership functions. Fuzz logic operators, such as the conjunction, disjunction and negation (complement), can be used to combine the propositions. Furthermore, a MIMO model can be written as a set

of MISO models. Therefore, for the ease of notation, we will write the rules for MISO sstems. Most common is the conjunctive form of the antecedent, which is given b: R i : If is A i and 2 is A i2 and ::: and p is A ip then is B i ; i =; 2;:::;K: (6) Note that the above model is a special case of (), as the fuzz set A i in () is obtained as the Cartesian product of fuzz sets A ij : A i = A i A i2 A ip. Hence, the degree of fulfillment (step of Algorithm 2.) is given b: i = Ai ( ) ^ Ai2 ( 2 ) ^^ Aip ( p ); i K: (7) Other conjunction operators, such as the product, can be used. A set of rules in the conjunctive antecedent form divides the input domain into a lattice of fuzz hperboes, parallel with the aes. Each of the hperboes is an Cartesian product-space intersection of the corresponding univariate fuzz sets. This is shown in Fig. 6a. The number of rules in the conjunctive form, needed to cover the entire domain, is given b: K = p i=n i ; where p is the dimension of the input space and N i is the number of linguistic terms of the ith antecedent variable. (a) (b) (c) 2 2 2 A 2 A 22 A 23 A 23 A 22 A 2 A A 3 A 2 A 4 A A 2 A 3 A A 2 A 3 Figure 6. Different partitions of the antecedent space. Gra areas denote the overlapping regions of the fuzz sets. B combining conjunctions, disjunctions and negations, various partitions of the antecedent space can be obtained, the boundaries are, however, restricted to the rectangular grid defined b the fuzz sets of the individual variables, see Fig. 6b. As an eample consider the rule antecedent covering the lower left corner of the antecedent space in this figure: If is not A 3 and 2 is A 2 then ::: The degree of fulfillment of this rule is computed using the complement and intersection operators: =[, A3 ( )] ^ A2 ( 2 ) : (8) The antecedent form with multivariate membership functions () is the most general one, as there is no restriction on the shape of the fuzz regions. The boundaries between these regions can be arbitraril curved and opaque to the aes, as depicted in Fig. 6c. Also the number of fuzz sets needed to cover the antecedent space ma be much smaller than in the previous cases. Hence, for comple multivariable sstems, this partition ma provide the most effective representation. Note

that the fuzz sets A to A 4 in Fig. 6c still can be projected onto and 2 to obtain an approimate linguistic interpretation of the regions described. Another wa to reducing the compleit of multivariable fuzz sstems is the decomposition into subsstems with fewer inputs per rule base. The subsstems can be inter-connected in a flat or hierarchical (multi-laer) structure. In such a case, an output of one rule base becomes an input to another rule base, as depicted in Fig. 7. This cascade connection will lead to the reduction of the total number of rules. As an eample, suppose five linguistic terms for each input. Using the conjunctive form, each of the two sub-rule bases will have 5 2 = 25 rules. This is a significant saving compared to a single rule base with three inputs which would have 5 3 = 25 rules. 2 3 rule base A rule base B z Figure 7. Cascade connection of two rule bases. 2..4 Defuzzification In man applications, a crisp output is desired. To obtain a crisp value, the output fuzz set must be defuzzified. With the Mamdani inference scheme, the center of gravit (COG) defuzzification method is used. This methods computes the coordinate of the center of gravit of the area under the fuzz set B : = cog(b )= P F j= B ( j) j PF j= B ( j) ; (9) where F is the number of elements j in Y. Continuous domain Y thus must be discretized to be able to compute the center of gravit. Eample 2.4 Consider the output fuzz set B =[:2; :2; :3; :9; ] from Eample 2.2, where the output domain is Y =[; 25; 5; 75; ]. The defuzzified output obtained b appling formula (9) is: :2 +:2 25 + :3 5 + :9 75 + = =72:2 : :2 +:2 +:3 +:9 + The heating power of the burner, computed b the fuzz model, is thus 72.2 W. 2..5 Singleton model A special case of the linguistic fuzz model is obtained when the consequent fuzz sets B i are singleton fuzz sets. These sets can be represented simpl as real numbers b i, ielding the following rules: R i : If ~ is A i then = b i ; i =; 2;:::;K: (2) 2

This model is called the singleton model. A simplified inference/defuzzification method is usuall used with this model: P K i= = i b i PK : (2) i= i This defuzzification method is called the fuzz mean. The singleton fuzz model belongs to a general class of general function approimators, called the basis functions epansion (Friedman, 99) taking the form: = KX i= i ()b i : (22) Most structures used in nonlinear sstem identification, such as artificial neural networks, radial basis function networks, or splines, belong to this class of sstems. Connections between these tpes of models have been investigated (Jang and Sun, 993; Brown and Harris, 994). In the singleton model, the basis functions i () are given b the (normalized) degrees of fulfillment of the rule antecedents, and the constants b i are the consequents. Multilinear interpolation between the rule consequents is obtained if the antecedent membership functions are trapezoidal, pairwise overlapping and the membership degrees sum up to one for each domain element, the product operator is used to represent the logical and connective in the rule antecedents. The input-output mapping of the singleton model is then piecewise (multi-)linear, as shown in Fig. 8a. b 2 b 4 b 4 b 3 = k + q b 2 b A = f () b 3 A 2 A 3 A 4 b A A 2 A 3 A 4 (a) a a 2 a 3 (b) a 4 Figure 8. Singleton model with triangular or trapezoidal membership functions results in a piecewise linear input-output maping (a), of which a linear mapping is a special case (b). Clearl, a singleton model can also represent an linear mapping of the form: = p T + q = px i= p i i + q: (23) In this case, the antecedent membership functions must be triangular. The consequent singletons can be computed b evaluating the desired mapping (23) for the cores a ij of the antecedent fuzz 3

sets A ij : b i = px j= p j a ij + q: (24) This situation is depicted in Fig. 8b. This propert is useful, as the (singleton) fuzz model can alwas be initialized such that it mimics a given (perhaps inaccurate) linear model and can later be optimized. 2.2 Takagi Sugeno model The linguistic model, introduced in the previous section, describes a given sstem b means of linguistic if-then rules with fuzz proposition in the antecedent as well as in the consequent. The Takagi Sugeno (TS) fuzz model (Takagi and Sugeno, 985), on the other hand, uses crisp functions in the consequents. Hence, it can be seen as a combination of linguistic and mathematical regression modeling in the sense that the antecedents describe fuzz regions in the input space in which consequent functions are valid. The TS rules have the following form: R i : If is A i then i = f i (); i =; 2;:::;K: (25) Contrar to the linguistic model, the input is a crisp variable (linguistic inputs are in principle possible, but would require the use of the etension principle (Zadeh, 975) to compute the fuzz value of i ). The functions f i are tpicall of the same structure, onl the parameters in each rule are different. Generall, f i is a vector-valued function, but for the ease of notation we will consider a scalar f i in the sequel. A simple and practicall useful parameterization is the affine (linear in parameters) form, ielding the rules: R i : If is A i then i = a T i + b i; i =; 2;:::;K; (26) where a i is a parameter vector and b i is a scalar offset. This model is called an affine TS model. Note that if a i =for each i, the singleton model (2) is obtained. 2.2. Inference mechanism The inference formula of the TS model is a straightforward etension of the singleton model inference (2): P K i= = i i = : (27) PK i= i P K i= i(a T i + b i) P K i= i When the antecedent fuzz sets define distinct but overlapping regions in the antecedent space and the parameters a i and b i correspond to a local linearization of a nonlinear function, the TS model can be regarded as a smoothed piece-wise approimation of that function, see Fig. 9. 2.2.2 TS model as a quasi-linear sstems The affine TS model can be regarded as a quasi-linear sstem (i.e., a linear sstem with inputdependent parameters). To see this, denote the normalized degree of fulfillment b i () = i ()= 4 KX j= j () : (28)

= a + b = a 2 + b 2 = a 3 + b 3 µ Small Medium Large Figure 9. Takagi Sugeno fuzz model as a smoothed piece-wise linear approimation of a nonlinear function. Here we write i () eplicitel as a function to stress that the TS model is a quasi-linear model of the following form: = KX i= i ()a T i! + KX i= i ()b i = a T () + b() : (29) The parameters a(), b() are conve linear combinations of the consequent parameters a i and b i, i.e.: a() = KX i= i ()a i ; b() = KX i= i ()b i : (3) In this sense, a TS model can be regarded as a mapping from the antecedent (input) space to a conve region (poltope) in the space of the parameters of a quasi-linear sstem, as schematicall depicted in Fig.. Antecedent space Rules Parameter space Medium Small Big a 2 Poltope 2 Small Medium Big a Parameters of a consequent function: = a + a 2 2 Figure. A TS model with affine consequents can be regarded as a mapping from the antecedent space to the space of the consequent parameters. This propert facilitates the analsis of TS models in a framework similar to that of linear sstems. Methods have been developed to design controllers with desired closed loop characteristics (Filev, 996) and to analze their stabilit (Tanaka and Sugeno, 992; Zhao, 995; Tanaka, et al., 996). 5

2.3 Modeling dnamic sstems Before discussing dnamic fuzz models, let us recall that time-invariant dnamic sstems are in general modelled b static functions, b using the concept of the sstem s state. Given the state of a sstem and given its input, we can determine what the net state will be. In the discrete-time setting we can write (k +)=f ((k); u(k)); (3) where (k) and u(k) are the state and the input at time k, respectivel, and f is a static function, called the state-transition function. Fuzz models of different tpes can be used to approimate the state-transition function. As the state of a process is often not measured, input-output modeling is usuall applied. The most common is the NARX (Nonlinear AutoRegessive with exogenous input) model: (k+) = f ((k);(k,);:::;(k,n +);u(k);u(k,);:::;u(k,n u +)) : (32) Here (k);:::;(k, n +)and u(k);:::;u(k, n u +)denote the past model outputs and inputs respectivel and n, n u are integers related to the model order (usuall selected b the user). For eample, a linguistic fuzz model of a dnamic sstem ma consist of rules of the following form: R i : If (k) is A i and (k, ) is A i2 and;:::(k, n +)is A in and u(k) is B i and u(k, ) is B i2 and;:::;u(k, m +)is B im then (k +)is C i : (33) In this sense, we can sa that the dnamic behavior is taken care of b eternal dnamic filters added to the fuzz sstem Fig.. In (33), the input dnamic filter is a simple generator of the lagged inputs and outputs, and no output filter is used. Input Knowledge Base Output Dnamic filter Rule Base Data Base Dnamic filter Numerical data Numerical data Fuzzifier Fuzz Set Fuzz Inference Engine Fuzz Set Defuzzifier Figure. A generic fuzz sstem with fuzzification and defuzzification units and eternal dnamic filters. Since the fuzz models can approimate an smooth function to an degree of accurac (Wang, 992), models of tpe (33) can approimate an observable and controllable modes of a large class of discrete-time nonlinear sstems (Leonaritis and Billings, 985). 3 Building Fuzz Models Two common sources of information for building fuzz models are the prior knowledge and data (process measurements). The prior knowledge can be of a rather approimate nature (qualitative 6

knowledge, heuristics), which usuall originates from eperts, i.e., process designers, operators, etc. In this sense, fuzz models can be regarded as simple fuzz epert sstems (Zimmermann, 987). For man processes, data are available as records of the process operation or special identification eperiments can be designed to obtain the relevant data. Building fuzz models from data involves methods based on fuzz logic and approimate reasoning, but also ideas originating from the field of neural networks, data analsis and conventional sstems identification. The acquisition or tuning of fuzz models b means of data is usuall termed fuzz identification. Two main approaches to the integration of knowledge and data in a fuzz model can be distinguished:. The epert knowledge epressed in a verbal form is translated into a collection of if then rules. In this wa, a certain model structure is created. Parameters in this structure (membership functions, consequent singletons or parameters) can be fine-tuned using inputoutput data. The particular tuning algorithms eploit the fact that at the computational level, a fuzz model can be seen as a laered structure (network), similar to artificial neural networks, to which standard learning algorithms can be applied. This approach is usuall termed neuro-fuzz modeling (Jang, 993; Jang and Sun, 993; Pedrcz, 995). 2. No prior knowledge about the sstem under stud is initiall used to formulate the rules, and a fuzz model is constructed from data. It is epected that the etracted rules and membership functions can provide an a posteriori interpretation of the sstem s behavior. An epert can confront this information with his own knowledge, can modif the rules, or suppl new ones, and can design additional eperiments in order to obtain more informative data. These techniques, of course, can be combined, depending on the particular application. In the sequel, we describe the main steps and choices in the knowledge-based construction of fuzz models, and the main techniques to etract or fine-tune fuzz models b means of data. 3. Structure and parameters With regard to the design of fuzz (and also other) models, two basic items are distinguished: the structure and the parameters of the model. The structure determines the fleibilit of the model in approimation (unknown) mappings. The parameters are then tuned (estimated) to fit the data at hand. A model with a rich structure is able to approimate more complicated functions, but, at the same time, has worse generalization properties. Good generalization means that a model fitted to one data set will also perform well on another data set from the same process. Eample 3. A well-known eample of a general function approimator is a polnomial function: = a +a +a 2 2 ++a n n. In this case, the structure is the order n of the polnomial and the parameters are the constants a to a n. Higher-order polnomials will be able to approimate more complicated functions, but will have worse generalization properties. For instance, a 5-th order polnomial has si parameters and therefore will perfectl fit 6 data points (a unique analtical solution). However, if the data is corrupted b noise, a large error ma occur for new data, as 7

depicted in Fig. 2a. A less comple model, such as a second-order polnomial, will do much better in this case, see Fig. 2b. See the function polnom.m..2.8.6.4.2 2 3 4 5 6 (a) Fifth-order polnomial..2.8.6.4.2 2 3 4 5 6 (b) Second-order polnomial. Figure 2. Approimation of a sinusoidal function (dashed-dotted line) b two models of a different compleit (solid line). In fuzz models, structure selection involves the following choices: Input and output variables. With comple sstems, it is not alwas clear which variables should be used as inputs to the model. In the case of dnamic sstems, one also must estimate the order of the sstem. For the input-output NARX model (32) this means to define the number of input and output lags n and n u, respectivel. Prior knowledge, insight in the process behavior and the purpose of modeling are the tpical sources of information for this choice. Sometimes, automatic data-driven selection can be used to compare different choices in terms of some performance criteria. Structure of the rules. This choice involves the model tpe (linguistic, singleton, Takagi- Sugeno) and the antecedent form (refer to Section 2..3). Important aspects are the purpose of modeling and the tpe available knowledge. Number and tpe of membership functions for each variable. This choice determines the level of detail (granularit) of the model. Again, the purpose of modeling and the detail of available knowledge, will influence this choice. Automated, data-driven methods can be used to add or remove membership functions from the model. Tpe of the inference mechanism, connective operators, defuzzification method. These choices are restricted b the tpe of fuzz model (Mamdani, TS). Within these restrictions, however, some freedom remains, e.g., as to the choice of the conjunction operators, etc. To facilitate data-driven optimization of fuzz models (learning), differentiable operators (product, sum) are often preferred to the standard min and ma operators. After the structure is fied, the performance of a fuzz model can be fine-tuned b adjusting its parameters. Tunable parameters of linguistic models are the parameters of antecedent and consequent membership functions (determine their shape and position) and the rules (determine 8

the mapping between the antecedent and consequent fuzz regions). Takagi-Sugeno models have parameters in antecedent membership functions and in the consequent functions (a and b for the affine TS model). 3.2 Knowledge-based design To design a (linguistic) fuzz model based on available epert knowledge, the following steps can be followed:. Select the input and output variables, the structure of the rules and the inference and defuzzification methods. 2. Decide on the number of linguistic terms for each variable and define the corresponding membership functions. 3. Formulate the available knowledge in terms of fuzz if-then rules. 4. Validate the model (for instance b using data). If the model does not meet the epected performance, iterate on the above design steps. It should be noted that the success of this method heavil depends on the problem at hand, and the etent and qualit of the available knowledge. For some problems, the knowledge-based design ma lead fast to useful models, while for others it ma be a ver time-consuming and inefficient procedure (especiall manual fine-tuning of the model parameters). Therefore, it is useful to combine the knowledge based design with a data-driven tuning of the model parameters. The following sections review several methods for the adjustment of fuzz model parameters b means of data. 3.3 Data-driven acquisition/tuning of fuzz models In this section, we assume that a set of N input-output data pairs f( i ; i )ji = ; 2;:::;Ng is available. Recall that i 2 R p are input vectors and i are output scalars. Denote X 2 R Np a matri having the vectors T k in its rows, and 2 R N a vector containing the outputs k : X =[ ;:::; N ] T ; =[ ;:::; N ] T : (34) 3.3. Least-squares estimation of consequents Note that the defuzzification formulas of the singleton and TS models, equations (2) and (27), respectivel, are linear in the consequent parameters, a i, b i. Hence, these parameters can be estimated from the available data b least-squares techniques. Denote, i 2 R NN the diagonal matri having the normalized membership degree i ( k ) of (28) as its kth diagonal element. B appending a unitar column to X, the etended matri X e =[X; ] is created. Further, denote X the matri in R NKN composed of the products of matrices, i and X e X =[, X e ;, 2 X e ; :::;, K X e ] : (35) 9

The consequent parameters a i and b i are lumped into a single parameter vector 2 R K(p+) : = a T ;b ; a T 2 ;b 2; :::; a T K ;b K T : (36) Given the data X,, eq. (27) now can be written in a matri form, = X + : From linear algebra (Strang, 976) we know that this set of equations can be solved for the parameter b: = (X ) T X, (X ) T : (37) This is an optimal least-squares solution which gives the minimal prediction error, and as such is suitable for prediction models. At the same time, however, it ma bias the estimates of the consequent parameters as parameters of local models. If an accurate estimate of local model parameters is desired, a weighted least-squares approach applied per rule ma be used: [a T i ;b i] T = X T e, ix e, X T e, i : (38) In this case, the consequrent parameters of individual rules are estimated independentl of each other, and therefore are not biased b the interactions of the rules. B omitting a i for all i K, equations (37) and (38) directl appl to the singleton model (2). 3.3.2 Template-based modeling With this approach, the domains of the antecedent variables are simpl partitioned into a specified number of equall spaced and shaped membership functions. The rule base is then established to cover all the combinations of the antecedent terms. The consequent parameters are estimated b the least-squares method. Eample 3.2 Consider a nonlinear dnamic sstem described b a first-order difference equation: (k +)=(k) +u(k)e,3j(k)j : (39) We use a stepwise inputs signal to generate with this equation a set of 3 input output data pairs (see Fig. 4a). Suppose that it is known that the sstem is first order and that the nonlinearit of the sstem is onl caused b, the following TS rule structure can be chosen: If (k) is A i then (k +)=a i (k) +b i u(k); (4) Assuming that no further prior knowledge is available, seven equall spaced triangular membership functions, A to A 7, are defined in the domain of (k), as shown in Fig. 3a. The consequent parameters were estimated b the least-squares method as described in Section 3.3.. Figure 3b gives a plot of the parameters a i, b i against the cores of the antecedent fuzz sets A i. Also plotted is the linear interpolation between the parameters (dashed line) and the true sstem nonlinearit (solid line). The interpolation between a i and b i is linear, since the membership functions are piece-wise linear (triangular). One can observe that the dependence of the consequent parameters on the antecedent variable approimates quite accuratel the sstem s nonlinearit, which gives the model a certain transparenc. The values of the parameters a T =[:; :; :; :97; :; :; :] and b T =[:; :5; :2; :8; :2; :5; :] T indicate the strong input nonlinearit and the linear dnamics as in (39). Validation of the model in simulation using a different data set is given in Fig. 4b. This eample is implemented in the MATLAB function phdemo.m. 2