Considering Numerical Error Propagation in Modeling and Regression of Data

Size: px

Start display at page:

Download "Considering Numerical Error Propagation in Modeling and Regression of Data"

Cassandra Joseph
5 years ago
Views:

1 Sesson 2520 Consderng Numercal Error Propagaton n Modelng and Regresson of Data Nema Brauner, Mordecha Shacham Tel-Avv Unversty/Ben-Guron Unversty of the Negev Abstract The use of user-frendly nteractve regresson software enables undergraduate engneerng students to reach a hgh level of sophstcaton n regresson, correlaton and analyss of data. In order to nterpret correctly the results, the students must be famlar wth potental causes for poor fts n correlatons, should be able to recognze a poor correlaton and mprove t f possble. They should also be aware of the practcal consequences of usng a correlaton whch has no statstcal valdty. In ths paper, the harmful effects of numercal error propagaton (resultng from collnearty among the ndependent varables) are explaned and demonstrated. Smple methods for mnmzng such error propagaton n polynomal regresson are ntroduced. Ths materal can be presented, for example, as part of 3rd year undergraduate mathematcal modelng and numercal methods course. Introducton Realstc modelng and accurate correlaton of expermental data are essental to sound engneerng desgn. Many of the statstcal technques for analyzng the accuracy of the correlatons have been known for several decades (see, for example, Draper and Smth, 1981, Hmmelblau, 1970, Bates and Watts, 1988 and Noggle, 1993). But, untl recently, those technques have not been utlzed n a sgnfcant level n undergraduate engneerng educaton. One of the man reasons for not utlzng those technques was that statstcal tests usually yeld numbers (varance, standard devaton, correlaton coeffcent, etc.). The meanng of these numbers can be easly msnterpreted f the statstcal theory and the assumptons made n developng the tests are not well understood. The emergence of software packages wth nteractve regresson and statstcal analyss capabltes (such as POLYMATH, MATLAB, MATHEMATICA, EXCEL) whch provdes both numercal and graphcal output changes the stuaton. These software packages enable undergraduate engneerng students, wth moderate statstcal background, to carry out rgorous regresson and statstcal analyss of data. They are able to select the most approprate correlaton model and test ts statstcal valdty usng resdual and confdence regon plots. They can analyze the qualty and precson of the laboratory data by plottng one ndependent varable versus the others to detect hdden collnearty that may exst among the varables. Page

2 Shacham et al (1996) had descrbed a set of lectures and exercses that s used to ntroduce freshman engneerng students to the bascs of data modelng and analyss usng nteractve software packages. Ths materal s ncluded n an ntroductory computng course or as part of an ntroductory engneerng course. The ntroducton to data modelng and analyss (descrbed by Shacham et al, 1996) ncludes the followng subjects: 1. Basc statstcal concepts. 2. Dscrmnaton between real expermental data and smoothed nterpolated data. 3. Usng resdual plots and confdence ntervals for selectng the most approprate model. 4. The dangers of extrapolaton, n partcular, when a non-theory based model s used. Ths ntroductory materal s very helpful to the students for modelng and analyzng ther own data. However, they may need more advanced materal when dealng wth, for example, models contanng large numbers of parameters. In ths paper, more advanced materal related to regresson s presented. The dscusson ncludes models whch are comprsed of a sum of functons of the same ndependent varable (as n polynomal regresson). Varous effects of the nterdependency between these functons are descrbed and demonstrated. The materal presented s taught to thrd year, undergraduate Chemcal Engneerng students at the Ben Guron Unversty as part of a mathematcal modelng and numercal methods course. The calculatons nvolved n solvng the examples presented have been carred out usng the POLYMATH 4.0 (Shacham and Cutlp, 1996) and MATLAB (MathWorks, 1992) packages, but other smlar packages can be used for ths purpose. Lnear Regresson wth Models Comprsng of Functons of One Independent Varable Let us assume that there s a set of N data ponts of a dependent varable (measured varable, such as vapor pressure, vscosty, heat capacty, etc.), y versus an ndependent varable (controlled varable, such as temperature,. concentraton, pressure) x, = 1, 2, N. A regresson model comprsed of a lnear combnaton of n dfferent functons of the ndependent varable s consdered. Thus, the regressors are x 1 = f 1 (x), x 2 = f 2 (x) x n = f n (x). For nstance, n polynomals x 1 = x 0, x 2 = x, x n = x n 1. A lnear model ftted to the data s of the form: y = β 0 +β 1 x 1 + β 2 x 2 + βn x n + (1) where β 0, β 1, β n are the parameters of the model and s a measurement error n y. It s assumed that s ndependently and dentcally (..d.) dstrbuted. The vector of estmated parameters ˆ T β = (ˆ β, ˆ ˆ 0 β1κ βn ) s usually calculated usng the least squares error approach, by mnmzng the followng functon: Page

3 S 2 = N [ y ( β0 +β1x1 +β2x2 +βn xn )] = 1 2 Λ (2) If the parameters appear n a lnear expresson (as n eq. (1)), the mnmzaton can be carred out by solvng a set of smultaneous lnear algebrac equatons (the normal equaton): T T X Xβˆ = X y (3) The columns of X are: x 0 = 1, x 1, x 2 x n and X T X = A s the normal matrx. To check the goodness of the ft between the observed y and estmated ŷ values of the dependent varable, they can be plotted versus x (when there s a sngle ndependent varable) or versus, the pont number (when there are several ndependent varables). The dstance between the observed and estmated values can serve as an ndcaton for the qualty of the ft. These dstances can be amplfed usng a resdual plot. In the resdual plot, the model error (resdual) εˆ s plotted usually versus y, where: ε ˆ = y yˆ (4) A random dstrbuton of the resduals around zero ndcates that the model correctly represent the partcular set of data. A defnte trend or pattern n the resdual plot may ndcate ether a lack of ft of the model, or that the assumpton of random error dstrbuton for the dependent varable s ncorrect. In some cases (for example, n cases where the value of the dependent varable changes by several orders of magntude over the range of nterest) the relatve error s dstrbuted normally. The relatve error s defned as: εˆ εˆ r = (5) y The approprate transformaton, whch results n mnmzaton of the relatve error n a regresson, s takng the logarthm of both sdes of the model equaton. It should be emphaszed, however, that when the varables are transformed, the resdual plot must be constructed usng the transformed form of the dependent varable, n order to account for the change n the error dstrbuton ntroduced by the transformaton. A numercal ndcator for the qualty of the ft whch s used most frequently s the square of standard error of the estmate, whch represents the sample varance, and s gven by: s 2 = N = 1 ( y yˆ ) N n 1 2 (6) Page

4 Thus, the sample varance s the sum of squares of errors dvded by the degrees of freedom (where the number of parameters, n+1, s subtracted from the number of data ponts, N) and s a measure for the varablty of the actual y values from the predcted ŷ values. Smaller varance ndcates a better ft of the model to the data. Confdence ntervals (n partcular the 95% confdence nterval) on the parameter values can be very useful ndcators of the ft between the model and the data. A better ft and more precse data lead to narrow confdence ntervals, whle a poor model ft and/or mprecse data result n wde confdence ntervals. Furthermore, confdence ntervals whch are larger (n absolute value) than the respectve parameters themoften ndcate that the model contans too many parameters. Confdence nterval s defned by: j ( ν, α) s a jj β j < βˆ j + t( ν α) s a jj β ˆ t, (7) where t (ν, α) s the statstcal t dstrbuton correspondng to ν degrees of freedom (ν=n (n+1)) and a desred confdence level, α and s s the standard error of the estmate. If the number of data ponts s large enough, the value of t approaches a constant value (for ν > 15, t 2 for the 95% confdence nterval). Therefore, t (ν, α) s often omtted. The term s a jj s called the standard error of the estmate of parameter β. One of the assumptons of the least squares error approach s that there s no error n the ndependent varables. Ths s rarely true, however. Most reports on expermental measurements nclude estmated error n the ndependent varable. If such an estmate s not ncluded, a lower lmt on the error can be estmated from the number of decmal dgts n whch the data are reported. (For example f the temperature s reported wth one dgt after the decmal pont n degrees K, then the error s at least ± 0.05K). Thus, the true value of an ndependent varable can be represented by: x = ~ x + δx (8) where x~ s the expected measured value and δ x s the error (or uncertanty) n ts value. The error n the ndependent varable s, of course, carred over to ts functons. The error n the dfferent functons can be estmated from: f j δx j δx (9) x x= x where δx j s the estmated error n the jth functon of x. The errors n x and ts functons are carred over to the normal matrx and to the X T y term n eq. (3). These errors can propagate durng the soluton process of eq. (3) yeldng naccurate and unstable parameter estmates. Denotng b = X T y, the errors n the calculated parameter values, δ βˆ are bounded by (p. 176 n Dahlqust et al, 1974): Page

5 βˆ δa δ κ( A ) (10) A βˆ +δβˆ where κ (A) s the condton number of the normal matrx and δa s the matrx of errors n A. A smlar equaton relates the error n b, δb, to the error δβ ˆ : δβˆ βˆ κ b ( A) δ b (11) The condton number s the rato of the largest to the smallest egenvalue of A. If the condton number s large, the δa and δb are amplfed consderably n the calculaton of βˆ, yeldng very poor estmate of the true parameter values. Large condton numbers result from lnear dependency (collnearty) between the varous (supposedly) ndependent varables. The normal matrx whch has large condton number s called ll-condtoned. The problems of collnearty has been extensvely dscussed n the lterature (see, for example, Shacham and Brauner (1997), Brauner and Shacham (1998)). In the next three sectons, the practcal results of collnearty, ll-condtonng and numercal error propagaton wll be demonstrated usng a vapor pressure correlaton example. Vapor Pressure Models and Data There are several models whch are beng extensvely used for correlatng vapor pressure versus temperature data. Three types of equatons wll be used n ths work. The frst s a polynomal: P T T 2 T n = β0 + β1 + β2 Λ βn 1 (12) ln + where T s temperature (K), P s pressure (k Pa) and n s the order of the polynomal. The precson and stablty of polynomal regresson can often be ncreased by transformng the data. The followng transformatons (whch were nvestgated by Shacham and Brauner (1997)) wll be used. Normalzaton of the temperature: ι = T /T max (13) where T max s the largest temperature value n the data set. The w transformaton defned by: w = (T T mn )/( T max T mn ) (14) whch yelds values n the rage 0 w 1, and the z transformaton Page

6 z 2T T T T T max mn = (15) max mn whch yelds varable dstrbuton n the range of -1 z 1. The Redel equaton s a four parameter equaton, whch s used n several slghtly dfferent forms. The followng defnton s used n ths work ln β τ 1 2 π = β0 + + β2 ln τ + β3τ (16) where π s the normalzed pressure π = P /P max. When only the frst two terms of the equaton are used, the model reduces to what s known as Clapeyron s equaton. The Wagner (1973) equaton s consdered the most accurate equaton, wth the smallest number of constants, for correlatng vapor pressure data n a wde range of temperatures (between the trple pont and the crtcal temperatures). Whle the number of terms and the exponents of the dfferent terms n Wagner s equaton may change, the most wdely used form of ths equaton s: ln P R = 1 P R [ β ( 1 T ) + β ( 1 T ) + β ( 1 T ) + β ( T ) ] 6 0 R 1 R 2 R 3 1 R (17) where T R = T/T c s the reduced temperature, P R = P/P c s the reduced pressure, T c s the crtcal temperature (K) and P c s the crtcal pressure (k P a ). To enable dscrmnaton between possble dfferent sources of naccuracy n a. regresson model, exact data generated by Wagner s equaton wll be used rather than real expermental data. Ths way, the precson of the data can be controlled by ntroducng randomly dstrbuted error (nose) to the varables. Table 1 shows the crtcal constants and the Wagner equaton coeffcents for Toluene, the substance whch s used n ths study. Three dfferent data sets were generated usng the Wagner equaton wth the constants gven n Table 1. In each set, 21 data ponts were generated n the vcnty of the normal bolng pont. The three sets are coverng temperature ranges of 100K, 50K and 20K. Mnmal and maxmal values of T, P, ι and π for the three data sets are also shown n Table 2. In Fgure 1, the vapor pressure of Toluene n the range of 310K-590K as calculated from Wagner s equaton s dsplayed. It can be seen that the vapor pressure changes by more than four orders of magntude over ths temperature range. The temperature ranges correspondng to the three data sets are also marked on the fgure. These data sets wll be regressed usng the varous vapor pressure models. Page

7 Table 1: Crtcal Propertes and the Wagner Equaton Constants for Toluene (McGarry (1983)). Meltng pont temperature K Normal bolng pont temperature K Crtcal temperature K Crtcal pressure kPa Wagner constants β β β β Reference: Weast (1978) Table 2: Mnmal and Maxmal Values for the Vapor Pressure Data Sets. Data Set 1 Data Set 2 Data Set 3 Range = 100K Range = 50K Range = 20K T max (K) T mn (K) ι max ι mn P max (kpa) P mn (kpa) π max π mn Page

8 Fgure 1: Vapor pressure of toluene n the 310K - 590K temperature range. Fttng Polynomals to the Data An equaton n the form of polynomal s often used for representng the relatonshp between an ndependent varable and a dependent varable. The resultng correlaton s an emprcal model, whch often lacks a theoretcal bass. Some of the ptfalls of usng such a model for correlatng vapor pressure data were dscussed by Shacham et al (1996). Recently, theoretcal aspects of the stablty of parameters estmaton as functon of range and precson of the data and the number of terms n the polynomal have been extensvely studed (Shacham and Brauner, 1997). The vapor pressure data can be used to demonstrate the practcal results predcted by these theoretcal studes. The model of eq. (2) wth data set 1 (range of 100K) were used n all polynomal regresson studes. The calculatons were carred out usng MATLAB (double precson computaton wth approxmately 16 decmal dgts accuracy) where eq. (3) s explctly solved for βˆ. Table 3 shows the coeffcents, varance and condton numbers of the X T X matrx for polynomals up to the 6th orders when the regresson s done on T and P wthout any normalzaton or transformaton. It can be seen that the varance decreases from for a lnear dependency to for 4th order polynomal of T. Increasng the order of the polynomal further causes a sharp ncrease of the varance to for the 6th order polynomal. The standard errors of the parameters (not shown) ndcate a smlar trend. They are smaller by about two orders of magntude then the parameter values for polynomals up to the 4th order. For the 5th order polynomal (and hgher orders), the standard errors become larger then the parameter values. Page

9 Table 3: Coeffcents, Varance and Condton Number n Polynomal Regresson of the Vapor Pressure Data Order cons β E E E E E E+01 β E E E E E E-01 β E E E E E-04 β E E E E-08 β E E E-09 β E E-12 β E-15 s E E E E E E-02 κ(a) E E E E E E+34 The practcal sgnfcance of such varatons n the varances and the standard errors can be seen n Fgures 2 and 3. Fgure 2 shows the relatve error n the calculated pressure values for 1st, 2nd and 6th order polynomals versus the expermental pressure. For the 1st order polynomal, there s a clear trend of the error and t reaches a maxmum value of about 16%. For the 2nd order polynomal, the error s much smaller and t hardly reaches 1%. Errors n the 3rd and 4th order polynomals cannot even be observed n the scale of Fgure 2. For the 6th order polynomal, there s agan a clear trend n the error and the maxmal error reaches 30%. The same trend s even more pronounced when lookng at the errors n the calculated dervatve values (Fgure 3). The maxmal error n the dervatve values usng the straght lne correlaton reaches only about 50%, t reduces to 0.1% for the 4th order polynomal, whereas wth the 6th order polynomal t ncreases up to 65% error. Thus, usng a too hgh order polynomal (often called overcorrelaton) yelds even poorer results n the dependent varable values and ts dervatve values then usng a polynomal of a too low order. Fgure 2: Relatve error n the calculated pressure values (polynomal regresson). Page

10 Fgure 3: Relatve error n calculated pressure dervatves (polynomal regresson). In tryng to determne what causes the poor results obtaned wth hgh order polynomals, the condton number of the normal matrx (shown n Table 3) can provde valuable nformaton. The value of the condton number s for the 1st order polynomal and t ncreases to for the 6th order polynomal. The MATLAB program, whch reles on the condton number n determnng the condtonng of the normal matrx, starts ssung warnng messages Matrx s close to sngular or badly scaled. Results may be naccurate already for the 3rd order polynomal. Does the 4th order polynomal represent the lmt of precson for ths correlaton? To answer ths queston, regressons have been carred out usng the varous transformatons of the temperature as defned n eqs. (13), (14) and (15). The results of these regressons are summarzed n Fg. 4, whch shows the varance as functon of the polynomal order for the varous transformatons. Wth no-transformaton or wth a smple normalzaton, a. steady decrease of the varance up to the 4th order polynomal s obtaned, whereby each addtonal term n the polynomal reduces the varance by about two orders of magntude. Startng at the 5th order polynomal, the normal matrx becomes ll-condtoned, whch causes a sharp ncrease n the varance. Wth the w and z transformatons, however, the same rate of decrease of the varance extends to the 6th order polynomal. For the w transformaton, the varance ncrease starts usng a 7th order polynomal, whle for the z transformaton, the varance reaches a steady mnmal level at the 8th order polynomal. Thus, usng the z-transformaton enables reducng the varance of the best ft to usng the 8th order polynomal from the mnmal value of obtaned wth the 4th order polynomal and the untransformed orgnal data. Page

11 Fgure 4: Varance n polynomal regresson usng varous transformatons of the temperature data. Fgure 5: Condton numbers n polynomal regresson usng varous transformatons of the temperature data. Fgure 5 shows the logarthm of the condton number as functon of the polynomal order for the varous transformaton. The logarthm of the condton number ncreases lnearly as the polynomal order ncreases. (The only excepton s the case of no transformaton where numercal error propagaton prevents obtanng accurate values for the condton numbers for Page

12 hgh order polynomals). The condton numbers are mnmal for the z-transformaton and ncrease n the followng order: w-transformaton, normalzaton and no transformaton. The reducton of the condton number correspondng to a partcular polynomal order when usng the w and z transformaton (nstead of normalzaton) s reflected n the lower varance values acheved wth these transformatons. It s to be noted, however, that the reducton n the condton number when usng normalzed data nstead of the orgnal data, has no sgnfcant effect on the order of the best-ft polynomal and ts varance. Fttng Clapeyron s and Redel s Equatons to the Data Clapeyron s equaton (the frst two terms n eq. (5)) and Redel s equaton are both frequently used for modelng of vapor pressure data. To nvestgate the effects of the range and precson of the ndependent varable data, both equatons were ftted usng the three data sets shown n Table 2. The data was generated usng Wagner s equaton, but for ths study, a random and normally.dstrbuted error was ntroduced nto the temperature data. Three cases were tested wth the followng error levels: δ T = , 0.05 and 0.5K. Usng the hghest precson (δt = K) and the wdest temperature range (100K) data wth Redel s model yelds a very accurate correlaton. The parameter values obtaned are β 0 = , β 1 = , β 2 = , β 4 = , the varance s and the standard errors of the parameters are smaller by three orders of magntude than the parameter values. The resultant maxmal error n the calculated pressure value s kpa. For the same data set, Clapeyron s equaton yelds a much worse ft. The resultant parameter values are β 0 = , β 1 = , the varance s and the maxmal error n the calculated pressure s kpa. Thus, the use of the four parameter Redel s equatons adds approxmately three more decmal dgts to the accuracy of the correlaton. To observe the effects of the range and precson of the temperature data on the accuracy of the correlaton, the varous data sets were regressed usng Clapeyron s and Redel s models. Fgure 6 shows the varance of the ft obtaned wth Clapeyron s equaton as functon of the range wth the error level δt as parameter. It can be seen that for the hgh precson temperature data (δt = K), the varance s for the 100K range. It reduces to for the 50K temperature range and further to for the 20K range. Ths s the trend that could have been expected n case the varance results from a lack of ft of a model (due to a lmted number of parameters). In such a case, the model can better represent the data n a narrower range of temperature and the varance s expected to decrease accordngly. However, the trend s dfferent for the low precson data (δt = 0.5K). For the 100K range, the varance s , t reduces to for the 50K range and ncreases to for the 20K range. For ths low precson data, numercal error propagaton (resultng from collnearty) s ntensfed wth reducng the range and for the range of 20K, t domnates over a better model ft whch could have been acheved n a narrower range. Fgure 7 shows the varance of the Redel correlaton as functon of the temperature range wth the error level δt as parameter. It can be seen that for all the δt s studed, reducng the range from 50K to 20K affects an of the varance. For the low precson data, Redel s model yelds even a larger varance than that of Clapeyron s equaton. Page

13 Fgure 6: Change of the varance as functon of range for Clapeyron s equaton. Fgure 7: Change of the varance as functon of range for Redel s equaton. Ths study clearly demonstrates that the optmal number of terms/parameters n a model s a functon of the range and precson of the ndependent varable data. When reducng the range mproves the accuracy of the correlaton consderably, the use of a dfferent model (conssts of more terms/parameters) should be consdered. But, when range reducton does not mprove the accuracy, addng more terms to the model wll probably lead to over-correlaton. The harmful effects of over-correlaton have been demonstrated n the prevous secton wth reference to the polynomal regresson. Page

14 Conclusons Numercal error propagaton, resultng from collnearty between varous functons of the same ndependent varable, can severely lmt the accuracy of a correlaton. Because of the complex theory nvolved, demonstratng to undergraduate students the causes and practcal sgnfcance of such naccuraces and dentfyng methods to mprove the accuracy can represent a great challenge. In ths paper, the graphcal and nteractve capabltes of two numercal computaton packages (MATLAB and POLYMATH) have been used to demonstrate these topcs. It was shown that n polynomal regresson, the hghest order of polynomal to be used s often lmted by numercal error propagaton. Dsregardng ths lmt can yeld large errors n the calculated values of the dependent varable and even more severe errors n ts dervatves. The condton number of the normal matrx provdes nformaton regardng the rate of the error propagaton, but ths nformaton can be naccurate and sometmes even msleadng. Collnearty n polynomal regresson can be sgnfcantly reduced by usng data transformatons, n partcular the z-transformaton, whch transform the ndependent varable data to the [-1, +1] range. To determne whether t s collnearty that lmts the hghest order polynomnal and the accuracy of the ft, t s advsable to repeat the regresson usng z-transformaton of the ndependent varable data. If the transformed data allows fttng by a hgher order polynomal, collnearty s a lmtng factor, otherwse other reasons for the low accuracy must be sought. Usng the two-parameter Clapeyron equaton and the four-parameter Redel equaton, t was shown that there s not a sngle model whch s superor n representng a partcular type of data. For correlatng low precson and/or narrow range data of vapor pressure, the Clapeyron equaton s more approprate than the Redel equaton. Whereas, for hgh precson and wde range data, Redel s equaton s more approprate. If reducng the range results n a sgnfcant reducton of the varance, addng more terms/parameters to the model can mprove the accuracy of the correlaton. But, addng more terms when range reducton does not lead to a better ft can result n over-correlaton. References 1. Bates DM and Watts DG (1988), Nonlnear Regresson Analyss and ts Applcaton, John Wley & Sons, New York. 2. Brauner N and Shacham M (1998), The Role of Range and Precson of the Independent Varable n Regresson of Data, AIChE J., 44(3), Dahlqust A, Björk A and Anderson N (1974), Numercal Methods, Prentce-Hall, Englewood Clffs, N.J. 4. Draper NR and Smth H (1981), Appled Regresson Analyss, John Wley & Sons, New York. 5. Hmmelblau DM (1970), Process Analyss by Statstcal Methods, John Wley & Sons, New York. 6. Math Works, Inc. (1992), The Student Edton of MATLAB ; Prentce-Hall, Englewood Clffs, N.J. 7. McGarry J (1983), Correlaton and Predcton of the Vapor Pressures of Pure Lquds over Large Pressure Ranges, Ind. Eng. Chem. Process. Des. Dev, 22, Noggle JH (1993), Practcal Curve Fttng and Data Analyss, Prentce-Hall, Englewood Clffs, N.J. Page

15 9. Shacham M and Brauner N (1997), Mnmzng the Effects of Collnearty n Polynomal Regresson, Ind. & Chem. Res. 36, (10), Shacham M, Brauner N and Cutlp MB (1996), Replacng Graph Paper wth Interactve Software n Modelng and Analyss of Expermental Data. Comp. Appl. Eng. Edu. 4(3), Shacham M and Cutlp MB (1996), POLYMATH 4.0 User s Manual, CACHE corporaton, Austn, TX. 12. Wagner W (1973) New Vapor Pressure Measurements for Argon and Ntrogen and a New Method for Establshng Ratonal Vapor Pressure Equatons, Cryogencs 13, Weast RC (Ed) (1975), Handbook of Chemstry and Physcs, 56th Ed, CRC Press, Oho. NEIMA BRAUNER receved her BSc and MSc from the Technon, Israel Insttute of Technology, and her PhD from the Unversty of Tel-Avv. She s currently a Professor n the Flud Mechancs and Heat Transfer Department. She teaches courses n Mass and Heat Transfer and Process Control. Her man research nterests nclude hydrodynamcs and transport phenomena n two-phase systems and appled regresson analyss. MORDECHAI SHACHAM s a Professor n the Chemcal Engneerng Department at the Ben Guron Unversty of the Negev, Beer-Sheva, Israel. He receved hs BSc and DSc from the Technon, Israel Insttute of Technology. Hs research nterests nclude appled numercal and statstcal methods, computer-aded nstructon, chemcal process smulaton, desgn and optmzaton. Page

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson