A Robust Strategy for Joint Data Reconciliation and Parameter Estimation

Similar documents
ROBUST AND EFFICIENT JOINT DATA RECONCILIATION PARAMETER ESTIMATION USING A GENERALIZED OBJECTIVE FUNCTION

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

By 3DYHOýtåHN, Wolfgang Härdle

MODULE -4 BAYEIAN LEARNING

ECE521 week 3: 23/26 January 2017

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

Robust model selection criteria for robust S and LT S estimators

Stat 5101 Lecture Notes

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

Lecture 3: Statistical Decision Theory (Part II)

Robust high-dimensional linear regression: A statistical perspective

Indian Statistical Institute

A Modified M-estimator for the Detection of Outliers

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc

Testing for Regime Switching in Singaporean Business Cycles

Parameter estimation by anfis where dependent variable has outlier

Single Index Quantile Regression for Heteroscedastic Data

Bayesian Methods for Machine Learning

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers

Single Index Quantile Regression for Heteroscedastic Data

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

Regression Clustering

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea)

A Two-Stage Algorithm for Multi-Scenario Dynamic Optimization Problem

Scale Mixture Modeling of Priors for Sparse Signal Recovery

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

An algorithm for robust fitting of autoregressive models Dimitris N. Politis

More on Unsupervised Learning

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Discussion of Sensitivity and Informativeness under Local Misspecification

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Lecture 12 Robust Estimation

Breakdown points of Cauchy regression-scale estimators

Additive Outlier Detection in Seasonal ARIMA Models by a Modified Bayesian Information Criterion

Regression Analysis for Data Containing Outliers and High Leverage Points

A Brief Overview of Robust Statistics

Figure 1. Sketch of various properties of an influence function. Rejection point

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION

Optimal Implementation of On-Line Optimization

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Robust regression in R. Eva Cantoni

Statistics 203: Introduction to Regression and Analysis of Variance Course review

9. Robust regression

Data Reconciliation: Measurement Variance, Robust Objectives and Optimization Techniques

AC : A MODULE FOR TEACHING BATCH OPERATIONS

Uncertainty Quantification for Inverse Problems. November 7, 2011

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Probabilistic Machine Learning. Industrial AI Lab.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Effects of Outliers and Multicollinearity on Some Estimators of Linear Regression Model

On a Strategy of Serial Identification with Collective Compensation for Multiple Gross Error Estimation in Linear Steady-State Reconciliation

The Bayesian approach to inverse problems

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

2 Statistical Estimation: Basic Concepts

Leverage effects on Robust Regression Estimators

A general linear model OPTIMAL BIAS BOUNDS FOR ROBUST ESTIMATION IN LINEAR MODELS

Bayesian estimation of the discrepancy with misspecified parametric models

DESIGN OF AN ON-LINE TITRATOR FOR NONLINEAR ph CONTROL

Physics 509: Bootstrap and Robust Parameter Estimation

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Robust estimation of the Pareto index: A Monte Carlo Analysis

Multiview Geometry and Bundle Adjustment. CSE P576 David M. Rosen

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

A Bayesian perspective on GMM and IV

Fast approximations for the Expected Value of Partial Perfect Information using R-INLA

In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence function. In the present chapter, the

Bayesian Models for Regularization in Optimization

Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems

Nonlinear ph Control Using a Three Parameter Model

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Computer Intensive Methods in Mathematical Statistics

A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions

Robust estimation, efficiency, and Lasso debiasing

Relevance Vector Machines for Earthquake Response Spectra

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Linear model selection and regularization

ROBUST TESTS ON FRACTIONAL COINTEGRATION 1

Data reconciliation: a robust approach using contaminated distribution.

Adaptive Filter Theory

COMP90051 Statistical Machine Learning

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Bayesian Networks. Motivation

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Modeling Multiscale Differential Pixel Statistics

Measuring robustness

Bootstrap Goodness-of-fit Testing for Wehrly Johnson Bivariate Circular Models

Robust Variable Selection Through MAVE

Definitions of ψ-functions Available in Robustbase

7. Forecasting with ARIMA models

A COLLABORATIVE 20 QUESTIONS MODEL FOR TARGET SEARCH WITH HUMAN-MACHINE INTERACTION

Transcription:

A Robust Strategy for Joint Data Reconciliation and Parameter Estimation Yen Yen Joe 1) 3), David Wang ), Chi Bun Ching 3), Arthur Tay 1), Weng Khuen Ho 1) and Jose Romagnoli ) * 1) Dept. of Electrical & Computer Engineering, The National University of Singapore, 1 Kent Ridge Cres., Singapore 1196 ) Dept. of Chemical Engineering, The University of Sydney, NSW 6, Australia 3) Institute of Chemical and Engineering Sciences, Ayer Rajah Cres., Block 8, Unit #-8, Singapore 139959 Abstract In this work, the generalized T (GT) distribution is used to develop a statistically robust joint data reconciliation parameter estimation (DRPE) strategy. The robustness feature is provided by the GT distribution, which includes Normal, Laplacian and Cauchy distribution as special cases. We use historical data to first estimate the parameters of the GT distribution, so that the resulting estimator is efficient when the error is in the GT family. The strategy is implemented in a simulation of a practical chemical engineering plant. The results confirm the robustness and efficiency of the estimator. Keywords: parameter estimation, data reconciliation, error-in-all-variables, robust, estimators 1. Introduction A more efficient approach than the sequential data reconciliation (DR) parameter estimation (PE) that is common in practice is to jointly perform DR and PE (DRPE), such that the resulting reconciled data and model parameters are consistent with respect to both the process model and DR constraints. The DRPE can also be viewed as the error-in-all-variables-measured (EVM) formulation, which is the generalization of the conventional PE: in EVM, all measurements are s.t. errors, such that the distinction between independent and dependent variables is no longer clear (Romagnoli et al, ). The three main aspects of EVM discussed in the literature are the EVM algorithm (Valko et al, 1987), the optimization strategy (Kim et al, 199; Tjoa et al, 1991) and the robustness of the EVM estimation (Albuquerque et al, 1996; Arora et al, 1). In this paper we will focus mainly on the robustness of the EVM estimation. Various robust estimation approaches such as the M-estimators have been proposed, but most assumed, in a priori, some forms of error distribution, which, although robust, might not be representative of the actual distribution. On the other hand, the nonparametric methods such as the kernel function (Wang et al, 3) are free of such assumptions and fully flexible, but are also complex and computationally demanding. * Author to whom correspondence should be addressed: jose@chem.eng.usyd.edu.au

An alternative is to strike a balance between the simplicity of the parametric approach and the flexibility of the non-parametric approach, i.e. by adopting a specific objective function that covers a wide variety of common distributions. This corresponds to the generalized T (GT) distribution. The parameters of the GT distribution can be estimated posteriori to ensure its suitability to the data. In this work, we extend the robust DR strategy using GT distribution (Wang et al, 3) to incorporate parameter estimation. This results in a statistically robust EVM strategy that is also efficient. The paper is organised as follows. The next section discusses the incorporation of the robustness feature into DRPE within a probabilistic framework. Section 3 describes the DRPE strategy using the GT distribution, which is then applied to a case study of a general purpose chemical engineering plant in Section 4. Finally, Section 5 concludes the paper.. The Robustness of DRPE Estimator Within a probabilistic framework, by maximum-likelihood principle, the DRPE can be formulated as: max f ( ε ) = min -log(f( ε)) = min ρ( ε ) x, u, θ x, u, θ x, u, θ () s.t. model and bounds where ε = y x is the measurement error, and f (ε ) is the probability density of the error. As the efficiency of the estimator depends on how well f (ε ) characterizes the actual error, the estimator can be made robust by reducing the sensitivity of ρ(ε ) to large values of ε. This corresponds to the robust M-estimator. The robustness of the M- estimators can be explained by the influence function (IF), defined by ψ ( ε)= ρ( ε) ε. Essentially, IF gives a rough measure of how much influence a particular residual has in the estimation (McDonald et al, 1988; Hampel et al, 1986), so it is desirable to have an IF that is bounded for large residuals in order for them to have limited influence on the estimation. It should be pointed out at this point that the conventional WLS, where ρ( ε ) ε, has the IF that is a straight line, which is why large residuals have unlimited influence on and can dominate the estimation, resulting in biased estimates. Common choices of robust M-estimator such as the contaminated normal (Tjoa et al, 1991), the combination of Laplace and Normal distribution (Wang et al, 3), the fair function (Albuquerque et al, 1996) and the redescending estimator (Arora et al, 1) depend on parameters which either are assigned to them a priori or do not have meaningful association with the error distribution. As a result, the underlying error distribution may not be well characterised and the estimators may not be efficient in the MLE sense (Wang et al, 3). We therefore propose the use of GT distribution as robust estimator for DRPE problems, as the GT has the advantages that enable it to be robust while not sacrificing efficiency. This will be elaborated in the next section. 3. Robust DRPE Using the Generalized T (GT) Distribution The use of the GT distribution in estimation is first proposed by McDonald et al (1988) due to its flexibility to accommodate various distributional shapes. The density function is given by:

p f GT ( ε; σ, p, q) = ; - < ε < (3) q+ 1 p p 1/ p ε σq B(1 p, q) 1 + p qσ Depending on the values taken by { p, q,, can take the shape of any distribution within the family defined by the GT distribution. As illustrated in Figure 1, it covers most of the important distributions that are commonly encountered in practice. f GT GT p = q σ =α 1 8 6 4 (a) q=5, sigma=1 p= -.- p=4 - -4 Power exponential/ Box-Tiao T-distribution, df = q -6-8 -1-5 -4-3 - -1 1 3 4 5 p =1 p = σ =α q q = 1 8 6 4 (b) p=, sigma=1 q=5 -.- q=4 Double exponential/ Laplacian Normal Figure 1: GT Distribution Tree Cauchy - -4-6 -8-5 -4-3 - -1 1 3 4 5 Figure : Influence Functions of GT with different parameter settings The IFs of f GT with different settings of { p, q, is shown Figure. σ only affects the distribution spread, while p and q determine the shape. It is seen that the robustness criteria are satisfied as the IFs are bounded and actually descending when the residuals get large. The two properties of GT demonstrated above: flexibility and robustness, enable us to achieve a robust yet efficient estimator in the MLE sense. This justifies our main motivation in selecting GT over other M-estimators, which as mentioned in Section, may not be efficient in the MLE sense as they may not characterize the error distribution well. The GT, on the other hand, can take a wide range of distributional shapes depending on its parameter values. When these parameters are estimated from the data, it is able to adapt its shape to the data. We therefore use a set of historical data to estimate the distribution parameters. To ensure robustness, however, care must be taken in estimating the distribution parameters. We see that as p increases (Figure a), the IF value for large residuals increases, and as q increases (Figure b), the IF becomes less bounded. It is thus necessary to impose bounds on p and q if robustness is to be preserved (in this work, p 5, q 5 ). We underline that although the bounds exclude a part of the GT family, by estimating p and q from the data, we effectively fit the data with the GT distribution within the parameter bounds. This ensures that the exclusion will have little effect on the asymptotic efficiency of the estimator (McDonald et al, 1988).

To estimate the GT distribution parameters, a preliminary reconciliation of the historical data is performed to obtain the residuals ε, which is then fed to the maximum likelihood estimator, given by (Wang et al, 3): max log f GT ( ε ; p, q, σ ) p, q, } (4) { σ The estimates of { p, q, are then obtained as the parameters of a GT member from which the data are most likely sampled. In DRPE, some of the measurement variables are non-redundant, which complicates the estimation of the distribution parameters. Since this work deals with steady-state data, we take the median as the estimated value of the non-redundant measurements. Taking the median as the estimated values corresponds to the use of robust L-estimator (Albuquerque et al, 1996; Hampel et al, 1986); however, it can only be used when the variables have repeated measurements or are known to be constant over the time horizon considered. In the case where these conditions do not hold, a robust preliminary parameter estimation or DRPE has to be performed to obtain the residuals for the nonredundant measurements. This is more complex and computationally expensive, especially if we would like to update the distribution parameters online, as the estimation of the distribution parameters may need a number of iterations. Another alternative is to assign fixed values of { p, q, that are sufficiently robust. In this case, the efficiency is traded off for convenience. Figure 3: Case Study Plant Flowsheet 4. Application Case Study The proposed robust DRPE strategy is applied to a case study of a pilot-scale setting containing two CSTRs, a mixer and a number of heat exchangers (Figure 3). Material feed from the feed tank is heated before being fed to the first reactor and the mixer. The effluent from first reactor is then mixed with the material feed in the mixer, and then fed to the second reactor. The effluent from the second reactor is, in turn, fed back to the feed tank and the cycle continues. Steady-state analysis of the system structure results in seven redundant equations involving 14 redundant variables. The model parameters estimated are the product of the heat transfer coefficient with the effective heat transfer area of the steam jacket (P1)

and the cooling coil (P) of the first reactor. For parameter estimation, two more model equations with five non-redundant variables are included. Associated with the pilot-scale plant, a virtual environment has been developed within the Matlab/Simulink framework which mimics the actual plant behaviour and will be used in this paper while the plant is being commissioned. Simulation data are generated with several different distributions: Normal, Laplacian and Cauchy distribution. The different distributions are considered as outliers. A data set having Normal distribution and with large random shifts as gross error is also generated. We then perform DRPE using three different methods, the conventional WLS, the contaminated (bivariate) Gaussian distribution with gross error probability p =. and gross error ratio b = (Tjoa et al, 1991), and the GT distribution with distribution parameters { p, q, estimated from a historical data set (n=1) having similar distribution as the current data set. The performance criterion used to compare the efficiencies of the different methods is the mean-squared error (MSE): MSE = 1 mk K m j= 1 i= 1 ) ( x i, j xi, j ) σ i where m is the number of measured variables and K is the number of data sets used for ) the DRPE (m=19, K=1 in this study). and are the estimates of the reconciled x i, j data and the actual value of the variable, respectively, while σ i is the standard deviation of the Gaussian noise on sensor i. The MSE results are shown in Figure 4, while Table 1 lists the estimated model parameters P1 and P. The fact that the bivariate Gaussian and GT method are more efficient (lower MSE and % discrepancy of parameter estimates) than WLS for distributions other than Normal, and for Normal noise with gross error, proves the robustness of the two M-estimators. Compared to the bivariate Gaussian method, the GT method is more efficient for the Laplacian and Cauchy error distributions, which are special cases of the GT distribution. The GT distribution parameters for some variables for the case of Laplacian and Cauchy noises are listed in Table. The reader can refer back to the distribution tree in Figure 1 to see that the values of the estimated p and q are close to the ideal p and q for the respective distributions. For example, for Laplacian noise, ideally p=1 and q ; the estimated p are close to one, while q are large or close to the upper bound, i.e. q=5. Figure 5 plots the relative frequency distribution of the noise, the estimated GT density with estimated{ p, q,, the bivariate density (p=.,q=), and the normal distribution (N(, σ )) corresponding to the actual noise, GT, bivariate and WLS estimator, respectively, for a temperature variable with Laplacian noise. It is seen that the GT estimator characterizes the data best, which explains its lowest MSE and %discrepancy for Laplacian noise in Figure 4 and Table 1. The same can be concluded for Cauchy noise. 5. Conclusion The DRPE based on GT distribution is robust and efficient, especially when the underlying error distribution is within the GT family. Since the GT family encompasses x i, j (5)

a wide variety and many important statistical distributions, the GT-based estimator is a very viable choice of estimator considering its simplicity. MSE 4.5 4 3.5 3.5 1.5 1.5 WLS Bivariate GT Normal Normal + Gross Error Laplacian Cauchy.5.4.3. *-. : relative freq : GT relative freq - - GT : Bivariate bivariate WLS Figure 4: Performance Comparison for Different Noise Profiles -6-4 - 4 Figure 5: Distribution Plots 6 Table 1: Model Parameter Estimates and Their Accuracies %discrepancy %discrepancy P1 P P1 P P1 P P1 P Actual value 37,5 56,5 -- -- 37,5 56,5 -- -- Normal WLS 39,17 57,747 4.45.1 Laplace 39,5 57,761 4.55.3 Biv. 36,6 58, -3.98.65 39,17 57,797 4.9.3 GT 37,96 57,769 -.54.5 36,545 57,951 -.55.57 Normal+ WLS 4,331 57,46 7.55 1.64 Cauchy 41,34 51,988 1.5-7.99 Gross Biv. 39,45 57,539 4.65 1.84 36,1 58, -4..65 Error GT 36,3 57,994-3.91.64 36,1 58, -4..65 Table : GT Distribution Parameter Estimates Cauchy Noise Laplacian Noise Variable p=1 q =.5 p=1 q inf T7 1.9387.5 1.464 4.6 T9 1. 1.1839 1.135 47.881 T1 1.478.76.186 1.3741 T1 1.8465.5 1.3936 19.733 Trx.699.5 1.84 5. Tmx.993.5 1. 48.5511.1 References Albuquerque, J. S., Biegler, L.T., 1996. AIChE J., Vol. 4, No. 1, pp. 841-856. Arora, N., Biegler, L.T., 1. Comp. Chem. Eng., Vol. 5, pp. 1585-1599. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A., 1986, Robust Statistics: The Approach Based on Influence Functions, Wiley. Kim, I.W., Liebman, M.J., and Edgar, T.F., 199. AIChE, Vol. 36, pp. 985-993. McDonald, J.B., Newey, W.K., 1988, Partially Adaptive Estimation of Regression Models via the Generalized T Distribution, Econometric Theory, Vol. 4, pp.48-457. Romagnoli, J.A., Sanchez, M.C.,, Data Processing and Reconciliation for Chemical Process Operations, Academic Press. Tjoa, I.B., Biegler, L.T., 1991. Comp. Chem. Eng., Vol. 15 No. 1, pp. 679-69. Valko, P., Vadja, S., 1987. Comp. Chem. Eng., Vol. 11, pp. 37-43. Wang, D., Romagnoli, J.A., 3. Ind.Eng.Chem.Res.,Vol.4, No.13, pp. 375-384.