OPTIMAL SENSOR PLACEMENT FOR JOINT PARAMETER AND STATE ESTIMATION PROBLEMS IN LARGE-SCALE DYNAMICAL SYSTEMS WITH APPLICATIONS TO THERMO-MECHANICS
|
|
- Allan Jacob Heath
- 5 years ago
- Views:
Transcription
1 OPTIMAL SENSOR PLACEMENT FOR JOINT PARAMETER AND STATE ESTIMATION PROBLEMS IN LARGE-SCALE DYNAMICAL SYSTEMS WITH APPLICATIONS TO THERMO-MECHANICS Roland Herzog Ilka Riedel Dariusz Uciński February 7, 2017 We consider large-scale dynamical systems in which both the initial state and some parameters are unknown. These unknown quantities must be estimated from partial state observations over a time window. A data assimilation framework is applied for this purpose. Specifically, we focus on large-scale linear systems with multiplicative parameter-state coupling as they arise in the discretization of parametric linear time-dependent partial differential equations. Another feature of our work is the presence of a quantity of interest different from the unknown parameters, which is to be estimated based on the available data. In this setting, we develop a simplicial decomposition algorithm for an optimal sensor placement and set forth formulae for the efficient evaluation of all required quantities. As a guiding example, we consider a thermo-mechanical PDE system with the temperature constituting the system state and the induced displacement at a certain reference point as the quantity of interest. Technische Universität Chemnitz, Faculty of Mathematics, Professorship Numerical Mathematics (Partial Differential Equations), D Chemnitz, Germany, roland.herzog@mathematik.tu-chemnitz.de, Technische Universität Chemnitz, Faculty of Mathematics, Professorship Numerical Mathematics (Partial Differential Equations), D Chemnitz, Germany, ilka.riedel@mathematik.tu-chemnitz.de, University of Zielona Góra, Institute of Control and Computation Engineering, ul. Podgórna 50, Zielona Góra, Poland, d.ucinski@issi.uz.zgora.pl,
2 1. INTRODUCTION In this paper, we consider joint parameter and state estimation problems for large-scale dynamical systems of the form { E ẋ(t) = A(p) x(t) + f (t), t [0, t f ], x(0) = x 0 R n (1.1). Here x(t) R n is the state vector, E R n n is a non-singular matrix, f (t) R n signifies a known forcing input, p R q stands for a set of system parameters, and A(p) R n n is a matrix representing parameter dependent dynamics. The purpose of our estimation procedure is to infer an estimate ˆx 0 of the unknown initial state x 0 as well as an estimate ˆp of the unknown parameters p from partial measurements y j = C y x(t j ) + η j R m, j = 1,..., N (1.2) of the state trajectory evaluated at sampling instants t 1,..., t N which are fixed in a given time horizon [0, t f ]. Here η j R m denotes measurement noise which accounts for factors such as measurement errors and inadequacies of the mathematical model (1.1). We adopt a Bayesian setting, which means that while estimating x 0 and p some information about them is available. Once determined, the estimates ˆx 0 and ˆp are supposed to be plugged into the model (1.1) so as to produce an estimate ˆx(t f ) of the terminal state x(t f ) and then finally yield the estimate ẑ = C z ˆx(t f ) of a quantity of interest (QOI) z, which depends linearly on the terminal state x(t f ), z = C z x(t f ) R r, (1.3) where C z R r n is given. In practice, the measurements of the observable quantity y are subject to measurement error. Logically, the output noise propagates into the estimate of (x 0, p), thereby influencing the estimate of the QOI z. The amount of perturbation in ẑ depends on the matrix C y which encodes which parts of the state trajectory are being observed. It is the purpose of this paper to optimize the measurement matrix C y in order to minimize the influence of the measurement error on the estimate of the QOI, in a sense to be made precise below. We envision that the state vector x(t) R n is high-dimensional and it represents a distributed quantity, as for instance in the discretization of timedependent partial differential equations. It is assumed that the measurement matrix C y consists of m distinct rows of the n n identity matrix. In this setting, the optimization of C y can be understood as choosing optimal sensor locations. We point out that we consider the sensors to be static here. NOTATION. Throughout the paper, R + and R ++ stand for the sets of nonnegative and positive real numbers, respectively. We adopt the convention that all vectors have 2
3 column form. The set of real m n matrices is denoted by R m n. We use sym m to denote the set of symmetric m m matrices, S m + to denote the set of symmetric nonnegative definite m m matrices, and S m ++ to denote the set of symmetric positive definite m m matrices. The symbol id n denotes the n n identity matrix. The symbol 1 n denotes a vector whose components are all equal to one. Given two vectors x and y of dimension n, x y is an n-vector whose i-th component is x i y i (the componentwise multiplication operator). Finally, the symbol conv({q 1,..., q l }) denotes the convex hull of a set of vectors q i, i = 1,..., l. MOTIVATION: THERMO-MECHANICAL PDE SYSTEM As a motivation to consider sensor placement problems for systems of type (1.1), we mention an application described by a thermo-mechanical PDE system. More details are given in Section 5. Suppose that the temperature T of a machine tool constitutes the state of the system and it is governed by the heat equation endowed with boundary conditions ρ c p Ṫ div(λ T) = 0, λ n T + α(x) (T T ref) = r(x, t) describing the heat flux. The heat transfer coefficient, α(x), depends on the spatial position x and it subsumes various physical phenomena, such as convective and radiative heat transfer. Its true value is therefore unknown and must be estimated from a time series of temperature measurements. A second unknown is the initial temperature state T 0 (x), which arises from previous operation of the machine and is impossible to be measured directly. The right hand side r(x, t) represents heat sources acting on the machine tool. A table describing these correspondences with the model (1.1) is provided as Table 5.2. It is not our primary goal to estimate the temperature distribution of the machine at time t f, but rather to estimate the QOI, that is the displacement of a certain relevant point of the machine structure induced by that temperature. Notice that thermally induced displacements can be the source of dominating positioning errors in machine tools. It is the precision of the estimation of these displacements that we are concerned with. To increase this precision, we wish to find optimal locations of temperature (state) sensors on the machine s surface. RELATED WORK AND STRUCTURE OF THE PAPER Let us put our paper into perspective. In the absence of unknown parameters p in (1.1), the estimation of the terminal state x(t f ) in a dynamical model such as (1.1) from previous measurements of the state is known as a data assimilation problem, see, e.g., (Freitag and Potthast, 2013; Law et al., 2015; Cacuci et al., 2014). Notice that unknown 3
4 parameters could be easily incorporated by declaring them as artificial state variables satisfying ṗ(t) = 0. We do not follow this approach but prefer to keep p and x separate. Such joint parameter and state estimation problems were considered, for instance, in Kühl et al. (2011); Küpper et al. (2009). A key design problem in state and/or parameter estimation of distributed parameter systems (DPSs) consists in properly deploying the available measurement sensors. Logically, they should be placed at sites which provide the most valuable information about the estimated quantities. As it is desirable to determine best sensor positions before the actual data collection, the issue that must be primarily addressed is the appropriate choice of the optimality criterion. As for state estimation, various criteria quantifying observability were employed in deterministic scenarios (El Jai and Pritchard, 1988), whereas in stochastic settings the research was focused on minimizing criteria which aggregated the covariance matrix of the estimation error, see (Kubrusly and Malebranche, 1985) for the state of the art in the mid-1980s. Since the Kalman filter, which was the main tool to produce state estimates, was hard to implement in realistic settings due to its prohibitive computational and memory requirements, this line of research was abandoned for nearly two decades, and then revived interest in it was observed in the framework of variational data assimilation (Cacuci et al., 2014) or spatial statistics (Cressie and Wikle, 2011). In turn, sensor location for parameter estimation usually follows the traditional approach of statistical experimental design (Atkinson et al., 2007; Pázman, 1986; Pronzato and Pázman, 2013; Pukelsheim, 2006) and is based on various scalar measures of performance defined on the Fisher information matrix (FIM) associated with the estimated parameters. The inverse of the FIM constitutes the Cramér-Rao bound to the covariance matrix of the estimates. The approach dates back to the work of Uspenskii and Fedorov (1975), whose ideas were then extended by Rafajłowicz (1981, 1986). A comprehensive overview of this currently very active research area is contained in the monograph (Uciński, 2005). Over the past decade communications about sensor location have continued to grow. Results regarding various types of PDEs have been reported, e.g., for reaction-diffusion or convection-diffusion problems (Alonso et al., 2004a; Armaou and Demetriou, 2006; Alonso et al., 2004b; García et al., 2007), as well as for models in fluid dynamics (Mokhasi and Rempfer, 2004; Cohen et al., 2006; Willcox, 2006; Yildirim et al., 2009). By the same token, the problem has been considered in numerous applications, e.g., in environmental and water resource systems (Sun and Sun, 2015), for mechanical deformation problems (Yi et al., 2011; Meo and Zumpano, 2005), as well as in sensor networks (Song et al., 2009). A great difficulty in estimation of DPSs arises due to the infinite dimensional nature of the parameter space. Some theoretical problems, such as existence of a least-squares estimator, continuous dependence of the estimator on the data and convergence of approximations, require compactness of the parameter space. If these aspects are not properly addressed, the estimation process may be ill-posed in the sense that noise in the data may give rise to significant errors in the estimate. Therefore, techniques 4
5 known as regularization methods have been developed to deal with this ill-posedness, e.g., Tikhonov regularization (Vogel, 2002). They, however, hardly ever consider the statistical aspects of the estimation problem. Alternatively, a Bayesian framework can be employed, which quite naturally makes it possible to take account of prior statistical information of the unknown parameters and/or states. Bayesian methods, unlike asymptotic methods of classical statistics, turn out to be well-suited theoretically and computationally to infinite dimensional parameter spaces and can well handle the above-mentioned theoretical problems (Fitzpatrick, 1991). Unfortunately, sensor location for Bayesian inference in DPSs (or, in general, estimation combined with regularization) has not been sufficiently considered yet. Recent research, however, points to some breakthrough in this area, especially in the context of variational data assimilation. Gejadze and Shutyaev (2012) approached the problem of efficiently evaluating the gradient of the A-optimality criterion with respect to the spatial coordinates of the sensors for estimating the initial condition of a onedimensional Burgers equation with a nonlinear viscous term. To this end, they used a limited-memory approximation of the inverse Hessian of the data assimilation cost function (up to a multiplier, the Hessian is equal to the FIM associated with the coefficients of a finite-dimensional parametrization of the initial state). The cost of the attendant computations is substantially reduced by extensive use of adjoint equations. In turn, selection of an optimal subset of candidate sensor locations have been studied by Alexanderian et al. (2014) for estimation of the initial state of a three-dimensional advection-diffusion equation. The optimality criterion was the trace of the posterior covariance, implemented in practice through a randomized trace estimator. Substantial computational savings result from using a randomized SVD to get a low-rank surrogate for the prior-preconditioned parameter-to-observable map. Efficiency is additionally increased by specifying the covariance operator of the Gaussian prior as the inverse of an elliptic differential operator, which can be evaluated using fast solvers for elliptic PDEs. A successful attempt to generalize this approach to a parameter estimation problem (i.e., a nonlinear inverse problem) for inferring a coefficient field in a two-dimensional elliptic problem has been made in (Alexanderian et al., 2016). This inspired the formulation we use in our paper. In an earlier paper Herzog and Riedel (2015), we focused on sensor placement problems for thermo-mechanical systems, but in the absence of a dynamical system (1.1). To be precise, the temperature field was estimated directly from instantaneous measurements and in a reduced-order temperature space. This is not possible here, since the heat transfer coefficient α is considered unknown. Notice that an estimation of α is only possible in a time-dependent model. The particular features of the problem at hand and novelties in the present paper compared with previous work on sensor placement are the following. The presence of the QOI prevents us from using directly the Fisher information matrix (FIM) of the (x 0, p)- estimation problem to formulate the objective for the optimal sensor placement problem. Instead, we must use the (approximate) covariance matrix of the QOI estimator, which involves the solution map of a linearized state system. Since we assume the di- 5
6 mension of the QOI to be much lower than the state dimension (r n), we employ an adjoint technique to evaluate that covariance matrix efficiently. In order to solve the sensor placement problem, we employ a simplicial decomposition algorithm, which was analyzed in Patriksson (1999) and Bertsekas (2015). To solve the main subproblem we make use of the classical multiplicative algorithm which goes back to Silvey et al. (1978), but needs to be adapted to the objective at hand. We refer the reader to Torsney (2009); Yu (2010) for a historical overview. Basically, while solving the relaxed convex sensor selection problem (Problem 3.2) we could adapt the approach outlined by Joshi and Boyd (2009), which advocates an interiorpoint method. As will be shown, however, the implementation of simplicial decomposition is strikingly easy, the algorithm usually runs very fast, and most often the solutions produced by it are rather sparse (i.e., the number of nonzero weights is low). Sparsity may be quite an acute problem as far as relaxed solutions are concerned and usually requires augmenting the criterion by sparsifying penalty functions (Chepuri and Leus, 2015; Alexanderian et al., 2014; Haber et al., 2010, 2008). The linear programming subproblem built in simplicial decomposition seems to successfully retain a moderate number of nonzero weights. Due to the multiplicative coupling of parameters p and state vector x(t) in (1.1), the covariance of the QOI is going to depend not only on the measurement matrix C y but also on the unknown parameters p themselves (but not on the unknown initial state x 0 ). Often, this feature is addressed in sensor placement or similar experimental design problems by embedding the latter in a robust formulation, where the unknown parameter is confined to an uncertainty set. This significantly adds to the level of complexity of the problem; see, e.g., (Uciński, 2005; Pronzato and Pázman, 2013) or (Körkel et al., 2004; Diehl et al., 2006; Bock et al., 2007). In this paper, we focus on the sensor placement problem for systems of type (1.1) in the presence of a QOI and therefore content ourselves with a given set-point (nominal value) p 0 in the parameter space. In Section 2, we formulate the data assimilation problem, which is used to jointly estimate the unknown initial state x 0 and the parameters p. The sensor placement problem is addressed in Section 3 and we propose a simplicial decomposition algorithm for its solution in Section 4. Subsequently, we elaborate on a specific thermo-mechanical system modeling a machine tool, where the temperature constitutes the system state x(t) and the thermo-mechanically induced displacement at a certain reference point (the tool center point, or TCP) serves as the quantity of interest z. We seek optimal locations of temperature sensors on the surface of the machine in order to obtain an accurate estimate of the TCP displacement. The details are given in Section 5 and illustrated with numerical results in Section DATA ASSIMILATION PROBLEM We consider the dynamical system (1.1) with state x(t) R n, unknown initial state x 0 R n and unknown parameter vector p R q. We assume that measurements (1.2) of 6
7 certain parts of the state trajectory are taken at given measurement times t j, j = 1,..., N during the time interval [0, t f ] under consideration. The measurements y j are subject to measurement errors η j, j = 1,..., N, which we assume to be i.i.d. random variables with normal distribution N (0, V y ), where V y = σ 2 id m. This means that the components of each η j are independent zero-mean random variables with the same variance σ 2, or equivalently, that the measurements from different sensors are independent of one another and that their accuracy is the same. The unknowns in the model (1.1) are x 0 and p. However, our prior (background) information are their prior estimates x bg 0 and p bg which are supposed to be realizations of Gaussian random vectors with means x 0 R n and p R q, and covariance matrices V x0 R n n and V p R q q, respectively, i.e., x bg 0 N ( x 0, V x0 ) a p bg N ( p, V p ). Here x 0 and p are unknown and interpreted as the true initial state and the true parameter, respectively. In turn, as for V x0 and V p, we assume that they are known and positive definite, and hence invertible. As is usually the case in data assimilation problems, the number of unknowns (n + q) exceeds the number of measurements (N m). Consequently, regularization terms are needed expressing the above-mentioned prior information about the unknowns. We thus state our data assimilation as follows, cf. Cacuci et al. (2014): min x 0 R n, p R q J DA(x 0, p) = 1 2 x 0 x bg Vx p pbg 2 Vp N j=1 y j C z x(t j ; x 0, p) 2, V 1 y (2.1) where the term x(t j ; x 0, p) is the solution to (1.1) at sampling time t j evaluated at given x 0 and p 0. In order to solve the nonlinear least-squares problem (2.1), one can employ a standard derivative-based method such as the Gauss-Newton or Levenberg-Marquardt algorithms; see for instance (Nocedal and Wright, 2006, Section 10.3). In order to formulate the Jacobian of the model output w.r.t. the unknowns (x 0, p), we introduce the sensitivities X 0 (t) = x 0 x(t; x 0, p) R n n and X p (t) = p x(t; x 0, p) R n q of the state x(t; x 0, p) with respect to the initial state x 0 and the parameters p. By the implicit function theorem, it follows from (1.1) that X 0 is given by the linear system { E Ẋ0 (t) = A(p)X 0 (t), t [0, t f ], X 0 (0) = I n (2.2) and X p satisfies { E Ẋp (t) = A (p) x(t) + A(p)X p (t), t [0, t f ], X p (0) = 0 R n p. (2.3) 7
8 Note that, for simplicity of notation, we let A (p) x(t) stand for the Jacobian matrix of the mapping p A(p) x(t) with respect to p while holding x(t) constant. Using the Chain Rule Theorem (Magnus and Neudecker, 1999, Thm. 12, p. 108), we easily deduce that ( ) A (p) x(t) = p A(p) x x=x(t) = ( x(t) ) vec A(p) I n (2.4) = [ A(p) p 1... p A(p) p q ] (idq x(t) ), where vec is the column-stacking operator and signifies the Kronecker product. Due to the linearity of the output equation (1.2), the sensitivity of the model output to changes in (x 0, p) is given by C y x(t j ; x 0, p) (x 0, p) = C y [ X0 (t j ) X p (t j ) ] R m (n+q), j = 1,..., N. (2.5) The data assimilation problem (2.1) can be written as a weighted least-squares problem of the form 1 min x 0 R n, p R q 2 r(x 0, p) H r(x 0, p) (2.6) with the residual vector r(x 0, p) = x 0 x bg 0 p p bg y 1 C y x(t 1 ; x 0, p). y N C y x(t N ; x 0, p) and the symmetric non-negative definite weight matrix H = diag ( Vx 1 0, Vp 1, Vy 1,..., Vy 1 ). }{{} N times R n+q+n m (2.7) The Jacobian of the residual can be computed from the sensitivities defined above in the following way: id n 0 J(x 0, p) = r(x 0 id 0, p) q (x 0, p) = C y X 0 (t 1 ) C y X p (t 1 ). (2.8).. C y X 0 (t N ) C y X p (t N ) 8
9 Notice that for a large state dimension n, the sensitivity trajectory X 0 : [0, t f ] R n n will be of a formidable size. Also, since the number of model outputs and measurements, N m, is typically smaller than the number of unknowns n + q, it is more economical to evaluate the Jacobian using the adjoint technique. We will now show that one single adjoint variable S : [0, t f ] R n m is enough to attain this objective. To this end, consider one typical output estimate ŷ j = C y x(t j ; x 0, p) of the actual output y j. Adjoin (1.1) to this estimate with an arbitrary time-varying Lagrange multiplier matrix S j (t) R n m as follows: ŷ j = C y x(t j ; x 0, p) tj + S j (t) [ ] E ẋ(t; x 0, p) A(p) x(t; x 0, p) f (t) dt (2.9) 0 }{{} =0 Let us integrate the S j (t) E ẋ(t) term in (2.9) by parts, yielding ŷ j = [ C y + S j (t j ) E ] x(t j ) S j (0) E x(0) tj 0 [Ṡj (t) E + S j (t) tj A(p)]x(t) dt S j (t) f (t) dt. Differentiating both the sides of (2.10) with respect to x 0, we thus get C y X 0 (t j ) = ŷ j x 0 = [ C y + S j (t j ) E ] X 0 (t j ) S j (0) E X 0 (0) tj 0 [Ṡj (t) E + S j (t) A(p)]X 0 (t) dt. 0 (2.10) (2.11) To avoid having to determine the function X 0 (t), we choose the multiplier function S j (t) so that the coefficients of X 0 (t) and X 0 (t j ) vanish, i.e., we specify it as the solution to the following backwards-in-time adjoint differential equation: { E Ṡ j (t) = A(p) S j (t), t [0, t j ], E S j (t j ) = Cy (2.12). Equation (2.11) then becomes C y X 0 (t j ) = S j 0 (0) EX 0 (0) = S j 0 (0) E. (2.13) In turn, differentiating both the sides of (2.10) with respect to p, we get C y X p (t j ) = ŷ j p = [ C y + S j (t j ) E ] X p (t j ) S j (0) E X p (0) tj 0 tj 0 [Ṡj (t) E + S j (t) A(p)]X p (t) dt S j (t) ( A (p) x(t) ) dt (2.14) 9
10 But on account of (2.12) and the initial condition for (2.3), this simplifies to tj C y X p (t j ) = S j (t) ( A (p) x(t) ) dt. (2.15) 0 Consequently, the block row of the Jacobian (2.8) associated with the output at time t j can be expressed as [ Cy X 0 (t j ) C y X p (t j ) ] [ tj = S j (0) E S j (t) ( A (p) x(t) ) ] dt. It is now important to observe that (2.12) is an autonomous system. Therefore, S j (t) = S k (t t j + t k ) holds whenever both are defined. We conclude that in place of N different systems of type (2.12) it is enough to consider a single adjoint system for the adjoint state S : [0, t f ] R n m { E Ṡ(t) = A(p) s(t), t [0, t f ], 0 E S(t f ) = C y. (2.16) Since S j (t) = S(t t j + t f ) holds, each block row of the Jacobian can be evaluated according to [ Cy X 0 (t j ) C y X p (t j ) ] [ = S(t f t j ) E tj 0 S(t t j + t f ) ( A (p) x(t) ) dt ]. (2.17) We provide in Table 3.1 an overview over the quantities required during the solution of the data assimilation problem (2.1) by gradient-based methods. 3. SENSOR PLACEMENT PROBLEM 3.1. COVARIANCE OF THE QOI ESTIMATOR Having solved the data assimilation problem (2.1), we obtain estimates ˆx 0 and ˆp of the sought true values x 0 and p, respectively. In the sequel, we shall concatenate x 0 and p so as to have only one vector of unknown true parameters θ = ( x 0, p) and its estimates ˆθ = ( ˆx 0, ˆp). As was mentioned in the introduction, our main concern is not to estimate the unknown initial state x 0 or the parameter vector p directly, but rather to estimate a quantity of interest z depending on the terminal state x(t f ) at time t f, z = C z x(t f ; θ) R r (3.1) through ẑ = C z x(t f ; ˆθ) R r (3.2) 10
11 with r small compared with the dimension n of the state variable. To be able to assess the quality of the estimator (3.2), we investigate the expected dispersion of the estimates produced by it, which is quantified by the covariance matrix Cov(ẑ). Clearly, the QOI z depends on the unknowns (x 0, p) in an indirect way, and its dependence on p is nonlinear. Therefore, to obtain an expression for the covariance matrix of the estimator ẑ is a real challenge. That is why we follow here a standard approach in the literature, cf. Mehra (1974), and resort to the covariance of a linearized estimator, which is obtained by linearizing the parameter-to-qoi map. This approach is backed up by asymptotical considerations; see for instance (Pronzato and Pázman, 2013, Chapter 3). From now on, let θ 0 = (x0 0, p0 ) denote a given set-point in the parameter space (we may set θ 0 = θ bg = (x bg 0, pbg )), where (3.1) is linearized. An application of the chain rule, applied to (3.1) and (2.2) (2.3), shows that this linearization is given by the matrix Q = z θ = C z X(t f ; θ 0 ) R r d, (3.3) θ=θ 0 where here and subsequently for abbreviation we write d = n + q and X(t; θ) = [ X 0 (t; θ) X p (t; θ) ]. Consequently, the covariance of the linearized QOI estimator is related via Cov(ẑ) = Q Cov( ˆθ) Q (3.4) to the covariance Cov( ˆθ) of the parameter estimator ˆθ. Throughout the paper we assume that the matrix Q has full row rank: rank Q = r. (3.5) In order to form the matrix Q, we exploit the similarity of (3.3) and (2.5) and follow an adjoint approach. To be precise, we solve the additional adjoint system for S Q : [0, T] R n r { E Ṡ Q (t) = A(p 0 ) S Q (t), t [0, t f ], and evaluate E S Q (t f ) = C z Q = [ C z X 0 (t f ; θ 0 ) C z X p (t f ; θ 0 ) ] [ t = S Q (0) f E S Q (t) ( A (p 0 ) x(t) ) dt 0 ]. (3.6) (3.7) The problem of characterizing and evaluating Cov( ˆθ) has extensively been investigated by researchers concerned with variational data assimilation. Gejadze et al. outlined an approach to obtain approximations to the covariance matrices for the data assimilation 11
12 Quantity defined in evaluate using requires r(x 0, p) residual (2.7) (2.7) solution x of (1.1) J(x 0, p) Jacobian (2.8) (2.17) solution S of (2.16) Q Jacobian of QOI (3.3) (3.7) solution S Q of (3.6) Table 3.1: Overview of quantities for the solution of the data assimilation and the sensor placement problem, and how to evaluate them efficiently. problem in which either the initial state x 0 or the parameter vector p are unknowns, see Gejadze et al. (2008) and Gejadze et al. (2010), respectively, as well as Gejadze et al. (2013); Gejadze and Shutyaev (2012). It is rather straightforward to combine these results in our problem of joint estimation of x 0 and p, thereby obtaining Cov( ˆθ) ( V 1 N θ + X(t j ) Cy Vy 1 C y X(t j ) ) 1, (3.8) j=1 where V θ = diag(v x0, V p ) and X(t j ) = X(t j ; θ). The dependence of the right-hand side on the true vector θ is not surprising, as it is a rule as long as estimates of the covariance matrices of various estimators are constructed in settings where the outputs depend nonlinearly on the estimated parameters. Clearly, we do not know θ and, in practice, we approximate it by a preliminary estimate θ 0 (e.g., a logical choice is θ 0 = θ bg ) THE CRITERION TO BE OPTIMIZED Our optimal design problem consists in determining an m-element subset selected out of a total of n state variables, which would yield the lowest variability in the estimates of the QOI as measured by the covariance matrix (3.4). In order to express this formally, we define a decision variable which is the n-dimensional vector w whose component w i is zero if x i is supposed to be measured and zero if x i in not going to be measured. In consequence, the observation matrix takes the form C y (w) = D(diag(w)), (3.9) where D stands for the operation of forming a submatrix of its matrix arguments by deleting all zero rows. Since we assume that the measurements of the observed state components are independent of one another and taken by equally accurate sensors, i.e., V y = σ 2 id m for some known variance σ 2, it follows that Cov( ˆθ) I(w) 1, (3.10) 12
13 where I(w) = V 1 θ + 1 σ 2 N j=1[x(t j )] diag(w) X(t j ) = V 1 n θ + i=1 w i Υ i, (3.11) Υ i = 1 σ 2 N j=1 row i (X(t j )) row i (X(t j )), i = 1,..., n. (3.12) Here row i signifies the i-th row of its matrix argument. We call I(w) the Bayesian information matrix for θ, cf. Chepuri and Leus (2015). Observe that the positive definiteness of V x0 and V p implies that of V θ, and this, in turn, forces I(w) to be positive definite (since the term i=1 n w iυ i is nonnegative definite). Consequently, there is no problem with the inversion of I(w). For the intended search for an optimal w, we have to introduce the appropriate optimality criterion. As nonnegative-definite matrices can be only partially ordered, instead of directly comparing the covariance matrices for different choices of the output matrix, a scalar performance index Ψ defined on Cov( ˆθ) can be used here. Thus, our sensor selection problem can be ultimately expressed as the optimization problem: Problem 3.1 (Sensor Selection Problem). Find a vector w bin Rn to minimize subject to the constraints J (w) = Ψ ( Q I(w) 1 Q ) (3.13) 1 n w = m, (3.14) w i {0, 1}, i = 1,..., n. (3.15) In the role of Ψ, various alphabetical optimality criteria commonly used in experimental design can be considered. Specifically, three possible criteria follow: (i) D Q -optimality (or generalized D-optimality), which corresponds to Ψ = log det, J (w) = log det ( Q I(w) 1 Q ), (3.16) (ii) A Q -optimality (or generalized A-optimality), which corresponds to Ψ = trace, J (w) = trace ( Q I(w) 1 Q ), (3.17) (iii) E Q -optimality (or generalized E-optimality), which corresponds to Ψ = λ max, J (w) = λ max ( Q I(w) 1 Q ), (3.18) 13
14 where λ max is the maximal eigenvalue of its matrix argument. See (Atkinson et al., 2007, p. 137) or (Silvey, 1980, p. 10) for justification of this terminology and notation. Different optimality criteria may produce different solutions to Problem 3.1, but this results from the their slightly different interpretations in terms of the uncertainty ellipsoid for the estimates ẑ. Roughly speaking, a D Q -optimum design minimizes its volume, an A Q -optimum design suppresses the mean squared length of its axes, and an E Q -optimum design minimizes the length of its largest axis. In what follows, our attention will be focused on the D Q -optimality criterion (3.16). Note that the assumption (3.5) implies rank Q I(w) 1 Q = rank Q = r, (3.19) see, e.g., the Range Inclusion Lemma in (Pukelsheim, 2006, p. 17), which clearly demonstrates that Q I(w) 1 Q is always nonsingular RELAXED SENSOR SELECTION PROBLEM Owing to the combinatorial nature of Problem 3.1, which may make its solution intractable even for small-scale problems, we relax it by replacing the non-convex Boolean constraints w i {0, 1} with the convex box constraints w i [0, 1]. Thus we get the following convex relaxed sensor selection problem: Problem 3.2 (Relaxed Sensor Selection Problem). Find a vector w R n to minimize subject to the constraints J (w) = Ψ ( Q I(w) 1 Q ) ( = Ψ Q ( V 1 ) ) 1Q (3.20) w i Υ i n θ + i=1 1 n w = m, (3.21) 0 w i 1, i = 1,..., n. (3.22) It goes without saying that the above relaxed problem is not equivalent to the original problem, as some components of the computed optimal solution w may be fractional and not binary. It is however by no means useless, as J (w ) constitutes a lower bound to J (wbin ) solving Problem 3.1. What is more, rounding up m largest components of w to one and the remaining components to zero, we can produce a suboptimal solution for Problem 3.1. This option is typical for sensor selection problems, see, e.g., Joshi and Boyd (2009). What is more, solutions to Problem 3.2 can be embedded into a general branch-and-bound scheme to yield a solution wbin, see (Uciński and Patan, 2007) for details. Problem 3.2 possesses a number of notable features which, in theory, should make its solution straightforward. First of all, note that the performance index J (w) is convex over the convex feasible set W defined by the constraints (3.21) and (3.22), being the 14
15 intersection of a hyperplane and a hyperbox. The convexity results from the fact that, under the assumption (3.5), the mapping Φ : M log det(q M 1 Q ) is convex on the set of positive-definite R d d matrices (Marshall et al., 2011, Theorem 16.F.4, p. 688). What is more, J is differentiable with where φ(w) := J (w) = [ φ 1 (w),..., φ n (w) ], (3.23) φ i (w) = trace ( Φ (I(w)) Υ i ) R, (3.24) where Φ (X) d dx Φ(X) signifies the matrix derivative of Ψ(X) of a matrix argument X R d d, which is the d d matrix whose (i, j) entry is Φ(X)/ X (j,i), cf. (Bernstein, 2005, p. 410). As I R d d is positive definite, we have Φ (I) = d di log det(q I 1 Q ) = I 1 Q ( Q I 1 Q ) 1 D I 1, (3.25) cf. (Bernstein, 2005, Eqn. ( ), p. 411). Substituting this into (3.24) and using the cyclic commutativity of the trace of a product of matrices, we get φ i (w) = trace (( Q I(w) 1 Q ) 1 Q I(w) 1 Υ i I(w) 1 Q ), i = 1,..., n. (3.26) As the feasible set W is a rather nice convex set, numerous computational methods can potentially be employed for solving Problem 3.2, e.g., the conditional gradient method or a gradient projection method. Unfortunately, if the number of the support points n is large, which is rather a common situation in applications, then these algorithms require additional efforts regarding implementation in order to avoid unsatisfactory computational times. On the other hand, an extremely simple multiplicative algorithm (Silvey et al., 1978; Yu, 2010) is available to maximize the D Q -optimality criterion over the canonical simplex. Its idea is reminiscent of the EM algorithm used for maximum likelihood estimation and a decisive advantage is ease of implementation. In what follows, it will be shown how this multiplicative algorithm can be built into a very simple and efficient computational scheme in which account of the additional upper-bound constraint in (3.22) is taken. The principal tool in its construction will be simplicial decomposition. 4. SIMPLICIAL DECOMPOSITION FOR PROBLEM ALGORITHM MODEL Simplicial decomposition (SD) proved extremely useful for large-scale pseudoconvex programming problems encountered, e.g., in traffic assignment or other network flow 15
16 problems (Patriksson, 1999). In its basic form, it proceeds by alternately solving linear and nonlinear programming subproblems, called the column generation problem (CGP) and the restricted master problem (RMP), respectively. In the RMP, the original problem is relaxed by replacing the original constraint set W with its inner approximation being the convex hull of a finite set of feasible solutions. In the CGP, this inner approximation is improved by incorporating a point in the original constraint set that lies furthest along the gradient direction computed at the solution of the RMP. This basic strategy has been discussed and extended in numerous references (Bertsekas, 2015; Patriksson, 1999). A marked characteristic of the SD method is that the sequence of solutions to the RMP tends to a solution of the original problem in such a way that the objective function strictly monotonically approaches its optimal value. The SD algorithm may be viewed as a form of modular nonlinear programming, provided that one has an effective computer code for solving the RMP, as well as access to a code which can take advantage of the linearity of the CGP. One of the aims of this paper is to show that this is the case within the framework of Problem 3.2. What is more, since we deal with minimization of the convex function J over a bounded polyhedral set W, this will automatically imply the convergence of the resulting SD scheme in a finite number of RMP steps (Bertsekas, 2015). Tailoring the SD scheme to our needs, we obtain Algorithm 1. In the sequel, its consecutive steps will be discussed in turn CHARACTERIZATION OF THE OPTIMAL DESIGN AND TERMINATION OF ALGORITHM 1 In the original SD setting, the criterion for terminating the iterations is checked only after solving the column generation problem. The computation is then stopped if the current point w (k) satisfies the condition of nondecrease, to first order, in performance measure value in the whole constraint set, i.e., min w W φ(w(k) ) (w w (k) ) 0. (4.8) The condition (4.4) is less costly in terms of the number of floating-point operations. It results from the following characterization of w which has the property that J (w ) = min w W J (w). Theorem 4.1. A vector w constitutes a global minimum of J over W if, and only if, there exists a number λ such that λ if w φ i (w i = 1, ) = λ if 0 < wi < 1, (4.9) λ if pi = 0 for i = 1,..., n. 16
17 Algorithm 1 Algorithm model for solving Problem 3.2 via simplicial decomposition. Step 0: (Initialization) Guess an initial solution w (0) W such that I(w (0) ) is nonsingular. Set I = { 1,..., n }, G (0) = { w (0)} and k = 0. Step 1: (Termination check) Set I (k) ub = { i I w (k) i = 1 }, (4.1) I (k) im = { i I 0 < w (k) i < 1 }, (4.2) I (k) lb = { i I w (k) i = 0 }. (4.3) If λ if i I (k) φ i (w (k) ub, ) = λ if i I (k) im, λ if i I (k) lb for some positive λ, then STOP and w (k) is optimal. (4.4) Step 2: (Solution of the column generation subproblem, CGP) Compute g (k+1) = arg min w W φ(w(k) ) w (4.5) and set If g (k+1) conv(g (k) ), then STOP G (k+1) = G (k) { g (k+1)}. (4.6) Step 3: (Solution of the restricted master subproblem, RMP) Find w (k+1) = arg min Ψ ( Q I(w) 1 Q ), (4.7) w conv(g (k+1) ) and purge G (k+1) of all extreme points with zero weights in the resulting expression of w (k+1) as a convex combination of elements in G (k+1). Increment k by one and go back to Step 1. 17
18 The proof of this result proceeds in much the same way as that of Proposition 1 in (Uciński and Patan, 2007) SOLUTION OF THE COLUMN GENERATION SUBPROBLEM In Step 2 of Algorithm 1 we deal with the linear programming problem minimize c w subject to w W, (4.10) where c = φ(w (k) ), in which the feasible region is defined by 2n bound constraints (3.22) and one equality constraint (3.21). Making use of this special form of the constraints, we can develop an algorithm to solve this problem, which is almost as simple as a closed-form solution. The key idea is to make use of the following assertion which can be demonstrated in much the same way as Theorem 4.1. Theorem 4.2. A vector g W constitutes a global solution to the problem (4.10) if, and only if, there exists a scalar ρ such that for i = 1,..., n. c i ρ if g i = 1, = ρ if 0 < g i < 1, ρ if g i = 0 (4.11) We thus see that, in order to solve (4.10), it is sufficient to pick m largest components c i of c and set the corresponding weights g i as one, and the remaining weights as zero SOLUTION OF THE RESTRICTED MASTER SUBPROBLEM Suppose that in the (k + 1)-th iteration of Algorithm 1, we have G (k+1) = { g 1,..., g l}, (4.12) possibly with l < k + 1 owing to the built-in deletion mechanism of points in G (j), 1 j k, which did not contribute to the convex combinations yielding the corresponding iterates w (j). Step 3 of Algorithm 1 involves minimization of the design criterion (3.20) over conv ( { G (k+1)) l = v j g j j=1 From the representation of any w conv ( G (k+1)) as w = l v j = 1, v j 0, j = 1,..., l j=1 }. (4.13) l v j g j, (4.14) j=1 18
19 or, in component-wise form, w i = g j i being the i-th component of gj, it follows that I(w) = V 1 n θ + i=1 w i Υ i = l v j g j i, i = 1,..., n, (4.15) j=1 l ( v j V 1 j=1 n θ + i=1 ) g j i Υ i = l v j I(g j ). (4.16) j=1 From this, we see that the RMP can equivalently be formulated as the following problem: Problem 4.3. Find the sequence of weights v R l to minimize subject to the constraints P(w) = log det ( Q H(v) 1 Q ) (4.17) 1 l v = 1, (4.18) v j 0, j = 1,..., l (4.19) where H(v) = l v j H j, H j = I(g j ), j = 1,..., l. (4.20) j=1 Basically, since the constraints (4.18) and (4.19) define the probability simplex in R l, i.e., a very nice convex feasible domain, it is intuitively appealing to determine optimal weights using a numerical algorithm specialized for solving convex optimization problems. Note, however, that this formulation has already captured close attention in optimum experimental design theory, where various characterizations of optimal solutions and efficient computational schemes have been proposed (Atkinson et al., 2007). In particular, in the case of the D Q -optimality criterion studied here, we can employ the General Equivalence Theorem of (Uciński, 2005, Theorem 3.2, p. 48) to get the following conditions for global optimality: Theorem 4.4. A vector v constitutes a global solution to Problem 4.3 if and only if ψ j (v ) { = r if v j > 0, r if v j = 0 (4.21) for each j = 1,..., l, where ψ j (v) = trace (( Q H(v) 1 Q ) 1 Q H(v) 1 H j H(v) 1 Q ), j = 1,..., l. (4.22) 19
20 A very simple multiplicative algorithm (Yu, 2010) can be adapted to the above RMP. It is summarized in Algorithm 2. Although only its monotonicity can be proven for the D Q - optimality criterion, and not global convergence, cf. (Yu, 2010), in practice it behaves flawlessly. As an alternative, an interior-point method has recently been proposed by Lu and Pong (2013), for which global convergence is guaranteed, but at the cost of a much more complicated implementation. Algorithm 2 Algorithm model for the restricted master problem. Step 0: (Initialization) Select a weight vector v (0) with positive components which sum up to one, e.g., set v (0) = (1/r)1 l. Set κ = 0. Step 1: (Termination check) If then STOP. Step 2: (Multiplicative update) Evaluate 1 r ψ(v(κ) ) 1 l (4.23) v (κ+1) = 1 r ψ(v(κ) ) v (κ). (4.24) Increment κ by one and go to Step APPLICATION TO A THERMO-MECHANICAL SYSTEM In this section, we descibe in more detail the application of the sensor placement procedure for a certain thermo-mechanical system. To be more precise, we consider the temperature evolution T(x, t) of the machine tool column depicted in Figure 5.1. We denote the solid body of the machine column by Ω and its surface by Γ. The temperature evolution is governed by the linear heat equation, ρ c p Ṫ div(λ T) = 0 in Ω (0, t f ), λ n T + α(x) (T T ref) = r(x, t) on Γ (0, t f ), T(x, 0) = T 0 (x) in Ω. The boundary conditions represent a simplified model for the heat transfer occurring at the different parts of the machine s surface. Since the underlying heat transfer mechanism includes both convective and radiative phenomena, the value of the effective coefficient α(x) is considered unknown and also dependent on the spatial position x. We (5.1) 20
21 Sensor Placement for Joint Parameter and State Estimation Herzog, Riedel, Ucin ski make here the following ansatz: q α( x) := α k χ k ( x ), (5.2) k =1 where each χk is an indicator function with values in {0, 1}, which selects a certain portion of the machine s surface Γ. Here the surface of the machine is divided into five parts. The value of α is fixed to zero on those two areas where the two heat sources act, which are expressed through the right hand side r ( x, t). The heat sources are assumed to be known and described in Section 6 where numerical results are presented. They originate from an electrical drive mounted on top of the machine column and on the other hand through the spindle driving the horizontal movement of the column, see Figure 5.1(c). On the remaining q = 4 parts of the surface, the heat transfer coefficients α1,..., α4 need to be estimated but some background information αbg is available. We have 12 W K 1 m 2 on the vertical surfaces, 10 W K 1 m 2 and 8 W K 1 m 2 on the horizontal planes with the outer normal facing upwards and downwards, respectively, and 5 W K 1 m 2 on all enclosed surfaces, including the inner surfaces of the cavities; see Figure 5.1(c). At those surface parts where the electrical drives are mounted, the heat transfer coefficient α( x) is zero. All symbols occuring in (5.1) are summarized in Table 5.1. (a) Photograph of the (b) CAD model with machine column. mounting points determining the TCP location. (c) Background values of αbg. Figure 5.1: Auerbach ACW 630 machine column. We now switch to a spatial finite element model of (5.1) w.r.t. a basis { ϕi }, i = 1,..., n. In our computations, we are using the standard nodal basis composed of piecewise linear, continuous elements on a tetrahedral grid of the geometry depicted in Figure 5.1(b). In a slight abuse of notation, we denote the coefficient vector representing the temperature field T also by T. By converting (5.1) to its weak formulation and restricting it to 21
22 Symbol Meaning Value Units T temperature K r thermal surface load W m 2 ρ density kg m 3 c p specific heat at constant pressure 500 J kg 1 K 1 λ thermal conductivity 46.8 W K 1 m 1 T ref ambient temperature 20 C α bg background information on α 0 to 12 W K 1 m 2 α heat transfer coefficient unknown W K 1 m 2 T 0 initial temperature unknown K Table 5.1: Table of symbols associated with the thermal model. the finite element space, we arrive at the following semi-discretized version of (5.1) M Ṫ(t) + K T(t) + q k=1 α k M k (T(t) T ref ) = r(t), t [0, t f ], T(0) = T 0. Here M and M k denote mass and boundary mass matrices, respectively, and K is the stiffness matrix: ( ) ( ) M = ρ c p ϕ i ϕ j dx, M k = ϕ i ϕ j χ k dx, Ω i,j Γ i,j ( ) K = λ ϕ i ϕ j dx Ω i,j with indices i, j = 1,..., n. T ref is a coefficient vector in R n with identical entries. The right hand side vector r(t) represents the load vector generated by the given boundary heat sources: ( ) r(t) = r(x, t) ϕ j dx. j Γ Finally, we recall that the coefficient vector T 0 representing the initial temperature distribution n T 0 (x) = T 0,j ϕ j (x) j=1 is unknown. It is clear that the finite element model (5.3) is of the form (1.1) when the identifications given in Table 5.2 are made. Our model output y(t) = C y T(t), which is adjusted to the temperature measurements during the data assimilation process, is described by the measurement matrix C y. In the present setting, we wish to use as potential measurement locations all finite element (5.3) 22
23 mesh nodes which are located on the surface of the machine column. Therefore, C y is composed of all rows of the n n identity matrix corresponding to the surface degrees of freedom. The specific form of the adjoint system (2.16) reads M Ṡ(t) = K S(t) M S(t f ) = C y. q k=1 α k M k S(t), t [0, t f ], Notice that the symmetry of M and K has been used. The block rows of the Jacobian according to (2.17) are [ tj S(t f t j ) M S(t t j + t f ) [ M 1 T(t) M 4 T(t) ] ] dt. (5.5) 0 (5.4) Symbol in (1.1) Symbol in (5.3) Remark x T p α unknown x 0 T 0 unknown E A(p) f (t) M K q α k M k k=1 r(t) + q α k M k T ref k=1 Table 5.2: Correspondence of symbols in the general dynamical system (1.1) and the finite element model of the heat equation (5.3). We recall that our emphasis is not on the estimation of the temperature distribution of the machine, but rather on the estimation of the QOI, i.e., the temperature-induced displacement of a certain reference point of the machine structure, at time t f. The overall displacement field is governed by a quasi-static linear elasticity model since the time scale of the heat equation is unable to generate wave motion in the machine structure. The linear elasticity model is based on the balance of forces, div σ ( ε(u), T(t f ) ) = 0 in Ω. (5.6) We employ an additive split of the stress tensor σ into its mechanically and thermally induced parts. (An alternative, equivalent approach would apply such a split to the strains.) Together with the usual homogeneous and isotropic stress-strain relation, we obtain the following constitutive law; see (Boley and Weiner, 1960, Section 1.12), (Eslami et al., 2013, Section 2.8): 23
24 σ ( ε(u), T(t f ) ) = σ el (ε(u)) + σ th (T(t f )), σ el (ε(u)) = E 1 + ν ε(u) + Eν trace(ε(u)) id, (1 + ν)(1 2ν) σ th (T(t f )) = E 1 2ν β ( ) T(t f ) T ref id3. Herein, ε denotes the linearized strain tensor ε(u) = 1 2 ( u + u ). (5.7) The elasticity modulus E and Poisson ratio ν of the cast iron machine column are given. For convenience, all quantities relevant for the displacement model are summarized in Table 5.3. Symbol Meaning Value Units u displacement m σ stress N m 2 ε strain 1 ν Poisson s ratio E modulus of elasticity N m 2 β thermal volumetric expansion coefficient K 1 L length of the main spindle m l auxiliary quantity, see Appendix A m σ standard deviation of temperature sensors K Table 5.3: Table of symbols associated with the displacement model. We continue by a specification of the mechanical boundary conditions for the elasticity equation (5.6) (5.7). The machine column is free to move in the X-direction on the rail by which it connects to the machine bed, see Figure 5.1(a). Movements in Y and Z- directions are prohibited. Moreover, the machine column is connected by a spindle nut to the spindle in the machine bed which drives the horizontal movement during operation. This leads to the following mixture of essential and natural boundary conditions for (5.6) (5.7): u 2 = 0, u 3 = 0, [σ n] 1 = 0 on Γ rail, u = 0 on Γ nut, σ n = 0 on Γ \ ( (5.8) ) Γ nut Γ rail. The third boundary condition expresses the absence of boundary loads on the remainder of the surface. 24
25 We discretize (5.6) (5.8) by standard nodal (vector-valued) linear finite elements on the same mesh employed for the discretization of the heat equation (5.1). This leads to a stationary, discrete problem of the following form: K u + F (T(t f ) T ref ) = 0, (5.9) where K is the stiffness matrix and F is a matrix associated with the thermally induced stress. Clearly, the solution map T(t f ) u taking the terminal temperature to the induced displacement is affine. Our quantity of interest z = u(x TCP ) R r with r = 3 is the displacement at a certain reference point x TCP, the tool center point. As TCP, we use the tip of the main spindle (holding the tool) seen in the left of Figure 5.1(a). We consider the main spindle assembly as a rigid body which is thermally insulated from the machine column. Consequently, the TCP displacement is determined by the displacement at the four mounting points x 1,..., x 4 of the sledge holding the main spindle, see Figure 5.1(b). The dependence u(x TCP ) = N(u(x 1 ),..., u(x 4 )) is nonlinear, and we refer the reader to (Herzog and Riedel, 2015, Section 3.2) for more details. Here we are only interested in the linearization C z of the map described by (5.9), T(t f ) u u(x TCP ) at the constant reference temperature T ref. By the chain rule, it is evident that C z = N (0) K 1 F R 3 n holds. The specific form of N (0) is given in Appendix A. Clearly, it is advantageous to evaluate C z in an adjoint fashion according to C z = F K N (0). This amounts to the solution of only r = 3 adjoint elasticity equations with point sources acting in x 1,..., x 4. With the matrix C z available, the output matrix Q can be evaluated by solving the adjoint system (3.6) and applying (3.7). For the forward system (5.3) under consideration, this amounts to solving MṠ Q (t) = K S Q (t) MS Q (t f ) = C z q k=1 for S Q : [0, t f ] R n r and subsequently evaluating [ Q = S Q (0) M α k M k S Q (t), t [0, t f ], t f S Q (t) [ M 1 T(t) M 4 T(t) ] dt 0 The symmetry of M and K has been used in these formulas. ]. (5.10) 25
26 6. NUMERICAL RESULTS In this section we present some numerical results. We focus on the sensor placement problem and its solution by the simplicial decomposition sensor placement method described in Algorithm 1. The algorithm is applied to the thermo-mechanical system described in Section 5. We therefore assume that the set-point θ 0 = (T 0 0, α0 ) is given and no data assimilation problem needs to be solved DESCRIPTION OF PROBLEM DATA We fix the set-point of the initial temperature state equal to the ambient temperature, i.e., T 0 0 (x) T ref. The set-point of the heat transfer parameter α 0 (x) varies over different parts of the boundary and it is zero were the heat sources are applied, see (5.2) and Figure 5.1(c). We have chosen typical values for the heat transfer coefficient, 12 W K 1 m 2 if x Γ vert (vertical surfaces), 10 W K 1 m 2 if x Γ up (horizontal surfaces facing up), α 0 (x) = 8 W K 1 m 2 if x Γ down (horizontal surfaces facing down), 5 W K 1 m 2 if x Γ inner (enclosed surfaces), 0 W K 1 m 2 if x Γ r1 Γ r2 (surfaces with heat sources). The inverse covariance matrices for the initial state and for the parameter were chosen as Vx 1 0 = M (finite element mass matrix) and Vp 1 = id 4. The machine column experiences the influence of two heat sources, see Figure 5.1(c). One originates from an electrical drive mounted on the top of the machine column (Γ r1 ) and the other one from the spindle driving the horizontal movement of the column (Γ r2 ). The heat sources are described by 6700 W m 2 if x Γ r1 and 0s t 2400s, 2700 W m 2 if x Γ r2 and 0s t 4800s, r(x, t) = 6700 W m 2 if x Γ r1 and 4800s < t 7200s, 0 else. All calculations are done in the time interval [0 s, 7200 s]. The standard deviation of the measurements was assumed to be σ = DISCRETIZATION As described in Section 5, we used a finite element model with a standard nodal basis of piecewise linear, continuous elements for the temperature T as well as for the displacement u on a tetrahedral grid of the geometry depicted in Figure 5.1(b). The size of the mesh can be seen in Table 6.1 All finite element nodes on the boundary are potential 26
27 number of mesh nodes number of mesh cells n = number of nodes on the boundary (potential sensor locations) Table 6.1: Size of the finite element mesh. sensor positions in the sensor placement problem. In order to compute the required quantities for the sensor placement problem, particularly the Jacobian J and the output matrix Q, we need to solve the time-dependent forward system (5.1), the adjoint system for the sensitivities (5.4) and the adjoint system (5.10) for the matrix Q. For the forward system we employed the implicit Euler method with time step length t = 360 s. The adjoint systems with discretized with the consistent adjoint time stepping scheme. The measurements y(t j ) = C y T(t j ) were taken at the same time instances t j = j t, j = 1,..., N = 20, which occur during integration EFFICIENT IMPLEMENTATION Notice that the sensitivities X(t j ) = [X 0 (t j ), X p (t j )] R n (n+q), j = 1,..., N as well as the matrices Υ i R (n+q) (n+q), i = 1,..., n are dense and therefore would require a large amount of memory to store. Moreover, the assembly of the matrix I(w) appearing in the evaluation of φ i during the CGP step, see (4.5) and (3.26), and during the RMP step (4.7) of Algorithm 1 would be computationally rather expensive. Here we take advantage of the fact that only the product h = I(w) 1 Q is needed to compute all required quantities in. Instead of forming I(w), we therefore solve I(w) h = Q by means of r = 3 calls to a preconditioned conjugate gradient method based on matrix-vector-products with I(w), which are much more economical to implement. As preconditioner we use the background covariance matrix V 1 θ = diag(vx 1 0, Vp 1 ) = diag(m, id 4 ). Similar considerations apply to the evaluation of ψ j in (4.22). In addition, the computation of φ(w) and ψ(v), see (3.26) and (4.22), as well as the matrix-vector products with the FIM I(w) are executed in parallel with N = 20 threads, where each thread j only uses sensitivity information X(t j ) for time step j RESULTS AND PERFORMANCE For practical purposes the termination criteria for the simplicial decomposition problem (4.4) and for the restricted master problem (4.23) are implemented only up to certain tolerances. In (4.4) a weight w i is considered zero (one), if it is below 0.05 (above 27
28 0.95) and hence i is taken to belong to the set I lb (I ub ). Afer solving the RMP, a column g j is purged if the corresponding v j is below All values for tolerances as well as maximal iteration numbers for solving both problems can be found in Table 6.2. Parameter maximal number of iterations for SDP 40 Value zero weight in termination check for SDP 0.05 unit weight in termination check for SDP 0.95 tolerance in termination check for SDP 0.01 tolerance for purging columns in RMP 0.05 maximal number of iterations for RMP 30 tolerance in termination check for RMP 0.01 Table 6.2: Parameters used in Algorithm 1. The sensor placement problem for the thermo-mechanical system described in Section 5 was solved for the setting described in Section 6.1 with a desired number of m = 10 sensors. Algorithm 1 stopped after 6 iterations, because the column generated in the CGP was already contained in the previous column set G. In this case the RMP to be solved would be the same as in the step before and no further progress could be achieved. The computation took about 2.5 h, further detail about the performance of the algorithm are listed in Table 6.3. time for computation of sensitivities (5.5) time for computation of φ(w) 15 min 150 s average number of RMP steps per SDP step 18 time for RMP step 70 s number of SDP steps 6 average time for SDP step 1360 s overall time 2.5 h Table 6.3: Computation times for the application of Algorithm 1. Figure 6.1 shows the evolution of the distribution of measurement weights for each SDP step over all possible sensor locations, which were all boundary nodes of the FE mesh. The final solution is achieved practically after 4 iterations, which is also reflected in the objective values Ψ ( Q I(w) 1 Q ) in Figure 6.2(b). The optimal sensors are all placed in the vicinity of the two heat sources, see Figure 6.2(a). Since the D Q -criterion targets the volume of the confidence ellipsoid of the QOI (TCP 28
29 Figure 6.1: Evolution of the measurement weights w (k) during SPD iterations. (a) Optimal sensor positions (m = 10). (b) Objective values Ψ ( Q I(w (k) ) 1 Q ) over iteration number. Figure 6.2: Optimal sensors and objective values. 29
On construction of constrained optimum designs
On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct
More informationOptimal sensor placement based on model order reduction
P. Benner a, R. Herzog b, N. Lang b, I. Riedel b, J. Saak a a Max Planck Institute for Dynamics of Complex Technical Systems, Computational Methods in Systems and Control Theory, 39106 Magdeburg, Germany
More information1 Computing with constraints
Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationStability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games
Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,
More informationOptimization for Compressed Sensing
Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve
More informationChapter 2 Distributed Parameter Systems: Controllability, Observability, and Identification
Chapter 2 Distributed Parameter Systems: Controllability, Observability, and Identification 2.1 Mathematical Description We introduce the class of systems to be considered in the framework of this monograph
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationAlgorithms for Constrained Optimization
1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More information5.1 2D example 59 Figure 5.1: Parabolic velocity field in a straight two-dimensional pipe. Figure 5.2: Concentration on the input boundary of the pipe. The vertical axis corresponds to r 2 -coordinate,
More informationNumerical methods for the Navier- Stokes equations
Numerical methods for the Navier- Stokes equations Hans Petter Langtangen 1,2 1 Center for Biomedical Computing, Simula Research Laboratory 2 Department of Informatics, University of Oslo Dec 6, 2012 Note:
More informationStochastic Spectral Approaches to Bayesian Inference
Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to
More informationSelf-Concordant Barrier Functions for Convex Optimization
Appendix F Self-Concordant Barrier Functions for Convex Optimization F.1 Introduction In this Appendix we present a framework for developing polynomial-time algorithms for the solution of convex optimization
More informationUNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems
UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction
More informationSelected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.
. Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A. Nemirovski Arkadi.Nemirovski@isye.gatech.edu Linear Optimization Problem,
More informationASIGNIFICANT research effort has been devoted to the. Optimal State Estimation for Stochastic Systems: An Information Theoretic Approach
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL 42, NO 6, JUNE 1997 771 Optimal State Estimation for Stochastic Systems: An Information Theoretic Approach Xiangbo Feng, Kenneth A Loparo, Senior Member, IEEE,
More informationScalable algorithms for optimal experimental design for infinite-dimensional nonlinear Bayesian inverse problems
Scalable algorithms for optimal experimental design for infinite-dimensional nonlinear Bayesian inverse problems Alen Alexanderian (Math/NC State), Omar Ghattas (ICES/UT-Austin), Noémi Petra (Applied Math/UC
More information. D CR Nomenclature D 1
. D CR Nomenclature D 1 Appendix D: CR NOMENCLATURE D 2 The notation used by different investigators working in CR formulations has not coalesced, since the topic is in flux. This Appendix identifies the
More informationAppendix A Taylor Approximations and Definite Matrices
Appendix A Taylor Approximations and Definite Matrices Taylor approximations provide an easy way to approximate a function as a polynomial, using the derivatives of the function. We know, from elementary
More information8 A pseudo-spectral solution to the Stokes Problem
8 A pseudo-spectral solution to the Stokes Problem 8.1 The Method 8.1.1 Generalities We are interested in setting up a pseudo-spectral method for the following Stokes Problem u σu p = f in Ω u = 0 in Ω,
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More informationOptimal input design for nonlinear dynamical systems: a graph-theory approach
Optimal input design for nonlinear dynamical systems: a graph-theory approach Patricio E. Valenzuela Department of Automatic Control and ACCESS Linnaeus Centre KTH Royal Institute of Technology, Stockholm,
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationSparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations
Sparse Linear Systems Iterative Methods for Sparse Linear Systems Matrix Computations and Applications, Lecture C11 Fredrik Bengzon, Robert Söderlund We consider the problem of solving the linear system
More informationMS&E 318 (CME 338) Large-Scale Numerical Optimization
Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods
More informationLinear Programming Redux
Linear Programming Redux Jim Bremer May 12, 2008 The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains
More informationNotes for CS542G (Iterative Solvers for Linear Systems)
Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,
More information2 Nonlinear least squares algorithms
1 Introduction Notes for 2017-05-01 We briefly discussed nonlinear least squares problems in a previous lecture, when we described the historical path leading to trust region methods starting from the
More informationWritten Examination
Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes
More informationMulti-Robotic Systems
CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed
More informationSubject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)
Subject: Optimal Control Assignment- (Related to Lecture notes -). Design a oil mug, shown in fig., to hold as much oil possible. The height and radius of the mug should not be more than 6cm. The mug must
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationSolving the Generalized Poisson Equation Using the Finite-Difference Method (FDM)
Solving the Generalized Poisson Equation Using the Finite-Difference Method (FDM) James R. Nagel September 30, 2009 1 Introduction Numerical simulation is an extremely valuable tool for those who wish
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More information5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns
5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns (1) possesses the solution and provided that.. The numerators and denominators are recognized
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationChapter 2 Finite Element Formulations
Chapter 2 Finite Element Formulations The governing equations for problems solved by the finite element method are typically formulated by partial differential equations in their original form. These are
More informationGENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION
Chapter 4 GENERALIZED CONVEXITY AND OPTIMALITY CONDITIONS IN SCALAR AND VECTOR OPTIMIZATION Alberto Cambini Department of Statistics and Applied Mathematics University of Pisa, Via Cosmo Ridolfi 10 56124
More informationFundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved
Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications to the natural
More informationDefinition 5.1. A vector field v on a manifold M is map M T M such that for all x M, v(x) T x M.
5 Vector fields Last updated: March 12, 2012. 5.1 Definition and general properties We first need to define what a vector field is. Definition 5.1. A vector field v on a manifold M is map M T M such that
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationFitting Linear Statistical Models to Data by Least Squares: Introduction
Fitting Linear Statistical Models to Data by Least Squares: Introduction Radu Balan, Brian R. Hunt and C. David Levermore University of Maryland, College Park University of Maryland, College Park, MD Math
More informationThe Simplex Method: An Example
The Simplex Method: An Example Our first step is to introduce one more new variable, which we denote by z. The variable z is define to be equal to 4x 1 +3x 2. Doing this will allow us to have a unified
More information1 Kalman Filter Introduction
1 Kalman Filter Introduction You should first read Chapter 1 of Stochastic models, estimation, and control: Volume 1 by Peter S. Maybec (available here). 1.1 Explanation of Equations (1-3) and (1-4) Equation
More informationNumerical Methods I Solving Nonlinear Equations
Numerical Methods I Solving Nonlinear Equations Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 16th, 2014 A. Donev (Courant Institute)
More information(df (ξ ))( v ) = v F : O ξ R. with F : O ξ O
Math 396. Derivative maps, parametric curves, and velocity vectors Let (X, O ) and (X, O) be two C p premanifolds with corners, 1 p, and let F : X X be a C p mapping. Let ξ X be a point and let ξ = F (ξ
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More information11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.
C PROPERTIES OF MATRICES 697 to whether the permutation i 1 i 2 i N is even or odd, respectively Note that I =1 Thus, for a 2 2 matrix, the determinant takes the form A = a 11 a 12 = a a 21 a 11 a 22 a
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationAN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION
AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION JOEL A. TROPP Abstract. Matrix approximation problems with non-negativity constraints arise during the analysis of high-dimensional
More informationLecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.
Lecture 9 Approximations of Laplace s Equation, Finite Element Method Mathématiques appliquées (MATH54-1) B. Dewals, C. Geuzaine V1.2 23/11/218 1 Learning objectives of this lecture Apply the finite difference
More informationIntroduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim
Introduction - Motivation Many phenomena (physical, chemical, biological, etc.) are model by differential equations. Recall the definition of the derivative of f(x) f f(x + h) f(x) (x) = lim. h 0 h Its
More informationLecture 13: Constrained optimization
2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems
More informationNumerical Optimal Control Overview. Moritz Diehl
Numerical Optimal Control Overview Moritz Diehl Simplified Optimal Control Problem in ODE path constraints h(x, u) 0 initial value x0 states x(t) terminal constraint r(x(t )) 0 controls u(t) 0 t T minimize
More informationNumerical Methods for Large-Scale Nonlinear Systems
Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationSensitivity and Reliability Analysis of Nonlinear Frame Structures
Sensitivity and Reliability Analysis of Nonlinear Frame Structures Michael H. Scott Associate Professor School of Civil and Construction Engineering Applied Mathematics and Computation Seminar April 8,
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More informationThe Bock iteration for the ODE estimation problem
he Bock iteration for the ODE estimation problem M.R.Osborne Contents 1 Introduction 2 2 Introducing the Bock iteration 5 3 he ODE estimation problem 7 4 he Bock iteration for the smoothing problem 12
More informationThe Hilbert Space of Random Variables
The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2
More informationAn Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse
An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse Yongjia Song, James Luedtke Virginia Commonwealth University, Richmond, VA, ysong3@vcu.edu University
More informationAn introduction to Mathematical Theory of Control
An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationEE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)
EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in
More informationCS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares
CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search
More informationA Locking-Free MHM Method for Elasticity
Trabalho apresentado no CNMAC, Gramado - RS, 2016. Proceeding Series of the Brazilian Society of Computational and Applied Mathematics A Locking-Free MHM Method for Elasticity Weslley S. Pereira 1 Frédéric
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Prof. C. F. Jeff Wu ISyE 8813 Section 1 Motivation What is parameter estimation? A modeler proposes a model M(θ) for explaining some observed phenomenon θ are the parameters
More informationDirectional Field. Xiao-Ming Fu
Directional Field Xiao-Ming Fu Outlines Introduction Discretization Representation Objectives and Constraints Outlines Introduction Discretization Representation Objectives and Constraints Definition Spatially-varying
More informationIPAM Summer School Optimization methods for machine learning. Jorge Nocedal
IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep
More informationKey words. preconditioned conjugate gradient method, saddle point problems, optimal control of PDEs, control and state constraints, multigrid method
PRECONDITIONED CONJUGATE GRADIENT METHOD FOR OPTIMAL CONTROL PROBLEMS WITH CONTROL AND STATE CONSTRAINTS ROLAND HERZOG AND EKKEHARD SACHS Abstract. Optimality systems and their linearizations arising in
More informationINTRODUCTION TO FINITE ELEMENT METHODS
INTRODUCTION TO FINITE ELEMENT METHODS LONG CHEN Finite element methods are based on the variational formulation of partial differential equations which only need to compute the gradient of a function.
More informationAPPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.
APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product
More informationPDEs in Image Processing, Tutorials
PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower
More informationGeneralized Finite Element Methods for Three Dimensional Structural Mechanics Problems. C. A. Duarte. I. Babuška and J. T. Oden
Generalized Finite Element Methods for Three Dimensional Structural Mechanics Problems C. A. Duarte COMCO, Inc., 7800 Shoal Creek Blvd. Suite 290E Austin, Texas, 78757, USA I. Babuška and J. T. Oden TICAM,
More informationOutline. 1 Full information estimation. 2 Moving horizon estimation - zero prior weighting. 3 Moving horizon estimation - nonzero prior weighting
Outline Moving Horizon Estimation MHE James B. Rawlings Department of Chemical and Biological Engineering University of Wisconsin Madison SADCO Summer School and Workshop on Optimal and Model Predictive
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization
Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization Instructor: Farid Alizadeh Author: Ai Kagawa 12/12/2012
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationIntroduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Bastian Steder
Introduction to Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Bastian Steder Reference Book Thrun, Burgard, and Fox: Probabilistic Robotics Vectors Arrays of numbers Vectors represent
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationAdaptive methods for control problems with finite-dimensional control space
Adaptive methods for control problems with finite-dimensional control space Saheed Akindeinde and Daniel Wachsmuth Johann Radon Institute for Computational and Applied Mathematics (RICAM) Austrian Academy
More informationCIV-E1060 Engineering Computation and Simulation Examination, December 12, 2017 / Niiranen
CIV-E16 Engineering Computation and Simulation Examination, December 12, 217 / Niiranen This examination consists of 3 problems rated by the standard scale 1...6. Problem 1 Let us consider a long and tall
More informationTHEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)
4 Vector fields Last updated: November 26, 2009. (Under construction.) 4.1 Tangent vectors as derivations After we have introduced topological notions, we can come back to analysis on manifolds. Let M
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationarxiv: v1 [math.na] 7 May 2009
The hypersecant Jacobian approximation for quasi-newton solves of sparse nonlinear systems arxiv:0905.105v1 [math.na] 7 May 009 Abstract Johan Carlsson, John R. Cary Tech-X Corporation, 561 Arapahoe Avenue,
More information1 The linear algebra of linear programs (March 15 and 22, 2015)
1 The linear algebra of linear programs (March 15 and 22, 2015) Many optimization problems can be formulated as linear programs. The main features of a linear program are the following: Variables are real
More informationDimensionality reduction of SDPs through sketching
Technische Universität München Workshop on "Probabilistic techniques and Quantum Information Theory", Institut Henri Poincaré Joint work with Andreas Bluhm arxiv:1707.09863 Semidefinite Programs (SDPs)
More informationIterative Methods for Linear Systems
Iterative Methods for Linear Systems 1. Introduction: Direct solvers versus iterative solvers In many applications we have to solve a linear system Ax = b with A R n n and b R n given. If n is large the
More informationVasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks
C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationCCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27
CCP Estimation Robert A. Miller Dynamic Discrete Choice March 2018 Miller Dynamic Discrete Choice) cemmap 6 March 2018 1 / 27 Criteria for Evaluating Estimators General principles to apply when assessing
More informationDynamic System Identification using HDMR-Bayesian Technique
Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in
More informationLinear Hyperbolic Systems
Linear Hyperbolic Systems Professor Dr E F Toro Laboratory of Applied Mathematics University of Trento, Italy eleuterio.toro@unitn.it http://www.ing.unitn.it/toro October 8, 2014 1 / 56 We study some basic
More informationLecture Notes: Geometric Considerations in Unconstrained Optimization
Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections
More information