Autoregressive Gaussian processes for structural damage detection

Size: px
Start display at page:

Download "Autoregressive Gaussian processes for structural damage detection"

Transcription

1 Autoregressive Gaussian processes for structural damage detection R. Fuentes 1,2, E. J. Cross 1, A. Halfpenny 2, R. J. Barthorpe 1, K. Worden 1 1 University of Sheffield, Dynamics Research Group, Department Mechanical Engineering Mappin Street, Sheffield, S1 3JD, UK ramon.fuentes@sheffield.ac.uk 2 HBM-nCode United Kingdom, Advanced Manufacturing Park, Catcliffe, Rotherham, South Yorkshire, S60 5WG Abstract This paper provides an application of damage detection from nonlinear time series through the use of a Gaussian Process Autoregressive model. Gaussian Processes are powerful nonparametric models that can be used for advanced nonlinear regression. On the other hand, autoregressive models have been examined extensively for fault detection on mechanical systems. However, when a mechanical system behaves nonlinearly in its baseline condition, a linear autoregressive model might not be able to distinguish between a baseline system response and a response from the system with damage. This paper is intended as a brief introduction to both data-driven damage detection and Gaussian Processes, and concludes with an example damage detection application to a simulated signal as well as a short review of Gaussian Processes for time series models. 1 Introduction This paper explores the use of Gaussian Process Regression within a nonlinear autoregressive framework for structural damage detection in mechanical systems. Gaussian Process regression is a powerful machine learning tool that can be used for both regression and classification tasks. Structural damage detection for mechanical systems, on the other hand, has been the topic of a growing number of academic publications in the last few years. Gaussian Processes have been studied recently for SHM applications in [1] for monitoring of landing gear loads, and in [2] for monitoring of aircraft loads. The next section provides a summarised literature review on the subject and an overview of the general concept and challenges involved. The key idea in performing structural damage detection on a mechanical system is to use available data from a structure in operation in order to diagnose the presence of damage. It has been identified [3] that a statistical pattern recognition approach is often desirable in order to identify the presence of damage. When using a machine learning approach to the problem, the general damage identification procedure is divided in two phases. The first is the training phase, which involves using a data set of the healthy structure in order to tune the model. If example data is available from damaged cases this can also be used in the training phase. The second phase is the prediction phase, and involves using the model that has been trained in order to infer the health state of the structure. Gaussian Process Regression (GPR) is a relatively new tool for performing nonlinear regression in a Bayesian way. What this means exactly will be explained in more detail in a later section, but what it 469

2 470 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 means in practice is that given a set of inputs, the model predicts a probability density for the outputs. Hence, all predictions come with a measure of uncertainty, and it has been demonstrated that this measure of uncertainty in other Bayesian models can be exploited for structural damage identification [4]. The Gaussian process is a also a non-parametric model, which in practice means that it does not rely on parameters in order to make predictions, but instead uses conditional probability relationships to estimate the density of the outputs given the training inputs and outputs. The details of this will be covered in Section 3. The layout of this paper is as follows; Section 2 provides a brief background on Structural Health Monitoring (SHM) and discusses some of the challenges involved as well as some common and simple models. The reader familiar with SHM may safely skip this. Section 3 provides a conceptually-oriented introduction to Gaussian Process Regression, while Section 4 shows the application of a GPR autoregressive model to damage detection from a nonlinear signal. 2 Structural damage detection background Data-driven techniques for SHM involve inferring the presence of damage from data collected via transducers on the structure. These transduces could measure strain, displacements or accelerations as well as any other environmental variables that may affect the characteristics of structures. There is an increasing trend for using acceleration measurements in vibration-based SHM, due to the practicality of using an accelerometer as a transducer. The recent developments in MEMS (microelectronic mechanical sensors) technology not only allow acceleration measurements to be taken in a more cost-effective manner, but they also allow the development of embedded systems which can run algorithms online so that SHM can be performed in real time for a structure in operation. The basic premise for damage detection is that damage will change the dynamic characteristics of a structure, and detecting those changes will lead to detecting damage. In the SHM literature, Condition Monitoring (CM) refers to damage diagnosis applications to rotating machinery. This is now a mature subject and it would be safe to say that it has made the transition out of university research into industrial applications [5 7] One of the defining aspects of CM is that rotating machinery operates mostly in a stationary environment, where the loading amplitudes and frequencies remain constant. The class of problems that CM deals with includes the diagnosis of faulty bearings from shifts in the harmonic frequencies of the response spectrum, or diagnosis of worn or missing tooth in gears from changes in noise signature. The more recent developments in SHM in various industries including civil, aerospace, automotive, power generating and offshore are concerned with damage detection, diagnosis and prognosis in structures where the operating conditions are not necessarily stationary, the input excitations are not necessarily known, the structures might not behave in a linear fashion and their dynamics might also change according to the environment they operate in. These conditions call for algorithms that are robust to these changes. This introduction will provide a brief review of these methods. Some great accounts of and literature reviews can be found in [3,8,9]. A well accepted hierarchical structure for damage identification is the Rytter scale [10], which splits the problem into four levels: 1) Detection Is damage present? 2) Localisation What is the physical location of the damage? 3) Severity What is the extent of the damage? 4) Prognosis What is the remaining useful life in the structure?

3 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 471 It will typically be possible to use machine learning and statistical tools to solve Levels 1 to 3 from analysis of data collected from the structure. Prognosis will require some physics-based model of the system and a model for the evolution of damage after it has been diagnosed which will typically require data from previous similar cases. The damage detection problem (Level 1 diagnosis) can be solved by statistical and machine learning tools, using data from the undamaged structure alone. This is an unsupervised learning problem, as the mathematical model is built using training data only from the undamaged structure. Supervised learning algorithms, on the other hand, use samples from both the damaged and undamaged cases in order to fit a model to the data, and this will typically require labels for each class of damage. If this kind of information is available, either from data from previous cases of damage, Finite Element modelling of the various damage scenarios or experimental testing then, in principle, supervised learning algorithms can be used to solve the localisation and severity problems. The issue is that it is not always possible, cost effective or straightforward to measure all the possible damage scenarios. This would mean damaging the structure which, for most applications, will become obviously prohibitive. In some cases, data collected from actual failures could be available, but this is rare and might not necessarily capture the required parameters. The approach of using physics-based models to generate the damaged cases is also restrictive. Their accuracy can be affected by the lack of known input excitations which, for a lot of real-world structures, are hard to measure or predict. Structural nonlinearities and the general difficulties in creating physics-based models to characterise the structural dynamics of complex structural assemblies are also a limiting factor. A thorough and recent account that compares data-driven against physics-based models can be found in [11]. These limitations make it more practical, in the majority of cases, to use unsupervised learning algorithms and thus solve only a Level 1 problem. Data pre-processing and feature extraction steps are required before using any statistical or pattern recognition algorithms. Significant pre-processing is required to make sure the data is clean of anomalies inherent in any data acquisition system. Feature extraction refers to transforming the raw data collected by the acquisition system which will normally be discrete time domain signals, and transform them into a more useful format or features. These features are mathematical transformations of the raw data, typically into a lower dimensional space. It is obviously desirable to construct features that are sensitive to damage. However, sensitivity to damage in a feature will also mean that the feature will be sensitive to changes in the environment. There is an inherent trade-off between sensitivity to damage and sensitivity to environmental changes in any particular feature [12], where these changes refer to varying input loads, changes in temperature, humidity, or mass in the structure. When dealing with variable loading conditions, it is possible to select features that are normalised against these loads. If loads can be measured, a frequency response function, for example, could be a good feature vector. However, the input loads will not be always available (in particular, for structures in operation). This can sometimes be solved by assuming a statistical distribution for the input loading. If this can be approximated as Gaussian, it is possible to use the standard result from linear systems theory that the output of a linear system that is excited by a Gaussian excitation will also be Gaussian distributed. This has allowed for various methods that can deal with this uncertainty in the input excitation. The effects of environmental variability also have a significant impact on the generalisation capabilities of any novelty detection algorithm. There are numerous techniques available for normalising damage sensitive features against environmental variabilities. These include using Singular Value Decomposition (SVD) [13], factor analysis [14] and Auto-associative neural networks [15]. Principal Component Analysis (PCA) has also been suggested as a means of selecting damage sensitive features that are insensitive to environmental variations [16]. More recently, cointegration has been suggested [17,18] for SHM applications and successfully implemented in the removal of environmental trends of a bridge [19]. It is a method of projecting out components that correspond to long term trends in the data, which is a characteristic of environmental variables such as temperature, humidity and ice build-up.

4 472 PROCEEDINGS OF ISMA2014 INCLUDING USD Time series models: autoregressive processes There are numerous methods for modelling a time series, but arguably the simplest such model is an Auto Regressive (AR) process. This defines any point in the signal as a weighted sum of the previous measured points. (1) where is the point of, the signal being modelled and is AR coefficient. If the system has measured inputs, one can add a moving average term to model the signal as a weighted sum of the previous terms plus the influence of the previous inputs. (2) where is the measured input into the system and the term is the Moving Average (MA) coefficient. An AR model of order with an MA model of order is usually denoted as ARMA( ). It is interesting to note that the discrete time representation of the SDOF linear oscillator is in fact an ARMA model. (3) It is possible to show that the second order differential equation for the harmonic oscillator above can be described in discrete time form as (4) This is an ARMA(2,1) model and there is a direct correspondence between the coefficients and the mass and stiffness of the system. The purpose of showing this here is to highlight that a change in the physical parameters of the system would imply a change in the AR coefficients. There are two damage sensitive features that can be extracted using such a model; the coefficients and the signal reconstruction error. The AR coefficients are damage sensitive features, and can be used in further pattern recognition algorithms to detect and classify damage. Although the AR parameters are useful damage sensitive features, there still remains the question of determining the order of such model. Fitting a model with too high an order will result in a poor ability for the model to generalise; the model might start fitting the measurement noise. If the model order is too low, it will simply not completely capture the dynamics contained in the signal. The right choice of model order and its influence on damage detection has been studied in [20]. The study used three dominant techniques; the Akaike information criterion [21], the partial autocorrelation function and root-mean square error. It concluded that they do not all converge to the same solution for model order, but using the three of them together generally provides a robust guideline that can be used to establish model orders in SHM applications. The AR parameters are useful damage sensitive features on their own, but it is possible, for example, to use a Mahalanobis squared distance-based outlier analysis to determine when a combination of AR

5 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 473 parameters deviate significantly from the normal condition. A different approach can also be taken using AR parameters. This would involve reconstructing every point in the signal using the AR parameters, together with the previous signal values and then taking the residual in this prediction. This error will increase if there are any changes such as damage (or operational and environmental conditions), thus also being a useful damage sensitive feature. This method will be compared against the autoregressive Gaussian Process model in the last section. 3 Gaussian process regression background The aim of this section is to introduce the concepts behind GP regression at a conceptual level. It is aimed at readers not familiar with concepts such as Bayesian and nonparametric models. It is outside the scope of this paper to fully discuss the details behind GP regression, for a full discussion of the subject the reader is referred to [22]. Readers familiar with Bayesian methods, nonparametric models and Gaussian Processes may skip to the next section. 3.1 Parametric methods The main idea behind any regression model is to try to explain a set of output values, given a set of input values. In a parametric model a parametrised function is used to map inputs to outputs. In linear regression, for example, the function is, where there are two parameters, the slope and the intercept. There needs to be a systematic way at arriving at the parameters that best fit the data. This task is called parameter estimation. The next issue is whether the model is even the right choice for the data, which means that a model selection stage is also required. It is outside the scope of this paper to discuss parameter estimation and model selection in detail, but the reader is referred to [23] for further reading. It is necessary to have a means of quantifying the uncertainty of predictions made by a model. The reason for this is that there will always be errors in the modelling process. To begin with, the training data may be corrupted by noise, which could be noise from the transducers or there could be bias due to the model not capturing the physical process completely. Even if there is good confidence that the chosen model can approximate the physical process, there will be uncertainty about the parameters for that model. If the parameters for a model were chosen using training data within a certain region, there will be high confidence that the parameters fit the data within that region. Any predictions made outside that region will have a higher uncertainty and there needs to be a systematic way to address this. Bayesian methods are particularly suited to these types of problems. 3.2 Bayesian linear regression The term Bayesian comes from the use of Bayes rule of probability (5) where denotes the conditional probability of given. What Bayes rule is doing is essentially reducing the uncertainty of given information about. In order to do so it uses the original uncertainty about, which is called a prior since it encodes prior information about it. is called the

6 474 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 likelihood, and is the inverse conditional probability. terms equation (5) can be rewritten: is called the marginal likelihood. Using these (6) where the posterior is our updated uncertainty about. This is a powerful idea that has led to the development of a host of models that take into account prior information or degrees of belief about something, and then update it when new evidence becomes available. To see how this is useful for regression, consider the simple linear regression model: (7) where is the function of interest, are the inputs to the model, and are the output observations. The observations differ from the actual function values through noise, which could be considered as white Gaussian with zero mean and a variance of. Lastly, are the weights of the model, which are the parameters that need to be estimated. This model could be solved using a least squares method, which would solve for the weights, and the noise. One could now use Bayes theorem to derive a posterior distribution over the weights: (8) which readily gives a distribution over the parameters of the model. The prior in this case is, which is any distribution that reflects one s initial knowledge about the model parameters before one attempts to identify them. The likelihood term is, and is a probability density that reflects the probability of a given measured data point given a set of parameter values. This term is important as it if often chosen as an objective function in optimisation routines [23]. Lastly, the marginal likelihood term also called the model evidence provides a normalising constant and represents the probability of data given a specific model, and so it is often useful for comparing and selecting different competing models. The discussion on Bayesian linear regression can only go to a limited depth here, but the reader is encouraged to consult [23] for a more thorough explanation of the subject. 3.3 Nonparametric methods - Gaussian processes There is another class of machine learning algorithms called nonparametric models. The name comes from the fact that no parameters are used to make predictions. Instead, the training data itself is used to define the relationships between inputs and outputs. One major advantage of this setup is that the type of model does not need to be specified exactly, but is determined from the data. This makes this class of models naturally more flexible and able to model more complex data. It also, (partially) removes the model selection step, while providing an uncertainty estimate when making predictions. Gaussian Process Regression is a type of nonparametric model, which in fact has been shown to be equivalent to other models such as Bayesian linear regression, Artificial Neural Networks (ANNs), Support Vector Machines (SVMs) and spline models. This section will cover the basics of what a GP regression model consists of and how one can perform training and predictions using this model as well as some of the computational issues associated with them as well as some recent solutions to these issues.

7 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 475 A Gaussian Process is a generalisation of the Gaussian distribution. Whereas the Gaussian distribution defines a distribution over discrete variables, a Gaussian Process defines a distribution over functions. A key property of this definition, which makes Gaussian Processes usable in practice, is that any finite subset of points from a GP will also be Gaussian distributed. This means that it is possible to use the properties and identities for Gaussian distributions to make predictions for a GP. It is worth starting the discussion with the kernel, which is at the heart of how many nonparametric models retain the relationship between inputs and outputs. A kernel (also called the covariance function) encodes the relationship between the output values as a function of the inputs. It also encodes prior information about the process that is being modelled since the choice of covariance function will determine essentially how smooth this process is. A popular choice for covariance function [22] is the squared exponential: ( ( )) ( ) ( ) (9) which is a function of any two points of the input space (although note how it defines a covariance between points in the outputs). The squared-exponential kernel is shown in Figure 1. There are two hyperparameters for this covariance function. The length scale controls roughly how far in the output space does one have to travel before there is a significant change in direction. The term could be thought of as controlling the overall magnitudes of the outputs. Figure 1 Squared-exponential kernel A Gaussian Process is defined by its covariance function as well as its mean function. Formally, this is written as (10)

8 476 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 which states that comes from a Gaussian Process with a mean of and a covariance function (or kernel), where and are any two points in the input space. For most applications, the mean is typically set as zero, which means that the GP is fully specified by the covariance function [22]. The evaluation of the underlying function for a finite sample of inputs is therefore done by evaluating the covariance function for these inputs, which will result in a covariance matrix. It is possible to sample from a GP, for example, by creating a random vector from a Gaussian distribution, and evaluating its covariance matrix: (11) Figure 2 a) shows some samples from a GP with a squared-exponential prior (using a lengthscale of 2 and a variance of 1). One could generate many functions, but the usefulness of GPs for regression comes from the fact that one could generate samples from functions that pass close to the observed training data points. This is done in GPs using a conditional probability framework. If one uses the standard conditional probability relationships for Gaussians, it is possible to arrive at the predictive distribution, where the GP is conditioned on the training points so that the mean and the variance of the distribution are: [ ] [ ] (12) (13) Note the notation, which is taken to mean a test input point, not a training input point. Equation (12) defines the mean of the predictions, while equation (13) defines their variance. Figure 2 b) shows the mean and the variance of a GP predictive distribution, with the observations used for the conditioning. Note that this predictive distribution assumes noisy observations through the addition of the term, which is the noise variance multiplied by the identity matrix. On a basic level the GP regression model is trained by simply including the training points in the covariance matrix. However, the covariance function still makes use of hyperparameters, so typically an optimisation routine is required to tune these hyperparameters so that the GP fits the data correctly. A gradient-based optimiser is normally used with the model likelihood as an objective function [22]. 3.4 Practicalities of Gaussian process regression It is worth having a brief discussion about some of the practicalities of implementing a GPR model. The first one is the data size problem, in particular from an engineering perspective and even more in a time series analysis context where thousands or millions of data points may need to be analysed. The problem is the matrix inversion required to evaluate the mean and variance of the predictive distribution. The covariance matrix contains an evaluation of the covariance function for all possible pairs of training inputs, and therefore this matrix can be potentially very big. Various methods have been suggested for this based on approaches which mostly try to retain informative training points [22]. A good review of recent methods for Gaussian processes on large data sets is [24]. One method that stands out is the Fully Independent Training Conditional (FITC) method [25] which picks a number of inducing points where the covariance matrix is computed for the inducing points only. This provides a low rank plus diagonal approximation to the covariance matrix which is then easier to invert. The issue with this approach is that a bad selection of inducing points will result in poor predictions, and the effect can be severe if one is not careful. One approach that deals with this problem is presented in [26] where a variational approach is used to learn the appropriate inducing points from the data. This method was used in the examples presented in the next section.

9 f X f X DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 477 a) GP priors b) GP Posterior Figure 2 a) Samples from a Gaussian Process prior, b) Mean function for a Gaussian Process conditioned on some observations. The mean of the predictive distribution is shown as the continuous blue line, while the two standard deviations region is shown by the shaded region 4 Gaussian process nonlinear autoregressive model for fault detection This section discusses the use of a Gaussian Process nonlinear autoregressive model for structural damage detection. This is demonstrated using an artificial example of a nonlinear time series with simulated damage. The idea of an autoregressive model is to predict a signal value at a discrete time point using the previous signal values. These are lagged versions of the signal. If the function used to predict a signal based on its lagged values is linear, then it is a standard linear Auto Regressive (AR) model as described in equation (1). The idea behind using an AR model for SHM is to use it to characterise the baseline condition of the mechanical system by means of the residual errors between the model predictions and the actual observed values. Any fault in the system will be evidenced as a change in the response, and therefore an increase in the residuals. For systems that exhibit a linear response in the baseline condition, a linear AR model might be sufficient to characterise this. However, if the baseline response is inherently nonlinear, the residuals from the linear model will fail to distinguish between the baseline and the damaged condition. For this reason a nonlinear function is necessary that can predict points in a time series based on some lagged values. In this paper, the use of Gaussian Processes is explored as the nonlinear autoregressive function. There are in fact several ways this could be done using a Gaussian Process, so a brief review will be presented of some of the developments on modelling time series and dynamical systems using GPs. The general autoregressive model is formally written as: (14)

10 478 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 where is the signal measured at a discrete time index, is the number of lags being considered and is the process noise, which could come from sensor noise or other uncertainties, and is assumed to be modelled as white Gaussian. In this case, the function in the demonstration will be a GP, where the lagged signal values are used as inputs and the outputs are all the points in the time series. 4.1 Review of time series and dynamical models using Gaussian process regression There are two general strategies for modelling the dynamics contained in a measured time series. In the case of the autoregressive framework, the model is implicitly time dependent, since a point in time depends on previous points in time, so the inputs to the model are the lagged signals and the output is the full signal. The alternative is to explicitly establish the dependency between time and the signal, so that the input to the model is time, and the output is the signal. There are advantages and disadvantages to both approaches. In the autoregressive formulation, the performance of the model will be strongly dependent on the sample rate chosen to gather data. In the case of measuring a mechanical system, the engineer needs to have enough knowledge about the dynamics of the system to be able to choose a sampling rate that captures the relevant dynamics of the SHM problem. Also, for the case of the autoregressive formulation it is hard to make contact with the physics of the problem, since it is not a parametric model. In the case of the continuous time-dependent formulation, it is possible to interpolate and extrapolate to areas of the time series where data is missing. This may be useful for modelling problems such as financial forecasting, but not necessarily to SHM. The continuous time formulation will require a periodic covariance function, and this has a penalty on the level of complexity of the dynamics that can be modelled. In general, the autoregressive formulation will be able to capture more complex dynamics. However, this comes at a computational penalty since the number of input points will grow with the number of lags used. A very recent detailed review of these issues, as well as a very thorough comparison of different methods for modelling dynamics in time series using Gaussian Processes is presented in [27]. A good review of continuous time Gaussian Processes can be found in [28]. Gaussian Processes are intimately linked to Support Vector Machines (SVMs) [23] which are another class of nonparametric models. A study has been presented in [29] which makes use of SVMs in an autoregressive framework to show its usefulness on a damage detection application. The example shown later in this paper will be based on that used in [29]. Possibly the biggest issue with implementing GPs within an autoregressive framework is the fact that the predictive distribution from equations (12) and (13) takes into account noisy observations, but the inputs are assumed to be noise free. This is clearly a problem for an autoregressive formulation since a point that is an output at time will eventually become an input to the model at time. So, all of the observations will also be inputs. This issue is uncertainty propagation from one time point to another, since in the ideal case one would propagate not only a point forward in time, but the point as well as its uncertainty; the whole probability distribution. This is non-trivial for linear dynamical systems, but there are well established and successful algorithms that do this efficiently such as the Kalman filter [30]. In fact the Kalman filter belongs to a class of estimation algorithms called Bayesian Filters [31], which address the general issue of estimating a time series from noisy observations using a model of the process, where this model can be parametric such as a physical model or neural network or nonparametric model such as a Gaussian Process. Unlike the linear case where the uncertainty can be propagated in closed form when it can be approximated as Gaussian distributed, Bayesian filters that use nonlinear models rely on approximations to the uncertainty propagation problem. Good examples of applications of these using Gaussian Processes can be found in [32] where it is applied to tracking a blimp which has complex flight dynamics. Other interesting formulations using Gaussian Processes for dynamical systems are [27,33,34]. An approximation to the uncertainty propagation for an autoregressive Gaussian Process model is presented in [35,36], with applications to multiple-step-ahead forecasting of a nonlinear time series.

11 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING Damage detection example An example of a damage detection application using a simple autoregressive Gaussian Process is used here. The problem is the same as in [29], which consists of a nonlinear time series: ( ) ( ) ( ) ( ) (15) In contrast with [29], no noise is added to the process, and only one damage case is considered here. The powers on the first two sinusoidal terms make the time series nonlinear, and thus difficult for a linear model to characterise. It is useful here because it highlights the difference in predictions between the linear and the GP autoregressive models. The simulated damage is the addition of a signal: (16) This is to demonstrate the ability of the autoregressive GP model to characterise the baseline nonlinear signal and detect a change on it, in contrast with the linear AR model which cannot differentiate between the two. The baseline and the damage signals are both generated for. A Gaussian Process model with a squared-exponential covariance function was used as a prior, and trained using the first 600 points. This is in order to test two aspects: the ability of the model to generalise (to correctly make predictions for points not present on the training set) as well as the ability to detect a change in the signal. Only five lags were used as the model input, which is the same number of lags in [29]. Also, the approximation described in [26] for large datasets was used which learns the optimum inducing points. The number of inducing points used was for this example was 50. The autoregressive GP model provides a much better fit to the data than the linear AR counterpart does, which is very clear from Figure 3. As a measure of fitness, the Normalised Mean Squared Error (NMSE) can be used, defined as, where is a measured data point and is a predicted data point. An NMSE of less than one is typically considered as a very good fit. The autoregressive GP model achieved a NMSE of 0.93 for the testing set which included the training set plus 600 more points of previously unseen data (from the baseline condition). The linear AR model achieved a NMSE of for the testing set, which indicates a terrible fit. The fits for both models is presented in Figure 3, where it is visually evident that the autoregressive GP has a superior fit, while the linear AR model has a rather poor fit. The residuals are a good indicator of the models performance, and are shown in Figure 4 for both models. The residuals of the GP model also indicate a good ability make predictions on unseen data, since the model was only trained using data from zero to six hundred seconds. It is clear from the residuals of the linear autoregressive model that it cannot distinguish between the baseline signal and the damaged one, while the residuals from the autoregressive GP can clearly distinguish between the two cases.

12 (Residual) (Residual) (Amplitude) (Amplitude) 480 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 a) (Seconds) b) (Seconds) Figure 3 Comparison between predictions of; a) GP autoregressive model and b) linear autoregressive model. The predictions are shown for a subset of data within the training set. The continuous line shows the actual signal while the dashed lines show the model predictions for one step ahead. Note that no confidence intervals are plotted for the GP predictions Undamaged Damaged a) (Seconds) b) (Seconds) Figure 4 Comparison of residuals between a) the autoregressive GP model and b) the linear AR model. 4.3 Conclusions and further work From the results of the example presented above it can be concluded that Gaussian process regression is a viable method to use within an autoregressive framework in the context of damage detection. It can characterise a signal from a dynamical system when the baseline condition is not linear, and the residuals produced by the model can clearly be used as damage sensitive features provided an appropriate threshold is set [37]. It also demonstrated that recently developed methods for GP regression on large datasets are

13 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 481 accurate in their approximations enough for this type of problem. There are several aspects that need to be examined further. Although the autoregressive GP can characterise a nonlinear baseline condition and a change to this can be detected through the residuals, the question still remains as to how to select the correct number of lags for the autoregression. This is largely due to the specific application, but a systematic way of selecting the number of lags, and the sensitivity to this choice for SHM applications needs to be addressed. This class of model should also be robust to changing environmental and operational conditions, provided sufficient example data is present in the training set, therefore further investigation is needed with real-world datasets to validate these conjectures. Gaussian Processes have only just started to appear in the SHM literature, and there are many application areas. However, in order for them to be a successful methodology in engineering, there needs to be a better understanding of their physical interpretation. Acknowledgements The authors would like to acknowledge Dr James Hensman for some interesting discussions as well as guidance with respect to Gaussian Processes and the Python GPy package. Acknowledgements also go to the UK Technology Strategy Board for funding part of this project. References [1] E. J. Cross, P. Sartor, and P. Southern, Prediction of Landing Gear Loads from Flight Test Data using Gaussian Process Regression, in International Workshop in Structural Health Monitoring, (2013). [2] R. Fuentes, E. J. Cross, A. Halfpenny, K. Worden, and R. J. Barthorpe, Aircraft Parametric Structural Load Monitoring Using Gaussian Process Regression, in 7th European Workshop on Structural Health Monitoring, (2014). [3] K. Worden and C. R. Farrar, Structural Health Monitoring: A Machine Learning Perspective. John Wiley & Sons, (2013). [4] R. Fuentes, A. Halfpenny, E. J. Cross, K. Worden, and R. J. Barthorpe, An Approach to Fault Detection Using a Unified Linear Gaussian Framework, in International Workshop in Structural Health Monitoring, (2013). [5] R. B. Randall, State of the Art in Monitoring Rotating Machinery Part 1, Journal of Sound and Vibration, vol. 38, no. 3, pp , (2004). [6] R. B. Randall, State of the Art in Monitoring Rotating Machinery Part 2, Journal of Sound and Vibration, vol. 38, no. 5, pp , (2004). [7] R. B. Randall, Vibration Based Condition Monitoring - Industrial, Aerospace and Automotive. John Wiley & Sons Ltd, (2011). [8] S. Doebling, C. R. Farrar, B. Prime, M, and D. Shevitz, Damage Identification and Health Monitoring of Structural and Mechanical Systems from Changes in their Vibration Characteristics: A Literature Review. Los Alamos National Laboratory Report LA MS, 1996.

14 482 PROCEEDINGS OF ISMA2014 INCLUDING USD2014 [9] H. Sohn, C. R. Farrar, and M. Hemez, F, A Review of Structural Health Monitoring Literature: Los Alamos National Laboratory Report LA MS, [10] A. Rytter, Vibration Based Inspection of Civil Engineering Structures, Ph. D. Dissertation, Aalborg University, (1993). [11] R. J. Barthorpe, On Model and Data-based Approaches to Structural Health Monitoring, Ph. D. Dissertation, The University of Sheffield, (2010). [12] K. Worden, C. R. Farrar, G. Manson, and G. Park, The fundamental axioms of structural health monitoring, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 463, pp , (2007). [13] C. Ruotolo and C. Surace, Damage Detection Using Singular Value Decomposition, in DAMAS 97: Structural Damage Assesment using Advanced Signal Processing Procedures, (1997). [14] J. Kullaa, Is Temperature measurement essential in SHM?, in International Workshop in Structural Health Monitoring, (2003). [15] H. Sohn, K. Worden, and C. R. Farrar, Statistical Damage Classification Under Changing Environmental and Operational Conditions, Journal of Intelligent Material Systems and Structures, vol. 13. pp , [16] G. Manson, Identifying damage sensitive, environment insensitive features for damage detection, in 3rd International Conference on Identification in Engineering Systems, (2002). [17] E. J. Cross, K. Worden, and Q. Chen, Cointegration: a novel approach for the removal of environmental trends in structural health monitoring data, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol pp , [18] E. J. Cross, On Structural Health Monitoring in Changing Environmental and Operational Conditions, Ph. D. Thesis, The University of Sheffield, (2012). [19] E. J. Cross, K. Y. Koo, J. M. W. Brownjohn, and K. Worden, Long-term monitoring and data analysis of the Tamar Bridge, Mechanical Systems and Signal Processing, vol. 35, no. 1 2, pp , (Feb. 2013). [20] E. Figueiredo, J. Figueiras, G. Park, C. R. Farrar, and K. Worden, Influence of the autoregressive model order on damage detection, Computer-Aided Civil and Infrastructure Engineering, vol. 26, pp , (2011). [21] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol. 19, no. 6, pp , (Dec. 1974). [22] C. E. Rasmussen and C. K. I. Williams, Gaussian processes for machine learning. Cambridge, Massachusetts: The MIT Press, (2006). [23] C. M. Bishop, Pattern Recognition and Machine Learning, vol. 4. (2006), p [24] J. Hensman, U. Sheffield, N. Fusi, and N. Lawrence, Gaussian Processes for Big Data, Proceedings of UAI 29, pp , (2013).

15 DAMAGE DETECTION AND STRUCTURAL HEALTH MONITORING 483 [25] A. Naish-Guzman and S. Holden, The Generalized FITC Approximation, in Advances in Neural Information Processing Systems 20 (NIPS), (2007). [26] M. K. Titsias, Variational Learning of Inducing Variables in Sparse Gaussian Processes, in Twelfth International Conference on Artificial Intelligence and Statistics, (2010). [27] R. D. Turner, Gaussian Processes for State Space Models and Change Point Detection, Ph. D. Thesis, University of Cambridge, (2011). [28] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain, Gaussian processes for time-series modelling., Philosophical transactions. Series A, Mathematical, physical, and engineering sciences, vol. 371, no. 1984, p , (2013). [29] L. Bornn, C. R. Farrar, G. Park, and K. Farinholt, Structural Health Monitoring With Autoregressive Support Vector Machines, Journal of Vibration and Acoustics, vol p , [30] R. E. Kalman, A New Approach to Linear Filtering and Prediction Problems, Transactions of the ASME-Journal of Basic Engineering, vol. 82, no. Series D, pp , (1960). [31] S. Särkkä, Bayesian filtering and smoothing. Cambridge University Press, (2013). [32] J. Ko, D. J. Klein, D. Fox, and D. Haehnel, GP-UKF: Unscented Kalman Filters with Gaussian Process prediction and observation models, in IEEE International Conference on Intelligent Robots and Systems, (2007), pp [33] R. Turner, M. P. Deisenroth, and C. E. Rasmussen, State-Space Inference and Learning with Gaussian Processes, in International Conference on Artificial Intelligence and Statistics, (2010). [34] A. C. Damianou, M. K. Titsias, and N. D. Lawrence, Variational Gaussian Process Dynamical Systems, in Advances in Neural Information Processing Systems, (2011). [35] A. Girard, C. E. Rasmussen, J. Quinonero-Candela, and R. Murray-smith, Gaussian Process Priors With Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting, in Advances in Neural INformation Processing Systems 15 (NIPS), (2002). [36] A. Girard, Approximate Methods for Propagation of Uncertainty with Gaussian Process Models, Ph. D. Thesis, University of Glasgow, (2004). [37] K. Worden, G. Manson, and N. R. J. Fieller, Damage Detection Using Outlier Analysis, Journal of Sound and Vibration, vol. 229, no. 3, pp , (Jan. 2000).

16 484 PROCEEDINGS OF ISMA2014 INCLUDING USD2014

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION ABSTRACT Presented in this paper is an approach to fault diagnosis based on a unifying review of linear Gaussian models. The unifying review draws together different algorithms such as PCA, factor analysis,

More information

Unsupervised Learning Methods

Unsupervised Learning Methods Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational

More information

WILEY STRUCTURAL HEALTH MONITORING A MACHINE LEARNING PERSPECTIVE. Charles R. Farrar. University of Sheffield, UK. Keith Worden

WILEY STRUCTURAL HEALTH MONITORING A MACHINE LEARNING PERSPECTIVE. Charles R. Farrar. University of Sheffield, UK. Keith Worden STRUCTURAL HEALTH MONITORING A MACHINE LEARNING PERSPECTIVE Charles R. Farrar Los Alamos National Laboratory, USA Keith Worden University of Sheffield, UK WILEY A John Wiley & Sons, Ltd., Publication Preface

More information

Ph.D student in Structural Engineering, Department of Civil Engineering, Ferdowsi University of Mashhad, Azadi Square, , Mashhad, Iran

Ph.D student in Structural Engineering, Department of Civil Engineering, Ferdowsi University of Mashhad, Azadi Square, , Mashhad, Iran Alireza Entezami a, Hashem Shariatmadar b* a Ph.D student in Structural Engineering, Department of Civil Engineering, Ferdowsi University of Mashhad, Azadi Square, 9177948974, Mashhad, Iran b Associate

More information

Lecture 9. Time series prediction

Lecture 9. Time series prediction Lecture 9 Time series prediction Prediction is about function fitting To predict we need to model There are a bewildering number of models for data we look at some of the major approaches in this lecture

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Neutron inverse kinetics via Gaussian Processes

Neutron inverse kinetics via Gaussian Processes Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques

More information

Expectation Propagation in Dynamical Systems

Expectation Propagation in Dynamical Systems Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1 Motivation Figure : Complex

More information

Complexity: A New Axiom for Structural Health Monitoring?

Complexity: A New Axiom for Structural Health Monitoring? (FIRST PAGE OF ARTICLE) Complexity: A New Axiom for Structural Health Monitoring? C. FARRAR, K. WORDEN AND G. PARK ABSTRACT The basic purpose of the paper is simple; having proposed a set of axioms or

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Multiple-step Time Series Forecasting with Sparse Gaussian Processes

Multiple-step Time Series Forecasting with Sparse Gaussian Processes Multiple-step Time Series Forecasting with Sparse Gaussian Processes Perry Groot ab Peter Lucas a Paul van den Bosch b a Radboud University, Model-Based Systems Development, Heyendaalseweg 135, 6525 AJ

More information

A methodology for fault detection in rolling element bearings using singular spectrum analysis

A methodology for fault detection in rolling element bearings using singular spectrum analysis A methodology for fault detection in rolling element bearings using singular spectrum analysis Hussein Al Bugharbee,1, and Irina Trendafilova 2 1 Department of Mechanical engineering, the University of

More information

The effect of environmental and operational variabilities on damage detection in wind turbine blades

The effect of environmental and operational variabilities on damage detection in wind turbine blades The effect of environmental and operational variabilities on damage detection in wind turbine blades More info about this article: http://www.ndt.net/?id=23273 Thomas Bull 1, Martin D. Ulriksen 1 and Dmitri

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Thang D. Bui Richard E. Turner tdb40@cam.ac.uk ret26@cam.ac.uk Computational and Biological Learning

More information

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Principal Component Analysis vs. Independent Component Analysis for Damage Detection 6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR

More information

System identification and control with (deep) Gaussian processes. Andreas Damianou

System identification and control with (deep) Gaussian processes. Andreas Damianou System identification and control with (deep) Gaussian processes Andreas Damianou Department of Computer Science, University of Sheffield, UK MIT, 11 Feb. 2016 Outline Part 1: Introduction Part 2: Gaussian

More information

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS

BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS BAYESIAN CLASSIFICATION OF HIGH DIMENSIONAL DATA WITH GAUSSIAN PROCESS USING DIFFERENT KERNELS Oloyede I. Department of Statistics, University of Ilorin, Ilorin, Nigeria Corresponding Author: Oloyede I.,

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Virtual Sensors and Large-Scale Gaussian Processes

Virtual Sensors and Large-Scale Gaussian Processes Virtual Sensors and Large-Scale Gaussian Processes Ashok N. Srivastava, Ph.D. Principal Investigator, IVHM Project Group Lead, Intelligent Data Understanding ashok.n.srivastava@nasa.gov Coauthors: Kamalika

More information

Operational modal analysis using forced excitation and input-output autoregressive coefficients

Operational modal analysis using forced excitation and input-output autoregressive coefficients Operational modal analysis using forced excitation and input-output autoregressive coefficients *Kyeong-Taek Park 1) and Marco Torbol 2) 1), 2) School of Urban and Environment Engineering, UNIST, Ulsan,

More information

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II 1 Non-linear regression techniques Part - II Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector

More information

The application of statistical pattern recognition methods for damage detection to field data

The application of statistical pattern recognition methods for damage detection to field data IOP PUBLISHING Smart Mater. Struct. 17 (2008) 065023 (12pp) SMART MATERIALS AND STRUCTURES doi:10.1088/0964-1726/17/6/065023 The application of statistical pattern recognition methods for damage detection

More information

2 Related Works. 1 Introduction. 2.1 Gaussian Process Regression

2 Related Works. 1 Introduction. 2.1 Gaussian Process Regression Gaussian Process Regression with Dynamic Active Set and Its Application to Anomaly Detection Toshikazu Wada 1, Yuki Matsumura 1, Shunji Maeda 2, and Hisae Shibuya 3 1 Faculty of Systems Engineering, Wakayama

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Novelty Detection based on Extensions of GMMs for Industrial Gas Turbines

Novelty Detection based on Extensions of GMMs for Industrial Gas Turbines Novelty Detection based on Extensions of GMMs for Industrial Gas Turbines Yu Zhang, Chris Bingham, Michael Gallimore School of Engineering University of Lincoln Lincoln, U.. {yzhang; cbingham; mgallimore}@lincoln.ac.uk

More information

Using SDM to Train Neural Networks for Solving Modal Sensitivity Problems

Using SDM to Train Neural Networks for Solving Modal Sensitivity Problems Using SDM to Train Neural Networks for Solving Modal Sensitivity Problems Brian J. Schwarz, Patrick L. McHargue, & Mark H. Richardson Vibrant Technology, Inc. 18141 Main Street Jamestown, California 95327

More information

Gaussian process for nonstationary time series prediction

Gaussian process for nonstationary time series prediction Computational Statistics & Data Analysis 47 (2004) 705 712 www.elsevier.com/locate/csda Gaussian process for nonstationary time series prediction Soane Brahim-Belhouari, Amine Bermak EEE Department, Hong

More information

Analytic Long-Term Forecasting with Periodic Gaussian Processes

Analytic Long-Term Forecasting with Periodic Gaussian Processes Nooshin Haji Ghassemi School of Computing Blekinge Institute of Technology Sweden Marc Peter Deisenroth Department of Computing Imperial College London United Kingdom Department of Computer Science TU

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Bearing fault diagnosis based on EMD-KPCA and ELM

Bearing fault diagnosis based on EMD-KPCA and ELM Bearing fault diagnosis based on EMD-KPCA and ELM Zihan Chen, Hang Yuan 2 School of Reliability and Systems Engineering, Beihang University, Beijing 9, China Science and Technology on Reliability & Environmental

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Gaussian Processes. 1 What problems can be solved by Gaussian Processes?

Gaussian Processes. 1 What problems can be solved by Gaussian Processes? Statistical Techniques in Robotics (16-831, F1) Lecture#19 (Wednesday November 16) Gaussian Processes Lecturer: Drew Bagnell Scribe:Yamuna Krishnamurthy 1 1 What problems can be solved by Gaussian Processes?

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

Autoregressive modelling for rolling element bearing fault diagnosis

Autoregressive modelling for rolling element bearing fault diagnosis Journal of Physics: Conference Series PAPER OPEN ACCESS Autoregressive modelling for rolling element bearing fault diagnosis To cite this article: H Al-Bugharbee and I Trendafilova 2015 J. Phys.: Conf.

More information

Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt

Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt ACS Spring National Meeting. COMP, March 16 th 2016 Matthew Segall, Peter Hunt, Ed Champness matt.segall@optibrium.com Optibrium,

More information

Gaussian Process for Internal Model Control

Gaussian Process for Internal Model Control Gaussian Process for Internal Model Control Gregor Gregorčič and Gordon Lightbody Department of Electrical Engineering University College Cork IRELAND E mail: gregorg@rennesuccie Abstract To improve transparency

More information

Chapter 2 System Identification with GP Models

Chapter 2 System Identification with GP Models Chapter 2 System Identification with GP Models In this chapter, the framework for system identification with GP models is explained. After the description of the identification problem, the explanation

More information

Feature comparison in structural health monitoring of a vehicle crane

Feature comparison in structural health monitoring of a vehicle crane Shock and Vibration (28) 27 2 27 IOS Press Feature comparison in structural health monitoring of a vehicle crane J. Kullaa and T. Heine Helsinki Polytechnic Stadia, P.O. Box 421, FIN-99, Helsinki, Finland

More information

Damage detection in a reinforced concrete slab using outlier analysis

Damage detection in a reinforced concrete slab using outlier analysis Damage detection in a reinforced concrete slab using outlier analysis More info about this article: http://www.ndt.net/?id=23283 Abstract Bilal A. Qadri 1, Dmitri Tcherniak 2, Martin D. Ulriksen 1 and

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Gaussian Processes for Big Data. James Hensman

Gaussian Processes for Big Data. James Hensman Gaussian Processes for Big Data James Hensman Overview Motivation Sparse Gaussian Processes Stochastic Variational Inference Examples Overview Motivation Sparse Gaussian Processes Stochastic Variational

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation COMP 55 Applied Machine Learning Lecture 2: Bayesian optimisation Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material posted

More information

Relevance Vector Machines for Earthquake Response Spectra

Relevance Vector Machines for Earthquake Response Spectra 2012 2011 American American Transactions Transactions on on Engineering Engineering & Applied Applied Sciences Sciences. American Transactions on Engineering & Applied Sciences http://tuengr.com/ateas

More information

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting)

Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Advanced Machine Learning Practical 4b Solution: Regression (BLR, GPR & Gradient Boosting) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch,

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Anomaly Detection in Logged Sensor Data. Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK

Anomaly Detection in Logged Sensor Data. Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK Anomaly Detection in Logged Sensor Data Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK Department of Applied Mechanics CHALMERS UNIVERSITY OF TECHNOLOGY Göteborg, Sweden 2015 MASTER S THESIS

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods

Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods Fotis P. Kopsaftopoulos and Spilios D. Fassois Stochastic Mechanical Systems

More information

COVER SHEET. Title: Detecting Damage on Wind Turbine Bearings using Acoustic Emissions and Gaussian Process Latent Variable Models

COVER SHEET. Title: Detecting Damage on Wind Turbine Bearings using Acoustic Emissions and Gaussian Process Latent Variable Models COVER SHEET NOTE: Please attach the signed copyright release form at the end of your paper and upload as a single pdf file This coversheet is intended for you to list your article title and author(s) name

More information

Machine learning for automated theorem proving: the story so far. Sean Holden

Machine learning for automated theorem proving: the story so far. Sean Holden Machine learning for automated theorem proving: the story so far Sean Holden University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD, UK sbh11@cl.cam.ac.uk

More information

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling G. B. Kingston, H. R. Maier and M. F. Lambert Centre for Applied Modelling in Water Engineering, School

More information

Nonparametric Inference for Auto-Encoding Variational Bayes

Nonparametric Inference for Auto-Encoding Variational Bayes Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an

More information

Probabilistic numerics for deep learning

Probabilistic numerics for deep learning Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling

More information

Reliable Condition Assessment of Structures Using Uncertain or Limited Field Modal Data

Reliable Condition Assessment of Structures Using Uncertain or Limited Field Modal Data Reliable Condition Assessment of Structures Using Uncertain or Limited Field Modal Data Mojtaba Dirbaz Mehdi Modares Jamshid Mohammadi 6 th International Workshop on Reliable Engineering Computing 1 Motivation

More information

Non-Factorised Variational Inference in Dynamical Systems

Non-Factorised Variational Inference in Dynamical Systems st Symposium on Advances in Approximate Bayesian Inference, 08 6 Non-Factorised Variational Inference in Dynamical Systems Alessandro D. Ialongo University of Cambridge and Max Planck Institute for Intelligent

More information

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes Roger Grosse Roger Grosse CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes 1 / 55 Adminis-Trivia Did everyone get my e-mail

More information

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010 Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X

More information

VIBRATION-BASED DAMAGE DETECTION UNDER CHANGING ENVIRONMENTAL CONDITIONS

VIBRATION-BASED DAMAGE DETECTION UNDER CHANGING ENVIRONMENTAL CONDITIONS VIBRATION-BASED DAMAGE DETECTION UNDER CHANGING ENVIRONMENTAL CONDITIONS A.M. Yan, G. Kerschen, P. De Boe, J.C Golinval University of Liège, Liège, Belgium am.yan@ulg.ac.be g.kerschen@ulg.ac.bet Abstract

More information

Introduction to Gaussian Process

Introduction to Gaussian Process Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression

More information

How to build an automatic statistician

How to build an automatic statistician How to build an automatic statistician James Robert Lloyd 1, David Duvenaud 1, Roger Grosse 2, Joshua Tenenbaum 2, Zoubin Ghahramani 1 1: Department of Engineering, University of Cambridge, UK 2: Massachusetts

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

A BAYESIAN APPROACH FOR PREDICTING BUILDING COOLING AND HEATING CONSUMPTION

A BAYESIAN APPROACH FOR PREDICTING BUILDING COOLING AND HEATING CONSUMPTION A BAYESIAN APPROACH FOR PREDICTING BUILDING COOLING AND HEATING CONSUMPTION Bin Yan, and Ali M. Malkawi School of Design, University of Pennsylvania, Philadelphia PA 19104, United States ABSTRACT This

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

Discovery Through Situational Awareness

Discovery Through Situational Awareness Discovery Through Situational Awareness BRETT AMIDAN JIM FOLLUM NICK BETZSOLD TIM YIN (UNIVERSITY OF WYOMING) SHIKHAR PANDEY (WASHINGTON STATE UNIVERSITY) Pacific Northwest National Laboratory February

More information

Multiple damage detection in beams in noisy conditions using complex-wavelet modal curvature by laser measurement

Multiple damage detection in beams in noisy conditions using complex-wavelet modal curvature by laser measurement Multiple damage detection in beams in noisy conditions using complex-wavelet modal curvature by laser measurement W. Xu 1, M. S. Cao 2, M. Radzieński 3, W. Ostachowicz 4 1, 2 Department of Engineering

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information

Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information 5st IEEE Conference on Decision and Control December 0-3, 202 Maui, Hawaii, USA Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information Joseph Hall, Carl Rasmussen

More information

Multilevel Analysis of Continuous AE from Helicopter Gearbox

Multilevel Analysis of Continuous AE from Helicopter Gearbox Multilevel Analysis of Continuous AE from Helicopter Gearbox Milan CHLADA*, Zdenek PREVOROVSKY, Jan HERMANEK, Josef KROFTA Impact and Waves in Solids, Institute of Thermomechanics AS CR, v. v. i.; Prague,

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Learning Non-stationary System Dynamics Online using Gaussian Processes

Learning Non-stationary System Dynamics Online using Gaussian Processes Learning Non-stationary System Dynamics Online using Gaussian Processes Axel Rottmann and Wolfram Burgard Department of Computer Science, University of Freiburg, Germany Abstract. Gaussian processes are

More information

Structural Health Monitoring and Damage Assessment Using Measured FRFs from Multiple Sensors, Part I: The Indicator of Correlation Criteria

Structural Health Monitoring and Damage Assessment Using Measured FRFs from Multiple Sensors, Part I: The Indicator of Correlation Criteria Key Engineering Materials Vols. 245-246 (2003) pp. 131-140 online at http://www.scientific.net Journal 2003 Citation Trans Tech (to Publications, be inserted by Switzerland the publisher) Copyright by

More information

COMS 4771 Regression. Nakul Verma

COMS 4771 Regression. Nakul Verma COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Automated Modal Parameter Estimation For Operational Modal Analysis of Large Systems

Automated Modal Parameter Estimation For Operational Modal Analysis of Large Systems Automated Modal Parameter Estimation For Operational Modal Analysis of Large Systems Palle Andersen Structural Vibration Solutions A/S Niels Jernes Vej 10, DK-9220 Aalborg East, Denmark, pa@svibs.com Rune

More information

ROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino

ROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Probabilistic Fundamentals in Robotics Gaussian Filters Course Outline Basic mathematical framework Probabilistic models of mobile robots Mobile

More information

Maximum Direction to Geometric Mean Spectral Response Ratios using the Relevance Vector Machine

Maximum Direction to Geometric Mean Spectral Response Ratios using the Relevance Vector Machine Maximum Direction to Geometric Mean Spectral Response Ratios using the Relevance Vector Machine Y. Dak Hazirbaba, J. Tezcan, Q. Cheng Southern Illinois University Carbondale, IL, USA SUMMARY: The 2009

More information

Available online at ScienceDirect. Procedia Engineering 119 (2015 ) 13 18

Available online at   ScienceDirect. Procedia Engineering 119 (2015 ) 13 18 Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 119 (2015 ) 13 18 13th Computer Control for Water Industry Conference, CCWI 2015 Real-time burst detection in water distribution

More information

Course in Data Science

Course in Data Science Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes

Statistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes Statistical Techniques in Robotics (6-83, F) Lecture# (Monday November ) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan Applications of Gaussian Processes (a) Inverse Kinematics

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Smooth Bayesian Kernel Machines

Smooth Bayesian Kernel Machines Smooth Bayesian Kernel Machines Rutger W. ter Borg 1 and Léon J.M. Rothkrantz 2 1 Nuon NV, Applied Research & Technology Spaklerweg 20, 1096 BA Amsterdam, the Netherlands rutger@terborg.net 2 Delft University

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural

More information

A Data-driven Approach for Remaining Useful Life Prediction of Critical Components

A Data-driven Approach for Remaining Useful Life Prediction of Critical Components GT S3 : Sûreté, Surveillance, Supervision Meeting GdR Modélisation, Analyse et Conduite des Systèmes Dynamiques (MACS) January 28 th, 2014 A Data-driven Approach for Remaining Useful Life Prediction of

More information

Learning Control Under Uncertainty: A Probabilistic Value-Iteration Approach

Learning Control Under Uncertainty: A Probabilistic Value-Iteration Approach Learning Control Under Uncertainty: A Probabilistic Value-Iteration Approach B. Bischoff 1, D. Nguyen-Tuong 1,H.Markert 1 anda.knoll 2 1- Robert Bosch GmbH - Corporate Research Robert-Bosch-Str. 2, 71701

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information