Non-Stationary Time-dependent ARMA Random Vibration Modeling, Analysis & SHM with Wind Turbine Applications

Size: px

Start display at page:

Download "Non-Stationary Time-dependent ARMA Random Vibration Modeling, Analysis & SHM with Wind Turbine Applications"

Audra Strickland
5 years ago
Views:

1 Non-Stationary Time-dependent ARMA Random Vibration Modeling, Analysis & SHM with Wind Turbine Applications Luis David Avendaño-Valencia Department of Mechanical Engineering and Aeronautics University of Patras A thesis submitted for the degree of Doctor of Philosophy March 3, 6

3 Examination Committee Spilios Fassois, Professor (supervisor) Department of Mechanical Engineering & Aeronautics University of Patras, Greece Theodore Philippidis, Professor (advising committee member) Department of Mechanical Engineering & Aeronautics University of Patras, Greece Kostas Berberidis, Professor (advising committee member) Department of Computer Engineering & Informatics University of Patras, Greece John Sakellariou, Lecturer Department of Mechanical & Aeronautical Engineering University of Patras, Greece Nikos Aspragathos, Professor Department of Mechanical & Aeronautical Engineering University of Patras, Greece Biwajit Basu, Professor Department of Civil, Structural & Environmental Engineering Trinity College Dublin, Ireland Apostolos Papageorgiou, Professor Department of Civil Engineering University of Patras, Greece i

4 ii

5 To my parents, for their unconditional support and inspiration to seek the paths of greatness; To my professors, for guiding me through these difficult paths and encouraging me to give the best of myself; To all my beloved ones, for sharing this path with me through its ups and downs and making it the best time of my life...

6 iv

7 Acknowledgements Firstly, I would like to acknowledge the Marie Curie 7th Framework Programme of the European Union for their financial support through the Initial Training Network Project SYSWIND (Grant 3835). Also, a very special acknowledgment to Professor Spilios D. Fassois for his support, guidance and encouragement through all these five years of research. Likewise, to the all the professors in the advisory and examination committee, for their very useful comments and discussions that helped to improve the quality of this work. Furthermore, the collaboration of Mr. Giannis Papadakis of IWECO M.V. S.A. was vital for the acquisition of very valuable experimental data from a wind generator located at the Atavyros Wind Park on the island of Rhodes, which is under his supervision. Also, Professor Soren K. Nielsen and his research team at Aalborg University, in Denmark, who provided me vital information regarding to the dynamics of wind turbines as well as the FAST aeroelastic simulation tool, which extremely important to obtain simulation data that was utilized in the last chapter of this thesis. Moreover, there are several persons here at our research group that I want to acknowledge, starting from Dr. Minas Spiridonakos, Dr. Fotis Kopsaftopoulos, Mr. Pavlos Michaelides, Mr. Christos Sacharis, Mr. Theodoris Mastrapostolis, Mr. Konstantinos Zoglopitis and Mrs. Toula Katopodi who welcomed me and introduced me into the group on the first years, also Mr. Vangelis Petratos, Ms. Elpida Nassiopoulou, Ms. Ana Gómez and Mr. Dimitris Sotiriou who accompanied me during the mid-years of this project, and Mr. Kyriakos Vamvoudakis, Ms. Andriana Georgantopoulou, Mr. Trifonas Aravanis, Mr. Nikos Spanos, and Mr. Oscar Scussel with whom I shared during the final years of my PhD. Besides, I want to recognize the help from all the personnel of the Department of Mechanical and Aeronautical Engineering, who helped me to navigate through all the complications of the Greek system. To finalize, I want to thank my family and my friends. All of you were my inspiration to do my best, all of you are very dear to me! I wish you the best. This work is for all of you! And to my special one, without your support and your love, this wouldn t have happened. Thanks for being there with me. I love you!

8 vi

9 Abstract The main topic of discussion of this PhD dissertation is the output-only identification of the vibration response of structures with time-dependent dynamics under uncertainty, and its application to the diagnosis of structural damage, referred to as vibration-based Structural Health Monitoring (SHM). This is of relevance to several classes of modern structures, including bridges with moving vehicles, structures with changing geometry, cranes, robotic manipulators, and so on. However, the main focus of application on this dissertation are wind turbines, for which the non-stationary vibration response stems from the time-dependent structural dynamics, as well as the randomness and variability of the wind excitation and other operational and environmental variables. The development of effective methods for the identification and analysis of non-stationary vibration response is of pertinence for the improved understanding of the dynamic behavior of engineering structures during normal operation, as well for achieving effective representations of complex nonstationary behavior, which may be subsequently used with the purpose of SHM. Moreover, effective SHM methods are useful for early detection of damage, and thus for the reduction of maintenance and repair costs, and more importantly, for safeguarding the structure from catastrophic failure. Among several other non-stationary modeling methods, Time-dependent ARMA (TARMA) models are characterized by several advantages that make them strong candidates for the problem discussed in this thesis. These include the modeling parsimony, referring to the capacity of representing very complex phenomena within a reduced set of parameters, improved tracking abilities, improved representation accuracy, and several more. However, these models have several limitations regarding to the proper modeling of non-stationary dynamics with uncertainty, the lack of analytical quantities that can serve to understand with precision the dynamic characteristics of fast-evolving non-stationary processes, and the lack of damage diagnosis methods that can deal also with significant uncertainties in the dynamics. Therefore, this PhD dissertation deals with three issues regarding to the application of TARMA models for the purpose of solving the main problem stated before: (i) the development of non-stationary vibration response modeling methods able to represent both time-dependent dynamics and the effects of uncertainty, (ii) the construction of enhanced analysis methods that can lead to better understanding of fast-evolving non-stationary dynamics featuring in the vibration response, and (iii) the development of robust SHM methods that can deal with non-stationary dynamics and uncertainty in the vibration response. This thesis is divided in six chapters, each one devoted to the development of methods aiming at solving specific parts of the problem discussed before. Chapter is devoted to the comparison of stationary and non-stationary analysis methods for the vibration response of an operating wind turbine, where it is demonstrated the necessity of non-stationary modeling and the benefit of TARMA representations on this particular identification problem. Besides, it shows the necessity of several improvements on the modeling and analysis methods that are addressed in posterior chapters. In that sense, Chapter 3 addresses the problem of finding a Stochastic Parameter Evolution TARMA modeling method with improved flexibility and accuracy. For that purpose, the class of Generalized linear Stochastic Constraint TARMA (GSC-TARMA) models is introduced, which is characterized by Gaussian AR parameter evolutions. Consequently, the postulated GSC-TARMA models are fully linear and Gaussian, thus facilitating the identification process. Moreover, different identification methods, including Maximum Likelihood and Bayesian methods, are analyzed and adapted to the GSC-TARMA model type.

10 The proposed methods are evaluated on several Monte Carlo simulations and on the identification of in-operation wind turbine vibration response. Chapter 4 is devoted to the development of enhanced TARMA model-based methods for the analysis of non-stationary dynamics. On this sense, the concepts of Harmonic Impulse Response and Harmonic Frequency Response Function (FRF) are adapted to the TARMA case, from which inputoutput relations in the time-variant system can be derived, and from which the spectral correlation and some types of time-varying spectral densities can be calculated. The derivation of this quantities is of importance for improved analysis of the dynamics of fast-evolving non-stationary processes that can be represented with TARMA models. This is revealed through the analysis of the dynamic response of simulated LTV models, and on the analysis of the vibration response of an operating wind turbine identified with TARMA models. Chapters 5 and 6 are devoted to the health monitoring of structures with non-stationary vibration response and operating under significant uncertainty. The representation of both types of effects cannot be simply accomplished with conventional models. Hence, for that purpose, the SHM problem is postulated in terms of a Multiple Model (MM) representation of each structural state. The MM representation is actually a collection of similar models (of the TARMA type), where each model represents the vibration response of the structure under particular uncertainty conditions. This yields a simple but yet effective representation of non-stationary dynamics and uncertainty. Furthermore, two types of damage diagnosis methods (damage detection and damage identification) are defined based on the MM representation, which are fully explained and evaluated in two SHM application examples, the first one based on the simulated vibration response of a time-varying suspension system where uncertainties are introduced by variability in the parameters of the governing differential equation, and the second one based on the simulated vibration response of an operating wind turbine, where uncertainty is introduced by variable wind speed and turbulence. The results demonstrate the robustness of the MM approach, stemming from its simple but effective representation of uncertainty, and its superiority in comparison with other SHM methods in terms of diagnostic accuracy.

11 Contents Contents Nomenclature ix xiv Introduction. The Problem Statement and its Importance State of the Art Representation of Non-Stationary Vibration Time and Frequency Representation of Non-Stationary TARMA Processes Non Stationary Vibration Based Structural Health Monitoring Proposed Methodology Representation of Non-Stationary Vibration by Means of TARMA Models Time and Frequency Representation of Non-Stationary TARMA Processes Non-Stationary Vibration-Based Structural Health Monitoring Main Results Organization and Contributions of this Thesis Stationary and Non-Stationary Random Vibration Modeling and Analysis for an Operating Wind Turbine. Introduction Characteristics of the dynamics of the vibration response of wind turbine structures Concise overview of stationary and non stationary vibration modeling Stationary non parametric modeling Stationary parametric modeling Non stationary non parametric modeling Non-stationary parametric modeling The experimental set up and the vibration signal The experimental set up The vibration signal modeled Stationary modeling Non parametric modeling Parametric modeling Model based analysis Non stationary modeling Non parametric modeling Parametric modeling Smoothness Priors Time dependent AutoRegressive (SP TAR) modeling Functional Series Time-dependent AutoRegressive (FS-TAR) modeling Adaptable Functional Series Time dependent AutoRegressive (AFS TAR) modeling Model based analysis ix

12 Contents.7 Discussion Conclusions Appendix.A Smoothness Priors TAR modeling A. Kalman Filter and Smoother A. The Likelihood of the SP-TAR Model and the BIC Appendix.B Model assessing and validation Appendix.C Estimation of the Time Dependent Variance C. Non-parametric estimation of time-dependent variance C. Smoothness priors based estimator of time-dependent variance Generalized Stochastic Constraint Time-Dependent ARMA Modeling of Non-Stationary Random Signals Introduction On the class of stochastic parameter evolution TARMA models Generalized linear stochastic constraint TARMA models Model definition Statistical properties of the GSC TARMA model The joint and conditional distributions of the observations and parameters The likelihood of the hyperparameters given the complete data The marginal likelihood of the hyperparameters Identification of a GSC TARMA model Estimation of the parameter trajectories with known hyperparameters and structure Estimation of the hyperparameters via Maximum Likelihood methods An approach based on the marginal likelihood An approach based on the complete data likelihood Estimation of the hyperparameters via Bayesian methods Estimation via Markov Chain Monte Carlo sampling Joint Kalman filter estimation method Selection of the best model structure Validation of the identified models Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend Model definition Identification via GSC TAR, SP TAR and FS TAR models Presentation and set up of the identification methods Identification results GSC TAR and SP TAR models Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) Identification results FS TAR models Comparison of identification results obtained via GSC, SP and FS TAR representations Validation of the identified GSC, SP and FS TAR models Monte Carlo simulation: SPE TAR model Model definition Identification via GSC TAR and SP TAR models Presentation and set up of the identification methods Identification results GSC TAR and SP TAR models Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) Comparison of identification results Validation of the identified GSC, SP and FS TAR models Monte Carlo simulation: DPE TAR model x

13 Contents 3.7. Model definition Identification via GSC TAR, SP TAR and FS TAR models Presentation and set up of the identification methods Identification results GSC TAR and SP TAR models Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) Identification results FS TAR models Comparison of identification results obtained via GSC, SP and FS TAR representations Validation of the identified GSC, SP and FS TAR models Application example: Identification of the vibration response of an operating wind turbine Data description Model identification Validation of the identified models Model based analysis Summary of the results Discussion of the results Comparison of the GSC TARMA model identification methods GSC TARMA, SP TARMA and FS TARMA Conclusions Appendix 3.A Some examples of models in the class of GSC-TARMA models A. TARMA model with AR parameter evolution A. TARMA model with integrated AR parameter evolution Appendix 3.B Demonstration of the statistical properties of the GSC TARMA model B. The parameter mean B. The joint distribution of the observations and parameter vector B.3 The PDF of the parameter trajectory B.4 The conditional observation PDF Appendix 3.C Derivation of the EM algorithm for the identification of the GSC TARMA model... 3.C. Derivation of the expected log-likelihood C. Update equation for the parameter innovations mean C.3 Update equation for the stochastic constraint parameters C.4 Update equation for the parameter innovations covariance matrix C.5 Update equations for the innovations variance Appendix 3.D Identification based on the parameter trajectories Time and Frequency Analysis of Non-Stationary Signals by Means of Time-Dependent ARMA Representations 9 4. Introduction Time and frequency domain analysis of TARMA models Time-domain analysis of discrete linear time varying models Impulse response The spreading function and the harmonic impulse response Convolution representation and the HIR Excitation response covariance and autocovariance functions Frequency-domain analysis of discrete linear time varying models Instantaneous frequency response function The frequency response operator and the harmonic FRF Spectral correlation Parametric time dependent spectra Computational considerations xi

14 Contents 4.3 Application examples Analysis of an FS TARMA(,) model Model definition Time and frequency domain representation of the FS TARMA model Analysis of the response of the FS TARMA model Identification of the dynamics of an operational wind turbine Data description Short description of the identification methods Harmonic Transfer Function Spectral correlation estimates Time dependent spectra Conclusions Appendix 4.A Proofs A. Proof for the spreading function in Equation (4..) A. Proof for the Harmonic Impulse Response of Equation (4..) Appendix 4.B Time and frequency domain analysis of stationary processes Appendix 4.C Time and frequency domain analysis of non stationary processes C. Non-stationary random processes C. Harmonizable random processes C.3 Oscillatory random processes C.4 Reconciling the definitions of time-dependent spectra Appendix 4.D Poles and modal decomposition of TARMA models A Multiple Model Framework for Vibration Based SHM of Structures with Time-Dependent Dynamics Under Uncertainty Introduction The Problem and the Multiple Model Based SHM Framework Precise problem statement Overview of the main ideas The Elements of the Multiple Model Based SHM Framework Multiple Model representations of time-dependent structural dynamics The elementary models MM representations Construction of an MM representation Multiple Model Based SHM methods Methods based on the marginal likelihood (MM-ML) Methods based on the Kullback-Leibler Divergence (MM-KL) Optimization of the Multiple Model based SHM methods Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty The suspension model and problem description The vibration response signals FS TAR model based dynamic analysis FS TAR model based Multiple Model representation construction Damage detection results Results on the conventional damage detection problem Results on the modified damage detection problem Analysis of the dimensionality of the MM representation on the MM-KL method Comparison with the Random Coefficient (RC-FS-TAR) based methods Discussion Concluding Remarks xii

15 Contents Appendix 5.A Model Validation Appendix 5.B Random Coefficient and MM representations Illustrative example Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty via Vibration Response Multiple Model Based Methods Introduction The Wind Turbine, the Damages/Faults, the Sensors Wind turbine description and simulation The damage/fault scenarios The sensors and the vibration response signals Brief Overview of the Multiple Model Damage/Fault Diagnosis Methods The elementary models Construction of the MM representation Damage diagnosis based on MM representations MM Marginal Likelihood based methods MM Kullback Leibler based methods Modeling and Analysis of the Vibration Response Signals LPV-AR and FS-TAR based modeling Model based analysis of the dynamics Damage and Fault Diagnosis Results Description of the optimization of the damage diagnosis methods via cross-validation MM-ML based methods Detection results Identification results MM-KL based methods Detection results Identification results Comparison with a Model Parameter Based Method Damage detection results Damage identification and level estimation results Discussion Concluding Remarks Appendix 6.A Model validation and comparative assessment Appendix 6.B The spectral correlation and the Melard Tjøstseim PSD of a TARMA model Appendix 6.C Brief overview of the model parameter based method Appendix 6.D The non-parametric TFR and PCA based damage diagnosis D. Damage detection results D. Damage identification and level estimation results Conclusions and Future Work General Conclusions Problem Problem Problem Summary Suggestions for Future Work List of Figures 39 List of Tables 47 References 5 xiii

17 Contents Important Conventions and Symbols Bold-face upper/lower case symbols designate matrix/column-vector quantities, respectively. Matrix transposition is indicated by the superscript T. A functional argument in parentheses designates function of a real variable; for instance x(t) is a function of analog time t R. A functional argument in brackets designates function of an integer variable; for instance x[t] is a function of normalized discrete time (t =,,...). The conversion from discrete normalized time to analog time is based on (t )T s, with T s standing for the sampling period. A time instant used as a subscript/superscript pair to a function indicates the set of values of the function from the time indicated at the subscript up to the time indicated in the superscript; for instance x t τ ={x[i],i=τ,τ+,...,t}. A hat designates estimator/estimate of the indicated quantity; for instance ˆθ is an estimator/estimate of θ. B stands for the backshift operator defined such that B i x[t] = x[t i]. For simplicity of notation, no distinction is made between a random variable and its value(s). Main Acronyms D-PSD SLS SSNLS ACF AFS-TAR AFS-TARMA AR AUC BIC DPE EM EKF ELS FDI FPR FS-TAR FS-TARMA GSC-TAR GSC-TARMA IV KF MA MAP ML MS MW NID OLS PDF : Two Dimensional Power Spectral Density : Two Stage Least Squares (method) : Two Stage Separable Nonlinear Least Squares (method) : Adaptable FS-TAR (model) : Adaptable FS-TAR (model) : Adaptable FS-TARMA (model) : AutoRegressive : Area Under the ROC curve : Bayesian Information Criterion : Deterministic Parameter Evolution (method) : Expectation-Maximization (method) : Extended Kalman Filter : Extended Least Squares (method) : Fault Detection and Identification : False Positive Rate : Functional Series TAR (model) : Functional Series TARMA (model) : Generalized Stochastic Constraint TAR (model) : Generalized Stochastic Constraint TARMA (model) : Instantaneous Variance (method) : Kalman Filter : Moving Average : Maximum A Posteriori (method) : Maximum Likelihood (method) : Multi-Stage (method) : Moving Window (method) : Normally Independently Distributed : Ordinary Least Squares (method) : Probability Density Function xv

18 Contents PSD PSO RELS RLS RSS ROC SNLS SP-TAR SP-TARMA SPWV STFT SPE TAR TARMA TPR TV UPE WLS : Power Spectral Density : Particle Swarm Optimization : Recursive Extended Least Squares (method) : Recursive Least Squares (method) : Residual Sum of Squares : Receiver Operating Characteristic curve : Separable Nonlinear Least Squares (method) : Smoothness Priors TAR (model) : Smoothness Priors TARMA (model) : Smoothed Pseudo Wigner Ville (method) : Short-Time Fourier Transform (method) : Stochastic Parameter Evolution (method) : Time-dependent AR (model) : Time-dependent ARMA (model) : True Positive Rate : Time-Varying : Unstructured Parameter Evolution (method) : Weighted Least Squares (method) xvi

19 Chapter Introduction. The Problem Statement and its Importance The main topic of this dissertation is the output-only identification of the vibration response of structures with timedependent dynamics under significant uncertainty (environmental and operational) by means of Time dependent ARMA (TARMA) and Linear Parameter Varying ARMA (LPV ARMA) modeling, and its application to the diagnosis (detection and identification) of structural damage, referred to as vibration-based Structural Health Monitoring (SHM). The work carried out in this dissertation can be classified into three main directions, which are also the main objectives of this dissertation: (i) The effective representation of the non stationary vibration response of structures with time dependent dynamics under significant uncertainty by means of TARMA and LPV ARMA models. (ii) The construction of enhanced analysis methods that can lead to better understanding of non-stationary dynamics in comparison to the well known frozen approach, particularly for the case of fast evolving processes. (iii) The development of robust damage diagnosis methods capable of coping with the variability induced by different operational and environmental conditions in the dynamics of the vibration response of structures with time dependent dynamics and operating under significant uncertainty. Addressing the above mentioned problems is of prime importance for vibration-based output-only identification, analysis, and health monitoring of structures with time-dependent dynamics operating under significant uncertainty, for which the vibration response is non-stationary (Hansen et al., 6b; Poulimenos and Fassois, 6, 9b; Li et al., ; Zhang and Huang, ). Nowadays, there are several common examples of mechanical and civil structures with those characteristics. Examples include cranes, robotic manipulators, bridges with moving vehicles, structures with variable geometry, and several others. However, one important example are operating wind turbines, which are actually the main subject of application on this dissertation. The development of proper representations (modeling) and analysis methods of the non-stationary vibration response of these structures is pertinent for the achievement of improved understanding of their dynamic behavior under normal operation (Poulimenos and Fassois, 6; Antoni, 9). Furthermore, accurate and compact modeling facilitates the development of robust damage diagnosis methods, which in turn would be used for the detection of incipient damage, thus reducing maintenance and repair costs and safeguarding the structure from catastrophic damage (Sohn, 7; Farrar and Worden, 7; Worden and Manson, 7; Fassois and Sakellariou, 9; Ciang et al., 8; Hameed et al., 9).

20 .. State of the Art. State of the Art.. Representation of Non-Stationary Vibration Although conventional stationary analysis methods may be used for initial analysis and understanding of the basic properties of any type of signal (Kay, 988; Bendat and Piersol, ; Fassois, ), these are not sufficient to reveal the structure of non-stationary dynamics. Consequently, the representation of the non-stationary dynamics on the vibration response of a structure requires specialized methods that can properly represent the evolutionary characteristics of the spectral content (Grenier, 989; Flandrin, 989; Hammond and White, 996; Poulimenos and Fassois, 6; Antoni, 7). For this purpose, it is possible to use either non-parametric and parametric non-stationary modeling methods. Non-parametric methods are based either on representations of the distribution of power or energy of the nonstationary signal in time frequency domains, or on modal decompositions. Examples of the former are the spectrogram, the Wigner-Ville and related time frequency distributions, and the continuous wavelet transform (Cohen, 995; Hammond and White, 996; Flandrin, 999; Staszewski and Robertson, 7; Hlawatsch and Matz, 8; Sejdic et al., 9). On the other hand, modal representations are based either on the Hilbert-Huang transform (also known as Empirical Mode Decomposition) or on discrete wavelet representations (Huang et al., 998; Ditommaso et al., ; Feng et al., 3). Due to their simpler computation and physical interpretation, nonparametric representations have widely appraised for the modeling of non-stationary vibration (Antoni, 7, 9; Braun and Feldman, ; Feldman, ; Feng et al., 3; Pai, 3; Dziedziech et al., 5b). However, SHM techniques based on non-parametric representations are large sized and contain redundant data, while at the same time may have reduced accuracy, tracking ability and time-frequency resolution in comparison to parametric representations (Poulimenos and Fassois, 6). On the other hand, parametric representations are based on non-stationary stochastic models, including Timedependent AutoRegressive Moving Average (TARMA) models and related models, as well as Time-Variant Stochastic Subspace models (Poulimenos and Fassois, 6; Niedzwieki, ; Spiridonakos and Fassois, 4a). In particular, the family of parametric TARMA models includes three main types of models, which are classified according to the form of structure imposed upon the evolution of the time-dependent parameters and innovations variance (Niedzwieki, ; Poulimenos and Fassois, 6, 9b; Spiridonakos and Fassois, 4b): Unstructured Parameter Evolution models; (ii) Stochastic Parameter Evolution (SPE) models; and (iii) Deterministic Parameter Evolution (DPE) models. Each type of TARMA model has its advantages and disadvantages, but it is clear that SPE and DPE TARMA modeling can potentially yield improved performance and tracking of the timedependent dynamics within a compact representation. The price to be paid is increased complexity and required user expertise. Most recent works have focused on the development and improvement of TARMA models of the deterministic type (Poulimenos and Fassois, 9b; Spiridonakos et al., ; Spiridonakos and Fassois, 4b,a). In general, DPE TARMA models may be considered particularly effective for the identification of the vibration response of structures characterized by time-dependent dynamics, where the physical mechanism responsible for the timedependency is of deterministic type. In those cases, DPE-TARMA offer very compact and accurate representations. However, in the case when stochastic components are present in the evolutionary characteristics of the dynamics due to the effects of uncertainties, then deterministic TARMA models may have difficulties to effectively represent the parameter evolution (Beck, ; Behmanesh et al., 5; Green and Worden, 5). In that case, one important question is how to effectively model the effects of uncertainty in models of the DPE-TARMA class, while avoiding the introduction of extra complexity into the model. On the other hand, SPE TARMA models have also the potential to yield compact, accurate and flexible representations for a wider type of non-stationary time-series featuring purely stochastic or a mix of stochastic and deterministic effects in the temporal evolution of their dynamics, as in the case of the vibration response of operating wind turbines. Smoothness Priors TARMA (SP TARMA) models (Kitagawa and Gersch 996, Ch. ; Kitagawa, Ch. 3) are the simplest and most widely used models within the SPE TARMA class. These models feature integrated Gaussian processes for the description of the evolution of the TARMA parameters, which are referred to as smoothness priors constraints. The smoothness priors constraints are useful for the description of non-stationary processes with smooth evolution of the parameters, but may lack flexibility when the dynamics rapidly evolve, or

21 . Introduction when there are abrupt changes in the parameters. More elaborated models attempt to improve the flexibility of the model by making use of non Gaussian densities (Kitagawa and Gersch 996, Ch. ; Kitagawa, Ch. 3) or Markov processes (in a general sense) to define the evolution of the TARMA parameters (Rajan et al., ; Gosdill et al., ; Djuric et al., 3; Francq and Gautier, 4; Hsiao, 8). Despite of the improved modeling accuracy and flexibility, the problem with these models is that the identification process is more complex, requiring the application of specialized recursive Bayesian estimation methods adequate for the tracking of the non-gaussian densities (Gosdill et al. ; Kitagawa, Ch. 3). In this regard, it is of interest to develop alternate TARMA modeling methodologies that can take advantage of their flexibility, but at the same time offer improved accuracy in the tracking of the non-stationary dynamics... Time and Frequency Representation of Non-Stationary TARMA Processes The derivation of the properties describing the correlation between time frequency components of a non-stationary process, including frequency response functions and the spectral density, has been motive of extensive research for many years (Martin and Flandrin, 985; Gardner, 986; Flandrin, 989; Grenier, 989; Antoni, 7; Sejdic et al., 9). Several definitions have been provided for the Power Spectral Density (PSD) of non stationary processes (Gardner, 986; Grenier, 989). On the one hand, the spectral correlation is a frequency frequency representation that describes the correlations between spectral components of the signal and is useful to unveil the structure of amplitude or frequency modulations (Gardner, 986; Grenier, 989; Giannakis, 999; Antoni, 7). On the other hand, there are time-frequency (evolutionary) representations that describe the evolution of the spectral components of the non stationary process as a function of time. Two big classes of time-dependent PSD for non-stationary processes can be recognized: the generalized Wigner-Ville spectrum (Type I) and the generalized evolutionary spectrum (Type II) (Grenier, 989; Matz and Hlawatsch, 6). Likewise, time-dependent systems can also be described in terms of frequency-frequency representations, such as the Harmonic Frequency Response Function (Harmonic FRF), or in terms of time-frequency representations, including Parametric Transfer Functions, Instantaneous Transfer Functions, or Frozen Transfer Functions (Sandberg et al., 5; Tohumoglu, 5; Skjoldan and Hansen, 9; Allen et al., b), being the latter approach the most commonly used in practice, but the one bearing the poorest representation properties. Currently, parametric time-frequency analysis via TARMA models is almost exclusively made in terms of the frozen approach (Grenier, 989; Poulimenos and Fassois, 6), despite of the clear benefits offered by other representations, specially on the improved tracking of fast evolving non-stationary dynamics. This is mainly due to the simplicity of the frozen approach, but also for the lack of analytical expressions that provide precise forms to calculate other quantities with improved theoretical properties on the case of TARMA models, such as the Harmonic FRF, the instantaneous FRF or the spectral correlation...3 Non Stationary Vibration Based Structural Health Monitoring In general, the problem of vibration based SHM may be treated in a statistical time series framework using either non parametric or parametric time series models, accompanied by statistical decision making schemes (Sohn, 7; Farrar and Worden, 7; Worden and Manson, 7; Fassois and Sakellariou, 9). Despite significant developments in recent years, two prime challenges faced by such methods relate to their operation (i) on structures with time dependent dynamics, and (ii) under significant environmental and operational uncertainty. For the first challenge, relating to structure with non-stationary vibration response, non-parametric models (based on non-parametric time-frequency representations) are mainly used for the representation of the vibration response (Staszewski et al., 997; Flandrin, 999; Staszewski and Robertson, 7; Boashash, 3; Peng and Chu, 4; Tang et al., ; Feng et al., 3; Dziedziech et al., 5a), while damage detection and identification are based on the recognition of potential discrepancies observed in these representations between reference and inspection cases (Feng et al., 3). Alternatively, stochastic parametric modeling is a more powerful tool in this regard, since these models are capable of providing very accurate modeling and tracking of the non-stationary behavior within a very compact representation. Various types of Time-dependent ARMA (TARMA) models, including unstructured, stochastic and deterministic parameter evolution models, have proven to be useful for this purpose (Avendaño-Valencia and Fassois, 3

22 .3. Proposed Methodology 4a; Poulimenos and Fassois, 6; Spiridonakos and Fassois, 3, 4b). Likewise, the closely related Linear Parameter Varying ARMA (LPV ARMA) models are more suitable whenever the time-dependent dynamics of the vibration response of the structure are themselves dependent on an external scheduling variable that is also measurable (Bamieh and Giarré, ; Tóth, ). Nonetheless, despite of the evident advantages of parametric modeling for SHM of structures with non-stationary response, these have received very low attention in practice (Poulimenos and Fassois, 4; Spiridonakos and Fassois, 4b). The second challenge, stemming from the presence of operational and environmental uncertainty, may be tackled by various approaches. If the variables can be quantified (measured), then explicit cause and effect type modeling may be employed. This leads to methods aiming at building a regression or interpolation model that explains the variability of the damage sensitive features or the parameters of the representation model as a function of the uncertainty inducing variables (Hios and Fassois, 4; Kopsaftopoulos and Fassois, 3; Sohn, 7; Worden et al., ). If this is not the case, a first measure is to use a method attempting to separate the effects of uncertainty from those of damage. For this purpose, orthogonal decompositions and related methods are employed, with Principal Component Analysis (PCA) based methods being most common (Deraemaeker et al., 8; Gómez-González and Fassois, 3; Sohn et al., ; Sohn, 7; Yan et al., 5a,b). Alternative methods attempt to model uncertainty. Random Coefficient (RC), probabilistic or similar models describe the effects of uncertainties on the dynamics of the vibration response signal by adopting a random character on the model parameters (coefficients) (Michaelides and Fassois, 3; Mosavi et al., ; Zao and Wang, ). Normal distribution models are most commonly used, although in practice the parameters may follow more complex distributions. If that is the case, Gaussian mixture (Nair and Kiremidjian, 7; Słoǹski, ), kernel (Fricker et al., ; Khatibinia et al., 3), and fuzzy (Chandrashekhar and Ganguli, 9; Degrauwe et al., 9) models, or interval analysis methods (Muscolino et al., 5) are useful alternatives. The main drawback of conventional RC and interpolating models is that these require large amounts of data in order to construct a reliable representation. Moreover, the data requirements sharply increase as the model complexity does. According to the above, one of the challenges still open for the proper SHM of structures with non-stationary response operating under significant uncertainty, is to develop a representation method capable of describing both types of effects (non-stationarity and uncertainty) in a simple and compact manner, while at the same time providing effective damage diagnosis capabilities..3 Proposed Methodology This section explains the methodology postulated in this dissertation to tackle each one of the main objectives discussed in Section...3. Representation of Non-Stationary Vibration by Means of TARMA Models Improved representation of non stationary vibration is attempted by two methods: (i) Generalized linear Stochastic Constraint TARMA (GSC TARMA) models; (ii) Linear Parameter Varying ARMA (LPV ARMA) models. Both of them aim to provide a simple, compact and flexible representation, which is also robust to uncertainties. More specifically, GSC TARMA models are a type of SPE TARMA model for which the parameter trajectories are Gaussian multivariate AR processes. GSC TARMA models have the potential to track the evolution of parameters influenced by deterministic and stochastic effects, while avoiding the complicated selection of a functional basis expansion and its order, which is the main drawback of Functional Series TARMA (FS TARMA) models. Besides, since GSC TARMA models are fully linear Gaussian representations, they are characterized by simpler parameter estimation methods. The identification of GSC-TARMA models, requires the estimation of the parameter trajectories and hyperparameters, and is tackled via Maximum Likelihood and Bayesian methods. Alternatively, LPV-ARMA models are useful to represent non-stationary processes whose time-dependent behavior is determined by an external variable, referred to as the scheduling variable. This property gives the LPV-ARMA model the potential capability to track with improved accuracy the evolution of the parameters, as compared with conventional FS TARMA models. Besides, LPV-ARMA models yield compact representations, 4

23 . Introduction similar to the case of FS TARMA models. The estimation of these models is solved via Maximum Likelihood methods..3. Time and Frequency Representation of Non-Stationary TARMA Processes The achievement of enhanced analysis methods of non-stationary dynamics based on TARMA models is attempted by adapting the concepts of the Harmonic Impulse Response and the Harmonic FRF to the case of TARMA models. The rationale for this selection is that both Harmonic Impulse Response and the Harmonic FRF are quantities that count with several desirable properties that can provide insight into the input-output characteristics of the underlying non-stationary process. Besides, these can be used to derive other quantities with nice properties, like the spectral correlation and different types of time-dependent spectra with improved characteristics in comparison with the frozen spectrum. The methodology followed to adapt the concepts of the Harmonic Impulse Response and the Harmonic FRF to the case of TARMA models, starts by considering a discrete LTV system represented by a difference equation model and defined over a finite analysis period (being the case of TARMA models a particular example of those). Then, the time-varying parameters of this model are assumed to accept a representation by means of the Discrete Fourier Transform (DFT). Subsequently, by using the properties of the DFT, it is possible to derive analytical expressions for the Harmonic Impulse Response and the Harmonic FRF for this type of model. Based on the obtained expressions, it is also possible to calculate the Spectral Correlation, the Wigner-Ville and the Melard- Tjøsteim time-dependent spectra based on the DFT of the parameters of the discrete LTV system..3.3 Non-Stationary Vibration-Based Structural Health Monitoring The methodological approach used for SHM of structures with time-dependent dynamics operating under significant uncertainty consists of a framework that includes two entities: non-stationary parametric modeling (via FS TARMA or LPV-ARMA models) for representing the time-dependent dynamics of the vibration response of the structure; and a Multiple Model (MM) representation for modeling each health state of the structure under uncertainty. The rationale behind the current formulation stems from the expectation that a MM representation may be best for capturing the different dynamic behaviors of the structure under the influence of uncertainty in a simple and effective form. From this concept, two damage detection and identification methods are derived, one based on a Bayesian formulation of the damage detection and identification in terms of the likelihoods of the individual models, and a second one based on Kullback-Leibler divergences measured between the model of the test signal and the baseline models contained in the MM representation of each health state of the structure..4 Main Results This Ph.D. Dissertation has demonstrated the importance, both from a theoretical and practical point of view, of non-stationary parametric modeling via TARMA and LPV-ARMA modeling. The capabilities of these models for the modeling of the non stationary vibration response of structures with time-dependent dynamics and influenced by important levels of uncertainty, plus their applicability for the analysis of the dynamic characteristics of these non-stationary processes, and for SHM problems in such types of structures are demonstrated. In particular, improved TARMA-related modeling and identification methods are proposed and evaluated in simulation examples and real-life scenarios, which demonstrate the capabilities of the proposed modeling methods. Moreover, the difficulties met for the application of these models for SHM and the time and frequency domain analysis of non-stationary processes are effectively tackled. The main contributions of this thesis are: (i) The introduction of the GSC TARMA model class along with a complete framework for the identification of these models based on Maximum Likelihood and Bayesian methods. This includes the following methods: A method aiming to optimize the marginal likelihood of the GSC TARMA model solved via a non linear iterative optimization 5

24 .5. Organization and Contributions of this Thesis A method aiming to optimize the likelihood of the parameters and hyperparameters of the GSC TARMA model solved via an Expectation Maximization algorithm A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via Markov Chain Monte Carlo sampling A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via a joint Kalman filter method based on an augmented state space representation (ii) Derivation of novel analytical expressions of the Harmonic Impulse Response, Harmonic FRF, Instantaneous FRF for the particular case of discrete LTV systems based on the DFT of the system parameters on a finite analysis interval. Besides: The adaptation of these expressions to the particular case of TARMA models The adaptation of these expressions for the calculation of the spectral correlation and various types of time dependent spectra (Wigner-Ville and Melard-Tjøsteim types) for the particular case of TARMA models (iii) A damage diagnosis methodology based on Multiple Model (MM) representations for the health monitoring of structures with time-dependent vibration response and operating under significant uncertainty. More specifically A methodology for the construction of efficient MM representations Two damage diagnosis (detection and identification) methods: one based on a Bayesian formulation of the damage detection and identification in terms of the likelihoods of the individual models, and a second one based on Kullback Leibler divergences measured between the model of the test signal and the baseline models contained in the MM representation of each health state of the structure. (iv) The practical evaluation of the proposed methods in simulation examples and real-life scenarios, which demonstrate the effectiveness and improved performance provided by the postulated methods in comparison with similar methodologies of non-stationary modeling..5 Organization and Contributions of this Thesis This dissertation is divided into the chapters summarized in Table.. The specific topics discussed on each chapter and their respective contributions are described next. Table.: Thesis chapters. Chapter Title Stationary and Non-Stationary Random Vibration Modeling and Analysis for an Operating Wind Turbine 3 Generalized Stochastic Constraint Time-Dependent ARMA Modeling of Non-Stationary Signals 4 Time and Frequency Analysis of Non-Stationary Signals by Means of Time-Dependent ARMA Representations 5 A Multiple Model Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Part I The Methodology 6 A Multiple Model Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments: Part II - Application to Damage Diagnosis on an Operating Wind Turbine 6

25 . Introduction Chapter : Stationary and Non-Stationary Random Vibration Modeling and Analysis for an Operating Wind Turbine In this chapter, the problem of stationary and non-stationary random vibration modeling and analysis for an operating wind turbine is considered by using an acceleration vibration signal measured on the tower of a NegMicon NM5/9 fixed speed wind turbine. Three stationary methods, namely Welch spectral estimation, and parametric AR and ARMA modeling, are considered, along with five non-stationary methods, namely non-parametric Wigner-Ville and spectral correlation, and parametric modeling by means of SP-TAR modeling, FS TAR modeling, and adaptable FS TAR modeling. The need for non-stationary modeling is mainly dictated by blade rotation and the wind dynamics. The results of the study confirm the cyclo-stationary and broader non-stationary nature of the random vibration, dictating the need for corresponding methods for accurate modeling and analysis. The capabilities and facets of the various methods in terms of modeling and analyzing operating wind turbine random vibration are presented. Main contributions: (i) For the first time a detailed analysis of the vibration response of operational wind turbines based on actual experimental data is provided. Cyclo-stationary, nearly cyclo-stationary, and broader non-stationary components are identified in the vibration response. (ii) The assessment of different stationary and non-stationary methods is provided in the context of the modeling and analysis of the vibration response of an operating wind turbine. A detailed comparison of the various modeling approaches, the determination of their relative pros and cons, and the formulation of practical strategies and recommendations are presented. Chapter 3: Generalized Stochastic Constraint Time-Dependent ARMA Modeling of Non-Stationary Signals In this chapter, the problem of modeling non-stationary time series featuring deterministic and stochastic patterns in the evolution of the dynamics via Stochastic Parameter Evolution Time dependent ARMA models (SPE TARMA) is discussed. This work particularly focuses on the class of Generalized linear Stochastic Constraint TARMA (GSC TARMA) models, in which the parameter evolution is determined by Gaussian multivariate AR processes, being the well known Smoothness Priors TARMA models the special case when all the roots of the multivariate AR model are equal to one. The GSC TARMA model has the advantage of a simple linear Gaussian form, which facilitates the estimation of the parameter trajectories via conventional Kalman filter methods, while has the potential of optimizing its tracking abilities by the proper adjustment of the model hyperparameters. The methods investigated in this chapter are evaluated and compared with similar methods on several Monte Carlo simulations and on a vibration response of an operating wind turbine, thus demonstrating the workings of the GSC TARMA modeling methods as well as their advantages for the identification of complex non stationary time series. Main contributions: (i) A set of methods for the identification of GSC TARMA models, including: A method aiming to optimize the marginal likelihood of the GSC TARMA model solved via a non linear iterative optimization A method aiming to optimize the likelihood of the parameters and hyperparameters of the GSC TARMA model solved via an Expectation Maximization algorithm A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via Markov Chain Monte Carlo sampling A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via a joint Kalman filter method based on an augmented state space representation 7

26 .5. Organization and Contributions of this Thesis (ii) The evaluation of the postulated methods and their comparison with alternative TARMA modeling methods by means of Monte Carlo simulations and on a wind turbine vibration response signal. Chapter 4: Time and Frequency Analysis of Non-Stationary Signals by Means of Time-Dependent ARMA Representations The analysis of the time and frequency properties of Time-dependent ARMA and similar parametric time dependent models has been generally performed in terms of the so called frozen approach, in which the time dependent parameters of the model are analyzed as if they corresponded to a stationary system at each time instant. The frozen approach facilitates the analysis of the non stationary dynamics, but has important limitations on the accurate representation of non stationary dynamics, especially when the model parameters are evolving rapidly. However, on the analysis of Linear Time Periodic (LTP) systems, the concepts of Harmonic Impulse Response and Harmonic Frequency Response Function (FRF) are widely used for the analysis of input-output relationships and the understanding of modulations introduced by the LTP system. Up to date, those concepts have been analyzed for the case of continuous time LTP systems, and discrete time LTP systems described by the impulse response function. In this chapter, these concepts are adapted to the particular case of discrete Linear Time-Varying (LTV) difference equation models, from which TARMA models are a special case. The main contribution of this work is on the derivation of analytical expressions that associate the discrete Fourier transform of the parameters of the LTV model with its respective Harmonic Impulse Response and Harmonic FRF. These formulations are also used to evaluate the spectral correlation and time dependent spectra of the Melard Tjøsteim and Wigner-Ville types for the difference equation LTV model and for the particular case of TARMA models. The theoretical and practical results demonstrate the superiority of the postulated parametric TARMA model based analysis framework compared with non parametric methods and the conventional frozen approach for the analysis of LTV systems. Main contributions: (i) A critical overview of the definitions of the time and frequency representation of non-stationary processes and time-dependent systems. (ii) Derivation of novel analytical expressions of the Harmonic Impulse Response, Harmonic FRF, Instantaneous FRF for the particular case of discrete LTV systems based on the DFT of the system parameters on a finite analysis interval. Besides: The adaptation of these expressions to the particular case of TARMA models The adaptation of these expressions for the calculation of the spectral correlation and various types of time dependent spectra (Wigner-Ville and Melard-Tjøsteim types) for the particular case of TARMA models Chapter 5: A Multiple Model Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Part I The Methodology This chapter focuses on vibration based SHM methods for structures with time-dependent dynamics under important environmental and operational uncertainty. The adopted approach utilizes a framework that includes nonstationary parametric time-dependent modeling for representing the time-dependent vibration response dynamics, and a Multiple Model (MM) representation of an individual health state of the structure under uncertainty. The work on this chapter focuses on the methodological aspects of the MM based SHM framework. In particular, detailed definitions of the main theoretical aspects of the MM framework for vibration-based SHM are provided, based on the interpretation of the MM representation as a mixture approximation of a random coefficient model, which facilitates the construction of the MM representation as well as the definition of damage diagnosis (detection and identification) tests. Besides, a small example featuring a suspension system with time dependent dynamics, where the uncertainty is introduced by variability in the physical parameters of the dynamic model, is used to show the workings of the postulated methods. The achieved MM based damage diagnosis methods are characterized by 8

27 . Introduction their simplicity in construction and their effective use of limited amounts of data, while providing high accuracy in damage diagnosis. Main contributions: (i) A methodology for the construction of the MM representation: which encompasses the problem of identification of the MM representation based on a finite number of baseline vibration responses, and includes the estimation of the parameters of individual LPV ARMA or FS TARMA models via Maximum Likelihood (ML), and the selection of the MM structure and dimensionality, addressed through statistical learning methods. (ii) Damage diagnosis methods based on identified MM representations: referring to the problem of detection and identification of damage, which addressed in terms of a Bayesian method based on the marginal likelihood associated with the MM representation, and alternatively based on a Kullback Leibler divergence based method aiming at finding models in the MM with similar statistical characteristics. (iii) A methodology for the optimization of the free parameters of the damage diagnosis methods: including the structural parameters of the MM representation and the damage diagnosis tests, which is posed in terms of statistical learning theory and accounted for either by Receiver Operating Characteristic (ROC) curve analysis or by optimization of the correct identification rate. Chapter 6: A Multiple Model Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments: Part II - Application to Damage Diagnosis on an Operating Wind Turbine This chapter continues with the evaluation of the MM framework for vibration based SHM methods postulated in the previous chapter. In particular, this chapter provides a complete account of the application of the MM framework discussed on the previous chapter on the SHM a simulated NREL offshore 5 MW wind turbine using a single vibration response signal by means of the FAST aeroelastic simulation tool. In the considered experiment, nonstationary dynamics mainly originate from the different inertial configurations of the structure as the wind turbine blades rotate, while uncertainty is introduced by random variations of the wind excitation average wind speed in the range from to m/s. Monte Carlo simulations are performed on six structural states, including the healthy state and five types of damages in the tower, blades and transmission, each one of them with four respective levels of damage. The complete set up of the modeling and damage diagnosis methods is presented, while comparisons with similar state of the art methods are provided as well. The results demonstrate consistently good performance of the MM based methods, which vastly improve the performance achieved by other methods, and thus are indicative of their applicability and effectiveness. Moreover, by means of the postulated MM methodology, it is possible to detect most damage types and specify the level of damage with a single sensor either at the tower or the blades. Main contributions: (i) Practical evaluation of the MM framework: including a comparison with similar damage diagnosis methods aiming at representing non stationary vibration response dynamics and uncertainty. (ii) A guide for the practical application of the MM methodology: which can be used for the application of the MM damage diagnosis methodology in other vibration based SHM problems, featuring important influence of uncertainty and/or non stationarity. 9

28 .5. Organization and Contributions of this Thesis

29 Chapter Stationary and Non-Stationary Random Vibration Modeling and Analysis for an Operating Wind Turbine In this chapter, the problem of stationary and non-stationary random vibration modeling and analysis for an operating wind turbine is considered by using an acceleration vibration signal measured on the tower of a NegMicon NM5/9 fixed speed wind turbine. Three stationary methods, namely Welch spectral estimation, and parametric AR and ARMA modeling, are considered, along with five non-stationary methods, namely non-parametric Wigner-Ville and spectral correlation, and parametric modeling by means of SP-TAR modeling, FS-TAR modeling, and adaptable FS-TAR modeling. The need for non-stationary modeling is mainly dictated by blade rotation and the wind dynamics. The results of the study confirm the cyclo-stationary and broader non-stationary nature of the random vibration, dictating the need for corresponding methods for accurate modeling and analysis. The capabilities and facets of the various methods in terms of modeling and analyzing operating wind turbine random vibration are presented.

30 .. Introduction. Introduction Wind energy is an important technology branch experiencing fast growth. The main requirements are more yield power and less downtime. Therefore, to increase productivity and reduce operating costs, wind energy research aims at developing larger wind turbines located in more challenging environments, such as offshore. Effective control algorithms are necessary to operate the wind turbines with optimal productivity under safe conditions (Barlas and van Kuik, ). Moreover, maintenance and repairs constitute significant expenses for wind farms. The new generation of wind turbines, which are higher and produce more power, are more prone to fatigue or malfunction and are more expensive to repair (Hameed et al., 9; Ciang et al., 8). These new conditions require the adequate identification of the wind turbine dynamics, as this information is useful for improved design, control, and structural health monitoring. The main problems in accurately modeling wind turbine random vibration are due to the complexity of the structure, of the applied loads, and the uncertain operating environment. The loads applied on the wind turbine can be separated into aerodynamic and gravitational (external) as well as structural (internal), which are related by aeroelastic coupling (Barlas and van Kuik, ). For wind turbines rotating at fixed speed, aerodynamic loads are characterized by periodic stochastic fluctuations (referred to as cyclo-stationary) (Wagner et al., 9; Rokenes and Kogstad, 9; Hansen et al., 6b; Li et al., ). Gravitational loads on the blade system are also of periodical nature and induce corresponding excitation to the rotor/blade system. In turn these forces interact with the structural modes of other components, such as the tower and the drive train (Barlas and van Kuik, ; Murtagh and Basu, 7; Tcherniak et al., ; Allen et al., b). For variable speed wind turbines, more complex effects, such as centrifugal stiffening of the blades or variation of the aeroelastic damping, occur (Hansen et al., 6b; Li et al., ). In summary, all these effects constitute wind turbine vibration stochastically time-dependent, that is non-stationary. The modeling of the wind turbine random vibration, and consequently part of the underlying dynamics, requires the use of appropriate techniques that can effectively capture and describe the non-stationary (including cyclo-stationary) nature of the random vibration (Murtagh and Basu, 7; Allen et al., b). In particular, it is vital to have the ability to track the non-stationary dynamics present in the random vibration which include important information on the dynamic behavior of fixed and variable speed wind turbines (Antoni, 9; Skjoldan and Hansen, 9). To this date, stationary time and frequency domain methods are almost exclusively used for modeling wind turbine random vibration (Hameed et al., 9; Ciang et al., 8; Carne and James, ). These methods may be useful for the analysis of some sub-components of the wind turbine, like gearings, bearings and blades under certain (typically standstill) conditions (Hameed et al., 9; Adams et al., ). The analysis is thus carried out when the turbine is at standstill or when the analyzed component is taken apart from the structure, eliminating the dynamic interactions and the environmental conditions found under normal operation (Yang and Allen, ). Therefore, it is not possible to consider the actual, in-operation, dynamics under realistic loading conditions (Tcherniak et al., ; Dolinski and Krawczuk, 9). It is evident that for modeling wind turbine vibration and the underlying dynamics under realistic conditions more advanced techniques are necessary (Yang and Allen, ). Non-stationary identification based on the time frequency plane has, within the context of damage detection, been recently used in order to cope with the temporal dependence and non-linearity of wind turbine structural dynamics (Dolinski and Krawczuk, 9; Fitzgerald et al., ). Indeed, time-frequency methods have been used to track the time-varying modal damping of wind turbines, with reportedly improved results compared to those obtained with standard stationary frequency domain analysis even when the wind turbine is at standstill (Murtagh and Basu, 7). Nevertheless, despite the improved tracking of the non-stationary dynamics, the use of non-parametric approaches and the overlooking of cyclo-stationary components call for more detailed analysis and improvements. Indeed, as pointed out in (Antoni, 9) the concept of cyclo-stationarity may lead to elegant and powerful solutions when the non-stationarity originates from periodic phenomena, such as those in rotating and reciprocating machines. Assuming that the mechanism responsible for the generation of the vibration is periodic, a family of transformations can be performed such that the cyclic effects present in the vibration may be removed. Such are the

31 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine Floquet and Coleman transformations (Skjoldan and Hansen, 9). These are used to map the initial rotating frame of reference, where the vibration is measured, into the inertial frame of reference, where the vibration is stationary. After applying such a transformation to a measured vibration signal, it is possible to carry out the analysis with stationary techniques (Tcherniak et al., ). Afterwards, the inverse transformation may be applied for the determination of the cyclic features. The Coleman transformation is valid when the cyclic structure of the vibration is only due to the rotation frequency, whereas the Floquet transform is more general, and is capable of describing several harmonics (Skjoldan and Hansen, 9). For this reason, Floquet analysis is used in the general case of unbalanced rotors. This approach combined with system identification techniques has proven useful for the determination of the operational modes of the simulated response of fixed speed wind turbines (Tcherniak et al., ; Allen et al., b,a). In the case of variable speed wind turbines, a technique known as angular resampling can be used to remove the cyclic effects from the vibration response (Villa et al., ). Nonetheless, it is generally assumed that the dynamics include purely cyclic components and the excitation forces are white Gaussian; assumptions that may not always hold true as the conditions are not ideal and wind effects are of broader non-stationary nature. The use of parametric modeling methods is useful for effective random vibration modeling. In (Hansen et al., 6a) stochastic subspace identification techniques have been used for the estimation of the aeroelastic damping of a wind turbine. It is assumed that the vibration signal can be modelled by a linear time-invariant state space model with white Gaussian excitation. Modal frequencies and damping ratios are extracted from the estimated model s system matrix, but proper estimation of damping requires long time histories and averaging techniques (Tcherniak et al., ). Besides, the modeling approach neglects the time-varying dynamics. Nevertheless, it demonstrates the usefulness of parametric methods. Non-stationary time-varying autoregressive (TAR) models have been used in the past for the identification of the dynamics of time-varying structures and systems see the survey (Poulimenos and Fassois, 6, 9b). Parametric models are quite general and allow for cyclo-stationary as well as broader non-stationary phenomena to be effectively modelled. In a previous study by the present authors, a preliminary modeling and analysis of the vibration on a wind turbine under normal operation was carried out using parametric TAR modeling (Avendano-Valencia and Fassois, ), in the form of Smoothness Priors TAR (SP-TAR) or Functional Series TAR (FS-TAR) models. The results demonstrated the usefulness of non-stationary TAR modeling in the context of wind turbine non-stationary random vibration. In the present study the problem of modeling and analysis of the random vibration acquired on an operating (fixed speed) wind turbine is undertaken in a comprehensive way by employing both stationary and nonstationary methods. For this purpose a random vibration acceleration signal acquired at the tower top of a Neg- Micon NM5/9 wind turbine in the fore-aft direction is employed. Special emphasis is placed on the proper modeling of the cyclo-stationary and broader non-stationary characteristics based on parametric non-stationary methods. The main aims and novel contributions of this study may be summarized as: (i) The application and assessment of stationary non-parametric and parametric stochastic methods for the modeling and analysis of operating wind turbine random vibration. (ii) The application and assessment of non-stationary both non-parametric and parametric stochastic modeling methods. Non-parametric time frequency and cyclic spectral analysis are specifically employed, while parametric modeling based on time varying autoregressive models is for the first time employed in this context. Moreover, three important classes of such models are used: those with stochastically varying parameters, and those with deterministically varying parameters using fixed or adaptive basis functions. (iii) The analysis based on the identified models, including the determination of cyclo-stationary, nearly cyclostationary, and broader non-stationary components. (iv) The detailed comparison of the various approaches, the determination of their relative pros and cons, and the formulation of practical strategies and recommendations. The rest of the paper is organized as follows: A summary of the stationary and non-stationary modeling techniques employed is presented in Section.3. The experimental description along with the signal acquisition 3

32 .. Characteristics of the dynamics of the vibration response of wind turbine structures and preprocessing details are presented in Section.4. The random vibration modeling under normal operating conditions using conventional non-parametric and parametric stationary approaches is presented in Section.5, while that based on non-parametric and parametric non-stationary approaches (smoothed Wigner-Ville spectrum, spectral correlation and parametric TAR modeling) is presented in Section.6. The results are compared, analyzed, and discussed in Section.7. The main conclusions of the study are finally summarized in Section.8.. Characteristics of the dynamics of the vibration response of wind turbine structures The vibration response of operational wind turbines is characterized by complex non-linear and time-dependent dynamics, which are influenced by the constantly varying operating and environmental conditions (Hansen et al., 6b; Skjoldan and Hansen, 9; Barlas and van Kuik, ; Wang et al., ; Allen et al., b; Li et al., ; Zhang and Huang, ; Gebhardt and Roccia, 4). Typically, the analysis of the dynamics of wind turbines is performed via simplified physical models obtained after linearizing around an operating point, namely a constant rotor speed. Although the assumption of a constant rotor speed is not very practical, since even for fixed speed wind turbines, the variation of the rotor speed can be of a few percent points, while in variable speed wind turbines these variations can be even larger (Hansen et al., 5; Muljadi and Butterfield, 999; Muljadi et al., 3), the assumption is useful for the understanding of the dynamics of operating wind turbines. Thus, the linearized wind turbine model at a constant rotor speed is characterized by time-periodic behavior, described by the following differential equation (Hansen et al., 6b; Skjoldan and Hansen, 9; Allen et al., b) M(t) ÿy(t)+c(t) ẏy(t)+ K(t) y(t)= f(t), M(t),C(t), K(t) : time-periodic with period T (.) where y(t) is a vector containing the generalized coordinates of the wind turbine (degrees of freedom DOF), and M(t), C(t), and K(t) are the mass, gyroscopic/damping and stiffness matrices, which are of time-periodic nature (i.e. M(t)= M(t+ T), where T is the period of the blade rotation). The time-periodic nature of the wind turbine structural dynamics is associated with the rotation of the blades of the wind turbine, which constantly modifies the gravitational and aerodynamic forces exerted over the blades (Skjoldan and Hansen, 9; Allen et al., b). To clarify the linearized model shown in Equation (.), consider a simplified wind turbine model with three flap-hinged blades and a rigid nacelle that can tilt and yaw on a rigid tower, as in (Skjoldan and Hansen, 9). For this simplified model, the degrees of freedom are: y(t) [ θ (t) θ (t) θ 3 (t) θ x (t) θ z (t) ] (.) where θ i (t), i=,,3 represents the flap-hinge angle of blade i, and θ x (t) and θ z (t) are the tilt and yaw angles of the nacelle, respectively. Furthermore, the respective mass matrix M, gyroscopic/damping matrix C, and stiffness matrix K are given by (Skjoldan and Hansen, 9): J b J b cosφ J b sinφ J b J b cosφ J b sinφ M(t)= J b J b cosφ 3 J b sinφ 3 J b cosφ J b cosφ J b cosφ 3 J x + 3 J b+ J J b sinφ J b sinφ J b sinφ 3 J z + 3 J b+ J (.3a) c ΩJ b sinφ ΩJ b cosψ c ΩJ b sinφ ΩJ b cosψ C(t)= c 3 ΩJ b sinφ 3 ΩJ b cosψ 3 c x 3ΩJ b 3ΩJ b c z (.3b) 4

33 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine G + Ω J b G + Ω J b K(t)= G 3 + Ω J b Ω J b cosφ Ω J b cosφ Ω J b cosφ 3 G x Ω J b sinφ Ω J b sinφ Ω J b sinφ 3 G z (.3c) where J = 3m b L s, φ i = Ωt+ π(i )/3, J b, J x and J z are the moments of inertia of the blade about the root, of the nacelle/tower tilt, and of the nacelle/tower yaw, respectively, G b, G x and G z are the stiffness coefficients of the blade, nacelle/tower tilt and nacelle/tower yaw, respectively, c b, c x and c z represent the damping coefficients of the blade, nacelle/tower tilt and nacelle/tower yaw, respectively, m b is the blade mass, and L s is the distance of the tower top to the hub. Observing the structure of the mass matrix, it can be concluded that the components representing the interaction between the blades and the nacelle are characterized by time-dependent quantities, determined by the blade inertia multiplied by a sine/cosine component of the respective flap-hinge angle of the blade. This indicates that as each blade rotates, it transfers loads to the structure differently according to the angle relative to the coordinates of the nacelle. Similar conclusions can be drawn from the gyroscopic/damping and the stiffness matrices. The result of this, is that the structural dynamics of the wind turbine are characterized by cyclo-stationary effects. Time-periodic differential equations such as the one shown in the expression (.) can be solved via Coleman or Floquet transformations, which transform the time-periodic system equations into time-invariant ones (Hansen et al., 6b; Skjoldan and Hansen, 9). Afterwards, the obtained equations can be solved using modal domain representations to reduce the number of DOFs of the system and facilitate the computations (Hansen et al., 6b; Zhang and Huang, ). Through this type of analysis it is possible to show that the vibration response of the operational wind turbine can be described as a superposition of several vibration eigen-modes with corresponding eigen-frequencies, damping ratios and deflection shapes. The actual values of the modal frequencies and damping ratios depend on the size and construction of the particular wind turbine, and are a function of the angular speed of the rotor. In practice, only the lowest eigen-frequencies are considered, under the assumption that the excitation forces in combination with the structural damping do not excite eigen-modes associated with higher frequencies. Nonetheless, practical application has demonstrated that this type of approximation is in good agreement with real measurements (Hansen et al., 6b). Thus, following this line of thought, most of the commercially available aeroelastic simulation tools use from 6 up to DOFs to describe the wind turbine vibration response, which are depicted in Figure. (Øye, 996; Jonkman and Buhl, 5), including: 3-4 DOFs for the blades ( flapwise, - edgewise); up to 4 DOFs for the rotor shaft ( for torsion, for the hinges before the first bearing, and for pure rotation); DOF to describe the tilt stiffness of the nacelle; and about 3 DOFs to describe the torsion of the tower and its displacements in the fore-aft and lateral directions. The excitation forces, gathered in the vector f(t) are principally related to the aerodynamic loads applied by the incoming wind over the blades and tower, but also include the input from the control mechanisms of the wind turbine (Hansen et al., 6b; Barlas and van Kuik, ). In the case of wind turbines located offshore it is required to consider as well the wave, tidal and, if required, ice loads over the structure. Most of these excitation forces are stochastic in nature. The wind loads are also characterized by cyclo-stationary behavior due the variable wind profiles encountered by the blades as they go along the height of the structure (Hansen et al., 6b; Rokenes and Kogstad, 9; Li et al., ). In turn, these excitation forces interact with the structural modes of other components, such as the tower and the drive train (Murtagh and Basu, 7; Barlas and van Kuik, ; Tcherniak et al., ; Allen et al., b). Therefore, the sum of the time-dependent dynamics of the operating wind turbine, plus the effects of the non-stationary excitation forces make the vibration response of the wind turbine stochastically time-dependent, i.e. non-stationary. The global structural dynamics of the wind turbine can be analyzed using linearized models at different operating points. As mentioned before, the modal frequencies of wind turbines will vary with the speed of the incoming wind and the angular speed of the rotor (Hansen et al., 6b; Hansen, 7; Li et al., ). The variability of the eigen-frequencies leads to complex phenomena including mode coupling and some types of instabilities of the aeroelastic type. Such instabilities are found when the wind turbine is operating at high wind speed or rotor angular speeds, and are evidenced by increased levels of vibration due to the reduction of the aeroelastic damping at certain 5

34 .3. Concise overview of stationary and non stationary vibration modeling Pitch Lateral Fore-aft Yaw Roll Flapwise Azimuth Tilt Edgewise Figure.: Main degrees of freedom of a wind turbine. modes of vibration (Chaviaropoulos, ; Lobitz, 4; Riziotis et al., 4; Hansen, 7; Larsen et al., 7). In this sense, the variability of the incoming wind, as well as other effects, including temperature, humidity, and so on, largely affect the dynamics of the operational wind turbine. These effects are difficult to measure and be included within the modeling method, and as a consequence, become uncertainties that can harm the effectiveness of the time series model for the purpose of SHM if no special considerations are taken..3 Concise overview of stationary and non stationary vibration modeling A concise overview of the employed stationary and non-stationary stochastic modeling methods is presently provided. Sections.3. and.3. present brief overviews of non-parametric and parametric, respectively, stationary methods; the latter based on time domain autoregressive (AR) and autoregressive moving average (ARMA) representations. Section.3.3 summarizes the non-parametric non-stationary methods used, including the spectrogram, the smoothed Wigner-Ville spectrum (SWVS), and the two dimensional power spectral density (D-PSD). In Section.3.3., parametric non stationary methods based on various forms of Time-Varying AutoRegressive (TAR) models are presented. Stochastic Smoothness Priors (SP) and Functional Series (FS) representations are considered for the temporal evolution of the model parameters. Further reading on these topics may be found in Poulimenos & Fassois (Poulimenos and Fassois, 6), Fassois & Sakellariou (Fassois and Sakellariou, 9), Flandrin (Flandrin, 989), Gardner (Gardner, 97, 986) and Antoni (Antoni, 7)..3. Stationary non parametric modeling Let y[t] be a zero-mean stationary Gaussian random vibration defined over the discrete time t Z. The signal can be fully characterized by the autocovariance function (ACF), or the latter s Fourier transform, which is the power 6

35 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine spectral density (PSD) (Manolakis et al., 5): γ yy [τ]=e{y[t] y[t τ]} S yy (ω)=f{γ[τ]}= τ= γ yy [τ] e jωτt s (.4) where E{ } is the expectation operator, F{ } the Fourier transform operator, τ Z the time lag, j the imaginary unit, ω the frequency in rad/s and T s the sampling period. The PSD is presently estimated using the Welch method (Manolakis et al., 5, p. 7)..3. Stationary parametric modeling An autoregressive moving average (ARMA) model for the random vibration y[t] is of the form (Manolakis et al., 5, p. 79): y[t]= n a i= a i y[t i]+ n c i= c i w[t i]+w[t], w[t] NID(,σ w) (.5) where a i, i=,...,n a and c i, i=,...,n c are the AR and MA parameters, respectively, n a and n c the corresponding AR and MA orders, respectively, and w[t] is an innovations (residual) signal, that is a Normally Independently Distributed (NID) time series with zero mean and variance σ w. The AR model is a special case of the broader ARMA model with n c =. Parametric AR/ARMA modeling comprises the selection of proper model orders and the estimation of the model parameters. Model order selection is carried out by a discrete search aiming at providing the model that minimizes a proper criterion, such as the Residual Sum of Squares over the Series Sum of Squares (RSS/SSS) or the model s Bayesian Information Criterion (BIC) (Fassois and Sakellariou, 9; Manolakis et al., 5)..3.3 Non stationary non parametric modeling A non-stationary random vibration is characterized by time-dependent statistical moments. Then, the mean (first moment) and autocovariance (second moment) are both functions of time, as follows (Poulimenos and Fassois, 6): µ y [t]=e{y[t]} γ yy [t,t ]=E{(y[t ] µ y [t ]) (y[t ] µ y [t ])} (.6) The autocovariance function can also be thought of as being of local nature relating values of the signal around time instant t. In that case the selections t t τ/ and t t+ τ/ are often used in Equation (.6). Another definition of a local autocovariance function advocates the use of t = t and t = t τ, which is more appealing for discrete implementation. The highest value of the instantaneous ACF occurs for t = t = t (or equivalently when τ = ); this provides the instantaneous variance γ yy [t,t] of the signal. Time frequency domain representations are available through the Time-Varying PSD (TV-PSD). The spectrogram is one of the most basic and widely used estimators of a TV-PSD (Poulimenos and Fassois, 6) (here in analog time form): S spg (ω,t)= w(t t ) y(t ) e jωt dt (.7) where w( ) is a proper window function. The Wigner-Ville time-frequency (energy) spectrum, is defined as the Fourier transform of the local instantaneous ACF with respect to the variable τ (Cohen, 995, p. 4) (again in analog time form): S WV (ω,t)=f{γ(t τ/,t+ τ/)} τ = γ(t+ τ/,t τ/) e jωτ dτ (.8) This definition of the TV-PSD has several desirable properties, such as preservation of energy and preservation of time and frequency marginals. Nonetheless, it presents problems by the potential introduction of negative 7

36 .3. Concise overview of stationary and non stationary vibration modeling components as well as interference terms, known as cross-terms (Cohen, 995). In order to avoid these problems, it is possible to introduce a kernel function into the definition, such that the negative terms and cross-terms are reduced (Cohen, 995). In analogy with the stationary case, where the PSD is the Fourier transform of the ACF, and since in the nonstationary case the instantaneous ACF is a function of two variables, a D Fourier transform with two frequency variables α and ω of the ACF may be employed as (Antoni, 7) (in analog time form): Γ[k,n]=F{γ(t τ/,t+ τ/)} t,τ = γ(t τ/,t+ τ/) e j(αt+ωτ) dτdt (.9) where ω denotes conventional spectral frequency (related to the frequencies forming the signal waveform or first order periodicities) and α denotes the cyclic frequency (representing the repeated cyclic evolution of the waveforms or higher order periodicities), both measured in radians per time unit. The quantity Γ[k,n] is known as the spectral correlation (Gardner, 986; Antoni, 7) and displays the correlation between spectral components of y[t] at frequencies separated by the amount α, namely ω + α/ and ω α/ (Gardner, 986). Then, compared with conventional spectral analysis, the spectral correlation displays an additional dimension related to the nonstationarity of the signal. This quantity is also referred to as D power spectrum (Gardner, 97; García et al., ). The spectral correlation is also related with the Wigner-Ville spectrum by the inverse Fourier transform: S WV (ω,t)=f {Γ[k,n]} α which is carried out over the dimension α. The instantaneous ACF, Wigner-Ville spectrum and spectral correlation form a triplet which is related via the Fourier transform. The instantaneous ACF is the time representation, describing the linear dependence of the signal x[t] between two different time instants; the Wigner-Ville spectrum describes the evolution of the frequency content of the signal over time; and the spectral correlation describes the correlation of spectral components of x(t) at frequency ω with spectral increments ±α. Therefore, the spectral correlation is capable of detecting higher order periodicities hidden in the non-stationary signal. An extended discussion on the definition and estimation of these spectral quantities is provided in (Antoni, 9, 7) Non-stationary parametric modeling A non-stationary parametric TAR(n a ) model, with n a designating its AutoRegressive (AR) order, is defined as (Poulimenos and Fassois, 6): y[t]= n a i= a i [t] y[t i]+w[t], w[t] NID(,σ w) (.) where w[t] is an unobservable uncorrelated (white) non-stationary innovations (residual) signal characterized by zero mean and time varying variance σ w, and a i [t] are the model s time-varying AR parameters. Unlike with conventional AR models presented in Section.3., the parameters of a TAR model depend upon time, and the innovations is itself non-stationary too. A TAR model is fully specified by precisely modeling the form of temporal evolution of its parameters a i [t]. Two particular approaches in this direction are the smoothness priors (SP) approach (resulting in SP-TAR models) and the functional series (FS) approach (resulting in FS-TAR models). In the SP-TAR approach the temporal evolution of the model parameters is subject to stochastic smoothness constraints. The smoothness priors constraints constitute difference equations of the form (Kitagawa and Gersch, 985): κ a i [t]=v i [t] (.) where κ designates the difference equation order, κ =( B) κ is the κ-th order difference operator with B representing the back shift operator, and v i [t] are zero-mean, serially and mutually uncorrelated Gaussian signals with variance σ v i. SP-TAR models are estimated via Kalman filter/smoother based recursions (Poulimenos and Fassois, 6; Kitagawa and Gersch, 985). 8

37 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine The specification of the SP-TAR model structure requires the selection of the model order n a, the smoothness constraint order κ and the variance trade off parameter ν defined as the ratio between the variance of the innovations σ w and σ v i. The selection of n a, κ and ν is carried out by maximization of the likelihood function or by minimizing the Bayesian Information Criterion (BIC) (Kitagawa and Gersch, 985). Furthermore, to reduce the effect of the unknown initial conditions in the Kalman filtering, the forward filtering/backward smoothing can be run at first using as initial values a i []=, i=,...,n a and P a []=β I na, where P a [] is the covariance matrix of the parameter estimation error at time zero, β is a large constant, and I na the n a n a identity matrix. In the FS-TAR approach, the time-dependent AR parameters and innovations (residual) variance are expanded on specific functional subspaces. Each functional subspace consists of a set of linearly independent basis functions selected from a suitable family (such as a polynomial family, a trigonometric family, and so forth). In the present study, the estimation of FS-TAR models is based on two different approaches: Classical FS-TAR modeling with trigonometric basis functions. The following form of parameter evolution is used, where the fundamental frequency α (in Hz) and its first p a / (p a assumed to be even) harmonics are employed as follows (Avendano-Valencia and Fassois, ) (t designating normalized discrete time): p a / a i [t]=a i, + k= {a i,k cos[πkα t/ f s ]+a i,k sin[πkα t/ f s ]} (.) and where p a designates the basis dimensionality and a i,k,k =,,..., p a are the coefficients of projection of the model, and f s (in Hz) is the sampling frequency. The estimation of the FS-TAR model is reduced to the estimation of the projection coefficient vector θ = [a,... a na,p a ] T R n a (p a +), which can be carried out using ordinary or weighted least squares (Poulimenos and Fassois, 6). The procedure for selecting the model order n a, the basis functions, and the dimensionality p a may be based on the minimization of a proper criterion such as the RSS/SSS or the BIC (Poulimenos and Fassois, 6). Adaptable FS-TAR (AFS-TAR) modeling with trigonometric basis functions. This is a recently introduced method that achieves simultaneous estimation of the basis function characteristics and the corresponding coefficients of projections based on a Separable Nonlinear Least Squares scheme (Spiridonakos and Fassois, 3, 4b). Specifically, a set of p a trigonometric decaying basis functions is considered with a priori unknown rates of decay ρ k and frequencies ω k (in rad/s), with the model AR parameters being of the form (t designating normalized discrete time): a i [t]=a i, ρ t/ f p a /{ } s + a i,k ρ t/ f s k cos[ω k t/ f s ]+a i,k ρ t/ f s k sin[ω k t/ f s ] k= (.3) The estimation problem is thus reduced to the determination of the time-invariant parameter vectors η = [ρ... ρ pa / ω... ω pa /] T R pa + and θ =[a,... a na,p a ] T R n a (p a +). Due to the non-quadratic dependence of the estimation criterion on the parameter vector η, an iterative (non-linear) optimization technique is employed. For avoiding potential convergence problems, initial parameter values are obtained by a Particle Swarm Optimization (PSO) based algorithm (Perez and Behdinan, 7)..4 The experimental set up and the vibration signal The random vibration signal employed in the study was acquired at a wind farm located on the Atavyros Mountain on the island of Rhodes, Greece. The wind farm, owned by IWECO M.V. S.A., is a fully operational power facility consisting of 3 NegNicon NM5/9, 9 kw Horizontal Axis Wind Turbines. Each turbine, including the generator unit, consists of approximately 5 kg, and produces electricity at 5 Hz and 69 V. The generator is driven by three stall control blades, each over 5 m long, with a blade tip of 3.6 m long, operating as an aerodynamic brake. The turbine operates with wind speeds in the 3 to 5 m/s range, with rated power at 4 m/s. The tubular steel towers are approximately 49 m high; the diameter is approximately 3.5 m and.75 m at the base and top, respectively. The towers are built up by two sections, each of about 3 5 kg. An internal ladder allows access to the nacelle for maintenance purposes. Three decks are available along the stairway. The first at 3 m over the base, the second at m over the base, and the third at 48 m, right under the yaw mechanism. 9

38 .4. The experimental set up and the vibration signal.4. The experimental set up A set of accelerometers are installed along the tower, measuring the vibration response along the fore-aft (x) and lateral (y) directions, as depicted in Figure.. The sensors are placed at four distinct levels: Sensor set α Base of the tower m. (α ) x direction, (α ) y direction. Sensor set β First deck 3 m over the base. (β ) x direction, (β ) y direction. Sensor set γ Second deck m over the base. (γ ) x direction, (γ ) y direction. Sensor set δ Third deck 48 m over the base. (δ ) x direction, (δ ) y direction. NegNicon NM5/9 Wind Generator Deck/Sensor Location on the Tower 48 m Sensor x-axis Rotor plane Shaft axis x 5 m y x m Sensor y-axis y 3 m m Figure.: Wind turbine geometry and location of the sensors along the tower..4. The vibration signal modeled The signal acquired at position δ along the fore-aft direction (x axis) is presently used. The acquisition and preprocessing of the signal is described in Table., while a time plot of a portion of it is depicted in Figure.3. The wind speed, and the average rotor and generator speeds are monitored via the machine s SCADA system, although they are not used in the present analysis. Notice that the signal length is extended to s for nonparametric modeling, compared to only 3 s used for parametric modeling. This is necessary as non-parametric modeling requires longer data records for achieving a corresponding level of accuracy. Figure.4 depicts the frequencies associated with blade rotation. Three main frequencies are defined: α = π/t related to the frequency of blade going back to its original position (complete rotation); α = π/t related to the frequency of blade going to the position of blade 3 (4 rotation); and α 3 = π/t 3 related to the frequency of blade going to the position of blade ( rotation). If the rotor speed is constant, the frequencies α and α 3 are the second and third harmonics of α, that is α = α and α 3 = 3α. According to Table., in average, the rotor frequency is.37 Hz, therefore, the first cyclic mode frequency is α =.37 Hz and the first cyclic period is T =.69 s. The other cyclic frequencies and periods are summarized in Table..

39 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine Table.: Details of acquisition and pre processing of the wind turbine vibration response signal. Feature Description Sensor type: Piezoelectric accelerometers, voltage sensitivity of.9 mv/g, frequency range.7 - khz. Acquisition Sensor location: Tower top (48 m) fore-aft direction. Initial sampling frequency:.56 khz. Signal length: min 536 samples. Filtering: Chebyshev II lowpass filter, order 5, cutoff frequency 8 Hz. Forward and backward application (Matlab command filtfilt). Re-sampling: f Pre processing s = 6 Hz. Normalization: Sample trend removed by 3rd order polynomial approximation, normalization of the signal power. Analyzed signal length: 3 s 4 8 samples. Average wind speed: 8.6 m/s. Operating conditions Average rotor speed:.3 rpm,.37 Hz. Average generator speed: 54 rpm, 5.67 Hz. Important frequencies Frequencies [Hz]: β = [ ], and periods Periods [s]: T = [ ]. 4 Normalized acceleration Time [s] 4 Normalized acceleration Time [s] Time [s] Figure.3: The pre-processed random vibration acceleration signal: 3 s portion (top); two 3 s portions (bottom) α =.37 Hz T =.69 s α =.744 Hz T =.344 s α 3 =.6 Hz T =.896 s 3.5 Stationary modeling Figure.4: Motions and average frequencies associated with blade rotation..5. Non parametric modeling The non parametric Welch based PSD estimate is presented in Figure.5 (details in Table.3).

40 .5. Stationary modeling α α α 3 α 4 α 5 α 6 Frequency [Hz] Period [s] Table.: The first six frequencies and periods associated with blade rotation. Method Welch PSD AR modeling ARMA modeling Implementation Details Window length Nwin = 6 samples, overlap Nover = 99.5%, Hamming window (MATLAB command hamming(nwin) ) Model order search: find best BIC for n a =,...,. Estimation: Burg s method (MATLAB command arburg(y,na) ). Model order search: find best ARMA(n a,n a ) model for n a =,...,64; then find best ARMA(n a,n b ) for n b =,...,n a Estimation: Iterative search algorithm minimizing a robustified version of the RSS (MATLAB command armax System Identification Toolbox). Table.3: Details on the non-parametric and parametric stationary modeling methods. 4.6 Hz 4.9 Hz 9.6 Hz 4.4 Hz 7.77 Hz.7 Hz 5. Hz 8.59 Hz 3.8 Hz Hz Hz 5.4 Hz Hz Hz Hz 63.5 Hz PSD [db] Frequency [Hz] Figure.5: Stationary Welch-based PSD estimate. Prominent modes are designated by arrows..5. Parametric modeling The structure of the parametric models is selected based on both the RSS/SSS and BIC criteria (Fassois, ) orders to for AR models, and to 64 for ARMA models are employed. Estimation details for AR and ARMA modeling are summarized in Table.3. The resulting model order selection criteria are shown in Figures.6(a) and.6(b) for AR and ARMA models, respectively. The frequency stabilization plots for AR and ARMA models with increasing orders are shown in Figures.7(a) and.7(b), respectively. The estimated parametric PSDs corresponding to each model order are plotted as an image in the background. Blue colors indicate low power, whereas red colors represent high spectral power. The natural frequencies derived from the parametric estimates are indicated with dots, and their corresponding damping ratios via a gray scale (see rightmost column). The non-parametric estimate of the PSD is plotted at the top. The BIC and RSS/SSS curves show a fast decrease until model order 6 65 in the AR case, and 5 3 in the ARMA case. For higher than these orders the criteria present a sustained but slower decrease (see Figure.6(a) and top plot in Figure.6(b)). The stabilization diagrams and PSD convergence plots provide a better picture of the AR and ARMA modeling capabilities for increasing model orders. In the AR case, the stabilization diagram and PSD convergence plots shown in Figure.7(a) demonstrate that the model approximates well the main frequency peaks observed in the non-parametric Welch-based estimate after order 9. In the ARMA case similar stabilization of the frequency content is achieved after model order 55 (see Figure.7(a)). The AR order is, for all models, selected by taking into account the normalized RSS and BIC curves, along with the stabilization and PSD convergence plots. This yields a selected AR order of n a = 9 and n a = 58, in the pure AR and ARMA modeling cases, respectively. These orders are indicated via arrows in the RSS/SSS and BIC curves and via dashed red lines in the stabilization plots. For the ARMA model, the MA order can be further decreased by evaluating the BIC and normalized RSS for ARMA(n a,n) models, with n =,...,n a. The

41 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine RSS/SSS [%] AR( n ) n a = 9 RSS/SSS BIC BIC ( 4 ) n (a) RSS/SSS [%] ARMA( n, n ) RSS/SSS BIC n a = BIC ( 4 ) RSS/SSS [%] n ARMA( n a, n ).4. n c = 48 RSS/SSS BIC n (b) BIC ( 4 ) Figure.6: Stationary AR and ARMA model order selection via the RSS/SSS and BIC criteria: (a) order n selection for AR(n) models (n =,...,); (b) order n selection for ARMA(n,n) models (n =,...,64) (top), and order n selection for ARMA(n a,n) models (n a = 58, n=,...,58) (bottom). resulting curves are shown at the bottom plot of Figure.6(b). It is found that after reducing the MA order to n c = 48 the performance remains more or less the same. The finally selected model structures thus are AR(9) and ARMA(58, 48). A summary of the BIC, normalized RSS, and Samples Per Parameter (SPP) achieved by each final model are presented in Table.4. Model Structure Performance Validation RSS/SSS [%] BIC SPP KS-test Sign test AR(9) n a = ARMA(58,48) n a = 58, n c = SP-TAR(6) n a = 6, κ = AFS-TAR(6) 3 n a = 6, p a = SP-TAR(34) n a = 34, κ = FS-TAR(34) n a = 34, p a = AFS-TAR(34) n a = 34, p a = Results from (Avendano-Valencia and Fassois, ). Table.4: Summary of estimated parametric model structures, their achieved RSS/SSS, BIC, SPP, and validation results for Gaussianity of the residuals (Kolmogorov-Smirnov test) and sign test (applicable only in the case of TAR models). Symbol description: model does not satisfy the validation test; model satisfies the validation test; test not applicable or unavailable for the referenced models. 3

42 .5. Stationary modeling PSD [db] AR order Damping ratio Frequency [Hz] (a).. PSD [db] ARMA order Damping ratio Frequency [Hz] (b).. Figure.7: Stationary parametric PSD estimates for increasing model orders and frequency stabilization diagram (the horizontal dashed red line indicates the selected order) compared with the Welch based PSD estimate (on top of each subplot). (a) AR frequency stabilization plot for increasing AR order; (b) ARMA frequency stabilization plot for increasing AR/MA orders (ARMA(n, n) models are used). The value of each estimated damping ratio is indicated according to the gray scale code on the right of each subplot. The finally estimated models are then validated. Model validation includes formally checking the Gaussianity and uncorrelatedness of the estimated model residuals (Soderstrom and Stoica, 989, Ch. ). Figure.8 depicts 4

43 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine the sample residual ACF, their normal probability plot, and their histogram for both the selected AR and ARMA models. Both models seem to fail the validation tests. A detailed view of the sample ACF suggests that some correllation structure seems to remain for higher lags. This behavior may be due to various limiting factors, including the linearity and stationarity of the employed models... Normalized ACF Normalized ACF Probability Lag e[t t-] ( -3 ) (a) Frequency e[t t-] ( -3 ) Probability Lag e[t t-] ( 3 ) (b) Frequency e[t t-] ( 3 ) Figure.8: Validation of the estimated stationary models: (a) AR(9) and (b) ARMA(58,48) models. (Residual normalized sample ACF with significance limits at the 5% level, residual normal probability plot, and residual histogram.).5.3 Model based analysis The obtained non-parametric and parametric stationary PSD estimates are all depicted in Figure.9. The most prominent resonant frequencies according to the non-parametric PSD estimate are indicated by vertical dashed red lines and arrows. Table.5 summarizes the natural frequencies, damping ratios and modal dispersions (indicating the contribution of each mode to the energy of the measured vibration signal (Fassois, )) obtained from the AR(9) and ARMA(58, 48) models. 5

44 .5. Stationary modeling 4.9 Hz 9.6 Hz 4.4 Hz 7.77 Hz 5. Hz 3.8 Hz Hz 5.4 Hz Hz Hz Hz 63.5 Hz Welch AR(9) db ARMA(58,48) Frequency [Hz] Figure.9: Comparison of the obtained stationary PSD estimates: Welch, AR(9), and ARMA(58,48) based estimates. The main identified modes are indicated by arrows and vertical dashed lines. 6

45 7 Welch PSD AR(9) model ARMA(58,48) model SP-TAR(34) FS-TAR(34) Mode f n [Hz] f n [Hz] ζ n [%] n [%] f n [Hz] ζ n [%] n [%] f n [Hz] ζ n [%] f n [Hz] ζ n [%] ± ± ± ± ± ± ± ± ±.7 5. ±. 9. ± ± ± ± ± ± ± ±.6.4 ± ± ±.3.53 ±.7 5. ±.7.6 ± ±. 6.98± ± ± ±.4.7± ±.34.8 ± ±..85± ± ± ± ± ± ± ±.7.6± ±.5.7 ± ±.3.79± ±.8.77 ± ±..98± ±..6 ± ±.5.5± ±..5 ± Table.5: Summary of the modes identified by the Welch method and parametric AR, ARMA, SP-TAR and FS-TAR modeling. Time-varying modes obtained from the SP-TAR and FS-TAR models are displayed in terms of mean value and standard deviation.. Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine

46 .6. Non stationary modeling.6 Non stationary modeling.6. Non parametric modeling The non-parametric modeling is based on the instantaneous autocorrelation function, the smoothed Wigner-Ville spectrum and the spectral correlation, as described in Section.3.3. As in the non-parametric estimate of the PSD, non-parametric modeling requires a long ( s or 9 samples) signal record. Table.6 presents the estimation details, while Figure. depicts the estimate of the instantaneous ACF (Figure.(a)), the smoothed estimate of the Wigner-Ville spectrum (Figure.(b)), and the estimate of the spectral correlation (Figure.(c)). The instantaneous ACF and the smoothed Wigner-Ville spectrum are shown for a time span of 5 s. Both the smoothed Wigner-Ville spectrum and spectral correlation are plotted, with the Welch-based estimate shown on the left. Oscillation periods of the frequency modes and the most prominent resonant frequencies are indicated by arrows. Method Instantaneous ACF Smoothed Wigner-Ville Spectral Correlation Identification details Window length: 4 samples, sample advance, Gaussian window with dispersion parameter 3. Signal length: 9 samples. Local ACF computed via the MATLAB command xcorr. Fourier transform of the Instantaneous ACF with respect to lag. Fourier transform of the Smoothed Wigner-Ville spectrum with respect to time. Table.6: Non-stationary non-parametric estimation method details. The instantaneous ACF in Figure.(a) demonstrates low frequency power changes and oscillations. The smoothed Wigner-Ville spectrum shows the same frequency modes encountered in the stationary PSD analysis; yet it reveals that some of the modes have a non-stationary nature. In particular, the frequency mode at f = Hz shows a cyclic effect with an apparent period of approximately.9 s (see Figure.(b)) that can be related to the period T 3 =.896 s indicated in Table.. The spectral correlation in Figure.(c) shows that the time varying frequency modes are composed by a series of cyclic components. These cyclic effects are stronger for the mode at f = Hz, with a base frequency α =.373 Hz, and afterwards, the frequencies α =[.74,.,.483,.858,.5] are also encountered. These frequencies turn out to be approximately equal to the first 5 harmonics of the base rotation frequency α =.373. Similar behavior is observed for the mode at f = 5 Hz, featuring only the first harmonic at α =.373 Hz. Furthermore, it can be noticed that the low frequency band from to 4 Hz, has continuously decreasing spectral correlation values which tend to vanish at about.5 Hz on the α axis. This demonstrates that the behavior of this frequency band is non-stationary. Yet, the non-stationarity is not as structured as in the case of the cyclo-stationary modes..6. Parametric modeling.6.. Smoothness Priors Time dependent AutoRegressive (SP TAR) modeling Model order selection The selection of the SP-TAR model structure, consisting of model order n a and smoothness constraint order κ, is based on the minimization of the model BIC. The variance trade off parameter ν is optimized by minimizing the model RSS with a Levenberg Marquardt non-linear search method (Coleman and Li, 996). The estimation details may be found in Table.7. The resulting BIC and normalized RSS curves are shown in Figure.. The considered measures demonstrate that, for both smoothness constraint orders κ = and, the BIC begins to settle down after n a = 8, with much better performance of the models with κ =. Figure. shows the TV-PSD and TV modes of the estimated SP-TAR models as they converge for increasing model order (n a = 8,,...,4 and κ = ). For n a > 8 the spectral content remains more or less the same. In accordance with both curves, the model order is selected as n a = 34. The model s normalized RSS, BIC and SPP (samples per parameter) values are provided in Table.4. Model validation The estimated model residuals are tested for Gaussianity and uncorrelatedness, using hypothesis tests for Gaussianity and checking their sample ACF, histogram and normal probability plots a explained in (Poulimenos and Fassois, 6; Soderstrom and Stoica, 989). The parameter evolution innovations (residual) are 8

. Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine.5.4 Lag [s].3...5.5.5 3 3.5 4 4.5 5 Time [s] (a) 6 Frequency f [Hz] 5 4 3 T =.9 [s] -6 PSD.5.5.5 3 3.5 4 4.5 5 Time [s] (b) 6 5.

(b) Smoothed estimate of the Wigner-Ville TV-PSD S WV (ω,t) with the stationary PSD estimate shown on the left.

47 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine.5.4 Lag [s] Time [s] (a) 6 Frequency f [Hz] T =.9 [s] -6 PSD Time [s] (b) Hz.74 Hz. Hz.48 Hz.86 Hz.3 Hz Frequency f [Hz] 4 3 PSD 6.3 Hz Frequency α [Hz] (c) Figure.: Non-stationary non-parametric identification: (a) Estimate of the instantaneous ACF γ(t, t + τ). (b) Smoothed estimate of the Wigner-Ville TV-PSD S WV (ω,t) with the stationary PSD estimate shown on the left. (c) Estimate of the spectral correlation Γ[k,n] with the stationary PSD estimate shown on the left. similarly tested. For all hypothesis tests a significance level of α = 5% is employed and the results are provided in Table.4. The validation results for the estimated residuals are shown in Figure.3. The residual sample ACF is shown at the top, and the normal probability plot and histogram are shown at the bottom plots. For the SP-TAR(34) 9

.6. Non stationary modeling (a) (b) RSS/SSS [%] BIC ( 4 ) 4 3 3 n a = 34 n a = 34 κ = κ = κ = κ = 4 8 4 6 8 4 6 8 3 3 34 36 38 4 n Figure.

48 .6. Non stationary modeling (a) (b) RSS/SSS [%] BIC ( 4 ) n a = 34 n a = 34 κ = κ = κ = κ = n Figure.: Non-stationary SP-TAR model structure selection: For models with n a = 8,...,4 and κ =, (a) RSS/SSS curves; (b) BIC curves Frequency [Hz] Damping Ratio TAR order Figure.: Convergence of TV modes and TV-PSD for SP-TAR models with orders n a = 8,...,4 and κ =. The value of each estimated damping ratio is indicated according to the gray scale code on the right of the plot. model, it is found that both hypothesis tests validate residual Gaussianity. It may be seen at the bottom plots of Figure.3(a) that the residual distribution is indeed very close to Gaussianity. The top plot of Figure.3(a) also demonstrates that the residuals are (almost) uncorrelated. Yet, the parameter evolution innovations (residuals) fail the Gaussianity tests. As it can be seen in Figure.3(b) the innovations (residuals) for the 4th parameter evolution are highly correlated, showing a behavior close to that of random walk. This behavior is encountered for the other parameters as well, implying that the first order smoothness priors constraints are insufficient to fully explain the time evolution of the SP-TAR parameters. Finally, the correlation of the parameter evolution innovations (residuals) and the TAR innovations (residuals) is examined in Figure.3(c) which shows the correlation coefficient matrix of the parameter evolution innovations (residuals) ˆv i [t], i=,..., and the TAR residuals ŵ[t]. The diagonal corresponds to the individual variance of each process ˆv i [t], which is always equal to unity, and the values around the diagonal show the correlation coefficient between ˆv i [t] and ˆv j [t], where i j. As the matrix is symmetrical, only the values under the diagonal are plotted. The correlation matrix shows a repeated pattern in the innovations (residuals), with high values for E{ ˆv i [t] ˆv i+4 [t]}. The bottom plot of Figure.3(c) shows a 5 s extract of the parameter evolution innovations (residuals) ˆv [t], ˆv 5 [t] and ˆv 9 [t], verifying that they have similar and highly autocorrelated behavior in time. 3

49 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine Method SP-TAR models FS-TAR models AFS-TAR models Estimation details Model structure selection: Minimization of the BIC for SP-TAR models with n a =,,...,4, κ =, and initial trade off parameter ν = 4. Variance trade off parameter optimization: Minimization of the model RSS using the Levenberg Marquardt non-linear search method. Tolerance of the loss function 3, maximum number of iterations 3. Kalman filter/smoother: Initial state vector a[ ]=; initial error covariance 6 I na. Three passes over the signal: backward filter for initial parameter estimation and two forward filter/ backward smoothing passes. Innovations variance: non-parametric estimate via a -sample-long moving window. Structure selection: Minimization of the BIC for FS-TAR models with n a =,,...,4 and p a =,,...,8. Parameter estimation: Minimization of the prediction error with ordinary least squares. Basis functions: Fourier basis with frequencies selected equal to the fundamental frequency of blade rotation, α =.37 Hz, and its harmonics. Innovations variance: non-parametric estimate via a -sample-long moving window. SNLS method: Initial values obtained through PSO, refinement through nonlinear least squares, parameter space constraints:.999 [r a( j),ρ a ]. PSO: MATLAB pso function (MATLAB Particle Swarm Optimization Toolbox), w=, c =.45, c =.45, population size, stall generation limit, termination rule TolFun = 9, Nonlinear Least Squares: MATLAB lsqnonlin function (interior reflective Newton method), termination rules TolFun= 9 and TolX = 9. Innovations variance: non-parametric estimate via a -sample-long moving window. Table.7: Non-stationary parametric SP-TAR and FS-TAR estimation details. Nonetheless, the TAR innovations (residuals) and the parameter evolution innovations (residuals) are uncrosscorrelated, as all the cross-correlation coefficients in the last row of the correlation coefficient matrix are almost zero..6.. Functional Series Time-dependent AutoRegressive (FS-TAR) modeling Model structure selection The selection of the model order n a and the functional basis order p a is based on the BIC. The details on the selection of the model order are provided in Table.7. The BIC and RSS/SSS curves for FS-TAR models with p a = 6 are shown in Figure.4(a), where a minimum BIC is reached for n a 3. The spectral convergence plot in Figure.5 shows that the spectral content settles for n a > 6. In agreement with these results, the model order is selected as n a = 34. The BIC and RSS/SSS are also checked for various functional basis dimensionalities in Figure.4(b). The RSS/SSS decreases with increasing basis dimensionality, but at the same time, the BIC increases due to the increased number of model parameters. Using a compromise between model accuracy and required number of parameters, the basis dimensionality is chosen as p a =. Thus an FS- TAR(34) model is selected, with RSS/SSS, BIC and SPP provided in Table.4. In terms of these criteria this model is better than a model obtained in a previous study (Avendano-Valencia and Fassois, ) (also see Table.4). The estimated projection parameters of the FS-TAR(34) model are shown in Figure.6. Model validation The achieved FS-TAR(34) model is validated following the same procedure outlined in the SP-TAR model case. Although Gaussianity is satisfied, the model residuals are not uncorrelated at the 5% risk level (Figure.7(a)(a)) Adaptable Functional Series Time dependent AutoRegressive (AFS TAR) modeling Model structure selection In the case of AFS-TAR modeling the corresponding basis dimensionalities are selected via the procedure presented in (Spiridonakos and Fassois, 4b) also see summary in Table.7. It should 3

50 .6. Non stationary modeling Probability Normalized ACF Normalized ACF 5% significance limits Lag Normalized e[t t-] (a) Frequency Normalized e[t t-] Probability Normalized ACF Normalized ACF 5% significance limits Lag w 4 [t t-] ( -4 ) (b) Frequency w 4 [t t-] ( -4 ) v i [t] Correlation matrix v [t] v [t] v 3 [t] v 4 [t] v 5 [t] v 6 [t] v 7 [t] v 8 [t] v 9 [t] v [t] v [t] v [t] v [t] v 3 [t] v 4 [t] v 5 [t] v 6 [t] v 7 [t] v 8 [t] v 9 [t] v [t] v [t] v [t] e[t t-] v [t] e[t t-] Time [s] (c) v [t] v 5 [t] v 9 [t] Figure.3: Validation of the SP-TAR(3) model: (a) Sample normalized model innovations ACF with the red horizontal lines indicating statistical significance at the 5% risk level, normal probability plot, and histogram for the TAR model residuals; (b) sample normalized ACF, normal probability plot and histogram for the fourth parameter evolution residuals; (c) correlation coefficient matrix for the first twelve parameter evolution residuals and TAR model residuals, plus sample paths for four parameter evolution residuals. be noticed that constraints have been imposed on the basis rates of decay: r a( j),ρ a [.999,.] (see Equation (.3)) in order to ensure that the resulting AR parameters do not decay or grow too fast. The obtained BIC and RSS/SSS curves for the initial structure selection step pertaining to model order are presented in Figure.8. The bottom plot shows the same values for the selection of the basis dimensionality. In both cases, only the initial PSO optimization for the estimation of the non-linear parameters is performed (Spiridonakos and Fassois, 4b; Perez and Behdinan, 7). In the upper figure the normalized RSS and BIC reduce as the number of parameters increases, till about n a = 3. The lower plot shows a behavior similar to that 3

. Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine (a) (b) RSS/SSS [%] RSS/SSS [%] 3..9.8.

51 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine (a) (b) RSS/SSS [%] RSS/SSS [%] n a = 34 RSS/SSS BIC n p a = RSS/SSS BIC p a BIC ( 4 ) BIC ( 4 ) Figure.4: Non-stationary FS-TAR model order selection. (a) RSS/SSS and BIC for model orders n a = 8,...,4 and p a = 7; (b) RSS/SSS and BIC for FS-TAR(34) model with p a =,,..., Frequency [Hz] Damping Ratio TAR order Figure.5: Convergence of TV modes and TV-PSD for FS-TAR models with orders n a = 8,...,4 and p a = 7. The value of each estimated damping ratio is indicated according to the gray scale code on the right of the plot. of the FS-TAR case, namely, decreasing RSS/SSS for increasing basis dimensionality, but at the same time, an increasing BIC. The AFS-TAR model structure selected is n a = 34 and p a =, based on a trade off between the RSS/SSS and BIC. After model structure selection, the model is refined using a non-linear least squares approach (Spiridonakos and Fassois, 4b). The refined estimated coefficients of projection of the AFS-TAR model are depicted in Figure.9. Additionally, the model structure and its normalized RSS, BIC and SPP are summarized in Table.4. The non-linear parameters corresponding to the basis frequencies are summarized in Table.8. Model validation The estimated AFS-TAR model is validated following the previously mentioned procedure. As in the FS-TAR case, residual Gaussianity is confirmed, but some correlation structure remains in the residuals (Figure.7(b))..6.3 Model based analysis The estimated SP-TAR and FS-TAR model parameters demonstrate the presence of periodicities in the structural dynamics. Welch-based spectral estimation applied on the SP-TAR model parameters allows for the identification 33

52 .6. Non stationary modeling.8.8 Parameter.6.4. a i, a i, Parameter.6.4. a i,3 a i, Parameter.3.. a i,5 a i,6 Parameter.. a i,7 a i, Parameter.4. a i,9 a i, Parameter 4 a i, i i Figure.6: Estimated coefficients of projection of the FS-TAR(34) model for each time varying parameter a i [t] organized per frequency of the basis functions, drawn with the standard deviation of the estimated coefficients of projection (slashed lines around the estimated values): top left: components for α ; top right: components for α ; center left: components for 3α ; center right: components for 4α ; bottom left: components for 5α ; bottom right: DC component. Probability Normalized ACF Normalized ACF 5% significance level Lag Normalized e[t t-] (a) Frequency Normalized e[t t-] Probability Normalized ACF Normalized ACF 5% significance level Lag Normalized e[t t-] (b) Frequency Normalized e[t t-] Figure.7: Validation of the estimated functional series models: (a) FS-TAR(34) model; (b) AFS-TAR(3) model. In each case the sample normalized ACF of the model residuals with significance limits at the 5% level (top) and residual normal probability plot and histogram (bottom) are depicted. of the presence of cyclic components. For the FS-TAR parameters it is possible to compute the power of each basis function as: P i,dc = a i,, P i,k = a i,k + a i,k where P i,dc is the power of the DC component, and P i,k is the power at the component with frequency kα. The PSD estimates of the first four SP-TAR parameters and the power of the corresponding FS-TAR parameters are 34

53 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine RSS/SSS [%] n a = 34 RSS/SSS BIC BIC ( 4 ) RSS/SSS [%] n a.6 RSS/SSS.5 BIC.4.3 p a = 3 BIC ( 4 ) p a 4 Figure.8: Non-stationary AFS-TAR model structure selection: model order (n a top) and functional dimensionality (p a bottom) selection based on the RSS/SSS and BIC criteria..3.4 Parameter.. a i, a i, Parameter.3.. a i,3 a i, Parameter.3.. a i,5 a i,6 Parameter..5 a i,7 a i, Parameter..5 a i,9 a i, Parameter 4 a i, i i Figure.9: Estimated coefficients of projection of the AFS-TAR(34) model for each time varying parameter a i [t] organized per frequency of the basis functions, drawn with the standard deviation of the estimated coefficients of projection (slashed lines around the estimated values): top left: components for α ; top right: components for α ; center left: components for α 3 ; center right: components for α 4 ; bottom left: components for α 5 ; bottom right: DC component. shown in Figure.. Although not always very evident, it may be observed that the estimated PSDs shows frequencies associated with blade rotation (in particular around.37 Hz,.74 Hz, and. Hz) further details in (Avendano-Valencia and Fassois, ). The frequencies identified in the SP-TAR, FS-TAR and AFS-TAR 35

54 .6. Non stationary modeling parameters and the frequency given by the SCADA system of the wind turbine with some of its first harmonics are summarized in Table.8. SP-TAR FS-TAR AFS-TAR a ( f ) [db] a ( f ) [db] a 3 ( f ) [db] a 4 ( f ) [db] a 5 ( f ) [db] a 6 ( f ) [db] f [Hz] f [Hz] Figure.: Analysis of the estimated first four TAR model parameters and detection of periodic components: PSD estimate of the SP-TAR(34) parameters (continuous blue lines) and power of the coefficients of the FS- TAR(34) and AFS-TAR(34) model parameters for each frequency (green line with circles and red line with squares, respectively). Harmonic Average rotor Identified hidden periodicities (Hz) speed (Hz) Non-parametric SP-TAR FS-TAR AFS-TAR α α α α α α Table.8: Non-stationary identification: identified cyclic periodicities in the spectral correlation and in the analyzed TAR models compared with the average values measured by the SCADA system. Figure. shows the corresponding frozen TV-PSDs for the obtained SP-TAR and FS-TAR models, as well as the non-parametric estimate obtained via the smoothed Wigner-Ville spectrum as explained in Section.6.. The frozen TV-PSDs are drawn at 4 frequency points for each data point. Figure.(a) shows the smoothed Wigner-Ville spectrum, Figure.(b) the SP-TAR(34) based frozen TV-PSD, Figure.(c) shows the frozen TV-PSD for the FS-TAR(34) model, and Figure.(d) shows the frozen TV-PSD for the AFS-TAR(34) model. Figure. shows the frozen TV natural frequencies extracted from the SP-TAR and FS-TAR models plotted over the corresponding frozen TV-PSD estimates. Figure.(a), Figure.(c) and Figure.(e) show the full set of frozen TV natural frequencies for the SP-TAR, FS-TAR and AFS-TAR models, respectively. 36

55 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine The non-parametric estimate of the stationary PSD is plotted on the left, with the modes identified by stationary analysis (see Table.5) indicated by arrows. Details for the most prominent TV natural frequencies around 5 Hz and 38 Hz are presented in Figure. for the SP-TAR and FS-TAR models..7 Discussion The obtained results provide an initial insight into the vibration signal characteristics and some of the underlying dynamics under normal operation. In comparison with previously obtained, preliminary, results by the authors (Avendano-Valencia and Fassois, ), the present models exhibit improved performance in terms of the RSS/SSS and BIC (see Table.4). Besides, the additional analysis of the non-stationary structure of the dynamics has revealed the cyclic behavior, along with broader non-stationary behavior at low frequencies. Furthermore, a comparison between conventional stationary, non-parametric and parametric, modeling has been carried out. Stationary modeling has demonstrated the presence of several frequency modes that are summarized in Table.5 and indicated in Figure.9. The identified frequency at 5 Hz coincides with the 5 Hz average frequency of the high speed shaft measured by the wind turbine SCADA system. The frequency at 5 Hz can be interpreted as the first subharmonic of the high speed shaft frequency. The other frequency modes are attributed to operational modes of the tower, similarly to the discussion in (Hansen, 7; Wang et al., ). Nevertheless, it is difficult to point out the precise origin of these modes without installing a large array of sensors on the wind turbine. Some of the operational modes seem to cluster in specific bandwidths. These are designated by the shaded areas in Figure.9. In particular, the mode at Hz, accompanied by several side lobes with decreasing magnitude, demonstrates the presence of a frequency modulated mode. The low frequency non-stationary components (Figure.9 and Table.5) may be associated with small variations in the power of the wind excitation as the blades pass through the different wind profiles of a turbulent wind (Rokenes and Kogstad, 9; Hansen et al., 6b; Li et al., ). The non-stationary modeling has also demonstrated the TV behavior and hidden periodicities present in some frequency modes. The instantaneous ACF mainly shows a high power very low frequency action of the wind over the structure. This behavior is characterized as TV stochastic, as it could be expected from the wind moving through the blades and tower, explaining the low frequency peaks of the stationary PSD estimates. Additionally, the instantaneous ACF shows a background cyclic activity, demonstrated by the sinusoidal form of the autocorrelation. The estimates of the TV-PSD (smoothed Wigner Ville spectrum and parametric frozen TV-PSDs in Figure.) show a low (under 5 Hz) frequency component, a mode at -4 Hz, and a cyclo-stationary mode oscillating in the range from 37 to 4 Hz. Comparison with the stationary PSD estimates in Figure. demonstrates that these are not necessarily distinct modes, but, rather, represent the time evolution of a smaller number (or just a single) actual mode. The frequencies influencing the modes have been also shown to be related to those of blade rotation (both fundamental frequency and its harmonics Table.8). Parametric TAR models yield congruent results that allow the estimation of the periodicities present in the dynamics. The frozen-time natural frequencies shown in Figure. demonstrate the variability of the frequency modes. The temporal evolution of the SP-TAR and FS-TAR obtained modes seems to be somewhat different, but the main features are quite alike, except for the very low frequency mode which in the SP-TAR case seems to be more damped. Both models achieve similar RSS/SSS and BIC values, with the FS model having the lead Table.4. The AFS-TAR models are advantageous for the automatic detection of the cyclic frequencies in the data, in comparison to the classical FS-TAR approach, as it is not necessary to give a priori values for the cyclic frequencies. On the other hand, the SP-TAR approach is more flexible for the detection of cyclic frequency components, as it allows, through the frequency analysis of the parameters, for the identification of cyclic components in the time evolution of the parameters. In summary three kinds of non-stationary behaviors have been identified. First, a non-stationary low frequency component that can be attributed to the wind excitation. Second, two frequencies at 5 Hz and 5 Hz that are associated with the high speed shaft and generator. Third, a group of frequencies where at least one of them displays a clear non stationary/cyclo-stationary behavior that can be related to the response of the wind tower structure to the cyclic stress exerted by the blade system. These findings are in general agreement with previous studies. As it has been suggested in the literature (Allen et al., b), the vibration of an operating wind turbine 37

56 .8. Conclusions exhibits cyclo-stationary effects, which are linked to the rotation speed of the blade system. Nonetheless, the wind also produces broader non-stationary behavior in the low frequency range. The applied smoothness priors and functional series models have proved useful for random vibration signal modeling, as they are capable of describing the temporal changes and unveiling hidden periodicities in the structural modes. The smoothed Wigner Ville and spectral correlation have been shown to be also useful and capable of capturing non-stationary and cyclo-stationary behaviors. Nevertheless, these non-parametric tools are generally not as precise as their parametric counterparts, and need quite longer data records in order to provide comparable accuracy. Stationary AR and ARMA models also have achieved low RSS/SSS and BIC values, but are incapable of explaining the non-stationary and cyclo-stationary behaviors. The violation of the validation tests also demonstrates the inappropriateness of the stationary approach. The non-stationary and cyclo-stationary information is vital, as it may be used to detect the presence of blade unbalance, problems in the transmission system, or other types of structural problems..8 Conclusions The problem of stationary and non stationary random vibration modeling and analysis for an operating wind turbine has been considered by using an acceleration vibration signal measured on a NegMicon NM5/9 fixed speed wind turbine tower. Three stationary methods, namely Welch spectral estimation, autoregressive, and autoregressive moving average modeling, have been considered, along with five non stationary methods, namely non parametric Wigner Ville and spectral correlation, and parametric modeling by means of smoothness priors (SP) time-dependent autoregressive modeling, functional series (FS) time-dependent autoregressive modeling, and adaptable functional series (AFS) time-dependent autoregressive modeling. The main conclusions drawn may be summarized as follows: (i) The nearly cyclo-stationary and broader non-stationary natures of the vibration signal have been confirmed and accurately modelled. Three main types of dynamical behavior have been confirmed: That due to the wind characteristics, those due to the low and high speed shafts of the transmission system, and strongly cyclo-stationary dynamics associated with blade rotation. (ii) It has been confirmed that certain modes are significantly affected by blade rotation. Nevertheless, it is found that this influence is rather complex, comprising both harmonics and subharmonics of the blade rotation frequency. A clear picture was possible to obtain through non-stationary analysis tools. (iii) The non-stationary non-parametric tools (smoothed Wigner-Ville spectrum and spectral correlation) proved useful for a first analysis of the non-stationary vibration, but have required much longer signals compared to parametric analysis in order to yield accurate and reliable results. Of course, they are also useful for guiding the application of parametric approaches. (iv) Parametric non-stationary TAR models have allowed for deeper insight into the non-stationary and cyclostationary vibration characteristics, and have been proven to be capable of unveiling hidden periodicities using shorter signals, while also improving the precision and compactness of the obtained models. In addition they may potentially provide further information for variable speed wind turbines, including variation of the rotation frequency, centrifugal stiffening, and aeroelastic damping. 38

57 39 (a) (c) Figure.: The estimated TV PSDs: (a) Non parametric smoothed Wigner-Ville estimate; (b) SP-TAR(34) based frozen TV-PSD; (c) FS-TAR(34) based frozen TV-PSD; (d) AFS-TAR(34) based frozen TV-PSD. (b) (d). Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine

.A. Smoothness Priors TAR modeling Frequency [Hz] 6 5 4 3 Frequency [Hz] Frequency [Hz] 4 4 38 36 6 5.5 5 4.5 T 3 =.987[s] T 3 =.

769[s] T 3 =.93[s] - -4-6 3 4 5 PSD [db] Time [s] (c) 4 3 4 5 Time [s] (d) Frequency [Hz] 6 5 4 3 Frequency [Hz] Frequency [Hz] 4 4 38 36 6 5.5 5 4.

: The estimated TV frozen natural frequencies with the corresponding frozen PSD estimate as background: (a) SP-TAR(34) model based natural frequencies

(stationary PSD estimate on the left); (d) detail around 5 and 38 Hz based on the FS-TAR(34) model; (e) AFS- TAR(34) model based natural frequencies

A Smoothness Priors TAR modeling Consider a time varying coefficient AR model (Poulimenos and Fassois, 6) x[t]= n a i= a i [t] x[t i]+e[t], e[t] NID(,σ

58 .A. Smoothness Priors TAR modeling Frequency [Hz] Frequency [Hz] Frequency [Hz] T 3 =.987[s] T 3 =.83[s] PSD [db] Time [s] (a) Time [s] (b) Frequency [Hz] Frequency [Hz] Frequency [Hz] T 3 =.769[s] T 3 =.93[s] PSD [db] Time [s] (c) Time [s] (d) Frequency [Hz] Frequency [Hz] Frequency [Hz] T 3 =.769[s] T 3 =.93[s] PSD [db] Time [s] (e) Time [s] (f) Figure.: The estimated TV frozen natural frequencies with the corresponding frozen PSD estimate as background: (a) SP-TAR(34) model based natural frequencies (stationary PSD estimate on the left); (b) detail around 5 and 38 Hz based on the SP-TAR(34) model; (c) FS-TAR(34) model based natural frequencies (stationary PSD estimate on the left); (d) detail around 5 and 38 Hz based on the FS-TAR(34) model; (e) AFS- TAR(34) model based natural frequencies (stationary PSD estimate on the left); (f) detail around 5 and 38 Hz based on the AFS-TAR(34) model. Appendix.A Smoothness Priors TAR modeling Consider a time varying coefficient AR model (Poulimenos and Fassois, 6) x[t]= n a i= a i [t] x[t i]+e[t], e[t] NID(,σ ε) (.A.) with t designating normalized discrete time, x[t] the non-stationary vibration response signal modeled, e[t] an unobservable uncorrelated (white) non-stationary innovations (residual) signal characterized by zero mean and time varying variance σ ε, and a i [t] are the model s time-varying AR parameters. 4

59 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine In the SP-TAR approach, it is assumed that the temporal evolution of the model parameters is subject to stochastic smoothness constraints. The smoothness priors constraints constitute difference equations of the following form (Kitagawa and Gersch, 985) κ a i [t]=v i [t] (.A.) where κ designates the difference equation order, κ = ( B) κ is the κ th order difference operator with B representing the backshift operator, and v i [t] NID(,σ v i ) is a zero mean, uncorrelated and mutually uncorrelated Gaussian sequence. The SP-TAR model can be written in the following state-space form z[t]= F z[t ]+ G w[t] x[t]= h[t] T z[t]+e[t] (.A.3a) (.A.3b) where z[t]= [ a [t]... a na [t]... a [t κ+ ]... a na [t κ+ ] ] T n a κ w[t]= [ w [t]... w na [t] ] T n a h[t]= [ x[t ]... x[t n a ]... ] n a κ and κ = : F I na, G I na ] ] Ina I κ = : F [ na Ina, G [ I na na na and so on, where I n and n represent the n n dimensional identity and zero matrices, respectively. The process noise vector w[t] and the innovations (residual) sequence e[t] are assumed to be mutually uncorrelated white Gaussian noises defined as w[t] NID(, Q), Q = σ wi na. Given the state space representation of the SP-TAR model, the fitting of the model to the data x t is achieved by maximizing the likelihood computed by the Kalman filter/smoother (Kitagawa and Gersch, 985)..A. Kalman Filter and Smoother Given a set of observations x N = [x[],x[],...,x[n]] and initial conditions z[ ] and P[ ] the one step ahead predictor z[t t ] and filtered estimate z[t t] of the state vector z[t] are given by the Kalman filter algorithm (Anderson and Moore, 979, chapter 3) Time update z[t t ]= F z[t t ] P[t t ]= F P[t t ]F T + G Q G T (.A.4a) (.A.4b) Observation update K[t]= P[t t ] h[t] T (h[t] P[t t ]h[t] T + σ ε) (.A.4c) z[t t]= z[t t ]+ K[t] (x[t] h[t] z[t t ]) P[t t]=(i K[t] h[t]) P[t t ] (.A.4d) (.A.4e) After forward Kalman filtering, it is possible to obtain smoothed estimates of the state vector and error covariance matrix given the entire vector of observations x N by means of the smoothing algorithm A[t]= P[t t] F T P[t+ t] z[t N]= z[t t]+ A[t] (z[t+ N] z[t+ t]) P[t N]= P[t t]+ A[t] (P[t+ N] P[t+ t]) A[t] T (.A.5a) (.A.5b) (.A.5c) 4

60 .A. Smoothness Priors TAR modeling In equations (3.4.3) and (3.4.4) z[t t ] and P[t t ] represent the one step ahead or a priori prediction of the state vector z[t] and its error covariance matrix, respectively. As the initial conditions z[ ] and P[ ] are ussually unknown, their values can be initially set as z[ ]=[,...,] T and P[ ]=β I κ na, with β a very large number. This setting is equivalent to estimate the initial values from the entire data set. Later, forward filtering and backward smoothing iterations can be sequentially repeated with the updated initial values from the last iteration, until the precision of the estimates fits some convergence rule..a. The Likelihood of the SP-TAR Model and the BIC The likelihood of the SP-TAR model is defined as (Kitagawa and Gersch, 985) L(x N,M)= f(x N M)= N t= f(x[t] x t,m) (.A.6) where f(x[] x[ ],M) = f(x[],m), and f(x,m) is the probability density function defining the likelihood of x being generated by the model M. The state space representation in Equation (.A.3) and the Kalman filter in Equation (3.4.3) yield the efficient O(N) computation of the likelihood of the time series model. To begin with, the probability density function of the state z[t] given observations up to t is f(z[t] x t,m)=n(ẑz[t t ], P[t t ]) (.A.7) Then, the PDF of the observation x[t] given observations up to t and under model M can be approximated by propagating the PDF in Equation (.A.7) through the state measurement equation in Equation (.A.3b), yielding f(x[t] x t,m)=n ( h T [t] ẑz[t t ], h T [t] P[t t ] h[t]+σ ) ε =(/ π) (h T [t]p[t t ]h[t]+σε) / ( exp (x[t] ht [t]ẑz[t t ]) (h ) T [t]p[t t ]h[t]+σε) ( = v [t] π exp ) v [t] / r[t] where v[t]=x[t] h T [t]ẑz[t t ] and r[t]= h T [t]p[t t ]h[t]+σ ε. Applying this result in the Equation (.A.6), results L(x N N (,M)= r [t] π exp ) v [t] (.A.8) / r[t] and the log likelihood function is The BIC is computed as logl(x N,M)=log N t= ( N t= ( r [t] π exp ) ) v [t] / r[t] = t= logπ logr[t] v [t] r[t] = N logπ N ( ) logr[t] v [t] r[t] t= BIC(M)= logl(m,x N )+d logn (.A.9) (.A.) where d is the number of free parameters of the model, which are the size of the state vector n a κ, the innovations variance dimension -- and the variance tradeoff parameter dimension --. Consequently, d = n a κ+. 4

61 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine Appendix.B Model assessing and validation To assess the performance of the estimated model, typically the following quantities are analyzed (i) Residual Sum of Squares (RSS): Given the estimation residuals sequence e[t t ], the RSS is computed by means of the following equation RSS= N t= e[t t ] where N is the signal length. Generally, the RSS is normalized by the Series Sum of Squares (SSS) and a percentual measure is given SSS= N t= y[t] RSS/SSS= RSS SSS % (ii) Log likelihood function: The likelihood function L(M,y N ) of the model M given the observations y[t], t =,...,N is computed according to each model. For AR and ARMA models, the likelihood function is the variance of the estimation residuals σ e. For TAR models, the likelihood function depends on the particular model structure (SP-TAR, FS-TAR) (Poulimenos and Fassois, 6). (iii) Bayesian Information Criterion (BIC): The BIC is computed as (Stoica and Moses, 997) BIC(M)= logl(m,x N )+d logn N where d is the number of free parameters of the model. (.B.) (iv) Samples Per Parameter (SPP): Evaluates the consistency and parsimony of the estimates. It is evaluated as SPP= N d A SPP value lower than 5 shows that the number of samples is insuficient to achieve statistically consistent parameter estimates. For validation purposes, the following analysis is carried out in the estimation residuals of the achieved models: (i) Kolmogorov-Smirnov test (K-test), The K-test compares the values in the series to a standard normal distribution. The null hypothesis is that the series has a standard normal distribution. The alternative hypothesis is that it does not have that distribution. (ii) Gaussianity hypothesis test for heterskedastic residuals (sign test), The residual sign test examines if the pattern of the residual sequence is unusual for a zero mean uncorrelated series. Thus, the number of plus z, minus z and the number of series with the same sign z are counted. For large samples, the distribution of z may be approximated by a Gaussian curve. Then, for a level of risk α it can be stated that (Poulimenos and Fassois, 6) For z ˆµ :Z l Z α H is accepted (the model is valid), Z l < Z α H is accepted (the model is not valid). For z ˆµ :Z u Z α H is accepted (the model is valid), Z u > Z α H is accepted (the model is not valid). 43

62 .C. Estimation of the Time Dependent Variance where Z α and Z α designate the standard normal distribution at α and α critical points, and z ˆµ+ / z ˆµ / Z l =, Z u = ˆσ ˆσ ˆµ = z z +, ˆσ = z z (z z z z ) z + z (z + z ) (z + z ) (iii) Analysis of the autocorrelation function: Ideally, if the residuals are NID, the ACF should be zero for all lags over zero. In the sample ACF, lower and upper bounds wherein the correlation should lie, can be computed according to a certain significance level. (iv) Normal probability plot, histogram. Appendix.C Estimation of the Time Dependent Variance The estimation of the time dependent variance σ ε of the residuals sequence e[t t ] can be achieved by different approaches. Two approaches are addressed in this appendix: (i) a non-parametric estimate using centralized moving averaging and (ii) a smoothness priors based approach of changing power..c. Non-parametric estimation of time-dependent variance Assuming that the residual sequence e[t t ] is quasi-stationary within M + samples, a non parametric estimate of σ ε can computed by the following expression ˆσ e[t]= M+ M i= M e [t i t i ] (.C.) where M is the size of the averaging window, and e[t t ] is the residual sequence or prediction error..c. Smoothness priors based estimator of time-dependent variance The procedure of estimating the changing power of a time series is considered as described in (Kitagawa and Gersch, 985). Consider a realization of white noise s[t] where s[t] N()σ [t] with unknown time varying variance σ [t]. The stochastic process ξ [t ] defined by ξ [t ]=(s [t ]+s [t ])/ (.C.) constitutes an independent sequence of chi-square random variables with two degrees of freedom (ξ [t ] ξ ). Then, by means of the transformation η[t ]=log(ξ [t ]+η o ) (.C.3) where η o =.577 is the Euler constant, leaves the independent random variable η[t ] with distribution which is almost normal and with the moments E{η[t ]}=logσ [t ], var{η[t ]}=π /36. That transformation allows to use least squares estimate of η[t ], and hence, for the approximation of the unknown variance σ [t ]. To obtain the smooth estimate of the variance σ [t], the following smoothness constraint can be set κ η[t ]=w[t ] (.C.4) where w[m] N()τ. Then, as before, this constraint can be cast into state space as z[t ]= Fz[t ] Gw[t ] η[t ]= h T z[t ]+ε[t ] (.C.5a) (.C.5b) 44

63 . Stationary and non-stationary random vibration modeling and analysis for an operating wind turbine Assuming that κ =, then [ F = ], G= [ ], h= [ ] [ w[t ] ] ε[t N ] ([ ])[ ] τ σ Applying the Kalman filter and smoother algorithms, the value of η[t N] can be estimated, and then, the variance σ (t N)=σ (t N)=exp(η[t N]) η o is obtained. 45

64 .C. Estimation of the Time Dependent Variance 46

65 Chapter 3 Generalized Stochastic Constraint Time-Dependent ARMA Modeling of Non-Stationary Random Signals In this chapter, the problem of modeling non-stationary time series featuring deterministic and stochastic patterns in the evolution of the dynamics via Stochastic Parameter Evolution Time dependent ARMA models (SPE TARMA) is discussed. This work particularly focuses on the class of Generalized (linear) Stochastic Constraints TARMA (GSC TARMA) models, in which the parameter evolution is determined by Gaussian multivariate AR processes, being the well known Smoothness Priors TARMA models the special case when all the roots of the multivariate AR model are equal to one. The GSC TARMA model has the advantage of a simple linear Gaussian form, which simplifies the estimation of the parameter trajectories via conventional Kalman filter methods, while has the potential of optimizing its tracking abilities by the proper adjustment of the model hyperparameters. General properties and an estimation framework for this type of models are thoroughly analyzed and discussed. The estimation methods are analyzed and compared in two application examples: Monte Carlo tests based on simulated TAR models with stochastically parameter evolutions, and the identification of operational wind turbine vibration response signals. The results show the improved performance of the proposed GSC-TARMA models in terms of tracking of the time-varying dynamics and representation error. 47

66 3.. Introduction 3. Introduction Time-dependent AutoRegressive Moving Average (TARMA) models are one of the most powerful and widely used techniques for the parametric identification and analysis of non-stationary signals and time-dependent systems. A Time-dependent ARMA (TARMA) model of a discrete non-stationary signal y[t], t =,...,N, denoted as TARMA(n a,n c ) with n a designating its AutoRegressive (AR) order and n c designating its Moving Average (MA) order, is defined as (Poulimenos and Fassois, 6): y[t]= n a i= a i [t] y[t i]+ n c i= c i [t] w[t i]+w[t], w[t] NID (,σw ) (3..) where w[t] is an unobservable non-stationary Normally Identically Distributed (NID) innovations characterized by zero mean and variance σ w, and the functions a i [t], c i [t] are the model s time-dependent AR and MA parameters, correspondingly. Compared to their non-parametric counterparts, non-stationary parametric TARMA methods offer a series of advantages, including representation parsimony (capability of accurately describing a process with a few number of parameters), improved representation accuracy and improved tracking of time-varying dynamics (Poulimenos and Fassois, 6, 9b; Spiridonakos et al., ). Besides, the TARMA family of models is useful for analysis, simulation, control and damage detection/classification in systems with non-stationary response. The definition of a TARMA model in Equation (3..) is accompanied by the additional definition of the particular form in which the parameters evolve over time, namely the parameter evolution model. The selection of the parameter evolution model directly affects the tracking performance of the method and is the key to obtain a most accurate model. According to the classification provided in (Poulimenos and Fassois, 6), the parameter evolution model can be defined according to any of the subsequently described formalisms: (i) Unstructured Parameter Evolution (UPE) models (Ljung 999, Ch. 5; Niedzwieki, Ch. ) for which no particular structure is imposed over the parameter evolutions; (ii) Stochastic Parameter Evolution (SPE) models (Kitagawa and Gersch 996, Ch., Poulimenos and Fassois 6) for which a stochastic structure is imposed on the parameter evolution; (iii) Deterministic Parameter Evolution (DPE) models (Poulimenos and Fassois, 6, 9b; Spiridonakos and Fassois, 4b) for which a deterministic structure is imposed on the parameter evolution. Therefore, the first decision the analyst has to make when using TARMA models is regarding to the type of parameter evolution model to use. From a practical viewpoint, the decision must be made in terms of the modeling accuracy (indicating the predictive capability and the tracking ability of the model) and the simplicity of use (which means selecting a parameter evolution model that facilitates the identification process and the posterior analysis of the results). Certainly, UPE TARMA models are the simplest to use, since very few assumptions have to be made, while very few adjusting parameters and user expertise are required (Ljung, 999; Niedzwieki, ). On the other hand, these models may lack on accuracy, parsimony and flexibility, and are easily outperformed by other model types, particularly whenever the rate of change of the non-stationary dynamics is relatively high (Poulimenos and Fassois, 6). On the contrary, DPE TARMA models provide very compact and accurate representations of non-stationary process, particularly for those whose dynamics evolve in a deterministic way (Poulimenos and Fassois, 6, 9b). However, the main shortcoming of DPE TARMA models is the identification process, which includes the selection of a representation basis and a respective basis index set that optimize a certain fitness function (Poulimenos and Fassois, 6, 9b; Spiridonakos and Fassois, 4b). These selections in many cases are not trivial, particularly when the structure of the non-stationary process follows complex paths. Some methodologies to circumvent this problem exist and include the use of adaptable basis functions, or the application of integer optimization schemes (via genetic algorithms or related optimization methods) or regression schemes (Spiridonakos and Fassois, 4a,b). SPE TARMA models may be located somewhere in the middle between UPE-TARMA models and DPE TARMA models. More precisely, the use of stochastic constraints for the parameter evolution gives SPE TARMA models a larger degree of flexibility and simplicity that DPE TARMA models lack, while provides a certain structure to the evolution of the parameters, that may yield improved tracking capabilities and modeling accuracy, compared to the UPE-TARMA model class. Within the stochastic model class, the Smoothness Priors TARMA 48

67 3. GSC-TARMA modeling of non-stationary random signals (SP-TARMA) models, which feature q-order integrated white noise (and thus non-stationary mean) to model the parameter evolutions (Kitagawa and Gersch, 996), are by far the most used. More elaborate models utilize more general Markov processes and/or non-gaussian distributions to represent the parameter evolution, where the exact type of Markov process dictates the actual dynamic characteristics of the parameter evolutions (Francq and Gautier, 4; Hsiao, 8; Kitagawa, 987; Rajan et al., ). The identification (modeling) of non-stationary signals via SPE TARMA methods requires attention to two related sub-problems: (i) The estimation of the parameter trajectories of the TARMA model, which is typically tackled by means of recursive Bayesian estimation methods (where Kalman filtering and smoothing are the solutions for the most commonly found linear Gaussian case). (ii) The identification of the parameters of the associated probability distributions of the Bayesian model, which are referred to as hyperparameters. From these two problems, the first one has drawn most of the attention in practical applications, although the second one is of foremost importance for the optimization of the performance of the SPE TARMA methods, despite of being overlooked in most applications. The main subject of this chapter is on the identification of SPE TARMA models, and in particular those using stochastic constraints in the form of linear AutoRegressive (AR) processes fed by Normally and Identically Distributed (NID) white Gaussian noise, which shall be referred to as Generalized linear Stochastic Constraint TARMA (GSC TARMA) models. The rationale for the application of this class of SPE model stems from the simplicity and potential improvement of the tracking ability and modeling accuracy of the obtained representation, as compared with other TARMA model classes. Moreover, such parameter evolution model also includes the class of SP TARMA models which utilize integrated AR processes, and are the special case when one or several roots of the AR polynomial are equal to one. Besides, the use of Gaussian distributions for the noise terms of the model leads to a fully Gaussian representation, under which the formulation of the estimation of parameter trajectories and the identification of the model are much simpler than in the case of other non-gaussian formulations present in the literature (Rasmussen and Williams, 6, Ch. ), (Shumway and Stoffer,, Ch. 6). The main focus of the present chapter is on Maximum Likelihood and Bayesian identification methods for GSC TARMA models, and its main contributions are: (i) An overview of Stochastic Parameter Evolution TARMA models and the complete definition of the Generalized linear Stochastic Constraint TARMA model class and its statistical properties. (ii) Maximum Likelihood and Bayesian inference methods for the identification of the parameter trajectories and hyperparameters of GSC TARMA models. In particular the following methods are considered: A method aiming to optimize the marginal likelihood of the GSC TARMA model solved via a non linear iterative optimization A method aiming to optimize the likelihood of the parameters and hyperparameters of the GSC TARMA model solved via an Expectation Maximization algorithm A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via Markov Chain Monte Carlo sampling A method aiming to optimize the Joint Posterior PDF of the parameters and hyperparameters of the GSC TARMA model solved via a joint Kalman filter method based on an augmented state space representation (iii) Assessment of the capabilities of the GSC-TARMA models and evaluation of the identification methods via Monte Carlo simulations on different scenarios, including purely deterministic parameter evolution, mix of stochastic and deterministic parameter evolution, and purely stochastic parameter evolution. (iv) Assessment of the maximum likelihood and Bayesian GSC TARMA model identification methods and comparison with FS TARMA model identification methods on a real application example on the identification of wind turbine vibration response signals. The chapter is organized as follows: Section 3. provides an overview of stochastic parameter evolution TARMA models; Section 3.3 provides a summary of the statistical properties of GSC TARMA models; Section 3.4 shows the definition of the identification problem and provides detailed algorithms for the estimation of 49

68 3.. On the class of stochastic parameter evolution TARMA models the parameter trajectories via Kalman filtering and smoothing, and the estimation of the model hyperparameters by means of Maximum Likelihood and Bayesian inference methods; in sections 3.5, 3.6, and 3.7 are provided Monte Carlo simulations demonstrating the capabilities of the methods on the modeling of non stationary time series with purely stochastic, mixed stochastic and deterministic, and purely deterministic characteristics in the evolutionary dynamics; Section 3.8 provides the evaluation of the methods on the identification of the non stationary vibration response of an operating wind turbine. Finally, comments and conclusions are provided in Section 3.. Specifics about the estimation methods and other details are given in the Appendices. 3. On the class of stochastic parameter evolution TARMA models Stochastic Parameter Evolution (SPE) TARMA models encompass the family of stochastic models defined on the period of time t =,,...,N, that can be described within the following representation: θ[t]= f(θ[t ],, θ[t q], µ)+ v[t], v[t] NID( na, Σ v ) (3..a) y[t]= φ T [t] θ[t]+w[t], w[t] NID (,σw ) (3..b) where θ[t] is the AR and MA parameter vector and φ[t] is the regression vector, both of them defined as: [ ] T Parameter vector: θ[t] = a [t] a na [t] c [t] c nc [t] n [ Regression vector: φ[t] = y[t ] y[t n a ] w[t ] w[t n c ] with n = n a + n c being the size of the parameter and regression vectors, and with initial conditions θ[] = n and y[]=. Notice that in the full TARMA case, since the elements w[t i] contained in the regression vector can be written as w[t i] = y[t i] φ T [t i] θ[t i], and thus the observation equation becomes a non-linear function of the parameters θ[t]. An SPE TARMA model is thus defined in terms of two stochastic equations, the first one known as the process equation or parameter evolution equation (Equation (3..a)), and the second one known as the observation equation or TARMA equation (Equation (3..b)). In the parameter evolution equation, f( ) is a function that defines the value of the parameter vector at time t in terms of its previous q values and a vector unobservable stochastic process v[t]= [ v [t] v n [t] ] T n, referred to as the parameter innovations. The specific form of the function f( ) is determined by the type of stochastic model used to represent the evolution of the parameters. The function f( ) is parametrized by the vector µ, defined as: ] T n µ = [ µ µ µ nµ ] T n µ which is referred to as the stochastic constraint vector. The stochastic constraint vector takes values according to the specific type of function used in the parameter evolution equation. For example, µ can contain the parameters of an AR model representing the evolution of the TARMA parameters. On the other hand, the TARMA equation is equivalent to Equation (3..) providing the definition of the TARMA model: y[t]= φ T [t] θ[t]+w[t]= n a i= a i [t] y[t i]+ n c i= c i [t] w[t i]+w[t] The parameter innovations v[t] and signal innovations (or simply the innovations) w[t] are uncorrelated and uncrosscorrelated stochastic processes, which are typically defined as zero mean white Gaussian processes, satisfying {[ ] v[t] E [v w[t] T [t τ] w[t τ] ]} [ ] Σv = δ[t τ] Notice that Normally and Identically Distributed noise has been used for the definition of the parameter innovations in Equation (3..a). However, other types of stochastic processes can be used, including non-gaussian, multiplicative noise, and so on. σ w 5

69 3. GSC-TARMA modeling of non-stationary random signals where E{ } represents the expectation operator in the joint probabilistic space of v[t] and w[t], and δ[t] is the Kronecker delta. The quantities µ, Σ v and σ w, and the initial conditions y[], and θ[] constitute the hyperparameters of the SPE TARMA model, which are collectively grouped in the variablep={µ, Σ v,σ w,y[], θ[]}. Stochastic Parameter Evolution (SPE) models were initially introduced by in (Kitagawa and Gersch, 985) for the modeling of non-stationary-in-the-covariance time series, based on the solution of Shiller and Akaike for ill-posed regression problems within the Bayesian framework (Shiller, 973; Akaike, 98). For such case, the parameter evolution was defined in terms of Smoothness Priors (SP) that take the form of linear difference equations with unit roots excited by an NID (parameter innovations) process (Kitagawa and Gersch, 985). Formally expressed, the smoothness prior constraints are defined as follows: ( B) q θ[t]= v[t], v[t] NID(, Σ v ) Σ v = diag ( σ v,...,σ v n ) where B is the time shift operator (with B x[t]=x[t ]), and q Z + is the smoothness constraint order. Thus, for q= and q=, the parameter evolution equation (Equation (3..a)) takes the form θ[t]= f(θ[t ], µ)+ v[t]= θ[t ]+ v[t], for q= θ[t]= f(θ[t ], θ[t ], µ)+ v[t]=θ[t ] θ[t ]+ v[t], for q= and so on for higher orders. Notice that the stochastic constraint parameters of the SP-TARMA model correspond to the coefficients of the polynomial( B) q. Thus, for stochastic constraint orders q= and q=: q := B µ = q := ( B) = B+B µ = [ ] T Generalizations of the basic smoothness priors model have also been attempted. For example, in (Tugnait, 98; Francq and Gautier, 4) a class of SPE TARMA models is defined for which the parameter evolution is governed by a finite state Markov chain, that is able to model parameter evolutions with abrupt changes or different regimes. For this type of models, the parameter evolution equation (Equation (3..a)) is defined as follows, θ[t]=f(r t ) θ[t ]+ G(r t ) v[t] where F( ) and G( ) are the parameter transition and noise coupling matrices respectively, v[t] is a vector NID process with covariance matrix Σ v, and r t S = {,,...,n r } is a finite state Markov chain with n r states and transition probabilities: p i j p(r t = j r t = i) and initial probabilities p i p(r = r i ),r i S. It has also been proposed to model the parameter evolution as Markov processes in the general sense (Kitagawa, 987; Gosdill et al., ). Thus θ[t] is a random vector, with conditional PDF p(θ[t] θ[t ]), which is not necessarily Gaussian nor defines a linear relationship between θ[t] and θ[t ]. It is claimed that this parameter evolution model can provide better treatment of abrupt changes over the linear Gaussian smoothness priors (Gosdill et al., ). For example, in (Kitagawa, 987) non-gaussian distributions have been used to reproduce both abrupt and gradual changes of the parameters. Also, in (Miyanaga et al., 986) it is stated that the representation of time-varying parameters of a TARMA model of speech can be approximated by first order smoothness priors with process noise composed of a pseudo-periodical pulse train and a locally stationary Gaussian process. More recently, in (Doucet et al., ; Gosdill et al., 4), specific treatment of the estimation of TARMA models with parameters constrained to be within regions of the parameter space associated with stable observation trajectories, is achieved by means of non-linear stochastic modeling of the parameter evolution. The first problem associated with the identification of an SPE TARMA model is the estimation of the parameter trajectories. The estimation of the parameter trajectories can be posed in terms of the Maximum A Posteriori 5

70 3.. On the class of stochastic parameter evolution TARMA models framework, and is approached via recursive Bayesian filtering and smoothing methods. In the simplest case of SP-TAR models, parameter estimates are obtained via Kalman filtering and smoothing (Kitagawa and Gersch, 985). In the full SP-TARMA case, the observation equation becomes a non-linear function of the state vector (Poulimenos and Fassois, 6), and thus it is necessary to perform the estimation by means of the Extended Kalman Filter (EKF) (Anderson and Moore, 979, pp ), or by the use of an Extended Least Squares (ELS)-like approach, where the theoretical innovations are replaced by their prediction errors (Niedzwieki,, p. 44). More recently, improved estimation of the SP-TARMA models was pursued by means of Unscented Kalman Filters (UKF) or particle filters (Dong et al., 9; Spiridonakos and Chatzi, 3), for which the propagation of the probability densities through the non-linear system is approximated by a weighted average of a small set of points propagated through the model equations. UKF based estimates potentially yield increased accuracy compared to that one obtained with the estimates derived from the EKF and ELS, with almost the same computational effort. In (Dembo and Zeitouni, 988), it is postulated a dual modeling of the SP-TARMA process, where the evolution of the observations and the parameters is represented by dual state space representations, in order to facilitate the estimation in the full TARMA case. The estimation of the parameter trajectories in other types of SPE TARMA models featuring non-linearities or non-gaussian distributions requires the application of numerical approximation methods for the inference of the evolution of the associated non-linear / non-gaussian probability densities associated. However, in some cases it is possible to estimate the time-varying parameters using a KF-like approximation. Typical methods include particle filters, Markov Chain Monte Carlo methods, and Gaussian approximation filters (Gosdill et al., ; Gosdill and Clapp, ; Robert and Casella, 4; Hsiao, 8). Nevertheless, these algorithms tend to be computationally inefficient, especially when the acceptance rate of the Monte Carlo sampler is low (Hsiao, 8; Robert and Casella, 4). Therefore, in order to facilitate the parameter estimation process, it may be preferable to preserve a linear Gaussian parameter evolution model instead of a more complex non-linear or non-gaussian model. A second problem associated with the identification of SPE TARMA models and ofter overlooked is the selection of the values of the hyperparameters. In most practical applications, these parameters are defined by the user after several trial and test cycles, until satisfactory results are found. This procedure often yields sub optimal results. In some recent works, an alternative version of the Kalman filter, where the innovations and parameter innovations are normalized, is used in the estimation of SP-TARMA models (Poulimenos and Fassois, 6; Spiridonakos et al., ). This reduces the number of hyperparameters to be adjusted to only one, the variance ratio. The authors later suggest to optimize the value of the variance ratio by minimization of the residual sum of squares. In other types of stochastic models, the selection of the hyperparameters is approached as an estimation/optimization problem. Then, the estimation of the hyperparameters can be approached by means of either Maximum Likelihood (ML) or Maximum A Posteriori (MAP) methods. In the ML approach, the hyperparameters are treated as deterministic quantities and the objective consists of maximizing their likelihood given available data and unknown parameter trajectories (Kitagawa, Ch. 9; Rasmussen and Williams 6, Sec. 5.4; Särkä 3, Ch. ; Shumway and Stoffer 98; Shumway and Stoffer, pp. 34). This objective can be formally expressed as: ˆP ML = argmaxl(p y N, θ N ) (3..) P whereprepresents the hyperparameters, y N =[ y[] y[] y[n] ] T and θ N = [ θ[] θ[] θ[n] ] T represent the observations and the parameter trajectories on the analysis period of length N, and L(P y N, θ N ) is the complete data likelihood (of the hyperparameters). The problem in the implementation of an ML method on this type of problem is that the likelihood function is incomplete, in the sense that the parameter trajectories are unknown, and only estimates are available. At the same time, these estimates depend on the values of the hyperparameters. Therefore, it is necessary either to use a marginalized version of the likelihood, where the unknown parameter trajectories are integrated out, or to use parameter estimates to evaluate an approximate value of the likelihood. The optimization problem then is solved using non linear optimization methods in the first case, and the Expectation Maximization algorithm in the second. This type of methods have been analyzed for similar types of stochastic models in (Kitagawa, Ch. 9; Rasmussen and Williams 6, Sec. 5.4; Särkä 3, Ch. ; 5

71 3. GSC-TARMA modeling of non-stationary random signals Shumway and Stoffer 98; Shumway and Stoffer, pp. 34). On the other hand, in the MAP approach the hyperparameters are defined as random variables for which a prior PDF is defined. More specifically: ˆP MAP = argmax P p(θ N,P y N )=argmax p(θ N y N,P) p(p) (3..3) P where p(p, θ N y N ) is the joint posterior PDF of the parameters and hyperparameters given the observations, and p(p) is a prior PDF for the hyperparameters. The task then consists of performing inference based on the joint posterior PDF of the parameters and hyperparameters given the observations (Berger 985, Ch. 4; Robert 7, Ch. 4; Särkä 3, Ch. ; Wan and Nelson ). 3.3 Generalized linear stochastic constraint TARMA models 3.3. Model definition The non-stationary response signal y[t] of a Generalized (linear) Stochastic Constraint TARMA model GSC TARMA(n a,n c ) q with n a and n c designating the respective AR and MA orders, and q the stochastic constraint order is described by the following equation set: θ[t]= q k= y[t]= φ T [t] θ[t]+w[t], M k θ[t k]+ v[t], v[t] NID(v o, Σ v ) (3.3.a) w[t] NID (,σw ) (3.3.b) where M k R n n is the k-th stochastic constraint matrix, and v[t] is an NID process with mean v o R n and covariance matrix Σ v. The non-zero mean condition in the parameter innovations is appraised in order to obtain parameter trajectories with mean (see Appendix 3.B.) θ o = ( I n n + q k= ) M k v o (3.3.) given that all the poles of the multivariate AR process have magnitude less than one. Otherwise, when one or more poles have magnitude equal to one (being the case of an integrated multivariate AR process), the parameter innovations mean is zero and θ o θ[] is the initial condition of the parameter vector. The GSC TARMA model can also be expressed in terms of the state space representation where z[t]= F(µ) z[t ]+ G v[t], v[t] NID(v o, Σ v ) (3.3.3a) y[t]= h T [t] z[t]+w[t], w[t] NID (,σw ) (3.3.3b) State transition matrix: M M M q M q I n n n n n n n n F(µ)= n n I n n n n n n n n n n I n n n n F(µ) R q q 53

72 3.3. Generalized linear stochastic constraint TARMA models Noise coupling matrix: G= [ ] T In n G R n q n Measurement vector: h[t]= [ φ T [t] (q ) n ] T h[t] R n q State vector: Parameter innovations vector: z[t]= [ θ T [t] θ T [t ] θ T [t q+] ] T v[t]= [ v [t] v [t] v n [t] ] T z[t] R n q v[t] R n and represents the Kronecker product (Golub and van Loan, 996, Ch. ). Notice that it is explicitly shown in the notation the dependency of the state transition matrix on the stochastic constraint vector µ, which for the GSC TARMA model is defined as: µ = [ vec(m ) T vec(m ) T vec(m q ) T] T q with vec( ) indicating the vectorization operator. A more compact form of the GSC TARMA model is obtained when the stochastic constraint parameters and the parameter innovations covariance are assumed to be diagonal of the form M k µ k I n n and Σ v = σ v I n n. The stochastic constraint vector, the innovations variance, and the parameter innovations mean vector and covariance matrix, along with the initial conditions (y[] and θ[]), are collectively referred to as the GSC TARMA model hyperparameters, and are noted as P={µ, v o,σ w, Σ v,y[], θ[]} (3.3.4) Once given the hyperparameters values and specific realizations of the parameter innovations v[t] and the signal innovations w[t] for t =,...,N, it is possible to create a realization of the GSC TARMA model, including the parameter trajectories θ[t] and the response signal y[t] Statistical properties of the GSC TARMA model The joint and conditional distributions of the observations and parameters Under the NID assumption of the parameter and signal innovations, the values of the signal and the state vector at time t [,,N] are also jointly distributed Gaussian variables that follow the probability distribution (Rasmussen and Williams, 6, App. A): ( ) ([ȳ[t t ] [ ]) ] σ p y[t], z[t] φ[t], z[t ],P =N, ε [t] Σ y,z [t] z[t t ] Σ T y,z[t] Σ z,z where N(, ) indicates a multivariate Gaussian distribution with the indicated mean and covariance, and { ȳ[t t ] E { z[t t ] E y[t] φ[t], z[t ],P z[t] z[t ],P (3.3.5) } = h T [t] z[t t ] (3.3.6a) } = F(µ) z[t ]+ G v o (3.3.6b) The Kronecker product of two matrices A and B is A B [ a B ] a B a B a B The vectorization operator transforms the matrix A into the column vector a, as follows vec(a) [ a a a a ] T 54

73 3. GSC-TARMA modeling of non-stationary random signals are the signal and parameter vector one-step-ahead predictions, and { } σε[t] E (y[t] ȳ[t t ]) φ[t], z[t ],P = σw+h T [t] Σ z,z h[t] (3.3.7a) { } Σ y,z [t] E (y[t] ȳ[t t ]) (z[t] z[t t ]) T φ[t], z[t ],P = h T [t] Σ z,z (3.3.7b) { } Σ z,z E (z[t] z[t t ]) (z[t] z[t t ]) T z[t ],P = G Σ v G T (3.3.7c) are the one-step-ahead prediction error variance and covariance matrices, where the expectation operators are defined over the joint probabilistic space of y[t] and z[t] (see Appendix 3.B. for derivations). Given the joint Gaussian distribution of y[t] and z[t], it is simple to determine the conditional distributions of y[t] and z[t] using the identities of Gaussian distributions shown for example in (Rasmussen and Williams, 6, App. A). More specifically, for the first case, the conditional distribution of y[t] given z[t] (and φ[t],p) is ( ) p y[t] z[t], φ[t],p =N ( h T [t] z[t],σw ) (3.3.8) The distribution above defines the probability of the signal y[t] given the TARMA parameter at time t, the regression vector (containing the previous values of y[t]), and hyperparameters, and describes how the samples of the signal are obtained from a given parameter trajectory and hyperparameter values. On the other hand, the conditional distribution of z[t] given y[t] (and φ[t],p) is ( ) p z[t] z[t ],y[t], φ[t],p =N( z[t t], P[t t]) (3.3.9) where { z[t t]=e z[t] z[t ],y[t], φ[t],p } = z[t t ]+ K[t] (y[t] h T [t] z[t ]) (3.3.a) { } P[t t]=e (z[t] z[t t]) (z[t] z[t t]) T z[t ],y[t], φ[t],p =(I K[t] h[t]) Σ z,z K[t]= Σ y,z (σ ε) = h T [t] Σ z,z (σ w+h T [t] Σ z,z h[t]) (3.3.b) (3.3.c) and where K[t] is the Kalman gain. The conditional distribution in Equation (3.3.9) describes the probability of the parameter vector at time t, given its previous values, the observed values of the signal, and hyperparameters values. Notice that the conditional mean ẑz[t t] is also the value that maximizes the conditional distribution, namely, the Maximum A Posteriori estimate of z[t] given y[t], φ[t], z[t ] and P, while P[t t] is its respective covariance matrix The likelihood of the hyperparameters given the complete data Let y N {y[],y[],,y[n]} and zn {z[], z[],, z[n]} represent the complete response signal and state vector trajectories over the period of time t =,...,N. The joint distribution of y N and zn, given hyperparameters P is ( ) ( ) ( ) p y N, z N P = p y N z N,P p z N P (3.3.) where p(y N zn,p) is referred to as the conditional observation PDF, which is the conditional PDF of the signal yn given z N andp; and p(zn P) is the PDF of the parameter trajectory (or parameter PDF) given hyperparametersp, both of them equal to : ( ) N ( ) N p P = p z[t] z[t ],P = N( z[t t ], Σ v ) (3.3.a) ( p y N z N ) z N,P = t= N t= ( ) p y[t] φ[t], z[t],p = t= N t= N ( h T [t] z[t],σw ) (3.3.b) See Appendix 3.B.3 and Appendix 3.B.4 or (Anderson and Moore 979, Ch. ; Gosdill et al. ; Shumway and Stoffer, Ch. 6) for the respective derivations. 55

74 3.3. Generalized linear stochastic constraint TARMA models with initial conditions φ[]= [ y[] w[] ] T φ[] R n z[]= [ θ T [] n n ] T z[] R n q Figure 3.3. provides a graphical description of the evolution of the PDFs described by equations (3.3.b) and (3.3.a) in the GSC TARMA model. The evolution of the state vector z[t] is governed by the conditional PDF p(z[t] z[t ],P), while the evolution of the signal y[t] is governed by the conditional PDF p(y[t] φ[t], z[t],p) (Gosdill et al., ). Parameters: z[] p(z[] z[], P) p(z[3] z[], P) p(z[t] z[t z[] z[3] ], P) z[t] Observations: y[] y[] y[3] y[t] p(y[] φ[], z[], P) p(y[3] φ[3], z[3], P) p(y[t] φ[t], z[t], P) Figure 3.3.: Graphical description of the evolution of the GSC TARMA parameters and observed time series and the probability density functions involved. Notice that the joint PDF p(y N, zn P) also can be seen as a likelihood function for the hyperparameters given the complete data y N and zn. In such a case, the notation L(P yn, zn ) p(yn, zn P) shall be used instead to represent this likelihood, which is referred to as the complete data likelihood (Shumway and Stoffer 98; Shumway and Stoffer, pp. 34). In posterior analysis, the logarithm of L(P y N, zn ) is preferred, which is of the form: lnl(p y N, z N )=ln p(y N z N,P)+ln p(z N P) (3.3.3) where, after computing the logarithm in equations (3.3.b) and (3.3.a), yields: ln p(y N z N,P)=K (N ) lnσ w ln p(z N P)=K (N ) ln Σ v and where K and K are constants. σ w N t= N t= (y[t] h T [t] z[t]) (z[t] z[t t ]) T Σ z,z (z[t] z[t t ]) The marginal likelihood of the hyperparameters The marginal PDF of y[t] can be extracted from Equation (3.3.5), and is equal to p(y[t] φ[t],p)=n ( ȳ[t],σ ) ε (3.3.4) while the marginal distribution of y N is 56 p(y N P)= N t= p(y[t] φ[t],p)= N t= N ( ȳ[t],σ ε[t] ) (3.3.5)

75 3. GSC-TARMA modeling of non-stationary random signals The quantity p(y N P) when seen as a function of the hyperparameters also represents a likelihood, in this case the likelihood obtained by marginalizing the complete data likelihood with respect to the state vector (Kitagawa, Ch. 9; Rasmussen and Williams 6, Sec. 5.4; Särkä 3, Ch. ; Shumway and Stoffer, pp. 335). This likelihood is referred to as the marginal likelihood and is noted as L(P y N ), while the value of its logarithm is with K 3 being a constant and L(P y N )=K 3 N t= lnσ ε[t]+ w [t t ] σ ε[t] (3.3.6) w[t t ]=y[t] h T [t] z[t t ] (3.3.7) is the one-step-ahead prediction error. 3.4 Identification of a GSC TARMA model The identification of a GSC TARMA model can be stated as the problem of selecting the corresponding model structure M={n a,n c,q}, and estimating the AR/MA parameter trajectories θ N and hyperparameters P that best fit a set of available observations y N. Model fitness may be understood in different ways, and in this sense, it is possible to derive different fitness measures, which shall be discussed later. The identification problem for the class of GSC TARMA models requires the estimation of stochastic and deterministic quantities (the parameter trajectories and the hyperparameters, respectively), as well as a discrete parameter search for the selection of the model structure. Thus, the identification problem is divided into the following hierarchically related tasks in order to simplify its solution (see also Figure 3.4.): y N Observations Task (i) Estimate θ N Kalman filter & Smoother M Task (ii) P Evaluate cost function AdjustP MML/EM method Converge? All M ˆθ N, ˆP ML, M Yes evaluated? Yes GSC TARMA model No No Task (iii) Draw structural parameters Set of candidate structures Figure 3.4.: Identification loop for a GSC TARMA model (i) Estimate the parameter trajectories θ N (or equivalently the state vector trajectory z N ) given a sequence of observations y N, and fixed hyperparameters P and structurem; (ii) Estimate the values of the hyperparameters P given the observation trajectory y N and a structure M (the parameter trajectories are obtained as sub-products); The term hierarchical is used since in order to solve tasks (ii) and (iii) it is necessary to solve (i), and in order to solve (iii) it is necessary to solve (i) and (ii). 57

76 3.4. Identification of a GSC TARMA model (iii) Select the model structurem(the parameter trajectories and hyperparameters are obtained as sub-products). The formulation of each one of these tasks is presented next Estimation of the parameter trajectories with known hyperparameters and structure The estimation of the parameter trajectories consists of computing the values that maximize the a posteriori PDF of the parameter trajectory z N given observations yn, hyperparameters P and structure M. More specifically, the Maximum A Posteriori (MAP) estimates of the parameter trajectories are obtained by computing the densities p(z[t] y t ) (filtering) or p(z[t] yn ) (smoothing), and then finding the values that maximize those probabilities (Doucet et al. ; Shumway and Stoffer, Ch. 6; Simon 6, pp ). For the particular case of the GSC TARMA models, there are two possible scenarios: (i) the GSC TAR case where all the densities are Gaussian and the dynamics are linear; and (ii) the full GSC TARMA case, where the TARMA equation (Equation (3.3.3b)) is a non-linear function of z[t]. The GSC TAR case In the GSC TAR case all the densities are Gaussian and the state space representation in Equation (3.3.3) is linear. Thus, the computation of the filtering density p(z[t] y t ) = N(ẑz[t t], P[t t]) can be performed with the Kalman filter, which is summarized as follows (Anderson and Moore 979, pp. 44; Särkä 3, Ch. 4; Shumway and Stoffer, Ch. 6): Kalman filter: (i) Initialization: For t = ẑz[ ]=E{z[]} P[ ]=E { (z[] ẑz[]) (z[] ẑz[]) T} (3.4.a) (3.4.b) (ii) Prediction: Compute p(z[t] y t )=N(ẑz[t t ], P[t t ]), where ẑz[t t ]= F(µ) ẑz[t t ]+ G v o P[t t ]= F(µ) P[t t ] F T (µ)+ G Σ v G T (3.4.a) (3.4.b) (iii) Update: After obtaining a new observation y[t], compute p(z[t] y t )=N(ẑz[t t], P[t t]), where ẑz[t t]=ẑz[t t ]+ K[t] (y[t] h T [t] ẑz[t t ] ) P[t t]=(i K[t] h T [t]) P[t t ] (3.4.3a) (3.4.3b) K[t]= P[t t ] h[t] (h T [t] P[t t ] h[t]+σ w) (3.4.3c) where { } ẑz[t t ]=E z[t] y t, a priori state estimate { } P[t t ]=E (z[t] ẑz[t t ]) (z[t] ẑz[t t ]) T y t a priori state estimation error, covariance matrix { } ẑz[t t]=e z[t] y t, a posteriori state estimate { } P[t t]=e (z[t] ẑz[t t]) (z[t] ẑz[t t]) T y t a posteriori state estimation error, covariance matrix K[t] Kalman gain If there is no previous knowledge of the initial values of the state and state error covariance matrix, then these values can be set as ẑz[ ]= M and P[ ]= p I M, where p is a large constant. 58

77 3. GSC-TARMA modeling of non-stationary random signals Once the filtered estimates are ready, smoothed state estimates can be computed by means of the fixed interval smoother defined by the following set of equations (Anderson and Moore 979, pp. 44; Särkä 3, Ch. 4; Shumway and Stoffer, Ch. 6): Kalman smoother: Compute p(z[t] y N )=N(ẑz[t N], P[t N]), where ẑz[t N] = ẑz[t t]+ A[t] (ẑz[t+ t+ ] ẑz[t+ t]) P[t N] = P[t t] A[t] (P[t+ t] P[t t]) A T [t] A[t] = P[t t] F T (µ) P [t+ t] (3.4.4) where { ẑz[t N]=E P[t N]=E A[t] z[t] y N { (z[t] ẑz[t N]) (z[t] ẑz[t N]) T y N }, smoothed state estimate }, smoothed state estimation error covariance matrix Smoother gain The EM algorithm of Section also requires to compute smoothed estimates of the lag-one state estimation error covariance matrices, which are obtained by means of the following recursion (Shumway and Stoffer,, Ch. 6): Lag-one Covariance Smoother: P[N,N N] = (I K[N] h T [N]) F(µ) P[N N] P[t,t N] = P[t,t] A T [t ]+ A[t] (P[t+,t N] F(µ) P[t t]) A T [t ] } (3.4.5) where P[t,t N]=E { } (z[t] ẑz[t N]) (z[t] ẑz[t N]) T y N, smoothed lag-one state estimation error covariance matrix Once Kalman filter and smoother estimates of the parameter trajectories are available, it may be also of interest to obtain estimates of the signal. As in the Kalman filter and smoothing parameter estimates, there are three types of estimates that can be derived, namely a priori estimates (one-step-ahead predictions) ŷ[t t ], filtered estimates ŷ[t t], and smoothed estimates ŷ[t N], each one of them defined as: ŷ[t ˆθ[t t ]] h T [t] ẑz[t t ] ŷ[t ˆθ[t t]] h T [t] ẑz[t t] ŷ[t ˆθ[t N]] h T [t] ẑz[t N] (3.4.6a) (3.4.6b) (3.4.6c) The GSC TARMA case As pointed out before, in the complete GSC TARMA case the observation equation (3.3.3b) is a non-linear function of z[t]. Specifically, the (signal) innovations and the state measurement vector are of the form (Poulimenos and Fassois, 6): w[t, z t ]=y[t] h T [t, z t ] z[t] [ ] h[t, z t ]= y[t ] y[t n a ] w[t, z t ] w[t n c, z t n c T ] (q ) (n) q n For this type of state space representation, it is possible to use non-linear approximation filters like the Extended KF (EKF), Unscented KF (UKF), Particle Filters (PF) or related methods (Doucet et al. ; Simon 6, Ch. 3; Wan and van der Merwe ). However, a simpler yet effective method consists of using an Extended Least Squares (ELS)-like approach, where the theoretical innovations w[t, z t ] are replaced by the one-step-ahead prediction error e[t] = y[t] h T [t ] ẑz[t t], which then are treated as measurements (Niedzwieki, p. 63; Poulimenos and Fassois (6)). 59

78 3.4. Identification of a GSC TARMA model 3.4. Estimation of the hyperparameters via Maximum Likelihood methods The estimation of the hyperparameters of the GSC TARMA model concerns the estimation of the stochastic constraint parameter vector µ, the mean value of the parameter innovations variance v o, the initial conditions y[] and θ[], as well as the innovation variance σ w and the parameter innovations covariance matrix Σ v. The estimation of these values can be carried out by means of either Maximum Likelihood (ML) or Bayesian methods. The remainder of this section is devoted to ML estimation methods, while the next section discusses Bayesian identification methods An approach based on the marginal likelihood The marginal likelihood approach consists of obtaining the hyperparameter values that maximize the marginal likelihood, defined in Equation (3.3.6). More specifically, estimates of the hyperparameters are obtained through the following optimization problem (Kitagawa, Ch. 6; Shumway and Stoffer, Ch. 6): ˆP=argmax P lnl(p yn ) lnl(p y N )=K N ( ) ln ˆσ ε[t]+ ŵ [t t ] ˆσ ε[t] t= (3.4.7a) (3.4.7b) where K is a constant, and ŵ[t t ]=y[t] ŷ[t t ] is the a priori estimation error (see Equation (3.4.6)) with corresponding variance ˆσ ε[t], which is obtained from the Kalman filter prediction equations (see Equation (3.4.3)). Specifically, both quantities are defined as (Shumway and Stoffer,, Ch. 6): ŵ[t t ]=y[t] h T [t] ẑz[t t ] ˆσ ε[t]= h T [t] P[t t ] h[t]+σ w (3.4.8a) (3.4.8b) Notice that in comparison with the expression of the marginal likelihood in Equation (3.3.6), here the one-stepahead prediction error w[t t ] and its variance σε[t] are replaced by the Kalman filter a priori estimates ŵ[t t ] and ˆσ ε[t], respectively. This substitution is necessary since the actual values are unknown and only the estimates provided by the Kalman filter are available. This type of likelihood optimization, referred to as the Maximum Marginal Likelihood (MML) approach, has been addressed initially by (Akaike, 973, 98; Gupta and Mehra, 974; Lim and Oppenheim, 978) for the estimation of the hyperparameters of linear Gaussian models, and more recently for the estimation of the variances of SP-TAR models (Kitagawa,, Ch. 6) and for dynamic linear models (Shumway and Stoffer,, Ch. 6). The MML optimization problem in Equation (3.4.7) requires the use of iterative optimization techniques, for which gradient-based methods may yield increased convergence properties. For the analytical evaluation of the gradient of the marginal likelihood first consider the vector p= [ ] T p p p K vec(p) R K, which is a vector with the hyperparameters. Then, the gradient of the marginal likelihood with respect to the hyperparameter vector is lnl(p y N ) p lnl(p y N ) = p k [ = lnl(p y N ) lnl(p y N ) p p N t= ( ( ˆσ ε [t] ŵ [t t ] ˆσ 4 ε[t] lnl(p y N ) p K ] T (3.4.9) ) ˆσ ε[t] + ŵ[t t ] h T [t] p k ˆσ ε[t] ) ẑz[t t ] p k for all k=,,k. Furthermore, the evaluation of the partial derivative of ẑz[t t ] with respect to the k-th entry of the hyperparameter vector p k leads to the recursive equations (Wan and Nelson, ): 6 ẑz[t t ] p k = F(µ) ẑz[t t ] ẑz[t t ]+ F(µ) + G v o p k p k p k (3.4.a) ẑz[t t] = ( I K[t] h T ẑz[t t ] [t] ) + K[t] (y[t] h T [t] ẑz[t t ] ) (3.4.b) p k p k p k

79 3. GSC-TARMA modeling of non-stationary random signals for t =,3,,N, with initial value ẑz[ ]/ p k = n q. Likewise, the following expressions are obtained for the innovations variance ˆσ ε[t] = h T [t] P[t t ] p k p k P[t t ] p k = h[t]+ σ w p k ( F(µ) p k P[t t ]+ F(µ) P[t t ] p k ) F T (µ)+ G Σ v p k G T (3.4.a) (3.4.b) P[t t] =(I K[t] h T [t]) P[t t ] K[t] h T [t] P[t t ] (3.4.c) p k p k p k To simplify the expressions of the gradients, the last terms in equations (3.4.b) and (3.4.c) may be dropped by assuming that the Kalman gain is independent of the hyperparameters (Wan and Nelson, ). In a practical application, it should be considered whether the increased complexity and computational expense in calculating the recursive derivatives is worth the improvement of the performance of the optimization method. As an alternative and since the computation of these gradients is evidently not straightforward, it may be more practical to use numerical approximations or gradient free optimization methods, at the expense of potentially reduced convergence speed. Another issue to be considered is the selection of initial values, so as to avoid that the optimization method falls into local maxima or finds saddle points, discontinuities and singularities in the parameter space (Akaike, 973; Gupta and Mehra, 974). For this reason it is very important to obtain good starting values to ensure convergence to the absolute maximum. In that sense, the optimization algorithm may be initialized using a relatively simple but robust model, such as the smoothness priors TARMA model, with reasonable small values for the innovations variance and parameter innovations covariance, such as ˆσ w() =, and ˆΣ v() = ˆσ v() I, with < ˆσ v(). Finally, it is important to consider as well that this type of methods will require a much larger number of iterations when the size of p is too large. For that reason, it is recommended to reduce the number of optimized hyperparameters by considering the compact GSC TARMA representation with a diagonal structure for the stochastic constraint matrices M k = µ k I, and for the parameter innovations covariance matrix Σ v = σv I An approach based on the complete data likelihood Alternatively, it is possible to obtain maximum likelihood estimates by maximizing the complete data likelihood lnl(p y N, zn ), defined in Equation (3.3.3). The problem in the application of this method is that the values zn are also unknown. To circumvent this problem, the optimization is carried out instead in terms of the function (Shumway and Stoffer,, Ch. 6): Q(P P j )=E { lnl(p y N, z N ) y N,P j } (3.4.) which is the conditional expectation of the complete data likelihood given the observed signal y N and a previously given value (or initial guess) of the hyperparametersp j, referred to as the expected log-likelihood. The optimization problem then translates into finding the values of the hyperparameters that minimize the function Q(P P j ), or more explicitly ˆP=argmin P Q(P P j ) (3.4.3) which is solved by means of the Expectation-Maximization (EM) algorithm. The EM algorithm is an iterative optimization method that alternates between computing the expected log-likelihood Q(P P j ) in the expectation step, and finding the values that maximize the expected log-likelihood in the maximization step (Dempster et al. 977; Musicus and Lim 979; Shumway and Stoffer 98; Roweis and Ghahramani 999, ; Weinstein et al. 994; Shumway and Stoffer, Ch. 6). The form of the expected log-likelihood for a GSC TARMA model is shown as follows (see the full derivation of this expression in Appendix 3.C), 6

80 3.4. Identification of a GSC TARMA model (N ) (N ) Q(P P j )=K+ ln Σ z,z + lnσw ( ) + N ( ) +h σ w y[t] h T [t] ẑz[t N] T [t] P[t N] h[t] t= ( ( )) + tr Σ z,z S F(µ) S T S F T (µ)+ F(µ) S F T (µ) + N t= ( ) ( ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G v o + v T o G T Σ z,z G v o (3.4.4) where tr( ) indicates the trace operator, and S, S and S are(nq nq) sized matrices, defined as: S = S = S = N t= N t= N t= ( ) P[t N]+ẑz[t N] ẑz T [t N] ( ) P[t,t N]+ẑz[t N] ẑz T [t N] ( ) P[t N]+ẑz[t N] ẑz T [t N] (3.4.5a) (3.4.5b) (3.4.5c) After computing the derivatives of Q(P P j ) with respect to each one of the hyperparameters and equating to zero, leads to the following expressions for the updated hyperparameters (see full derivation of each one of the expressions in Appendix 3.C) ( v o( j) = M ( j) = I n [ q k= [ Σ v( j) = N σw( j) = N S []+ ( ) M k( j ) N t= S [,] N t= N N t= v o( j ) ẑz T [t N] q k= ] ˆθ[t N] ) (3.4.6a) S (3.4.6b) ( ) M k( j ) S T [,k]+ S [k,] M T k( j ) + q q k= l= M k( j ) S [k,l] M T l( j ) + v T o( j ) v o( j ) + v T o( j) v o( j ) (3.4.6c) ( y[t] h T [t] ẑz[t N] ) + h T [t] P[t N] h[t] (3.4.6d) where M = [ ] M M M q, P j ={M ( j), v o( j), Σ v( j),σv( j) } represent the hyperparameter values at the j th iteration, and the matrices S and S are divided into smaller equally-sized submatrices of the following form: S [] S [,] S [,] S [,q] S = S [] = S [,] S [,] S [,q] S [q] S [q,] S [q,] S [q,q] ] where S [k] is an (n nq) matrix and S [k,l] is an(n n) matrix. The following simplified form of the update equations is obtained for the compact form GSC TARMA model obtained when the stochastic constraint parameter and parameter innovations covariance matrices are diagonal, 6

81 where v o( j) = ( µ ( j) = + q k= S S [] ( σ v( j) = n (N ) σw( j) = N N t= ) ( µ k( j ) N S [,] q k= N t= 3. GSC-TARMA modeling of non-stationary random signals ˆθ[t N] ) ( µ k( j ) S ) [,k] + q q k= l= µ k( j ) µ l( j ) S [k,l] (3.4.7a) (3.4.7b) ) + n v o( j ) v T o( j ) + n v o( j ) v T o( j) (3.4.7c) ( y[t] h T [t] ẑz[t N] ) + h T [t] P[t N] h[t] (3.4.7d) S [k,l]=tr(s [k,l]), S [k,l]=tr(s [k,l]), S [k,l]=tr(s [k,l])+ v o( j ) ( N t= ˆθ T [t N] ) (3.4.8) The convergence properties of the EM algorithm have been largely discussed in the literature. The following are the most remarkable properties of the EM algorithm (Dempster et al. 977; Shumway and Stoffer 98; Wu 983; Yamaguchi and Watanabe 4; Shumway and Stoffer, Ch. 6): (i) Any sequence of estimates of the EM algorithm are guaranteed to increase the marginal likelihoodl(p y N ), and, if bounded above, the EM algorithm converges to a local maximum of the marginal likelihood. (ii) The rate of convergence of the EM algorithm is proportional to γ, where γ is the largest eigenvalue of the ratio of the information matrix of the complete data likelihood to the information matrix of the marginal likelihood. This means that the convergence rate reduces as the amount of missing data (the size of the z N ) increases in proportion to the size of the known data y N. (iii) The likelihood L(P y N ) might be characterized by several local maxima, so in order to obtain the global maximum, it may be recommendable to use different starting points for the EM sequence. In some cases when the convergence speed of the EM algorithm is very low, a non-linear iterative optimization algorithm can be appraised in order to minimize directly the expected log-likelihood Q(P P j ), and in this way speed up the convergence of the method after some initial iterations have been made with the conventional EM algorithm (Minami, 4) Estimation of the hyperparameters via Bayesian methods The basic assumption for all Bayesian identification methods is that the parameters, the hyperparameters and the observations of the GSC-TARMA model are random quantities, all of them related by the joint PDF p(y N, zn,p). Then, the identification problem is based on the quantity p ( z N,P y N ), referred to as the joint parameter posterior, from which Maximum A Posteriori (MAP) estimates of the parameters and hyperparameters are obtained by maximizing such PDF (Doucet et al., ). Formally speaking, MAP estimates are derived from the following optimization problem: {ẑz N, ˆP}=arg ( ) max p z N,P y N {z N,P} (3.4.9) where, with the help of the Bayes rule, the joint parameter posterior is decomposed as follows: ( ( ) ( ) Joint Parameter Posterior: p z N,P y N = p( y N, zn,p) p y N z N ),P p z N P p(p) p(y N ) = p(y N ) Evidence: p(y N )= p ( y N, z N,P ) dz dp (3.4.a) (3.4.b) 63

82 3.4. Identification of a GSC TARMA model and where p ( y N z N,P) is the conditional observation PDF of Equation (3.3.b), and p ( z N P ) is the conditional parameter PDF of z N of Equation (3.3.a), and p(p) is a prior probability for P (or hyper prior), which shall be defined according to the specific method used to solve the optimization in Equation (3.4.9). The problem of the optimization of the MAP objective function is difficult, since both z N and P are nonlinearly and statistically coupled. Two approaches are analyzed next, namely: (i) a method based on Markov Chain Monte Carlo (MCMC) sampling, in which the joint parameter posterior is randomly sampled and then MAP estimates of the hyperparameters are derived from the obtained sample set; and (ii) a joint Kalman filter estimation method, where both z N and P are optimized after having (artificially) defined P as a dynamic variable (Cox 964; Ljung 979; Nelson and Stear 976; Wan and Nelson ; Ljung and Soderstrom 983, Sec..3; Niedzwiecki and Cisowski 996, Ch. 7) Estimation via Markov Chain Monte Carlo sampling The principle of operation of this method consists on drawing samples from the joint parameter posterior, defined in Equation (3.4.a). Then, point estimates of the hyperparameter values may be obtained by sample averages. For example, MAP estimates are obtained as the sample median value. However, given the difficulty of the calculation of the evidence through the marginalizing integral shown in Equation (3.4.b), the joint parameter posterior is known only up to a normalizing factor, and thus direct sampling is not possible. For that reason, sampling schemes such as the Metropolis Hastings sampling method must be used (Geyer ; Robert 7, p. 33; Robert and Casella 4, Sec. 6.). The Metropolis Hastings sampling method is summarized as follows: Metropolis Hastings sampling method: Given the PDF π(p), known up to a normalizing factor, and a conditional density q(p P), the Metropolis-Hastings algorithm generates the chain P m, by (i) Start with an arbitrary value P, (ii) Update fromp m to P m+ for m=,,m by (a) Generate P q(p P m ) (b) Define { π(p ) q(p m P } ) ρ = min π(p m ) q(p P m ), (c) Take P m+ = { P P m with probability ρ otherwise where π(p) and q(p P) are referred to as the target and proposal distributions respectively. The method is completed by specifying the form of the target and proposal distributions. The forthcoming paragraphs explain how both quantities are calculated for the identification of a GSC-TARMA model. Target distribution: The target distribution is the un-normalized joint parameter posterior, as in Equation (3.4.a): π(p) p ( y N z N,P ) p ( z N P ) p(p) (3.4.) where p ( y N z N,P) and p ( z N P ) are the conditional observation PDF and parameter PDF, both defined in Equations (3.3.b) and (3.3.a). The hyper-priors are not required to be precise representations of the hyperparameters, but instead these may be rough descriptions of their distribution (Berger 985, Ch. 3; Robert 7, Ch. 3). Table 3. summarizes some exemplary distributions that may be associated with each one of the hyperparameters of the GSC-TARMA model. Since the actual values of z[t] are unknown, these are replaced by the smoothed Kalman filter estimates ẑz[t N], obtained from the Kalman filter recursion shown in Equation (3.4.3). 64

83 3. GSC-TARMA modeling of non-stationary random signals Table 3.: Some hyper-priors and proposal distributions to be used on the MCMC optimization of the hyperparameters of the GSC-TARMA model. Hyperparameter Hyper-prior Proposal distribution π(p) q(p P m ) Stochastic constraint vector N ( n,σ ) µ I n n N ( µ m,σ ) µ I n n Mean parameter vector N ( n,σ θ ) I n n N ( θ m,σ θ ) I n n Innovations and parameter innovations variance Γ(α,β) Γ(α,β) σµ, σµ, α, and β are user-defined parameters. Γ(α,β) represents a gamma distribution with shape parameter α and scale parameter β. Proposal distribution: The proposal distribution determines how new samples are introduced in the algorithm, and also how these are scored via the ratio ρ. Two types of distributions are typically used: independent distributions of the form q(p ) which are independent from the previous values, and random walk distributions, where P =P m + ε m, where ε m is a perturbation with distribution g(ε), independent of P, so that q(p P m )=g(p P m ) (Robert and Casella 4, Ch. 6). A summary of some exemplary distributions that can be used as proposal distributions for the MCMC identification of the GSC-TARMA model is provided in Table Joint Kalman filter estimation method The joint Kalman filter estimation method is based on a reformulation of the state space representation of the GSC-TARMA model in Equation (3..). For this purpose, firstly an augmented state vector z J [t] is defined as follows (Anderson and Moore 979, p. 84; Ljung and Soderstrom 983, p. 37): z J [t]= [ z T [t] µ T [t] ] T (3.4.) and then, the following modified state space representation is defined based on the augmented state vector [ ] [ ] [ ] z[t] F(µ[t ]) z[t ] G v[t] = + (3.4.3a) µ[t] µ[t ] u[t] y[t]= h T [t ] z[t]+w[t] where u[t] NID( q, Σ u ) are the hyperparameters innovations, with covariance matrix (3.4.3b) E { u[t] u T [t] } = Σ ϑ = ς I q q (3.4.4) and ς is an used defined parameter. The modified state space representation of Equation (3.4.3) may be associated with a joint parameter posterior of the form p({z J } N y N )= p(z N, µ N yn )= p(y N z N, µ N ) p(zn, µ N ) (3.4.5) where p(y N zn, µn ) and p(zn, µn ) are normal distributions which take forms similar to those observed in Equation (3.3.a) and Equation (3.3.b). The estimation of the parameter trajectories and hyperparameters of GSC-TARMA model through the joint Kalman filter approach consists of performing recursive Bayesian estimation for the estimation of the extended state vector associated with the modified state space representation of Equation (3.4.3). For this purpose, nonlinear approximation versions of the Kalman filter are available, such as the Extended KF (EKF), Unscented KF (UKF), Particle Filters (PF) and related methods (Doucet et al. ; Wan and van der Merwe ; Simon 6, Ch. 3). The values of the remaining hyperparameters can be updated through successive iterations of the non-linear Kalman filter approximation, by means of the following expressions (Avendaño-Valencia and Fassois, 5d) ˆσ w = N ˆθ o = N N t= N t= ( y[t] h T [t] ẑz[t N] ) + h T [t] P[t t] h[t] (3.4.6a) ˆθ[t t] (3.4.6b) 65

84 3.4. Identification of a GSC TARMA model The convergence properties of this approach are not very satisfactory in general, since the algorithm tends to produce biased estimates, while in many cases has divergence issues (Ljung, 979). The divergence problem can be partially tackled by introducing forgetting factor or annealing techniques, so that the quantities ς, τ are shrunk towards zero as the algorithm iterates (Wan and Nelson, ). Another option consists on using dual Kalman filtering methods, based on dual state space representations for the parameters and hyperparameters of the GSC TARMA model, as discussed in (Wan and Nelson, 997, ). These methods allow for independent treatment of the parameters and hyperparameters and facilitate the control of the convergence of the algorithm Selection of the best model structure The selection of the order of a GSC TARMA model consists of the selection of the AR and MA polynomial orders n a and n c and the stochastic constraint order q. The problem of model order selection is usually addressed by a discrete parameter search on the multidimensional discrete space spanned by n a, n c and q Poulimenos and Fassois (6). A simplified parameter search strategy for GSC TARMA models consists of the following three steps (similarly to Poulimenos and Fassois (6, 9b)): (i) With the value of q fixed, estimate GSC TARMA(n,n) q models with n =,...,n max where n max is the maximum order to be explored, select n a as the order of the best fit model; (ii) With n a and q fixed, estimate GSC TARMA(n a,n) q models, with n=,...,n a, select n c as the order of the best fit model; (iii) Increase the value of q and return to step (i) if q is lower than the maximum stochastic constraint order q max. Otherwise, select q as the stochastic constraint order of the best fit GSC TARMA(n a,n c ) q model. On the model structure selection process, the fitness of a GSC TARMA model is evaluated by means of the particular objective function used for optimization. However, for the sake of comparison with other model types it is more useful to evaluate commensurate fitness measures. For that purpose, consider first the filtering and smoothing residuals, defined as follows (see also Equation (3.4.6)): Filtering residuals: ε f [t] y[t] y[t ˆθ[t t]], y[t ˆθ[t t]] h T [t] ẑz[t t] (3.4.7a) Smoothing residuals: ε s [t] y[t] y[t ˆθ[t N]], y[t ˆθ[t N]] h T [t] ẑz[t N] (3.4.7b) based on the filtered and smoothed estimates of the parameter vector ˆθ[t t] and ˆθ[t N], respectively. The residuals defined in Equation (3.4.7) are the errors obtained when the either the filtering or smoothing estimates are used to predict the signal, and may be compared directly with the residuals (one-step-ahead prediction errors) obtained with other types of TARMA models Poulimenos and Fassois (6, 9b); Spiridonakos and Fassois (4b), ε[t] y[t] y[t ˆθ[t]], y[t ˆθ[t]] φ T [t] ˆθ[t] (3.4.8a) where ˆθ[t] are the obtained parameter estimates based on the particular TARMA model type. Based on the above, the following fitness measures can be computed: Mean Squared Error The Mean Squared Error is a measure of the size of the (filtered or smoothed) estimation residuals, and is defined as follows: MSE = N where ε[t] represents the estimation residuals, including filtered or smoothed SPE TARMA model-based residuals. It is also worth to mention that the commonly used Residual Sum of Squares (RSS) measure corresponds to a nonnormalized version of the MSE Poulimenos and Fassois (6, 9b). 66 N t= ε [t]

85 3. GSC-TARMA modeling of non-stationary random signals Aggregate Parameter Deviation Whenever the actual values of the parameters and hyperparameters are available, it is useful to compute the Aggregate Parameter Deviation (APD), which measures the difference between the original parameter trajectories θ[t] and the smoothed estimates of the parameter trajectories ˆθ[t N], as Poulimenos and Fassois (6) APD= n N N t= (θ[t] ˆθ[t N]) T (θ[t] ˆθ[t N]) and as in the MSE, the APD can also be measured in terms of the filtered parameter estimates ˆθ[t t] Validation of the identified models Validation is carried out in order to determine if the assumptions made in the definition of the model are satisfied by the identified one (Ljung, 999, Sec. 6.5), Poulimenos and Fassois (6), (Soderstrom and Stoica, 989, Ch. ). These assumptions include the Gaussianity and uncorrelatedness of the estimation residuals. For this purpose, both the filtered or the smoothed parameter estimates can be used in the evaluation. There are several tools to examine visually the Gaussianity of the residuals, including histograms and normal probability plots. Otherwise, hypothesis tests such as the Kolmogorov Smirnov test can be used to quantitatively evaluate the Gaussianity of the residuals (Corder and Foreman 9, Ch. ; Soderstrom and Stoica 989, Ch. ). Uncorrelatedness can be evaluated by means of plots of the AutoCorrelation Function (ACF) and by whiteness hypothesis tests derived from the ACF estimates, such as the Ljung-Box Q test (Ljung 999, Sec. 6.5; Soderstrom and Stoica 989, Ch. ). 3.5 Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend 3.5. Model definition The main aim of the present evaluation is to analyze the performance of the GSC TAR model identification methods discussed in this chapter, and provide a comparison with FS TAR model based methods. For that purpose, a TAR model with stochastic parameter evolution in combination with a deterministic trend component is studied. To start with, consider the discrete non-stationary signal y[t] defined over the normalized discrete time t =,,,N, which is governed by the following TAR() process y[t]= a [t] y[t ] a [t] y[t ]+w[t]= φ T [t] ϑ[t]+w[t], w[t] NID (,σw ) (3.5.) where φ[t]= [ y[t ] y[t ] ] T and θ[t]= [ a [t] a [t] ] T are the regression and parameter vectors, respectively, and w[t] is an NID process with zero mean and variance σ w. Additionally, the parameter vector ϑ[t] is governed by the equations ϑ[t]= θ[t]+ θ[t] ] [ ] θ[t]=[ā [t] ρ[t] cosω[t] = ā [t] ρ [t] θ[t]= µ θ[t ] µ θ[t ]+ v[t], v[t] NID (,σ v I ) (3.5.a) (3.5.b) (3.5.c) It might be tempting to evaluate as well the Gaussianity, uncorrelatedness and un-cross-correlatedness of the residuals of estimation of the parameter trajectories (i.e. the filtered parameter estimation residuals ˆv[t t]= z[t] ẑz[t t]). Nonetheless, there are two problems with such a test: (i) in most practical scenarios, the actual parameter trajectories z[t] are unavailable; (ii) given the structure of the Kalman filter, the estimates ˆv[t t] and ˆv[t t ] are highly correlated, and in consequence a test of uncorrelatedness will certainly fail. 67

86 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend where θ[t] R is the stochastic component of the parameter vector and θ[t] R the deterministic time-dependent trend of the parameter vector, determined by two conjugate poles with frozen time-dependent magnitude ρ[t] and time-dependent angle ω[t], each one of them defined as: ρ[t]=ρ + k= ρ k exp ( (t/n α k) ) ; ω[t]=ω + β k k= ω k exp ( (t/n α k) ) with coefficients ρ k [,] and ω k [,π] rad for k=,,, and where α k [,] for k=, is the location and β k R + for k=, the scale parameter of the respective Gaussian functions. Moreover, the stochastic component θ[t] is governed by an AR model with parameters µ = [ ] T µ µ excited by the NID process v[t] R with zero mean and covariance matrix Σ v = σv I. Moreover, the processes w[t] and v[t] are uncorrelated and uncrosscorrelated. The whole set of hyperparameters that determine the SPE TAR() model defined by equations (3.5.) and (3.5.) are presented in Table 3.. β k Table 3.: Values of the hyperparameters of the SPE TAR() model with time-dependent deterministic trend described by equations (3.5.) and (3.5.). Parameter Value Initial values y[t]= and θ[t]= for t Innovations and parameter innovations variances σw = 3, σv = 7 Stochastic constraint parameters µ = [ ] T Deterministic trend parameters ρ =.85, ρ =., ρ =.8 ω = π/4, ω = π/, ω = π/ α =.5, α =.75 β =., β =.3 Sample length N = Number of realizations M = 5 The evolution of the parameters of the SPE TAR model defined by Equation (3.5.) is characterized by the mean value θ[t], which is constant for any realization of the process, while the term θ[t] defined by an AR process, describes the random variations of the parameter vector ϑ[t] over the mean value θ[t], which change from realization to realization. This type of parameter evolution serves as an analogue of a time dependent system with deterministic dynamics influenced by changing operational or environmental conditions modifying the evolutionary dynamics at different analysis periods. The presently analyzed identification problem is challenging for both GSC TAR and FS TAR modeling methods due to the complex nature of the parameter evolution. For the particular for the case of FS TAR models, despite of the presence of a strong deterministic component in the evolution of the parameters, the randomness introduced in the parameter evolution makes difficult the selection of an optimal functional basis subspace. Therefore, the idea is to understand the advantages of using the GSC TAR approach in this type of problem. A Monte Carlo simulation is carried out by generating a set of M = 5 realizations, each one of them of N = samples, using different realizations for the signal and parameter innovation processes, based on the initial values and the hyperparameter values shown in Table 3.. Figure 3.5. shows a single realization of the process, along with the time-dependent deterministic trend of the parameters and the sample-based 99.6% confidence intervals (equivalent to ±3 σ θi ) of the each one of the parameters. Figure 3.5. provides a time-varying frequency domain representation of a single realization of the process via the frozen poles and frozen time-varying PSD (Poulimenos and Fassois, 6). The value σ θi is the sample based standard deviation of the parameter trajectories computed as where ϑ m is the m th realization of the parameter vector. 68 σ θ i = M N M m= N (a i,m ā i [t]) t=

3. GSC-TARMA modeling of non-stationary random signals - a[t] -.5 99.6% confidence interval parameter mean ā [t] single realization - 4 6 8 4 6 8 a[t].4. 99.6% confidence interval parameter mean ā [t] single realization.8.6 4 6 8 4 6 8.

87 3. GSC-TARMA modeling of non-stationary random signals - a[t] % confidence interval parameter mean ā [t] single realization a[t] % confidence interval parameter mean ā [t] single realization y[t] Normalized time Figure 3.5.: Single realization of the SPE TAR() model with time-dependent deterministic trend. Top and middle: Single realization of the parameter trajectories a [t] and a [t], along with the time-dependent deterministic trend and the sample-based 99.6% confidence interval. Bottom: Single realization of the signal generated by the SPE TAR() model. (a) Imaginary π/T..3π/T π/T.π/T (b) Frequency [rad/π] ω[t] PSD [db] Real 5 5 Normalized time -8 Figure 3.5.: Frozen poles and time-varying PSD derived from the actual parameters of a single realization of the SPE TAR() model: (a) Frozen poles derived from a single realization of the model compared with the corresponding values obtained from the deterministic trend; (b) Frozen time-varying PSD of a single realization of the process compared with the natural frequency derived from the deterministic trend component ω[t]. 69

88 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend 3.5. Identification via GSC TAR, SP TAR and FS TAR models Presentation and set up of the identification methods Identification of a single realization of the SPE TAR() process is carried out by means of a GSC TAR, SP TAR and FS TAR models. The identification procedure for each one of the model types is described next. GSC TAR models GSC TAR() models with stochastic constraint order q = {,, 3}, and stochastic constraint matrices and parameter innovations covariance matrix of the form M k = µ k I na n a for k =,,q and Σ v = σv I na n a, are considered for the identification of the simulated time series. The identification of the parameter trajectories and the hyperparameters of GSC TAR model is made with the Markov Chain Monte Carlo method via Metropolis Hastings Sampling (MHS) method, the Joint Extended Kalman Filter (JEKF) method, the Maximum Marginal Likelihood (MML) method, and the Expectation Maximization (EM) method. Details of the implementation of the identification methods are presented in Table 3.3, while a short description is provided next: MML method: The marginal likelihoodl(p y N ) is optimized via a two stage method: Stage : Generalized Pattern Search (GPS) method via MATLAB s Global Optimization Toolbox function patternsearch (Kolda et al., 6); Stage : Simplex Search method implemented via MATLAB s Optimization Toolbox function fminsearch (Lagarias et al., 998). A two stage optimization method is adopted in order to cope with potential multiple maxima in the MML method. EM method: The expected log-likelihood function Q(P P j ) is optimized in two stages: Stage : Conventional EM algorithm; Stage : Simplex Search method implemented via MATLAB s Optimization Toolbox function fminsearch (Lagarias et al., 998)). This approach is used to speed up the convergence of the algorithm, which has known problems of low convergence speed (Minami, 4). MHS method: Markov Chain Monte Carlo sampling of the hyperparameters via the Metropolis Hastings algorithm based on the un normalized posterior π(p)= p(y N P) p(p) p(p yn ). JEKF method: JEKF method with backwards smoothing, considering only the optimization of the stochastic constraint parameters. Several passes of the JEKF algorithm over the data are considered to ensure convergence. SP TAR models SP TAR() models with stochastic constraint order q ={,, 3} are considered for the identification of the simulated time series. The values of the innovations and parameter innovations variances are adjusted by two methods: (i) by optimization of the size of the prediction error (MSE) as in (Poulimenos and Fassois, 6; Spiridonakos et al., ); (ii) by means of the MML method, following the same guidelines as for the case of GSC TAR models (see Table 3.3). FS TAR models FS TAR() models with two different types of functional basis for the expansion of the parameters of the model, namely a b spline basis with regular knot position and a Chebyshev polynomial basis, are appraised for the identification of the simulated time series. The projection coefficients of the FS TAR models are estimated using a Multi-Stage Maximum Likelihood (MS-ML) method (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 4b). Due to the complexity of the evolution of the parameter trajectories of the original model, a backwards regression technique is used for the selection of the basis functions used for each model (Spiridonakos and Fassois, 4b). More specifically, the following procedure is followed: 7. Estimate the projection coefficient vector ˆϑ and its corresponding covariance matrix ˆΣ ϑ for an FS TAR model with a large basis order p amax.. Find the index of the functional expansion basis associated with the lowest value of the Mahalanobis distance d M ( na p a, ˆϑ), where: d M (ϑ, ˆϑ)= (ϑ ˆϑ) T ˆΣ ϑ (ϑ ˆϑ)

89 3. GSC-TARMA modeling of non-stationary random signals 3. Compute an FS TAR model after eliminating from the basis function index vector the index found in the previous step. 4. Repeat steps and 3 after updating the values of ˆϑ and ˆΣ ϑ. Notice that the previous procedure aims at eliminating the basis functions for which the distribution of the respective coefficients of projection is more likely to contain zero. Table 3.3: Settings of the various optimization methods used for the identification of the GSC TAR and SP TAR models. Initialization (GSC TAR and SP TAR models) Stochastic constraint parameters GSC TAR: q= µ =.9; q= µ =[.8,.8]; q=3 µ =[.7,.43,.79] SP TAR: q = µ = ; q = µ = [,]; q = 3 µ = [ 3,3, ] Parameter innovations variance Initial parameter vector and innovations variance σv = 6 Correspond to the parameters and innovations variance of an AR() model fitted to the initial samples of each realization. The values are estimated via MATLAB s function arburg Maximum Marginal Likelihood (MML) Method GSC TAR & SP TAR Models Parameter Value Stage Stage Initial mesh size Maximum number of iterations 4 min L(P j y N ) L(P j y N ) 6 8 min P j P j 6 8 Expectation-Maximization (EM) Method GSC TAR Models Parameter Value Stage Stage Maximum number of iterations 5 min Q(P j P j ) Q(P j P j ) 4 8 min P j P j 4 8 Metropolis Hastings Sampling (MHS) Method GSC TAR Models Parameter Value Prior PDF p(p)= p(µ) p(σw) p(σv) p(µ)=n ( ) q, I q q Proposal PDF p(σw)=γ(.,.) p(σv)=γ( 4, 4 ) ) q(p)=q(µ, µ j ) q(σ w) q(σv) q(µ, µ j )=N (µ j,. I q q q(σ w)=γ(.,.) q(σ v)=γ( 4, 4 ) Number of samples Acceptance rate.4 Joint Extended Kalman Filter (JEKF) Method GSC TAR Models Parameter Value Number of passes Multi Stage Maximum Likelihood (MS ML) Method FS TAR Models Parameter Value Basis order B-spline basis p amax =[3,7] Chevyshev basis p amax = 3 Maximum number of iterations 4 min L(ϑ j y N ) L(ϑ j y N ) 6 min ϑ j ϑ j 6 7

90 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend Identification results GSC TAR and SP TAR models Table 3.4 provides a summary of the average values of different performance figures obtained with the identified GSC TAR and SP TAR models on the 5 realizations of the Monte Carlo simulation after optimization with the different approaches described in Table 3.3. The performance is measured in terms of the marginal likelihood L(P y N ), the prediction, filtering and smoothing Mean Squared Error (MSE), and the filtering and smoothing Aggregate Parameter Difference (APD). The performance obtained with the different identification methods for the GSC TAR model type is approximately the same, with exception of the JEKF method, which yields the lowest performance. However, the main reason for this significant drop in performance is that in the JEKF method the optimization of the variance of the parameter innovations is not included. A comparison between the GSC TAR and SP TAR models demonstrates that the former yield best performances in overall. The difference of performance of the SP TAR models in comparison with the GSC TAR models is more evident in the tracking performance of the parameter trajectories measured via the APD, especially at higher values of q. For such a reason, in SP TAR models one would be more inclined to select a model with small values of q, while in GSC TAR there is the possibility to select a higher value of q that may provide improved tracking of the parameter trajectories. Table 3.4: Summary of the average performance figures of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q lnl(p y N ) log MSE log APD Prediction Filtering Smoothing Filtering Smoothing GSC TAR MML GSC TAR EM GSC TAR MHS GSC TAR JEKF SP TAR MML In Table 3.5 are shown the sample median values of the hyperparameters of the GSC TAR and SP TAR models obtained after optimization. For the sake of comparison also the actual values of θ o and σ w are given in the first row of the table. Notice that the values given for the stochastic constraint parameters of the SP TAR model are not estimated but instead are the values used by definition with this model type. Also, a detail of the roots or the characteristic polynomial associated with the estimated stochastic constraint vector ˆµ is shown in Figure The values of θ o and σ w estimated through the different identification methods for corresponding orders are quite similar and are very close to the actual values. The results of the JEKF method are the ones showing the largest difference from the actual values. The estimated values of µ are more or less in general agreement among all the identification methods. By observing the roots associated with the characteristic polynomial of µ, it is evident that at least one root is equal to one, or very close to the unity, while the remaining roots have magnitudes over.8, except to those associated with the MHS estimates which tend to yield lower values. Finally, the values of σ v are the ones showing the largest variation from method to method. This difference in most of the cases can be correlated with the stochastic constraint order. Figure shows the parameter trajectories of a GSC TAR() model identified with the MML method on a single realization of the process. The plots in the left column show the filtered parameter estimates, while 7

91 3. GSC-TARMA modeling of non-stationary random signals Table 3.5: Summary of the average values of the hyperparameters of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q ˆµ ˆθ o ln ˆσ w ln ˆσ v Actual values [.6,.74] 3. [.999] [.46,.759] GSC TAR MML [.9,.9] [.44,.758] [.8,.6,.8] [.6,.768] [.] [.5,.76] GSC TAR EM [.88,.88] [.88,.84] [.78,.574,.794] [.67,.795] [.999] [.8,.74] GSC TAR MHS [.886,.887] [.6,.74] [.38,.757,.43] [.6,.74] [.9] [.35,.758] GSC TAR JEKF [.84,.84] [.349,.759] [.687,.43,.743] [.349,.758].8 8. [.] [.5,.755] SP TAR MML [.,.] [.57,.679] [ 3., 3.,.] [.65,.67] Roots associated with µ (absolute value) MML GCS TAR() EM MHS JEKF MML GCS TAR() EM MHS JEKF MML GCS TAR() 3 EM MHS JEKF Figure 3.5.3: Distribution of the roots of the characteristic polynomials associated with the estimated stochastic constraint parameters ˆµ obtained with each one of the GSC TAR identification methods for all the realizations of the Monte Carlo simulations. the plots in right column show the smoothed counterparts. On each plot, the estimated parameter trajectories are displayed along with their corresponding 99.6% confidence intervals, and are compared with the actual values of the parameter trajectories. Likewise, Figure shows similar quantities, this time obtained with the SP TAR() model identified with the MML method. The plots evidence that the filtered estimates of the parameter trajectories of the GSC-TAR() model follow better the parameter trajectories of the original model in contrast to the estimates of the SP-TAR() model. On the other hand, the smoothed estimates obtained from the SP-TAR() model appear to be over-smoothed, and thus fast changes are not effectively tracked. It is also evident that the confidence intervals derived from the smoothed estimates are in both cases much tighter than those obtained from the filtered estimates Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) In the conventional approach, the considered SPE TAR model structure is estimated for a grid of values of σw and σv and then the values that minimize the size of the prediction errors (the MSE or RSS) are selected The confidence intervals are determined by the area in the interior of the intervals described by the time series â i [t t]±3 p / i,i [t t] for the case of filtered estimates. Similar values are drawn for smoothed estimates. 73

92 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend (a) - (c) - a[t] -.5 a [t] â [t t] a[t] -.5 a [t] â [t N] (b) (d) a[t].5 a [t] â [t t] a[t].5 a [t] â [t N] 5 5 Normalized time 5 5 Normalized time Figure 3.5.4: Comparison of the parameters of a single realization of the process with their respective estimates derived from the MML GSC TAR() model: (a)-(b) filtered estimates of the parameter vector; (c)-(d) smoothed estimates of the parameter vector. Gray shaded areas indicate 99.6% confidence intervals derived from the filtered and smoothed parameter estimation error covariance matrices P[t t] and P[t N]. (a) - (c) - a[t] -.5 a [t] â [t t] a[t] -.5 a [t] â [t N] (b) (d) a[t].5 a [t] â [t t] a[t].5 a [t] â [t N] 5 5 Normalized time 5 5 Normalized time Figure 3.5.5: Comparison of the parameters of a single realization of the process with their respective estimates derived from the MML SP TAR() model: (a)-(b) filtered estimates of the parameter vector; (c)-(d) smoothed estimates of the parameter vector. Gray shaded areas indicate 99.6% confidence intervals derived from the filtered and smoothed parameter estimation error covariance matrices P[t t] and P[t N]. (Poulimenos and Fassois, 6; Spiridonakos et al., ). For that purpose, the value of the parameter innovations variance of SP TAR() and GSC TAR() models is set as log σ v = λ where the values of λ correspond to 5 regularly sampled values in the interval [, 3]. The remaining values of the hyperparameters of both 74

93 3. GSC-TARMA modeling of non-stationary random signals model types are assumed known, and are equal to: SP TAR model µ =[,], θ o =[.6,.74], σ w = 3 ; GSC TAR model µ =[.9,.9], θ o =[.6,.74], σ w = 3. GSC TAR() and SP TAR() models are estimated for all the 5 realizations of the Monte Carlo simulation using the considered values of the hyperparameters, and subsequently the MSE and APD (based on filter and smoother estimates) are measured. The marginal likelihood is also measured for reference. The resulting curves are shown in Figure log MSE SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother log APD - - σ v = 6. SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother lnl(p y N )( 3 ) σv = log σv SP-TAR GSC-TAR Figure 3.5.6: Conventional identification approach: Curves showing the (filter and smoother estimates) MSE and APD, and the marginal likelihood lnl(p y N ) for different values of the parameter innovations variance in the logarithmic range log σv [, 3]. The obtained results are very revealing. Firstly, it can be observed that the MSE (from both filter and smoother estimates) tends to yield lower values for increasing values of σ v, while at the same time the APD increases. This indicates that improved predictive ability can be obtained at the expense of increased variability in the parameter estimates. Moreover, arbitrarily small filtering and smoothing prediction errors could be obtained by increasing the value of σ v. On the other hand, it can be observed that for very low values of σ v, the MSE curves associated with the GSC TAR model stabilize at a certain value, which is equivalent to the MSE obtained with a stationary AR model. The same occurs for SP TAR models for even smaller values of σ v. Therefore, the selection of the variances based on the size of the prediction errors is misleading, since it tends to favor models with large variability in the parameter estimates. On the other hand, it can be observed that the values of σ v that yield the minimum values on both the marginal likelihood and the APD almost seem to coincide in the curves associated with the SP TAR model, while totally coincide for the GSC TAR counterpart. In conclusion, the results demonstrate that the marginal likelihood is a better criterion for the adjustment of the variances of SP TAR and GSC TAR models than other criteria based on the prediction error, such as the MSE or RSS Identification results FS TAR models As explained previously, FS TAR() models using b spline and Chebyshev polynomial basis are used for the identification of the time-series obtained in the Monte Carlo simulation. The performance, measured in terms 75

94 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend of the MSE and the APD, of the FS TAR() models as the number of rejected basis increases in the backwards regression scheme are shown in Figure The lowest MSE is obtained with the FS TAR() models using the b spline basis with initial functional basis dimensionality p amax = 7. On the other hand, the lowest APD is obtained with the Chebyshev polynomial basis with functional basis dimensionality p amax = 3. Both performance measures tend to deteriorate as the number of rejected basis increases, although the performance drop is low and in some cases improves when the first basis are rejected. The best performance (selecting by balancing both MSE and APD) is obtained with the following structures: MS-ML FS TAR a: FS TAR() models with b-spline basis subspace of dimensionality p amax = 3 after rejecting 3 basis functions; MS-ML FS TAR b: FS TAR() models with b-spline basis subspace of dimensionality p amax = 7 after rejecting 5 basis functions; MS-ML FS TAR c: FS TAR() models with Chebyshev polynomial basis subspace of dimensionality p amax = 3 after rejecting 3 basis functions. Histograms displaying the number of times a basis index is rejected from the FS-TAR() models using b-spline basis (p amax ={3,7}) and Chebyshev polynomial basis (p amax = 3) are shown in Figure The histograms are shown for the MS ML FS TAR a, b and c model structures previously selected. The results found in Figure evidence that the indices of the rejected basis of the FS TAR models using b-spline basis functions are not consistent, thus indicating that different basis functions are rejected at different realizations. Contrariwise, in the case of FS TAR models using Chebyshev polynomials, the rejected basis in general are the basis with higher index (with indices, and 3), and the first and fourth basis. Nonetheless, in both cases it is clear that improved performance is obtained by removing a certain amount of basis from the FS TAR representations. (a) -.96 B-spline p amax = 3 B-spline p amax = 7 Polynomial p amax = log MSE -3 (b) log APD B-spline p amax = 3 B-spline p amax = 7 Polynomial p amax = Number of rejected basis Number of rejected basis Figure 3.5.7: Selection of the basis indices of the FS TAR() models using b-spline basis (p amax ={3,7}) and Chebyshev polynomial basis (p amax = 3): (a) Mean Squared Error (MSE), and (b) Aggregate Parameter Deviation (APD) for increasing number of rejected basis Comparison of identification results obtained via GSC, SP and FS TAR representations A comparison of the performance of the identified GSC TAR(), SP TAR() and FS TAR() models with the best structures selected in the previous sections and measured in terms of the MSE and the APD can be found in Figure In the case of the GSC TAR and SP TAR models, these figures are provided for both filtered and smoothed estimates obtained after optimization with the MML, EM, MHS and JEKF methods using a stochastic constraint order q = for GSC TAR() models and q = for SP TAR() models. The results evidence the improved estimation and tracking ability of the GSC TAR models, in comparison with the SP TAR and FS TAR analogues in the presently analyzed problem. As also mentioned before, the performance with the MML, EM and MHS methods is very regular in both MSE and APD. The filtered estimates on both GSC TAR and SP TAR models yield lower MSE values compared to the smoothed estimates, whereas the opposite behavior is found 76

95 3. GSC-TARMA modeling of non-stationary random signals MS-ML FS TAR a MS-ML FS TAR b (a) 4 (b) 4 (c) 4 MS-ML FS TAR c Indices of rejected basis 5 5 Indices of rejected basis Indices of rejected basis Figure 3.5.8: Selection of the structure of the FS TAR() model. Histograms indicate the number of times a basis index is rejected from the FS TAR() model during the identification of the 5 realizations of the Monte Carlo simulation for: (a) MS ML FS TAR a; (b) MS ML FS TAR b; (c) MS ML FS TAR c. SPE TAR Filtering SPE TAR Smoothing FS TAR log MSE log APD MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR MS-ML FS-TAR a MS-ML FS-TAR b MS-ML FS-TAR c MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR MS-ML FS-TAR a MS-ML FS-TAR b MS-ML FS-TAR c Figure 3.5.9: Comparison of the overall performance obtained with GSC TAR(), SP TAR() and FS TAR() models in the identification of the simulated time-series of the SPE TAR() model. for the APD. Besides, smoothed parameter estimates are always more accurate than the estimates obtained with FS TAR models, while the filtered estimates are less accurate. Figure 3.5., Figure 3.5., and Figure 3.5. show the estimated parameter trajectories and the respective frozen TV-PSD and natural frequencies obtained on a single realization of the SPE TAR() process. The parameter trajectories obtained with the GSC TAR models estimated with the MML, EM and MHS methods follow very similar paths, while the ones estimated through the JEKF method evidence larger deviations from the actual parameter trajectories (see Figure 3.5.(a)). The results obtained with the SP TAR() model also evidence accurate tracking of the actual parameter trajectories, although perhaps over smoothed. Finally, the parameter estimates obtained with the different FS TAR models also provide accurate tracking although the main problem may be the large deviations observed in the initial and final segments of the parameter trajectories. Similar conclusions can be drawn from the frozen TV-PSDs derived from the estimated parameter trajectories. In general, it can be seen that all models are capable of tracking the time dependent frequency content in the signal, but the values obtained from the SP TAR and FS TAR models appear over smoothed in comparison with the original ones. 77

96 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend (a) Original MML GSC TAR EM GSC TAR MHS GSC TAR JEKF GSC TAR - a[t] a[t].8.6 (b) Time [samples] Original MML SP TAR MS ML FS TAR a MS ML FS TAR b MS ML FS TAR c - a[t] a[t] Time [samples] Figure 3.5.: Comparison of the original and estimated parameter trajectories obtained from the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model. 78

97 Original MML GSC TAR EM GSC TAR Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Time [samples] 5 5 Time [samples] 5 5 Time [samples] MHS GSC TAR JEKF GSC TAR MML SP TAR Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Time [samples] MS ML FS TAR a 5 5 Time [samples] Frequency [rad/π] Time [samples] MS ML FS TAR b 5 5 Time [samples] - 3 PSD [db] Frequency [rad/π] Time [samples] MS ML FS TAR c 5 5 Time [samples] Figure 3.5.: Comparison of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model. 3. GSC-TARMA modeling of non-stationary random signals

3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend 8 Figure 3.5.: Three dimensional plots of the frozen TV-PSD and natural frequencies derived

98 3.5. Monte Carlo simulation: SPE TAR model with time-dependent deterministic trend 8 Figure 3.5.: Three dimensional plots of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model.

99 3. GSC-TARMA modeling of non-stationary random signals Validation of the identified GSC, SP and FS TAR models The estimation residuals of the identified GSC TAR, SP TAR and FS TAR models are tested for Gaussianity and uncorrelatedness, as explained in Section The Gaussianity of the residuals is evaluated via the Kolmogorov- Smirnov test, while the uncorrelatedness is evaluated using the Ljung-Box Q test (Corder and Foreman 9, Ch. ; Ljung 999, Ch. 6; Soderstrom and Stoica 989, Ch. ). Both tests are performed at a significance level α =.5. The results of the tests shown in Table 3.6 show that the residuals obtained in all the identified models satisfy both conditions, with exception of the residuals of the JEKF GSC TAR method, for which the hypothesis of null correlation is rejected. Table 3.6: Validation of the Gaussianity and uncorrelatedness of the estimation residuals obtained with the different identification methods for GSC, SP and FS TAR models in all the realizations of the Monte Carlo simulation of the SPE TAR() process. Method Estimation residuals Kolmogorov Smirnov test Ljung-Box Q test MML GSC TAR() H.47 H.958 EM GSC TAR() H.48 H.97 MHS GSC TAR() H.58 H.875 JEKF GSC TAR() H.755 H. MML SP TAR() H.44 H.973 MS ML FS TAR() a H.43 H.994 MS ML FS TAR() b H.5 H.995 MS ML FS TAR() c H.36 H.996 Results of the test provided in terms of the accepted hypothesis and p value. Significance level of the tests α = Monte Carlo simulation: SPE TAR model 3.6. Model definition On this simulation analysis, a TAR model with stochastic parameter evolution in combination with a deterministic trend component is studied. To start with, consider the discrete non-stationary signal y[t] defined over the normalized discrete time t =,,, N, which is governed by the following TAR() process y[t]= a [t] y[t ] a [t] y[t ]+w[t]= φ T [t] θ[t]+w[t], w[t] NID (,σw ) (3.6.) where φ[t]= [ y[t ] y[t ] ] T and θ[t]= [ a [t] a [t] ] T are the regression and parameter vectors, respectively, and w[t] is an NID process with zero mean and variance σ w. The parameter vector θ[t] is governed by the equation θ[t]= µ θ[t ] µ θ[t ]+ v[t], v[t] NID (,σ v I ) (3.6.a) with parameters µ = [ ] T µ µ and excited by the NID process v[t] R with mean v o and covariance matrix Σ v = σv I. Moreover, the processes w[t] and v[t] are uncorrelated and un-crosscorrelated. The whole set of hyperparameters that determine the SPE TAR() model defined by equations (3.6.) and (3.6.) are presented in Table 3.7. A Monte Carlo simulation is carried out by generating a set of M = 5 realizations, each one of them of N = samples, using different realizations for the signal and parameter innovation processes, based on the initial values and the hyperparameter values shown in Table 3.7. Figure 3.6. shows a single realization of the process, along with the time-dependent deterministic trend of the parameters and the sample-based 99.6% confi- 8

100 3.6. Monte Carlo simulation: SPE TAR model Table 3.7: Values of the hyperparameters of the SPE TAR() model with time-dependent deterministic trend described by equations (3.6.) and (3.6.). Parameter Value Initial values y[t]= and θ[t]= for t Innovations and parameter innovations variances Stochastic constraint parameters σw = 3, σv = 6 µ = [ ] T Sample length N = Number of realizations M = 5 dence intervals (equivalent to ±3 σ θi ) of the each one of the parameters. Figure 3.6. provides a time-varying frequency domain representation of a single realization of the process via the frozen poles and frozen time-varying PSD (Poulimenos and Fassois, 6) a[t] % confidence interval single realization % confidence interval single realization a[t] y[t] Normalized time Figure 3.6.: Single realization of the SPE TAR() model. Top and middle: Single realization of the parameter trajectories a [t] and a [t], along with the time-dependent deterministic trend and the sample-based 99.6% confidence interval. Bottom: Single realization of the signal generated by the SPE TAR() model. The value σ θi is the sample based standard deviation of the parameter trajectories computed as where ϑ m is the m th realization of the parameter vector. 8 σ θ i = M N M m= N (a i,m ā i [t]) t=

3. GSC-TARMA modeling of non-stationary random signals (a) Imaginary.8.6.

101 3. GSC-TARMA modeling of non-stationary random signals (a) Imaginary π/T..3π/T π/T.π/T (b) Frequency [rad/π] PSD [db] Real 5 5 Normalized time Figure 3.6.: Frozen poles and time-varying PSD derived from the actual parameters of a single realization of the SPE TAR() model: (a) Frozen poles derived from a single realization of the model compared with the corresponding values obtained from the deterministic trend; (b) Frozen time-varying PSD of a single realization of the process compared with the natural frequency derived from the deterministic trend component ω[t] Identification via GSC TAR and SP TAR models Presentation and set up of the identification methods Identification of a single realization of the SPE TAR() process is carried out by means of a GSC TAR and SP TAR models. The identification procedure for each one of the model types is carried out following the same guidelines explained in Section Identification results GSC TAR and SP TAR models Table 3.8 provides a summary of the average values of different performance figures obtained with the identified GSC TAR and SP TAR models on the 5 realizations of the Monte Carlo simulation after optimization with the different approaches described in Table 3.3. The performance is measured in terms of the marginal likelihood L(P y N ), the prediction, filtering and smoothing Mean Squared Error (MSE), and the filtering and smoothing Aggregate Parameter Difference (APD). The performance obtained with the different identification methods for the GSC TAR model type is approximately the same, with exception of the JEKF method, which yields the lowest performance. However, the main reason for this significant drop in performance is that in the JEKF method the optimization of the variance of the parameter innovations is not included. A comparison between the GSC TAR and SP TAR models demonstrates that the former yield best performances in overall. The difference of performance of the SP TAR models in comparison with the GSC TAR models is more evident in the tracking performance of the parameter trajectories measured via the APD, especially at higher values of q. For such a reason, in SP TAR models one would be more inclined to select a model with small values of q, while in GSC TAR there is the possibility to select a higher value of q that may provide improved tracking of the parameter trajectories. In Table 3.9 are shown the sample median values of the hyperparameters of the GSC TAR and SP TAR models obtained after optimization. For the sake of comparison also the actual values of θ o and σw are given in the first row of the table. Notice that the values given for the stochastic constraint parameters of the SP TAR model are not estimated but instead are the values used by definition with this model type. Also, a detail of the roots or the characteristic polynomial associated with the estimated stochastic constraint vector ˆµ is shown in Figure The values of θ o and σw estimated through the different identification methods for corresponding orders are quite similar and are very close to the actual values. The results of the JEKF method are the ones showing the largest difference from the actual values. The estimated values of µ are more or less in general agreement among all the identification methods. By observing the roots associated with the characteristic polynomial of µ, it is evident that at least one root is equal to one, or very close to the unity, while the remaining roots have magnitudes 83

102 3.6. Monte Carlo simulation: SPE TAR model Table 3.8: Summary of the average performance figures of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q lnl(p y N ) log MSE log APD Prediction Filtering Smoothing Filtering Smoothing GSC TAR MML GSC TAR EM GSC TAR MHS GSC TAR JEKF SP TAR MML over.8, except to those associated with the MHS estimates which tend to yield lower values. Finally, the values of σ v are the ones showing the largest variation from method to method. This difference in most of the cases can be correlated with the stochastic constraint order. Table 3.9: Summary of the average values of the hyperparameters of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q ˆµ ˆθ o ln ˆσ w ln ˆσ v Actual values [.9739,.98] [.95,.9] [.93] [.947,.97] GSC TAR MML [.733,.979] [.95,.99] [.8,.6,.8] [.947,.9] [.] [.955,.864] GSC TAR EM [.88,.88] [.966,.93] [.78,.574,.794] [.946,.9] [.] [.947,.893] GSC TAR MHS [.863,.865] [.947,.893] [.49,.64,.57] [.947,.893] [.97] [.95,.95] GSC TAR JEKF [.8,.8] [.949,.94] [.68,.43,.749] [.95,.95] [.] [.95,.9] SP TAR MML [.,.] [.945,.95] [ 3., 3.,.] [ ,.67] Figure shows the parameter trajectories of a GSC TAR() model identified with the MML method on a single realization of the process. The plots in the left column show the filtered parameter estimates, while the plots in right column show the smoothed counterparts. On each plot, the estimated parameter trajectories are displayed along with their corresponding 99.6% confidence intervals, and are compared with the actual values of the parameter trajectories. Likewise, Figure shows similar quantities, this time obtained with the SP The confidence intervals are determined by the area in the interior of the intervals described by the time series â i [t t]±3 p / i,i [t t] for the case of filtered estimates. Similar values are drawn for smoothed estimates. 84

103 3. GSC-TARMA modeling of non-stationary random signals GCS TAR() GCS TAR() GCS TAR() 3 Roots associated with µ (absolute value) MML EM MHS JEKF MML EM MHS JEKF MML EM MHS JEKF Figure 3.6.3: Distribution of the roots of the characteristic polynomials associated with the estimated stochastic constraint parameters ˆµ obtained with each one of the GSC TAR identification methods for all the realizations of the Monte Carlo simulations. TAR() model identified with the MML method. The plots evidence that the filtered estimates of the parameter trajectories of the GSC-TAR() model follow better the parameter trajectories of the original model in contrast to the estimates of the SP-TAR() model. On the other hand, the smoothed estimates obtained from the SP-TAR() model appear to be over-smoothed, and thus fast changes are not effectively tracked. It is also evident that the confidence intervals derived from the smoothed estimates are in both cases much tighter than those obtained from the filtered estimates. -.6 (a) -.8 (c) a[t] a [t] â [t t] a[t] a [t] â [t N] (b). (d). a[t].8 a[t] a [t] â [t t].6.4 a [t] â [t N] 5 5 Normalized time 5 5 Normalized time Figure 3.6.4: Comparison of the parameters of a single realization of the process with their respective estimates derived from the MML GSC TAR() model: (a)-(b) filtered estimates of the parameter vector; (c)-(d) smoothed estimates of the parameter vector. Gray shaded areas indicate 99.6% confidence intervals derived from the filtered and smoothed parameter estimation error covariance matrices P[t t] and P[t N] Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) Similar to the test performed in the previous section, this experiment concerns to the evaluation of the adjustment of the variances of a GSC TAR() and SP TAR() models by the minimization of the size of the prediction errors. For that purpose, the value of the parameter innovations variance is set as log σ v = λ where the values of λ 85

104 3.6. Monte Carlo simulation: SPE TAR model a[t] -.6 (a) a [t] â [t t] (c) a[t] a [t] â [t N] (b). (d). a[t] a [t] â [t t] a[t] a [t] â [t N] 5 5 Normalized time 5 5 Normalized time Figure 3.6.5: Comparison of the parameters of a single realization of the process with their respective estimates derived from the MML SP TAR() model: (a)-(b) filtered estimates of the parameter vector; (c)-(d) smoothed estimates of the parameter vector. Gray shaded areas indicate 99.6% confidence intervals derived from the filtered and smoothed parameter estimation error covariance matrices P[t t] and P[t N]. correspond to 5 regularly sampled values in the interval[, 4]. The remaining values of the hyperparameters of both model types are assumed known, and are equal to: SP TAR model µ = [,], θ o = [.95,.9], σ w = 3 ; GSC TAR model µ = [.974,.98], θ o = [.95,.9], σ w = 3. GSC TAR() and SP TAR() models are estimated for all the 5 realizations of the Monte Carlo simulation using the considered values of the hyperparameters, and subsequently the MSE and APD (based on filter and smoother estimates) are measured. The marginal likelihood is also measured for reference. The resulting curves are shown in Figure The results obtained in the present case are very similar to the ones obtained in Section 3.5. The tendencies observed in the MSE, APD and marginal likelihood are about the same, with the main difference being that GSC TAR models yield notably improved APD and marginal likelihood values compared to SP TAR models. However, the principal conclusion of the experiment is the same, namely that the marginal likelihood is a more robust criterion than the MSE for the adjustment of the hyperparameters, the second one typically leading to improper results Comparison of identification results A comparison of the performance of the identified GSC TAR(), SP TAR() and FS TAR() models with the best structures selected in the previous sections and measured in terms of the MSE and the APD can be found in Figure In the case of the GSC TAR and SP TAR models, these figures are provided for both filtered and smoothed estimates obtained after optimization with the MML, EM, MHS and JEKF methods using a stochastic constraint order q = for GSC TAR() models and q = for SP TAR() models. The results evidence the improved estimation and tracking ability of the GSC TAR models, in comparison with the SP TAR and FS TAR analogues in the presently analyzed problem. As also mentioned before, the performance with the MML, EM and MHS methods is very regular in both MSE and APD. The filtered estimates on both GSC TAR and SP TAR models yield lower MSE values compared to the smoothed estimates, whereas the opposite behavior is found for the APD. Besides, smoothed parameter estimates are always more accurate than the estimates obtained with FS TAR models, while the filtered estimates are less accurate. 86

105 3. GSC-TARMA modeling of non-stationary random signals -.8 log MSE SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother log APD - σ v = 6. SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother lnl(p y N )( 3 ) σv = log σv SP-TAR GSC-TAR Figure 3.6.6: Conventional identification approach: Curves showing the (filter and smoother estimates) MSE and APD, and the marginal likelihood lnl(p y N ) for different values of the parameter innovations variance in the logarithmic range log σv [, 4]. SPE TAR Filtering SPE TAR Smoothing log MSE log APD MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR Figure 3.6.7: Comparison of the overall performance obtained with GSC TAR() and SP TAR() models in the identification of the simulated time-series of the SPE TAR() model. Figure 3.6.8, Figure 3.6.9, and Figure 3.6. show the estimated parameter trajectories and the respective frozen TV-PSD and natural frequencies obtained on a single realization of the SPE TAR() process. The parameter trajectories obtained with the GSC TAR models estimated with the MML, EM and MHS methods follow very similar paths, while the ones estimated through the JEKF method evidence larger deviations from the actual parameter trajectories. The results obtained with the SP TAR() model also evidence accurate tracking of the actual parameter trajectories, although perhaps over smoothed. Similar conclusions can be drawn from the frozen 87

106 3.6. Monte Carlo simulation: SPE TAR model TV-PSDs derived from the estimated parameter trajectories. In general, it can be seen that all models are capable of tracking the time dependent frequency content in the signal, but the values obtained from the SP TAR model appear totally smoothed in comparison with the original ones. Original MML GSC TAR EM GSC TAR MHS GSC TAR JEKF GSC TAR MML SP TAR -.8 a[t] a[t] Time [samples] Figure 3.6.8: Comparison of the original and estimated parameter trajectories obtained from the identified GSC TAR and SP TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model. 88

107 Original MML GSC TAR EM GSC TAR Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Time [samples] 5 5 Time [samples] 5 5 Time [samples] MHS GSC TAR JEKF GSC TAR MML SP TAR 89 Frequency [rad/π] Time [samples] Frequency [rad/π] Time [samples] PSD [db] Frequency [rad/π] Time [samples] Figure 3.6.9: Comparison of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model. 3. GSC-TARMA modeling of non-stationary random signals

3.6. Monte Carlo simulation: SPE TAR model 9 Figure 3.6.: Three dimensional plots of the frozen TV-PSD and natural frequencies derived from the parameter

108 3.6. Monte Carlo simulation: SPE TAR model 9 Figure 3.6.: Three dimensional plots of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the SPE TAR() model.

109 3. GSC-TARMA modeling of non-stationary random signals Validation of the identified GSC, SP and FS TAR models The estimation residuals of the identified GSC TAR, SP TAR and FS TAR models are tested for Gaussianity and uncorrelatedness, as explained in Section The Gaussianity of the residuals is evaluated via the Kolmogorov- Smirnov test, while the uncorrelatedness is evaluated using the Ljung-Box Q test (Corder and Foreman 9, Ch. ; Ljung 999, Ch. 6; Soderstrom and Stoica 989, Ch. ). Both tests are performed at a significance level α =.5. The results of the tests shown in Table 3. show that the residuals obtained in all the identified models satisfy both conditions, with exception of the residuals of the JEKF GSC TAR method, for which the hypothesis of null correlation is rejected. Table 3.: Validation of the Gaussianity and uncorrelatedness of the estimation residuals obtained with the different identification methods for GSC, SP and FS TAR models in all the realizations of the Monte Carlo simulation of the SPE TAR() process. Method Estimation residuals Kolmogorov Smirnov test Ljung-Box Q test MML GSC TAR() H.654 H.4 EM GSC TAR() H.747 H. MHS GSC TAR() H.75 H. JEKF GSC TAR() H.9 H. MML SP TAR() H.73 H. Results of the test provided in terms of the accepted hypothesis and p value. Significance level of the tests α = Monte Carlo simulation: DPE TAR model As a complement to the Monte Carlo simulations presented in Section 3.5 and Section 3.6, the Monte Carlo simulation presented in this section features a TAR model with deterministic parameter evolution following a complex pattern. This type of application is challenging for FS TAR models because of the selection of the functional basis subspace and its dimensionality. Then, the idea with this simulation is also to assess the benefit of GSC TARMA modeling in this complex deterministic parameter evolution scenario Model definition Consider the discrete non-stationary signal y[t] defined over the normalized discrete time t =,,,N, which is governed by the following TAR() process y[t]= a [t] y[t ] a [t] y[t ]+w[t]= φ T [t] ϑ[t]+w[t], w[t] NID (,σw ) (3.7.) where φ[t]= [ y[t ] y[t ] ] T [ and θ[t]= a [t] a [t] ] T are the regression and parameter vectors, respectively, and w[t] is an NID process with zero mean and variance σw. Additionally, the parameter vector ϑ[t] R is governed by the equations [ ] [ ] a [t] ρ[t] cosω[t] ϑ[t]= = a [t] ρ (3.7.) [t] where the deterministic parameter evolution is determined by two conjugate poles with frozen time-dependent magnitude ρ[t] and time-dependent angle ω[t]. The time-dependent magnitude ρ[t] and time-dependent angle ω[t] associated with the time-dependent deterministic parameter trajectories are defined as 4 ρ[t]=ρ + ρ k exp ( (t/n α k) ) ω[t]=ω + k= 4 k= β k ω k exp ( (t/n α k) ) β k 9

110 3.7. Monte Carlo simulation: DPE TAR model with coefficients ρ k [,] and ω k [,π] rad for k=,,, and where α k [,] for k=, is the location and β k R + for k =, the scale parameter of the respective Gaussian functions. The whole set of hyperparameters that represent the DPE TAR() model defined by equations (3.7.) and (3.7.) are presented in Table 3.. Table 3.: Values of the parameters of the DPE TAR() model with time-dependent deterministic trend described by equations (3.7.) and (3.7.). Parameter Value Initial values y[t]= for t Innovations variance σw = 3 Deterministic trend parameters ρ =.84, ρ =., ρ =.5, ρ =., ρ =.5 ω = π/4, ω = π/, ω = π/, ω 3 = π/, ω = π/ α =.5, α =.55, α 3 =.55, α 4 =.78 β =.8, β =., β 3 =.9, β =.8 Sample length N = Number of realizations M = 5 A Monte Carlo simulation is carried out by generating a set of M = 5 realizations, each one of them of N = samples, using different realizations for the signal innovation processes, based on the initial values and the parameter values shown in Table 3.. Figure 3.7. shows a single realization of the process. Figure 3.7. provides a time-varying frequency domain representation of a single realization of the process via the frozen poles and frozen time-varying PSD (Poulimenos and Fassois, 6) Identification via GSC TAR, SP TAR and FS TAR models Presentation and set up of the identification methods Identification of a single realization of the DPE TAR() process is carried out by means of a GSC TAR, SP TAR and FS TAR models. The identification procedure for each one of the model types is the same as in Section Identification results GSC TAR and SP TAR models Table 3. provides a summary of the average values of different performance figures obtained with the identified GSC TAR and SP TAR models on the 5 realizations of the Monte Carlo simulation after optimization with the different approaches described in Table 3.3. The performance is measured in terms of the marginal likelihood L(P y N ), the prediction, filtering and smoothing Mean Squared Error (MSE), and the filtering and smoothing Aggregate Parameter Difference (APD). The performance obtained with the different identification methods for the GSC TAR model type is approximately the same, with exception of the JEKF method, which yields the lowest performance. However, the main reason for this significant drop in performance is that in the JEKF method the optimization of the variance of the parameter innovations is not included. A comparison between the GSC TAR and SP TAR models demonstrates that the former yield best performances in overall. The difference of performance of the SP TAR models in comparison with the GSC TAR models is more evident in the tracking performance of the parameter trajectories measured via the APD, especially at higher values of q. For such a reason, in SP TAR models one would be more inclined to select a model with small values of q, while in GSC TAR there is the possibility to select a higher value of q that may provide improved tracking of the parameter trajectories. In Table 3.3 are shown the sample median values of the hyperparameters of the GSC TAR and SP TAR models obtained after optimization. For the sake of comparison also the actual values of θ o and σw are given in the first row of the table. Notice that the values given for the stochastic constraint parameters of the SP TAR model are not estimated but instead are the values used by definition with this model type. Also, a detail of the roots or the characteristic polynomial associated with the estimated stochastic constraint vector ˆµ is shown in Figure The values of θ o and σw estimated through the different identification methods for corresponding orders are quite similar and are very close to the actual values. The results of the JEKF method are the ones showing the largest difference from the actual values. The estimated values of µ are more or less in general agreement among 9

111 3. GSC-TARMA modeling of non-stationary random signals - -. θ[t] θ[t] y[t] Normalized time Figure 3.7.: Single realization of the SPE TAR() model with time-dependent deterministic trend. Top and middle: Single realization of the parameter trajectories θ [t] and θ [t], along with the time-dependent deterministic trend and the sample-based 99.6% confidence interval. Bottom: Single realization of the signal generated by the SPE TAR() model. (a) Imaginary π/T..3π/T π/T.π/T Real (b) Frequency [rad/π] Normalized time PSD [db] Figure 3.7.: Frozen poles and time-varying PSD derived from the actual parameters of a single realization of the SPE TAR() model: (a) Frozen poles derived from a single realization of the model compared with the corresponding values obtained from the deterministic trend; (b) Frozen time-varying PSD of a single realization of the process compared with the natural frequency derived from the deterministic trend component ω[t]. 93

112 3.7. Monte Carlo simulation: DPE TAR model Table 3.: Summary of the average performance figures of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q lnl(p y N ) log MSE log APD Prediction Filtering Smoothing Filtering Smoothing GSC TAR MML GSC TAR EM GSC TAR MHS GSC TAR JEKF SP TAR MML all the identification methods. By observing the roots associated with the characteristic polynomial of µ, it is evident that at least one root is equal to one, or very close to the unity, while the remaining roots have magnitudes over.8, except to those associated with the MHS estimates which tend to yield lower values. Finally, the values of σ v are the ones showing the largest variation from method to method. This difference in most of the cases can be correlated with the stochastic constraint order. Table 3.3: Summary of the average values of the hyperparameters of the identified GSC TAR and SP TAR models for the 5 realizations of the Monte Carlo simulation using different values of the stochastic constraint order with the various optimization methods. Method q ˆµ ˆθ o ln ˆσ w ln ˆσ v Actual values [.6,.74] 3. [.997] [.65,.76] GSC TAR MML [.9,.9] [.86,.769] [.8,.6,.89] [.354,.793] [.] [.3,.76] GSC TAR EM [.88,.88] [.374,.846] [.78,.574,.794] [.383,.788] [.] [.76,.79] GSC TAR MHS [.93,.93] [.7,.7] [.479,.,.533] [.7,.7] [.9] [.478,.778].9 8. GSC TAR JEKF [.83,.83] [.474,.78] [.685,.43,.745] [.477,.778] [.] [.96,.735] SP TAR MML [.,.] [.6,.76] [ 3., 3.,.] [.84,.73] Identification of the variances of SP TAR models by minimization of the prediction error (conventional approach) Similar to the test performed in Section 3.5, this experiment concerns to the evaluation of the adjustment of the variances of a GSC TAR() and SP TAR() models by the minimization of the size of the prediction errors. 94

113 3. GSC-TARMA modeling of non-stationary random signals Roots associated with µ (absolute value) MML GCS TAR() EM MHS JEKF MML GCS TAR() EM MHS JEKF MML GCS TAR() 3 EM MHS JEKF Figure 3.7.3: Distribution of the roots of the characteristic polynomials associated with the estimated stochastic constraint parameters ˆµ obtained with each one of the GSC TAR identification methods for all the realizations of the Monte Carlo simulations. For that purpose, the value of the parameter innovations variance is set as log σ v = λ where the values of λ correspond to 5 regularly sampled values in the interval[, ]. The remaining values of the hyperparameters of both model types are assumed known, and are equal to: SP TAR model µ = [,], θ o = [.6,.74], σ w= 3 ; GSC TAR model µ =[.9,.9], θ o =[.6,.74], σ w= 3. GSC TAR() and SP TAR() models are estimated for all the 5 realizations of the Monte Carlo simulation using the considered values of the hyperparameters, and subsequently the MSE and APD (based on filter and smoother estimates) are measured. The marginal likelihood is also measured for reference. The resulting curves are shown in Figure log MSE SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother log APD - - σ v = 5.8 SP-TAR Filter SP-TAR Smoother GSC-TAR Filter GSC-TAR Smoother lnl(p y N )( 3 ) σv = log σv SP-TAR GSC-TAR Figure 3.7.4: Conventional identification approach: Curves showing the (filter and smoother estimates) MSE and APD, and the marginal likelihood lnl(p y N ) for different values of the parameter innovations variance in the logarithmic range log σv [, 4]. The results obtained in the present case are very similar to the ones obtained in Section 3.5. The tendencies observed in the MSE, APD and marginal likelihood are about the same, with the main difference being that 95

114 3.7. Monte Carlo simulation: DPE TAR model GSC TAR models yield notably improved APD and marginal likelihood values compared to SP TAR models. However, the principal conclusion of the experiment is the same, namely that the marginal likelihood is a more robust criterion than the MSE for the adjustment of the hyperparameters, the second one typically leading to improper results Identification results FS TAR models As explained in Section 3.5, FS TAR() models using b spline and Chebyshev polynomial basis are used for the identification of the time-series obtained in the Monte Carlo simulation. The performance, measured in terms of the MSE and the APD, of the FS TAR() models as the number of rejected basis increases in the backwards regression scheme are shown in Figure The lowest MSE is obtained with the FS TAR() models using the b spline basis with initial functional basis dimensionality p amax = 7. On the other hand, the lowest APD is obtained with the b spline basis with initial functional basis dimensionality p amax = 3. Both performance measures tend to deteriorate as the number of rejected basis increases, although the performance drop is low and in some cases improves when the first basis are rejected. The best performance (selecting by balancing both MSE and APD) is obtained with the following structures: MS-ML FS TAR a: FS TAR() models with b-spline basis subspace of dimensionality p amax = 3 after rejecting basis functions; MS-ML FS TAR b: FS TAR() models with b-spline basis subspace of dimensionality p amax = 7 after rejecting 4 basis functions; MS-ML FS TAR c: FS TAR() models with Chebyshev polynomial basis subspace of dimensionality p amax = 3 after rejecting basis functions. (a) -.96 B-spline p amax = 3 B-spline p amax = 7 Chebyshev p amax = log MSE Number of rejected basis (b) log APD B-spline p amax = 3 B-spline p amax = 7 Chebyshev p amax = Number of rejected basis Figure 3.7.5: Selection of the basis indices of the FS TAR() models using b-spline basis (p amax ={3,7}) and Chebyshev polynomial basis (p amax = 3): (a) Mean Squared Error (MSE), and (b) Aggregate Parameter Deviation (APD) for increasing number of rejected basis. Histograms displaying the number of times a basis index is rejected from the FS-TAR() models using b-spline basis (p amax ={3,7}) and Chebyshev polynomial basis (p amax = 3) are shown in Figure The histograms are shown for the MS ML FS TAR a, b and c model structures previously selected. The results found in Figure evidence that the indices of the rejected basis of the FS TAR models using b-spline basis functions with initial basis dimensionality p amax =3 are not consistent, thus indicating that different basis functions are rejected at different realizations. Contrariwise, in the case of FS TAR models using Chebyshev polynomials and b-splines with initial basis dimensionality p amax =7, the rejected basis in general are the basis with higher index (with indices, and 3), as well as some of the basis functions with low index. Nonetheless, in both cases it is clear that improved performance is obtained by removing a certain amount of basis from the FS TAR representations Comparison of identification results obtained via GSC, SP and FS TAR representations A comparison of the performance of the identified GSC TAR(), SP TAR() and FS TAR() models with the best structures selected in the previous sections and measured in terms of the MSE and the APD can be found in 96

115 3. GSC-TARMA modeling of non-stationary random signals (a) MS-ML FS TAR a (b) MS-ML FS TAR b (c) MS-ML FS TAR c Indices of rejected basis 5 5 Indices of rejected basis Indices of rejected basis Figure 3.7.6: Selection of the structure of the FS TAR() model. Histograms indicate the number of times a basis index is rejected from the FS TAR() model during the identification of the 5 realizations of the Monte Carlo simulation for: (a) MS ML FS TAR a; (b) MS ML FS TAR b; (c) MS ML FS TAR c. Figure In the case of the GSC TAR and SP TAR models, these figures are provided for both filtered and smoothed estimates obtained after optimization with the MML, EM, MHS and JEKF methods using a stochastic constraint order q = for GSC TAR() models and q = for SP TAR() models. The results evidence the improved estimation and tracking ability of the GSC TAR models, in comparison with the SP TAR and FS TAR analogues in the presently analyzed problem. As also mentioned before, the performance with the MML, EM and MHS methods is very regular in both MSE and APD. The filtered estimates on both GSC TAR and SP TAR models yield lower MSE values compared to the smoothed estimates, whereas the opposite behavior is found for the APD. Besides, smoothed parameter estimates are always more accurate than the estimates obtained with FS TAR models, while the filtered estimates are less accurate. In comparison with the results presented in Section 3.5 and Section 3.6 the performance difference of the GSC TAR models with the FS TAR models is reduced. SPE TAR Filtering SPE TAR Smoothing FS TAR log MSE log APD MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR MS-ML FS-TAR a MS-ML FS-TAR b MS-ML FS-TAR c MML GSC-TAR EM GSC-TAR MHS GSC-TAR JEKF GSC-TAR MML SP-TAR MS-ML FS-TAR a MS-ML FS-TAR b MS-ML FS-TAR c Figure 3.7.7: Comparison of the overall performance obtained with GSC TAR(), SP TAR() and FS TAR() models in the identification of the simulated time-series of the DPE TAR() model. Figure and Figure show the estimated parameter trajectories and the respective frozen TV-PSD and natural frequencies obtained on a single realization of the DPE TAR() process. The parameter trajectories obtained with the GSC TAR models estimated with the MML, EM and MHS methods follow closely the actual parameter evolution, while the ones estimated through the JEKF method evidence larger deviations (see Figure 3.7.8(a)). The results obtained with the SP TAR() model also evidence accurate tracking of the actual parameter trajectories, although perhaps over smoothed. Finally, the parameter estimates obtained with the different FS 97

116 3.7. Monte Carlo simulation: DPE TAR model TAR models also provide accurate tracking although the main problem may be the large deviations observed in the initial and final segments of the parameter trajectories. Similar conclusions can be drawn from the frozen TV-PSDs derived from the estimated parameter trajectories. In general, it can be seen that all models are capable of tracking the time dependent frequency content in the signal, but the values obtained from the SP TAR and FS TAR models evidence less accurate tracking performance. (a) Original MML GSC TAR EM GSC TAR MHS GSC TAR JEKF GSC TAR - a[t] a[t].8.6 (b) Time [samples] Original MML SP TAR MS ML FS TAR a MS ML FS TAR b MS ML FS TAR c - a[t] a[t] Time [samples] Figure 3.7.8: Comparison of the original and estimated parameter trajectories obtained from the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the DPE TAR() model. 98

117 Original MML GSC TAR EM GSC TAR Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Time [samples] 5 5 Time [samples] 5 5 Time [samples] MHS GSC TAR JEKF GSC TAR MML SP TAR Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Frequency [rad/π] Time [samples] MS ML FS TAR a 5 5 Time [samples] Frequency [rad/π] Time [samples] MS ML FS TAR b 5 5 Time [samples] - 3 PSD [db] Frequency [rad/π] Time [samples] MS ML FS TAR c 5 5 Time [samples] Figure 3.7.9: Comparison of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the DPE TAR() model. 3. GSC-TARMA modeling of non-stationary random signals

118 3.7. Monte Carlo simulation: DPE TAR model Figure 3.7.: Three dimensional plots of the frozen TV-PSD and natural frequencies derived from the parameter trajectories of the original and the identified GSC TAR, SP TAR and FS TAR models on a single realization of the Monte Carlo simulation of the DPE TAR() model.

119 3. GSC-TARMA modeling of non-stationary random signals Validation of the identified GSC, SP and FS TAR models The estimation residuals of the identified GSC TAR, SP TAR and FS TAR models are tested for Gaussianity and uncorrelatedness, as explained in Section following the same procedure as explained in Section 3.5. The results of the validation are shown in Table 3.4. Table 3.4: Validation of the Gaussianity and uncorrelatedness of the estimation residuals obtained with the different identification methods for GSC, SP and FS TAR models in all the realizations of the Monte Carlo simulation of the DPE TAR() process. Method Estimation residuals Kolmogorov Smirnov test Ljung-Box Q test MML GSC TAR() H.8 H.9 EM GSC TAR() H.864 H.4 MHS GSC TAR() H.735 H. JEKF GSC TAR() H.738 H. MML SP TAR() H.667 H. MS ML FS TAR() a H.83 H.97 MS ML FS TAR() b H.55 H.345 MS ML FS TAR() c H.84 H.9 Results of the test provided in terms of the accepted hypothesis and p value. Significance level of the tests α = Application example: Identification of the vibration response of an operating wind turbine 3.8. Data description The identification of the dynamics on the vibration response of wind turbines is a difficult problem since it characterized by complicated non-stationary characteristics stemming from the cyclic effects induced by the rotating blades and the uncertain operational and environmental conditions (Barlas and van Kuik, ; Hansen et al., 6b). This problem has been tackled from different approaches, including non parametric time frequency and time scale representations (Dolinski and Krawczuk, 9; Fitzgerald et al., ), stationary inducing transforms (Allen et al., b; Tcherniak et al., ), and TARMA modeling (Avendano-Valencia and Fassois,,, 3a,b, 4). The main aim on this application example is to show the workings of the identification methods and the potential advantages of GSC TARMA modeling on the identification of complex non stationary phenomena, such as that featuring in the vibration response of operating wind turbines. The acceleration vibration signal used in this analysis has been acquired on a NegMicon NM5/9 wind turbine tower under normal operation, located at a wind farm on the Atavyros Mountain of the island of Rhodes, Greece. A full description of the signal, including acquisition and pre-processing details can be found in Chapter. A detailed analysis via stationary and non stationary identification methods, demonstrating the necessity of non stationary identification methods for this type of signal is provided in (Avendano-Valencia and Fassois, 3a,b). It should be noted that in contrast to previous studies (Avendano-Valencia and Fassois,, ), the low frequency dynamics (previously associated with wind loads) are presently removed from the signals via 3rd order polynomial fitting. Table. provides a summary of the vibration signals characteristics Model identification The vibration response signal is identified using FS TAR, SP TAR and GSC TAR models. The identification process is divided in two steps: (i) selection of the structure of SP TAR and FS TAR models, including the dimensionality of the basis subspace of FS TAR models; (ii) optimization of the hyperparameters of SP TAR and GSC TAR models via the Maximum Marginal Likelihood (MML), the Expectation Maximization (EM) and the Metropolis Hastings Sampling (MHS) identification methods.

120 3.8. Application example: Identification of the vibration response of an operating wind turbine The FS TAR models use the trigonometric functional basis (Poulimenos and Fassois, 9b): G [t]=, ( ) ( ) t t G k [t]=sin πk f o, G k [t]=cos πk f o, k=,,...,(p a )/ (3.8.) f s f s where p a = k+ is the functional basis dimensionality, t =,,...,N and N is the signal length. The frequency of the first harmonic of the trigonometric basis is f o =.37 Hz, corresponding to is the average period of rotation of the rotor as shown in Table.. The estimation details and settings of the FS and SPE TAR parameter estimation methods are summarized in Table 3.5. FS-TAR SPE-TAR Table 3.5: Details of the identification methods of FS TAR, SP TAR and GSC TAR models. Description Basis subspace Trigonometric basis subspace with base frequency f o =.37 Hz (average rotor speed in Hertz). Estimation of the projection parameters with ordinary least squares (OLS) method. Constant Estimation innovations estimated by averaging the squared prediction error. GSC TAR models with diagonal stochastic constraint matrix and parameter innovations variance Structure of the form M k := µ k I na n a, for k=,,q, and Σ v := σv I na n a. Stochastic constraint parameters: GSC TAR: Coefficients of (z.9) q ; SP TAR: Coefficients of (z ) q. Initialization Parameter innovations variance: σv = 6. Innovations variance and initial parameter vector derived from a stationary AR() model estimated via MATLAB s function arburg. Two stage non linear optimization method: Stage : Generalized Pattern Search (GPS) via MATLAB s Global Optimization Toolbox function patternsearch (Kolda et al., 6) (initial mesh size, iteration limit, tolerance on MML method the parameters and the objective fcn. 6 ); Stage : Simplex Search via MATLAB s Optimization Toolbox function fminsearch (Lagarias et al., 998) (iteration limit, tolerance on the parameters and the objective fcn. 8 ). Two stage optimization method: Stage : Conventional EM algorithm (5 iterations); EM method Stage : Simplex Search via MATLAB s Optimization Toolbox function fminsearch (Lagarias et al., 998) (iteration limit, tolerance on the parameters and the objective fcn. 8 ). Metropolis Hastings sampling algorithm based on the un normalized posterior π(p) = p(y N, zn P) p(p) p(p yn ). Hyperparameter prior: p(p) = p(µ) p(σw) p(σv), with p(µ) = N ( ) q, I q q, p(σ w ) = MHS method Γ( 3, 3 ) and p(σv)=γ( 4, 4 ). ) Proposal PDF: q(p) = q(µ, µ j ) q(σ w) q(σv), with q(µ, µ j ) = N (µ j,. I q q, q(σ w)=γ( 3, 3 ), and q(σ v)=γ( 4, 4 ). Number of simulated samples: ; Acceptance rate:.4 Selection of the structure of FS TAR and SPE TAR models The model order is selected based on the simpler FS TAR and SP TAR models. For this purpose, FS TAR and SP TAR models are computed for increasing model orders from n a = up to 4. The FS TAR models are estimated with p a = and the basis expansion indices b a = [ p a ] T, and the SP TAR models with stochastic constrain order q={,}. The resulting RSS/SSS curves for increasing model orders are shown in Figure 3.8.(a). Also, the corresponding BIC curve of FS TAR models is shown in Figure 3.8.(b). According to the resulting RSS/SSS and BIC curves, an order of the TAR model n a = 8 is selected (indicated with arrows in Figure 3.8.(a) and (b)). The selection of the structure of the FS TAR models is completed by determining the optimal dimensionality of the functional subspace. This is carried out by fitting FS TAR(8) models with increasing p a from up to. The resulting RSS/SSS and BIC curves are shown in Figure 3.8.(c) and (d). Both curves show contradictory results, since the RSS/SSS decreases for increasing model order, whereas the BIC sharply increases. A compromise solution between both quantities is accepted, by selecting the functional basis dimensionality p a = 7, which includes the constant basis and the first three pairs of sines and cosines with frequencies f o, f o and 3 f o.

121 3. GSC-TARMA modeling of non-stationary random signals SP TAR q= SP TAR q= FS TAR RSS/SSS (%) RSS/SSS (%) TAR order p a = 6 n a = 8 (a) (c) Functional subspace dimensionality BIC ( 4 ) BIC ( 4 ) TAR order p a = 6 n a = 8 (b) (d). 5 5 Functional subspace dimensionality Figure 3.8.: Selection of the model structures of the SP-TAR and FS-TAR models: (a) RSS/SSS of SP- TAR and FS-TAR models for increasing model order; (b) BIC of FS-TAR models for increasing model order; (c)-(d) RSS/SSS and BIC of FS-TAR(8) models for increasing basis order. Arrows indicate selected orders/dimensionalities. Optimization of the hyperparameters of SP TAR and GSC TAR models The hyperparameters of the GSC TAR models, including µ, θ, σw and σv are optimized using the MML, EM and MHS methods. For this purpose, GSC TAR models are identified given the selected structure n a = 8 and for q =. The hyperparameters of SP TAR models, namely θ, σw and σv, are also optimized using the MML method. The performance and the values of the hyperparameters of the optimized FS TAR, GSC TAR and SP TAR models are summarized in Table 3.6. The stochastic constraint parameters shown for the SP TAR case in Table 3.6 are those corresponding to the definition of the SP TAR model and are shown for the sake of comparison. The RSS/SSS in the case of the SP TAR and GSC TAR models is presented for both filtered and smoothed estimates of the parameter trajectories. The best performance is obtained with the MML method for both GSC TAR and SP TAR model types. The performance of the remaining identification methods based on the GSC TAR model type is lower in comparison with the one obtained with the MML method, although in all cases is better than in the case of the FS TAR model. The estimated stochastic constraint parameters of the GSC TAR models are in close agreement. Also, the estimates of the variances of the innovations and the parameter innovations (in the SP TAR and GSC TAR cases) are very similar, except for the case of the MHS method, whose yielded values are larger. This may be also the reason why the respective performance figures are also larger than in the other methods. Table 3.6: Performance of the optimized FS, SP and GSC TAR models in the identification of the wind turbine vibration response. Model RSS/SSS [%] Estimated hyperparameters Filtering Smoothing µ log σw log σv FS TAR(8) [6] MML SP TAR(8) q = [.,.] MML GSC TAR(8) q =.9. [.8,.8] EM GSC TAR(8) q =.75.5 [.88,.88] MHS GSC TAR(8) q = [.79,.79]

122 3.8. Application example: Identification of the vibration response of an operating wind turbine Validation of the identified models Validation the Gaussianity and uncorrelatedness of the estimation residuals obtained with estimated FS TAR, SP TAR and GSC TAR models is appraised as a final step in the identification process, as explained in Section In the present case, the Gaussianity of the estimation residuals is evaluated by means of the Kolmogorov Smirnov test, while the uncorrelatedness of the first lags of the autocorrelation function is assessed using the Ljung Box Q test. The results of the hypothesis tests, both of them carried out for a significance level α =.5, are shown in Table 3.7. The test results for the GSC TAR and SP TAR models are shown for both filtered and smoothed parameter estimates. The results of the Kolmogorov Smirnov test demonstrate that on all cases the Gaussianity of the residuals is satisfied. Opposite results are obtained with the Ljung-Box Q test, thus demonstrating the correlatedness of the residuals on all the cases. Table 3.7: Summary of the results of the validation tests of Gaussianity and uncorrelatedness of the prediction residuals of the best performed models of the wind turbine vibration signal. Hypothesis tests evaluated at the significance level α =.5. Results shown with the p-value of the test. Model Kolmogorov-Smirnov test Ljung-Box Q test Filtering Smoothing Filtering Smoothing FS TAR(8) [6] H.73 H MML SP TAR(8) q= H.464 H.36 H H MML GSC TAR(8) q= H.556 H.696 H H EM GSC TAR(8) q= H.64 H.733 H H MHS GSC TAR(8) q= H.536 H.33 H H Model based analysis The non stationary dynamics characterizing the wind turbine vibration response signal are analyzed by means of parametric frozen TV PSDs and frozen natural frequencies derived from the estimated FS TAR, SP TAR and GSC TAR models. The parametric frozen TV PSDs and natural frequencies are computed using the expressions (Poulimenos and Fassois, 6) S yy ( f,t)= + n a i= a i[t] e jiπ ft/ f s σ w f n,i [t]= f s lnλ i [t] π where λ i [t], i =,,n a is the i th discrete time frozen pole of the TAR model. The frozen TV PSDs and frozen natural frequencies derived from the estimated FS TAR, SP TAR and GSC TAR models, and the non parametric TV PSD estimate via spectrogram are depicted in Figure 3.9., while corresponding three dimensional surface plots are shown in Figure The TV PSD estimates shown in Figure 3.9. also include a detail in the frequency range from 34 to 4 Hz. The frozen parametric TV PSDs are computed for 5 frequency points in the range from to 6 Hz, while the spectrogram is computed using Gaussian windows of 4 samples, with an overlap of samples and aperture parameter α =. The TV PSD estimates evidence the same type of pattern in the evolution of the spectral content of the signal, and is in agreement with the findings on the previous works (Avendano-Valencia and Fassois,,, 3a,b). The most notorious feature in the spectra is the quasi periodic pattern evidenced in the two frequency modes located in the range [34, 4] Hz. This behavior is evident in all the spectral estimates, including the spectrogram, although in a not so clear fashion. The results from the FS TAR model are clearly influenced by the basis function, and thus the quasi periodic behavior observed in the other TV PSD estimates becomes periodic on this case. This is confirmed by the fact that all the GSC TAR model based spectral estimates and the spectrogram show that the remaining modes are almost constant, while in the case of the FS TAR model based estimates the same modes exhibit larger variations. The GSC TAR model based estimates based on the MML and EM methods show very consistent results, whereas those obtained with the MHS method appear more smooth in comparison. The SP TAR model based estimates show larger variations, and appear to be inconsistent in comparison with those obtained with the remaining 4

123 3. GSC-TARMA modeling of non-stationary random signals estimators. However, it is evident that in all cases, the optimization of the hyperparameters allows to obtain a more clear model based analysis. 3.9 Summary of the results The main conclusion of this application example is that TAR models are capable of providing very high performance, tracking ability and quality information on the dynamics of the signal that are not possible with the non parametric modeling methods. GSC TAR models GSC TAR models are capable of locating with improved precision and tracking with very high accuracy the evolution of the modes present in the vibration response of the wind turbine. More importantly, it is clear that the performance and tracking ability of GSC TAR models (being the SP TAR model a special case) is sharply improved with the optimization via MML, EM and MHS methods described in this chapter. The results demonstrate that the performance of the GSC TAR and SP TAR model structures is mainly dominated by the values of the variances σw and σv. However, the adjustment of the stochastic constraint parameters provides the possibility to achieve improved tracking abilities. Individual analysis of the optimization methods: The MML method provides the best performance while its set up is relatively simple. The evaluation of each iteration is also faster because only Kalman filter estimates are necessary, although it sharply increases if the partial derivatives of the a priori and filtered state vector estimates with respect to the hyperparameters are computed. That was the reason for using search methods instead of gradient based optimization methods on the present application. The potential difficulties with local maxima have been tackled with the use of a combination of global and local search methods. The EM method also provided very good performance on the estimation of the hyperparameters of the GSC TAR model. The evaluation of each step of the EM algorithm is relatively simple and converges fast on the initial iterations. The use of non linear sequential optimization after several iterations is highly recommended to accelerate the convergence speed. As a final note on the EM optimization method, it is important to watch the value of the update equation after several iterations for Σ v (either in Equation (3.4.6c) or Equation (3.4.7c)), since the negative terms can produce a non-positive definite update of the covariance matrix. The MHS method is perhaps the simpler to set up and facilitates the incorporation of user expertise in the optimization (however, that may also lead to biased estimates). The prediction performance of the obtained model in comparison with the MML and EM methods is lower, although the tracking properties may be improved. FS TAR models FS TAR models also provide good performance, which were slightly lower than those obtained with GSC TAR models. However, there are two main problems related with their application: (i) The choice of a trigonometric basis function subspace strongly influences the characteristics of the obtained parameter trajectories in the estimated model. In this sense, any random variations on the dynamics over long time periods may be poorly identified. Alternatively, different basis functions sets could be used, but this may lead to different problems, including higher basis dimensionality. (ii) The performance is reduced in comparison with the GSC TAR models. Again, this may be related to the selection of the functional basis subspace. 5

124 6 (a) (b) (c) Frequency [Hz] Frequency [Hz] T o =.69 s Time [s] Frequency [Hz] Frequency [Hz] T o =.69 s Time [s] Frequency [Hz] Frequency [Hz] T o =.69 s Time [s] 3.9. Summary of the results Time [s] Time [s] Time [s] (d) (e) (f) Frequency [Hz] T o =.69 s Time [s] Frequency [Hz] T o =.69 s Time [s] Frequency [Hz] Time [s] Frequency [Hz] 4 Frequency [Hz] 4 Frequency [Hz] Time [s] Time [s] Time [s] Figure 3.9.: Frozen Time-Varying Power Spectral Densities derived from the identified GSC TAR, SP TAR and FS TAR models and the spectrogram for a period of 5 s and for the frequency range f =[,6] Hz and detail on the range f =[34,4] Hz. (a) MML GSC TAR(8) q= model; (b) EM GSC TAR(8) q= model; (c) MHS GSC TAR(8) q= model; (d) MML SP TAR(8) q= model; (e) FS TAR(8) [6] model; (f) Spectrogram with Gaussian window with aperture parameter α =, N f ft = 4 samples and overlap samples.

125 (a) (b) (c) (d) (e) (f) 7 Figure 3.9.: Three dimensional view of the Frozen Time-Varying Power Spectral Densities derived from the identified GSC TAR, SP TAR and FS TAR models and the spectrogram for a period of 5 s on the frequency range f = [,6] Hz. (a) MML GSC TAR(8) q= model; (b) EM GSC TAR(8) q= model; (c) MHS GSC TAR(8) q = model; (d) MML SP TAR(8) q = model; (e) FS TAR(8) [6] model; (f) Spectrogram with Gaussian window with aperture parameter α =, N f ft = 4 samples and overlap samples. 3. GSC-TARMA modeling of non-stationary random signals

126 3.. Discussion of the results 3. Discussion of the results The previous two sections have been devoted to the evaluation and comparison of the performance of the GSC TAR and GSC TARMA model identification methods on respective Monte Carlo simulations featuring stochastic parameter evolution TAR and TARMA models with deterministic components. The following conclusions can be drawn from the experiments: 3.. Comparison of the GSC TARMA model identification methods Four identification methods have been evaluated, namely the Maximum Marginal Likelihood MML method, the Maximum Likelihood via Expectation Maximization algorithm EM method, the Markov Chain Monte Carlo sampling method via the Metropolis Hastings Sampling algorithm (MHS), and the Joint Extended Kalman Filter method (JEKF). The performance of the models identified via the different GSC TAR/TARMA model identification methods is consistent for both TAR and TARMA model structures. Nonetheless, the evaluation of the identification methods has demonstrated the strengths and weakness of each one of the methods that are summarized next: MML method The MML method provides the best overall performance among all methods. Besides, the evaluation of each iteration of the optimization is simpler, since only a priori and filtered estimates of the parameter vector are required for the computation of the marginal likelihood. However, the implementation becomes more involved when the values of the gradients of the estimated parameter vector with respect to the hyperparameters are computed. For that purpose, only gradient free search methods were used in the implementation used in the test shown in this work. The use of a two stage search scheme, combining global and local search methods, proved efficient for the present problem and yielded consistent estimates. EM method The EM method provides models that yield very good predictive ability, with a slightly lower tracking performance in comparison with the MML method. This difference is more notorious in the full GSC TARMA case. The implementation of this method requires the extra smoothing step and the calculation of several other quantities, at each iteration. Nonetheless, the conceptual simplicity and the robustness of the algorithm makes it a very good choice. The main problem of slow convergence speed has been successfully tackled by the application of a non linear search algorithm after several iterations of the conventional EM equations. MHS method This algorithm is perhaps the simpler among all the optimization methods and also provides models with very high performance. The simplicity of this method stems from the update mechanism, which only requires the estimation of the log-likelihood based on smoothed parameter estimates at each step. Alternative versions can be appraised, by using the marginal likelihood instead, which only requires the computation of filtered parameter estimates. The selection of the hyper priors and the proposal PDF may be either a drawback or an advantage. These selections may be a drawback, since may lead to erroneous estimates if poor selections are made. On the other hand, these are an advantage as a way to introduce prior knowledge about the problem by experienced users. For the non experienced user it is suggested to follow the selections used in this work, since these may lead good results in most cases (see also the results shown in the accompanying paper Avendaño-Valencia and Fassois (5d)). JEKF method The JEKF method provides the lowest performance in the tests. The main difficulty with this method is that it does not incorporate a mechanism to directly optimize the variances of the parameter and hyperparameter innovations. A solution may be to combine this method with one of the previous ones, but if that is the case, perhaps it is better just to use directly any the other methods. The main use of this method would be to make an initial analysis of the data, prior to the application of the remaining methods which are more time consuming. Also, with the appropriate selection of the parameter and hyperparameter innovation variances, this method becomes the best candidate for applications requiring on line identification. 8

127 3. GSC-TARMA modeling of non-stationary random signals 3.. GSC TARMA, SP TARMA and FS TARMA Making a comparison between GSC TARMA, SP TARMA and FS TARMA models is not fair, since all of them can do a very good job on the identification of non stationary time series. Therefore, it must be remarked that the discussion made here concerns mostly for applications involving non stationary time series with stochastic and deterministic effects in the evolution of the dynamics. In such a case, GSC TARMA models can provide very good performance, as confirmed with the results found in this work. A second scenario where GSC TARMA models can be very advantageous is in the case when the non stationary dynamics are characterized by complex deterministic patters, for which the determination of the most appropriate functional basis subspace and its dimensionality, in the case of models of the FS TARMA class, are complicated. In such a case, the GSC TARMA models have the advantage of a simpler structure, while the appropriate values of the hyperparameters can be obtained automatically with any of the optimization methods discussed in this work. Another issue of interest is on the use of GSC TARMA or SP TARMA model structure. The unique differentiating characteristic between these two model types is the adjustment of the stochastic constraint parameters, and then the issue is whether or not optimizing such parameters. In consequence with the results obtained in this work, the principal quantities driving the performance of the stochastic parameter evolution TARMA models are the variances of the innovations and the parameter innovations. However, it is also evident in the results that the optimization of the stochastic constraint parameters also leads to models with improved performance. Therefore, the suggestion is to optimize also these values if the performance improvement is worth the extra effort in estimation. 3. Conclusions This chapter features the Generalized Linear Stochastic Constraint TARMA model class, for which the parameter evolution model of the time-dependent model parameters is defined in terms of Gaussian AR processes. The well known Smoothness Priors TARMA (SP TARMA) models are the special case when the parameter evolution model is defined in terms of q integrated processes (AR(q) processes with q unit roots). The complete modeling potential of GSC TARMA models is employed by the proper adjustment of the model hyperparameters to the actual signal dynamics via adequate optimization methods. In that sense, the main contribution of this work is on the formulation of such a problem as a Bayesian inference problem. Two optimization methods have been discussed in this work, one based on the application Markov Chain Monte Carlo sampling of the joint distribution of the parameters and hyperparameters of the GSC TARMA model, and one based on an augmented state space representation which is later utilized for the estimation of the parameters and hyperparameters through Kalman filter methods. Alternative Maximum Likelihood optimization methods are discussed in the accompanying paper Avendaño-Valencia and Fassois (5d). The main conclusion of this work is that the overall performance of the GSC TARMA models (including the special case of SP TARMA models) can be sharply improved by the optimization of the parameters, and avoids the trial and test approach for the selection of the values of the model hyperparameters typically utilized when using stochastic parameter evolution models. Moreover, GSC TARMA methods are very good candidate models on the case of modeling of non stationary time series characterized by stochastic or complex deterministic patterns in the evolution of the dynamics, where FS TARMA models present difficulties on the selection of the functional basis subspace and its respective dimensionality. Monte Carlo evaluation of the various GSC TARMA identification methods, including the Bayesian methods discussed in this work and the alternative Maximum Likelihood methods from the accompanying paper Avendaño-Valencia and Fassois (5d), demonstrate the qualities and capabilities of the GSC TARMA approach. The proposed modeling approach shows to outperform the conventional smoothness priors and functional series models in the considered tests, in terms of modeling accuracy and tracking ability. The results in the Monte Carlo test demonstrate the consistency of the results and the high performance of the identification methods. 9

128 3.A. Some examples of models in the class of GSC-TARMA models Appendix 3.A Some examples of models in the class of GSC-TARMA models 3.A. TARMA model with AR parameter evolution A TARMA model with AR parameter evolution, described by the equations: θ[t]= q k= µ k θ[t k]+ v[t], v[t] NID ( v o,σ v I n n ) y[t]= φ T [t] θ[t]+w[t], w[t] NID (,σw ) (3.A.a) (3.A.b) is a GSC-TARMA model with M k = µ k I, Σ v = σ v I, and hyperparametersp={µ, v o,σ v,σ w,y[], θ[]}, where µ = [ µ µ µ q ] T. 3.A. TARMA model with integrated AR parameter evolution A TARMA model with integrated AR parameter evolution, described by the equations: θ[t]= q k= µ k θ[t k]+ v[t], v[t] NID ( n,σ v I n n ) y[t]= φ T [t] θ[t]+w[t], w[t] NID (,σw ) (3.A.a) (3.A.b) and with the property that at least one of the roots of the polynomial + q k= µ kb k has magnitude one, is a GSC-TARMA model with M k = µ k I, Σ v = σ v I, and hyperparameters P={µ,σ v,σ w,y[], θ[]}, where µ = [ µ µ µ q ] T. Notice that since the integrated AR model describes a non-stationary trend process, then it is considered that v[t] has zero mean so as to avoid a redundant definition of the parameter mean. Appendix 3.B Demonstration of the statistical properties of the GSC TARMA model This appendix is devoted to the demonstration of the statistical properties of the GSC TARMA model that are shown in Section 3.3 and are utilized throughout the text. 3.B. The parameter mean Given the parameter evolution constraints in Equation (3.3.), the parameter mean is defined as: E{θ[t]}=E { q k= M k θ[t k]+ v[k] } = q k= M k E{θ[t k]}+ v o where E{θ[t]} is the expectation operator defined in the probabilistic space of θ[t]. Assuming that the parameter mean converges to the value θ o, then θ o = q k= M k θ o + v o θ o + q k= M k θ o = v o after factorizing θ o and pre-multiplying both sides by the inverse of I n n + q k= M k, yields: θ o = ( I n n + q k= ) M k v o (3.B.)

129 3. GSC-TARMA modeling of non-stationary random signals 3.B. The joint distribution of the observations and parameter vector Under the NID assumption of the parameter and signal innovations, and the GSC-TARMA model state space definition in Equation (3.3.3), the values of the signal and the parameter vector at time t are also jointly distributed Gaussian variables that satisfy: [ ] ([ȳ[t t ] [ ]) y[t] ] σ N, ε [t] Σ y,z [t] z[t] z[t t ] Σ T y,z[t] Σ z,z where the elements in the mean vector are { } z[t t ] E z[t] z[t ],P = E{F(µ) z[t ]+ G v[t] z[t ],P} = F(µ) z[t ]+ G v o (3.B.) and { } ȳ[t t ] E y[t] z[t ], φ[t],p = E { h T [t] z[t]+w[t] z[t ], φ[t],p } = h T [t] E{z[t] z[t ],P}= h T [t] z[t] where E{ } is the (conditional) expectation on the joint probabilistic space of y[t] and z[t]. Moreover, the elements of the covariance matrix are { } Σ z,z E (z[t] z[t t ]) (z[t] z[t t ]) T z[t ],P { } = E (G v[t]) (G v[t]) T z[t ],P { } = G E v[t] v T [t] z[t ],P G T = G Σ v G T and and Σ y,z [t] E c { (y[t] ȳ[t t ]) (z[t] z[t t ]) T} = E c { (y[t] h T [t] z[t]) (z[t] z[t]) T} = E c { (y[t] h T [t] z[t]+ h T [t] z[t] h T [t] z[t]) (z[t] z[t]) T} = E c { (w[t]+ h T [t] G v[t]) (G v[t]) T} = E c { w[t] v T [t] G T + h T [t] G v[t] v T [t] G T} = h T [t] G E c { v[t] v T [t] } G T = h T [t] Σ z,z σ ε[t] E c { (y[t] ȳ[t]) } = E c { (y[t] h T [t] z[t]+ h T [t] z[t] h T [t] z[t]) } = E c { (w[t]+ h T [t] G v[t]) (w[t]+ h T [t] G v[t]) T} = E c { w [t] } + E c { w[t] v T [t] G T h[t] } + E c { h T [t] G v[t] v T [t] G T h[t] } = σ w+h T [t] Σ z,z h[t] where E c { } indicates the conditional expectation with respect to z[t ], φ[t] and P, and where the identities w[t]=y[t] h T [t] z[t] and v[t]= z[t] z[t] have been used in the derivations.

130 3.C. Derivation of the EM algorithm for the identification of the GSC TARMA model 3.B.3 The PDF of the parameter trajectory The parameter evolution equation (Equation (3.3.3a)) is associated with a conditional PDF for z[t] given z[t ] and hyperparametersp, which under the NID assumption for the parameter innovations, takes the form p(z[t] z[t ],P)=N( z[t t ], Σ v ) Let z N {z[], z[],, z[n]} represent a given trajectory of the state vector. The function p(z N P) represents the probability of the trajectory z N given the hyperparametersp. Using the Markov property of Equation (3.3.3a), it is possible to decompose it as follows p(z N P)= p(z[n] z N,P) p(z N,P) = p(z[n] z N,P) p(z[n ] z N,P) p(z N,P) and so on. Thus, after continuing with the decomposition of the joint PDF and by noting that p(z[t] z N,P) = p(z[t] z[t ],P), leads to the following result (Equation (3.3.a)) p(z N P)= N t= p(z[t] z[t ],P)= N t= N( z[t t ], Σ v ) given initial conditions φ[] and z[]= [ θ T [] θ T [ ] θ T [ n a ] ] (Anderson and Moore, 979, Ch. ), Gosdill et al. (). 3.B.4 The conditional observation PDF Likewise, the TARMA equation (Equation (3.3.3b)) is associated with a conditional PDF for y[t] given z[t], φ[t] and P. Under the NID assumption of the signal innovations, takes the form p(y[t] φ[t], z[t],p)=n ( h[t] z[t],σw ) (3.B.3) The quantity p(y N zn,p) is the conditional PDF of the sequence of observations yn given the realization of the parameter vector z N and the hyperparametersp. This PDF can be decomposed as follows: p(y N z N,P)= p(y[n] y N, z N,P) p(y N z N,P) = p(y[n] y N, z N,P) p(y[n ] y N, z N,P) p(y N z N,P) and so on. As in the previous case, after noticing that p(y[t] y t, z N,P)= p(y[t] φ[t], z[t],p), leads to the following result (Equation (3.3.b)) p(y N z,p)= N t=n a + p(y[t] φ[t], z[t],p)= N t=n a + N ( h[t] z[t],σw ) Appendix 3.C Derivation of the EM algorithm for the identification of the GSC TARMA model This appendix concerns to the demonstration of the equations utilized in the EM algorithm for the maximum likelihood identification of a GSC TARMA model described in Section The complete data likelihood, which includes the observations y N and parameter trajectories θ N is of the form L(P y N, θ N )= p(y N, θ N P)= p(y N θ N,P) p(θ N P). Then, after substituting equations (3.3.b) and (3.3.a), and taking the negative logarithm, leads to the following expression for the negative log-likelihood: lnl(p y N, z N )=K+ N where K is a constant. ln Σ z,z + N t= + N lnσ w+ σw ( ) T z[t] F(µ) z[t ] Σ N t= (y[t] h T [t] z[t]) z,z ( ) z[t] F(µ) z[t ] (3.C.)

131 3. GSC-TARMA modeling of non-stationary random signals 3.C. Derivation of the expected log-likelihood The EM algorithm consists on the optimization of the expected log-likelihood: Q(P P )=E { lnl(p y N, z N ) y N,P } = Ec { lnl(p y N, z N ) } (3.C.) which is conditioned on the observed data y N and the previous values of the hyperparametersp. From now on the nomenclature E c { } is used as a short form to represent the conditional expectation E { y N,P }. After replacing Equation (3.C.) in Equation (3.C.), yields Q(P P )=E c { lnl(p y N, z N ) } = K+ N + N t= ln Σ z,z + N lnσ w+ σw N { E c (y[t] h T [t] z[t]) } t=}{{} E { ( ) T ) E c z[t] F(µ) z[t ]+ G v o Σ z,z (z[t] } F(µ) z[t ]+ G v o } {{ } E (3.C.3) Solving for E E =E c { (y[t] h T [t] z[t]) } =E c { y [t] y[t] h T [t] z[t]+ h T [t] z[t] z T [t] h[t] } =y [t] y[t] h T [t] E c {z[t]}+ h T { [t] E c z[t] z T [t] } h[t] ( ) =y [t] y[t] h T [t] ẑz[t N]+ h T [t] P[t N]+ẑz[t N] ẑz T [t N] h[t] ( +h = y[t] h [t] ẑz[t N]) T T [t] P[t N] h[t] (3.C.4) where z[t N]=E c {z[t]} and P[t N]=E c { (z[t] Ec {z[t]}) (z[t] E c {z[t]}) T} are the smoothed estimates of the state and state estimation error covariance matrix. Solving for E : { ( ) T ) E =E c z[t] F(µ) z[t ]+ G v o Σ z,z (z[t] } F(µ) z[t ]+ G v o { ) ) =E c z T [t] Σ z,z (z[t] F(µ) z[t ]+ G v o (F(µ) z[t ]) T Σ z,z (z[t] F(µ) z[t ]+ G v o =E c { ( ) } +(G v o ) T Σ z,z z[t] F(µ) z[t ]+ G v o z T [t] Σ z,z z[t] z T [t] Σ z,z F(µ) z[t ]+ z T [t] Σ z,z G v o z T [t ] F T (µ) Σ z,z z[t] + z T [t ] F T (µ) Σ z,z F(µ) z[t ] z T [t ] F T (µ) Σ z,z G v o } + v T o G T Σ z,z z[t] v T o G T Σ z,z F(µ) z[t ]+ v T o G T Σ z,z G v o 3

132 3.C. Derivation of the EM algorithm for the identification of the GSC TARMA model E =E c {tr =tr =tr =tr ( ( ( ( Σ z,z z[t] z T [t] Σ z,z F(µ) z[t ] z T [t]+ Σ z,z G v o z T [t] Σ z,z z[t] z T [t ] F T (µ) + Σ z,z F(µ) z[t ] z T [t ] F T (µ) Σ z,z G v o z T [t ] F T (µ) )} + Σ z,z z[t] v T o G T Σ z,z F(µ) z[t ] v T o G T + Σ z,z G v o v T o G T Σ z,z E c {z[t] z T [t] F(µ) z[t ] z T [t]+ G v o z T [t] z[t] z T [t ] F T (µ) Σ z,z Σ z,z ( ( + F(µ) z[t ] z T [t ] F T (µ) G v o z T [t ] F T (µ) + z[t] v T o G T F(µ) z[t ] v T o G T + G v o v T o G T }) P[t N]+ẑz[t N] ẑz T [t N] F(µ) (P T [t,t N]+ẑz[t N] ẑz T [t N] ) + G v o ẑz T [t N] (P[t,t N]+ẑz[t N] ẑz T [t N]) F T (µ) + F(µ) (P[t N]+ẑz[t N] ẑz T [t N] ) F T (µ) G v o ẑz T [t N] F T (µ) + ẑz[t N] v T o G T F(µ) ẑz[t N] v T o G T + G v o v T o G T )) S [t] F(µ) S T [t] S [t] F T (µ)+ F(µ) S [t ] F T (µ)+ G v o v T o G T + G v o (ẑz[t N] F(µ) ẑz[t N]) T +(ẑz[t N] F(µ) ẑz[t N]) v T o G T )) where S [t] = P[t N]+ẑz[t N] ẑz T [t N], S [t] = P[t N]+ẑz[t N] ẑz T [t N], and S [t] = P[t,t N]+ ẑz[t N] ẑz T [t N]. Introducing both results for E and E into Equation (3.C.3) leads to the following expression Q(P P )=K+ N ln Σ z,z + N lnσw ( ) + N ( ) +h σ w y[t] h T [t] ẑz[t N] T [t] P[t N] h[t] t= ( ( )) + tr Σ z,z S F(µ) S T S F T (µ)+ F(µ) S F T (µ) + N t= ( ) ( ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G v o + v T o G T Σ z,z G v o (3.C.5) where 4 S = S = S = N t= N t= N t= ( ) P[t N]+ẑz[t N] ẑz T [t N] ( ) P[t,t N]+ẑz[t N] ẑz T [t N] ( ) P[t N]+ẑz[t N] ẑz T [t N] (3.C.6a) (3.C.6b) (3.C.6c)

133 3. GSC-TARMA modeling of non-stationary random signals 3.C. Update equation for the parameter innovations mean The update equation for the mean value of the parameter innovations v o is obtained by computing the derivative of Q(P P ) with respect to v o, equating to zero and solving for v o. To start with, the value of the partial derivative is as follows: ( ) N Q(P P j ) ( ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G v o + v T o G T Σ z,z G v o Now, since v o = = Equation (3.C.7) reduces to t= N t= v o ( Σ Q(P P j ) = v o So, after solving for v o, yields 3.C.3 ) ( ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G+v T o G T Σ z,z G z,z G= Σ v I n... n.... = G T Σ z,z G= [ I n n n ] N t= n Σ v ( ( ) T ẑz[t N] F(µ) ẑz[t N] =(N ) v T o Σ ˆv o = ( N v + t= I n + q k= ( ( ˆθ[t N]+ ) ( M k N Σ v n. n I n... n.... = Σ v n Σ v n. n ) + vt o Σ v q M k ˆθ[t ) T k N] Σ v k= N t= ˆθ[t N] Update equation for the stochastic constraint parameters ) ) (3.C.7) = (3.C.8) (3.C.9) Stochastic constraint matrices with full structure The update equation for the state transition matrices F(µ) is obtained by computing the derivative of Q(P P ) with respect to F(µ), equating to zero and solving for F(µ). Thus, the first step is computing the derivative, which is carried out as follows ( ( ( ))) Q(P P ) = F(µ) F(µ) tr Σ z,z S F(µ) S T S F T (µ)+ F(µ) S F T (µ) = + ( N t= ( F(µ) ) ( ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G v o + v T o G T Σ z,z G v o ) Σ z,z S + Σ z,z F(µ) S + = Σ z,z S + Σ z,z F(µ) S N t= N t= ( ( ) Σ z,z G v o ẑz T [t N] =Σ z,z ( S + F(µ) S (N ) G v o ẑz T o) = Σ z,z G v o ẑz T [t N] ) 5

134 3.C. Derivation of the EM algorithm for the identification of the GSC TARMA model where ẑz o = N N t= ẑz[t N]. Solving for F(µ) yields F(µ)= ( S +(N ) G v o ẑz T o ) S (3.C.) It is possible to solve only for the stochastic constraint matrices M k following the procedure described next: M M M q S [] I I = S []. +. (N ) v o ẑz T o S S [q] S []+(N ) v o ẑz T o (S []+(N ) v o ẑz T o) S S [] = S. = S [] S. (3.C.) S [q] S [q] S where the matrix S is divided into equal-sized sub-matrices S [k] of size n nq. Then, as can be seen in Equation (3.C.), the updated stochastic constraint matrices depend only on components of the upper row. Therefore, the update equation can be reduced to where ˆM = [ ˆM ˆM ˆM q ]. ˆM = ( S []+(N ) v o ẑz T o) S (3.C.) Stochastic constraint matrices with diagonal structure The chain rule for matrix derivation is used in order to derive the update equation for this special case. In this sense, the partial derivative of Q(P P ) with respect to µ k is ( ( ) ) Q(P P ) Q(P P ) F(µ) T =tr (3.C.3) µ k F(µ) µ k where, as demonstrated previously, and Q(P P ) F(µ) = Σ z,z ( S + F(µ) S (N ) G v o ẑz T ) o [ F(µ) µ = µ k µ k µ q µ k (q ) (q ) ] I n = J,k I n = J,k (3.C.4) (3.C.5) where the matrix J j,k is a matrix with zeros everywhere except on the position j, k, where is equal to one. The matrix J j,k has the property that J T j,k = J k, j. This property is extended also to the matrix J j,k Thus, using the chain rule produces the following expression for the partial derivative Q(P P ) µ k =tr ( Σ z,z ( S + F(µ) S (N ) G v o ẑz T ) o ) J k, = tr ( Σ ) ( z,z S J k, + tr Σ ) ( z,z F(µ) S J k, (N ) tr Σ z,z G v o ẑz T ) o J k, (3.C.6) then, after following a lengthy arithmetical procedure, the following expression for the derivative is achieved 6 Q(P P ) µ k = tr ( Σ v S [,k] ) q l= µ l tr ( Σ v S [l,k] ) ( (N ) tr Σ v v o ˆθ T o ) = (3.C.7)

135 3. GSC-TARMA modeling of non-stationary random signals and since a diagonal form for the covariance matrix is also assumed (namely Σ v = σv I n ), then ( Q(P P ) =σ q ( v tr(s [,k]) µ k µ l tr(s [l,k]) (N ) tr v o ˆθ T o) ) = l= [ S [,k] S [,k] S [q,k] ] µ. = S [,k] µ q µ (3.C.8) where S [l,k]=tr(s [l,k]), and S [,k]=tr(s [,k])+(n ) ˆθ T o v o. Thus, the following matrix equation is obtained after completing the system: ˆµ = S S [] (3.C.9) where tr(s [,])+(N ) ˆθ T o v o tr(s [,]) tr(s [,]) tr(s [q,]) S []= tr(s [,])+(N ) ˆθ T o v o., tr(s [,]) tr(s [,]) tr(s [q,]) S = (3.C.) tr(s [,q])+(n ) ˆθ T o v tr(s [,q]) tr(s [,q]) tr(s [q,q]) o 3.C.4 Update equation for the parameter innovations covariance matrix To start with the demonstration, initially the derivative of Q(P P ) with respect to Σ z,z is computed as follows: ( Q(P P j ) Σ = N z,z Σ ln Σ z,z + ( ( )) z,z tr Σ z,z S F(µ) S T S F T (µ)+ F(µ) S F T (µ) N ( ( ) ) ) T ẑz[t N] F(µ) ẑz[t N] Σ z,z G v o + v T o G T Σ z,z G v o + t= = N Σ z,z + + N t= ( S F(µ) S T S F T (µ)+ F(µ) S F T (µ) ( ( ) T ẑz[t N] F(µ) ẑz[t N]) ) G v o + v T o G T G v o = (3.C.) Now, solving for Σ z,z, yields [ ( ˆΣ z,z = ) S F(µ) S T N S F T (µ)+ F(µ) S F T (µ) N ( ( ) T + ẑz[t N] F(µ) ẑz[t N]) ] G v o + v T o G T G v o (3.C.) t= Since the matrix Σ z,z = G Σ v G T, then the result above can be simplified to ) ˆΣ v =v T o v o + ˆv T o v o ( + S [,] N q k= ( ) M k S T [,k]+ S [k,] M T k + q q k= l= M k S [k,l] M T l Moreover, in the case of a diagonal covariance matrix, the variance update equation is of the form ( ) ˆσ v = v o v T o + v o ˆv T o ( + S [,] µ k n n (N ) S ) q q [,k] + µ k µ l S [k,l] q k= k= l= where ˆv o corresponds to the updated value of the mean innovations shown in Equation (3.C.9). ) (3.C.3) (3.C.4) 7

136 3.D. Identification based on the parameter trajectories 3.C.5 Update equations for the innovations variance In the case of σw, the partial derivative is of the form ( Q(P P ) σw = N σw lnσ w+ σw = N σ w + σ 4 w N t= and after solving for σ w, yields the update equation N t= ( (y[t] h T [t] ẑz[t N] ) ) ) + h T [t] P[t N] h[t] ( (y[t] h T [t] ẑz[t N] ) + h T [t] P[t N] h[t]) = (3.C.5) ˆσ w = N N t= ( (y[t] h T [t] ẑz[t N] ) + h T [t] P[t N] h[t]) (3.C.6) Appendix 3.D Identification based on the parameter trajectories This appendix is devoted to the identification and evaluation of the predictive ability of AR and integrated AR models for the parameter trajectories of the SPE TAR models analyzed in Section 3.5.The parameter trajectories are identified via AR(q) models, with q = {, }. The parameter estimation is carried out using the function arburg of MATLAB. The results for the parameter trajectories of the SPE TAR models are shown in Table 3.8. From the results it is clear that the obtained AR models have a single unit root, demonstrating the integrated nature of the parameter trajectories in both cases. Moreover, in the case of the AR(3) models, the root with smallest magnitude has magnitude very close to zero, which indicates that it could be rejected from the model. Therefore, an AR() model with a single unit root is adequate for the description of the parameter trajectories. The prediction error obtained with from the results of identification with AR models is compared with the prediction error obtained by using integrated AR (IAR) models with parameters extracted from the polynomial ( z) q (the parameters of a smoothness priors model). The results are also shown in Table 3.8. In comparison with the AR models, the IAR models are characterized by lower predictive ability and increased variability. Table 3.8: Sample average values of the parameters, associated roots and innovations variance of AR() and AR(3) models of the parameter trajectories of the SPE TAR model. SPE TAR model Parameters AR() AR(3) IAR() IAR() a.956±.34%.95±.67% a.956±.34%.947±.5% a 3.5±.54% ρ.±.%.±.% ρ.956±.34%.956±.34% ρ 3.5±.3% log σv 6.99±.% 6.99±.4% 5.83±3.7% 6.98±.596% 8

137 Chapter 4 Time and Frequency Analysis of Non-Stationary Signals by Means of Time-Dependent ARMA Representations The analysis of the time and frequency properties of Time-dependent ARMA and similar parametric time dependent models has been generally performed in terms of the so called frozen approach, in which the time dependent parameters of the model are analyzed as if they corresponded to a stationary system at each time instant. The frozen approach facilitates the analysis of the non stationary dynamics, but has important limitations on the accurate representation of non stationary dynamics, especially when the model parameters are evolving rapidly. Yet, in the analysis of Linear Time Periodic (LTP) systems, the concepts of Harmonic Impulse Response and Harmonic Frequency Response Function (FRF) are widely used for the analysis of input-output relationships and the understanding of modulations introduced by the LTP system. Up to date, those concepts have been analyzed for the case of continuous time LTP systems, and discrete time LTP systems described by the impulse response function. In this chapter, these concepts are adapted to the particular case of discrete Linear Time-Varying (LTV) difference equation models, with TARMA models being a special case. The main contribution of this work is on the derivation of analytical expressions that associate the discrete Fourier transform of the parameters of the LTV model with its respective Harmonic Impulse Response and Harmonic FRF. These formulations are also used to evaluate the spectral correlation and time dependent spectra of the Melard Tjøsteim and Wigner-Ville types for the difference equation LTV model and for the particular case of TARMA models. The theoretical and practical results demonstrate the advantages of the postulated parametric TARMA model based analysis framework compared with non parametric methods and the conventional frozen approach for the analysis of LTV systems. 9

138 4.. Introduction 4. Introduction Several systems in real life posses time-dependent characteristics that make their dynamic response signals stochastically time-dependent, this is non stationary. Strictly speaking a random signal is called non-stationary if its probability density function is a function of time (Poulimenos and Fassois, 6). Examples of non-stationary signals include the vibration response of mechanical systems and structures with variable geometry and non stationary excitation, as well as signals of biological origin, speech, or financial time series. Time-varying AutoRegressive Moving Average (TARMA) models are one of the most powerful and widely used methods for the identification and analysis of non-stationary signals and systems with non-stationary response. A Time-dependent ARMA (TARMA) model of a discrete time non-stationary signal y[t], t =,,...,N, denoted as TARMA(n a,n c ) with n a designating its AutoRegressive (AR) order and n c designating its Moving Average (MA) order, is defined as (Poulimenos and Fassois, 6): y[t]= n a i= a i [t] y[t i]+ n c i= c i [t] w[t i]+w[t], w[t] NID(,σ w[t]) (4..) where w[t] is an unobservable uncorrelated (white) non stationary innovations characterized by zero mean and time dependent (non stationary) variance σw[t], and a i [t], c i [t] are the model s time-dependent AR and MA parameters, correspondingly. Unlike conventional stationary ARMA models, the parameters and innovations variance of a TARMA model depend upon time. Non stationary parametric TARMA models offer a series of properties that make them more attractive in comparison to their non parametric counterpart, which include (Poulimenos and Fassois, 6; Avendano-Valencia and Fassois, 4; Poulimenos and Fassois, 9b): (i) representation parsimony (capability of accurately describing the non stationary process with a few number of parameters); (ii) improved representation accuracy and resolution; (iii) improved tracking of time varying dynamics; and (iv) flexibility in the analysis of the non-stationary dynamics, as parametric methods are capable of directly capturing the underlying dynamics responsible for the non stationary behavior. Furthermore, TARMA models can be used for dynamic analysis, prediction, control and detection/classification of damage of time dependent systems. The main topic of this chapter is the derivation of physically meaningful properties out of an available TARMA model, in particular those aiming at describing the time and frequency domain characteristics of the underlying non stationary process. Among these, the AutoCovariance Function (ACF) is the most recognizable time domain representation of a non stationary signal, whereas time dependent spectra are typically considered for the analysis of the spectral content in non stationary processes. The latter issue, concerning the determination of a spectral density function in the non stationary case, has given rise to several different concepts that have been studied for many years. The main aim of this line of research is to provide a generalization of the spectral density for the non stationary case that preserves certain desirable properties that hold in the stationary case. As a result, several alternate definitions have been proposed, which can be grouped in the following two types (Antoni, 7; Araki et al., 996; Cohen, 995; Flandrin, 989; Gardner, 986; Grenier, 989; Matz and Hlawatsch, 6; Matz et al., 997; Poulimenos and Fassois, 6): (i) Time dependent spectra providing a quantity that describes the time varying (evolutionary) frequency content of the response signal; (ii) Frequency frequency spectra which provide a generalized description of the process in terms of first and second order frequency components. Time dependent spectra have been more widely used due to their simpler interpretation as the frequency content of the signal changing with time, although frequency frequency spectra is very useful to understand the correlations between spectral components and facilitates the detection of modulations and reverberations in the signal. Examples of time dependent spectra are Priestley s evolutionary spectrum (Priestley, 965, 97), the generalized Wigner Ville class of spectra (Flandrin, 989; Martin, 98), and wavelet spectral methods (Newland 993, Ch. 7; Priestley 996). More recently, Matz and Hlawatsch have gathered all the above definitions of time dependent spectra into two main classes (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6): the Type I spectra which gathers the generalized Wigner Ville spectra and similar definitions based on the ACF and similar correlation operators; and the Type II spectra which gathers the generalized evolutionary spectrum and similar definitions based on the impulse response and related system operators. The same authors also demonstrate that

139 4. T F Analysis of Non-Stationary Signals via TARMA Representations both classes of spectra turn out to be equivalent for the class of under spread non stationary processes, namely those non-stationary processes inducing small time frequency shifts (Hlawatsch and Matz, 8). For the case of Linear Time Varying (LTV) systems, the Type II Spectra is perhaps the most simple to associate, since the methods grouped on this family are based on system operators. Among these, the rational relief of Grenier on the specific case of TARMA models (Grenier, 989), also known as the frozen approach as introduced much earlier on the seminal work of Zadeh for the case of time dependent continuous time systems (Zadeh, 95), is the method most used in practice for the analysis of LTV systems. Since then, the frozen approximation has been almost exclusively used for the analysis of the time dependent spectral characteristics of non stationary systems. However, as pointed out in (Zadeh, 95), the frozen approach is a first approximation to the actual transfer function of a time-dependent system, whenever its time dependent parameters do not vary appreciably over time. This condition is violated in several practical applications and for this reason the parametric frozen spectra obtained from rapidly evolving TARMA models can lead to poor results (Zadeh, 95; Grenier, 989). Alternatively, Melard in (Melard, 975) and Tjøstheim in (Tjøstheim, 976), provide a definition of a parametric time dependent spectrum based on the impulse response, which turns out to be a special case of the evolutionary spectrum of Priestley. This type of time dependent spectra, referred to as the Melard Tjøsteim Power Spectral Density (MT PSD), is endowed with nicer theoretical properties than the frozen approach (Grenier, 989), although it is rarely used in practice, in part because its computation is based on a truncated approximation of the respective impulse response. More recently, several concepts derived from the theory of cyclo stationary processes have been applied to the analysis of the time and frequency response of linear periodically time-varying systems (characterized by cyclo stationary response) (Allen et al., b; Antoni, 7, 9; Giannakis, 999; Lataire et al., b; Sandberg et al., 5; Skjoldan and Hansen, 9; Tohumoglu, 5; Zadeh, 95). These include the harmonic decomposition of the impulse response and instantaneous transfer function (or frequency response function), leading to the concepts of Harmonic Impulse Response (or spreading function) and Harmonic Frequency Response Function (FRF). In particular, the Harmonic FRF is very useful for the determination of the response of a time periodic system to a complex exponential excitation, and has been utilized to determine the response of continuous time dependent systems in (Ertveldt et al., 4; Louarroudi et al., 3; Sandberg et al., 5; Sandberg, 6; Zadeh, 95), and discrete time linear difference equation models in (Tohumoglu, 5). Moreover, the Harmonic FRF can be used to calculate other spectral quantities, such as the spectral correlation, as explained for example in (Giannakis, 999). Nonetheless, these concepts have not been well exploited yet for the case of LTV systems and in particular for TARMA representations. In this chapter the concepts of Harmonic Impulse Response and Harmonic FRF are analyzed in the particular case of discrete Linear Time-Varying (LTV) difference equation models, from which TARMA models are a special case. In this sense, the main contribution of this work is on the derivation of novel analytical expressions that associate the discrete Fourier transform of the parameters of the LTV difference equation model with its respective Harmonic Impulse Response and Harmonic FRF, as well as the adaptation of these expressions to the particular case of TARMA models. Besides, these formulations are also used to evaluate the spectral correlation and time dependent spectra of the Melard Tjøsteim and Wigner-Ville types for the difference equation LTV model and for the particular case of TARMA models. There are two main advantages provided by the introduced framework for the analysis of LTV difference equation systems: (i) the Melard Tjøsteim and Wigner-Ville spectra, which are better representations of non-stationary signals in comparison with the conventional frozen spectral method, can be calculated directly from the parameters of the difference equation LTV model with no need for approximations of infinite duration functions like the impulse response and the ACF; (ii) other quantities of interest, like the Harmonic FRF and the spectral correlation can be computed directly from the parameters of the TARMA model. The proposed TARMA model based analysis method is then exemplified and compared in two application cases: the first consisting of a simple TARMA(,) model characterized by a relatively fast evolution of the parameters in comparison to the sampling rate, which serves as a simple example to demonstrate the main results and compare with its non-parametric counterpart; and the second consisting of the frequency domain analysis of Under spread non stationary processes satisfy that the correlation between two components of the signal with frequencies e j(ω+α)tt s and e j(ω α)tt s is negligible for large values of α (Hlawatsch and Matz, 8).

140 4.. Time and frequency domain analysis of TARMA models the vibration response signal obtained during normal operation of an actual wind turbine. This chapter is organized as follows: Section 4. provides the main contributions of this work, and provides expressions for time and frequency domain representations associated with difference equation LTV systems with particular focus on TARMA models. Section 4.3 provides two application examples to demonstrate the capabilities and strengths of the proposed methods in comparison with the non-parametric and conventional parametric frozen type of estimators. Conclusions and remarks for future research are provided in Section Time and frequency domain analysis of TARMA models The analysis presented in this section considers discrete time random signals y[t] R, with t denoting the normalized discrete time inzand sampled with period T s seconds. 4.. Time-domain analysis of discrete linear time varying models Consider the discrete Linear Time Varying (LTV) system defined by the time dependent difference equation: y[t]= n a i= a i [t] y[t i]+ n c i= c i [t] x[t i] (4..3) with input x[t] and output y[t], with parameters a i [t], i =,...,n a and c i [t], i =,,...,n c. The definition of the analysis interval is of importance for posterior analysis and in particular for the specific type of transforms used to represent the associated impulse response. On one hand, the model parameters may be defined for all the values of t Z. If that is the case, it is assumed that the model parameters accept the DTFT representation summarized with the following transform pairs: a i [t]=f {A i (ω)}= T s π c i [t]=f {C i (ω)}= T s π Ω Ω A i (ω) e jωtt s dω C i (ω) e jωtt s dω A i (Ω)=F{a i [t]}= C i (ω)=f{c i [t]}= t= t= a i [t] e jωtt s c i [t] e jωtt s (4..4a) (4..4b) where Ω=[ π/t s,π/t s ]. Although in practice it would be rare to know the values of the model parameters on the infinite duration interval(, ), this scenario is useful to derive exact representations. On the other hand, the parameters are still defined on t Z and additionally are N periodic, being this the case of a Linear Time Periodic (LTP) system with cyclo stationary response. For the present case, the time dependent parameters a i [t] and c i [t] accept the DTFS or equivalently the DFT defined in Equation (4..), from which the following transform pairs are defined: The Discrete Time Fourier Transform (DTFT) pairs for an infinite duration discrete time signal are represented as (Manolakis et al., 5, p.4), (Oppenheim et al., 999, p.48): y[t]= T s Y(ω) e jωtt s dω Y(ω)= π Ω t= y[t] e jωtt s (4..) where ω = π f is the frequency in radians per second with domain Ω [ π/t s,π/t s ]. Likewise, the Discrete Fourier Series (DFS) pairs for a signal with period N (also equivalent to the Discrete Fourier Transform (DFT) for a signal defined on the period t =,,...,N ) are represented as (Manolakis et al., 5, p.38), (Oppenheim et al., 999, p.54,p.559): y[t]= N N y[k] e jω oktt s k= where ω o = π/(nt s ) is the Fourier frequency, and k=,,...,n is the frequency index. N y[k]= y[t] e jω oktt s (4..) t=

141 4. T F Analysis of Non-Stationary Signals via TARMA Representations a i [t]=f {ă i [k]}= N c i [t]=f { c i [k]}= N N ă i [k] e jkω N ott s ă i [k]=f{a i [t]}= a i [t] e jkω ott s k= t= N c i [k] e jkω N ott s c i [k]=f{c i [t]}= c i [t] e jkω ott s k= t= (4..5a) (4..5b) where ω o = π/(nt s ) is the Fourier frequency, k [,,...,N ] is the frequency index, and ă i [k] and c i [k] are the respective DTFS coefficients of the LTV model parameters. In posterior analysis this case is also extended to the analysis on the finite interval t =[,N ]. Nonetheless, in order to facilitate the analysis, it may be thought that the system is N periodic by shifting the values of the parameters, while the analysis would be aimed only for the specific interval t =[,N ] (Oppenheim et al., 999, p.559). The motivation for this assumption is clarified during the analysis of the impulse response in the next section. The remainder of this section is devoted to the derivation of expressions for different time and frequency domain representations for the deterministic system described by Equation (4..3), as well as for the corresponding non stationary random signal defined by the TARMA model described by Equation (4..). In the latter case, the assumptions made on the DFT of the parameters of the LTV system also hold for the parameters of the TARMA model Impulse response The impulse response h[t,t ] is formally defined as the response of the system to an impulse applied at time t. Unlike the time invariant case, the impulse response of a time varying system is a function of two time variables t representing the normal time, and t representing the impulse application time. Figure 4.. shows an exemplary impulse response of a time dependent system. If the system is causal, then the impulse response is zero before the impulse is applied, and thus h[t,t ] = for t < t. The impulse response can also be analyzed on different equivalent domains obtained after simple variable changes. Therefore, it is also usual to find the impulse response in terms of the normal time t and the elapsed time after impulse τ, where the change of variables t = t and t = t τ has been used. In this case, the impulse response of a causal system is zero for τ <. Moreover, the line t = t corresponds to the t axis, while the line t corresponds to the τ axis (see Figure 4..). τ t, τ t = t τ τ τ t t Figure 4..: Impulse response of a causal time varying system and different domains of analysis. The impulse response of the LTV system described by Equation (4..3) is obtained by simply replacing x[t ]= δ[t t ], where δ[t] is the Kronecker delta function. This yields (considering infinite duration of the parameters of the LTV system) (Poulimenos and Fassois, 6; Grenier, 989): h[t,t ]= { for t < t n a i= a i[t ] h[t i,t ]+ n c i= c (4..6) i[t ] δ[t t i] for t t 3

142 4.. Time and frequency domain analysis of TARMA models with t,t Z. Likewise, after applying the change of variables t = t τ, and t = t, the impulse response becomes: { for τ < h[t,t τ]= n a i= a i[t] h[t i,t τ]+ n c i= c (4..7) i[t] δ[τ i] for τ with t,τ Z. An important difficulty encountered in the application of the recursive equations shown above to compute the impulse response of the LTV model is that typically the parameter values are only available for a specific period of time, say t =,,...,N, while the impulse response extends to infinity. Assuming that the parameters are zero outside of the available time period has the effect of truncating the impulse response on the area defined by the intervals t [,N ] and t [,N ]. While for values of t close to zero this truncation may not have a mayor effect, when t is close to N the impulse response is clipped while its values are still significant. a i [t ] N t τ t, τ t = t τ τ τ t t Figure 4..: Effect of the finite length of the LTV model parameters on the analysis of its impulse response. To avoid this problem and facilitate posterior analysis, it is assumed that the parameters are N periodic by shifting the parameter values on the available interval[,n ] to the intervalsl N+[,N ], forl Z. A mild limitation of this assumption on the posterior analysis of the impulse response is that the dynamic characteristics of the parameter evolutions are expected to be preserved outside the observed interval. After considering that the model parameters are N periodic, then the impulse response is also N periodic. This can be easily observed in Equation (4..7), by changing t to t+ N and analyzing only the values on τ, then: h[t+ N,t+ N τ]= h[t+ N,t+ N τ]= n a i= n a i= a i [t+ N] h[t+ N i,t+ N τ]+ a i [t] h[t+ N i,t+ N τ]+ n c i= n c i= c i [t+ N] δ[τ i] c i [t] δ[τ i] where, stemming from the periodicity of the system, a i [t] = a i [t + N] and c i [t] = c i [t + N]. In such a case, it necessarily holds that h[t+ N,t+ N τ]=h[t,t τ], which demonstrates the periodicity of the impulse response on the t axis The spreading function and the harmonic impulse response The spreading function is defined as the DTFT of the impulse response h[t, t τ] (Matz and Hlawatsch, 6; Matz et al., 997; Hlawatsch and Matz, 8). Similarly, the impulse response can be reconstructed from the 4

143 4. T F Analysis of Non-Stationary Signals via TARMA Representations spreading function by means of the inverse DTFT. These relationships are summarized by the following pair of transforms: h(α,τ]=f t α {h[t,t τ]}= h[t,t τ]=f α t{ h(α,τ]}= T s π t= Ω h[t,t τ] e jαtt s h(α,τ] e jαtt s dα (4..8a) (4..8b) The spreading function is defined for all the values of τ =,,...,, while is limited to the continuous range α Ω=[ π/t s,π/t s ]. When the impulse response is periodic, the concept of Harmonic Impulse Response (HIR) is used instead, under which a Discrete Fourier Series expansion is utilized to represent the impulse response. In this case, the HIR and the impulse response are related as follows (Sandberg et al., 5): N h (N) [k,τ]=fs t k {h[t,t τ]}= t= h[t,t τ]=fs k t{ h (N) [k,τ] } = N h[t,t τ] e jkω ott s N h (N) [k,τ] e jkω ott s k= (4..9a) (4..9b) where the symbol h (N) [k,τ] specifies the HIR of an N periodic system. In this case the HIR is still defined on the infinite size interval τ =,,...,, but takes values only on the discrete frequencies kω o. Therefore, the HIR corresponds to a sampled version of the spreading function on the frequency grid α = kω o, for k=,,...,n. The following paragraphs introduce a part of the novel contributions of this paper, which correspond to equations that can be used to calculate the spreading function and the HIR of the LTV system on Equation (4..3). The spreading function of the LTV system defined on t Z Here, the case when the parameters of the LTV system are defined for all t Z is considered. In this case, the spreading function of the LTV system can be derived by replacing the impulse response (Equation (4..7)) in the definition of Equation (4..8a), which leads to the following expression (a demonstration is provided in Appendix 4.A.): h(α,τ]= T s π n a i= A i (α) Ω ( h(α,τ i] e jαit s ) + n c i= C i (α) δ[τ i] (4..) where Ω represents the periodic convolution operator on the interval Ω. Although Equation (4..) provides a form to calculate the spreading function of the LTV system, it is of limited practical applicability, specially due to the convolutions appearing in the AR part of the equation. As shall be shown next, when the impulse response is periodic, then it is possible to conveniently arrange the convolution into a vector operation that facilitates the calculation. The HIR of the LTV system defined on t =,,...,N Following a derivation similar to the case of the spreading function, it can be shown (see Appendix 4.A.) that the HIR of the LTV system defined in Equation (4..3) can be computed as follows: h (N) [k,τ]= N n a i= ă i [k] N ( h (N) [k,τ i] e jkω ot s i ) + n c i= c i [k] δ[τ i] (4..) The relation between the DTFT of an infinite duration signal approximated by a finite duration signal and its respective DFT is clarified in (Oppenheim et al., 999, p. 559). For two periodic functions f(x) and g(x) the periodic convolution operator corresponds to the integral (Oppenheim et al., 999, p. 6): f(ω) g(ω) f(α) g(ω α) dα = f(ω α) g(α) dα Ω Ω Ω 5

144 4.. Time and frequency domain analysis of TARMA models where N denotes the N point circular convolution operator (Oppenheim et al., 999, pp ). The evaluation of the expression of the discretized HIR in Equation (4..) appears difficult to implement, although with some algebraic manipulations it can be calculated as the impulse response of a simpler multivariate AR process. To do so, the coefficients of the DFT of the parameters and the discretized HIR are stacked in the N column vectors: ăa i = ă i [] ă i []. ă i [N ], c i = c i [] c i []. c i [N ], h (N) [,τ] h (N) [,τ] h (N) [τ]=. h (N) [N,τ] Then, based on the N-periodicity of the DFT, the circular convolution in Equation (4..) can be replaced by a matrix-vector product that leads to the following recursive form of the discretized HIR (see proof in Appendix 4.A.): h (N) [τ]= n a i= Ω i ĂA i h (N) [τ i]+ where ĂA i = circ(ă i ) is a circulant matrix that holds the DFT coefficients of a i [t], defined as: n c i= ă i [] ă i [N ] ă i [] ă i [] ă i [] ă i [] ĂA i = ă i [N ] ă i [N ] ă i [] c i δ[τ i] (4..) and Ω = exp( jkω o T s )/N is the exponential matrix, with K = diag(,,...,n) a diagonal matrix with the frequency indices of the DFT. According to Equation (4..), the HIR of the LTV system takes the form of an impulse response function for a linear time invariant (LTI) N-dimensional vector ARMA system, in which all the impulse responses h (N) [k,τ] are interdependent according to the matrices Ω i ĂA i. Furthermore, on difference with the HIR of a continuous linear time periodic system shown in (Matz et al., 997; Sandberg et al., 5), which is an infinite dimensional function with domain k Z, in the discrete time linear time periodic case presented here the obtained HIR is a finite dimensional representation with k [,,...,N ] Convolution representation and the HIR The response of the LTV system y[t] to the excitation x[t] can be computed by means of the convolution operator and the impulse response function using the Wold-Cramér representation or convolution representation (Poulimenos and Fassois, 6). Then, using the convolution representation, the response of the non stationary system is computed as follows: y[t]= τ= h[t,t τ] x[t τ] (4..3) it is also possible to substitute the impulse response by is representation based on the HIR on Equation (4..9b) to obtain the following reformulation of the convolution representation in terms of the HIR for an LTV system Let f[k] and g[k] be two N periodic discrete functions. The N point circular convolution corresponds to: N N f[k] g[k] N f[mod(m,n)] g[mod(k m,n)]= f[mod(k m,n)] g[mod(m,n)] m= m= where mod(n,n) indicates the modulo operation or the remainder after the division n/n. 6

145 4. T F Analysis of Non-Stationary Signals via TARMA Representations analyzed in the period t =,,...,N (Lataire et al., a; Sandberg et al., 5): ( ) N y[t]= τ= N h (N) [k,τ] e jkω ott s x[t τ]= N k= N ( h (N) [k,t] x[t] ) e jkω ott s (4..4) k= where represents the (conventional) discrete convolution operator. The result above indicates that the response of an N periodic LTV system to the excitation x[t] (or an LTV system analyzed on the period of time t =,,...,N ) can be represented in terms of the finite sum of the response of N stationary sub systems h (N) [k,τ] to the same excitation, each one of them multiplied by the corresponding harmonic e jkω ot. This relationship is depicted in Figure (a) x[t] h[t, t τ] y[t] (b) x[t] h[n,τ] h[,τ] h[,τ] e j(n )ω ott s e jω ott s y[t] Figure 4..3: Time-domain representation of an LTV system using the convolution representation: (a) y[t] is represented in terms of the impulse response as the convolution of x[t] with the (time varying) impulse response function h[t,t τ]; (b) y[t] on the period N is represented in terms of the discretized HIR as the sum of the output of N LTI sub systems h (N) [k,τ] multiplied by a modulating function e jkω ot. The sub-system h[,τ] corresponds to the harmonic k= and is the best LTI approximation of the LTV system. A remarkable property of Equation (4..4) is that by using the HIR, it is possible to decompose the response of the time variant system into its stationary and non stationary components, as: ( y[t]=y s [t]+y ns [t]= N N h (N) [,t] x[t]+ e jkω otts ( h (N) [k,t] x[t] )) (4..5) where y s [t] corresponds to the stationary part of the process, and y ns [t] corresponds to the non stationary part. In the same sense, the sub-system h (N) [,t] is the one that generates the stationary part of the process and corresponds to the best LTI approximation of the non stationary process, while the higher order sub-systems h (N) [k,t], for k=[,,...,n ] are responsible for the non stationary components in the process (Lataire et al., a). A similar decomposition of the response of the LTV system can be obtained with the spreading function. However, instead of a sum of a finite set of sub systems, a superposition of an infinite number of sub systems is achieved, via the integral representation: k= y[t]= T s h(α,τ] x[t τ] π Ω( τ= ) e jαtt s (4..6) and as in the previous case, the sub system h(, τ] corresponds to the best LTI approximation of the non stationary process, while the remaining sub systems are responsible for the non stationary characteristics of the response Excitation response covariance and autocovariance functions The statistical dependency between the (zero mean) excitation signal x[t] and response signal y[t] on a time dependent system in the time domain t Z is measured by means of the excitation response covariance (or cross covariance) function, which is defined as: γ yx [t,t τ]=e{y[t] x[t τ]} 7

146 4.. Time and frequency domain analysis of TARMA models Replacing the convolution representation of the response shown in Equation (4..3) in the previous definition, then it can be shown that the excitation response covariance function satisfies (Giannakis, 999): } γ yx [t,t τ]=e{y[t] x[t τ]}=e{ h[t,t τ ] x[t τ ] x[t τ] γ yx [t,t τ]= τ = h[t,t τ ] γ xx [t τ,t τ], τ > (4..7) τ = where γ yx [t,t τ]= for τ, and γ xx [t,t τ] represents the excitation s ACF. Following a similar process, it is possible to derive the following expression for the response ACF (Giannakis, 999): γ yy [t,t τ]= and γ yy [t,t τ]= for τ. τ = τ =τ h[t,t τ ] h[t τ,t τ ] γ xx [t τ,t τ ], τ > (4..8) TARMA model case In the case of a TARMA model, the excitation is the NID process w[t] NID (,σ w[t] ). Then, the ACF of the excitation is γ ww [t,t τ]=σ w[t] δ[τ], and then the two expressions above for the excitation response correlation and response ACF reduce to (Poulimenos and Fassois, 6): γ yw [t,t τ]=σ w[t] h[t,t τ], τ > (4..9a) γ yy [t,t τ]= and γ yw [t,t τ]=, γ yy [t,t τ]= for τ. σw[t] h[t,t τ ] h[t τ,t τ ], τ > (4..9b) τ = 4.. Frequency-domain analysis of discrete linear time varying models 4... Instantaneous frequency response function As in the stationary case, the frequency response of a time-varying system can be defined as the response of the system to a complex exponential excitation e jωtt s, normalized by the excitation itself (Zadeh, 95). Computing the discrete Fourier transform of the convolution representation in Equation (4..3), then, it can be seen that the response of the non-stationary system to a harmonic input x[t]=e jωtt s is of the form (Sandberg et al., 5; Poulimenos and Fassois, 6; Matz et al., 997; Zadeh, 95): in consequence: H[t,ω) response of the system to e jωtt s e jωtt s H[t,ω)= τ= = h[t,t τ] e jω(t τ)ts e jωtt s = ( h[t,t τ] x[t τ] τ= τ= ) x[t]=e jωtts e jωtts (4..) h[t,t τ] e jωτt s =F τ ω {h[t,t τ]} (4..) where ω = π f Ω denotes the conventional frequency (frequencies forming the signal waveforms or first order periodicities). The frequency response of Equation (4..) is known as the Instantaneous Transfer Function (ITF) (Louarroudi et al., 3), the System Function (Gardner, 986; Lataire et al., b; Tohumoglu, 5), the Parametric Transfer Function (PTF) (Sandberg et al., 5), or the Time-Varying Transfer Function (TVTF) (Jachan et al., 7; Zou and Chon, 4). From now on, the frequency response H[t, ω) of Equation (4..) shall be referred to as the Instantaneous Frequency Response Function (FRF). The instantaneous FRF is a generalization of the conventional concept of frequency response to the time variant case and can be interpreted as the steady state response of the time variant system to the harmonic e jωtt s 8

147 4. T F Analysis of Non-Stationary Signals via TARMA Representations (Zadeh, 95; Sandberg et al., 5). The inverse relation between the instantaneous FRF and the impulse response is given by: h[t,t τ]=f ω τ {H[t,ω)}= T s H[t,ω) e jωτt s dω (4..) π Ω The instantaneous FRF of the LTV system in Equation (4..3) is obtained by computing the DFT of the impulse response over the dimension τ, as shown in Equation (4..), which leads to the following recursive equations : H[t,ω)= n a i= a i [t] H[t i,ω) e jωt si + n c i= c i [t] e jωt si (4..3) Notice that no assumptions have been made on the t domain, and consequently Equation (4..3) is valid for either t Z and for the finite analysis interval t [,,..., N]. Moreover, if the instantaneous FRF is slowly varying, so that H[t,ω) H[t i,ω), i=,...,n a, then Equation (4..3) is simplified into the expression: H F [t,ω)= n c i= c i[t] e jωt si + n a i= a i[t] e jωt si (4..4) which corresponds to the frozen Frequency Response Function (FRF), obtained by utilizing a sequence of frozen LTI systems (each one of them corresponding to each time instant) to represent the time variant system (Poulimenos and Fassois, 6; Grenier, 989; Zadeh, 95) The frequency response operator and the harmonic FRF The Frequency Response Operator H(α,ω) is defined as the DTFT of the spreading function with respect to τ, or equivalently as the DTFT of the Instantaneous FRF with respect to time, namely (Araki et al., 996; Yamamoto and Araki, 994): H(α,ω)=F τ ω { h(α,τ] } = H(α,ω)=F t α {H[t,ω)}= τ= t= h(α,τ] e jωτt s H[t,ω) e jαtt s (4..5a) (4..5b) Similarly, the Harmonic Frequency Response Function (Harmonic FRF), in the case of an N periodic LTV system, is defined as the DTFT of the HIR with respect to τ, or equivalently as the DTFS of the Instantaneous FRF with respect to time (Sandberg et al., 5; Allen et al., b; Lataire et al., b): H (N) [k,ω)=f τ ω { h (N) [k,τ] } = τ= N H (N) [k,ω)=fs t kωo {H[t,ω)}= t= h (N) [k,τ) e jωτt s H[t,ω) e jkω ott s (4..6a) (4..6b) where in the second equation it is assumed that the Instantaneous FRF is either N periodic or is analyzed only in the interval t =,,...,N. According to Equation (4..5), the steady state response of the time variant system is reached when the transient terms of the response of each sub system h[k,τ] fade. See (Zadeh, 95) for more details. The demonstration of Equation (4..3) follows by simple application of Equation (4..) on the impulse response defined in Equation (4..7), which yields: n a H[t,ω)=F τ ω { a i [t] h[t i,t τ]+ c i [t] δ[τ i] i= i= n c } n a = a i [t] F τ ω {h[t i,t τ]}+ c i [t] F τ ω {δ[τ i]} i= i= since F τ ω {h[t i,t τ]}=h[t i,ω) e jωt si andf τ ω {δ[τ i]}=e jωt si, then Equation (4..3) follows. n c 9

148 4.. Time and frequency domain analysis of TARMA models The Harmonic FRF associates the Fourier transform of a frequency shifted version of the input with the Fourier transform of the response of the system. To clarify this, consider the frequency shifted input x[t] e jkω ot s t with DTFT X(ω kω o ). Then, according to the convolution representation in Equation (4..4), the response of the system y[t] is: y[t]= N N ) ) ( h (N) [k,t] e jkω ott s (x[t] e jkω ott s, t Z (4..7) k= applying the DTFT at both sides of the Equation and using the convolution and frequency shift properties of the DTFT, yields: N Y(ω)= k= H (N) [k,ω kω o ) X(ω kω o ) (4..8) for all ω Ω. Similar relations can be shown for the Frequency Response Operator, where the sums are substituted by integrals in the interval Ω. When the excitation and response signals, y[t] and x[t] respectively, are analyzed on the finite time interval t =,,...,N, then their respective DFTs: N Y[n]= t= y[t] e jnω ott s, N X[n]= t= x[t] e jnω ott s, n=,,...,n (4..9) correspond to discretized versions of the respective DTFTs on the frequency variable ω = nω o, so that Y[n] = Y(nω o ) and X[n]=X(nω o ). Then, by defining the vectors: Y = [ Y[] Y[] Y[] Y[N ] ] T X = [ X[] X[] X[] X[N ] ] T the DFT of the response signal in Equation (4..8) can be related with the DFT of the excitation signal by means of the relation: Y = H (N) X (4..3) where H (N) is a N N matrix with the Harmonic FRF coefficients corresponding to each frequency n and k, namely H (N) [,] H (N) [N,N ] H (N) [N,N ] H (N) [,] H (N) = [ H (N) [k n,n n] ] H (N) [,] H (N) [,N ] H (N) [N,N ] H (N) [,] [k,n] = H (N) [,] H (N) [,N ] H (N) [,N ] H (N) [3,] H (N) [N,] H (N) [N,N ] H (N) [N 3,N ] H (N) [,] N N (4..3) where H (N) [k,n] = H (N) (kω o,nω o ). Notice that the periodicity property of the DFT has been used to build the matrix H (N) only with elements in the range [,N ], and thus yields a finite dimension operator, on difference with the Harmonic FRF in the case of linear time periodic systems defined in continuous time, where the Harmonic FRF matrix is of infinite dimension (Sandberg et al., 5; Lataire et al., b). Equation (4..8) suggests an alternative representation of the response signal to that one given by the convolution representation of Equation (4..4) and depicted in Figure 4..3(b). In the case of Equation (4..8), the response signal is built up from the response of N sub-systems ğ[k,t]= h (N) [k,t] e jkω ott s to frequency shifted versions of the excitation x[t] e jkω ott s. This alternative representation is depicted in Figure

149 4. T F Analysis of Non-Stationary Signals via TARMA Representations x[t] e j(n )ω ott s ğ[n,τ] x[t] e jω ott s ğ[,τ] y[t] x[t] ğ[,τ] Figure 4..4: Alternative time-domain representation of a non-stationary process using the convolution representation, where y[t] is represented as the sum of the output of N stationary sub-systems ğ[k,τ]= h[k,t] e jkω ott s fed with frequency shifted versions of the excitation x[t]. The sub-system ğ[,τ] corresponding to the harmonic k= is the best LTI approximation of the non-stationary system. The Frequency Response Operator of the LTV system in Equation (4..3) The Frequency Response Operator for this case is obtained by applying the Fourier transform to the spreading function expression in Equation (4..), which yields: H(α,ω)= T s π n a i= ( ) A i (α) H(α,ω) e j(ω+α)it s + Ω n c i= C i (α) e jωit s (4..3) However, as in the case of the spreading function, this expression is of limited practical applicability, due to the infinite dimension of the Frequency Response Operator and the convolution operator in the AR part of the equation. Nonetheless, as shall be shown next, a more useful vector equation is obtained in the N periodic case. The Harmonic FRF for the LTV system in Equation (4..3) on the N periodic case The Harmonic FRF for this case can be found by computing the DFT of h (N) [k,τ] in Equation (4..) with respect to τ, which leads to the following result: H (N) [k,ω)= n a i= ă i [k] N ( H (N) [k,ω) e jkω ot s i ) e jωt si + n c i= c i [k] e jωt si (4..33) After replacing the circular convolution operator by a matrix-vector product as performed for the derivation of the HIR (Equation (4..)), the following matrix form to compute the Harmonic FRF is obtained: where H (N) (ω)= ( I+ n a i= e jωt si Ω i ĂA i ) ( ) nc c i e jωt si, ω Ω (4..34) i= H (N) (ω)= [ H (N) [,ω) H (N) [,ω) H (N) [,ω) H (N) [(N ),ω) ] T is a vector containing the transfer function coefficients for each frequency ω Ω. The same expression for the Harmonic FRF may be obtained by computing the DFT of the instantaneous FRF with respect to t on Equation (4..). Equation (4..33) and Equation (4..34) are also novel contributions introduced by the present work, and can be used to evaluate the Harmonic FRF of the LTV system at any frequency of interest in the continuous interval Ω. Furthermore, the instantaneous FRF, harmonic impulse response and impulse response can be retrieved from the Harmonic FRF by means of the inverse DFT. In summary, all the relationships among the impulse response, the harmonic impulse response, the instantaneous FRF and the Harmonic FRF derived for the case of the LTV model in Equation (4..3) are depicted in Figure

150 4.. Time and frequency domain analysis of TARMA models h[t,t τ] Impulse response Equation (4..7) F τ ω H[t,ω) Instantaneous Frequency Response Function Equation (4..3) H F [t,ω) Frozen Frequency Response Function Equation (4..4) F t k F t k h[k, τ] F τ ω H[k,ω) Harmonic Impulse Response Equation (4..) Harmonic Frequency Response Function Equation (4..34) Figure 4..5: Summary of the relationships among the time and frequency domain descriptors derived from an discrete time LTV system Spectral correlation Based on the finite time DFTs of the excitation and response signals in Equation (4..9), the excitation response and response spectral correlations are defined as (Giannakis, 999): S yx [k,k n]=e{y[k] X [k n]}, S yy [k,k n]=e{y[k] Y [k n]}, k,n [,,...,N ] (4..35) On difference with the spectral correlation of infinite duration signals defined in Antoni (7, 9); Flandrin (989); Gardner (986) with domain α, ω Ω, the discretized spectral correlation of Equation (4..35) for the N sample analysis period is defined on the domain k,n [,,...,N ]. Moreover, the discretized spectral correlation of Equation (4..35) corresponds to the values of the infinite duration spectral correlation on α = kω o and ω = nω o, namely S yy [k,k n]=s yy (kω o,(k n)ω o ). The value of the excitation response spectral correlation for the LTV system in Equation (4..3), can be obtained by substituting the value of Y(nω)= Y[n] given in Equation (4..8) on the definition of the spectral correlation in Equation (4..35), which produces the following result: S yx [k,k n]=e N = m= { N m= H[m,k m] X[k m] X [k n] } N = m= H[m,k m] E{X[k m] X [k n]} H[m,k m] S xx [k m,k n] (4..36) Likewise, the spectral correlation of the response is determined by the relation: S yy [k,k n]= N N m= l= H[m,k m] H [l,k n l] S xx [k m,k n l] (4..37) where S xx [k,k n] is the spectral correlation of the excitation. Notice the similarity of Equation (4..36) and Equation (4..37) for the excitation response spectral correlation and the spectral correlation of the response with the relations for the excitation response covariance and ACF in Equation (4..7) and Equation (4..8). Similar expressions are shown in (Giannakis, 999) for the discrete time case, and in (Gardner, 986) for the continuous time case. Equations (4..36) and (4..37) may be used to analyze the relation of the spectral characteristics of the input with the spectral characteristics of the output of a system with time dependent dynamics. As in the case of 3

151 4. T F Analysis of Non-Stationary Signals via TARMA Representations the Harmonic FRF, it is also possible to derive a matrix-form equation for the spectral correlation of the output signal based on Equation (4..3) which relates the input and output DFTs. Thus, the excitation response spectral correlation matrix S yx is defined as: S yx [,] S yx [,] S yx [,N ] S yx [,] S yx [,] S yx [,N ] S yx = S yx [N,] S yx [N,] S yx [N,N ] Y[] X [] Y[] X [] Y[] X [N ] Y[] X [] Y[] X [] Y[] X [N ] = E..... = E { Y X } (4..38). Y[N ] X [] Y[N ] X [] Y[N ] X [N ] where the the symbol indicates the complex conjugate and indicates the conjugate transpose matrix. Moreover, the response correlation matrix S yy is defined as: S yy [,] S yy [,] S yy [,N ] S yy [,] S yy [,] S yy [,N ] S yy = S yy [N,] S yy [N,] S yy [N,N ] Y[] Y [] Y[] Y [] Y[] Y [N ] Y[] Y [] Y[] Y [] Y[] Y [N ] = E..... = E { Y Y } (4..39). Y[N ] Y [] Y[N ] Y [] Y[N ] Y [N ] Then, the following expression is obtained after replacing the result in Equation (4..3) in the previous equation, S yx = E { ( H X) X } = H E { X X } = H S xx S yy = E { ( H X) ( H X) } = H E { X X } H = H S xx H (4..4a) (4..4b) TARMA case In the TARMA case, the spectral correlation of the NID excitation w[t] NID (,σ w[t] ) is: { { S ww [k,k n]=f t nωo Fτ ω σ w [t] δ[t τ] }} N = N = t= t= τ= σ w[t] δ[t τ] e jnω otts e jωτt s σ w[t] e jnω ott s = σ w[n] (4..4) where σ w[n] is the DFT of the innovations variance σ w[t] in the period t =,,...,N. The spectral correlation matrix of the excitation S ww has a diagonal structure, with the main diagonal determined by the entriess ww [k,k]= σ w[], while the remaining diagonals are S ww [k,k+n] = σ w[n]. Then, after substituting the previous result in Equation (4..36) and Equation (4..37), yields the expressions: N S yx [k,k n]= m= S yy [k,k n]= N N m= l= H[m,k m] σ w[n m] H[m,k m] H [l,k n l] σ w[n+l m] (4..4a) (4..4b) Another special case is when the exciting innovations is a stationary NID process with variance σ w. In that cases ww [k,k n]=σ w δ[n]. In that case, the excitation response and response spectral correlation take the form: 33

152 4.. Time and frequency domain analysis of TARMA models S yx [k,k n]=σ w H[k,k n] S yy [k,k n]=σw N m= H[m,k m] H [m n,k m] (4..43a) (4..43b) and in matrix form: S yx = σ w H S yy = σ w H H (4..44a) (4..44b) The equations analyzed before correspond to a generalization of the well known expressions relating the PSD of the input and the FRF of a LTI system with the input output cross PSD and the PSD of the output signal (Manolakis et al., 5, Ch. 3). In this sense, Equations (4..36) and (4..37) may be used to analyze the relation between the spectral components of the input and the output of an LTV system of Equation (4..3) defined on the period of time t =,,...,N (including the time periodic case). Similarly, Equation (4..4) and Equation (4..43) may be used to analyze the spectral content of the non stationary response of a TARMA model Parametric time dependent spectra Three types of time dependent spectra are analyzed for the case of the TARMA model in Equation (4..): The frozen spectral approach The simplest approximation for a parametric time dependent spectrum is the frozen TV PSD, which basically consists of using the squared magnitude of the frozen FRF of Equation (4..4). Then, the frozen TV PSD of the TARMA model defined in Equation (4..) is as follows (Grenier, 989; Poulimenos and Fassois, 6), S F [t,n]=σw n c i= c i[t] e jnω ot s i + n a i= a i[t] e jnω ot s i = σ w H F [t,n] (4..45) The frozen TV PSD approach is the most used method for the analysis of the spectral content related with a TARMA model due to the simplicity of the implementation. Nevertheless, as pointed out in (Zadeh, 95), the frozen approach may be regarded as a first approximation to the actual system function of a variable network whenever the coefficients of the fundamental equation do not vary appreciably over the width of the impulse response of the system. This statement is confirmed with the findings in Equation (4..3), where it is evident that the frozen FRF corresponds to an approximation of the instantaneous FRF for slowly varying systems. The Melard Tjøsteim spectral approach The concept of Evolutionary Spectrum discussed for example in Flandrin (989); Grenier (989), is an attempt to generalize the concept of the PSD of the output of an LTI stationary system, which is based on the transform of the system response function. The frozen TV PSD corresponds to a special case of this evolutionary spectrum where the amplitude modulating function H[t, ω) defining the oscillatory family is replaced by the frozen FRF H F [t,nω o ). Nonetheless, as discussed in Section 4..., the frozen FRF is an approximation of the instantaneous FRF in Equation (4..3), when H[t,ω) H[t i,ω) for all i=,...,n a. In this sense, a more precise evolutionary spectrum is obtained by using the instantaneous FRF, in which case takes the name of the Melard Tjøsteim (evolutionary) Power Spectral Density (MT-PSD) (Poulimenos and Fassois, 6; Grenier, 989; Rao and Tong, 974) 34 S MT [t,ω)= H[t,ω) = τ= h[t,t τ] e jωt sτ = N N k= H[k,ω) e jkω ot s t (4..46)

153 4. T F Analysis of Non-Stationary Signals via TARMA Representations The previous equation suggests three ways to compute the MT PSD. The first one requires evaluating the impulse response and then evaluating the square absolute value of its corresponding DTFT, which implies the use of a truncated impulse response up to a finite, reasonable high, value of τ. The second one consists on directly computing the instantaneous FRF by means of the recursive expression in Equation (4..3), in which case initial conditions must be defined, while extra consideration for the initial transient period on the recursive equations is necessary. Finally, the third way to compute the MT PSD consists on evaluating the Harmonic FRF using Equation (4..33), and then computing the inverse DFT to obtain the instantaneous FRF and plug the result into Equation (4..46). The latter method is more recommendable, since it is based on the closed form expression of the Harmonic FRF in Equation (4..33) and requires only an inverse DFT of the obtained Harmonic FRF on the frequencies of interest. The Wigner Ville spectral approach A final approach to compute the time variant spectrum of the TARMA model in Equation (4..) is by using the definition of the generalized Wigner Ville spectrum. More specifically, a Rihaczek type of spectrum (Rihaczek PSD, which is part of the family of generalized Wigner Ville spectra) is obtained by computing the DFT with respect to τ of the parametric estimate of the ACF in Equation (4..9), or by computing the inverse DFT of the spectral correlation in Equation (4..37). Therefore, the Rihaczek spectrum of a TARMA model may be computed as follows: S R [t,n]= F τ nωo {γ yy [t,t τ]} = F k t {S yy[k,k n]} (4..47) 4..3 Computational considerations The key quantity in the time frequency analysis framework introduced in this section is the Harmonic FRF. The Harmonic FRF of the LTV system described by Equation (4..3) is computed in terms of Equation 4..34, which requires itself the computation of the inverse of the matrix D= I+ n a i= e jωt si Ω i ĂA i (4..48) Therefore, the computational effort required for the evaluation of the Harmonic FRF is mainly ruled by the computation of the inverse of D. As a note, it must be remarked that the inverse of D always exists, given the properties of the circulant matrices Ă i and the diagonal matrix Ω (Golub and van Loan, 996, Sec. 4.7). However, the computational cost of the inverse of D will depend on its size, which directly depends on the length of the period of analysis N. In that sense, it is highly recommended to analyze the structure of ă i [k] before performing any computation this will save extenuating computational times and large memory requirements!. Two important cases may be considered: (i) a i [t] is P-periodic, with period P> samples: In this case ă i [k] is different from zero only when k=l P, with l = {,,..., N/P } where indicates the nearest integer lower than the argument. Then, the circulant matrix ĂA i can be constructed only with the entries of ă i [k] at k = l P with l ={,,..., N/P }. For example, if a i [t] is -periodic, then ĂA i can be computed only with N/ entries from ă i [k], thus diminishing the size of D from N to N /. (ii) a i [t] is non periodic but slowly varying, so that f a(max) f s : In this case ă i [k] is highly concentrated around zero. Then, the circulant matrix ĂA i can be constructed only with the entries of ă i [k] with k = {,,..., N λ f a(max) / f s }, where λ > is a positive constant. The constant λ is introduced, since the Harmonic FRF spreads on frequencies much higher than f a(max), so, it is not recommendable to compute the Harmonic FRF exactly up to that frequency value, but instead to a larger value, which will depend on the dynamics of the LTV system. The value of λ can be adjusted by increasing its value, starting from, and then observing the convergence of the obtained Harmonic FRFs. 35

154 4.3. Application examples 4.3 Application examples 4.3. Analysis of an FS TARMA(,) model Model definition On this application example, the concepts introduced before for the time and frequency domain analysis of TARMA models are illustrated via a simple FS TARMA model. Also, the TARMA model based analysis methods are compared with analogous non parametric estimation methods. To begin with, consider the following FS TARMA(,) model: y[t]= a [t] y[t ] a [t] y[t ]+c [t] w[t ]+w[t], w[t] NID (,σw ) (4.3.) defined on the period of time t =[,,...,N ], with N = 4 samples and sampled at f s = /T s = Hz (thus ω o = π f s /N). The evolution of the parameters of the FS TARMA model is periodic, and is defined as follows: a i [t]=a i, + a i, cos(ω c tt s )+a i, cos(ω c tt s ), i=,, ω c = π f c c [t]=c, + c, cos(ω c tt s )+c, cos(ω c tt s ) where ω c = π f c, and f c =.5 Hz is the cyclic frequency of the FS TARMA model. The values of the innovations variance, the coefficients of the parameter trajectories, the sampling frequency, cyclic frequency and analysis period are summarized in Table 4.. The DFT of the parameter trajectories can be derived by simple inspection, and is equal to: ă i [k]=n a i, δ[k]+ c [k]=n c, δ[k]+ m= m= N a i,m N c,m (δ[k m]+δ[k+ m]) (δ[k m]+δ[k+ m]) The values of the coefficients of the DFT of the parameter trajectories are also shown in Table 4.. Table 4.: Summary of the coefficients of the FS TARMA model and their corresponding DFT coefficients. Parameter Value Sampling frequency f s = Hz Cyclic frequency f c =.5 Hz Analysis period N = 4 samples Innovations variance σw =. AR parameters Time domain: a, =.96, a, =.5, a, =.6 a, =.96, a, =.6, a, =.8 Frequency domain: ă []= 384, ă []=ă [3979]=, ă [4]=ă [3959]=3 ă []=3686, ă []=ă [3979]= 3, ă [4]=ă [3959]= 6 MA parameters Time domain: c, =.8, c, =.8, c, =.6 Frequency domain: c []=3, c []= c [3979]=36, c [4]= c [3959]= The parameter trajectories over time, a typical realization of the process and its corresponding stationary PSD estimate via Welch s method are shown in Figure The Welch PSD estimate is computed based on a single realization consisting of 4 samples (4 s), and using 4 point Fourier transforms with 5% overlap Time and frequency domain representation of the FS TARMA model The analysis of the dynamics of the FS TARMA(,) model is initially carried out in terms of the impulse response function, the HTF, the instantaneous FRF and the spreading function. The computation of these quantities, based on the model parameters and their respective DFTs, is described as next: 36

4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) a[t] - - 3 4 5 6 7 8 9.95 a[t].9 3 4 5 6 7 8 9 c[t].

155 4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) a[t] a[t] c[t] Time [s] (b) y[t] (c) Frequency [Hz] log Sspg[t,f] - -4 log S yy (f) Time [s] - Figure 4.3.: Single realization of the FS TARMA(,) model: (a) Time dependent AR and MA parameters; (b) Single process realization; (c) Welch estimate of the Stationary PSD and spectrogram. The Impulse Response Function h[t,t τ] is evaluated on the time period t ={,,...,4} samples, and the delay period τ ={,,...,5} samples, both corresponding to the periods t T s =[,4] s and τ T s = [,.5] s, by direct application of Equation (4..7). The HTF H[k, f] is evaluated on the discrete frequencies f = k f s /N, with k={,,,...,n/ }, equivalent to the frequency range f [, f s /], and on the cyclic frequencies α = f s l with l ={,,...,99}, by application of Equation (4..34) based on the DFT coefficients of the parameters of the FS TARMA(,) model in Table 4.. The Instantaneous FRF and the Spreading Function are obtained by evaluating the respective inverse DFTs of the HTF. Figure 4.3. shows the obtained representations. Notice that the frequencies at values of k N/ are shifted to respective negative frequencies. The time dependency of the underlying dynamics of the FS TARMA model is evident in both the impulse response Figure 4.3.(a) and the instantaneous FRF Figure 4.3.(c). Additionally, both the HIR and HTF Figure 4.3.(b) and (d) respectively demonstrate the effect of the parameter periodicity. A very complex cyclic modulation phenomenon can be observed in both the HIR and HTF, which is itself originated from a very simple parameter evolution model, consisting of just two cosine components. Moreover, it can be observed that in both the HTF and instantaneous FRF of the FS TARMA model the frequency content is maximally concentrated in the range from to 5 Hz, while the period of s is clearly 37

4.3. Application examples (a) (b) (c) (d) Figure 4.3.: Time and frequency

response function; (b) spreading function (magnitude); (c) instantaneous

Time and delay in seconds, frequency indices transformed to Hertz as f = k

The spectral lines in the HTF and in the HIR are located at cyclic

The magnitude of these spectral lines attenuates for increasing values of

3 shows the three different types of parametric spectra discussed in

..4, namely, the frozen, Melard-Tjøsteim, and Wigner-Ville spectra,

=[,35] Hz. The frozen spectrum is computed according to Equation (4.

The Melard-Tjøsteim spectrum is computed according to Equation (4.

Finally, the Wigner-Ville spectrum is computed as the inverse DFT of the

The time dependent spectra of the FS TARMA model provide similar

however there are differences in important details.

spectra demonstrate the presence of a single mode with evident amplitude

The instant frequency of the spectral component varies periodically within

the instant frequency. Moreover, the period of s is clearly visible.

156 4.3. Application examples (a) (b) (c) (d) Figure 4.3.: Time and frequency domain representations of the simulated FS-TARMA(,) model: (a) impulse response function; (b) spreading function (magnitude); (c) instantaneous FRF (magnitude); (d) harmonic transfer function (magnitude). Time and delay in seconds, frequency indices transformed to Hertz as f = k f s /N, k = {,,...,N }. visible. The spectral lines in the HTF and in the HIR are located at cyclic frequencies that are multiples of f c =.5 s. The magnitude of these spectral lines attenuates for increasing values of the cyclic frequency. Figure shows the three different types of parametric spectra discussed in Section 4...4, namely, the frozen, Melard-Tjøsteim, and Wigner-Ville spectra, evaluated for two periods of the system (4 s) within the frequency range f =[,35] Hz. The frozen spectrum is computed according to Equation (4..45) based directly on the parameters of the FS TARMA model. The Melard-Tjøsteim spectrum is computed according to Equation (4..46) based on the instantaneous FRF shown in Figure 4.3.(c). Finally, the Wigner-Ville spectrum is computed as the inverse DFT of the spectral correlation estimate derived from Equation (4..4). The time dependent spectra of the FS TARMA model provide similar information about the evolution of the spectral content of the signal, however there are differences in important details. First of all, it can be observed that the three types of time dependent spectra demonstrate the presence of a single mode with evident amplitude and frequency modulation. The instant frequency of the spectral component varies periodically within the range from to 5 Hz, while largest power is found for lower values of the instant frequency. Moreover, the period of s is clearly visible. In comparison to the spectrogram shown in Figure 4.3.(c), all the parametric time dependent spectra are characterized by improved localization of time frequency components. The frozen spectrum displayed in Figure 4.3.3(a)-(b) clearly shows the presence of a single mode in the signal and demonstrates the instantaneous nature of this type of spectrum. On the other hand, the Melard Tjøsteim Figure 4.3.3(c)-(d) and Wigner Ville 38

4. T F Analysis of Non-Stationary Signals via TARMA Representations (a)

TARMA(,) model in the frequency range f =[, 38] Hz: (a)-(b) Frozen

generalized Wigner Ville spectrum (Rihaczek type).

oscillations around the main spectral component.

157 4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) S F (t, f) (b) S F (t, f) 35 Frequency f [Hz] Magnitude Time t [s] (c) S MT (t, f) (d) S MT (t, f) 35 Frequency f [Hz] Magnitude 3 4 Time t [s] (e) S WV (t, f) (f) S WV (t, f) 35 Frequency f [Hz] Magnitude 3 4 Time t [s] Figure 4.3.3: Comparison of different parametric time dependent spectra of the FS TARMA(,) model in the frequency range f =[, 38] Hz: (a)-(b) Frozen spectrum; (c)-(d) Melard Tjøsteim spectrum; (e)-(f) parametric generalized Wigner Ville spectrum (Rihaczek type). Time in seconds, frequency indices transformed to Hertz as f = n f s /N, with n=,,...,n. Figure 4.3.3(e)-(f) spectra demonstrate the presence of other lower power oscillations around the main spectral component. As shall be demonstrated in detail later, these oscillations are actually an important component of the response of the LTV system, while these are totally missing on the frozen spectrum Analysis of the response of the FS TARMA model As explained in Section 4..., the instantaneous FRF corresponds to the response of the system to a complex exponential excitation. To demonstrate this, consider the FS TARMA(,) model, with the NID innovations 39

158 4.3. Application examples replaced by the complex exponential excitation x[t]=e jπ ftt s, with f ={,,3} Hz. The response of the FS TARMA model is computed in time domain directly from Equation (4.3.) and displayed in left column of Figure The instantaneous amplitude of the response A y [t]=(r{y[t]} + I{y[t]} ) / is shown as well. In the right column of Figure are displayed the instantaneous amplitude of the response along with the magnitude of the instantaneous FRF and the frozen FRF evaluated on the frequency of the excitation, namely H(t, f) (as shown in Figure 4.3.(c)) and H f (t, f). Notice that both quantities A y [t] and H[t, f] coincide, whereas H f (t, f) only coincides for the smoothly evolving response with frequency 3 Hz. Additionally, notice that the oscillations found in the instantaneous FRF (also evident in the Melard Tjøsteim spectrum in Figure 4.3.3(c)-(d) and Wigner Ville spectrum in Figure 4.3.3(e)-(f)) originate from analogous oscillations appearing in the response of the FS TARMA model. This effect is not manifested by the frozen FRF and spectrum, since the locally stationary assumption overlooks local interactions between frequency components. y[t] 5 x[t] = exp(jπtts ) Amplitude 4 H(t,) H F (t,) A y [t] y[t] y[t] x[t] = exp(jπtts ) x[t] = exp(jπ3tts ) Time [s] Amplitude Amplitude H(t,) H F (t,) A y [t] H(t,3) H F (t,3) A y [t] Time [s] Figure 4.3.4: Time domain analysis of the response of the FS TARMA(,) model to a complex exponential excitation with frequencies f ={,, 3} Hz. Left column, time domain representation of the response of the FS TARMA model to a complex exponential excitation and its respective amplitude. Right column, amplitude of the responses and magnitude of the instantaneous FRF evaluated at the frequency of the excitation. Figure shows the DFT of the response of the FS TARMA model to the considered complex exponential excitations along with the HTF evaluated at the same frequencies of the excitation. Again, it is clear that the magnitude of the HTF coincides with the magnitude of the DFT of the response of the FS TARMA model. In summary, it has been made evident that the instantaneous amplitude of the response coincides with the magnitude of the FRF evaluated on the frequency of the excitation, therefore confirming that the quantity H[t, f] satisfies the definition given in Equation (4..). Moreover, the DFT of the response of the FS TARMA model coincides with the HTF evaluated at the frequency of the excitation. 4

159 4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) 8 (b) 5 (c) 6 Y(α ) H(, α) 4 Y(α ) H(, α).8 Y(α 3) H(3,3 α) Magnitude 4 Magnitude 3 Magnitude Cyclic frequency α [Hz] Cyclic frequency α [Hz] Cyclic frequency α [Hz] Figure 4.3.5: Frequency domain analysis of the response of the FS TARMA(,) model to a complex exponential excitation with frequencies f ={,, 3} Hz. 4

160 4.3. Application examples 4.3. Identification of the dynamics of an operational wind turbine Data description The vibration signal used on this application example are acquired on an operating NegMicon NM5/9 wind turbine tower located at a wind farm on the Atavyros Mountain of the island of Rhodes, Greece. For a full description of the signal acquisition and pre-processing, the reader is referred to (Avendano-Valencia and Fassois,, 3b). Table. provides a summary of the characteristics of the vibration signal. Figure.3 shows a sample of the vibration signal Short description of the identification methods Different parametric TARMA model based time and frequency domain representations are compared with respective non parametric counterparts using a single vibration response signal with length 48 samples. The following methods are considered: 4 Non parametric estimate: A non parametric estimate of the spectral correlation is obtained via the smoothed cyclic periodogram, defined as (Giannakis, 999; Antoni, 7): Ŝ yy [k,k n]= M M W[k n] I yy (m) [k,k n]; I yy (m) [k,k n]= m= N Y m[k] Y m [n k] where W[n] is a frequency domain window with W[] = and W[] > W[n] for n, and Y m [k] is the DFT of the m th segment of the signal, obtained after dividing the signal into M overlapping segments. In the implementation, the signal is divided in segments of 4 samples with an overlap of 594 samples, and a Hamming spectral window of 4 samples is used. After estimating the spectral correlation, the corresponding time dependent spectral estimate is obtained by computing the inverse DFT ofŝyy[k,k n]. FS TAR model based estimates: The FS TAR model is estimated based on a Fourier basis for the evolution of the AR parameters and innovations variance, as follows (Avendano-Valencia and Fassois, 3b): a i [t]=a i, + σ w[t]=s + p k= p k= a i,k cos[kα o tt s ]+a i,k sin[kα o tt s ], s k cos[kα o tt s ]+s k sin[kα o tt s ] where f s = /T s = 6 Hz is the sampling frequency, and α o = π f o is the basis frequency corresponding to the average rotor speed of f o =.37 Hz. The coefficients of projection of the FS TAR model are estimated using the Multi-Stage Weighted Least Squares (MS-WLS) method described in (Poulimenos and Fassois, 6). A complete description of the identification process and validation of the identified models can be found in (Avendano-Valencia and Fassois,, 3b; Avendaño-Valencia and Fassois, 5d). After optimization, the resulting model structure is FS TAR(8) [7,7]. The model parameters of the estimated FS TAR model are used to compute the spectral correlation as well as frozen, Wigner Ville and Melard- Tjøsteim time dependent spectra. GSC-TAR model based estimates: The time dependent parameters and hyperparameters of the GSC TAR models are estimated using the maximum marginal likelihood (MML) method algorithm with stochastic constraint orders and (Avendaño-Valencia and Fassois, 5d). A complete description of the identification process and validation of the identified models can be found in (Avendano-Valencia and Fassois,, 3b; Avendaño-Valencia and Fassois, 5d). After optimization, the resulting model structure is GSC TAR(8) with q=. The model parameters of the estimated GSC TARMA model are used to compute the spectral correlation as well as frozen, Wigner Ville and Melard-Tjøsteim time dependent spectra.

161 4. T F Analysis of Non-Stationary Signals via TARMA Representations Figure shows the parameter trajectories of the identified FS TAR(8) [7,7] and GSC TAR(8) models in both time and frequency domain. The frequency domain representation is estimated via DFT on the complete length of the estimated parameter trajectories. The periodic behavior in the estimates of the FS TAR(8) [7,7] model stems from the choice of the functional expansion basis. On the other hand, the periodicity is not so evident on the estimates obtained with the GSC TAR(8) model. The spectral characteristics of the estimated parameter trajectories differs mostly on the evident peaks found in the DFT of the estimates obtained with the FS TAR(8) [7,7] model. Nonetheless, it is observed that the amplitude of the frequency components decays at the same rate on both cases. From this analysis, it can be concluded that the HTF can be computed based on a reduced number of points in the DFT of the parameter trajectories. Specifically, the HTF is computed based on the first 5 and last 5 DFT values, corresponding to the frequency range [ 5,5] Hz, since the values of the DFT of the parameter trajectories at higher frequencies is lower than 4 magnitude scales ( 4 ) compared to the amplitude at f =. -.6 GSC-TAR(8) FS-TAR(8) [7,7].4 -. a[t] a[t]. a3[t] ă(f) Time [s] Frequency [Hz] ă(f).8 5 Time [s] Frequency [Hz] ă3(f) Time [s] Frequency [Hz] Figure 4.3.6: Time and frequency domain representation of the estimates of the first three TAR parameter trajectories obtained with the GSC TAR(8) and FS TAR(8) [7,7] models Harmonic Transfer Function The HTFs obtained with the estimated GSC TAR(8) and FS TAR(8) [7,7] models are shown in Figure Despite of the evident differences seen in the parameter evolutions, the main frequency content in the HTFs is very similar, and more importantly, the blade to blade frequency β =.37 Hz (see Table.) is noticeable on both HTFs. This effect is more notorious in the frequency component centered around 38 Hz, where the peaks at multiples of the blade to blade frequency are more pronounced. The presence of those peaks indicates the presence of modulations on the component e jπ39tt s of the vibration response of the wind turbine at frequencies f = 39±l β with l = up to at least l = Spectral correlation estimates The spectral correlations obtained with the GSC TAR(8) model and the FS TAR(8) [7,7] model, and the non parametric estimate via the smoothed cyclic periodogram are shown in Figure The different spectral correlation estimates appear to demonstrate the same type of behavior in the signal. Nonetheless, it is clear that both parametric estimates have improved characteristics in terms of quality of estimation and location of frequency components. The modulations induced by the blade rotation frequency and the blade to blade frequency are 43

4.3. Application examples (a) (b) (c) (d) Figure 4.3.7: Harmonic Transfer Functions derived from: (a)-(b) the estimated FS TAR(8) [7,7] model; (c)-(d) the estimated GSC TAR(8) model.

In general, it can be observed that the largest portion of the power in the vibration response is concentrated in the frequency components around 38 Hz,

.5 Time dependent spectra The time dependent spectrum of the wind turbine vibration response is obtained by the following methods: (i) Non parametric

162 4.3. Application examples (a) (b) (c) (d) Figure 4.3.7: Harmonic Transfer Functions derived from: (a)-(b) the estimated FS TAR(8) [7,7] model; (c)-(d) the estimated GSC TAR(8) model. more clear in both parametric estimates. In general, it can be observed that the largest portion of the power in the vibration response is concentrated in the frequency components around 38 Hz, while the spectral correlation demonstrates that these components exhibit modulations at frequencies l β with l = up to at least l =, as also evident on the HTFs in Figure Time dependent spectra The time dependent spectrum of the wind turbine vibration response is obtained by the following methods: (i) Non parametric estimate via spectrogram (Gaussian window, 5 samples length, 5 samples overlap, aperture parameter 4). (ii) Wigner Ville spectrum computed as the inverse DFT of the smoothed cyclic periodogram. (iii) Melard Tjøsteim, Rihaczek and frozen spectra derived from the estimated FS TAR(8) [7,7] and GSC TAR(8) models. Figure shows the obtained time dependent spectra. The time dependent spectra demonstrate the periodicity on the evolution of the spectral components of the wind turbine vibration response. The periodicity in 44

163 4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) (b) (c) Figure 4.3.8: Spectral correlation of the wind turbine vibration response obtained by means of: (a) estimated FS TAR(8) [7,7] model; (b) estimated GSC-TAR(8) model; (c) non-parametric estimate via smoothed cyclic periodogram. the spectral components is mainly influenced by the blade rotation frequency 3β =.6 Hz, as also evident in the estimates of the spectral correlation and the HTF shown in Figure and Figure 4.3.7, respectively. The modulation induced by the blade to blade frequency is less evident in the time dependent spectra, although it is clear in the spectral correlation and HTF. A short discussion of the different time dependent spectra is provided as follows: The spectrogram in Figure 4.3.9(a) evidences the presence of modulations in several of the natural frequencies of the wind turbine vibration. However, the precise localization of these components is unclear 45

164 4.4. Conclusions and the precise time-dependent behavior of the components is difficult to discern, plus there are spurious components that mask the details. The non parametric Wigner Ville spectrum in Figure 4.3.9(b) demonstrates improved localization capabilities of frequency components and reduced influence of noise compared to the spectrogram. The frequency modulations apparent in the spectrogram appear as single or a group of modes with varying amplitude. The parametric time dependent spectra derived from the FS TAR(8) [7,7] model and the GSC TAR(8) model appear cleaner than the non parametric counterparts. Besides, it is easier to discern each one of the modes in the vibration response signal and determine their characteristics. In particular, the β and 3β modulations in the mode centered at 38 Hz are more clear than in the non parametric estimates, especially in the spectra derived from the FS TAR model. The frozen spectra (Figure 4.3.9(c) for the FS TAR case and Figure (d) for the GSC TAR case) seem to remark the local behavior of the time series, but exhibit very large bursts in some parts of the spectra that can be related with the time instants where the frozen poles move close to the unit circle. It can be noticed also that the obtained frozen spectrum derived from the GSC TAR model is more sensitive to small changes in the instant values of the parameters. The Melard Tjøsteim spectra shown in Figure 4.3.9(e) for the FS TAR case and Figure 4.3.9(f) for the GSC TAR case, are characterized by small oscillations of each one of the frequency modes along the time domain. These oscillations are consistent on both Melard-Tjøsteim spectra, and as clarified in the previous application example, can be associated with the modulations characterizing each one of the modes. The Rihaczek spectra shown in Figure 4.3.9(g) for the case of the FS TAR(8) [7,7] model and Figure 4.3.9(h) for the case of the GSC-TAR(8) model, appear as de noised versions of the non parametric Wigner Ville spectrum (Figure 4.3.9(b)). As in the Melard Tjøsteim spectra, the Rihaczek spectra show a consistent behavior on each one of the modes, and facilitate their location on the time frequency plane. 4.4 Conclusions The present work has been devoted to the derivation of time and frequency domain representations of discrete time linear time varying systems, and in particular for the family of TARMA models. In particular, the main contribution of this work is on the derivation of analytic expressions of the Harmonic FRF and the Harmonic Impulse Response of discrete linear time varying systems based on the DFT of the system parameters. Subsequently, these expressions are used to calculate the spectral correlation and different types of time dependent spectra associated with TARMA models, including Melard-Tjøsteim and Wigner-Ville spectra. The proposed methods are analyzed in two application cases: the first one consisting of a simple FS TARMA(,) model characterized by a periodic evolution of the parameters, which serves to demonstrate and provide some insight into the main results found on this work; and the second consisting of the analysis of the dynamics of the vibration response of a wind turbine. Both application examples demonstrate the important benefits obtained with the proposed methods in the spectral analysis of complex non-stationary signals, including: 46 The postulated methods facilitate the analysis of the non stationary dynamics characterizing a TARMA model and can be easily extended for other types of non stationary models, including TARX and TARMAX models. The postulated methods in combination with advanced TARMA identification techniques become a very powerful tool for the identification of non stationary processes, endowed with improved characteristics in comparison with other non parametric, including improved precision and tracking of the non stationary dynamics. Besides, the proposed approach requires shorter sample lengths, thus producing enhanced estimates out of a less amount of information.

165 4. T F Analysis of Non-Stationary Signals via TARMA Representations (a) (b) (c) (d) (e) (f) (g) (h) Figure 4.3.9: Time dependent spectra of the wind turbine vibration response on two periods of rotation of the blades, obtained via: (a) spectrogram, (b) non parametric Wigner-Ville spectrum estimate, (c) FS TAR(8) [7,7] model based frozen spectrum, (d) GSC TAR(8) model based frozen spectrum, (e) FS TAR(8) [7,7] model based Melard-Tjøsteim spectrum; (f) GSC TAR(8) model based Melard-Tjøsteim spectrum; (g) FS TAR(8) [7,7] model based Rihaczek spectrum; (h) GSC TAR(8) model based Rihaczek spectrum. In comparison with the typically used frozen approach, the postulated spectral methods provide a closer view of the time varying power spectrum of the signal, and avoids the sudden bursts and the noisy behavior that characterize the frozen approach. 47

166 4.A. Proofs Appendix 4.A Proofs 4.A. Proof for the spreading function in Equation (4..) Applying the definition of the impulse response from Equation (4..7) on the spreading function in Equation (4..8a), yields: h(α,τ]=f t α {h[t,t τ]}=f t α { = n a i= n a i= n c F t α {a i [t] h[t i,t τ]}+ a i [t] h[t i,t τ]+ i= n c i= F t α {c i [t]} δ[τ i] c i [t] δ[τ i] } (4.A.) (4.A.) using the property of the DTFT of the product of two functions, it can be shown that the Fourier transform of the product a i [t] h[t i,t τ] is (Oppenheim et al., 999, p. 59): where F t α {a i [t] h[t i,t τ]}= T s F π t β {a i [t]} F t (α β) {h[t i,t τ]} dβ Ω F t α {a i [t]}=a i (α) F t α {h[t i,t τ]}=f t α {h[t i,(t i) (τ i)]}= h(α,τ i] e jαit s Applying the obtained result into Equation (4.A.3), yields: (4.A.3) F t α {a i [t] h[t i,t τ]}= T s A i (β) h(α β,τ i] e j(α β)it s dβ = T s π Ω π A i(α) ( h(α,τ i] e jαit) s Ω (4.A.4) where Ω indicates the periodic convolution operator on the interval Ω. Then, according to the previous results, the spreading function takes the following form: h(α,τ]= T s π n a i= A i (α) Ω ( h(α,τ i] e jαit s ) + n c i= C i (α) δ[τ i] (4.A.5) where α Ω and τ [, ). 4.A. Proof for the Harmonic Impulse Response of Equation (4..) The demonstration is very similar to the case of the spreading function described in the previous section. To begin with, the definition of the HIR in Equation (4..9a) is applied to the impulse response of the LTV system (assuming N periodicity of the parameters), to yield: h (N) [k,τ]=fs t k {h[t,t τ]}=fs t k { = n a i= n a i= n c FS t k {a i [t] h[t i,t τ]}+ a i [t] h[t i,t τ]+ i= n c i= FS t k {c i [t]} δ[τ i] c i [t] δ[τ i] after applying the properties of the DFS as in the demonstration of the spreading function, it can be shown that: FS t k {a i [t] h[t i,t τ]}= N ăi[k] N ( h (N) [k,τ i] e jkω oit s ) } (4.A.6) 48

167 4. T F Analysis of Non-Stationary Signals via TARMA Representations where N indicates the N point circular convolution operator (Oppenheim et al., 999, pp ). Thus, the HIR of becomes: h (N) [k,τ]= N n a i= ă i [k] N ( h (N) [k,τ i] e jkω oit s )+ n c i= c i [k] δ[τ i] (4.A.7) where k=[,,...,n ] and τ [, ). The obtained HIR can be reorganized in vector form, based on the HIR vector: h (N) [τ]= [ h (N) [,τ] h (N) [,τ] h (N) [N,τ] ] T Then, ω h (N) [,τ] (ă ()i N i [] h (N) [,τ i]+ă i [] h (N) [,τ i]+ +ă i [N ] h (N) [ N+,τ i] ) h (N) [,τ] na. = ω (ă ()i N i [] h (N) [,τ i]+ă i [] h (N) [,τ i]+ +ă i [N ] h (N) [ N+,τ i] ) i=. h (N) [N,τ] ω (N )i N (ă i [] h (N) [N,τ i]+ă i [] h (N) [N,τ i]+ +ă i [N ] h (N) [,τ i] ) ĉ[] n c ĉ[] + i] i=. δ[τ (4.A.8) ĉ[n ] using the property of periodicity of the Fourier transform coefficients and reordering the values, leads to the following recursive form of the HIR ω h (N) [,τ] (ă ()i N i [] h (N) [,τ i]+ă i [] h (N) [N,τ i]+ +ă i [N ] h (N) [,τ i] ) h (N) [,τ] na. = ω (ă ()i N i [] h (N) [,τ i]+ă i [] h (N) [,τ i]+ +ă i [N ] h (N) [,τ i] ) i=. h (N) [N,τ] ω (N )i N (ă i [] h (N) [N,τ i]+ă i [] h (N) [N,τ i]+ +ă i [N ] h (N) [,τ i] ) c[] n c c[] + i] i=. δ[τ c[n ] h (N) [,τ] ă i [] ω ()i ă i [N ] ω ()i ă i [] ω ()i h (N) [,τ i] h (N) [,τ] na. = ă i [] ω ()i ă i [] ω ()i ă i [] ω ()i h (N) [,τ i] i= h (N) [N,τ] ă i [N ] ω (N )i ă i [N ] ω (N )i ă i [] ω (N )i h (N) [N,τ i] c[] n c c[] + i] i=. δ[τ c[n ] which writes in simple form as h (N) [τ]= n a i= Ω i ĂA i h (N) [τ i]+ n c i= c i δ[τ i] (4.A.9) 49

168 4.B. Time and frequency domain analysis of stationary processes where ă i [] ă i [N ] ă i [] ă i [] ă i [] ă i [] ĂA i =......, ă i [N ] ă i [N ] ă i [] ( ) Ω=diag ω k /N,k=,...,N Appendix 4.B Time and frequency domain analysis of stationary processes Let y[t] be a real-valued random signal defined over the discrete time t Z with f s being its sampling frequency and with an associated probability defined by the probability density function p(y[t]). For simplicity, during the present formalization y[t] will refer to both the stochastic process and a realization of the stochastic process. Stationarity (in the strict sense) is the property of a stochastic process in which its probabilistic behavior remains the same throughout time (Flandrin, 989). This requires that the probability density function associated with the time series has no time dependency, or equivalently, that all the statistical moments of the random process remain unchanged with time. The least restricting definition of wide sense stationarity requires only that the first and second statistical moments remain unchanged through time. Then, a time series is wide sense stationary if the following two conditions hold (Bendat and Piersol, ): Condition : E{y[t]}= µ y (4.B.a) Condition : E{(y[t] µ y ) (y[t τ] µ y )}=γ yy [τ] (4.B.b) where t,τ Z are used to denote the normalized discrete time and time delay respectively, and E{ } represents the statistical expectation operator. Condition indicates that the mean of a stationary process is constant, and Condition indicates that the AutoCorrelation Function (ACF) depends only on the time delay τ. In the case of Gaussian processes, stationarity in the strict sense is satisfied if wide sense stationarity is proven, since this type of processes are fully described by the mean and variance. ACF and spectrum A stationary time-series can be represented in the frequency domain using the Fourier- Stieltjes transform (Priestley, 967; Flandrin, 989; Grenier, 989): y[t]= T s e jωtt s dy(ω) π Ω (4.B.) where ω = π f with ω Ω =[ π/t s,π/t s ] is the frequency in radians per sample with f being the frequency in Hertz, T s is the sampling period, and dy(ω) is also a complex-valued random signal uniquely determined by y[t]. The function dy(ω) describes the distribution of the energy of the time-series y[t] in the frequency component e jωtt s. The representation provided by Equation (4.B.) has the property that the representation basis (complex exponential basis) is orthogonal. Besides, it holds that the spectral increments dy(ω ) and dy(ω ) are uncorrelated random variables for ω ω if the process is stationary (E{dY(ω ) dy(ω )}=, ω ω ). The contrary also applies, namely, if the spectral increments dy(ω ) and dy(ω ) are uncorrelated, then the signal is stationary (Priestley, 967; Antoni, 7). So, for this type of Fourier representation it holds that (Flandrin, 989; Grenier, 989): e jω t,e jω t T T ( e jω tts e jω ) tt s = δ(ω ω ) t= E{dY(ω ) dy (ω )}=δ(ω ω ) S yy (ω ) dω dω where, indicates the inner product, and δ(ω) is the delta function. The function S yy (ω) is known as the Power Spectral Density (PSD) and measures the average distribution of power of the process at each frequency ω. The 5

169 4. T F Analysis of Non-Stationary Signals via TARMA Representations Wiener-Khintchine theorem states that the ACF and the PSD can be related by the Fourier transform, as follows Priestley (967); Flandrin (989): S yy (ω)=f{γ yy [τ]}= τ= γ yy [τ]=f {S yy (ω)}= T s π γ yy [τ] e jωτt s Ω S yy (ω) e jωτt s dω (4.B.3a) (4.B.3b) Since by definition, the ACF γ yy [τ] is a positive definite and symmetric function, then S yy (ω) is also positive and symmetric. Further properties of the ACF can be found for example in (Bendat and Piersol,, Ch. 5); (Manolakis et al., 5, Ch. 3). The advantage of this representation is that the presence of complex exponentials as basis functions provides a useful physical interpretation of the signal in terms of a decomposition of the signal power in oscillatory components with different frequencies. The main differences between the functions Y(ω) and S yy (ω) can be summarized as follows (Priestley, 967): (i) Y(ω) is a complex valued random process uniquely determined by the particular realization y[t]. On the other hand, S yy (ω) is a deterministic function describing the whole process {y[t]}. (ii) Y(ω) is a decomposition of the signal s energy while S yy (ω) is a decomposition of the signal s average power. (iii) Y(ω) in general is not differentiable (its derivative will not be finite for some values of ω). LTI systems with random input Consider a Linear Time Invariant (LTI) system with a bounded impulse response h[t], with input x[t] and output y[t]. The input and output of the LTI system are related by the convolution sum, defined as follows y[t]=h[t] x[t]= τ= h[t τ] x[τ]= τ= h[τ] x[t τ] (4.B.4) Provided that the input x[t] is a zero mean stochastic process, then the output y[t] is also a zero mean stochastic process for which the following relationships are satisfied (Manolakis et al., 5, Ch. 3) γ xy [τ]=e{x[t] y[t τ]}= γ yx [τ]=e{y[t] x[t τ]}= γ yy [τ]=e{y[t] y[t τ]}= t= t= t = t = h [t τ] γ xx [t]=h [ τ] γ xx [τ] h [τ t] γ xx [t]=h[τ] γ xx [τ] h[t ] h[t ] γ xx [τ+t t ]=h[τ] h [ τ] γ xx [τ] (4.B.5a) (4.B.5b) (4.B.5c) The demonstration of the Wiener-Khintchine theorem is as follows: Starting from the definition of the ACF, if follows that {( ) ( )} Ts γ yy [τ]=e{y[t] y[t τ]}=e e jω tt s Ts dy(ω ) e jω (t τ)t s dy (ω ) π Ω π Ω = T s 4π e jω tts e jω (t τ)t s E{dY(ω ) dy (ω )} Ω Ω Since E{dY(ω ) dy (ω )}=δ(ω ω ) S yy (ω ) dω dω, then γ yy [τ]= T s 4π = T s π e jω tts e jω (t τ)ts δ(ω ω ) S yy (ω ) dω dω = T s Ω Ω π Ω S yy (ω ) e jω tts e jω (t τ)t s dω = T s S yy (ω ) e jω τt s dω Ω π Ω ( ) S yy (ω ) e jω tt s Ts e jω (t τ)ts δ(ω ω ) dω dω π Ω Thus demonstrating the second equation of the Wiener-Khintchine theorem. The first equation follows from the duality of the discrete-time Fourier transform. 5

170 4.B. Time and frequency domain analysis of stationary processes where γ xy [τ] is the Cross-Covariance Function (CCF) between input and output, and γ yy [τ] is the ACF of the output. The CCF and ACF take simpler forms if the system is excited by a Normally Identically Distributed (NID) process (Manolakis et al., 5, Ch. 3) γ xy [τ]=σ x h[ τ] γ yx [τ]=σ x h[τ] γ yy [τ]=σ x h[τ] h[ τ] (4.B.6a) (4.B.6b) (4.B.6c) where σ x is the variance of x[t]. The transfer function is defined as the response of the system to an exponential excitation normalized by the excitation itself, namely H(ω)= response of the system to e jωtt s e jωtt s (4.B.7) which, by using the convolution representation, leads to H(ω)= h[t] w[t] w[t] w[t]=e jωtts = τ= h[τ] e jω(t τ)t s e jωtt s = τ= h[τ] e jωτt s (4.B.8) It is easy to see that the transfer function H(ω) corresponds to the Fourier transform of the impulse response function h[τ]. In fact, the impulse response can be reconstructed from the transfer function by means of the inverse Fourier transform h[τ]= T s H(ω) e jωτt s dω (4.B.9) π Ω The transfer function can be used to relate the Fourier transforms of the input and the output using the relation Y(ω)=H(ω) X(ω) (4.B.) from which can be demonstrated that the input output spectrum and the spectrum of the output are of the form (Manolakis et al., 5, Ch. 3) S xy (ω)=h (ω) S xx (ω) S yx (ω)=h(ω) S xx (ω) S yy (ω)=h(ω) S xx (ω) H ( ω)= H(ω) S xx (ω) (4.B.a) (4.B.b) (4.B.c) ARMA models Let y[t] be a realization of an AutoRegressive Moving Average (ARMA) model of the form y[t]+ n a i= a i y[t i]= n c i= c i w[t i]+w[t], w[t] NID (,σw ) (4.B.) where a i and c i are the AR and MA parameters, with corresponding orders n a and n c, and stationary NID innovations w[t] with mean zero and variance σ w. The response of an ARMA model can be characterized by the impulse response and the transfer function. The impulse response function is defined as the response of a system to an impulse function δ[t] at time t = τ =. Obviously, the impulse response function h[τ] of the ARMA model is calculated as: h[τ]= n a i= a i h[τ i]+ n c i= c i δ[τ i] (4.B.3) 5

171 4. T F Analysis of Non-Stationary Signals via TARMA Representations where h[τ]= for τ < and c =. The transfer function of an ARMA model takes the well known rational form (obtained after taking the Fourier transform of Equation (4.B.3)) H( jω)= +n c i= c i e jωt si + n a i= a i e jωt si (4.B.4) Using Equation (4.B.) it can be demonstrated that the PSD of an ARMA model can be computed using the relationship (Manolakis et al., 5, Ch. 3) S yy (ω)=σw H( jω) = σw + n c i= c i e jωt si + n a i= a i e jωt si (4.B.5) Equations (4..8) and (4.B.5) provide closed form expressions relating the parameters of an ARMA model with the ACF and the PSD of the stationary process y[t]. The remainder of this work aims to provide similar relationships for the case of non-stationary TARMA processes. Appendix 4.C Time and frequency domain analysis of non stationary processes 4.C. Non-stationary random processes The term non stationarity (in the wide sense) refers to the property of a random signal whose first two statistical moments are functions of time. Therefore, a random signal y[t] is called non stationary (in the wide sense) if one of the following conditions holds (Antoni, 7; Poulimenos and Fassois, 6): Condition : Mean µ y [t]=e{y[t]}= y[t] dp(y[t]) Condition : ACF γ yy [t,t ]=E{(y[t ] µ y [t ]) (y[t ] µ y [t ])} = (y[t ] µ y [t ]) (y[t ] µ y [t ]) dp(y[t ],y[t ]) (4.C.a) (4.C.b) where P(y[t]) is the probability distribution function of the random variable y[t], while P(y[t ],y[t ]) is the joint probability distribution function of y[t] at two time instants t,t Z. The first condition indicates a time dependent mean while the second indicates that the AutoCovariance Function (ACF) depends on two time variables t and t. Notice that the ACF in the non stationary case can be found with different notations for the time delay variables. In general, two different notations may be used for the ACF (Antoni, 7): γ yy [t,t τ]=e{y[t] y[t τ]} γ yy [t+ τ,t τ]=e{y[t+ τ] y[t τ]} Non-symmetric ACF Symmetric ACF where in the first case t = t and t = t τ, while in the second t = t+τ and t = t τ. The difference between the two forms of the ACF is that the non-symmetric case, the covariance is analyzed between the current time instant and another time instant separated by τ time units, while in the symmetric case, the covariance is analyzed on signal components separated by τ from the current time instant. Nonetheless, both notations are equivalent under simple change of variables. The relation between the axes of the t t plane and the domains of the symmetric and non-symmetric cases is depicted in Figure 4.C.. In the non-symmetric case, the resulting axes of the t τ plane are associated with oblique vectors on the t t plane. Specifically, the line t corresponds to the τ axis, while the line t = t corresponds to the t axis, as can be seen in Figure 4.C.(a). Moreover, the line t = t passes through all the points where τ = and also associates unit increments on t and t. On the other hand, in the symmetric case shown in Figure 4.C.(b) the axes of the t τ plane represent orthogonal vectors on the t t plane, each one corresponding to the lines t = t and t = t, respectively. However, the line t = t associated to a single unit increment in τ corresponds to double increments on the t t plane. This occurs since the separation between two signal components y[t+τ] and y[t τ] appearing in the definition of the symmetric ACF is two times τ. This effect is unavoidable in a discrete time implementation, and has the effect of down sampling the ACF by a factor of two. 53

172 4.C. Time and frequency domain analysis of non stationary processes t (t = ) t t = t (τ = ) t t = t (τ = ) t t = t (t = ) t t = t (τ = ) t t = t (τ = ) τ (a) (b) Figure 4.C.: Comparison of the analysis domains of the ACF: (a) Non-symmetric ACF with t = t and t = t τ. The axes corresponding to the t,τ plane are oblique vectors on the t,t plane but the unit increments on both planes are commensurate. (b) Symmetric ACF with t = t+ τ and t = t τ. The axes on the t,τ plane are orthogonal vectors on the t,t plane, but an unit increment on t or τ corresponds to two unit increments on t,t. t τ This work focuses on Gaussian zero mean random signals with non stationary and continually evolving ACF. The zero mean assumption is adopted because in many applications the mean is zero or constant. In the latter case the mean can be (in an initial stage) estimated and subsequently subtracted from the signal. The case of a time-dependent mean (also known as deterministic trend) may be treated in a similar fashion using proper techniques, such as curve fitting, high pass filtering, or special parametric models, such as integrated models with a deterministic trend parameter Box et al. (994) (Kitagawa and Gersch, 996, Ch. 8). So far, non stationarity has been defined as the absence of a property, rather than a definition itself, and is not sufficient to specify the characteristics of the signals that are considered. In particular, a proper discrimination of the signal structure is necessary to be able to define important properties, such as the spectral representation. In the following part of this work, the two main classes of non stationary processes, namely, harmonizable and oscillatory processes are defined, along with the particular definitions of time and frequency domain power representations that stem from the respective definitions. 4.C. Harmonizable random processes The class of harmonizable processes is composed by the set of non-stationary processes that can be represented as a sum of complex exponential functions. Therefore, harmonizable processes make use of the following Fourier Stieljes spectral representation for the random signal y[t] (Grenier, 989; Antoni, 7): y[t]= T s π Ω e jωtt s dy(ω) (4.C.) where Ω=[ π/t s,π/t s ] is the domain of integration, ω = π f Ω is the frequency in radians per second (with f being a frequency in Hertz), and dy(ω) is a strictly increasing positive function (or distribution function), which represents a spectral increment accounting for the weight of the frequency component e jωtt s in the signal. To understand Equation (4.C.) consider first that the spectral increment dy(ω) can be approximated as Y(ω) = Y(ω + ω) Y(ω) for a small value of ω. Moreover, it follows that Y(ω) = ỹ(ω) ω, indicating that the spectral increment can be substituted by a function ỹ(ω) multiplied by the frequency increment ω. Therefore, Equation (4.C.) can be approximated as: y[t] T s π ( ) ỹ(ω) ω e jωtt s ω Ω d 54

173 4. T F Analysis of Non-Stationary Signals via TARMA Representations where Ω d is a set of discrete frequencies determined by ω. In this case, it is clear that the spectral increment Y(ω)=ỹ(ω) ω describes the weight of the frequency component e jωtt s in the signal. The same analysis can be extended to the case when the limit ω applies. Consequently, ỹ(ω) becomes the derivative of Y(ω) with respect to ω, namely y(ω)=dy(ω)/dω. Nonetheless, the derivative y(ω) may not be well defined for all possible values of ω. Therefore, the (infinitesimal) spectral increment dy(ω) is used instead of y(ω) dω. The structure of the ACF may be analyzed based on the Fourier Stieljes spectral representation. For this purpose, Equation (4.C.) is replaced on the definition of the ACF in Equation (4.C.b), thus obtaining: {( ) ( )} Ts γ yy [t,t ]=E e jω t T Ts s dy(ω ) e jω t T s dy(ω ) π Ω π Ω = T s 4π e j(ω t +ω t )T s E{dY(ω ) dy(ω )} (4.C.3) Ω Ω Again, to understand the quantity E{dY(ω ) dy(ω )}, lets replace the approximate spectral increment Y(ω)= ỹ(ω) ω, to obtain: E{dY(ω ) dy(ω )} E{ Y(ω ) Y(ω )}=E{ỹ(ω ) ỹ(ω )} ω ω Notice that the quantity E{ỹ(ω ) ỹ(ω )} implies a correlation between the spectral components associated with the function ỹ(ω) at different values of ω. Furthermore, in the limit when ω tends to zero, the approximation shown on the previous equation becomes exact (if the limit exists, of course), thus yielding the expression: E{dY(ω ) dy(ω )}=S yy (ω,ω ) dω dω (4.C.4) where the quantity S yy (ω,ω ) is referred to as the spectral correlation (density) or the two-dimensional power spectral density (D-PSD) function. However, the former expression more clearly elucidates the actual characteristic of this function, which describes the correlation between two spectral components with frequencies ω and ω (Antoni, 7, 9; Flandrin, 989; Gardner, 986; Giannakis, 999; Priestley, 967). Replacing the result in Equation (4.C.4) into the ACF expression in Equation (4.C.3), leads to the familiar expression: γ yy [t,t ]= T s 4π S yy (ω,ω ) e j(ω t +ω t )T s Ω Ω dω dω =F ω t { F ω t {S yy (ω,ω )} } (4.C.5) which indicates that the ACF is the inverse double Fourier transform of the spectral correlation. The reciprocal also holds, namely the spectral correlation is the double DFT of the ACF, S yy (ω,ω )= t = t = γ yy [t,t ] e j(ω t +ω t )T s =F t α {F τ ω {γ yy (t,t )}} (4.C.6) According to the above, the spectral correlation S yy (ω,ω ) and the ACF γ yy [t,t ] form a transform pair, while Equation (4.C.5) and Equation (4.C.6) are the generalization of the Wiener-Khintchine theorem for the case of non stationary harmonizable processes (Antoni, 7; Flandrin, 989; Gardner, 986; Giannakis, 999). As for the case of the ACF, the spectral correlation can be represented under different domain configurations. The transformation between representation domains in time and frequency can be done in terms of the exponents inside the integral on Equation (4.C.6). Firstly, consider the change of variables t = t and t = t τ for the non-symmetric ACF, under which the argument of the exponential function becomes: ω t + ω t = ω t+ ω (t τ)=(ω + ω )t ω τ which suggests a change of variables in the frequency domain of the form ω = ω and α = ω + ω, or ω = ω + α and ω = ω. Following a similar procedure, the change of variables t = t+ τ and t = t τ used in the symmetric ACF, leads to the change of variables in the frequency domain α = ω + ω and ω = ω ω, or 55

174 4.C. Time and frequency domain analysis of non stationary processes equivalently ω = /(α + ω), ω = /(α ω). This leads to the following parametrizations of the spectral correlation: S yy (ω+ α, ω)=f τ ω {F t α {γ[t,t τ]}} Non-symmetric spectral correlation S yy ( (α+ ω), (α ω) )=F τ ω {F t α {γ[t+ τ,t τ]}} Symmetric spectral correlation The relationships among the axes in the different parametrizations are analogous to those shown in Figure 4.C. for the ACF. Notice also that the effect of the double time steps in the symmetric ACF is found as half frequency steps in the spectral correlation. Example The spectral correlation of a stationary process: In a stationary process, the ACF is of the form γ yy [t,t ]=γ yy [t t ]. In such a case, according to Equation (4.C.6), the spectral correlation is: S yy (ω,ω )= t = t = γ yy [t t ] e j(ω t +ω t )T s making the change of variables t = t and τ = t t, then the argument of the exponential becomes ω t+ ω (t τ)=(ω + ω )t ω τ. Thus, making the change of variables ω = ω and ω + ω = α, then ) ) S yy (ω+ α, ω)= γ yy [τ] e jαtts e jωτt s =( γ yy [τ] e jωτt s ( e jαtt s t= τ= The first sum in the result may be recognized as the Fourier transform of the (stationary) ACF, which is the (also stationary) Power Spectral Density, namely S yy (ω)=f{γ yy [τ]}. On the other hand, the second sum is equivalent to the Fourier transform of a constant, which becomes the delta function δ(α). The resulting spectral correlation is then: τ= t= S yy (ω+ α, ω)=s yy (ω) δ(α) or S yy (ω,ω )=S yy ( ω ) δ(ω + ω ) where δ(α) represents the delta function. The result indicates that the spectral correlation of a stationary process is different from zero only at the values α =, or equivalently, the spectral correlation is different from zero only on the line ω = ω. Example The spectral correlation of an amplitude modulated NID process: Consider the process y[t] = a[t] w[t], where a[t]=a + a cosω o tt s and w[t] NID(,). For this process, it is easy to show that the ACF is of the form: γ yy [t,t ]=a [t ] δ[t t ] a [t]= a + a + a a cosω o tt s + a cosω ott s while the spectral correlation is found by replacing the above defined ACF into the definition on Equation (4.C.6), which yields: 56 S yy (ω,ω )= = = t = t = t = t = a [t ] δ[t t ] e j(ω t +ω t )T s a [t ] e jω t Ts ( t = δ[t t ] e jω t T s ) a [t ] e j(ω +ω )t T s = G a (ω + ω )

175 4. T F Analysis of Non-Stationary Signals via TARMA Representations where G a (ω)=f { a [t] } = a + a ) δ(ω)+a a (δ(ω ω o )+δ(ω+ ω o ) + a ( ) δ(ω ω o )+δ(ω+ ω o ) 4 In contrast to the previous example, in this case, the spectral correlation also takes values on the lines ω = ω ±ω o and ω = ω ± ω o. Based on the previous definitions, the class of harmonizable processes is defined formally as the class of non-stationary processes for which the spectral correlation satisfies the Loève s condition (Flandrin, 989): S yy (ω,ω α) dω dα < Ω Ω which in simple words indicates that a harmonizable process is characterized by an absolutely integrable spectral correlation. Moreover, Loève s condition also implies that the process is of finite power. For the class of harmonizable signals, the Wigner Ville spectrum is defined as the Fourier transform of the symmetric ACF with respect to τ, or analogously, as the inverse Fourier transform of the (also symmetric) spectral correlation with respect to α. Therefore, the Wigner Ville spectrum is defined as follows (Antoni, 7; Gardner, 986; Martin, 98) S WV [t,ω)=f τ ω {γ yy [t+ τ,t τ]}= =F α t τ= {S yy ( (α+ ω), (α ω) )} = γ yy (t+ τ,t τ) e jωτt s Ω (4.C.7a) S yy ( (α+ ω), (α ω) ) e jαtt s dα (4.C.7b) However, the formulation of the Wigner Ville spectrum has issues of implementation on the discrete time case due to the half time steps. Therefore, a Wigner-Ville-like spectrum can be defined in terms of the asymmetric ACF formulation given in Equation (4.C.b), which is known as the Rihaczek spectrum and is defined as follows (Hlawatsch and Matz, 8): S R [t,ω)=f τ ω {γ yy [t,t τ]}= τ= =F α t {S yy(ω+ α, ω)}= T s π γ yy [t,t τ] e jωτt s Ω (4.C.8a) S yy (ω+ α, ω) e jαtt s dα (4.C.8b) Both Wigner Ville and Rihaczek spectra describe the evolution of the frequency components of the signal over time, and are members of the class of generalized Wigner Ville spectra (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6). The Wigner-Ville spectrum (and all other types of spectra of the generalized Wigner Ville family) is characterized by several desirable properties, which are briefly summarized as follows (Cohen, 995, p.7); (Flandrin, 999, pp.6-3); Martin (98): (i) Preservation of marginals: The time and frequency marginals of the Wigner Ville spectrum are equal to the stationary PSD S yy (ω) and the instantaneous variance σ y[t], respectively: Time marginal: Frequency marginal: T s π t= Ω S WV [t,ω)=s yy (ω) S WV [t,ω) dω = γ yy [t,t]=σ y[t] (ii) Preservation of time and frequency shifts: S WV [t,ω) preserves the time and frequency shifts, i.e: Time shift: y[t t ] S WV [t t,ω) Frequency shift: y[t] e jω mtt s S WV [t,ω ω m ) 57

176 4.C. Time and frequency domain analysis of non stationary processes (iii) S WV [t,ω) is always real but can yield negative values (thus violating the actual definition of density ); (iv) The Wigner Ville spectrum corresponds to the ensemble average or expected value of the well known Wigner Ville distribution which is of use in the deterministic case (Cohen, 995, p. 4); Martin (98). The ACF, the Wigner Ville spectrum and the spectral correlation form a triplet which is related by the discrete Fourier transform. Intuitively, a fourth representations for a harmonizable process can be obtained by evaluating the discrete Fourier transform of the ACF with respect to t, which is referred to as the Expected Ambiguity Function (EAF) and is defined as (Hlawatsch and Matz, 8): Γ yy [τ,α)=f t α {γ yy [t,t τ]}= t= =F ω τ {S yy(ω+ α, ω)}= T s π γ yy [t,t τ] e jαtt s Ω (4.C.9a) S yy (ω+ α, ω) e jωτt s dω (4.C.9b) and describes the average correlation of the components of the signal separated by τ in time and α in frequency. The EAF is useful to measure the degree of dispersion of the time-frequency components in a harmonizable non-stationary process. In particular, a process is called under spread if distant components in the τ α plane have EAF zero or close to zero (indicating that the components y[t] and y[t τ] e jαtt s are effectively uncorrelated), and is called over spread otherwise (Hlawatsch and Matz, 8). In terms of the EAF, a process is called under spread if its corresponding EAF is well concentrated close to the origin of the τ α plane. The concept of sufficiently distant or well concentrated is rather arbitrary, though it is possible to define actual concentration measures for the EAF to determine the degree of concentration of a non-stationary process, see for example (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6). In the cyclo stationary case, both the spectral correlation and the EAF have the form of a line spectra (values different from zero only at discrete frequency values corresponding to integer multiples of the fundamental cycle of the process). In such a case the spectral correlation is referred to as cyclic spectrum, while the EAF is referred to as the cyclic auto-correlation (Antoni, 7, 9). The ACF, the expected ambiguity function, the Wigner Ville/Rihaczek spectrum and the spectral correlation are all quantities related by the discrete Fourier transform. Figure 4.C. summarizes those relationships. An extended discussion on the definition, analysis and non parametric estimation of these representations can be found in the references (Antoni, 7, 9; Cohen, 995; Flandrin, 999; Giannakis, 999; Hlawatsch and Matz, 8). 4.C.3 Oscillatory random processes The class of oscillatory random processes is composed by those processes that can be represented as the sum of an infinite set of amplitude modulated complex exponential functions. More specifically, the class of oscillatory random processes considers non stationary signals that can be described in terms of the spectral representation (Flandrin, 989; Grenier, 989; Martin and Flandrin, 985; Priestley, 967): y[t]= T s H[t,λ) e jλtt s dw(λ) (4.C.) π Λ where dw(λ) C represents a random spectral increment defined on the pseudo-frequency λ with domain Λ = [ π/t s,π/t s ], and H[t,λ) is a complex function of time and pseudo frequency. For the class of oscillatory processes, the spectral increments are orthogonal and then satisfy the relation: E{dW(λ ),dw (λ )}=δ(λ λ ) S ww (λ ) dλ dλ (4.C.) Replacing γ yy [t,t τ] by E{y[t] y[t τ]}, then Equation (4.C.9) becomes } Γ yy [τ,α)= E{y[t] y[t τ]} e jαtt s = E{ y[t] y[t τ] e jαtt s t= t= where it is evident that the EAF measures the average correlation between y[t] and y[t τ] e jαtt s. 58

177 4. T F Analysis of Non-Stationary Signals via TARMA Representations γ yy [t,t τ] AutoCorrelation Function Equation (4.C.b) F τ ω S WV [t,ω) S R [t,ω) Wigner-Ville/Rihaczek spectra Equation (4.C.8) F t α F t α Γ yy (α,τ] Expected Ambiguity Function Equation (4.C.9) F τ ω S yy (ω,ω α) Spectral correlation Equation (4.C.6) Figure 4.C.: Summary of the time and frequency domain representations of a harmonizable non-stationary process and the relationships among them. where S ww (λ) is a (stationary) PSD function, while the amplitude modulating function H[t,λ) satisfies the Fourier Stieljes representation H[t,λ)= T s e jαtt s dh(α,λ) π Ω such that dh(α,ω) has maximum value at α = for all λ (i.e. dh(α,ω) is concentrated on the low frequencies). To understand the spectral representation in Equation (4.C.), consider the approximate spectral increment W(λ)= w(λ) dλ, which after substitution into the spectral representation, provides the approximate: y[t] T s π λ Λ d ( H[t,λ) w(λ) dλ ) e jλtt s where Λ d is a set of discrete frequencies determined by λ. Then, it is clear that the signal y[t] is approximated as the superposition of a finite set of amplitude modulated complex exponential components, where the function H[t,λ) w(λ) represents the time varying weight of the respective spectral component e jλtt s. On the limit λ, the approximation becomes an equality and leads to the representation in Equation (4.C.). The structure of correlation of a signal of the class of oscillatory random processes can be analyzed by substituting the spectral representation of Equation (4.C.) in the definition of the ACF in Equation (4.C.b), which yields: γ yy [t,t ]= T s 4π H[t,λ ) H [t,λ ) e j(λ t λ t )T s E{dW(λ ) dw (λ )} Λ Λ Since the spectral increments are uncorrelated by the definition in Equation (4.C.), then the double integral reduces to a single integral, of the form: γ yy [t,t ]= T s ( ) 4π H [t,λ ) e jλ t Ts δ(λ λ ) dλ H[t,λ ) e jλ t Ts S ww (λ ) dλ Λ Λ = T s π Λ H[t,λ ) H [t,λ ) S ww (λ ) e jλ (t t )T s dλ Then, it results that on the class of oscillatory random processes the following Karhunen spectral decomposition for the ACF is satisfied (Flandrin, 989; Grenier, 989; Martin and Flandrin, 985; Priestley, 967): γ yy [t,t ]= T s H[t,λ) H [t,λ) e jλ(t t )Ts S ww (λ) dλ (4.C.) π Λ 59

178 4.C. Time and frequency domain analysis of non stationary processes where φ[t,λ)=h[t,λ) e jλtt s C is the representation basis of the spectral decomposition. So, as shown before, this type of Karhunen spectral representation keeps the orthogonality on the spectral increments, but abandons the orthogonality of the representation basis, since it does not necessarily hold that the inner product φ[t,λ ),φ[t,λ ) =δ(λ λ ). Nevertheless, when the function H[t,λ) varies slowly enough, the Karhunen representation has orthogonal spectral increments dw(λ), and is almost orthogonal with respect to the representation basis H[t,λ) e jλt, in the sense that in the inner product appear products of (orthogonal) complex exponential functions with slowly varying functions (Grenier, 989). Based on the representation of Equation (4.C.), the Evolutionary Spectrum is defined with respect to the oscillatory family as follows (Martin and Flandrin, 985; Priestley, 967): S ev [t,λ)= H[t,λ) (4.C.3) and since γ[t,t]= T s H[t,λ) dλ = T s S ev [t,λ) dλ π Λ π Λ then, the evolutionary spectrum defines the power spectral density of the non stationary signal y[t] both on the time and frequency domains (Martin and Flandrin, 985). The main properties of the evolutionary spectrum are summarized as follows (Flandrin, 999, Ch. ); Priestley (965): (i) The pseudo-frequency λ approximates a physical frequency ω (as in e jωtt s ), when the envelope function H[t, λ) varies slowly enough. (ii) S ev [t,λ) reduces to the stationary PSD when H[t,λ)=H(λ), where H(λ) is the Frequency Response Function (FRF) of a corresponding stationary system. In such a case, the PSD is S yy (ω)= H (ω). (iii) Preservation of marginals: The time and frequency marginals of the evolutionary spectrum are equal to the stationary PSD S yy (ω) and the instantaneous variance σ y[t], respectively: Time marginal: Frequency marginal: S ev [t,ω)=s yy (ω) t= Ω S ev [t,ω) dω = γ yy [t,t]=σ y[t] (iv) S ev [t,λ) is always positive. The concept of oscillatory random processes is useful for the definition of parametric spectra. In fact, it can be demonstrated that a time varying system with impulse response h[t, t τ] can be associated with the oscillatory family φ[t,λ) = H[t,λ) e jλtt s, where in this case H[t,λ) is the Fourier transform of the impulse response with respect to τ, referred to as Instantaneous Frequency Response Function. Besides, the evolutionary spectrum for the present case becomes (Flandrin, 999, Ch. ); Hlawatsch and Matz (8): S ev [t,ω)= H[t,ω) (4.C.4) which is a generalization of the relationship between the transfer function and the PSD in the stationary case. More will be said about this type of spectrum in later sections, since its structure is closely related to the case of the response of time dependent systems. Example Interpretation of an amplitude NID process as an oscillatory process: Consider again the amplitude modulated NID process defined as y[t]=a[t] w[t], where w[t] NID(,). In that case, the stationary NID process w[t] can be represented as: 6 w[t]= T s e jωtt s π Ω dw(ω)

179 4. T F Analysis of Non-Stationary Signals via TARMA Representations where E{dW(ω ) dw (ω )} = δ(ω ω ) dω dω. Replacing this representation into the definition of y[t], yields: while the ACF of y[t] becomes y[t]= T s a[t] e jωtt s π Ω dw(ω) {( ) ( )} Ts γ yy [t,t ]=E a[t ] e jω t T Ts s dw(ω ) a[t ] e jω t T s dw (ω ) π Ω π Ω = T s 4π a[t ] a[t ] e jω t Ts e jω t T s E{dW(ω ) dw (ω )} Ω Ω = T s 4π a[t ] a[t ] e jω t Ts e jω t Ts δ(ω ω ) dω dω Ω Ω Then, after computing the integrals and performing some algebraic manipulations, it can be shown that the ACF is of the form: γ yy [t,t ]=a [t ] δ[t t ] which is equivalent to the ACF shown in the previous example. Furthermore, it is evident that for the analyzed process, the amplitude function a[t] corresponds to the function H[t, λ) defining the oscillatory family, while the evolutionary spectrum for this process is equivalent to S ev [t,ω)=a [t]. 4.C.4 Reconciling the definitions of time-dependent spectra The types of time dependent spectra described in previous sections, namely the Wigner Ville and evolutionary spectra are the most widely known types of time dependent spectra, nonetheless, many more definitions exist. Most of these definitions of the time varying spectrum fall within two main classes of spectra, namely the Type I and Type II spectra of Matz and Hlawatsch (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6): (i) The Type I of time varying spectra, also known as the Generalized Wigner Ville spectrum (GWVS) (Flandrin, 999; Hlawatsch and Matz, 8) which includes the type of spectra derived from the ACF and related functions. This class of spectra are based on representations of the type of Equation (4.C.7), while different parametrizations for the t and τ variables, and the use and choice of smoothing functions result into different variations of this type of spectra. Main examples of time varying spectra that fall into this class are the spectrogram, the Wigner Ville spectrum, the Rihaczek spectrum, the Page spectrum, the Levin spectrum, the Physical spectrum, and related (Bendat and Piersol,, p. 54); Hammond and White (996); Matz and Hlawatsch (6); (Newland, 993, p. 8); (Preumont, 994, sec. 8.3). (ii) The Type II of time varying spectra, or Generalized Evolutionary Spectrum (GES) (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6) includes the type of spectra derived from Fourier transforms of transfer functions and similar system operators (Hlawatsch and Matz, 8; Matz et al., 997; Matz and Hlawatsch, 6), and relates time variant spectra of the type defined in Equation (4.C.). Again, for the case of Type II spectra, different choices of the definition of t and τ variables, and different smoothing functions result into different variations of this class of spectra. Within this type of time varying spectra are grouped the evolutionary spectrum, the transitory evolutionary spectrum, and the Weyl spectrum (Flandrin, 999, Ch. ); Hammond and White (996); Matz et al. (997); Priestley (97). Matz and Hlawatsch demonstrate in (Hlawatsch and Matz, 8; Matz and Hlawatsch, 6) that the two type of generalized spectra turn out to be approximately equivalent when the signal is under spread. The interested reader is referred to the aforementioned references for details. 6

180 4.D. Poles and modal decomposition of TARMA models Appendix 4.D Poles and modal decomposition of TARMA models The definition of the poles and zeros, and modal decompositions of time-varying systems is still a motive of extensive discussion and research. In general, the estimates of these modal quantities for TARMA models are typically defined in the frozen sense (Poulimenos and Fassois, 6), where the poles λ m and zeros µ m are computed point-wise in time. Thus the TARMA model s frozen modes with natural frequencies and damping ratios may be computed as (Poulimenos and Fassois, 6) ω ni [t]= lnλ i[t] T s (rad/time unit), ζ i [t]= cos(arg(lnλ i [t])) (4.D.) with λ i [t] designating the i-th discrete time frozen pole. Modal decompositions based on the frozen pole and zero estimates have been analyzed elsewhere, see for example (Prado and West, 997; Aguilar et al., 998). The frozen approach is perhaps the most used due to its simplicity, but the derived poles and zeros do not have an explicit temporal structure. More elaborated definitions have been proposed as well: in (Kamen, 988) the poles and zeros are defined in terms of factorizations of operator polynomials, leading to linear recursions for the computation of the time-varying poles and zeros, which bear information about the asymptotic stability of the time-varying system; in (O Brien and Iglesias, ), the poles and zeros of a state space time-varying system are defined in terms of a transformation matrix that makes the original state transmission matrix triangular. Here, an alternative definition of the non-stationary TARMA model poles and zeros is derived from the previously defined HIR concept. Thus, applying the Z transform to Equation (4..), yields H(z)= n a i= z i Ω i ĂA i H(z)+ n c i= c i z i = H(z)= ( I+ n a i= ) ( ) z i Ω i nc ĂA i c i z i i= (4.D.) where H(z) = [ H[,z] H[,z] H[N,z] ] T, and H[k,z] is the Z transform of the k th coefficient of the harmonic impulse response. Then, it is very simple to see that the poles and zeros of the TARMA model correspond to the roots of the polynomials A(z)= I+ n a i= z i Ω i ĂA i, C(z)= n c i= c i z i (4.D.3) Notice that the definition of the harmonic impulse response of the TARMA case in Equation (4..) can be written in the following state space form f[τ]=a f[τ ]+B δ[τ] h[τ]=c f[τ]+d δ[τ] (4.D.4a) (4.D.4b) where the matrices in the full TARMA case are of the form Ω ĂA Ω ĂA... Ω na ĂA na I c Ω I... A=......, B= ĂA c., C= c Ω ĂA c.... c na Ω na ĂA na c T, D= c where c i = for all i > n c. In the TAR case, the matrices simplify to B =, C = [ I ], and D = c. Thus, based on this state space representation, the poles of the TARMA model can be found as the eigenvalues of the matrix A. Notice that the number of eigenvalues is equal to the dimension of A, which is M = N n a. Then, to each eigenvalue λ m, with m=,...,m, a corresponding (right) eigenvector v m is associated. Using the eigenvalues and eigenvectors, the state matrix can be written asa=v Λ V, where Λ=diag(λ,...,λ M ), and V = [ ] v v M. Applying the decomposition ofain the state space representation leads to ğg[τ]= Λ ğg[τ ]+ B δ[τ] h[τ]=u ğg[τ]+d δ[τ] (4.D.5a) (4.D.5b) 6

181 4. T F Analysis of Non-Stationary Signals via TARMA Representations where ğg[τ]=v f[τ], B=V B, and u, u, u,m u, u, u,m U = C V = u N, u N, u N,M (4.D.6) The autonomous response of the state space representation in Equation (4.D.5) is ğg[τ] = Λ τ = e (log Λ)τ, and consequently, the harmonic impulse response can be written as follows h[τ]=u e (log Λ)τ, = h[k,τ]= M m= u k,m e (logλ m)τ (4.D.7) By extension, computing theztransform, leads to H[k,z]= M m= u k,m λ m z, H[k,n]= M m= u k,m λ m e nω o (4.D.8) where z=e jnω o. The result in Equation (4.D.8) suggests that the Harmonic FRF in the TARMA case, evaluated at the frequency k, is constructed as the sum of M frequency basis with frequency defined by the eigenvalues λ m and amplitude determined by the coefficients u k,m. Computing the inverse Fourier transform of the Harmonic FRF, mapping from k to t, leads to the following expression for the ITF, H[t,n]=F t { M m= u k,m λ m e jnω o } = M m= Ft M {u k,m } λ m e jnω = o m= M u m [t] λ m e jnω = o G m [t,n] m= (4.D.9) where u m [t]=ft {u k,m } is the m-th modulation function andg m [t,n] is the m-th time-modulated transfer function, both corresponding to the m-th eigenvalue of the TARMA model. Equation (4.D.9) suggests that the ITF of a TARMA model is constructed by a set of M modulated stationary transfer functions, where the modulation function u m [t] corresponds to a sum of weighted exponential functions, as follows u m [t]=ft {u k,m }= N N k= u k,m e jkω ot (4.D.) which indicates that the ITF contains a finite number of time-frequency atoms u k,m e jkω ot ( λ m e jnω o ). In the same sense, the impulse response can be obtained by taking again the inverse Fourier transform of the ITF in Equation (4.D.9), mapping from n to τ, to yield h[t,t τ]=f τ { M m= u m [t] λ m e jnω o and the response of the TARMA model can be computed as } = M m= u m [t] e (logλ m)τ (4.D.) y[t]= t τ= h[t,t τ] w[t τ]= M m= ( ) u m [t] e (logλm)t w[t] = M m= u m [t] e (logλ m)t (4.D.) The main interpretation of Equations (4.D.) and (4.D.) is that the impulse response and the system response y[t], consist of a weighted sum of complex exponential functions e (logλ m)τ modulated by a corresponding modulating function u m [t]. 63

182 4.D. Poles and modal decomposition of TARMA models 64

183 Chapter 5 A Multiple Model Framework for Vibration Based SHM of Structures with Time-Dependent Dynamics Under Uncertainty This work focuses on vibration based SHM methods for structures with time-dependent dynamics under important environmental and operational uncertainty. The framework includes non-stationary parametric time-dependent modeling for representing the time-dependent vibration response dynamics, and a Multiple Model (MM) representation of an individual health state of the structure under uncertainty. This work provides detailed definitions of the main theoretical aspects of the MM framework for vibration-based SHM, based on the interpretation of the MM representation as an mixture approximation of a random coefficient model, which facilitates the construction of the MM representation with a lower number of models as well as the definition of damage diagnosis tests. A case study featuring a suspension system with time dependent dynamics, where the uncertainty is introduced by variability in the physical parameters of the dynamic model, is used to show the workings of the postulated methods. The introduced MM based damage diagnosis methods are characterized by their simplicity in construction and their effective use of limited amounts of data, while providing high accuracy in damage diagnosis. 65

184 5.. Introduction 5. Introduction Vibration-based output-only Structural Health Monitoring (SHM) aims at detection and identification of damage on structures based on measured vibration response signals (Farrar and Worden, 7; Fassois and Sakellariou, 9; Sohn et al., ; Worden and Manson, 7). Statistical time series methods are one of the most commonly used approaches for vibration based SHM, and consist of: (see also Figure 5..) (i) a set of vibration sensors located at different parts of the structure; (ii) a data acquisition and conditioning system that measures the vibration response, translates it into a discrete-time signal, and performs the necessary pre-processing required to optimize the signal quality; (iii) a statistical time-series model that is used to represent the (discretized) vibration response of the structure; and (iv) a statistical decision making, which provides an estimate of the current state of the structure, based on the statistical time-series model and the newly acquired vibration response. The application of an SHM method requires to first (just once, in an initial baseline phase) use vibration signals to construct a statistical model for each considered state of the structure. Then, under normal SHM operation, a fresh vibration response signal is obtained, and a decision is to be made with respect to the current health state (inspection phase). Natural excita on forces (unmeasurable) ( i) ( ii) Data acquisi on and preprocessing ( iii) m Sta s cal me series model ( iv) Sta s cal Decision Making Structure Unknown state Es mate of the structural state Figure 5..: Architecture of a statistical time series method for vibration-based SHM system Despite significant developments in recent years, two prime challenges faced by such methods relate to their operation: (i) on structures with time-dependent dynamics, and (ii) under significant environmental and operational uncertainty. This is the case for several commonly found modern structures, including bridges with heavy moving vehicles, cranes, wind turbines, robot manipulators, rotating machinery, structures with variable geometry, and many others. For this type of structures the vibration response is stochastically time dependent, namely nonstationary. Consequently, the success of any health monitoring method on type of structures thrives on the effective representation of non-stationarity and on an equally powerful decision making method robust to uncertainty. The first challenge, pertaining to structures with time-dependent dynamics, requires the proper modeling of the measured non-stationary random vibration response signals, which are characterized by time-dependent AutoCovariance Function (ACF) (Avendaño-Valencia and Fassois, 4a; Poulimenos and Fassois, 6). For that purpose, most popular modeling methods are those based on non-parametric Time-Frequency (TF) or Time-Scale (TS) wavelet power distributions (Boashash, 3; Feng et al., 3; Peng and Chu, 4; Staszewski and Robertson, 7). Alternatively, stochastic parametric modeling is a more powerful tool in this regard, since it is capable of providing very accurate modeling in a very compact representation. In this sense, Time-dependent ARMA (TARMA) models (Avendaño-Valencia and Fassois, 4a; Poulimenos and Fassois, 6; Spiridonakos and Fassois, 3, 4b), and the closely related Linear Parameter Varying ARMA (LPV ARMA) models (Bamieh and Giarré, ; Tóth, ), provide effective modeling of non stationary dynamics and have proven effective in SHM applications under low uncertainty. However, when the uncertainty level is significant, these modeling methods are not sufficient to cope with the extra variability, in which case adjustments are necessary. For this problem, which actually corresponds to the second challenge mentioned earlier, various approaches may be applied. If the variables can be quantified (measured), then explicit cause and effect type modeling may be employed. This leads to methods aiming at building a regression or interpolation model that explains the variability of the damage sensitive features or the parameters of the representation model as a function of measurable uncer- 66

185 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I tainty inducing variables (Hios and Fassois, 4; Kopsaftopoulos and Fassois, 3; Sohn, 7; Worden et al., ). If this is not the case, a first measure is to use a method attempting to separate the effects of uncertainty from those of damage. For this purpose, orthogonal decompositions and related methods are employed, with Principal Component Analysis (PCA) based methods being most common (Sohn, 7; Deraemaeker et al., 8; Yan et al., 5a). Alternative methods attempt to directly model uncertainty. Random Coefficient (RC), probabilistic or similar models describe the effects of uncertainties on the dynamics of the vibration response signal by adopting a random character on the model parameters (coefficients) (Michaelides and Fassois, 3; Mosavi et al., ; Zao and Wang, ). Normal distribution models are most commonly used, although in practice the parameters may follow more complex distributions. If that is the case, Gaussian mixture (Nair and Kiremidjian, 7; Słoǹski, ), kernel (Fricker et al., ; Khatibinia et al., 3), or interval analysis methods (Muscolino et al., 5), as well as non probabilistic fuzzy models (Chandrashekhar and Ganguli, 9; Degrauwe et al., 9), are useful alternatives. The main drawback of RC and interpolating models is that these require large amounts of data in order to construct a reliable representation. Moreover, the data requirements sharply increase as the model complexity increases. Another possibility is to represent uncertainty by means of a finite set of models (with respective parameter values), referred to as Multiple Model (MM) representation. In the MM representation, each model is associated with the dynamic characteristics on the vibration response of the structure under particular uncertainty conditions, while the whole MM is assumed to represent the variability on the dynamics under uncertainty. Former use of MM representations has been attempted by using different types of distance functions to compare between the models in the representation and test models for the diagnosis of damage Zheng and Mita (7); Carden and Brownjohn (8); Surace and Worden (). However, in these applications there is no regard for the actual structure of the MM representation and its optimization. Although MMs provide simple and flexible representation of uncertainty, these require a careful selection of the models in order to avoid leaving regions of the space poorly represented (Han and Narendra, ). This problem is more notorious when the size of the parameter vector increases, where a larger number of models in the MM would be required to fill a larger volume on the parameter space, and is associated to the inherent discretization of the parameter space. In Figure 5.. is provided a brief summary of the common ingredients used on the vibration-based health monitoring of structures with time-dependent dynamics and uncertainty. The main aim of this work is on the development of robust vibration based SHM methods for structures with time-dependent dynamics under important environmental and operational uncertainty. Although the main focus of attention is on the output only case, the theory introduced in this work can be easily adapted to other cases. The adopted approach utilizes a framework that includes two entities: (i) non-stationary parametric time-dependent parametric models (either from the FS TARMA or LPV ARMA types) for representing the time-dependent vibration response dynamics, and (ii) a Multiple Model (MM) representation of an individual health state of the structure under uncertainty. The combination of these two entities endows the whole SHM method with the accuracy and compactness characterizing the FS TARMA and LPV ARMA models, with the simplicity and flexibility offered by the MM approach. The present work can be seen as the outcome of our recent works on the topic (Avendaño-Valencia and Fassois, 4b, 5a,d,b), while its main contribution is on representing the MM as a mixture approximation of an RC model, in which each model in the MM is associated with a density function (kernel) to approximate the local distribution of the parameters instead of a discrete parameter value, as it typically made (Zheng and Mita, 7; Carden and Brownjohn, 8; Surace and Worden, ). This arrangement facilitates the representation of large volumes in the parameter space with a reduced number of models in the MM representation. Moreover, the interpretation of the MM as a mixture approximation facilitates the construction of simple and effective statistical damage diagnosis tests. A small example featuring a suspension system with time dependent dynamics, where the uncertainty is introduced by variability in the physical parameters of the dynamic model, is used to show the workings of the postulated methods. The work is organized as follows: Section 5. provides a precise definition of the problem and an overview of the main ideas used in this work, Section 5.3 presents the whole MM framework and its elements. A simple It must be remarked that in the area of control systems, the term multiple models is used in a different context to refer to a group of models, each one for a different health state of the structure/system (Zhao et al., 5). 67

186 5.. The Problem and the Multiple Model Based SHM Framework Representation of time-dependent dynamics Time-Frequency, Time-Scale or modal representations (Staszewski et al., 997; Peng and Chu, 4; Staszewski and Robertson, 7; Feng et al., 3; Worden et al., 4). Time-dependent ARMA (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 3, 4b; Avendaño-Valencia and Fassois, 4a). Linear Parameter Varying ARMA (Bamieh and Giarré, ; Tóth, ; Avendaño-Valencia and Fassois, 5d,a). Representation of uncertainty Conventional (single) models (Poulimenos and Fassois, 4; Spiridonakos and Fassois, 3; Avendaño-Valencia and Fassois, 4a). PCA and related methods (Sohn et al., ; Yan et al., 5a,b; Sohn, 7; Deraemaeker et al., 8; Gómez-González and Fassois, 3). Regression or interpolation methods (Worden et al., ; Sohn, 7; Kopsaftopoulos and Fassois, 3; Hios and Fassois, 4). Random Coefficient (RC) models (Nair and Kiremidjian, 7; Fricker et al., ; Słoǹski, ; Zao and Wang, ; Michaelides and Fassois, 3). Multiple Model (MM) representations (Avendaño-Valencia and Fassois, 4b, 5d,a). Decision making Hypothesis testing (Fassois and Sakellariou, 9; Spiridonakos and Fassois, 3; Avendaño-Valencia and Fassois, 4a). Bayesian decision making (Saito et al., 5; Beck, ; Sankararaman and Mahadevan, 3). Distance-based methods (Conforto and D Alessio, 999; Peng and Chu, 4; Mosavi et al., ; Feng et al., 3). Figure 5..: Common ingredients in vibration-based SHM methods for structures with time-dependent dynamics under uncertainty. application example involving the detection and identification of a single damage on an active suspension system is presented in Section 5.4. Finally, the main conclusions of this study are summarized in Section The Problem and the Multiple Model Based SHM Framework 5.. Precise problem statement Consider a structure with non stationary vibration response y[t], with t =,,...,N and sampled with period T s seconds, which may operate in one of several structural states v={o,a,b,c,...}, where o stands for the structure operating at the healthy state, while a, b, c and so on, indicate the structure operating under a damage of type A, B, C, and so on, respectively. Moreover, since the structure operates under operational and environmental uncertainty, its vibration response is characterized by common but slightly different dynamics at different intervals of analysis. Examples of operational uncertainty are the amount of traffic moving on a bridge, variable loads in a crane or manipulator, or the demanded power on a wind turbine. On the other hand, environmental uncertainty stems from uncontrollable variables characterizing the environment in which the structure operates such as temperature, humidity, wind speed and turbulence and several more. A common characteristic of these variables is that these may be difficult or impossible to be measured, and in consequence their effect in the structural dynamics cannot be properly accounted for. In the normal operating cycle of the structure, it is of interest to determine the current health state of the structure, an issue referred to as damage diagnosis. Here, damage diagnosis is considered in two stages: first 68

187 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I to determine the presence of damage in the structure (damage detection) and subsequently, specify the type of damage that the structure has incurred (damage identification). In turn, the problem of damage detection may be solved either by comparing the test signal with the model of the healthy state (unsupervised damage detection), or by comparing with models of healthy and damaged states (supervised damage detection). 5.. Overview of the main ideas The proposed approach for the previously described SHM problem is to use a non stationary time series model, represented as m = {θ,s}, with parameter vector θ and the structure S, to represent the vibration response of the structure on short time periods. Although, the analysis presented on this work is limited to FS TARMA and LPV ARMA models, it may be extended to other types of time series models. Furthermore, a Multiple Model (MM) representation, consisting of a finite set of modelsm v ={m v,l } l =,...,L, is used as a joint description of the different dynamics of the vibration response of the structure under uncertainty. The SHM method operates either in a baseline phase or an inspection phase. In the baseline phase a set of MMs M v for the healthy and various potential damage types (v={o,a,b,c,...}) are obtained from a corresponding set of experiments on the structure. Besides, the adjustment parameters of the damage diagnosis tests are optimized to ensure the best performance of the method later on the inspection phase. Then, during the inspection phase, a test signal y u [t] is provided and a decision is to be made regarding to the current health state of the structure. 5.3 The Elements of the Multiple Model Based SHM Framework 5.3. Multiple Model representations of time-dependent structural dynamics The elementary models A (fully) parametric LPV ARMA(n a,n c ) [pa,p c,p s ] model, with n a and n c designating its AR and MA orders, and p a, p c, p s its AR, MA and innovations variance functional subspace dimensionalities, is defined as (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 4b; Bamieh and Giarré, ; Tóth, ): y[t] n a i= a i (β[t]) y[t i]+ n c i= c i (β[t]) w[t i]+w[t]; w[t] NID (,σ w(β[t]) ) (5.3.) where β[t] a (measurable) scheduling variable that determines the behavior of the structure at time t, a i (β[t]) for i =,...,n a, and c i (β[t]) for i =,...,n c are the i th AR and MA parameter-varying trajectories, and w[t] is a Normally and Independently Distributed (NID) random innovations sequence with mean zero and variance σ w(β[t]). An FS TARMA model corresponds to the special case of the LPV ARMA model in Equation (5.3.) when the scheduling variable is β[t]= t. The temporal evolution of the AR part of the LPV ARMA model is defined as: a i (β[t]) p a j= { } a i, j G ba( j) (β[t]); F AR = G ba() (β[t]),...,g ba(pa) (β[t]) (5.3.) where F AR designates a functional subspace, b a( j) ( j =,..., p a ) are the indices of the specific basis functions that are included in the subspace, while a i, j stand for the coefficients of projection of the AR parameters. Similar definitions are utilized for the MA and innovations variance, where c i, j, j =,..., p a and s j, j =,..., p s represent the coefficients of projection of the MA parameters and innovations variance, respectively (Poulimenos and Fassois, 6). Thus, the LPV ARMA model m ={θ, S} is fully determined by the parameter vector θ and the structures, each one of them defined as: θ = [ ϑ T ς T] T (n+p s ) S={n a,n c, b a, b c, b s } (5.3.3) An LPV ARMA(n a,n c ) [pa,p c,p s ] model in the form of equations (5.3.) and (5.3.) is referred to as a fully parametric LPV ARMA model. A semi-parametric LPV ARMA(n a,n c ) [pa,p c ] model is obtained when the innovations variance is not expanded on a functional basis. 69

188 5.3. The Elements of the Multiple Model Based SHM Framework respectively, where n=n a p a + n c p c, and: ϑ = [ a, a, a na,p a c, c, c nc,p c ] T n ς = [ s s s ps ] T p s b a = [ b a() b a(pa ) ] T p a b c = [ b c() b c(pc ) ] T p c b s = [ b s() b s(ps ) ] T p s In the approach described above, the functional subspaces are selected from an ordered set of orthogonal basis functions. However, it is possible to select only some particular indices from the basis to achieve more economic representations (i.e. b a = [ b a(3) b a(7) b a(8)...,b a(pa )] T ). These representations are obtained by following model structure optimization schemes that are discussed in (Poulimenos and Fassois, 9b; Spiridonakos and Fassois, 4b). The LPV ARMA model of Equation (5.3.) and (5.3.) can be written in the compact regression type form: y[t]= φ T [t] ϑ + w[t], σ w(β[t])= g T s (β[t]) ς (5.3.4) where the regression vectors φ[t] R n and g s (β[t]) R p s are: y[t ] G ba() (β[t]).. y[t n a ] G ba(pa) (β[t]) φ[t]= w[t ] G bc() (β[t]).. w[t n c ] G bc(pc) (β[t]) n G bs() (β[t]) G bs() (β[t]), g s (β[t])=. G bs(ps) (β[t]) p s (5.3.5) and designates the Kronecker product (Golub and van Loan, 996, p. 8). Under the NID assumption of the innovations sequence, the LPV ARMA model of Equation (5.3.) is associated with a likelihood of the form (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 4b): p(y β,m)= p(y β, θ,s)= p(φ[]) N t= N ( φ T [t] ϑ,σ w(β[t]) ) (5.3.6) where y= [ y[] y[] y[n] ] T N, β =[ β[] β[] β[n] ] T, and N(, ) indicates a normal distribution with the indicated mean and covariance. N Maximum Likelihood (ML) estimates of the LPV-ARMA parameter vector are obtained by solving the following optimization problem (Poulimenos and Fassois, 6, 9b; Spiridonakos and Fassois, 4b): ˆθ = argmax θ ln p(y θ) ln p(y θ)= N lnπ e[t ϑ]=y[t] φ T [t] ϑ N t= ( ) lnσw(β[t])+ e [t ϑ] σw(β[t]) (5.3.7a) (5.3.7b) (5.3.7c) where e[t ϑ] represents the estimation residuals obtained when using the ARMA parameter vector ϑ. In the general case, the ML optimization problem in Equation (5.3.7a) does not lead to closed form solutions for the parameter estimates, although sub-optimal solutions can be obtained after following some considerations. Later, the values found by the sub-optimal approach can be employed as initial values for posterior refinement through iterative non-linear optimization methods that attempt to directly solve the non-linear optimization problem. The reader interested on parameter estimation methods for FS TARMA (which can be used as well for the LPV ARMA case) are referred to the following references Poulimenos and Fassois (6); Spiridonakos and Fassois (4b); Poulimenos and Fassois (9b). 7

189 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I If the innovations are independent and identically distributed, then it can be demonstrated that as the sample size N tends to infinity the ML estimates ˆθ converge in distribution to the normal distribution (Ljung, 999, p.5): ˆθ N ( ) θ, Σ θ (5.3.8) where θ is the actual value of the parameter vector, and Σ θ is its covariance matrix, given by the Cramér-Rao lower bound: Σ θ = I ( θ), { } I( θ) E dln p(y θ) dθdθ T θ= θ (5.3.9) and I(θ) is the Fisher information matrix, defined as the Hessian matrix of the log-likelihood with respect to the parameter vector. The above indicates that the ML estimates are unbiased and have the lowest variance among all estimators. More details on the asymptotic analysis of the parameter estimates of LPV ARMA and FS TARMA models can be found in Poulimenos and Fassois (7, 9a) MM representations Whenever the level of uncertainty is important, the characteristics of the vibration response of the structure experience significant variations at different analysis intervals. In consequence, the parameter vectors of the FS TARMA or LPV ARMA models are also different, even on the same structural state. Multiple Model representations attempt to represent the parameter variability by constructing a finite set of models. On difference with the typical interpretation of the MM representation as a set of models obtained on discrete locations of the parameter space (Alkahe et al., ; Gadsden et al., 3; Zhao et al., 5), the interpretation introduced in this work consists on assuming that the MM is a mixture approximation of a Random Coefficient model. To start with the postulated definition of the MM, consider first a Random Coefficient (RC) model in which the parameter vector θ at the structural state v (also referred to as class) follows a distribution model p(θ v), referred to as the parameter prior for class v (or class-parameter prior). Consequently, under class v, both y and θ are jointly distributed random variables described by the joint Probability Density Function (PDF) : where p(y θ, v) p(y β, θ, S, v) is the likelihood in Equation (5.3.6). p(y, θ v)= p(y θ,v) p(θ v) (5.3.) Figure 5.3.(a) describes the modeling concept through RC models. On first instance, a single vibration response signal y is measured from the structure. The likelihood function p(y θ,v) associates the observed vibration signal y with a parameter vector θ when the structure is at state v. The respective parameter vector θ describes the dynamics of the vibration response at the particular interval of analysis where the environmental and operational conditions may be assumed to be constant. At the same time, the class-parameter prior p(θ v) describes the whole variability of the parameters associated with the structure at state v under uncertainty. The class-parameter prior and the likelihood lead to a joint description of y and θ in terms of the joint PDF p(y, θ v) defined in Equation (5.3.). The selection of the prior distribution is considered the most important aspect in Bayesian analysis and has received a lot of attention in the related literature (Berger, 985, Ch.3) (Robert and Casella, 4, Ch.3). Many of the methods for selecting the prior distribution start by selecting a distribution model and then adjusting/optimizing the parameters of the distribution to a set of experimental data. Typically, the normal distribution model is most commonly used (Mosavi et al., ; Zao and Wang, ). However, Gaussian mixture (Nair and Kiremidjian, 7; Słoǹski, ), kernel (Fricker et al., ; Khatibinia et al., 3), and interval analysis (Muscolino et al., 5) methods, and fuzzy models (in a non probabilistic framework) (Chandrashekhar and Ganguli, 9; Degrauwe et al., 9) are alternatives to represent the parameter distribution whenever more complex distributions are involved. In order to simplify the notation, the dependency of the joint PDF and the likelihood on the scheduling variable and the model structure is not explicitly shown. However, in a strict notation, the joint PDF is written as p(y, θ v, β,s), and the likelihood as p(y θ,v, β,s). 7

190 5.3. The Elements of the Multiple Model Based SHM Framework (a) p(θ v) θ p(y θ,v) Likelihood y (b) p(θ v) θ m v,l ={θ v,l,s} p(y θ,v) Likelihood y Figure 5.3.: Representation of the relationship between an observed vibration response signal y and its associated random parameter vector θ. (a) In a random coefficient model y and θ are associated through the likelihood function, while θ is drawn from the class-parameter prior. (b) In the MM representation y and θ are also associated through the likelihood function, but the class-parameter prior is approximated by the superposition of individual models describing the parameter distribution at particular environmental and operational conditions. Random coefficient LPV AR models for SHM have been also explored by the authors in (Avendaño-Valencia and Fassois, 5d). Random coefficient models may be effectively used for the representation of uncertainty, however, these may require large amounts of data during the model identification process, especially for complex distribution models. Alternatively, the Multiple Model (MM) representation for the class v, noted as M v, consists of a set of L individual models, that is M v ={m v,l }, l =,...,L, with corresponding parameter vectors θ v,l,l =,...,L and a common structure S. Each one of the individual models in the MM describes the vibration response of the structure under specific conditions, while the MM represents the whole variability under uncertainty. The key approximation made, then consists of expressing the joint PDF p(y, θ v) as that corresponding to the MM as follows: p(y, θ v)= p(y θ,v) p(θ v) p(θ v)= L l= L l= π v,l = ; π v,l f(θ θ v,l ) f(θ θ v,l ) dθ =, l (5.3.a) (5.3.b) (5.3.c) where p(y θ,v) is the likelihood as in Equation (5.3.6), p(θ v) is the MM parameter prior for class v, π v,l are the MM weights, and f(θ θ v,l ) is the MM kernel function, which is a positive semi-definite function with compact support, centered on the value θ v,l that integrates to one. The normalizing conditions in Equation (5.3.c) are defined to ensure that the MM parameter prior for class v integrates to. Figure 5.3.(b) depicts the concept of modeling uncertainty through an MM representation. The relationship The integral of the MM prior is L p(θ v) dθ = l= π v,l f(θ θ v,l ) dθ = Thus, if L l= π l = and f(θ θ v,l ) dθ =, then p(θ v) dθ =. L l= π v,l f(θ θ v,l ) dθ 7

191 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I between a realization of the parameter vector and a realization of the vibration response is also made in terms of the likelihood function, but the class-parameter prior p(θ v) is approximated by p(θ v), constructed as the superposition of several models shown in Equation (5.3.b), where each one of the individual models describes the parameter variability under particular environmental and operational conditions. The MM uses a type of kernel density model or mixture density model for the parameter prior of each class (see (Duda et al.,, Sec. 4.3) and (Hastie et al., 9, Sec. 6.6)), where each modelm v,l determines a single kernel of the approximation, and the parameter vectors (kernel centers) and the weights become the design parameters of the model. The model is completed by assigning a type of kernel function to the MM. For the purposes of this work, a Gaussian density function is used as a kernel, which is defined as follows: f(θ θ l )=(π) n/ Σ l / ( ) exp (θ θ l ) T Σ l (θ θ l ) ; n=dim θ (5.3.) Nonetheless, other PDF models may also be used for this purpose. Note that the selected kernel model may have specific parameters that would in turn become fixing parameters of the MM representation. A small illustrative example comparing RC models and MM representations is shown in Appendix 5.B Construction of an MM representation Consider a set of M independently drawn vibration response signals obtained from the structure at its state v, namely Y v = {y v,, y v,,..., y v,m } where M L (this indicates that the number of models used in the MM may be lower than the number of training signals). Thus, the problem of construction (or identification) of the MM representation for the structural state v, consists of: (i) identification of the individual models; (ii) identification of the MM representation. Each one of these subproblems is discussed next. Identification of the individual models The identification of the individual models aims at determining the parameter vectors θ v,l and the common structures={n a,n c, b a, b c, b s } of the LPV ARMA or FS TARMA models. The identification of an LPV ARMA (FS TARMA) model is an intricate optimization procedure according to which models corresponding to various candidate structures are estimated and the one providing the best fitness to the available set of vibration responses is selected. Moreover, since the construction of the MM is based on several signals from different conditions, it is of interest to obtain a single common structure that can be used to properly capture all the dynamic behaviors of the structure. For this purpose, the selection of the structure of the individual models may be performed using the guidelines in Poulimenos and Fassois (9b); Spiridonakos and Fassois (4b), based on the empirical risk as a performance measure (Schölkopf and Smola,, Ch.3), (Vapnik,, Ch.): R emp (S)= M M m= y (val) v,m ŷy (val) v,m [ ] ŷy (val) T v,m = φ (val) v,m [] φ (val) v,m [N] ˆϑ v,m (5.3.3) where designates Euclidean norm (norm-) of the vector in the argument, φ (val) v,m [t] is the regression vector constructed with y (val) v,m [t ],...,y (val) v,m [t n a ], and ŷy (val) v,m is the vector with the one-step-ahead predictions evaluated on the validation set employing the ML parameter estimates ˆϑ v,m obtained from the training set (as in Equation (5.3.7a)), so that: [ ] y (tr) v,m y v,m = y v,m (val) [ ] β (tr) v,m β v,m = β (val) v,m with ŷy representing the vector with the model-based one-step ahead predictions y[t t ], defined as y[t t ]=E { y[t] y[t ],...,y[],β[t], θ } = E { φ T [t] ϑ + w[t] y[t ],...,y[],β[t], θ } = φ T [t] ϑ (5.3.4) 73

192 5.3. The Elements of the Multiple Model Based SHM Framework The model identification in terms of the empirical risk is important to achieve an MM representation that does not over-fit the training data and is capable of generalizing to unseen data points. Furthermore, the empirical risk tends to favor models with low complexity, thus enabling the selection of the most compact and efficient representation (Schölkopf and Smola,, Ch.3), (Vapnik,, Ch.). Identification of the MM representation The identification of the MM representation requires to determine the weights π v,l, the number of models L of the MM and other structural parameters of the MM. Here, only the selection of the number of models in the MM is discussed. The adjustment of the weights for the purpose of optimizing the performance of the damage diagnosis methods is discussed in Section The selection of the structure of the MM aims to determine which models should be kept in the representation, in order to achieve a reduced dimensionality representation (the dimensionality being the number of models in the MM). Thus, from the M models obtained from an equal number of vibration response signals given in the baseline phase, the objective is to find a reduced set of L models that will make part of the MM, after eliminating (or merging) the redundant models, namely those that hold redundant information with other models in the set. For this purpose, the Kullback Leibler (K-L) divergence defined as (Burnham and Anderson,, p.5): D KL (m a,m b )= p(y θ a ) ln p(y θ a ) Y p(y θ b ) dy (5.3.5) can be used as a measure of the information lost when the model m b is used to represent model m a. Thus, a large K-L divergence is an indicator that modelsm a andm b represent different dynamics in the vibration response. Likewise, if both modelsm a andm b represent similar dynamic behaviors, then their divergence should be close to zero. The actual form of the K-L divergence for LPV ARMA and FS TARMA models is of the form D KL (m a,m b )= N + d M(m a,m b )+ d s(m a,m b ) (5.3.6) ( dm(m a,m b )=(ϑ b ϑ a ) T Σ θ b (ϑ b ϑ a ) ds(m N σ w(a) (β a [t]) a,m b )= σw(b) (β a[t]) lnσ w(a) (β ) a[t]) lnσw(b) (β a[t]) where ϑ i is the parameter vector with corresponding covariance matrix Σ θ i, and σ w(i) (β a[t]) is the innovations variance of the model m i, with i={a,b}. The K-L divergence of the LPV AR model shown in Equation (5.3.6), consists of two terms: d M (m a,m b ), which is the Mahalanobis distance between the predictions of both models and penalizes the modelm b when there are large prediction errors; and d s(m a,m b ), which is a ratio between the innovations variances of both models, and penalizes the disagreement between the innovations variances. Alternative measures could be used instead, like the Bayes factor formulated in (Avendaño-Valencia and Fassois, 5a). The reduction of the dimensionality of the representation (number of models) is then attempted by grouping similar models using a clustering method such as L-means, which aims at finding clusters containing models with similar features and their corresponding centers, this way determining the most representative model of the group according to the K-L divergence and a given number of clusters (Duda et al.,, Ch. ), (Hastie et al., 9, Ch. 3). The clustering process consists on ranking the candidate models according to the K-L divergence and then group them into L clusters holding models with small distances among themselves. Then, only the cluster centers are held in the MM, while the dimension of the representation is equal to the number of clusters. The KL divergence between two modelsm and m defined by the Gaussian PDFsN(µ, Σ ) and N(µ, Σ ) is: D KL (m,m )= ( ( ) ( )) tr Σ Σ +(µ µ ) T Σ det Σ (µ µ ) N+ ln det Σ where µ i R N and Σ i R N N. Then, Equation (5.3.6) follows after applying the corresponding mean and covariance matrices of the Gaussian densities defined by Equation (5.3.6). t= 74

193 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I 5.3. Multiple Model Based SHM methods Methods based on the marginal likelihood (MM-ML) The idea of this approach consists of basing the damage diagnosis directly on the test signal y u. Then, the problem of damage diagnosis is posed in terms of Bayesian decision theory, where, the vibration response and the structural state are treated as jointly distributed random variables. Thus, inference is based on the class posterior probability P(v y u ) that measures the probability that the current structural state is v after having observed the test signal y u. The prior and posterior probabilities are associated through the Bayes theorem: P(v y u )= p(y u v) P(v) p(y u ) = p(y u v) P(v) v p(y u v) P(v) (5.3.7) where the evidence p(y u ) is the PDF of y u within all structural states, and the marginal likelihood p(y u v) is the marginal of the class joint PDF p(y u, θ v) with respect to all the possible values of θ, and is defined as: p(y u v)= p(y u, θ v) dθ = p(y u θ) p(θ v) dθ (5.3.8) Θ where Θ R n is the space where θ is defined. According to Bayesian decision theory, the best decision about the structural state is the Bayes decision, which simply requires to select the class with the highest posterior probability (Berger, 985, Sec. 4.4). So, the Bayes decision in the present case takes the form: Θ ˆv= argmax P(v y u )= argmax p(y u v) P(v) (5.3.9) v={o,a,b,c...} v={o,a,b,c...} where the evidence is disregarded in the last equation, since it is equal for all classes. When only two classes are considered, say a and b, the decision can be made in terms of the ratio of the two posterior probabilities P(b y u )/P(a y u ). If this ratio is greater than one (indicating that P(b y u )>P(a y u )), then class b is selected, that is: P(b y u ) P(a y u ) = p(y u b) a) P(b) Class b is selected (5.3.) p(y u P(a) Otherwise Class a is selected The ratio of the priors P(b)/P(a) is a bias term that weighs the decision towards class a or b, and is a user-defined parameter that determines the performance of the method. Clearly, the most important part of this method is the computation of the marginal likelihood p(y u v). Two alternatives to solve the marginalizing integral in Equation (5.3.8) with the help of the MM representation of the class joint PDF shall be analyzed. These are:. Finite sum approximation with Gaussian kernels: Substituting the definition of the MM prior probability (Equation (5.3.b)) on the expression for the marginal probability in Equation (5.3.8), yields: ) L ( ) p(y u v)= p(y u θ) dθ = π v,l p(y u θ) f(θ θ v,l ) dθ (5.3.) Θ ( L π v,l f(θ θ v,l ) l= where θ v,l and π v,l for l=,...,l are the centers and weights of the MM of the class v. Considering that each model m v,l is associated with a Gaussian kernel with mean θ v,l and covariance Σ v,l, and since the likelihood is Gaussian as well (as shown in Equation (5.3.6)), then it follows that the integral inside Equation (5.3.) is (Rasmussen and Williams, 6, App.): p(y u θ) f(θ θ v,l ) dθ = N ( Φ T u θ, Σ w ) N(θ v,l, Σ v,l ) dθ =N ( ) ŷy v,l, Σ εv,l (5.3.) Θ Θ l= Θ 75

194 5.3. The Elements of the Multiple Model Based SHM Framework where the mean and covariance matrices are defined as: ŷy v,l = Φ T u θ v,l Σ εv,l = Σ w + Φ T u Σ v,l Φ u with Φ u = [ φ u []... φ u [N] ], φ u [t] is the regression vector constructed with y u [t ],...,y u [t n a ], and Σ w = diag ( σ w(β[t]) ). Then, the marginal probability of the MM becomes: p(y u v)= L l= π v,l N ( ŷy v,l, Σ εv,l ) (5.3.3) and corresponds to a weighted average of the likelihoods of the models in the MM, where the weights π v,l determine which models are more dominant in the approximation. Moreover, given that the actual values of the parameter vectors θ v,l and covariance matrices Σ εv,l are unknown, these are replaced by the respective ML estimates ˆθ v,l and ˆΣ εv,l = diag ( ˆσ e[t] ), where ˆσ e[t] are the estimates of the innovations variance obtained from the estimation residuals e[t ˆϑ v,l ], as shown in Equation (5.3.7c). A limit case is when the norm of Σ v,l is much lower than the norm of Σ w. In that case, the covariance matrix Σ εv,l is dominated by the term Σ w, and thus Σ εv,l Σ w. In consequence, the marginal probability reduces to: p(y u v)= L l= π v,l N(ŷ v,l, Σ w ) (5.3.4) This limit case corresponds to the Bayesian MM SHM approach presented in Avendaño-Valencia and Fassois (5a), and differences from the main case presented above in that all the kernels share the same covariance matrix Σ εv,l = Σ w.. Maximum approximation: Another limit case is when the sum in Equation (5.3.3) is dominated by the largest term. This happens if each model in the MM representation dominates over specific non intersecting (or with small overlap) regions in the space of y, which is facilitated by implementing a structure optimization method like the one discussed in Section In such case, the marginal probability in Equation (5.3.3) may be approximated by: p(y u v)= max l=,...,l π v,l p(y u θ v,l ) (5.3.5) which indicates that the marginal probability is approximated by the maximum of the likelihoods in the MM evaluated on the test signal y u. With the marginal likelihood computed, it just remains to specify the tests for damage detection and damage identification based on the tests shown in equations (5.3.9) and (5.3.). Damage detection The damage detection problem consists of determining the presence of damage in the structure based on the measured vibration response signal y u. In this case, two classes are considered: o healthy, and d damaged. Yet, information only from the healthy class is utilized (the reason being that the current damage may not necessarily belong to one of the modeled classes). To cope with this, the posterior P(d y u ) is replaced by P(o y u ), as the structure is either healthy or damaged. Substituting into Equation (5.3.) yields: P(o y u ) P(o y u ) thus leading to the following test for damage detection: 76 = P(o y u ) = p(y u ) (5.3.6) p(y u o) P(o) p(y u ) p(y u o) ρ lim = P(o) The structure is damaged (5.3.7) Otherwise The structure is healthy

195 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I where ρ lim = P(o) is a selected damage detection threshold (ρ lim < ). The evidence p(y u ) is obtained from the double marginalization of the joint PDF p(y u, θ,v) with respect to all the possible values of θ and v. Since the joint PDF p(y u, θ,v) can be decomposed into the product p(y u θ) p(θ v) P(v), then it follows that: ( ) p(y u )= p(y u θ) p(θ v) P(v) dθ = p(y u θ) p(θ) dθ (5.3.8) Θ v V Θ where p(θ) is the (class unconditional) parameter PDF, namely, the PDF of the parameters regardless of the structural state. Since p(θ) is in general unknown (given the impossibility of knowing all the potential structural states), then the integral in Equation (5.3.8) is computed by means of the Laplace method, which uses a Taylor series expansion of ln p(y u, θ) around its maximum to yield an approximate value for the evidence of the form (Berger, 985, p.65), (MacKay, 3, p.34), (Robert, 7, p.98): p(y u ) (π) n det A p(y u ˆθ u ) p( ˆθ u )= C p(y u ˆθ u ) (5.3.9) where ˆθ u is the ML estimate of θ, A is the matrix of second derivatives of ln p(y u, θ) with respect to θ and evaluated at ˆθ u, and C is a constant. The method described above shall be referred to as the Multiple Model Marginal Likelihood (MM-ML) damage detection method and is summarized in Table The weights π o,l and the threshold ρ lim are the free parameters of the damage detection method, which may be optimized as discussed in Section On the other hand, the constant C from the Laplace approximation method has been neglected in the damage detection test, since it is absorbed by the threshold ρ lim. Notice that for the application of this method, it is necessary to calculate the ML estimate of the parameter vector for the test signal, namely ˆθ u, and its respective likelihood value. This term serves as a normalization factor that facilitates the comparison, although it may be assumed constant to avoid the evaluation of ML parameter estimates during the inspection phase, with the cost of a potential performance decrease. Damage identification Once the presence of damage has been detected, the damage identification problem consists of determining the specific type of damage (class) based on the available vibration response signal y u. This achieved via the procedure described (Equation (5.3.9)) using all the available MM corresponding to damaged structural states. This method shall be referred to as the Multiple Model Marginal Likelihood (MM-ML) damage identification method and is summarized in Table As in the damage detection case, the weights are the free parameters of the damage identification method and may be optimized as discussed in Section Methods based on the Kullback-Leibler Divergence (MM-KL) On difference with the MM-ML methods, in the MM Kullback-Leibler (MM-KL) divergence based methods the decision is made in terms of a modelm u of the test signal y u. The concept of the MM-KL methods is determining whether the model m u corresponds to any of models in the MM of the healthy or damaged states (represented by each class MM). Then, the current state of the structure is assigned to the class of the MM which holds the models with the best correspondence. For that purpose, the parameter vector θ u, corresponding to the ML estimate of the parameter vector of the model derived from the test signal y u, is associated with the likelihood p(y u θ u ). Subsequently, the damage detection and identification are formulated in terms of the likelihood of the test model and those of the kernels of the MM of each class. In this sense, the K-L divergence defined in Equation (5.3.6) can be used to determine if the likelihood of the test model corresponds to any of the models in the MM, M v for all v {o,a,b,...}. Again, the damage detection and identification subproblems require specific treatment. The solution for each one of this subproblems is developed in the sequel. 77

196 5.3. The Elements of the Multiple Model Based SHM Framework Table 5.3.: The Multiple Model Marginal Likelihood-based (MM-ML) method Damage detection: Given a test signal y u and the MM of the healthy state M o = {m o,l },l =,...,L, with m o,l = { ˆθ o,l,s} and corresponding weights π o,l, and a damage detection threshold ρ lim :. Compute the parameter vector estimate ˆθ u and its likelihood p(y u ˆθ u ) Eqns. (5.3.7a) and (5.3.7b) ;. Compute the likelihoods of the individual models in the MM p(y u ˆθ o,l ) for all l =,...,L Eqn. (5.3.7b) ; 3. Evaluate the marginal likelihood with one of the following approximations: (i) (ii) p(y u o)= L l= π o,l p(y u ˆθ o,l ) Finite sum approximation (MM-ML-sum) Eqn. (5.3.3) p(y u o)=max l π o,l p(y u ˆθ o,l ) Maximum approximation (MM-ML-max) Eqn. (5.3.5) 4. Damage detection test Eqn. (5.3.7) : ρ d (y u )= p(y u ˆθ u ) p(y u o) ρ lim Otherwise The structure is damaged The structure is healthy where ρ d (y u ) is the statistical quantity for damage detection. The constant in the Laplace approximation p(y u )= C p(y u ˆθ u ) is absorbed by the damage detection threshold, and for that reason is not explicitly shown in the damage detection test. Damage identification: Given a test signal y u and the MMs of the damaged states M v = {m v,l } for l =,...,L and v={a,b,c,...}, with m v,l ={ ˆθ v,l,s} and corresponding weights π v,l :. Compute the likelihoods of the individual models of each one of the MMs p(y u ˆθ v,l ) for all l =,...,L and v={a,b,c,...} Eqns. (5.3.7a) and (5.3.7b) ;. Evaluate the marginal likelihood with one of the following approximations: (i) (ii) p(y u v)= L l= π v,l p(y u ˆθ v,l ) Finite sum approximation (MM-ML-sum) Eqn. (5.3.3) p(y u v)=max l π v,l p(y u ˆθ v,l ) Maximum approximation (MM-ML-max) Eqn. (5.3.5) 3. Damage identification test Eqn. (5.3.9) : ˆv= argmax p(y u v) v={a,b,c,...} Damage detection In the present context, the problem of damage detection consists of determining if the likelihood of the test model coincides with the likelihood of any of the models in the MM of the healthy state, M o. Accordingly, the current inspection signal is assigned to the healthy class, if the smallest K-L divergence evaluated on all the models inm o is lower than a certain threshold d lim. Formally, the above is stated as follows min D KL(m o,l,m u ) d lim The structure is healthy (5.3.3) l=,...,l otherwise The structure is damaged The threshold d lim constitutes a free parameter defined by the user, which may also be optimized in order to maximize the performance of the method (see Section 5.3.3). The MM-KL damage detection method is summarized in Table

197 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I Table 5.3.: The Multiple Model Kullback-Leibler divergence-based (MM-KL) method Damage detection: Given a test signal y u and the MM of the healthy state M o ={m o,l }, with m o,l ={ ˆθ o,l,s} for all l =,...,L, and a damage detection threshold d lim :. Compute the ML estimate ˆθ u and its corresponding covariance matrix Σ θ u Eqns. (5.3.7a) and (5.3.9) ;. Evaluate the Kullback-Leibler divergence between the test model and all the models inm o Eqn. (5.3.6). 3. Damage detection test Eqn. (5.3.3) : min l=,...,l D KL(m u,m o,l ) d lim otherwise The structure is healthy The structure is damaged Damage identification: Given a test signal y u and the MM of the damaged states M v = {m v,l }, with m v,l = { ˆθ v,l,s} and weights λ v,l, for all l =,...,L and v={a,b,c,...}:. Compute the ML estimate ˆθ u and its corresponding covariance matrix Σ θ u Eqns. (5.3.7a) and (5.3.9) ;. Evaluate the Kullback-Leibler divergence between the test model and all the models inm v, v={a,b,c,...} Eqn. (5.3.6). 3. Damage identification test Eqn. (5.3.3) : ˆv = min v {a,b,c,...} ( ) min λ v,l D KL (m u,m v,l ) l=,...,l Damage identification Following a similar reasoning, the damage identification problem can be solved by assigning the current inspection signal to the state whose corresponding model set yields the least of the minimum divergences, this is to say ( ) v= min min λ v,l D KL (m u,m v,l ) (5.3.3) v {a,b,c,...} l=,...,l where the weights λ v,k are introduced to bias the decision towards a certain class, and are free parameters that can be used to optimize the performance of the method (see Section 5.3.3). The MM-KL damage identification method is summarized in Table It is evident the similarity of the MM-KL methods with the nearest neighbors classifier, where the test object is assigned to the most repeated class among the nearest K neighbors (Duda et al.,, Sec. 4.7), (Hastie et al., 9, Sec. 3.3). In the present context, each model in the MM is treated as a neighbor and the function measuring the distance between neighbors is the K-L divergence. However, it must be pointed out that the nearest neighbors method operates on distances between feature vectors, whereas in the methodology based on K-L divergences, the distances are measured between PDFs. The advantage is that the space spanned by y is approximated more efficiently in terms of the volumes associated with the PDFs (likelihoods) of the models in the MM, instead of single points in the space, as made in the nearest neighbors methods Optimization of the Multiple Model based SHM methods The optimization of the damage diagnosis refers to the problem of adjusting the MM representation and the related damage diagnosis tests in order to maximize the performance in an independent set of data. In this sense, the fixing parameters of the damage diagnosis methods, including the detection thresholds (ρ lim or d lim according to the method), the class prior probabilities P(v), v = {o,a,b,c,...}, the weights (π v,l or λ v,l ), are the parameters controlling the performance of the damage diagnosis methods. While the detection thresholds, prior probabilities and weights may be specified according to specific assumptions about the distribution of the damage detection (identification) statistic (Fassois and Sakellariou, 9; Spiridonakos and Fassois, 3), a more reliable approach 79

198 5.3. The Elements of the Multiple Model Based SHM Framework is to adjust these values by optimizing the performance of the damage diagnosis methods on an independent set of data. For this purpose, cross-validation techniques, where the set of baseline vibration signals is randomly split into a training set, used to construct and optimize the MM representation, and a validation set, can be used to estimate the generalization error (the error obtained on a set of data independent from the training data) of the damage diagnosis methods (Hastie et al., 9, Sec. 7.). The validation set is designed so that it does not include samples from the training set, in order to avoid evaluating on training samples. The training and validation sets may be created according to the K-fold cross validation approach, under which the available set of M vibration response signals from healthy and damaged states is randomly divided into K disjoint subsets called folds, each fold F k, k =,...,K being composed by M/K vibration signals from all structural states. In the k th iteration of the cross validation, the fold F k becomes the validation set, while the remaining K folds (all folds F j with j ={,..., k, k +,..., K}) become the training set. Then, the signals on the training set from each structural state are separated to construct individual MMs M v for each v={o,a,b,c,...}. In the case of damage detection only an MM for the healthy state is constructed. Subsequently, a vector q v,k of size M/K with the damage characteristic quantities (either the marginal likelihood or the minimum K-L divergence) of the MM based SHM method is computed for each signal in the validation set F k and for each structural state. This process is repeated for k=,...,k from which the matrix q o, q o, q o,k q a, q a, q a,k Q= q b, q b, q b,k.. with the characteristic quantities for each signal in the complete set for each structural state is obtained. An important feature of the matrix Q is that all the entries are computed from an independent set of data. In the case of damage detection Q is composed by a single row, corresponding to the characteristic quantities of the healthy state, otherwise, Q is composed by as many rows as structural states are represented. Afterward, the overall performance of the method is evaluated on the obtained Q matrix. A robust approach to analyze the performance of a detection method is by means of Receiver Operating Characteristic (ROC) curves, which display the True Positive Rate (TPR) vs. the False Positive Rate (FPR) of a detector (or a two class classifier) as one of its parameter changes (Duda et al.,, pp. 48-5), Fawcett (6). In the damage detection problem, the parameter may be the detection threshold, while in the damage identification case the parameter may be the ratio between the prior probability of a target structural state and the prior probabilities of the remaining structural states. Perfect performance is achieved when the ROC crosses the point (,); false positive rate then equals zero and true positive rate equals one. Likewise, curves approaching the point (,) indicate improved method s performance. The Area Under the ROC Curve (AUC) is the integral of the ROC and yields values in the range to. Values approaching indicate very good performance, a value close to.5 indicates a random selection of the structural state, and values approaching indicate poor performance. The AUC can also be interpreted as the probability that a detector will rank a randomly chosen positive instance higher than a randomly chosen negative one (Duda et al.,, pp. 48-5), Fawcett (6). Based on the AUC, the fixing parameters may be obtained as the solution of the constrained optimization problem (hereby shown for the MM-ML damage detection case):. {π }=argmaxauc(π) s.t. π o,l, l =,...,L L l= π o,l = (5.3.3) where π [ ] T π o,... π o,l and AUC(π) represents the ROC AUC obtained with the MM-ML damage detection method using weights π. Due to the non linear nature of the AUC optimization problem, constrained non linear gradient based optimization methods or constrained non linear search methods are required. Besides, potential local maxima make the localization of a global optimal more difficult. In this sense, global optimization methods, such as pattern search, genetic algorithms or particle swarm optimization methods are recommendable for an initial search on the parameter space. In order to initialize the optimization method, equal values for all the 8

199 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I weights and the prior probabilities are advisable. Once the optimal weights are computed, the optimal threshold can be attained by picking up the threshold value of the point in the ROC curve closest to the corner(,). The ROC curves are computed between two structural states. When more than two structural states are considered, it is necessary to compute several ROC curves by defining one of the states as the negative class while the remaining ones as the positive class. Alternatively, the performance may be measured solely based on the correct identification rate, namely the number of correctly identified damages over the total number of considered cases. In that case, the respective optimization method is defined as follows (hereby presented on the case of the MM-KL damage identification method): {π }=argmaxcr(π) s.t. π v,l, l =,...,L, v={a,b,c,...} L l= π v,l =, v={a,b,c,...} (5.3.33) where π [ π a,... π a,l π b,... π b,l... ] T, and CR(π) denotes the correct identification rate of the damage identification method obtained with weights π. The same guidelines explained for the AUC optimization problem can be followed for the optimization of the correct identification rate. 5.4 Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty 5.4. The suspension model and problem description The example presently studied considers a 3-DOF mechanical system consisting of three masses connected by springs/dampers, which simulates an active suspension system of a quarter of a truck. Time dependent dynamics are introduced by periodically varying stiffness, while uncertainty in the dynamic response is introduced by the uncertain load of the truck. A diagram of the mechanism is presented in Figure 5.4.(a). The main objective on this example is to perform SHM on the suspension system based on its vibration response and demonstrate the workings of the MM based damage diagnosis methods introduced in this work. The dynamics of the mechanism are governed by the differential equation M ẍx(t)+c ẋx(t)+ K(t) x(t)= b r(t) (5.4.) where t represents continuous time in seconds, x(t)= [ x (t) x (t) x 3 (t) ] T is the mechanism s vibration response vector, and r(t) is the stationary zero-mean white Gaussian excitation signal, both being defined on the continuous time t R. Besides, the mass M, viscous damping C, and stiffness K(t) matrices, and the excitation influence vector b are all defined as: m c c M = m C= c c + c 3 c 3 m 3 c 3 c 3 k + k (t) k (t) K(t)= k (t) k (t)+k 3 (t) k 3 (t) k 3 (t) k 3 (t) b= The two springs connecting masses - and -3 are characterized by a time-dependent stiffness that follows a sinusoidal trajectory, as defined in the expression: k i (t)=k i, + k i, sin π P t+ k i, sin 4π t, i={,3} (5.4.) P where P represents the fundamental period of oscillation in seconds of the time variation. The remaining spring represents the contact of mass with the soil and its stiffness is constant. The two dampers have constant damping coefficient as well. k 8

5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty (a) m 3 k t 3( ) c 3 x

responses (time domain); (c) typical simulated vibration response measured at the third rigid body (x 3 (t)) (time-frequency

The third mass represents the mass of the empty truck plus an uncertain load, which is random and follows a Gamma distribution.

3) where m 3 is the uncertain mass of the truck, m 3 is the (deterministic) mass of the empty truck, and δ m3 is the random

200 5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty (a) m 3 k t 3( ) c 3 x t 3( ) (b) x3(t) - m k t ( ) c x t ( ) x(t) - m x t ( ) k r( t) x(t) Time [s] (c) Figure 5.4.: A 3-DOF mechanism with time-varying dynamic response and uncertain properties: (a) The mechanism; (b) typical simulated responses (time domain); (c) typical simulated vibration response measured at the third rigid body (x 3 (t)) (time-frequency representation via frozen TV-PSD). The third mass represents the mass of the empty truck plus an uncertain load, which is random and follows a Gamma distribution. Thus, the third mass, including the random load, is also random and is defined as follows: m 3 = m 3 + δ m3, δ m3 Γ(α,β) (5.4.3) where m 3 is the uncertain mass of the truck, m 3 is the (deterministic) mass of the empty truck, and δ m3 is the random load, which follows a gamma distribution with shape parameter α and rate parameter β (Berger, 985, App. ). A single damage representing the stiffening of the suspension is simulated by increasing the stiffness of the spring and the damping coefficient of the damper connecting masses and. The parameters characterizing the structure in its healthy and damaged states are summarized in Table

201 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I Table 5.4.: Parameters of the mechanical system and of the numerical integration Property Symbol Value Healthy state Damaged state Mass of rigid body m.5 kg.5 kg Mass of rigid body m.5 kg.5 kg Mass of rigid body 3 m 3 = m 3 + δ m3 Uncertain: Uncertain: δ m3 Γ(α,β) m 3 = kg m 3 = kg α = kg α = kg β =.5 kg β =.5 kg Damping between rigid bodies and c.5 N/(m/s).55 N/(m/s) Damping between rigid bodies and 3 c 3.3 N/(m/s).3 N/(m/s) Stiffness between rigid body and excitation k 3 N/m 3 N/m Stiffness between rigid bodies and k (t) Time-varying: Time-varying: k, = N/m k, = N/m k, = 6 N/m k, = 6 N/m k, = N/m k, = 5 N/m Stiffness between rigid bodies and 3 k 3 (t) Time-varying: Time-varying: k 3, = N/m k 3, = N/m k 3, = 7 N/m k 3, = 7 N/m k 3, = 4 N/m k 3, = 4 N/m Fundamental period of time-variation P s s Numerical integration method Runge Kutta method Integration time and sampling rate Hz Sampling rate used for analysis 6 Hz Analysis time 5 samples (3.5 seconds) initial 5 samples removed Number of Monte Carlo runs per structural state, total 5.4. The vibration response signals Simulated vibration response signals are obtained by integrating the differential equation governing the system (Equation (5.4.)) using the Runge-Kutta 4-5 method for a period of seconds and sampled initially at 64 Hz. Afterwards, the obtained signals are low-pass filtered and re-sampled at 6 Hz. The first 5 samples of the signal (56.5 seconds) are removed to avoid the effects of the initial conditions, thus yielding signals of 5 samples (3.5 seconds) length. A Monte Carlo test is performed with runs per structural state, making a total of realizations. Each realization is computed from independent draws of the excitation signal and the truck load. Figure 5.4.(b) shows a typical response of the suspension system at its healthy state measured at each one of the rigid bodies, while Figure 5.4.(c) shows the frozen Time-Varying Power Spectral Density (TV-PSD) of a typical realization of the vibration response measured at the third rigid body (i.e. the vibration response x 3 (t)), derived directly from the system equations. The time-dependency of the dynamics is evident in the frozen TV- PSD. Figure 5.4. shows the frozen natural frequencies and damping ratios of the system obtained from the Monte Carlo runs in the healthy and damaged state of the system as seen in the vibration response measured at the third rigid body. The variability of the modes at different realizations is evident, demonstrating the effects of the uncertain load in the system s dynamics FS TAR model based dynamic analysis Further analysis is based on the vibration response measured at the third rigid body. The vibration response signals at the healthy and damaged states are represented via FS TAR models (equivalent to LPV AR models with β[t]= t) using a sinusoidal basis with period equal to that one of the mechanism. Specifically, the FS TAR model uses a functional basis for the representation of the parameter trajectories and innovations variance of the form G ba() [t]=, G ba(k) [t]=sinkω o t, G ba(k+) [t]=coskω o t, ω o = πt s P (5.4.4) 83

202 5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty Healthy Damaged 6 3 ωn,3 (t) ζ3 (t) 5 4 Damping ratio [%] Frequency [Hz] ωn, (t) 3 ωn, (t) 4 ζ (t).3.. ζ (t) Time [s] Time [s] Figure 5.4.: Realizations of the frozen time-varying natural frequencies (left) and damping ratios (right) of the vibration response at the third rigid body of the theoretical system under variable load, obtained from Monte Carlo runs at the healthy and damage states of the structure. where t designates the normalized discrete time t =,,, N, k =,, (pa )/, and Ts = / fs is the sampling period. The estimation of the parameters of the AR part and the innovations variance is made using the Multi-Stage Maximum-Likelihood (MS-ML) method Spiridonakos and Fassois (4b); Poulimenos and Fassois (9b). The MS-ML method is iterated until convergence is met, or a maximum of iterations is reached. The criteria to evaluate the convergence are: θ θ θ k θ θ k, L p(yy θ θ k ) p(yy θ θ k ) where θ θ k and θ θ k are the parameter vector estimates at the current and previous iteration of the MS-ML method. Then, convergence is considered to be met when either θ or L are lower than the tolerance value 8. The identification of the model and basis dimensionality is made in terms of the empirical risk, as explained in Section In the evaluation of the empirical risk, the initial 4 samples of each signal are used to compute the ML parameter estimates (Equation (5.3.7a)), while the last samples are used to compute the empirical risk (Equation (5.3.3)). Moreover, BIC, RSS/SSS and log-likelihood are computed as well. An exhaustive search approach is used for model identification, where FS TAR models with na = [5 6 5] and pa = [5 7 5] are estimated for the complete set of vibration response signals from the healthy state. Details in Table Table 5.4.: Settings used for the identification of the FS TAR models Property Signal period for training Identification method Iteration limit Tolerance values Signal period for validation Evaluated model structures Value Initial 4 samples Multi Stage Maximum Likelihood (MS-ML) method θ = L = 8 Final samples FS TAR models with na = {5, 6,..., 5} and pa = {5, 7,..., 5} Figure provides the median and 9% confidence intervals of the empirical risk and BIC curves obtained 84

203 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I with the procedure explained before. Figure 5.4.3(a) and (c) show the curves obtained for the empirical risk and BIC for all the structural parameters tested. The empirical risk consistently shows the point n a = as optimal, while the BIC shows minimum values at different points depending on the basis dimensionality, though the value n a = is always a minimum or very close to the minimum in the range of values analyzed. Figure 5.4.3(b) and (d) show the empirical risk and BIC for the critical point n a = and p a =[5 7 5]. While the BIC demonstrates a steady increment as the basis dimensionality is increased and a clear minimum value at p a = 7, the empirical risk shows almost no change. However, the initial basis order p a = 5 is characterized by lower values compared to the final basis order p a = 5. Therefore, according to the empirical risk and BIC curves and the analysis of the obtained projection coefficients on different orders, it is concluded that the bases with indices 5 and higher can be eliminated from the model, thus leading to an FS TAR() [7,7] model, that includes the constant basis and the first three sine/cosine components. p a = 3 p a = 5 log Remp(M) (a) n a = log R emp =-.957 (b) n a =,p a = 5 log R emp =-.968 n a =,p a = log R emp =-.955 (c) BIC n a n a = BIC = n a -.5 (d) p a n a =,p a = 7 BIC = p a Figure 5.4.3: Selection of the FS TAR model order and basis order on realizations of the vibration response in the healthy state: (a)-(c) Median and 9% confidence intervals of the empirical risk and BIC obtained for FS TAR modes with n a =[5 6 5] and p a =[5 7 5]. (b)-(d) Median and 9% confidence intervals of the empirical risk and BIC evaluated for FS TAR() models with p a =[5 7 5]. Figure 5.4.4(a) shows the frozen TV-PSD derived from the estimated FS TAR() [7,7] model of a single vibration response from the healthy state of the structure. Figure 5.4.4(b) shows the frozen natural frequencies obtained from the FS TAR() [7,7] models of vibration signals from the healthy and damaged states. The obtained frozen TV-PSD resembles the one obtained from the actual system shown in Figure 5.4.(c). Likewise, the frozen natural frequencies also tend to follow those observed from the actual system shown in Figure 5.4., though the presence of other spurious modes and deviations in the modal frequencies are evident. More importantly, it is clear the degree of overlapping between the modal quantities in the healthy and damaged states of the structure FS TAR model based Multiple Model representation construction MMs for the healthy and damaged structural states are constructed based on FS TAR() [7,7] models obtained for each one of the signals of the Monte Carlo simulation. The optimization of the weights, thresholds and the MM 85

5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty (a) (b)

4: FS TAR() [7,7] model based analysis: (a) Frozen TV-PSD obtained from the FS TAR() [7,7] model of a single

(b) Frozen natural frequencies obtained from FS TAR() [7,7] models of vibration responses from Monte Carlo runs on

dimensionality is guided by the performance achieved on damage diagnosis, as shall be shown in the next section. 5.

subsequently attempted by means of the MM-based damage diagnosis methodologies introduced before, namely the MM ML

The two variants of the MM ML method are considered, namely the one using maximum approximation of the marginal

5) (MM ML max) and the other using the finite sum approximation of the marginal 3) (MM ML sum).

constructed only for the healthy state of the structure while damage diagnosis is performed using the damage

diagnosis is performed using the damage identification tests in Table 5.3.

damaged state is split into folds (subsets), where each fold contains realizations from both structural states (

The procedure then follows as described in Section 5.3.

204 5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty (a) (b) Figure 5.4.4: FS TAR() [7,7] model based analysis: (a) Frozen TV-PSD obtained from the FS TAR() [7,7] model of a single vibration response signal from the healthy state. (b) Frozen natural frequencies obtained from FS TAR() [7,7] models of vibration responses from Monte Carlo runs on healthy and damaged states. dimensionality is guided by the performance achieved on damage diagnosis, as shall be shown in the next section Damage detection results Given the identified MMs using FS TAR() [7,7] models, detection of damage is subsequently attempted by means of the MM-based damage diagnosis methodologies introduced before, namely the MM ML and the MM KL methods. The two variants of the MM ML method are considered, namely the one using maximum approximation of the marginal likelihood as shown in Equation (5.3.5) (MM ML max) and the other using the finite sum approximation of the marginal likelihood as shown in Equation (5.3.3) (MM ML sum). Moreover, two variants of the damage detection problem are assessed: Conventional damage detection where an MM is constructed only for the healthy state of the structure while damage diagnosis is performed using the damage detection tests in Table 5.3. and Table Modified damage detection where MMs are built for the healthy and for the available damage type while damage diagnosis is performed using the damage identification tests in Table 5.3. and Table The performance of the damage diagnosis methods is measured in terms of the AUC, which is evaluated within a fold cross validation. In the fold cross-validation, the whole data set consisting of signals from the healthy state and signals from the damaged state is split into folds (subsets), where each fold contains realizations from both structural states ( from the healthy and from the damaged class see also Table ). The procedure then follows as described in Section In the conventional damage detection, a total of 9 realizations only from the healthy state are used to construct a respective healthy MM M o, while in the modified damage detection the available realizations from the healthy and damaged states are used to construct respective MMs for the healthy and damaged states of the structure, M o and M d respectively. The training and validation process is repeated until all the realizations from the folds are used as validation data Results on the conventional damage detection problem Figure illustrates the obtained results on the conventional damage detection problem using the MM-ML-max method before and after optimizing the weights associated with the MM of the healthy state. Before optimizing, the weights π o,l are assumed all equal to /9, with L = 9. Figure 5.4.5(a) displays the distribution of the estimated marginal log-likelihoods per class for the complete set of validation realizations via boxplots. The 86

205 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I Table 5.4.3: Settings of the cross validation used for the validation of the damage diagnosis methods Property Value Cross validation approach fold cross-validation Number of folds per structural state Number of realizations per fold per structural state Number of realizations used to construct the each class MM (training) 9 per structural state Number of realizations used for performance evaluation (validation) per structural state corresponding ROC curve in Figure 5.4.5(c) evidences a good detection performance of the method characterized by an AUC =.9, although some overlap is evident between the distributions of the healthy and damaged structural states. Afterwards, the weights π o,l are optimized by maximizing the AUC using a combined genetic algorithm (ga of MATLAB (Goldberg, 989, Ch. )) and a constrained iterative non-linear optimization method that uses the interior-point method (fmincon of MATLAB Byrd et al. (999)). The genetic algorithm is used to find feasible values for the weights, while the iterative non-linear optimization algorithm is used to refine those feasible values. Constrained optimization is required in order to satisfy the constraints of the weights explained in Equation (5.3.3). The optimization algorithm is initialized using equal weights and iterated until the tolerance value of 8 in the objective function or the weight estimates is achieved. The results of optimization are shown in Figure 5.4.5(b) and (c), where it is evidenced that the changes in the estimated marginal log-likelihoods and the ROC are minimal for the present case. (a) 5 Equal weights (c) ln p(y o) (b) ln p(y o) -5 - Healthy Damaged 5 Optimized weights -5 - Healthy Damaged True Positive Rate Equal weights. AUC =.9 Optimized weights. AUC = False Positive Rate Figure 5.4.5: Results on the conventional damage detection problem using the MM-ML-max method after crossvalidation: (a) Boxplots displaying the distribution of the computed marginal likelihoods at healthy and damaged states using equal weights. (b) Boxplots displaying the distribution of the computed marginal likelihoods at healthy and damaged states using optimized weights. (c) Corresponding ROC curves obtained for both cases. Optimal threshold value on each case indicated with dashed lines. Figure 5.4.6(a) shows the ROC curves obtained with the MM ML max, MM ML sum and MM KL methods on the conventional damage detection problem after optimization of the weights associated with the MM representation of the healthy state. The results demonstrate that there is not a large difference in the performance obtained with both versions of the MM ML method, while the MM KL method provides poor performance on the conventional damage detection problem. 87

206 5.4. Illustrative Case Study: Damage diagnosis in an simple active suspension model with mass uncertainty (a) (b).8.8 True Positive Rate.6.4 True Positive Rate.6.4. MM-ML-sum AUC =.98 MM-ML-max AUC =.93 MM-KL AUC = False Positive Rate. MM-ML-sum AUC =.964 MM-ML-max AUC =.954 MM-KL AUC = False Positive Rate Figure 5.4.6: ROC curves obtained with the MM ML sum, MM ML-max and MM KL methods on (a) the conventional damage detection problem; (b) the modified damage detection problem. All the ROC curves are obtained after optimization of the weights associated with the respective MMs Results on the modified damage detection problem The modified damage detection problem is tackled this time with the MM ML max, MM ML sum and MM KL damage identification methods, where one class corresponds to the healthy state and the other class corresponds to the damaged state. The weights are optimized in the same form as in the previous case. However, in this case it is necessary to optimize the weights corresponding to models from the healthy and damaged classes. Therefore, the weights for the healthy and damaged states π v,l (or w v,l in the MM KL method) for all l =,...,9, and v= {o,d} are optimized following the combined genetic algorithm and interior point optimization methods described previously. A summary of the obtained ROC curves after the optimization process is presented in Figure 5.4.6(b). On difference with the previous case, presently the MM KL method surpasses the performance of both versions of the MM ML methods and largely improves the performance obtained in the conventional damage detection problem. Again, both versions of the MM ML method provide very similar performance Analysis of the dimensionality of the MM representation on the MM-KL method Further analysis of the MM-KL method on the modified damage detection problem is sought by means of the optimization of the dimensionality of the MM representation, following the methodology discussed in Section More specifically, the k means clustering method (MATLAB s kmeans method) is used to find clusters grouping models with similar features based on the K L divergence measure. Then, the dimensionality of the MM representation corresponds to the number of clusters. Figure shows the obtained performance of the MM KL method for increasing MM dimensionality (number of clusters in the k means method) on the range[, 9], where 9 would be the total number of training models. The results obtained from this procedure indicate two important details: (i) The performance is lowest when only a single FS TAR model is used for detection of damage, thus demonstrating the insufficiency of a single model for damage diagnosis under uncertainty. (ii) The performance increases as the number of models in the MM representation increases. The increase is very sharp in the beginning and stabilizes after reaching a certain dimensionality, which for the present case is about 4 models. 88

207 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I (a) 9 (b) True Positive Rate MM dimensionality AUC.8.7 Best AUC =.957 MM dimensionality = False Positive Rate MM dimensionality Figure 5.4.7: Optimization of the dimensionality of the MM representation on the MM KL method on the modified damage detection problem: (a) ROC curves and (b) respective AUC obtained for increasing MM dimensionality on the MM KL method. Vertical red line indicates the dimensionality where the AUC stabilizes Comparison with the Random Coefficient (RC-FS-TAR) based methods In order to contrast the results of the MM based damage diagnosis methods, damage diagnosis is also attempted with the RC FS TAR model based method described in Avendaño-Valencia and Fassois (5d), in which the coefficients of projection of the RC FS TAR model are represented via a Gaussian distribution. The same fold cross validation method is used to estimate the performance of the method. Then, the mean and covariance matrix are obtained by computing averages of the parameter vectors of the models obtained from 9 realizations at each structural state. Due to the small sample size (9 models) compared to the number of elements in the covariance matrix (96 elements), only the diagonal elements are estimated Discussion Table presents a summary of the results obtained with the evaluated methodologies, including the RC FS TAR model based method. The best performing method on the conventional damage detection problem is the MM ML max method, although similar performance is achieved with the MM ML sum method. On the other hand, the MM KL provides the lowest performance. Similarly, the best performing method on the modified damage detection problem is the MM ML max method, with the MM ML sum method yielding close performance again, while the MM KL method also provides very good performance, with outcomes very similar to those obtained with the MM ML methods. The performance of the MM ML methods does not change very much from the conventional damage detection problem to the modified one. This may be explained because the MM ML method is yielding an appropriate description of the distribution of the parameters on the healthy state of the structure, while the extra modeling of the damaged state helps to give an extra improvement to the performance. Contrariwise, in the MM KL method, it appears that the representation of the healthy case alone is not enough to yield a proper description of the vibration response of the structure. However, the performance sharply rises after also representing the damaged state of the structure. The results of the analysis of the dimensionality of the MM representation on the MM KL method demonstrates that the best performance of the method can be obtained with a limited (relatively small) number of models. Similar results can be obtained with the MM ML methods. Therefore, very good SHM performance can be obtained from compact MM based methods. The RC FS TAR model based method provides more or less the same performance on the two analyzed damage detection problems, which turns to be much lower than the one obtained with the MM ML method. 89

208 5.5. Concluding Remarks Table 5.4.4: Summary of damage detection results. TNR: True Negative Rate, TPR: True Positive Rate. Best TNR, TPR and threshold are the values corresponding to the point in the ROC closest to the corner (,). Problem Method AUC Best TNR Best TPR Best threshold MM-ML-max Conventional damage detection MM-ML-sum MM-KL RC FS-TAR model MM-ML-max Modified damage detection MM-ML-sum MM-KL RC FS-TAR model Results obtained following the methodology in Avendaño-Valencia and Fassois (5d). Nonetheless, the performance of this method may rise with the estimation of a full covariance matrix (here only the diagonal elements are estimated due to the low number of samples) or by introducing a more complex distribution model. The evaluation of the damage diagnosis methods presently shown does not attempt to be thorough, but instead aims to illustrate the working of the methods and they overall behavior. The reader is invited to consult the companion paper (Avendaño-Valencia and Fassois, 5c) for a complete analysis of the methods and comparison with similar SHM methodologies. 5.5 Concluding Remarks This work has been devoted to the problem of single vibration response only SHM for structures with timedependent dynamics under operational and environmental uncertainty. The methods discussed in this work utilize a framework that includes two entities: (i) non-stationary parametric time-dependent models (either from the FS TARMA or LPV ARMA types) for representing the time-dependent vibration response dynamics, and (ii) a Multiple Model representation of an individual health state of the structure under uncertainty. The main theoretical aspects of this framework, including the construction of the MM representation, the definition of corresponding damage diagnosis tests based on the marginal likelihood and Kullback Leibler divergences associated with the MMs, and the optimization of the free parameters of the methods, have been analyzed and discussed. Moreover, a case study has been presented as well, concerning to the detection of damage in a suspension system with time dependent dynamics, in which the random load of the vehicle serves as a variable inducing uncertainty in the system dynamics. The most important conclusions of this work are summarized next: (i) TARMA and LPV ARMA models are very powerful modeling tools for non stationary vibration. Nonetheless, these models alone are insufficient for the proper representation of the variable dynamics introduced by uncertainties in real life applications. In this sense, MM representations provide a very simple and yet powerful solution for the description of uncertainties. (ii) A vital aspect of MM representations is their dimensionality, namely the number of models in the representation. Obviously, a very large number of models is undesirable since this would make a very large representation, stemming into increased computational resources and computational effort. Moreover, in the practical application of the method, new models may be increasingly introduced into the MM as new data is available in order to improve the representation accuracy. For such purposes, the application of a model selection methodology, such as the one discussed in this work, is of great value for the achievement of compact and effective MM representations. (iii) The definition of the MM representations based on the interpretation of the MM representation as an mixture approximation of a random coefficient model, facilitated the accurate definition of the damage diagnosis 9

209 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I tests. Such a framework may be serve as a basis for even more powerful and efficient SHM methods. (iv) The MM framework can be easily extended to other cases where the handling of uncertainties is of importance, just by changing the type of time-series model. Thus, other SHM problems including stationary vibration response, single or multiple vibration measurements, and/or input output or output only vibration response can be also tackled by means of the framework discussed in this work. An exhaustive assessment of the MM framework is provided in the companion work (Avendaño-Valencia and Fassois, 5c), where the MM damage diagnosis methods are evaluated in the SHM of an operating wind turbine. The complete set up of the modeling and damage diagnosis methods is presented, while comparisons with similar state of the art methods are also shown. Appendix 5.A Model Validation The estimation residuals of the obtained FS-TAR() [7,7] models are analyzed to validate Gaussianity and uncorrelatedness. Gaussianity is evaluated by means of histogram and normal probability plots, while uncorrelatedness is tested by analyzing the Auto-Correlation Function (ACF) (Poulimenos and Fassois, 6). Figure 5.A. presents the three plots, namely the histogram, normal probability plot and normalized ACF plots computed for the residuals of all the models of the complete vibration responses of the healthy state. Both histogram and normal probability plot demonstrate that the residuals of the FS-TAR models follow a Gaussian distribution. However, the ACF plot shows that the residuals tend to be correlated, at least for the initial lags. Appendix 5.B Random Coefficient and MM representations Illustrative example To illustrate better the concepts of modeling via RC models and MM representations explained in Section 5.3, consider the linear model y=ψθ + w with w NID (,σw), deterministic ψ R, and a scalar random parameter θ that follows the prior distribution p(θ v) shown in Figure 5.B.(a). The likelihood associated with this simple RC model is p(y θ,v) exp( (y ψ θ) /σw), while the joint PDF is obtained from Equation (5.3.) and is illustrated in Figure 5.B.(b). The points θ k are sampled from the class parameter prior p(θ v) shown in Figure 5.B.(a). The obtained parameters become part of an MM defined as M v ={m v,k }, with corresponding parameters {θ k }. The MM prior p(θ v) in this case is constructed using rectangular kernel functions localized around each one of the sampling points θ k. Specifically, the whole interval Θ=[θ ini,θ end ], with θ ini being the initial point and θ end the ending point, is divided into the segments: ] Segment k: [θ k δ k,θ k + δ k ; k=,...,k δ k = (θ k θ k ); δ k = (θ k+ θ k ) where θ = θ ini and θ K+ = θ end. Thus, for each kernel, the scale parameter is δ k = δ k + δ k and the corresponding weight is π k = /δ k.the resulting MM approximation of the parameter prior for class v is shown in Figure 5.B.(c). Finally, Figure 5.B.(d) shows the joint PDF obtained with the obtained MM parameter prior and Equation (5.3.a). Figure 5.B. provides a comparison of the exact computation of the marginal likelihood with the values obtained with the finite sum and maximum approximations based on the MM. For the values used in the example, both approximations yield close approximations to the actual value of the marginal and actually coincide in the location of the maximum. 9

210 5.B. Random Coefficient and MM representations Illustrative example Normal Probability Plot (b) Probability Frequency (a) Residuals 4 6 Residuals Normalized ACF (c) Lag Figure 5.A.: Validation of the Gaussianity and uncorrelatedness of the estimation residuals of the obtained FSTAR()[5,5] models of the vibration response signals of the healthy state: (a) Histogram and Gaussian distribution fit; (b) normal probability plot; (c) Normalized Auto-Correlation Function (ACF). 9

211 5. An MM Framework for Vibration-Based SHM in Non-Stationary and Uncertain Environments Pt. I (b) (a) p(θk v) p(θ v) p(θk v)... θ θk... θk p(y, θk v ) θk... θ θ θ p(y, θ v ) p(y, θ v )... p(θ v) p(θ v) p(y, θ v) θ θk p(y, θk v ) y (d) (c) p (θ v) p (y, θ v) θ θ π π3 π4 π5 π6 π7 π8 θ θ θ3 θ5 θ6 θ7 θ8 θ4 θk p(y, θk v )... π... p(y, θ v ) p(y, θ v ) θ θk p(y, θk v ) y Figure 5.B.: A simple RC linear model with random scalar parameter θ : (a) The prior parameter is sampled at different points θk. (b) The associated joint PDF of the RC model obtained using Equation (5.3.). Surface and contours indicate the values of the joint PDF, while the blue lines show the values sampled on θ. (c) The MM approximation of the parameter prior using a rectangular kernel located at each one of the sampling points θk. (d) The MM joint PDF obtained with Equation (5.3.a) using the obtained MM prior. In this case, the surface indicates the MM approximation of the joint PDF while the contours the actual joint PDF. p(y v) Exact computation Finite sum approximation Maximum approximation y Figure 5.B.: The marginal likelihood of the observations p(y v) with y R (scalar) computed from the joint PDF of a simple RC model previously analyzed in Figure 5.B.. The value of the marginal probability is approximated via finite sums and the maximum based on the MM representation. Gray lines indicate the likelihoods of the models p(y θv,k ) of the MM. 93

212 5.B. Random Coefficient and MM representations Illustrative example 94

213 Chapter 6 Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty via Vibration Response Multiple Model Based Methods This work focuses on the problem of vibration based health monitoring of operating wind turbine structures, which feature time-dependent dynamics and important environmental and operational uncertainty. For this purpose, the MM framework for vibration based SHM postulated in the companion paper is appraised, while is evaluation is based on simulated single vibration response signals of an NREL offshore 5 MW wind turbine obtained with the FAST aeroelastic simulation tool. In the considered experiment, non-stationary dynamics mainly originate from the continually evolving inertial configurations of the structure as the wind turbine blades rotate, while uncertainty is introduced by random variations of the wind speed in the range from to m/s. Monte Carlo simulations are performed on six structural states, including the healthy state and five types of damages in the tower, blades and transmission, each one of them with four respective levels of damage. The complete set up of the modeling and damage diagnosis methods is presented, while comparisons with other state of the art methods are provided as well. The results demonstrate consistently good performance of the MM based methods, which vastly improve the performance achieved by other methods, and thus are indicative of their applicability and effectiveness. Moreover, by means of the postulated MM methodology, it is possible to detect most damage types and specify the level of damage with a single sensor either at the tower or the blades. 95

214 6.. Introduction 6. Introduction Vibration-based output-only Structural Health Monitoring (SHM) aims at damage diagnosis for structures based on measured vibration response signals Farrar and Worden (7); Fassois and Sakellariou (9). The application of vibration-based SHM methods is of particular importance on wind turbines, since these are costly, remotely located structures for which maintenance and repair costs are elevated (Ciang et al., 8; Hameed et al., 9). In this sense, SHM is of primary importance to reduce maintenance and repair costs, and to avoid structural damage, which may itself lead to catastrophic harm to the integrity of the wind turbine. Nonetheless, the design of a vibration based SHM system for operating wind turbines poses particular challenges, which are enumerated next: (i) In-operation analysis: Uninterrupted operation of wind turbine facilities is necessary to maximize power production and economical revenue. Therefore, it is most desirable to perform SHM during normal operation. Consequently, the statistical time series model used for SHM must have the capability to describe the dynamics characterizing an operating wind turbine, which feature cyclo-stationary and non-stationary behavior (in a broader sense) (Allen et al., b; Avendano-Valencia and Fassois, 3b; Hansen et al., 6b). Moreover, since the actual exciting forces are unmeasurable, the considered time series methods are output-only, or in other words, the models are constructed only on the basis of the vibration response of the structure. (ii) Changing environmental and operational conditions: Wind turbines operate in a constantly changing environment determined by the varying winds and weather conditions. Besides, these structures can be set to operate at different regimes in response to the current power demand and the environmental conditions. A consequence of this is that the stochastic characteristics of the vibration response of the wind turbine can be totally different as the environmental and operational conditions change (Hansen et al., 6b). Thus, the SHM system must be capable of recognizing whether those changes are due to damage or due to uncertainty, in order to provide robust diagnostic of the condition of the wind turbine structure (Farrar and Worden, 7; Sohn, 7). (iii) Data-based models: Analytical models of the wind turbine dynamics derived from the physical properties of the wind turbine and its structural dynamics, which at the same time satisfy the features listed above, would become quite complex for most practical applications. Instead, data based models derived from the vibration response at specific locations in the structure are preferable, since these can lead to simpler, more practical and yet highly effective SHM methods (Fassois and Sakellariou, 9; Avendaño-Valencia and Fassois, 4a). Most of the currently available vibration-based SHM methodologies utilize characteristic quantities derived from frequency domain representations or modal properties of the structure. However, the extraction of frequencydomain and modal characteristics from the vibration response of operational wind turbines requires specialized techniques to cope with the non-stationary characteristics of the vibration response. For that purpose, specialized Operational Modal Analysis (OMA) techniques for cyclo-stationary vibration response may be used (Allen et al., b; Antoni, 7; Carne and James, ; Ozbek et al., 3). Two main assumptions on these methods is that the vibration response is cyclo stationary and that the exciting forces are (at least approximately) white Gaussian. Although wind excitation is in general represented as a colored Gaussian process (Hansen et al., 6b), there are some methods available for the case of colored excitation (Chauhan et al., 9; Reynders and de Roeck, ; Reynders, ). Nonetheless, the assumption of cyclo stationarity may be more limiting, since wind turbines typically present variations on the rotor speed in response to power demand and wind speed (Hansen et al., 5; Muljadi and Butterfield, 999). More robust diagnosis of damage on operating wind turbines may be achieved by completely accounting for the non-stationary characteristics of the vibration response. In this sense, SHM can be performed based on timefrequency or time-scale distributions of the non-parametric (Peng and Chu, 4; Staszewski and Robertson, 7; Henríquez et al., 4), or parametric (Conforto and D Alessio, 999; Wang et al., 8; Feng et al., 3) type. More recently, empirical mode decomposition, Hilbert-Huang transform, and multi-resolution analysis via wavelet decomposition have been introduced for the representation of non-stationary signals in modal domain (Yang et al., ; Amirat et al., 3; Worden et al., 4). However, the main problem for the application of non-parametric TF methods on vibration-based SHM is the large quantity of data to be analyzed and processed. Pattern recognition methods that include a feature extraction and selection stage can be used to cope with this problem (Peng and Chu, 96

215 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty 4; Staszewski and Robertson, 7), although this involves increased processing time during both baseline and inspection phases of the SHM method. On difference with the methods based on non-parametric TF and TS representations, parametric representations, where Time-dependent ARMA (TARMA) and Linear Parameter Varying ARMA (LPV-ARMA) are the most recognized, are appealing for vibration-based SHM applications, since these are capable of summarizing within a compact and yet accurate form, the dynamical characteristics of the vibration signal (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 4b). Previous studies of the authors have confirmed the usefulness of TARMA modeling for the analysis of the vibration response of operating wind turbines (Avendano-Valencia and Fassois, 3b). Besides, very efficient damage diagnosis techniques already exist for the case of FS-TARMA and FS- TARX models (Avendaño-Valencia and Fassois, 4a; Poulimenos and Fassois, 4; Spiridonakos and Fassois, 3). Nonetheless, these methods are not well suited in the presence of significant levels of uncertainty, since these are not expressly embedded within the model. If that is the case, Random Coefficient (RC) probabilistic or similar models, potentially along with Multiple Model (MM) type representations may be used to effectively describe the effects of uncertainties (Gadsden et al., 3; Zhao et al., 5; Avendaño-Valencia and Fassois, 5b). An alternative, but more complex, approach consists of explicitly including the effects of uncertainties in the model (Hios and Fassois, 4). In a recent paper of the authors the problem of vibration based Structural Health Monitoring (SHM) for structures with time dependent dynamics under important environmental and operational uncertainty has been addressed Avendaño-Valencia and Fassois (5b). The postulated damage diagnosis methods are based on a Multiple Model (MM) representation of the vibration response of the structure at specific health states. The MM representation is a collection of proper models that can actually be interpreted as a type of mixture approximation of a Random Coefficient (RC) model. The MM representation is characterized by its flexibility on the representation of uncertainty, simplicity of construction and reduced amount of data required for training, in comparison with similar methodologies (utilizing other forms of RC models and interpolating models). The main aim of the present work is to provide a complete assessment of the MM based approach for the particular case of vibration based SHM on an operating wind turbine structure, in contrast to some of our recent works (Avendaño-Valencia and Fassois, 4b, 5d,a) where partial results have been presented. The framework includes two entities: (i) non-stationary time-dependent parametric models (either from the Functional Series FS or Linear Parameter Varying LPV types) for representing the time-dependent structural dynamics, and (ii) a Multiple Model (MM) representation for modeling an individual health state of the structure (also referred to as class) under uncertainty. The rationale behind this formulation stems from the expectation that an MM representation may be best for capturing uncertainty from a limited set of baseline vibration responses in a simple and effective form. The assessment of the damage diagnosis methods is performed on a NREL offshore 5-MW baseline wind turbine simulated via the FAST aeroelastic and structural dynamics simulation code (Jonkman and Buhl, 5), for which the average wind speed at each analysis interval is the external variable inducing uncertainties in the dynamic response of the structure. Five types of damages, each one of various increasing levels, are considered. Damage diagnosis is based on a single vibration acceleration response signal measured at either the tower top (fore-aft and lateral directions) or on the blade (at 5% of the blade length from the root; flapwise and edgewise directions). Initially, identification and analysis of the non stationary dynamics of the vibration response of the wind turbine via FS TAR and LPV AR models is demonstrated, and subsequently, MM based damage diagnosis methods are built and optimized. Comparison with a single model parameter based method introduced in Spiridonakos and Fassois (3) and a Time Frequency Representation with Principal Component Analysis based method Wang et al. (8); Avendaño-Valencia et al. () is provided. As one of its main contributions, this work provides the reader with a tool for the identification and analysis of structures with non stationary vibration response, and for the design and optimization of an MM based damage diagnosis method. The difficulties and selections met in the process are clarified throughout the text, in order to serve as a guide for the application of the methodology on similar problems. This paper is organized as follows: Initially, the problem of SHM on operating wind turbines is presented in Section 6., including the presentation of the FAST simulation package used to simulate the vibration response of operating wind turbines under uncertain wind excitation and the damage and fault scenarios utilized in this work. 97

216 6.. The Wind Turbine, the Damages/Faults, the Sensors A brief overview of the MM damage/fault diagnosis methods is provided in Section 6.3. Modeling and analysis of the vibration response at different structural states via FS TAR and LPV AR models is provided in Section 6.4. The results of damage diagnosis with the MM framework based on the optimized FS TAR and LPV AR models and a comparison with similar methods are given in Section 6.5. Finally, conclusions are presented in Section The Wind Turbine, the Damages/Faults, the Sensors 6.. Wind turbine description and simulation The SHM problem considered in this work features an NREL (National Renewable Energy Laboratory) offshore 5-MW baseline wind turbine (Jonkman et al., 9), which is standard for component design, and aerodynamic, aeroelastic, structural and control system simulation and assessment. The NREL 5-MW wind turbine is a conventional three-bladed upwind variable-speed and variable blade-pitch-to-feather-controlled turbine. The main properties of this wind turbine model are summarized in Table 6... Table 6..: Main properties of the NREL 5-MW baseline wind turbine as shown in (Jonkman et al., 9) Property Value Rating 5 MW Rotor orientation, configuration Upwind, 3 blades Control Variable speed, collective pitch Drive train High speed, multiple-stage gearbox Rotor, hub diameter, hub height {6,3,9} m Cut-in, rated, cut-out wind speed {3,.4,5} m/s Cut-in, rated rotor speed {6.9,.} rpm Rated tip speed 8 m/s Rotor, nacelle, tower mass {,4,347.46} 3 kg The vibration response of the NREL 5-MW wind turbine is simulated by means of the FAST wind turbine simulation code. The FAST (Fatigue, Aerodynamics, Structures and Turbulence) Code is an aeroelastic simulator capable of predicting the loads of two and three bladed horizontal axis wind turbines (Jonkman and Buhl, 5). FAST uses up to DOFs to describe the wind turbine structure, including (Jonkman and Buhl, 5): 3-4 DOFs for the blades ( flapwise, - edgewise); up to 4 DOFs for the rotor shaft ( for torsion, for the hinges before the first bearing, and for pure rotation); DOF to describe the tilt stiffness of the nacelle; and about 3 DOFs to describe the torsion of the tower and its displacements in the fore-aft and lateral directions. Although in actual operational conditions, there are several variables inducing uncertainty in the wind turbine dynamics, the average wind speed in the analysis interval is selected as the uncertainty inducing variable in the presently considered experiment. More specifically, the value of the average wind speed used on a single simulation is drawn from a Gaussian distribution with mean 5 m/s and standard deviation m/s. Moreover, the input wind excitation is simulated using a Kaimal turbulence model with a power law wind profile type (Hansen et al., 6b). The turbulence model is adjusted so that an expected turbulence intensity value of 4% at a wind speed of 5 m/s is complied. The wind turbine response is simulated under the rated rotor speed of. rpm. Then, at each simulation the wind turbine starts at the rated rotor speed with all the control systems on-line. The simulated period is of 4 s and the vibration signals are sampled at 5 Hz. For each simulation, the portion from 5 s to 5 s of the obtained vibration signals, corresponding to 5 samples, is used for further analysis. 6.. The damage/fault scenarios Five types of damage are simulated, each one with four increasing levels of damage. The simulated damages correspond to typical damages or malfunctions in wind turbines, as discussed for example in (Ciang et al., 8; Hameed et al., 9; Adams et al., ; Benedetti et al., ). Each scenario is described in the following paragraphs: 98

217 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Damage A Increased mass at the tip of the third blade: Increased mass on the last % of the length of the third blade to simulate the effect of water filtration into the blade structure, or to partly simulate the effects of ice growing over the tip of the blade (see Figure 6..(b)). Four levels of damage are simulated by increasing %, 4%, 6% and 8% of the blade mass density on the last % of the length of the third blade. Each damage level is equivalent to an increase of.37%,.73%,.% and.47% of the total mass of the blade. Damage B Stiffness reduction at the root of the third blade: Reduction of the stiffness of the root of the third blade in the edgewise direction, which simulates the effect of fatigue damage in the blade root, being one of the most common types of damage in blades (see Figure 6..(c)). Four damage levels are simulated by decreasing 5%, 3%, 45% and 6% the blade edgewise stiffness from the blade root up to % of the distance to the tip of the blade. Damage C Stiffness reduction at the tower base: Reduction of the stiffness in the lateral direction at the base of the tower to simulate fatigue damage in the welding joints or screws in the base of the tower, where it is exposed to the highest loads. (see Figure 6..(d)). Four damage levels are simulated by decreasing 5%, %, 5% and % the tower stiffness in the lateral direction, from the base up to % of the tower height. Damage D Yawing error: Error in the yawing mechanism. Four damage levels are simulated by inducing, 4, 6 and 8 degree errors from the upwind position in the yawing system. This type of problem is common when there are problems in the yawing control system and rapidly increases the fatigue loads in the whole structure. However, this is not a structural damage, but instead is a malfunction that may lead to rapid fatigue damage in other structural components. Damage E Reduction of the damping coefficient in the rotor low speed shaft: Reduction of the damping coefficient in the low speed shaft (LSS) to simulate fatigue damage in the transmission system. Four damage levels are simulated by decreasing 5%, 3%, 45% and 6% the damping coefficient of the LSS The sensors and the vibration response signals For each damage type and level, as well as for the healthy state, realizations of the wind turbine response are simulated with the FAST code using different seeds for the generation of the turbulence time series. In the present analysis, four virtual accelerometers are considered, which are located as depicted in Figure 6..(a): two at the tower top in the fore-aft and lateral directions, and other two at 5% of the distance from the root to the tip of the third blade in the edgewise and flapwise directions. Besides, the instantaneous angular position of the rotor is measured as well. Figure 6.. shows Welch estimates of the Power Spectral Density (PSD) of the vibration response signals obtained for simulated realizations of the wind turbine at the healthy state on different sensor locations. The Welch spectral estimates are obtained based on Gaussian windows of 496 samples, 389 samples overlap based on 8 samples signal length. The plots also show the main natural frequencies derived from periodic linearization analysis for the rated rotor speed on the same wind turbine model through the FAST code, as shown in (Bir and Jonkman, 7). For reference, these are summarized in Table 6... As can be seen in the estimated spectra, the vibration response is characterized by very complex dynamics where the presence of other frequency modes, apart from those predicted form the linear analysis, is observed. As shall be evident in the non-stationary analysis provided later, most of these modes are related to frequency and amplitude modulations stemming from the time-variant dynamics of the wind turbine. Furthermore, it is observed that the PSD feature large variability among different realizations, this due to the influence of the wind speed and turbulence. 6.3 Brief Overview of the Multiple Model Damage/Fault Diagnosis Methods Let y[t] represent the vibration response of the structure, with t =,,...,N being the discrete time and T s the sampling period. The structure may operate in one of several health states v={o,a,b,c,...}, where o stands for the healthy state, while a, b, c and so on represent various types of damages or faults. Damage (or fault) diagnosis refers to the problem of determining the current state of the structure given a newly acquired vibration response signal (the test signal) y u = [ y u [] y u [] y u [N] ] T and a model to represent the vibration response at each state v. In the Multiple Model (MM) approach, each one of these models correspond to an MM for each structural 99

6.3. Brief Overview of the Multiple Model Damage/Fault Diagnosis Methods Sensor location Increased mass - blade tip Reduced stiffness Tower base Tower top Lateral Blade 3 Edgewise Tower top Fore-aft

.: Location of the virtual sensors on the wind turbine structure and depiction of the types of damage used in the experiment: (a) Four virtual sensors are located on the structure: two at the tower

218 6.3. Brief Overview of the Multiple Model Damage/Fault Diagnosis Methods Sensor location Increased mass - blade tip Reduced stiffness Tower base Tower top Lateral Blade 3 Edgewise Tower top Fore-aft Blade 3 Flapwise % chord length (b) Reduced stiffness - blade root % chord length (a) (c) % tower height (d) Figure 6..: Location of the virtual sensors on the wind turbine structure and depiction of the types of damage used in the experiment: (a) Four virtual sensors are located on the structure: two at the tower top in the fore-aft and lateral directions, and other two about 5% of the length from the root of the 3rd blade in the flapwise and edgewise directions; (b) Damage type A: increased mass in the last % of the length of the 3rd blade; (c) Damage type B: the stiffness at the root of the 3rd blade in the edgewise direction is reduced; (d) Damage type C: the stiffness at the base of the tower is reduced. Table 6..: Natural frequencies of the NREL Offshore 5-MW Baseline Wind Turbine obtained via periodic linearization analysis (Bir and Jonkman, 7). Mode Description Frequency [Hz] st Tower Fore-Aft.34 st Tower Lateral.3 3 st Drive Train Torsion.65 4 st Blade Asymmetric Flapwise Yaw st Blade Asymmetric Flapwise Pitch st Blade Collective Flap st Blade Asymmetric Edgewise Pitch st Blade Asymmetric Edgewise Yaw nd Blade Asymmetric Flapwise Yaw.9337 nd Blade Asymmetric Flapwise Pitch.93 nd Blade Collective Flap.5 nd Tower Fore-Aft.93 3 nd Tower Lateral.936 state. An MM is simply a set of modelsm={m v,l }, withm={θ v,l,s} and l=,...,l, all of them corresponding to the structural state v (Avendaño-Valencia and Fassois, 5b). On this section, the definition and construction of MM representations based on LPV AR models, as well as related damage diagnosis tests based on the marginal likelihood and the Kullback Leibler divergence are briefly overviewed The elementary models In the case of non-stationary vibration response, the type of Linear Parameter Varying AR (LPV AR) models may be used as the elementary model in the MM representation, whenever the non stationary dynamics are governed

219 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Figure 6..: Comparison of the Welch spectral estimates on simulated vibration response signals from the healthy state at different sensor positions. Thick line depicts the sample average PSD, whereas the dotted lines indicate individual realizations. Red lines indicate the main natural frequencies of the wind turbine as provided by (Bir and Jonkman, 7). by a scheduling variable β[t] R. Thus, an LPV AR(n a ) [pa,p s ], with n a representing the AR order, and p a and p s representing the dimensionality of the AR and innovations variance functional subspaces is defined as follows (Poulimenos and Fassois, 6; Avendaño-Valencia and Fassois, 5b): y[t]= a i (β[t]) σ w(β[t]) n a p a j= p s j= i= a i (β[t]) y[t i]+w[t], w[t] NID (,σ w(β[t]) ) (6.3.a) a i, j G ba( j) (β[t]); F AR = { } G ba() (β[t]),...,g ba(pa) (β[t]) { } s j G bs( j) (β[t]); F σ w = G bs() (β[t]),...,g bs(ps) (β[t]) (6.3.b) (6.3.c) where a i (β[t]) are the AR parameters, w[t] a zero mean NID innovations with variance σ w(β[t]), F stands for a functional subspace of the respective quantity, b a( j) ( j =,..., p a ), b s( j) ( j =,..., p s ) are the indices of the specific basis functions that are included in each functional subspace, while a i, j and s j stand for the coefficients of projection of the AR parameters and innovations variance. Functional Series Time-dependent AR (FS TAR) models are the special case when β[t] t. The LPV AR model m={θ,s} is fully determined by the parameter vector θ and the structure S, each one of them defined as: θ = [ a, a, a na,p a s s s ps ] T n S={n a, b a, b s } (6.3.) where n=n a p a + p s and b a = [ ] T b a() b a(pa ) p a, b s = [ ] T b s() b s(ps ) p s.

220 6.3. Brief Overview of the Multiple Model Damage/Fault Diagnosis Methods 6.3. Construction of the MM representation The identification of single FS TAR models by means of maximum likelihood methods is broadly discussed in (Poulimenos and Fassois, 6; Spiridonakos and Fassois, 4b). Nonetheless, for the purpose of constructing MM representations, it is necessary to obtain a common model structure that would effectively represent a large set of vibration responses. Therefore, the estimation of the parameter vector and the posterior evaluation of the model performance are carried out in different non-intersecting training and validation segments, denoted as: y m = [ y (tr) m y (val) m ] β m = [ β (tr) m β (val) m ], m=,,...,m where M is the total number of vibration response signals available for each structural state on the baseline phase, y m is each one of the available vibration response vectors, while y (tr) m and y (val) m are the segments used for parameter estimation (training) and performance evaluation (validation), respectively. Then, the selection of the best model structure is made in terms of the empirical risk (Schölkopf and Smola,, Ch.3), (Vapnik,, Ch.): R emp (S)= M M N val m= t= ( y (val) m [t] ŷ (val) ( y (val) m ) m [t t ] ) (6.3.3) [t] where N val is the length of the validation segment, ŷ (val) m [t t ] are the one-step-ahead model predictions evaluated on the validation segment employing the parameter estimates ˆϑ m obtained from the training segment (Avendaño-Valencia and Fassois, 5b). The use of the empirical risk for model structure selection is important to achieve an MM representation that does not over-fit the training data and is capable to generalize to unseen data sets. Furthermore, the empirical risk tends to favor models with low complexity, thus enabling the selection of the most compact and efficient representation (Schölkopf and Smola,, Ch.3), (Vapnik,, Ch.). Nevertheless, criteria like the RSS, RSS/SSS and BIC may also be used to guide the decision of the best model structure (Poulimenos and Fassois, 6). A second issue of importance in the construction of an MM is to decide the number of models to keep in the representation. To understand this, consider first that a set of M vibration response signals have been obtained from the structure from a single structural state with the purpose of constructing a respective MM. Then, it must be assessed whether all the corresponding M models are actually necessary. For this purpose a measure determining the similarity between two models, like the Kullback Leibler (K-L) divergence can be used. The K-L divergence on the case of LPV AR models is actually of the form (Avendaño-Valencia and Fassois, 5b): D KL (m a,m b )= N + d M(m a,m b )+ d s(m a,m b ) (6.3.4) ( dm(m a,m b )=(ϑ b ϑ a ) T Σ θ b (ϑ b ϑ a ) ds(m N σ w(a) (β a [t]) a,m b )= σw(b) (β a[t]) lnσ w(a) (β ) a[t]) lnσw(b) (β a[t]) The K-L divergence is a measure of the information lost when the model m b is used to represent model m a. Thus, a large K-L divergence is an indicator that models m a and m b are very different between themselves and in consequence represent different dynamics in the vibration response. Likewise, if both models m a and m b are similar, hence representing similar dynamic behaviors, then their divergence should be close to zero. In this sense, if two models appear to have a low divergence value, then one of them may be dropped from the representation. This plucking process may be repeated until a certain performance measure is met, such as the performance on damage diagnosis Damage diagnosis based on MM representations MM Marginal Likelihood based methods The Multiple Model Marginal Likelihood (MM-ML) based damage diagnosis method attempts to determine the presence and type of damage in the structure in terms of the marginal likelihood of the test signal y u given class t=

221 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty v, namely p(y u v). The MM-ML method thus assigns the test signal to the class (structural state) with the highest marginal likelihood. The damage detection and identification tests derived from this concept, as presented in (Avendaño-Valencia and Fassois, 5b), are summarized in Table Two versions are considered according to the type of approximation used to compute the marginalizing integral, namely: (i) the MM-ML-sum method, which uses the finite sum approximation of the marginal likelihood; (ii) the MM-ML-max method, which uses the maximum approximation of the marginal likelihood. The weights π v,l and the threshold ρ lim are the variables determining the final performance of the method. As shall be shown in Section 6.5., an optimization method is used in order to find the best values for these variables. Table 6.3.: The Multiple Model Marginal Likelihood-based (MM-ML) method Damage detection: Given a test signal y u and the MM of the healthy state M o = {m o,l },l =,...,L, with m o,l = { ˆθ o,l,s} and corresponding weights π o,l and a damage detection threshold ρ lim :. Compute the parameter vector estimate ˆθ u and its likelihood p(y u ˆθ u );. Compute the likelihoods of the individual models in the MM p(y u ˆθ o,l ) for all l =,...,L; 3. Evaluate the marginal likelihood with one of the following approximations (i) (ii) p(y u o)= L l= π o,l p(y u ˆθ o,l ) Finite sum approximation (MM-ML-sum) p(y u o)=max l π o,l p(y u ˆθ o,l ) Maximum approximation (MM-ML-max) 4. Damage detection test ρ d (y u )= p(y u ˆθ u ) p(y u o) ρ lim Otherwise The structure is damaged The structure is healthy where ρ d (y u ) is the statistical quantity for damage detection. Damage identification: Given a test signal y u and the MMs of the damaged states M v = {m v,l } for l =,...,L and v={a,b,c,...}, with m v,l ={ ˆθ v,l,s} and corresponding weights π v,l :. Compute the likelihoods of the individual models of each one of the MMs p(y u ˆθ v,l ) for all l =,...,L and v={a,b,c,...};. Evaluate the marginal likelihood with one of the following approximations (i) (ii) p(y u v)= L l= π v,l p(y u ˆθ v,l ) Finite sum approximation (MM-ML-sum) p(y u v)=max l π v,l p(y u ˆθ v,l ) Maximum approximation (MM-ML-max) 3. Damage identification test ˆv= argmax p(y u v) v={a,b,c,...} MM Kullback Leibler based methods The Multiple Model Kullback-Leibler divergence (MM-KL) based damage diagnosis method attempts to determine the presence and type of damage/fault in the structure by comparing the likelihoods associated with test model and the models in the MM of all the reference states. This comparison is made in terms of the Kullback-Leibler (K-L) divergence measure in Equation (6.3.4). The damage/fault detection and identification tests derived from the K-L divergence are referred to as MM KL methods and are summarized in Table The fixing parameters of the 3

222 6.4. Modeling and Analysis of the Vibration Response Signals MM KL method are the weights λ v,k and the threshold d lim (only for the damage detection test). The optimization of these parameters for the maximization of the damage diagnosis performance is explained in Section Table 6.3.: The Multiple Model Kullback-Leibler divergence-based (MM-KL) method Damage detection: Given a test signal y u and the MM of the healthy state M o = {m o,l }, with m o,l = { ˆθ o,l,s} and weights λ o,l for all l =,...,L, and a damage detection threshold d lim :. Compute the ML estimate ˆθ u and its corresponding covariance matrix Σ θ u ;. Evaluate the Kullback-Leibler divergence between the test model and all the models inm o as shown in Equation (6.3.4). 3. Damage detection test min l=,...,l λ o,l D KL (m u,m o,l ) d lim Otherwise The structure is healthy The structure is damaged Damage identification: Given a test signal y u and the MM of the damaged states M v = {m v,l }, with m v,l = { ˆθ v,l,s} and weights λ v,l, for all l =,...,L and v={a,b,c,...}:. Compute the ML estimate ˆθ u and its corresponding covariance matrix Σ θ u ;. Evaluate the Kullback-Leibler divergence between the test model and all the models in M v, v={a,b,c,...} as shown in Equation (6.3.4). 3. Damage identification test ( ) ˆv = min min π v,l D KL (m u,m v,l ) v {a,b,c,...} l=,...,l 6.4 Modeling and Analysis of the Vibration Response Signals 6.4. LPV-AR and FS-TAR based modeling Complete LPV AR structures are estimated via the Multi-Stage Maximum Likelihood (MS-ML) method (Poulimenos and Fasso 6; Avendaño-Valencia and Fassois, 5b), where the parameters and innovations variance are expanded on the functional subspace of the trigonometric type defined as follows (Avendaño-Valencia and Fassois, 5d,a): G (β[t])=, G k (β[t])=sin(kβ[t]), G k+ (β[t])=cos(kβ[t]), k=,,..., p a (6.4.) only for p a odd (to keep the same number of sine and cosine components), where β[t] is the π modulus instantaneous angular position of the rotor at time t. The functional basis is defined in this form to account for the variability in the instantaneous angular speed of the rotor, which is constantly changing according to the upcoming wind. Simpler FS TAR models are also considered, this time utilizing a trigonometric basis function synchronized with the rated speed of the rotor for the expansion of the FS TAR parameters. Thus, the functional expansion basis is of the form (Avendano-Valencia and Fassois, 3b; Avendaño-Valencia and Fassois, 4b): G [t]=, G k [t]=sin(kω o t), G k+ [t]=cos(kω o t), k=,,..., p a (6.4.) also only for p a odd, where ω o = π f rot / f s and f rot =./6=. Hz is the rated speed of the rotor in Hertz. The selection of the model structure is made in two stages: (i) Selection of n a, where FS TAR and LPV AR models are estimated with fixed p a = p s = 3 and with increasing n a in the range 6 to 5; (ii) Selection 4

223 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty of p a and p s, where FS TAR and LPV AR models are estimated for the best n a in the previous step and with p a = p s ={3,5,,9}. The selection of the model structure is guided by the empirical risk defined in Equation (6.3.3). The RSS/SSS is also used as reference. Full details of parameter estimation and model order selection are provided in Table Table 6.4.: Details of the identification process for FS TAR and LPV AR models. Stage Details Vibration signal features: Total length N = 5 samples, initial 4 samples used for training (N tr ), last samples used for validation (N val ) Estimation: Multi Stage Maximum Likelihood (MS-ML) method with instantaneous innovations variance estimate (Poulimenos and Fassois, 6; Avendaño-Valencia and Fassois, 5b). Stopping criteria: tolerance in the change of the parameters, tolerance in the change of the likelihood, maximum number of iterations. Model order selection: Estimate FS TAR/LPV AR(n) [pa,p s ] models with n=6,...,3 and p a = p s = 3. Basis order selection: Estimate FS TAR/LPV AR(n a ) [p,p] models with best n a of previous step and p a = p s =,3,...,5. Figure 6.4. displays the realization based empirical risk and sample average RSS/SSS curves obtained at the two stages of model order selection for vibration responses of the healthy structure measured at each one of the considered sensors. The empirical risk curves obtained during the model order selection procedure, shown in Figure 6.4.(a) display clear minima according to each sensor. The minimum values are indicated with arrows on each plot. The curves obtained for the tower top in the lateral direction and at the blade in the edgewise direction also show other minimum values for higher orders, but the improvement is minimal compared with the selected order. The RSS/SSS curves have a similar behavior to the one displayed by the empirical risk, although these curves are continuously decreasing for increasing model order. Also, the empirical risk curves obtained for FS TAR models are slightly lower than those of the LPV AR models, while the RSS/SSS curves show the opposite, thus favoring LPV AR models. Figure 6.4.(b) shows the results of basis order selection procedure, based on the best model orders obtained in the previous step. In this occasion, the empirical risk and RSS/SSS curves show opposite behaviors, where the RSS/SSS favors higher orders, while the empirical risk results suggest selecting lower orders. Thus, the decision in this step is not as simple as in the previous one. The basis dimensionality is ultimately selected as p a = p s = 7, so that the representation basis consists of the constant plus the first three sine/cosines. This basis dimensionality is selected for following reasons: (i) to reach a compromise between the empirical risk and RSS/SSS curves; (ii) to obtain a representation basis that may account for the first three harmonics of the blade rotation frequency; (iii) although some basis functions may not improve significantly the modeling performance at the healthy state of the structure, these may be helpful for the modeling unbalances found under certain damage types. A summary of some performance figures, including the empirical risk, RSS/SSS, log-likelihood, Bayesian Information Criterion (BIC), Condition Number (CN) of the inverted matrices in the computation of the parameter estimates, and Samples Per Parameter (SPP) criterion, of the selected model structures is shown in Table Validation and comparison of the obtained model structures are presented in Appendix 6.A Model based analysis of the dynamics The dynamic characteristics of the vibration response of the wind turbine are analyzed by means of the Time- Variant Power Spectral Density (TV-PSD) and the Spectral Correlation (Antoni, 7; Poulimenos and Fassois, 6; Avendaño-Valencia and Fassois, 6). A brief summary of these quantities and their calculation based on the parameters of a TARMA model are provided in Appendix 6.B. The LPV ARMA model based Melard-Tjøsteim PSD is computed on 5 points over the frequency range [,8] Hz (thus the frequency resolution is f o = 8/5= 5

224 6.4. Modeling and Analysis of the Vibration Response Signals (a) -. Empirical Risk: FS-TAR Empirical Risk: LPV-AR RSS/SSS: FS-TAR RSS/SSS: LPV-AR -. n a = 5 n a = 8 log Remp log Remp Tower top - fore aft, p a = p s = Tower top - lateral, p a = p s = log Remp (b) -.6 n a = Blade - flapwise, p a = p s = Order n a -.3 log Remp -.6 n a = Blade - edgewise, p a = p s = Order n a -. log Remp Tower top - fore aft, n a = 5 log Remp Tower top - lateral, n a = log Remp Blade - flapwise, n a = Order p a log Remp Blade - edgewise, n a = Order p a Figure 6.4.: Empirical risk and sample average RSS/SSS obtained at the two stages of model order selection for FS TAR and LPV AR structures for realizations of the simulated vibration responses of the healthy structure at each one of the sensors: (a) Selection of the model order with fixed p a = p s = 3; (b) Selection of the basis order using best n a obtained in the previous step [Hz]), and evaluated for a single period of rotation of the wind turbine (thus β[t] = ω a t, with t =,,..., and ω a = π/). Likewise, the LPV ARMA model based spectral correlation is evaluated on the frequency range [,8] [Hz] and for p up to 3 p cyclic frequencies, where p =. [Hz] is the rotor frequency. Figure 6.4. provides a comparison of the parametric Melard-Tjøsteim PSDs obtained from the sample average LPV AR models of the vibration response of the healthy state at each one of the considered sensors. Additionally, Figure shows the parametric Melard-Tjøsteim PSD estimates derived from the sample average LPV AR models of the vibration response of the wind turbine at the tower top lateral direction at different structural states at the highest level of damage. The Melard-Tjøsteim PSD estimates from the vibration response at the healthy state show the main frequency modes found in the stationary PSD estimates shown in Figure 6... Some of the unexplained frequency modes found in the stationary PSD are evidenced in the Melard-Tjøsteim PSD estimates 6

225 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Table 6.4.: FS TAR/LPV AR model structure identification results for wind turbine vibration response simulations of the healthy state. Sensor Model R emp RSS/SSS ln p(y θ) BIC ( ) ( ) ( 3 ) ( 3 ) log CN SPP Tower top fore-aft FS TAR(5) [7,7] LPV AR(5) [7,7] Tower top lateral LPV AR(8) [7,7] FS TAR(8) [7,7] Blade flapwise LPV AR(7) [7,7] FS TAR(7) [7,7] Blade edgewise FS TAR(6) [7,7] LPV AR(6) [7,7] as being part of a single mode modulated in amplitude or frequency. The presence of damage modifies the timedependent behavior of some modes, depending on the damage type. Figure shows the spectral correlation derived from the sample average LPV AR model for the healthy and all the simulated damage scenarios at the highest damage level. The spectral correlations at p (corresponding to the constant terms in the PSD along time) and 3 p (corresponding to modulations induced by the blade-toblade period) remain almost unchanged for most damage types at all the sensors. Only damages of types B and D introduce deviations from the normal state, which manifest differently at different sensor locations. Nonetheless, the most important changes in the dynamics are evident at p (related to modulations induced by the period of rotation of the rotor). This is due to the unbalance introduced by the presence of damage on the dynamics of the vibration response. It can also be seen that damage modifies the natural frequencies of some modes, as evidenced for example by the vibration response of the structure under damage A at the blade in the lateral direction (Figure 6.4.4(d)). 6.5 Damage and Fault Diagnosis Results The remainder of this section is devoted to the application of the MM-based damage detection and identification methodologies presented in Section 6.3.3, namely the approach based on the Marginal Likelihood of the MM (MM ML sum and MM ML max damage detection and identification methods) and the approach based on the Kullback-Leibler divergence (MM-KL method). For the sake of comparison, the Single FS TAR/LPV AR Model Parameter-based (SMP Single Model Parameter based method) described in Appendix 6.C is evaluated as well. The damage diagnosis methods are built upon the LPV AR and FS TAR models obtained in the previous section Description of the optimization of the damage diagnosis methods via cross-validation In order to ensure the best performance of the methodologies, the fixing parameters of each one of the damage diagnosis methods are optimized within a cross-validation procedure. Cross-validation is a model assessment methodology in which the adjustment of the method (training) and its posterior performance evaluation (validation) are carried out on independent sets of data, with the purpose of assessing the behavior of the method in unseen sets of data. Then, the damage diagnosis methods utilized in this part of the work are evaluated and optimized within a -fold cross-validation approach, in which the whole data set is divided into different non-intersecting subsets. In the SMP method, since the training is based only on a single model, a different method to define the training and validation subsets must be followed. A summary of the training and validation subsets used for evaluation of the damage diagnosis methods is provided in Table The performance of the damage diagnosis methods is evaluated in terms of the Receiver Operating Characteristic (ROC) curves and its corresponding Area Under the ROC Curve (AUC) (Fawcett, 6). ROC curves are constructed by plotting the True Positive Rate (TPR) vs. the False Positive Rate (FPR) of a detector or binary classifier as the decision threshold (ρ lim or d lim ) increases. The ideal ROC curve passes through the point (,), 7

6.5. Damage and Fault Diagnosis Results (a)

the sample average LPV AR models of vibration

at different sensor locations: (a) tower top

direction, (c) blade flapwise direction, (d)

indicating zero FPR and TPR equal to the

diagonal TPR=FPR indicates a 5/5% chance that

The ROC-AUC summarizes the information of the

probability that the detector or classifier

class (damaged state) than a sample from the

Thus, the optimization of the weights of the

the maximization of the ROC-AUC, defined as

is the ROC-AUC obtained with the MM-ML damage

problem and the possibility of several local

226 6.5. Damage and Fault Diagnosis Results (a) (b) (c) (d) Figure 6.4.: Comparison of the parametric Melard-Tjøsteim TV-PSD estimates derived from the sample average LPV AR models of vibration responses of the wind turbine tower measured at different sensor locations: (a) tower top fore aft direction, (b) tower top lateral direction, (c) blade flapwise direction, (d) blade edgewise direction. indicating zero FPR and TPR equal to the unity, whereas a ROC curve moving through the diagonal TPR=FPR indicates a 5/5% chance that the method decides for any of the classes. The ROC-AUC summarizes the information of the ROC curve in a single variable, and can take values between and, where the ideal performance is achieved when AUC =, while the worst is at AUC =.5. The AUC can be also associated to the probability that the detector or classifier will rank higher an sample from the target class (damaged state) than a sample from the reference class (healthy state), and in that sense, the AUC is also a related to the probability of detection of the method, regardless of the value of the decision threshold. Thus, the optimization of the weights of the damage detection methods is made in terms of the maximization of the ROC-AUC, defined as follows: π = argmaxauc(π), π s.t.: L l= π o,l =, and π o,l (6.5.) where π = [ π o, π o, π o,l ], and AUC(π) is the ROC-AUC obtained with the MM-ML damage detection method (in its two versions) using weights π. Due to the complexity of the optimization problem and the possibility of several local maxima, a global optimization method is appraised for the search of optimal values of the weights. Therefore, the Generalized Pattern Search (GPS) method is selected for this task, which consists on the following steps (Kolda et al., 6): 8 (i) The GPS algorithm begins at a given starting point π.

6. Damage and Fault Diagnosis in an Operating Wind

TV-PSD estimates derived from the sample average LPV

tower at the tower top in the lateral direction for

A: increased mass blade tip, (c) damage B: decreased

stiffness tower base, (e) damage D: yawing error,

point at the K points determined by the direction

227 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty (a) (b) (c) (d) (e) (f) Figure 6.4.3: Comparison of the parametric Melard-Tjøsteim TV-PSD estimates derived from the sample average LPV AR model of vibration responses of the wind turbine tower at the tower top in the lateral direction for different structural states: (a) healthy, (b) damage A: increased mass blade tip, (c) damage B: decreased stiffness blade root, (d) damage C: decreased stiffness tower base, (e) damage D: yawing error, (f) damage E: decreased damping in the low speed shaft. (ii) A mesh of points is created around the current point at the K points determined by the direction vectors [ ], [ ],..., [ ] and their negative complements. (iii) In a successful poll the GPS algorithm finds a point that improves the cost function, and then selects it as the next point. Afterwards, the algorithm duplicates the size of the mesh and polls again. 9

228 6.5. Damage and Fault Diagnosis Results (a) (b) (c) (d) Figure 6.4.4: Parametric estimates of the spectral correlation derived from the sample average LPV AR models of vibration response signals of the wind turbine from the sensors located at: (a) tower-top fore-aft direction, (b) tower-top lateral direction, (c) blade flapwise direction, (d) blade edgewise direction. Each column shows the spectral correlations yy (α, f), with f = n f o, where f o = [Hz] is the Fourier frequency, and α = k p and k=,...,3, where p=.[hz] is the frequency of rotation of the rotor. (iv) In an unsuccessful poll the GPS algorithm does not find a point that improves the cost function. In that case, the current point is selected again as the next point, the mesh size is shrunk by a factor of two, and polls again. (v) The algorithm runs until a maximum number of iterations is reached, or until the tolerance values on the mesh size, the objective function or the change in the parameter size is met. The advantage of the GPS algorithm in the presently analyzed optimization problem is that it can search through

229 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Table 6.5.: Definition of the number of records used for training and validation in the cross-validation for evaluation of damage diagnosis in single and multiple model methods. Single model method Damage detection Damage identification Structural state Training Validation Total Training Validation Total Healthy 99 Damage level 99 Damage level 99 Damage level 3 99 Damage level 4 99 Total per damage type Total healthy and 5 damage types Multiple model methods Damage detection Damage identification Structural state Training Validation Total Training Validation Total Healthy 9 Damage level 9 Damage level 9 Damage level 3 9 Damage level 4 9 Total per damage type Total healthy and 5 damage types 9 8 several basins of attraction, thus improving the chances of finding a global minimum. The settings of the GPS algorithm used for the optimization problem in Equation (6.5.) are summarized in Table Table 6.5.: Settings used for the optimization of the damage diagnosis methods with the GPS algorithm. Method s implementation patternsearch (MATLAB s Global Optimization Toolbox) Stopping criteria Maximum number of iterations: L Tolerance value of the mesh size: 6 Tolerance value of the objective function: 6 Tolerance value of the change of the parameter size: 6 Initial values MM-ML method Damage detection: equal weights π o,l = /L, L=9; Damage identification: equal weights π v,l = / MM-KL method Damage detection: equal weights λ o,l = /L, L=9; Damage identification: equal weights λ v,l = / The computation of ROC curves comprises the computation of several binary tests between a reference and a target class, which may be cumbersome when the number of evaluated damage types is large, as may be the case of in a damage identification problem. Instead, the performance of the damage identification method can be measured in terms of other performance variables, such as the Correct Identification Rate, defined as the ratio between the number of correctly identified damages and the total number evaluated cases. The second option is preferred here for the evaluation of the damage identification methods, since it facilitates the optimization of the free parameters. Then, the optimization of the damage identification methods is based on the maximization of the correct identification rate. The fixing parameters in the damage identification methods are the weights of the MM representations of each class, namely π v,l for all structural states v={a,a,...,a 4,b,b...,b 4,...,e,e,...,e 4 } and l =,,L. Since the number of the weights and scale parameters is ( records of damages 4 damage levels 5 damage types). Therefore, the use of any optimization method for this elevated number of fixing parameters is computationally prohibitive. For this reason, optimization is attempted only for the weights, and instead of optimizing individually the weights of each model of the MMs, it is considered that all the weights of a class MM are equal, and only differ among classes. Thus, only weights are optimized, each one for a single damage type and level. The optimization of the weights of the MM representations is also performed with the GPS

230 6.5. Damage and Fault Diagnosis Results algorithm, using the same settings as in the optimization of the MM-ML damage detection method summarized in Table The performance of the damage identification methods can be further analyzed by means of confusion matrices. Each entry (i, j) of a confusion matrix contains the number of instances belonging to class i identified as class j by the classifier (damage identification method). Therefore, the confusion matrix would tend to be diagonal if the method performs perfectly, while values out of the main diagonal of the matrix indicate that the method confuses different damage types. Hence, the confusion matrix is a tool that can be used to analyze the behavior of the damage identification method at individual damage types MM-ML based methods The results obtained with the MM-ML damage detection diagnosis methods in the detection and identification of damage/faults in the wind turbine are hereby presented Detection results The MM ML sum and MM ML max damage detection methods described in the first part of Table 6.3. are used for the detection of damage in the wind turbine. The fixing parameters of this method in its both versions are the damage detection threshold ρ lim and the weights π o,l for l =,...,L, with L = 9 (according to the data distribution on the fold cross validation shown in Table 6.5.). Initially, the values of the fixing parameters are set using equal weights π o,l = /L. Figure 6.5.(a) shows the distribution of the statistical quantity ρ d (y u )= p(y u ˆθ u )/p(y u o) used by the MM-ML-sum damage detection method on a single sensor of the wind turbine (tower top in the lateral direction) with the selections for the fixing parameters described before. Larger values of ρ d (y u ) are an indication of a larger deviation from the healthy state. Accordingly, the plot evidences the increasing values of ρ d (y u ) when the structure is in a damaged state, especially as the level of damage increases in the damages of types C, D and E. (a).8 Damage A Damage B Damage C Damage D Damage E log ρd(yu).6.4. (b) Damage level Damage A Damage B Damage C Damage D Damage E log ρd(yu) Damage level Figure 6.5.: Damage detection results: Distribution of the values of the statistical quantity ρ d (y u ) of the LPV AR MM ML sum damage detection method (a) with equal weights and scale parameters (before optimization); and (b) after optimization of the weights. Plots shown for the vibration at the tower top lateral direction.

231 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Figure 6.5.(a) shows the ROC curves obtained with the MM-ML-sum damage detection method on the vibration signals from all the sensors and separated per individual damage types, using the selections described before for the fixing parameters of the method. From the obtained curves, it can be seen that the MM ML sum method is insensitive for some combinations of sensor and damage types, since the ROC curves are running on the diagonal. However, very satisfactory detection performance can be found for all damage types at sensors. Tower Top Fore-Aft Tower Top Lateral Blade Flapwise Blade Edgewise (a).8 TPR.6.4 (b). Damage A Damage B Damage C Damage D Damage E FPR FPR FPR FPR FPR.8 TPR.6.4. Damage A Damage B Damage C Damage D Damage E FPR FPR FPR FPR FPR Figure 6.5.: Damage detection results: Comparison of the ROC curves obtained with the LPV AR MM-MLsum damage detection method (a) with equal weights and scale parameters (before optimization); and (b) after optimization of the weights. The ROC curves are presented for individual damage types and for each one of the sensors. Optimal performance of the MM ML sum method is pursued by optimization of the weights of the MM of the healthy class, as explained in Section Figure shows the weights of the MM of the healthy class after optimization, sorted from highest to lowest values. As observed in the plot, larger values of the weights are found for up to 35 to 6 models. This indicates that after optimization, damage diagnosis mostly relies on a smaller subset of 35 to 6 models. The distribution of the statistical quantity ρ d (y u ) and corresponding ROC curves obtained after optimization of the weights are shown in Figure 6.5.(b) and Figure 6.5.(b), respectively. The change in the distribution of the statistical quantity of the MM-ML-sum method as computed by using the optimal weight values, in comparison with the non-optimized values, is almost indiscernible. Nevertheless, there is an evident lift in the ROC curves, as seen for example in the ROC curves at damages B, C and E with the sensor at the tower top in the fore-aft direction, or at damage C with the sensors in the tower top lateral direction and blade edgewise direction. Figure 6.5.4(a)-(b) shows the ROC-AUC computed for the MM ML sum and MM ML max damage detection methods, for each one of the sensors and separated per damage type and level. As previously observed in the distribution of the statistical quantity ρ d (y u ) shown in Figure 6.5., the probability of detection of damage (associated to the AUC value) increases as the level of damage increases. The method operating at different sensors tends to be more sensitive to particular damage types, being the sensors at the tower-top in the lateral direction and at the blade in the edgewise direction the most effective of all. At the same time, the performance obtained with sensor at the tower-top in the fore-aft direction is only adequate for damage type D, while that one found with the sensor at the blade in the flapwise direction is not very satisfactory. The detectability of damage is also dependent of the actual damage type. In fact, damages A and B appear to be the most difficult to determine in the lowest levels of 3

232 6.5. Damage and Fault Diagnosis Results Sorted optimized weights MM-ML-i method Tower Top Fore-Aft Tower Top Lateral Blade Flapwise Blade Edgewise Figure 6.5.3: Damage detection: Weights of the LPV AR MM ML sum damage detection method applied to each one of the sensors, obtained after the GPS optimization and sorter from higher to lower values. damage, while the remaining types can be effectively detected, even at earlier stages. Tower Top Fore-Aft Tower Top Lateral Blade Flapwise Blade Edgewise (a). Damage A Damage B Damage C Damage D Damage E.9 AUC (b) Damage A 3 4 Damage B 3 4 Damage level Damage C 3 4 Damage D 3 4 Damage E.9 AUC (c) Damage A 3 4 Damage B 3 4 Damage level Damage C 3 4 Damage D 3 4 Damage E.9 AUC Damage level Figure 6.5.4: Damage detection results: ROC-AUC computed for: (a) the LPV AR MM ML sum damage detection method, (b) the LPV AR MM ML max damage detection methods, and (c) the LPV AR MM KL damage detection methods, after optimization of the weights. ROC-AUC are presented for individual damage types and levels, and for each one of the sensors. 4

233 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Identification results Damage identification is attempted by means of the test shown in the second part of Table 6.3. in its both versions, using the sum and maximum approximations for the computation of the marginal likelihood. For this purpose, a class MM is constructed for each damage type and level. The optimization of the class weights is carried out as explained in Section Figure 6.5.5(a) (d) shows the confusion matrices obtained with the LPV AR MM ML sum and MM ML max damage identification methods on the identification of the damage type and level using vibration responses from the tower top in the lateral direction and from the blade in the edgewise direction. The confusion matrices show that the MM-ML methods are capable to discern among most damage types and levels. Particularly good performance is obtained for damage types B and D, where the MM-ML damage identification methods provide almost perfect performance. On the other hand, both methods are able to identify damage A from other damage types but are unable to determine the precise level of damage. Finally, damages C and E tend to be confused by the MM-ML damage identification methods, although when the damage type is correctly identified, it is also possible to identify the level of damage MM-KL based methods The results obtained with the MM KL damage detection and damage identification methods are hereby presented Detection results Damage detection is performed using the MM-KL method, whose fixing parameters are the weights λ o,l for each one of the L models in the MM of the healthy state and the damage detection threshold d lim. The fixing parameters are again adjusted by optimizing the ROC-AUC by means of the GPS algorithm with the settings shown in Table Figure 6.5.6(a) shows the distribution of the Kullback-Leibler divergence computed between LPV AR models from healthy and damaged states with the sensor at the tower top in the lateral direction. Figure 6.5.6(b) shows the corresponding ROC curves obtained with all the sensors, while Figure 6.5.4(c) shows the ROC-AUC separated per damage type and level as obtained with all the sensors. The performance of the MM-KL damage detection method is slightly better compared with the MM-ML counterpart, especially for all damages at earlier stages Identification results Damage identification is performed using the test shown in the second part of Table The set up of the method follows the same approach used in the MM-ML damage identification method, which consists on constructing an MM per damage type and level and using the same weight value for each one of the MMs. Then, the weights of each one of the MMs are optimized by maximizing the correct identification rate. Figures 6.5.5(e)-(f) show the confusion matrices of the LPV AR MM-KL damage identification method applied on the vibration responses from the sensors at the tower top in the lateral direction and at the blade in the edgewise direction. The results obtained with the MM-KL damage detection method are again very similar to those obtained with the MM-ML method, although in this case the performance is reduced. As it can be observed in the confusion matrices, the number of instances misclassified by the MM-KL method is slightly increased, thus explaining the reduced effectiveness of the method. However, as can be seen in the confusion matrices in Figures 6.5.5(e)-(f), the performance on level estimation obtained with the MM-KL method at damages A and B is improved in contrast to the MM-ML method Comparison with a Model Parameter Based Method The Single Model Parameter-based (SMP) damage diagnosis method is summarized in Appendix 6.C. In the damage detection test, the damage detection threshold χlim is selected using cross-validation, instead of being drawn from a chi square distribution as originally proposed in (Spiridonakos and Fassois, 3). Also, the damage identification test is a modified version from the one presented in the same paper, where the damage detection test is sequentially applied using each damage as the reference model, until a detection is met. 5

234 6.5. Damage and Fault Diagnosis Results (a) (c) Estimated damage type Estimated damage type A B C D E A B C D E A B C D E Actual damage type A B C D E Actual damage type Identified rate Identified rate (b) (d) Estimated damage type Estimated damage type A B C D E A B C D E A B C D E Actual damage type A B C D E Actual damage type Identified rate Identified rate (e) Estimated damage type A B C D E A B C D E Actual damage type Identified rate (f) Estimated damage type A B C D E A B C D E Actual damage type Identified rate Figure 6.5.5: Damage identification results: Confusion matrices obtained with the LPV AR MM damage identification methods in the identification of the type and level of damage of the wind turbine: (a) MM ML sum method using sensor at the tower top in the lateral direction; (b) MM ML sum method using sensor at the blade in the edgewise direction; (c) MM ML max method using sensor at the tower top in the lateral direction; (d) MM ML max method using sensor at the blade in the edgewise direction; (e) MM-KL method using sensor at the tower top in the lateral direction; (f) MM-KL method using sensor at the blade in the edgewise direction. 6

235 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty (a) log mindkl(mu,mo,k) -4. Damage A Damage B Damage C Damage D Damage E Damage level Tower Top Fore-Aft Tower Top Lateral Blade Flapwise Blade Edgewise (b).8 TPR.6.4. Damage A Damage B Damage C Damage D Damage E FPR FPR FPR FPR FPR Figure 6.5.6: Damage detection results: Analysis of the LPV AR MM-KL damage detection method: (a) Distribution of the values of the Kullback-Leibler divergence measured between the reference and test LPV AR models of the vibration at the tower top lateral direction; (b) ROC curves separated per damage type obtained at all the sensors Damage detection results Damage detection is attempted by means of the test in the first part of Table 6.C.. For this case, the unique fixing parameter is the detection threshold, which is fixed within the cross-validation. Figure 6.5.7(a) shows the distribution of the damage detection statistic χ ϑ ( ˆϑ u, ˆϑ o ) measured between the parameters of a reference LPV AR model from the healthy state and the test LPV AR models, from all structural states. The ROC curves obtained with the SMP damage detection method at each one of the sensors and for each one of the damage types and levels are shown in Figure 6.5.7(b). The behavior is more or less similar to the one obtained with the MM-KL method, however, the distribution of the individual classes appear to overlap more. For this reason, the ROC curves demonstrate a reduced sensitivity as compared with the MM-KL and MM-ML damage detection methods Damage identification and level estimation results Damage identification and level estimation is performed using the test shown in the second part of Table 6.C.. Figure shows the confusion matrices of the SMP damage identification method applied on the vibration responses from the sensors at the tower top in the lateral direction and at the blade in the edgewise direction. The overall behavior is also similar to that one obtained with the MM-KL damage identification method, although the SMP damage identification method tends to confuse more classes C and D, and the different levels of damage A. Besides, there is an increased number of misclassified damages in the other damage types Discussion The ROC-AUC obtained by each one of the damage detection methods per damage type and level are displayed in Figure for the LPV AR MM-ML and MM-KL methods, and the confusion matrices displaying the perfor- 7

236 6.5. Damage and Fault Diagnosis Results (a) 4 Damage A Damage B Damage C Damage D Damage E log χ ϑ (ˆϑu,ˆϑo) Damage level Tower Top Fore-Aft Tower Top Lateral Blade Flapwise Blade Edgewise (b).8 TPR.6.4. Damage A Damage B Damage C Damage D Damage E FPR FPR FPR FPR FPR Figure 6.5.7: Analysis of the LPV AR SMP damage detection method results: (a) Distribution of the values of the damage detection statistic measured between the parameters of a reference model from the healthy state and the test models; (b) ROC curves separated per damage type obtained at all the sensors. (a) Estimated damage type A B C D E A B C D E Actual damage type Identified rate (b) Estimated damage type A B C D E A B C D E Actual damage type Identified rate Figure 6.5.8: Confusion matrices obtained with the LPV AR SMP damage identification method in the identification of the type and level of damage of the wind turbine: (a) using sensor at the tower top in the lateral direction; (b) using sensor at the blade in the edgewise direction. mance in damage identification and level estimation are provided in Figure for the LPV AR MM-ML and MM-KL methods, and in Figure for the LPV AR SMP method. A summary of the performance obtained by the MM ML sum, MM ML max, MM KL and SMP damage detection methods based on both LPV AR and FS TAR models, measured in terms of the ROC-AUC, can be found in Table Notice that also the results obtained with a Time Frequency Representation an Principal 8

237 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Component Analysis based method are included in the table (Details presented in Appendix 6.D). Table 6.5.3: Damage detection: Summary of the results obtained after optimization of the damage detection methods shown in terms of the ROC-AUC for all damages and individual types of damage. TFR PCA: Time Frequency Representation with Principal Component Analysis based damage detection method as in (Avendaño-Valencia et al., ). Method Sensor ROC AUC Overall Damage A Damage B Damage C Damage D Damage E MM-ML-sum Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-ML-sum Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise MM-ML-max Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-ML-max Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise MM-KL Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-KL Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise SMP Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise SMP Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise TFR-PCA Tower top fore-aft Tower top lateral Blade flapwise Blade edgewise Table provides a summary of the correct identification rates obtained with the different damage identification methods using both LPV AR and FS TAR models. As for damage detection, the results of the Time Frequency Representation an PCA based method are also presented in the table. The correct identification rates are presented in overall and distributed per damage type and for each one of the sensors. Again, the best performance for most damage types is obtained with the sensors at the tower top in the lateral direction and at the blade in the edgewise direction. The other sensors show more discrete behavior and are insensitive to some damage types. Comparing between the two versions of the MM-ML damage identification method, the best performance is obtained in this time by the MM-MLsum version which uses the finite sum approximation of the marginal likelihood. In order to further summarize the performance and facilitate the comparison of the different damage diagnosis 9

238 6.5. Damage and Fault Diagnosis Results Table 6.5.4: Damage identification: Summary of the results obtained after optimization of the damage identification methods shown in terms of the correct identification rate per damage type. TFR PCA: Time Frequency Representation with Principal Component Analysis based damage identification method as in (Avendaño-Valencia et al., ). Method Sensor Correct Identification Rate Overall Damage A Damage B Damage C Damage D Damage E MM-ML-sum Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-ML-sum Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise MM-ML-max Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-ML-max Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise MM-KL Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise MM-KL Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise SMP Tower top fore-aft LPV AR Tower top lateral Blade flapwise Blade edgewise SMP Tower top fore-aft FS TAR Tower top lateral Blade flapwise Blade edgewise TFR-PCA Tower top fore-aft Tower top lateral Blade flapwise Blade edgewise methods, in Figure are shown the best AUC values among all sensors obtained with each one of the analyzed methods, distributed per damage type and level. Each curve displays the best AUC obtained with a single damage detection method for a single type of damage with increasing level, starting from level (no damage). Based on the obtained results, a final analysis regarding the performance of each one of the methods at each damage type is provided next. Damage A: Increased blade tip mass This damage type is one of the most difficult to detect among all methods. As evident in Figure 6.5.9(a), the methods can not reach accurate detection performance for the first two levels of damage, but in general the LPV AR MM-based methods are the ones yielding the best AUC. Similarly, the performance in damage identification and level estimation is not the best as well, mainly because the methods tend to confuse this damage type with others, and also have difficulty to determine the level of damage. This behavior is

239 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty evident in all the methods, being again the LPV AR MM-based methods the ones providing the best performance. Nonetheless, it must be noticed that this type of damage is perhaps the most challenging, since the damage is located in the tip of the blade, while the sensors are located either near the base of the blade or in the tower top. Moreover, the actual increase of the blade mass in comparison to the total mass of the blade is almost negligible. So, the fact that the damage detection methods are capable of detecting the presence of a damage is actually surprising. Damage B: Decreased stiffness in the blade root As in damage A, this type of damage is also difficult to detect with conventional methods. The AUC curves displayed in Figure 6.5.9(b) show that for most damage detection methods, it is not possible to determine this damage in the lowest levels. The exception are the LPV AR based methods, which are capable of determining the presence of damage from an early stage, being those based on multiple models the ones providing the best AUC. Likewise, the LPV AR MM-based damage identification methods are capable of determining effectively the damage type and level, in particular when using the sensor located in the blade. This may be expected, since the sensor at the blade is located close to the damage site. Besides, as seen in the Melard-Tjøsteim PSD and the spectral correlations in Figure 6.4.3(c) and Figure 6.4.4, the dynamic behavior of the vibration response of the structure under this type of damage appears to be very different from the one observed at the other damage types. Damage C: Decreased stiffness in the tower base The performance on damage detection for this type of damage is relatively consistent with all the methods, where it is observed for the first time that one FS TAR based method yields better performance than a LPV AR based method, however, the best performances are always found with the MM-based methods. On the other hand, the behavior of the results obtained on damage identification and level estimation is strange, since it persistently appears that this type of damage and damage E are confused by the damage identification methods. This behavior may be explained by the Melard-Tjøsteim PSDs shown in Figures 6.4.3(d) and (f), where it can be seen that the dynamics of the vibration response for damages C and E are quite similar. Damage D: Yawing error This damage type is the easiest to detect among all damages. The AUC curves in Figure 6.5.9(d) evidence that the LPV AR based methods yield almost perfect performance from the lowest level of damage. Contrarily, the FS TAR and TFR based methods provide very disappointing results, with the only exception being the FS TAR MM-KL method. Moreover, the LPV AR MM-based methods yield very high damage identification and level estimation correct identification rates. Damage E: Decreased damping in the low speed shaft The behavior on this damage type is very similar to the one obtained at damage C. The AUC curves in Figure 6.5.9(e) evidence that all the damage detection methods provide consistent performance, which consistently improves as the level of damage increases. Best performance is found with the MM-based methods, in particular using LPV AR models, but also with the FS TAR MM-KL method. Regarding damage identification and level estimation, as already explained for damage type C, the damage identification methods consistently confuse this damage type with damage C. A critical analysis of the performance of each one of the modeling and damage diagnosis methods utilized in this work for the vibration-based SHM of a wind turbine is presented in Table Concluding Remarks This work has presented the implementation and analysis of the performance of a vibration-based SHM method for structures with time-dependent dynamic response and subject to uncertainties (operational and environmental), based on two entities: (i) modeling of the time-dependent dynamics on the vibration response by means of Functional Series TAR and Linear Parameter Varying AR representations; (ii) representation of the effects of uncertainties on the model parameters by means of a Multiple Model framework, in which the each health state of the structure is represented by a set of models. The steps followed from initial model set up and identification up to

240 6.A. Model validation and comparative assessment (a) AUC (c) AUC (e) AUC Damage level Damage level Damage level (b) AUC (d) AUC Damage level Damage level LPV-AR MM-ML-i LPV-AR MM-ML-ii LPV-AR MM-KL LPV-AR SMP FS-TAR MM-ML-i FS-TAR MM-ML-ii FS-TAR MM-KL FS-TAR SMP TFR-PCA Figure 6.5.9: Comparison of the best ROC-AUC among all sensors obtained with all the damage detection methods considered for each damage type and increasing damage levels: (a) damage A: increased mass blade tip, (b) damage B: decreased stiffness blade root, (c) damage C: decreased stiffness tower base, (d) damage D: yawing error, (e) damage E: decreased damping in the low speed shaft. the optimization of the damage diagnosis methods have been presented in a systematic and critical form. Besides, comparison with similar damage diagnosis methods, including the single model parameter based method, and the TFR with PCA analysis for dimensionality reduction method. The MM damage diagnosis methods have shown to be very powerful tools for the vibration-based SHM problem, with a relatively low increase in the complexity compared to other methods based on single models. Furthermore, the potential capabilities for accurate modeling and analysis of time-dependent dynamics of nonstationary parametric modeling have been made evident. Given the results of this work and the accompanying paper (Avendaño-Valencia and Fassois, 5b), the MM framework may be considered as a very interesting line of research for the health monitoring of several types of structures, including as a main application case those with non-stationary dynamic response. Appendix 6.A Model validation and comparative assessment Validation aims at determining whether the assumptions made for the model, in the identified model structures, are correct. The validation procedure in the models presently used, consists of evaluating the Gaussianity and the uncorrelatedness of the innovations (estimation residuals). Both properties are visually evaluated by means of normal probability plots and the AutoCorrelation Function (ACF) of the residuals normalized by the estimated innovations standard deviation, respectively.

241 6. Damage and Fault Diagnosis in an Operating Wind Turbine Under Uncertainty Table 6.5.5: Final analysis of the performance of the methods involved in the vibration-based SHM Modeling of the vibration response Method Analysis Useful for initial analysis of the vibration response. Simple to compute. Non-parametric TFR FS TAR modeling LPV AR modeling Method Multiple methods Model Single Model Parameter-based method TFR with PCAbased method Large representation size. More complex to be used in SHM. Low SHM performance. Compact and accurate representation of the vibration response. Simple application on SHM. Simple for dynamic analysis of the vibration response. Involved identification process in comparison to non-parametric methods. More sensitive to variability in SHM compared to LPV AR models. Reduced SHM performance compared to LPV AR models. Compact and very accurate representation of the vibration response. Simple application on SHM. Simple and accurate for dynamic analysis of the vibration response. Most reliable SHM performance. Involved identification process in comparison to non-parametric methods. SHM methods Analysis Very powerful and highly reliable for SHM. Simple to implement. Simple representations of uncertainties. Accurate performance even without optimization of the free parameters. Robustifies the performance even for less accurate modeling methods. Requires a higher number of computations. Provides good performance for low uncertainty values. Simple to implement. May be used for initial assessment. The performance is compromised under the presence of moderate levels of uncertainty. May be used for initial analysis. Low performance compared to MM methods. More sensitive to uncertainties. Difficult handling of large-sized TFR matrices. The normal probability plots obtained for the residuals of the FS TAR and LPV AR models with optimized structures in a subset of realizations from the healthy state are shown in Figure 6.A.(a), however the behavior in the whole set of realizations is about the same. Diagonal lines indicate the distribution of an ideal Gaussian distribution. Therefore, since large deviations are not evident, it can be concluded that the distribution of the residuals from both FS TAR and LPV AR is approximately Gaussian. Small deviations in the tails of the distribution are only evident on the residuals of the LPV AR model of the vibration at the blade in the flapwise direction, and on the residuals of the FS TAR model of the vibration at the blade in the edgewise direction. The sample ACF plots for the residuals of the FS TAR and LPV AR models of the same subset of signals from the healthy state and the 9% significance bounds for the ACF are shown in Figure 6.A.(b). If the ACF remains within the significance bounds then it is concluded that the residuals are uncorrelated. Therefore, only the models of the vibration response at the tower top in the fore-aft direction seem to satisfy uncorrelatedness, while 3

6.A. Model validation and comparative assessment in the remaining sensors small

However, the LPV AR models tend to yield lower correlation values than the FS TAR

: Validation of the estimation residuals on a selected subset of vibration response

the distribution of the ideal normal distribution); (b) normalized autocorrelation

presents an analysis of the distribution of the estimated coefficients of projection

distributed around zero or close to zero, except to those ones corresponding to basis

coefficients of projection with corresponding basis order 6 and 7 (corresponding to

242 6.A. Model validation and comparative assessment in the remaining sensors small correlation values are evident at some lags. However, the LPV AR models tend to yield lower correlation values than the FS TAR models. (a) (b) Figure 6.A.: Validation of the estimation residuals on a selected subset of vibration response signals from the healthy state: (a) normal probability plot (diagonal lines indicate the distribution of the ideal normal distribution); (b) normalized autocorrelation function (horizontal lines indicate the 9% significance bounds). Figure 6.A. presents an analysis of the distribution of the estimated coefficients of projection of FS TAR and LPV AR models of the vibration response at the tower top in the lateral direction. Each plot shows the coefficients of projection organized according to their corresponding basis index. The estimates of the coefficients of projection of FS TAR models appear to be distributed around zero or close to zero, except to those ones corresponding to basis index (the constant basis), which in contrast appear to be distributed consistently around non-zero values. Similar behavior is noticeable in the LPV AR models, although in this case, the coefficients of projection with corresponding basis order 6 and 7 (corresponding to cos3β[t] and sin3β[t]) seem to be also consistently distributed around non-zero values. The reason for this difference is that the FS TAR models can not track the small deviations in the rotor angle, thus making the parameter estimates less consistent, while LPV AR models are designed to track these changes. An effect of this is that the LPV AR models will provide better and more consistent performance later in the damage diagnosis stage. 4

Stationary or Non-Stationary Random Excitation for Vibration-Based Structural Damage Detection? An exploratory study

Stationary or Non-Stationary Random Excitation for Vibration-Based Structural Damage Detection? An exploratory study 6th International Symposium on NDT in Aerospace, 12-14th November 2014, Madrid, Spain - www.ndt.net/app.aerondt2014 More Info at Open Access Database www.ndt.net/?id=16938 Stationary or Non-Stationary