Reliability Monitoring Using Log Gaussian Process Regression

Similar documents
Modeling Rate of Occurrence of Failures with Log-Gaussian Process Models: A Case Study for Prognosis and Health Management of a Fleet of Vehicles

Gaussian Processes (10/16/13)

GAUSSIAN PROCESS REGRESSION

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Learning Gaussian Process Models from Uncertain Data

Probabilistic & Unsupervised Learning

Gaussian Process Regression

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

Nonparameteric Regression:

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Model Selection for Gaussian Processes

MTTTS16 Learning from Multiple Sources

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Lecture 5: GPs and Streaming regression

CMU-Q Lecture 24:

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

GWAS V: Gaussian processes

Gaussian Processes for Machine Learning

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Probabilistic & Bayesian deep learning. Andreas Damianou

Nonparametric Bayesian Methods (Gaussian Processes)

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

Probabilistic Graphical Models Lecture 20: Gaussian Processes

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

Gaussian Processes in Machine Learning

Building an Automatic Statistician

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

Support Vector Machines

Optimization of Gaussian Process Hyperparameters using Rprop

Bayesian Regression Linear and Logistic Regression

Supervised Learning Coursework

Gaussian with mean ( µ ) and standard deviation ( σ)

How to build an automatic statistician

Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

Neutron inverse kinetics via Gaussian Processes

Introduction to Gaussian Processes

Machine Learning Srihari. Gaussian Processes. Sargur Srihari

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

CS 7140: Advanced Machine Learning

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Pattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

The Automatic Statistician

Multiple-step Time Series Forecasting with Sparse Gaussian Processes

Joint Emotion Analysis via Multi-task Gaussian Processes

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Density Estimation. Seungjin Choi

Assessing Reliability Using Developmental and Operational Test Data

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

A Process over all Stationary Covariance Kernels

Talk on Bayesian Optimization

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Probabilistic Machine Learning. Industrial AI Lab.

STAT 518 Intro Student Presentation

Bayesian Machine Learning

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

STA 4273H: Sta-s-cal Machine Learning

Introduction to Machine Learning

Linear & nonlinear classifiers

Gaussian process for nonstationary time series prediction

Gaussian Processes. 1 What problems can be solved by Gaussian Processes?

Pattern Recognition and Machine Learning

Parameter learning in CRF s

Nonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group

Gaussian processes in python

Introduction to Gaussian Processes

Outline Lecture 2 2(32)

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Statistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes

Advanced Introduction to Machine Learning CMU-10715

Bayesian Gaussian / Linear Models. Read Sections and 3.3 in the text by Bishop

System identification and control with (deep) Gaussian processes. Andreas Damianou

Relevance Vector Machines

Bayesian Linear Regression. Sargur Srihari

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Gaussian Process Regression Model in Spatial Logistic Regression

Logistic Regression. Seungjin Choi

Introduction to Machine Learning

ECE521 week 3: 23/26 January 2017

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Introduction to Machine Learning

Probabilistic Graphical Models

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati

Multivariate Bayesian Linear Regression MLAI Lecture 11

Issues in Dependency Modeling in Multi- Unit Seismic PRA

Lecture : Probabilistic Machine Learning

20: Gaussian Processes

Transcription:

COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical Engineering College Park, MD 074, U.S.A. mwayne1@umd.edu; modarres@umd.edu 1

Traditional Regression and Bayesian Regression COPYRIGHT 013, M. Modarres Traditional Regression assumes an underlying process (e.g., failure process described by the failure rate model λ(t)) which generates clean data. The goal is to describe the underlying model in the presence of noisy data. P( Η ) α Bayesian Regression Specify the prior of a set of probabilistic models. The likelihood of Η α after observing data D is P( D Ηα ) The posterior probability of is given by Η α P( Ηα D) P( Ηα ) P( D Ηα ) Prediction: py ( D) = Py ( Ηα) P( Ηα D) α

COPYRIGHT 013, M. Modarres GP Regression It is a nonlinear regression when you need to learn a function f with uncertainties from data D = {X, y} Ref: Eurandom 010, Z. Ghahramani 3

COPYRIGHT 013, M. Modarres GP Regression (Cont.) A Gaussian process defines a distribution over functions p(f) which can be used for Bayesian regression p(f D) = p(f)p(d f)/p(d) Gaussian processes (GPs) are parameterized by a mean function, μ(x), and a covariance function, or kernel, K(x, x ). The covariance matrix K is between all the pair of points x and x' and specifies a distribution on functions and similarly for p(f(x 1 ),..., f(x n )) where now μ is an n x 1 vector and Σ is an n x n matrix 4

5 COPYRIGHT 013, M. Modarres

COPYRIGHT 013, M. Modarres GP Regression (Cont.) Set of random variables, any finite collection of which follows joint Gaussian distribution Gaussian distribution specified by mean, μ(x), and kernel function, k(x 1,x ) May also be defined using standard linear model form y = f (x)+ε 6

COPYRIGHT 013, M. Modarres Kernel Function Kernel function = covariance function Kernel function determines correlation between data points Can model trends in data such as periodicity Popular example is squared-exponential kernel function l is the characteristic length-scale of the process (showing, "how far apart" two points have to be for X to change significantly Used to develop overall covariance matrix K for vector of data 7 k ( x, x ) 1 σ exp x 1 = l x

COPYRIGHT 013, M. Modarres Kernel Function Hyperparameters Kernel functions contain unknown hyperparameters Hyperparameters can be chosen by maximizing log-marginal likelihood given by Achieved using conjugate gradient optimization technique Built-in option in available software packages Chi-square goodness-of-fit can also be applied to test data to assess model 8 log p 1 T ( y / X ) = y K y log K logπ 1 1 n

COPYRIGHT 013, M. Modarres Log Gaussian Processes Observed data Y are strictly positive Assume log(y) is normally distributed Can use conditional probability to determine prediction for f * log( f *) / log( Y ) ~ N µ, Σ Inverse transform can be used to proper domain 9 log( Y) ~ log( f *) µ = b + Σ = K a N, b K K ( X, X ) K( X, X *) ( X*, X ) K( X*, X *) ( ) 1 K( X*, X ) K( X, X ) (log( Y ) b) 1 ( X*, X *) K( X*, X ) K( X, X *) K( X, X *)

COPYRIGHT 013, M. Modarres Application to Fleet of Vehicles Use Log GPR to model rate-of-occurrence of failures per month for fleet of vehicles Rate of occurrence of failures (per mile) 10

COPYRIGHT 013, M. Modarres Covariance Function Choices Data appear to have periodic behavior along with noise Examine kernel function alternatives to describe correlation within data Combinations of kernel functions are also kernel functions Options include Squared Exponential (SE), Noise, Periodic, Polynomial, etc. Can use negative log-likelihood to discriminate between possible kernel functions Smaller values indicate higher likelihood Better description of data 11

COPYRIGHT 013, M. Modarres Kernel Function Likelihood 1 Kernel Function Alternative Description Negative Log Likelihood 1 SE, Periodic, Noise 7.84 SE, Noise 8.54 3 Noise 15.83 4 SE, Periodic 8.13 5 SE 8.54 6 SE, Polynomial, Noise 8.54 7 Polynomial 13.33 8 Rational quadratic, Noise 8.51 9 Rational quadratic 8.51 Sum of squared exponential, periodic, and noise kernels yields highest likelihood and best description of data

Covariance Function and Hyperparameters COPYRIGHT 013, M. Modarres k ( x x ) sin Parameter Value θ 1 (SE 1) 4.38 θ (SE ) 0.31 θ 3 (Periodic 1) 0.13 θ 4 (Periodic ) 1.03 θ 5 (Periodic 3) 0.6 σ n (Noise) 0.1 13 ( x x ) 1 ( x1 x ) 4 1 1 exp 3 exp θ = θ + θ + σ n 1, θ θ5 ( x ), x 1, x1 = x ( x1, x ) = 0, o. w. π

Predicted Result Model provides probabilistic prediction of failures in future months + indicates test data used in goodness-of-fit test 14 **Dashed lines indicate 95% probability interval COPYRIGHT 013, M. Modarres

Anomaly Detection Example 15 **Dashed lines indicate 95% probability interval COPYRIGHT 013, M. Modarres

COPYRIGHT 013, M. Modarres Conclusions and Possible Extensions LOG GP Regression is powerful technique for modeling complex nonlinear behavior Provides probabilistic indication of reliability problems vs. typical trends within data Can be extended to model more complex nonlinear relationships Only require appropriate kernel function Can also handle multidimensional inputs Systems in different locations, different ages, etc. 16