A Hybrid Deep Learning Approach For Chaotic Time Series Prediction Based On Unsupervised Feature Learning

Similar documents
Knowledge Extraction from DBNs for Images

Deep unsupervised learning

UNSUPERVISED LEARNING

Deep Learning Architecture for Univariate Time Series Forecasting

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks

Lecture 16 Deep Neural Generative Models

Deep Neural Networks

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

The Origin of Deep Learning. Lili Mou Jan, 2015

Denoising Autoencoders

How to do backpropagation in a brain

Derivative-Optimized Empirical Mode Decomposition (DEMD) for the Hilbert- Huang Transform

Greedy Layer-Wise Training of Deep Networks

Learning Tetris. 1 Tetris. February 3, 2009

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

TIME series forecasting is used for forecasting the future

Introduction to Restricted Boltzmann Machines

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Course Structure. Psychology 452 Week 12: Deep Learning. Chapter 8 Discussion. Part I: Deep Learning: What and Why? Rufus. Rufus Processed By Fetch

CSC321 Lecture 20: Autoencoders

Using modern time series analysis techniques to predict ENSO events from the SOI time series

Ensemble empirical mode decomposition of Australian monthly rainfall and temperature data

An efficient way to learn deep generative models

A Hybrid Approach for Time series Forecasting using Deep Learning and Nonlinear Autoregressive Neural Network

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep Belief Networks are Compact Universal Approximators

Density functionals from deep learning

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

Dynamical Systems and Deep Learning: Overview. Abbas Edalat

arxiv: v3 [cs.lg] 18 Mar 2013

Learning Energy-Based Models of High-Dimensional Data

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks. Nicolas Le Roux and Yoshua Bengio Presented by Colin Graber

Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

Time-frequency analysis of seismic data using synchrosqueezing wavelet transform a

Th P4 07 Seismic Sequence Stratigraphy Analysis Using Signal Mode Decomposition

Gaussian Cardinality Restricted Boltzmann Machines

Deep Boltzmann Machines

A Support Vector Regression Model for Forecasting Rainfall

arxiv: v2 [cs.ne] 22 Feb 2013

Deep Belief Networks are compact universal approximators

Learning Deep Architectures

ARestricted Boltzmann machine (RBM) [1] is a probabilistic

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Modeling Documents with a Deep Boltzmann Machine

ADDING EMD PROCESS AND FILTERING ANALYSIS TO ENHANCE PERFORMANCES OF ARIMA MODEL WHEN TIME SERIES IS MEASUREMENT DATA

Deep Learning & Neural Networks Lecture 2

STA 414/2104: Lecture 8

Deep Learning Architectures and Algorithms

Unsupervised Learning

Deep Learning for Natural Language Processing

COMP9444 Neural Networks and Deep Learning 11. Boltzmann Machines. COMP9444 c Alan Blair, 2017

Forecasting Drought in Tel River Basin using Feed-forward Recursive Neural Network

A new method for short-term load forecasting based on chaotic time series and neural network

Machine Learning. Neural Networks

8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7]

Chapter 16. Structured Probabilistic Models for Deep Learning

Prediction of Monthly Rainfall of Nainital Region using Artificial Neural Network (ANN) and Support Vector Machine (SVM)

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

MULTISCALE CHANGES AND A TREND ANALYSIS OF THE FLUCTUATION OF HOUSEHOLD CONSUMPTION GROWTH IN CHINA. GUO Yue-ting 1, XU Jian-gang 2

Restricted Boltzmann Machines

Deep Generative Models. (Unsupervised Learning)

RegML 2018 Class 8 Deep learning

How to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto

Introduction to Deep Learning

Deep Recurrent Neural Networks

Deep Learning Autoencoder Models

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

Lecture Hilbert-Huang Transform. An examination of Fourier Analysis. Existing non-stationary data handling method

STA 414/2104: Lecture 8

Robust Classification using Boltzmann machines by Vasileios Vasilakakis

Lecture 14: Deep Generative Learning

Artificial Neural Networks. Historical description

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

An Efficient Learning Procedure for Deep Boltzmann Machines

Possible Applications of Deep Neural Networks in Climate and Weather. David M. Hall Assistant Research Professor Dept. Computer Science, CU Boulder

Jakub Hajic Artificial Intelligence Seminar I

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

CONVOLUTIONAL DEEP BELIEF NETWORKS

A hybrid model of stock price prediction based on the PCA-ARIMA-BP

Ruslan Salakhutdinov Joint work with Geoff Hinton. University of Toronto, Machine Learning Group

Feature Design. Feature Design. Feature Design. & Deep Learning

Introduction to Neural Networks

To Predict Rain Fall in Desert Area of Rajasthan Using Data Mining Techniques

arxiv: v1 [stat.ml] 30 Mar 2016

Chapter-1 Introduction

Basic Principles of Unsupervised and Unsupervised

Neural Networks for Machine Learning. Lecture 11a Hopfield Nets

Gaussian Process Approximations of Stochastic Differential Equations

Noise reduction of ship-radiated noise based on noise-assisted bivariate empirical mode decomposition

An Introduction to HILBERT-HUANG TRANSFORM and EMPIRICAL MODE DECOMPOSITION (HHT-EMD) Advanced Structural Dynamics (CE 20162)

Improved the Forecasting of ANN-ARIMA Model Performance: A Case Study of Water Quality at the Offshore Kuala Terengganu, Terengganu, Malaysia

Study of nonlinear phenomena in a tokamak plasma using a novel Hilbert transform technique

Reading, UK 1 2 Abstract

Transcription:

A Hybrid Deep Learning Approach For Chaotic Time Series Prediction Based On Unsupervised Feature Learning Norbert Ayine Agana Advisor: Abdollah Homaifar Autonomous Control & Information Technology Institute (ACIT), Departmenent of Electrical and Computer Engineering, North Carolina A&T State University June 16, 2017 N. Agana (NCAT) June 16, 2017 1 / 35

Outline 1 Introduction Time Series Prediction Time Series Prediction Models Problem Statement Motivation 2 Deep Learning Unsupervised Deep Learning Models Stacked Autoencoders Deep Belief Networks 3 Proposed Deep Learning Approach Deep Belief Network Empirical Mode Decomposition (EMD) 4 Empirical Evaluation 5 Conclusion and Future Work N. Agana (NCAT) June 16, 2017 2 / 35

Time Series Prediction 1 Time series prediction is a fundamental problem found in several domains including climate, finance, health, industrial applications etc 2 Time series forecasting is the process whereby past observations of the same variable are collected and analyzed to develop a model capable of describing the underlying relationship 3 The model is then used to extrapolate the time series into the future 4 Most decisions made in society are based on information obtained from time series analysis provided it is converted into knowledge Figure 1 N. Agana (NCAT) June 16, 2017 3 / 35

Time Series Prediction Models 1 Statistical methods: Autoregressive(AR) models are commonly used for time series forecasting 1 Autoregressive(AR) 2 Autoregressive moving average (ARMA) 3 Autoregressive integrated moving average (ARIMA) 2 Though ARIMA is quiet flexible, its major limitation is the assumption of linearity form of the model: No nonlinear patterns can be captured by ARIMA 3 Real-world time series such as weather variables (drought, rainfall, etc.), financial series etc. exhibit non-linear behavior 4 Neural networks have shown great promise over the last two decades in modeling nonlinear time series 1 Generalization ability and flexibility: No assumptions of model has to be made 2 Ability to capture both deterministic and random features makes it ideal for modeling chaotic systems 5 Nonconvex optimization issues occurs when two or more hidden layers are required for highly complex phenomena N. Agana (NCAT) June 16, 2017 4 / 35

Problem Statement 1 Deep neural networks trained using back-propagation perform worst than shallow networks 2 A solution is to initially use a local unsupervised criterion to (pre)train each layer in turn 3 The aim of the unsupervised pre-training is to: obtain useful higher-level representation from the lower-level representation output obtain better weights initialization N. Agana (NCAT) June 16, 2017 5 / 35

Motivation 1 Availability of large data from various domains(weather, stock markets,health records,industries etc.) 2 Advancements in hardware as well in machine learning algorithms 3 Great success in domains such as speech recognition, image classification, computer vision 4 Deep learning applications in time series prediction, especially climate data, is relatively new and has rarely been explored 5 Climate data is highly complex and hard to model, therefore a non-linear model is beneficial 6 A large set of features have influence on climate variables Figure 2: How Data Science Techniques Scale with Amount of Data N. Agana (NCAT) June 16, 2017 6 / 35

Deep Learning 1 Deep learning is an artificial neural network with several hidden layers 2 There are a set of algorithms that are used for training deep neural networks 3 Deep learning algorithms seek to discover good features that best represent the problem, rather than just a way to combine them Figure 3: A Deep Neural Network N. Agana (NCAT) June 16, 2017 7 / 35

Unsupervised Feature Learning and Deep Learning 1 Unsupervised feature learning are widely used to learn better representations of the input data 2 The two common methods are the autoencoders(ae) and restricted Boltzmann machines(rbm) N. Agana (NCAT) June 16, 2017 8 / 35

Stacked Autoencoders 1 The stacked autoencoder (SAE) model is a stack of autoencoders 2 It uses autoencoders as building blocks to create a deep network 3 An autoencoder is a NN that attempts to reproduce its input: The target output is the input of the model Figure 4: An Example of an Autoencoder N. Agana (NCAT) June 16, 2017 9 / 35

Deep Belief Networks 1 A Deep Belief Network (DBN) is a multilayer neural network constructed by stacking several Restricted Boltzmann Machines(RBM)[3] 2 An RBM is an unsupervised learning model that is learned using contrastive divergence Figure 5: Construction of a DBN N. Agana (NCAT) June 16, 2017 10 / 35

Figure 6: Flowchart of the proposed model N. Agana (NCAT) June 16, 2017 11 / 35 Proposed Deep Learning Approach 1 We propose an empirical mode decomposition based Deep Belief Network with two Restricted Boltzmann Machines 2 The purpose of the decomposition is to simplify the forecasting process

Proposed Deep Learning Approach Figure 7: Proposed Model Figure 8: DBN with two RBMs N. Agana (NCAT) June 16, 2017 12 / 35

Restricted Boltzmann Machines (RBMs) I 1 An RBM is a stochastic generative model that consists of only two bipartite layers: visible layer v and hidden layer h 2 It uses only input(training set) for learning 3 A type of unsupervised learning neural network that can extract meaningful features of the input data set which are more useful for learning 4 It is normally defined in terms of the energy of configuration between the visible units and hidden units Figure 9: An RBM N. Agana (NCAT) June 16, 2017 13 / 35

Restricted Boltzmann Machines (RBMs) II The joint probability of the configuration is given by [4]: P(v, h) = e E(v,h) Z, Where Z is the partition function (normalization factor): Z = v,h e E(v,h) and E(v, h), the energy of configuration: E(v, h) = i=visible a iv i j=hidden b jh j ij v ih j w ij Training of RBMs consists of sampling the h j given v (or the v i given h) using Contrastive Divergence. N. Agana (NCAT) June 16, 2017 14 / 35

Training an RBM 1 Set initial states to the training data set (visible units) 2 Sample in a back and forth process Positive phase: P(h j = 1 v) = σ (c j + w ij v i ) Negative phase: P(v i = 1 h) = σ (b i + w ij h j ) 3 Update all the hidden units in parallel starting with visible units, reconstruct visible units from the hidden units, and finally update the hidden units again w ij = α ( v i h j data v i h j model ) Figure 10: Single step of Contrastive Divergence 4 Repeat with all training examples N. Agana (NCAT) June 16, 2017 15 / 35

Deep Belief Network A Deep belief network is constructed by stacking multiple RBMs together. Training a DBN is simply the layer-wise training of the stacked RBMs: 1 Train the first layer using the input data only (unsupervised) 2 Freeze the first layer parameters and train the second layer using the output of the first layer as the input 3 Use the outputs of the second layer as inputs to the last layer (supervised) and train the last supervised layer 4 Unfreeze all weights and fine tune the entire network using error back propagation in a supervised manner. Figure 11: A DBN with two RBMs N. Agana (NCAT) June 16, 2017 16 / 35

Empirical Mode Decomposition (EMD) 1 EMD is an adaptive data pre-processing method suitable for non-stationary and nonlinear time series data [5] 2 Based on the assumption that any dataset consists of different simple intrinsic modes of oscillations 3 Given a data set, x(t), the EMD method will decompose the dataset into several independent intrinsic mode functions (IMFs) with a corresponding residue, which represents trend using the equation[6]: X (t) = n j=1 c j + r n where the c j are the IMF components and r n is a residual component N. Agana (NCAT) June 16, 2017 17 / 35

The Hybrid EMD-BBN Model 1 A hybrid model consisting of Empirical Mode Decomposition and a Deep Belief Network (EMD-DBN) is proposed in this work Figure 12: Flowchart of the hybrid EMD-DBN model Figure 13: EMD decomposition of SSI series: The top is the original signal, followed by 7 IMFs and the residue N. Agana (NCAT) June 16, 2017 18 / 35

Summary of the proposed approach The following few steps are used [1],[2]: 1 Given a time series data, determine if it is nonstationary or nonlinear 2 If yes, decompose the data into a fine number of IMFs and a residue using the EMD 3 Divide the data into training and testing data (usually 80% for training and 20% for testing) 4 For each IMF and residue, construct one training matrix as the input for one DBN. The input to the DBN are the past five observations 5 Select the appropriate model structure and initialize the parameters of the DBN. Two hidden layers are used in this work 6 Using the training data, pre-train the DBN through unsupervised learning for each IMF and the residue 7 Fine-tune the parameters of the entire network using the back-propagation algorithm 8 perform predictions with the trained model using the test data 9 Combine all the prediction results by summation to obtain the final output N. Agana (NCAT) June 16, 2017 19 / 35

Prediction of Solar Activity 1 The solar activity is characterized, among others, by means of the relative sunspot number 2 Sunspots are dark spots that are often seen on the suns surface. 3 Its known to influence several geophysical processes on earth 4 For example, atmospheric motion, climate anomaly, ocean change, etc. all have different degree of relation with sunspot number 5 It is also a good determiner for solar power generation 6 Due to the complexity of the sunspot number change, modeling methods have encountered troubles trying to describe its change rules N. Agana (NCAT) June 16, 2017 20 / 35

Description of Data 1 The monthly time series representing the solar activity cycle during the last 268 years is used. 2 The data represents the sunspot number, that is an estimation of the number of individual sunspots from 1949 to 2016: A total of 3216 observations We used monthly sunspot time series for the years 1749-1960 as the training set, and 1961-2016 for cross-validation (testing) Figure 14: Monthly Total Sunspot Number: 1749-2016 N. Agana (NCAT) June 16, 2017 21 / 35

Decomposition of Sunspot Number Series Figure 15: EMD Decomposition N. Agana (NCAT) June 16, 2017 22 / 35

Results and Discussion Table 1: Prediction Errors Figure 16: DBN prediction Results Model MSE RMSE MAE MLP(5 10 1) 0.00359 0.05992 0.04798 DBN(5 10 10 1) 0.00345 0.05865 0.04396 EMD-MLP(5 10 1) 0.00078 0.09205 0.02101 EMD-DBN(5 10 10 1) 0.00020 0.01438 0.01070 N. Agana (NCAT) June 16, 2017 23 / 35

Application to Drought Prediction 1 Drought is a natural disaster that occurs with great impact on society 2 Occurs when there is a significant deficit in rainfall compared to the long-term average 3 Affects water resources, agricultural and socioeconomic activities 4 Drought prediction is very vital in limiting their effects 5 Predictions can be useful in the control and management of water resources systems and mitigation of economic, environmental and social impacts Figure 17 N. Agana (NCAT) June 16, 2017 24 / 35

Study Area and Data 1 The case study is carried out using data from the Gunnison River Basin, located in the Upper Colorado River Basin with a total drainage area of 5400km 2 2 Monthly Streamflow observations from 1912 to 2013 are used 3 Standardized Streamflow Indices (SSI) are calculated based on the streamflow data Figure 18: Location of the Gunnison River Basin [7] N. Agana (NCAT) June 16, 2017 25 / 35

Summary of the Proposed Model 1 Obtain the different time scale SSI 2 Decompose the time series data into several IMFs and a residue using EMD 3 Divide the data into training and testing data 4 Pre-train each layer bottom up by considering each pair of layers as an RBM 5 Finetune the entire network using the back-propagation algorithm 6 Use the test data to test the trained model N. Agana (NCAT) June 16, 2017 26 / 35

Results and Discussion Table 2: Prediction Errors for SSI 12 Table 3: Prediction Errors for SSI 24 Model MSE RMSE MAE MLP(5 10 1) 0.00422 0.06468 0.04211 EMD-MLP(5 10 1) 0.00209 0.03580 0.02882 DBN(5 10 10 1) 0.00211 0.04593 0.02852 EMD-DBN(5 10 10 1) 0.00131 0.02257 0.01649 Model MSE RMSE MAE MLP(5 10 1) 0.00303 0.05507 0.03969 EMD-MLP(5 10 1) 0.00179 0.04399 0.04249 DBN(5 10 10 1) 0.00125 0.03535 0.01876 EMD-DBN(5 10 10 1) 0.00077 0.02780 0.01009 N. Agana (NCAT) June 16, 2017 27 / 35

Results and Discussion Figure 19: Comparison of the RMSE for SSI 12 forecast Figure 20: Comparison of the MAE for SSI 12 forecast N. Agana (NCAT) June 16, 2017 28 / 35

Conclusion 1 This study explored a deep belief network for drought prediction. We proposed a hybrid model comprising of empirical mode decomposition and deep belief network (EMD-DBN) for long term drought prediction 2 The results of the proposed approach are compared with both DBN, MLP and also EMD-MLP. 3 Overall, the hybrid EMD-DBN model was found to provide better forecasting results for SSI 12 and SSI 24 in the Gunnison River Basin 4 Performance of both MLP and DBN improved when the drought time series are decomposed N. Agana (NCAT) June 16, 2017 29 / 35

Summary of Contributions and Future Work I Contributions: 1 Constructed a DBN model for time series prediction by adding a final layer which simply map learned features with the target 2 Improved the performance of the proposed model by integrating empirical mode decomposition to form a hybrid EMD-DBN model 3 Calculated Standardized drought indices using the generalized extreme value (GEV) distribution instead of a gamma distribution 4 Applied the proposed model to drought prediction Futurework: 1 Optimize the structure of the model( e.g. number of hidden layers, hidden layer size, and also learning rate) by using search methods such as grid search or random search 2 Use a linear neural network to aggregate the individual predictions instead of just summing them N. Agana (NCAT) June 16, 2017 30 / 35

Summary of Contributions and Future Work II 3 Predict extreme precipitation indices across the Southeastern US 4 Apply the model to predict other climate variables such as precipitation and temperature using satellite images N. Agana (NCAT) June 16, 2017 31 / 35

Thank You? N. Agana (NCAT) June 16, 2017 32 / 35

References I [1] Norbert A Agana and Abdollah Homaifar. A deep learning based approach for long-term drought prediction. In SoutheastCon, 2017, pages 1 8. IEEE, 2017. [2] Norbert A Agana and Abdollah Homaifar. A hybrid deep belief network for long-term drought prediction. In Workshop on Mining Big Data in Climate and Environment (MBDCE 2017), 17th SIAM International Conference on Data Mining (SDM 2017), pages 1 8, April 27-29, 2017, Houston, Texas. [3] Geoffrey E Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527 1554, 2006. [4] Geoffrey Hinton. A practical guide to training restricted boltzmann machines. Momentum, 9(1):926, 2010. N. Agana (NCAT) June 16, 2017 33 / 35

References II [5] Zhaohua Wu, Norden E Huang, Steven R Long, and Chung-Kang Peng. On the trend, detrending, and variability of nonlinear and nonstationary time series. Proceedings of the National Academy of Sciences, 104(38):14889 14894, 2007. [6] Norden E Huang, Zheng Shen, Steven R Long, Manli C Wu, Hsing H Shih, Quanan Zheng, Nai-Chyuan Yen, Chi Chao Tung, and Henry H Liu. The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, volume 454, pages 903 995. The Royal Society, 1998. N. Agana (NCAT) June 16, 2017 34 / 35

References III [7] Shahrbanou Madadgar and Hamid Moradkhani. A bayesian framework for probabilistic seasonal drought forecasting. Journal of Hydrometeorology, 14(6):1685 1705, 2013. N. Agana (NCAT) June 16, 2017 35 / 35