Using reservoir computing in a decomposition approach for time series prediction.
|
|
- Kory Crawford
- 5 years ago
- Views:
Transcription
1 Using reservoir computing in a decomposition approach for time series prediction. Francis wyffels, Benjamin Schrauwen and Dirk Stroobandt Ghent University - Electronics and Information Systems Department Sint-Pietersnieuwstraat 41, 9 Gent - Belgium Abstract. In this paper we combine wavelet decomposition and recurrent neural networks to provide fast and accurate time series predictions. The original time series is decomposed by means of wavelet decomposition into a hierarchy of time series which are easier to predict. The prediction core of our solution is given by reservoir computing, which is a recently developed technique for the very fast training of recurrent neural networks. The three time series of the ESTSP 28 competition will be used as an illustration for our method. 1 Introduction Forecasting is a domain with a broad range of useful applications. Therefore, researchers working on time series prediction come from a wide variety of fields and are using many methods such as theta method [1], support vector machines, neural networks, local modeling [2], wavelet-decomposition [3] and many more. This year, for the second time, the European Symposium on Time Series Prediction is held. This symposium always presents a challenging competition in the domain of time series prediction. This year the competition concerns the prediction of three different time series which can be found on the website Each time series has different properties which is interesting because this will reveal the strength and weaknesses of the many methods that are applied on the time series during the contest. Although we have no profound experience in the domain of time series prediction, we wanted to join this competition in order to compare our method with many others in the forecasting domain. In this paper we describe how a combined approach of wavelet decomposition and reservoir computing can be used for forecasting. Before we continue the full explanation of our method we give a short overview of reservoir computing which will be the baseline of our method. This research is partially funded by FWO Flanders project G and the Photonics@be Interuniversity Attraction Poles program (IAP 6/1), initiated by the Belgian State, Prime Minister s Services, Science Policy Office.
2 2 Reservoir computing: a short overview Reservoir computing is a novel technique for the fast training of large recurrent neural networks which have been successfully applied in a broad range of temporal tasks such as robotics [4], speech recognition [5, 6] and time series generation [7]. Last year, reservoir computing outperformed all other methods in the NN3 competition for financial time series prediction [8]. The reservoir computing technique is based on the use of a large untrained dynamical system, the reservoir, where the desired function is implemented by a linear memory-less mapping from the full instantaneous state of the dynamic system to the desired output. Only this linear mapping is learned. When the dynamical system is a recurrent neural network of analog neurons the method is referred to as echo state networks [7]. When spiking neurons are used, one often speaks of liquid state machines [9]. But both are now commonly referred to as reservoir computing [1]. Training is done in a supervised way by first driving the reservoir with teacher forced inputs and/or teacher forced output feedback. Secondly, the output layer is trained by using linear regression methods. This is summarized by the following equations: ( ) x[k + 1] = f Wres res x[k] + Winpu[k] res + Wouty[k] res + Wbias res ŷ[k + 1] = W out res x[k + 1] + W out inp u[k] + W out bias, (1) where x[k] is the reservoir s state, u[k] is the input, y[k] is the desired output and ŷ[k] is the actual output. When analog neurons are used, the nonlinearity f often represents the sigmoid function. All the weight matrices denoted by W out are trained, while those denoted by W res are fixed and randomly created. During testing, the teacher forced output feedback y[k] is replaced by the actual output ŷ[k] which we call free run mode. Because only the output weights are changed, training can be realized very quickly which can be an additional benefit in comparison with other methods. Additionally, reservoir computing doesn t suffer from local optima like other methods based on neural networks do. We conclude that reservoir computing gives us a powerful tool that can be easily used in a broad range of applications. 3 Time series prediction Now we have introduced the baseline of our prediction mechanism we can formalize our overall prediction scheme which is illustrated in Fig. 1. We present now each of the modules in greater detail.
3 Normalize Decompose Discard d1 Rescale Recombine Predict (RC) Fig. 1: This is a schematic overview of our prediction methodology. We start by normalizing the given time series. Next, the normalized time series is decomposed by means of wavelet decomposition what results in a trend and a bunch of detail coefficients. Hereafter the level 1 detail coefficient is discarded. The obtained components are predicted separately using reservoir computing (RC). Finally, the predicted components are combined and rescaled in order to get a prediction of the given time series. 3.1 Normalizing the time series Because we want to work in both the linear and nonlinear part of our recurrent neural network we need to normalize the time series to the interval [ 1, 1]. Otherwise all neurons would be saturated and thus loosing information. Normalization is done by removing the mean and dividing the outcome by the maximal absolute value: 3.2 Decomposition x = x x x norm = x / max ( x ), (2) Time series can be very often decomposed into components with different dynamics: a trend, periodical effects (sometimes denoted as seasonal effects) and irregular residual components. In [3] wavelet decomposition was motivated because of the easy analysis of the obtained components. A second motivation to use decomposition of the original time series is inherent to the use of reservoir computing which tend to be sensitive to a small temporal range [11]. This can be a problem with time series which contain information on different timescales. When there is no additional information available about the time series, decomposition can be done by using a set of successive filters. This is also known as multiscale decomposition and described in more detail in [12]. The filters are obtained by rescaling the so called mother wavelet. When the filters are applied iteratively, one obtains a slow varying trend series and a hierarchy of detail components which contain the system s dynamics at different timescales [3]. Because
4 the system s dynamics are now splitted up into different timescales processing will be a lot easier with reservoir computing. The number of iterations L depends on the length N of the time series and is limited by L max = log 2 N. But less iterations can be considered by inspecting the derived detail components. After L iterations, time series y[k], k = 1...N, with length N can be written as the sum of the trend c L [k] and L detail coefficients d m [k], m = 1...L: y[k] = c L [k] + L d m [k], (3) For our experiments we used the MATLAB Wavelet toolbox for decomposition of the time series. The filters we used were obtained from the discrete Meyer filter because this gave components which have few discontinuities. But we suspect that a Daubechies filter of a sufficient high order is also feasible. In Fig. 3 the first time series of the ESTSP 28 competition is shown with its trend and detail coefficients using a level eight decomposition. The most noisy coefficient d 1 is not illustrated. By way of inspecting the derived trend and detail components we decided that level eight decomposition produced sufficient smooth components for the first and second time series of the ESTSP 28 competition. For the third time series we used level 12 decomposition because this gave smoother and more predictable coefficients. 3.3 Prediction The trend and detail components we obtain from decomposition are used to predict the original time series. We neglect the level 1 component d 1 because it is too noisy to predict. The remaining components are predicted separately by means of reservoir computing. This has as an important implication that correlations between different components are neglected. We plan to investigate the use of many timescales within a reservoir so that all components can be predicted at once instead of training them separately. This would boost calculation time and has the benefit of combining possible correlation between the components. For each component a fully connected reservoir with 5 sigmoid neurons, one output and only output feedback as an input is constructed. No other external inputs are used. An illustration of this topology can be seen in Fig. 2 The connection weights are scaled so that the largest eigenvalue is nearly 1 which makes the reservoir nearly unstable. The output feedback weights were scaled to.1. Depending on the desired output, classical neurons, leaky neurons [7] or band-pass neurons [11] are used. A rule of thumb here is that leaky neurons are used for the slowest components, band-pass neurons for the faster components and classical neurons for the fastest varying components. We determined the leak-rates and band-pass neurons manually based on the frequency spectrum of the components. Because we want to generate the future of a time series, the output is fed back to each of the neurons in the reservoir. During training the desired signal is used as feedback using teacher forcing as we previously explained. m=1
5 Reservoir with N nodes Output node Random but xed interconnections Trained interconnections Fig. 2: Schematic overview of the described reservoir topology. The reservoir has only output feedback as an input. Only the output weights, presented by dashed lines, are trained. All other connection weights are initialized randomly and scaled so that the largest eigenvalue is nearly 1. In order to avoid overfitting we use ridge regression to train the output which proved to have good regularization properties [13], even for generation tasks. For training and testing we divide each component in three parts: one for training (the largest part), and the two last parts (which have lengths equal to the desired prediction horizon) for validation and testing. The final results for both testing and the competition were obtained by first training the reservoir using teacher forcing with the largest part. The optimal regularization parameter is determined using the performance of the reservoir in predicting the validation part. Next, the reservoir is retrained by teacher forcing it with the first and the second part and using the obtained optimal regularization parameter in order to predict the third (known) testing part. This part is used for evaluation of our approach and these results are presented in the next section. Finally, we train the reservoir again using the complete component in order to predict the unknown samples which are needed for the competition. We repeat this process ten times for each of the components, each time using an other reservoir. The unknown samples were generated by the reservoir which had the best performance on the testing part.
6 3.4 Composition and rescaling Recombination of the components can be done by using equation 3. Afterward the composed time series is rescaled again to undo the normalization. 4 Experimental results The goal of the ESTSP 28 competition is to predict three different time series each for a different prediction horizon. The evaluation of time series y with length N and its prediction ŷ is done by calculating the NMSE: NMSE = N t=1 (y t ŷ t ) 2, (4) Lσ 2 y For the first time series we have to predict the next 18 samples based on a history of 354 samples. Two additional time series were given which could be helpful for prediction of the first time series. After first trying a few prediction setups were these additional time series were used as an input of our reservoir we decided to neglect these external variables. This because they gave no significant improvement. The final result with decomposition of the time series is presented in Fig. 3. A NMSE of.25 was obtained on the last known 18 samples. The complete training and prediction procedure takes nearly 2 minutes using an average desktop computer with two gigabyte of memory and a 2.4 GHz Intel based CPU. The second time series of the ESTSP 28 competition consists of 13 samples of which we have to predict the next 1 samples. This time series has a period of 7 samples which makes it convenient to think that it was sampled from a daily updated variable. This thought becomes more pronounced when we cut this time series into sets of 365 samples and look to the correlations between the different sets. Altough we wanted to use this analysis first as additional information into a different approach, we choose to reject it because our results were comparable to the result we have now. We wanted to have one consistent methodology for the three time series. The predictions are shown in Fig. 4. A NMSE of.14 was obtained which is the best of the three given time series. A total time of nearly 5 minutes was needed to complete the training and testing procedure. For the third time series we got samples of which we had to predict the next 2 samples. Completion of the prediction procedure took three hours which is due to the long sample history and the use of more decomposition levels (and thus needing more reservoirs for prediction of the components). The results are shown in Fig. 5. A NMSE of.42 was obtained which gives us the worst performance, this possibly due to the discrete jumps in the trend and the many noisy detail coefficients that we derived from decomposition.
7 3 Time series a8 d7 d5 d ! !.1.1!.1.1! d8 d6 d4 d2! !.1.1!.1.1! Fig. 3: In solid black lines the original time series 1 of the ESTSP 28 competition and its decomposition (using level eight wavelet decomposition) into its trend and detail coefficients are shown. The level 1 detail coefficient is not shown because it was too noisy to predict. The last 18 samples of the original time series and its components were predicted in order to evaluate our prediction methodology which is given in a dashed gray line. This resulted in a NMSE of.25. The 18 unknown samples which were predicted for the competition are shown in a solid gray line.
8 x Time series x 1 8 Subset of time series Fig. 4: At the top, the complete time series 2 of the ESTSP 28 competition is shown in a solid black line. Our method was evaluated on the last 1 samples which gave a NMSE of.14. At the bottom, these predictions are marked with a dashed gray line. The next 1 unknown samples for the competition are marked with a solid gray line.
9 3 25 Time series x Subset of time series x 1 4 Fig. 5: At the top, an impression of the complete time series 3 of the ESTSP 28 competition is given. The last 2 samples were used for evaluating our technique which resulted in a N M SE of.42. These predicted samples are shown with a dashed gray line. Our prediction for the unknown future of 2 samples is illustrated with solid gray line.
10 5 Conclusions In this work a prediction scheme for fast and accurate time series prediction based on wavelet decomposition and reservoir computing was presented. By using time series decomposition, components of different time scales were obtained which are easier to predict. The obtained trend series and detail coefficients were predicted using reservoir computing. We evaluated our method on a known part of the three time series of the ESTSP 28 competition. In the end, the unknown samples of the three time series were generated. For future work we plan to use a setup using a single reservoir for both decomposition and prediction. This will give us a prediction mechanism which is able to use the interdependence between the obtained components. Therefore we will need to investigate first how many timescales can be used within one reservoir. References [1] V. Assimakopoulos and K. Nikolopoulos. The theta method: a decomposition approach for forecasting. International journal of forecasting, 16:521 53, 2. [2] J. McNames. Innovations in local modeling for time series prediction. PhD thesis, Stanford University, [3] S. Soltani. On the use of the wavelet decomposition for time series prediction. Neurocomputing, 48: , 22. [4] Eric. A. Antonelo, Benjamin Schrauwen, and Jan Van Campenhout. Generative modeling of autonomous robots and their environments using reservoir computing. Neural Processing Letters, 26(3): , 27. [5] Mark D. Skowronski and John G. Harris. 27 Special Issue: Automatic speech recognition using a predictive echo state network classifier. Neural Networks, 2(3): , 27. [6] D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout. Isolated word recognition with the liquid state machine: a case study. Information Processing Letters, 95(6): , 25. [7] Herbert Jaeger. The echo state approach to analysing and training recurrent neural networks. Technical Report GMD Report 148, German National Research Center for Information Technology, 21. [8] H. Jaeger. Background information: Jacobs university smart systems seminar wins international financial time series competition, 27. [9] W. Maass, T. Natschläger, and H. Markram. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation, 14(11): , 22. [1] D. Verstraeten, B. Schrauwen, M. D Haene, and D. Stroobandt. An experimental unification of reservoir computing methods. Neural Networks, 2:391 43, 27. [11] F. wyffels, B. Schrauwen, D. Verstraeten, and D. Stroobandt. Band-pass reservoir computing. In Proceedings of the International Joint Conference on Neural Networks, 28. [12] Ingrid Daubechies. Ten Lectures on Wavelets (C B M S - N S F Regional Conference Series in Applied Mathematics). Soc for Industrial & Applied Math, December [13] F. wyffels, B. Schrauwen, and D. Stroobandt. Regularization methods for reservoir computing. In Proceedings of the International Conference on Analog Neural Networks (ICANN), 28. (accepted).
A First Attempt of Reservoir Pruning for Classification Problems
A First Attempt of Reservoir Pruning for Classification Problems Xavier Dutoit, Hendrik Van Brussel, Marnix Nuttin Katholieke Universiteit Leuven - P.M.A. Celestijnenlaan 300b, 3000 Leuven - Belgium Abstract.
More informationReservoir Computing with Stochastic Bitstream Neurons
Reservoir Computing with Stochastic Bitstream Neurons David Verstraeten, Benjamin Schrauwen and Dirk Stroobandt Department of Electronics and Information Systems (ELIS), Ugent {david.verstraeten, benjamin.schrauwen,
More informationNegatively Correlated Echo State Networks
Negatively Correlated Echo State Networks Ali Rodan and Peter Tiňo School of Computer Science, The University of Birmingham Birmingham B15 2TT, United Kingdom E-mail: {a.a.rodan, P.Tino}@cs.bham.ac.uk
More informationLinking non-binned spike train kernels to several existing spike train metrics
Linking non-binned spike train kernels to several existing spike train metrics Benjamin Schrauwen Jan Van Campenhout ELIS, Ghent University, Belgium Benjamin.Schrauwen@UGent.be Abstract. This work presents
More informationarxiv: v1 [cs.lg] 2 Feb 2018
Short-term Memory of Deep RNN Claudio Gallicchio arxiv:1802.00748v1 [cs.lg] 2 Feb 2018 Department of Computer Science, University of Pisa Largo Bruno Pontecorvo 3-56127 Pisa, Italy Abstract. The extension
More informationRecurrence Enhances the Spatial Encoding of Static Inputs in Reservoir Networks
Recurrence Enhances the Spatial Encoding of Static Inputs in Reservoir Networks Christian Emmerich, R. Felix Reinhart, and Jochen J. Steil Research Institute for Cognition and Robotics (CoR-Lab), Bielefeld
More informationShort Term Memory and Pattern Matching with Simple Echo State Networks
Short Term Memory and Pattern Matching with Simple Echo State Networks Georg Fette (fette@in.tum.de), Julian Eggert (julian.eggert@honda-ri.de) Technische Universität München; Boltzmannstr. 3, 85748 Garching/München,
More informationHarnessing Nonlinearity: Predicting Chaotic Systems and Saving
Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication Publishde in Science Magazine, 2004 Siamak Saliminejad Overview Eco State Networks How to build ESNs Chaotic
More informationReservoir Computing and Echo State Networks
An Introduction to: Reservoir Computing and Echo State Networks Claudio Gallicchio gallicch@di.unipi.it Outline Focus: Supervised learning in domain of sequences Recurrent Neural networks for supervised
More informationGood vibrations: the issue of optimizing dynamical reservoirs
Good vibrations: the issue of optimizing dynamical reservoirs Workshop on ESNs / LSMs, NIPS 2006 Herbert Jaeger International University Bremen (Jacobs University Bremen, as of Spring 2007) The basic idea:
More informationShort Term Memory Quantifications in Input-Driven Linear Dynamical Systems
Short Term Memory Quantifications in Input-Driven Linear Dynamical Systems Peter Tiňo and Ali Rodan School of Computer Science, The University of Birmingham Birmingham B15 2TT, United Kingdom E-mail: {P.Tino,
More informationEcho State Networks with Filter Neurons and a Delay&Sum Readout
Echo State Networks with Filter Neurons and a Delay&Sum Readout Georg Holzmann 2,1 (Corresponding Author) http://grh.mur.at grh@mur.at Helmut Hauser 1 helmut.hauser@igi.tugraz.at 1 Institute for Theoretical
More informationMODULAR ECHO STATE NEURAL NETWORKS IN TIME SERIES PREDICTION
Computing and Informatics, Vol. 30, 2011, 321 334 MODULAR ECHO STATE NEURAL NETWORKS IN TIME SERIES PREDICTION Štefan Babinec, Jiří Pospíchal Department of Mathematics Faculty of Chemical and Food Technology
More informationEffect of number of hidden neurons on learning in large-scale layered neural networks
ICROS-SICE International Joint Conference 009 August 18-1, 009, Fukuoka International Congress Center, Japan Effect of on learning in large-scale layered neural networks Katsunari Shibata (Oita Univ.;
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationImproving the Separability of a Reservoir Facilitates Learning Transfer
Brigham Young University BYU ScholarsArchive All Faculty Publications 2009-06-19 Improving the Separability of a Reservoir Facilitates Learning Transfer David Norton dnorton@byu.edu Dan A. Ventura ventura@cs.byu.edu
More informationMemory Capacity of Input-Driven Echo State NetworksattheEdgeofChaos
Memory Capacity of Input-Driven Echo State NetworksattheEdgeofChaos Peter Barančok and Igor Farkaš Faculty of Mathematics, Physics and Informatics Comenius University in Bratislava, Slovakia farkas@fmph.uniba.sk
More informationMachine Learning and Data Mining. Linear regression. Kalev Kask
Machine Learning and Data Mining Linear regression Kalev Kask Supervised learning Notation Features x Targets y Predictions ŷ Parameters q Learning algorithm Program ( Learner ) Change q Improve performance
More informationFast and exact simulation methods applied on a broad range of neuron models
Fast and exact simulation methods applied on a broad range of neuron models Michiel D Haene michiel.dhaene@ugent.be Benjamin Schrauwen benjamin.schrauwen@ugent.be Ghent University, Electronics and Information
More informationSeveral ways to solve the MSO problem
Several ways to solve the MSO problem J. J. Steil - Bielefeld University - Neuroinformatics Group P.O.-Box 0 0 3, D-3350 Bielefeld - Germany Abstract. The so called MSO-problem, a simple superposition
More informationA quick introduction to reservoir computing
A quick introduction to reservoir computing Herbert Jaeger Jacobs University Bremen 1 Recurrent neural networks Feedforward and recurrent ANNs A. feedforward B. recurrent Characteristics: Has at least
More informationAn overview of reservoir computing: theory, applications and implementations
An overview of reservoir computing: theory, applications and implementations Benjamin Schrauwen David Verstraeten Jan Van Campenhout Electronics and Information Systems Department, Ghent University, Belgium
More informationAt the Edge of Chaos: Real-time Computations and Self-Organized Criticality in Recurrent Neural Networks
At the Edge of Chaos: Real-time Computations and Self-Organized Criticality in Recurrent Neural Networks Thomas Natschläger Software Competence Center Hagenberg A-4232 Hagenberg, Austria Thomas.Natschlaeger@scch.at
More informationDETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja
DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationThe Markov Decision Process Extraction Network
The Markov Decision Process Extraction Network Siegmund Duell 1,2, Alexander Hans 1,3, and Steffen Udluft 1 1- Siemens AG, Corporate Research and Technologies, Learning Systems, Otto-Hahn-Ring 6, D-81739
More informationRefutation of Second Reviewer's Objections
Re: Submission to Science, "Harnessing nonlinearity: predicting chaotic systems and boosting wireless communication." (Ref: 1091277) Refutation of Second Reviewer's Objections Herbert Jaeger, Dec. 23,
More informationLecture 5: Recurrent Neural Networks
1/25 Lecture 5: Recurrent Neural Networks Nima Mohajerin University of Waterloo WAVE Lab nima.mohajerin@uwaterloo.ca July 4, 2017 2/25 Overview 1 Recap 2 RNN Architectures for Learning Long Term Dependencies
More informationStructured reservoir computing with spatiotemporal chaotic attractors
Structured reservoir computing with spatiotemporal chaotic attractors Carlos Lourenço 1,2 1- Faculty of Sciences of the University of Lisbon - Informatics Department Campo Grande, 1749-016 Lisboa - Portugal
More informationOptoelectronic Systems Trained With Backpropagation Through Time
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 7, JULY 2015 1545 Optoelectronic Systems Trained With Backpropagation Through Time Michiel Hermans, Joni Dambre, and Peter Bienstman
More informationMachine Learning! in just a few minutes. Jan Peters Gerhard Neumann
Machine Learning! in just a few minutes Jan Peters Gerhard Neumann 1 Purpose of this Lecture Foundations of machine learning tools for robotics We focus on regression methods and general principles Often
More informationRevisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data
Revisiting linear and non-linear methodologies for time series - application to ESTSP 08 competition data Madalina Olteanu Universite Paris 1 - SAMOS CES 90 Rue de Tolbiac, 75013 Paris - France Abstract.
More informationBased on the original slides of Hung-yi Lee
Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationDo we need Experts for Time Series Forecasting?
Do we need Experts for Time Series Forecasting? Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot Campus, Poole, BH12 5BB - United
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationAdvanced Methods for Recurrent Neural Networks Design
Universidad Autónoma de Madrid Escuela Politécnica Superior Departamento de Ingeniería Informática Advanced Methods for Recurrent Neural Networks Design Master s thesis presented to apply for the Master
More informationCSC321 Lecture 9: Generalization
CSC321 Lecture 9: Generalization Roger Grosse Roger Grosse CSC321 Lecture 9: Generalization 1 / 27 Overview We ve focused so far on how to optimize neural nets how to get them to make good predictions
More informationLearning Gaussian Process Models from Uncertain Data
Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada
More informationMachine Learning and Data Mining. Linear regression. Prof. Alexander Ihler
+ Machine Learning and Data Mining Linear regression Prof. Alexander Ihler Supervised learning Notation Features x Targets y Predictions ŷ Parameters θ Learning algorithm Program ( Learner ) Change µ Improve
More informationLecture 4: Feed Forward Neural Networks
Lecture 4: Feed Forward Neural Networks Dr. Roman V Belavkin Middlesex University BIS4435 Biological neurons and the brain A Model of A Single Neuron Neurons as data-driven models Neural Networks Training
More informationOnline Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?
Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification
More informationLast update: October 26, Neural networks. CMSC 421: Section Dana Nau
Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the
More informationEnsembles of Nearest Neighbor Forecasts
Ensembles of Nearest Neighbor Forecasts Dragomir Yankov 1, Dennis DeCoste 2, and Eamonn Keogh 1 1 University of California, Riverside CA 92507, USA, {dyankov,eamonn}@cs.ucr.edu, 2 Yahoo! Research, 3333
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationA Hybrid Time-delay Prediction Method for Networked Control System
International Journal of Automation and Computing 11(1), February 2014, 19-24 DOI: 10.1007/s11633-014-0761-1 A Hybrid Time-delay Prediction Method for Networked Control System Zhong-Da Tian Xian-Wen Gao
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationSimple Deterministically Constructed Cycle Reservoirs with Regular Jumps
1 Simple Deterministically Constructed Cycle Reservoirs with Regular Jumps Ali Rodan 1 and Peter Tiňo 1 1 School of computer Science, The University of Birmingham, Birmingham B15 2TT, United Kingdom, (email:
More informationFrom Neural Networks To Reservoir Computing...
From Neural Networks To Reservoir Computing... An Introduction Naveen Kuppuswamy, PhD Candidate, A.I Lab, University of Zurich 1 Disclaimer : I am only an egg 2 Part I From Classical Computation to ANNs
More informationElectric Load Forecasting Using Wavelet Transform and Extreme Learning Machine
Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine Song Li 1, Peng Wang 1 and Lalit Goel 1 1 School of Electrical and Electronic Engineering Nanyang Technological University
More informationAnalyzing the weight dynamics of recurrent learning algorithms
Analyzing the weight dynamics of recurrent learning algorithms Ulf D. Schiller and Jochen J. Steil Neuroinformatics Group, Faculty of Technology, Bielefeld University Abstract We provide insights into
More informationPrediction for night-time ventilation in Stanford s Y2E2 building
Prediction for night-time ventilation in Stanford s Y2E2 building Balthazar Donon Stanford University December 16, 2016 Indoor ventilation I. Introduction In the United States, around 40% of the energy
More informationCSC 411 Lecture 6: Linear Regression
CSC 411 Lecture 6: Linear Regression Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 06-Linear Regression 1 / 37 A Timely XKCD UofT CSC 411: 06-Linear Regression
More informationPattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore
Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal
More informationArtificial Neural Networks Examination, June 2005
Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationREAL-TIME COMPUTING WITHOUT STABLE
REAL-TIME COMPUTING WITHOUT STABLE STATES: A NEW FRAMEWORK FOR NEURAL COMPUTATION BASED ON PERTURBATIONS Wolfgang Maass Thomas Natschlager Henry Markram Presented by Qiong Zhao April 28 th, 2010 OUTLINE
More informationArtificial Neural Networks Examination, June 2004
Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum
More informationA gradient descent rule for spiking neurons emitting multiple spikes
A gradient descent rule for spiking neurons emitting multiple spikes Olaf Booij a, Hieu tat Nguyen a a Intelligent Sensory Information Systems, University of Amsterdam, Faculty of Science, Kruislaan 403,
More informationMultilayer Perceptrons (MLPs)
CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1
More informationA Model for Real-Time Computation in Generic Neural Microcircuits
A Model for Real-Time Computation in Generic Neural Microcircuits Wolfgang Maass, Thomas Natschläger Institute for Theoretical Computer Science Technische Universitaet Graz A-81 Graz, Austria maass, tnatschl
More informationTowards a Calculus of Echo State Networks arxiv: v1 [cs.ne] 1 Sep 2014
his space is reserved for the Procedia header, do not use it owards a Calculus of Echo State Networks arxiv:49.28v [cs.ne] Sep 24 Alireza Goudarzi and Darko Stefanovic,2 Department of Computer Science,
More informationLecture 2: Linear regression
Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued
More informationGaussian Process Regression with K-means Clustering for Very Short-Term Load Forecasting of Individual Buildings at Stanford
Gaussian Process Regression with K-means Clustering for Very Short-Term Load Forecasting of Individual Buildings at Stanford Carol Hsin Abstract The objective of this project is to return expected electricity
More informationNeural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35
Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How
More informationComparison of Modern Stochastic Optimization Algorithms
Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,
More informationArtificial Intelligence
Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement
More informationThe abstract can be found on the IEEE Xplore web site:
This document contains the accepted but unedited version of the following publication: Feedback control by online learning an inverse model. Tim Waegeman, Francis wyffels and Benjamin Schrauwen. IEEE TRANSACTIONS
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,
More informationArtificial Neural Networks. Q550: Models in Cognitive Science Lecture 5
Artificial Neural Networks Q550: Models in Cognitive Science Lecture 5 "Intelligence is 10 million rules." --Doug Lenat The human brain has about 100 billion neurons. With an estimated average of one thousand
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationNeural Networks and Ensemble Methods for Classification
Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated
More informationLecture 15: Exploding and Vanishing Gradients
Lecture 15: Exploding and Vanishing Gradients Roger Grosse 1 Introduction Last lecture, we introduced RNNs and saw how to derive the gradients using backprop through time. In principle, this lets us train
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationNeural Networks. Nethra Sambamoorthi, Ph.D. Jan CRMportals Inc., Nethra Sambamoorthi, Ph.D. Phone:
Neural Networks Nethra Sambamoorthi, Ph.D Jan 2003 CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 732-972-8969 Nethra@crmportals.com What? Saying it Again in Different ways Artificial neural network
More informationArtificial Neural Network and Fuzzy Logic
Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationIntroduction to Biomedical Engineering
Introduction to Biomedical Engineering Biosignal processing Kung-Bin Sung 6/11/2007 1 Outline Chapter 10: Biosignal processing Characteristics of biosignals Frequency domain representation and analysis
More informationSVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning
SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationNeural networks. Chapter 19, Sections 1 5 1
Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples
More informationTemporal Backpropagation for FIR Neural Networks
Temporal Backpropagation for FIR Neural Networks Eric A. Wan Stanford University Department of Electrical Engineering, Stanford, CA 94305-4055 Abstract The traditional feedforward neural network is a static
More informationShort Course: Multiagent Systems. Multiagent Systems. Lecture 1: Basics Agents Environments. Reinforcement Learning. This course is about:
Short Course: Multiagent Systems Lecture 1: Basics Agents Environments Reinforcement Learning Multiagent Systems This course is about: Agents: Sensing, reasoning, acting Multiagent Systems: Distributed
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationNeural networks. Chapter 20. Chapter 20 1
Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationRECENTLY there has been an outburst of research activity
IEEE TRANSACTIONS ON NEURAL NETWORKS 1 Minimum Complexity Echo State Network Ali Rodan, Student Member, IEEE, and Peter Tiňo Abstract Reservoir computing (RC) refers to a new class of state-space models
More informationStatistical NLP for the Web
Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks
More informationLecture 9: Generalization
Lecture 9: Generalization Roger Grosse 1 Introduction When we train a machine learning model, we don t just want it to learn to model the training data. We want it to generalize to data it hasn t seen
More informationWays to make neural networks generalize better
Ways to make neural networks generalize better Seminar in Deep Learning University of Tartu 04 / 10 / 2014 Pihel Saatmann Topics Overview of ways to improve generalization Limiting the size of the weights
More informationVote. Vote on timing for night section: Option 1 (what we have now) Option 2. Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9
Vote Vote on timing for night section: Option 1 (what we have now) Lecture, 6:10-7:50 25 minute dinner break Tutorial, 8:15-9 Option 2 Lecture, 6:10-7 10 minute break Lecture, 7:10-8 10 minute break Tutorial,
More informationBioinformatics: Network Analysis
Bioinformatics: Network Analysis Model Fitting COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University 1 Outline Parameter estimation Model selection 2 Parameter Estimation 3 Generally
More informationCS:4420 Artificial Intelligence
CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart
More information