A Machine Learning Approach to Define Weights for Linear Combination of Forecasts

Size: px
Start display at page:

Download "A Machine Learning Approach to Define Weights for Linear Combination of Forecasts"

Transcription

1 A Machine Learning Approach to Define Weights for Linear Combination of Forecasts Ricardo Prudêncio 1 and Teresa Ludermir 2 1 Departament of Information Science, Federal University of Pernambuco, Av. dos Reitores, s/n - CEP Recife (PE) - Brazil 2 Center of Informatics, Federal University of Pernambuco, Pobox CEP Recife (PE) - Brazil Abstract. The linear combination of forecasts is a procedure that has improved the forecasting accuracy for different time series. In this procedure, each method being combined is associated to a numerical weight that indicates the contribution of the method in the combined forecast. We present the use of machine learning techniques to define the weights for the linear combination of forecasts. In this paper, a machine learning technique uses features of the series at hand to define the adequate weights for a pre-defined number of forecasting methods. In order to evaluate this solution, we implemented a prototype that uses a MLP network to combine two widespread methods. The experiments performed revealed significantly accurate forecasts. 1 Introduction Combining time series forecasts from different methods is a procedure commonly used to improve forecasting accuracy [1]. In the linear combination of forecasts, a weight is associated to each available method, and the combined forecast is the weighted sum of the forecasts individually provided by the methods. An approach that uses knowledge for defining the weights in the linear combination of forecasts is based on the development of expert systems, such as the Rule-Based Forecasting system [2]. The expert rules deployed by the system use descriptive features of the time series (such as length, basic trend,...) in order to define the adequate combining weight associated to each available forecasting method. Despite its good results, developing rules in this context may be unfeasible, since good forecasting experts are not always available [3]. In this paper, we proposed the use of machine learning for predicting the linear weights for combining forecasts. In the proposed solution, each training example stores the description of a series (i.e. the series features) and the combining weights that empirically obtained the best forecasting performance for the series. A machine learning technique uses a set of such examples to relate time series features and adequate combining weights. In order to evaluate the proposed solution, we implemented a prototype that uses MLP neural networks [4] to define the weights for two widespread methods: S. Kollias et al. (Eds.): ICANN 2006, Part I, LNCS 4131, pp , c Springer-Verlag Berlin Heidelberg 2006

2 A Machine Learning Approach to Define Weights 275 the Random Walk and the Autoregressive model [5]. The prototype was evaluated in 645 yearly series and the combined forecasts were compared to the forecasts provided by benchmarking methods. The experiments revealed that the forecasts generated by the predicted weights were significantly more accurate than the benchmarking forecasts. Section 2 presents methods for linear combination of forecasts, followed by section 3 that describes the proposed solution. Section 4 brings the implemented prototype, experiments and results. Section 5 presents some related work. Finally, section 6 presents the conclusions and future work. 2 Combining Forecasts The combination of forecasts from different methods is a well-established procedure for improving forecasting accuracy [1]. Empirical evidence has shown that procedures that combine forecasts often outperform the individual methods that are used in the combination [1][6]. The linear combination of K methods can be described as follows. Let Z t (t = 1,...,T) be the available data of a series Z and let Z t (t = T +1,...,T + H) be the H future values to be forecasted. Each method k uses the available data to calibrate its parameters and then generates its individual forecasts Z t k (t = T +1,...,T + H). The combined forecasts Z t C are defined as: Z C t = K w k Z t k (t = T +1,...,T + H) (1) The combining weights w k (k =1,...,K) are numerical values that indicate the contribution of each individual method in the combined forecasts. Eventually constraints are imposed on the weights in such a way that: K w k =1 and w k 0 (2) Different approaches for defining adequate combining weights can be identified in literature [6]. A very simple approach is to define equal weights (i.e. w k =1/K) for the combined methods, which is usually referred as the Simple Average (SA) combination method. Despite its simplicity, the SA method has shown to be robust in the forecasting of different series [7]. A more sophisticated approach for defining the combining weights is to treat the problem within the regression framework [8]. In this context, the individual forecasts are viewed as explanatory variables and the actual values of the series as the response variable. In order to estimate the best weights, each combined method generates in-sample forecasts (i.e. forecasts for the available data Z t ). Then, a least squares method computes the weights that minimize the squared error for the combined in-sample forecasts. Empirical results revealed that the regression-based methods are almost always less accurate than the SA [9].

3 276 R. Prudêncio and T. Ludermir An alternative approach that has shown promising results in the combination of forecasts is based on the development of expert systems, such as the Rule- Based Forecasting [2]. In that work, an expert system with 99 rules is used to weight four forecasting methods. The rule base was developed based on the guidelines provided by human experts. The authors used time series features in the rules (such as basic trend, presence of outliers) to modify the weight associated to each method. Figure 1 shows an example of rule used in the system. The rule defines how to modify the weights in case of a series that presents an insignificant trend. In the experiments performed using the expert system, the improvement in accuracy over the SA method has shown to besignificant. Rule: Insignificant Basic Trend - IF NOT a significant basic trend, THEN add 0.05 to the weight on random walk and subtract it from that on the regression. Fig. 1. Example of rule implemented in the Rule Based Forecasting system 3 The Proposed Solution As seen, expert systems have been successfully used to define weights for the linear combination of forecasts. Unfortunately, the knowledge acquisition in these systems depends on the availability of good human forecasting experts, who are often scarce and expensive [3]. In order to minimize this difficulty, the use of machine learning is proposed to define the weights for combining forecasts. Figure 2 presents the architecture of the proposed solution. The system has two phases: training and use. In the training phase, the Intelligent Combiner (IC) uses a supervised algorithm to acquire knowledge from a set of examples in the Database (DB). Each training example stores the descriptive features for a particular series and the best configuration of weights used to combine K methods for that series. The learner implemented in the IC module generalizes the experience stored in the examples by associating time series features to the most adequate combining weights. In the use phase, given a time series to be forecasted, the Feature Extractor (FE) module extracts the values of the time series features. According to these values, the IC module predicts an adequate configuration of weights for combining the K methods. The solution proposed here contributes to two different fields: (1) in time series forecasting, since we provided a new method for combining forecasts; (2) in machine learning, since we used its concepts and techniques in a problem which was not tackled yet. In the following subsections, we provide a more formal description of each module of the proposed solution. Section 4 presents an implemented prototype and the experiments that evaluated the proposed solution.

4 A Machine Learning Approach to Define Weights 277 Time Series Data +Contextual Information Time Series Features (x) FE IC Combining Weights: ew 1,..., ew K Training Examples (E) DB Fig. 2. System s architecture 3.1 The Feature Extractor As said, the combining weights are predicted for a time series Z basedonits description. Formally, a time series Z is described as a vector x =(x 1,...,x p ) where each x j (j =1,...,p) corresponds to the value of a descriptive feature X j. A time series feature can be either: (1) a contextual information directly provided by the user, such as the domain of the series, the time interval, the forecasting horizon, among others; or (2) a descriptive statistics automatically calculated from the available time series data Z t (t =1,...,T). In the proposed solution, the FE module extracts those features that are computed from the available data. These features may be for example the length, the presence of trend or seazonality, the autocorrelations, the amount of turning points, among others. Obviously the choice of appropriate features is highly dependent on the type of series at hand. 3.2 The Database An important aspect to be considered is the generation of a set of training examples used by the IC module. Let E = {e 1,...,e n } be a set of n examples, where each example stores: (1) the values of the p features for a particular series; and (2) the adequate weights for combining the K forecasting methods on that series. An example e i E is then defined as a vector e i =(x i,w i )where x i =(x 1 i,...,xp i ) is the description of the series and w i =(w 1 i,...,wk i )isthe best configuration of weights associated to the K methods. In order to generate each example e i E, wesimulatetheforecastingofa series Z i by using the K methods and then we compute the configuration of weights that would obtain the best forecasting result if it was used to combine the forecasts provided by the K methods. For this, the following tasks have to be performed. First, given a sample series Z i (i =1,...,n), its data is divided into two parts: afitperiodz i,t (t =1,...,T i ) and a forecasting period Z i,t (t = T i +1,...,T i +H). The fit period corresponds to the available data at time T i used to calibrate the K methods.theforecastingperiodinturncorrespondstoh observations to be forecasted by the K calibrated methods. Hence, for each series Z i,thistask

5 278 R. Prudêncio and T. Ludermir results on the forecasts Z k i,t (k =1,...,K)(t = T i +1,...,T i + H) individually provided by the K calibrated methods for the forecasting period of the series. In the second task, we defined the combining weights (w 1 i,..., wk i )thatminimize a chosen forecasting error measure E, computed for the combined forecasts Z C i,t (t = T i +1,..., T i + H). The measure E may be for example the Mean Absolute Error, or the Mean Squared Error. The task of defining the best combining weights can be formulated as an optimization problem as follows: Minimize: Subject to: E( Z K i,t C )=E( wi k Z i,t k ) (3) K wi k =1 and wk i 0 (4) Different optimization methods may be used to solve this problem, considering the characteristics of the measure E to be minimized. For each series Z i,this optimization process results on the weights (wi 1,..., wk i )thatwouldminimizethe forecasting error E if they were used to combine the K methods on that series. Finally, in the third task, the features (x 1 i,...,xp i ) are extracted for describing the fit period of the series (as defined in the FE module). These features and the optimized weights (wi 1,..., wk i ) are stored in the DB as a new example. 3.3 The Intelligent Combiner The Intelligent Combiner (IC) module implements a supervised learning algorithm that acquires knowledge from the set of training examples E. Given the set E, the algorithm is used to build a learner which is a regression model L : X [0; 1] K that receives as input a time series description and predicts the best configuration of combining weights, considering the error criteria E. The final output of the system for a time series described as x =(x 1,...,x p ) is then: w =( w 1,..., w K )=L(x) (5) The algorithms that can be used to generate the learner L may be for instance neural network models, algorithms for induction of regression trees, the k-nearest neighbour algorithm, among others. 4 The Implemented Prototype In order to verify the viability of the proposal, a prototype was implemented for defining the combining weights of K = 2 methods: the Random Walk (RW) and the Auto-Regressive model (AR) [5]. The prototype was applied to forecast the yearly series of the M3-Competition [10], which provides a large set of time series related to economic and demographic domains. In the next subsections,

6 A Machine Learning Approach to Define Weights 279 we provide the implementation details as well as the experiments that evaluated the prototype. A short presentation of the implemented prototype can also be found in [11]. 4.1 The Feature Extractor In this module, the following features were used to describe the yearly series: 1. Length of the time series (L): number of observations of the series; 2. Basic Trend (BT): slope of the linear regression model; 3. Percentage of Turning Points (TP): Z t is a turning point if Z t 1 <Z t >Z t+1 or Z t 1 >Z t <Z t+1. This feature measures the oscillation in a series; 4. First Coefficient of Autocorrelation (AC): large values of this feature suggest that the value of the series at a point influences the value at the next point; 5. Type of the time series (TYPE): it is represented by 5 categories, micro, macro, industry, finances and demographic. The first four features are directly computed using the series data and TYPE in turn is a contextual information provided by the authors of M3-Competition. 4.2 The Database In this case study, we used the 645 yearly series of the M3-Competition to generate the set of examples. Each time series was used to generate a different example as defined in the section 3.2. First, given a series Z i,thelasth = 6 years of the series were defined as the forecasting period and the remaining data of the series was defined as the fit period (as suggested in the M3-Competition). After calibration, the RW and AR models provided its individual forecasts Z i,t k (k =1, 2) (t = T i +1,...,T i +6) Second, we defined the combining weights wi k (k =1, 2) that minimized the Mean Absolute Error (MAE) of the combined forecasts Z i,t C (t = T i +1,..., T i +6). This task was formulated as the optimization problem: Minimize: MAE( Z C i,t )= 1 6 Subject to: T i+6 t=t i+1 Z i,t Z C i,t = 1 6 T i+6 t=t i+1 Z i,t 2 (wi k Z i,t k ) (6) 2 wi k =1 and wi k 0 (7) This optimization problem was treated using a line search algorithm implemented in the Optimization toolbox for Matlab [14]. Finally, the example associated to the series Z i is composed by the values of the five features (defined in the FE module, section 4.1) computed on the fit data and the optimum weights w i =(w 1 i,w2 i ).

7 280 R. Prudêncio and T. Ludermir 4.3 The Intelligent Combiner In the implemented prototype, the IC module uses the Multi-Layer Perceptron (MLP) network [4] (one hidden layer) as the learner. The MLP input layer has 9 units that represent the 5 time series features described in the FE module. The first four input units received the values of the numeric features (i.e. L, BT, TP, AC). The feature TYPE was represented by 5 binary attributes (either 1 or 0 value), each one associated to a different category of series (see fig. 3). The output layer has two nodes that represented the weights associated to the RW and AR models. The output nodes used sigmoid functions which ensures that the predicted weights are non-negative. In order to ensure that the predicted weights sum to one (see eq. 2), the outputs of the MLP were normalized. Formally, let O 1 and O 2 be the values provided as output for a given time series description x. The predicted weights w 1 and w 2 are defined as: w k = O k 2 l=1 O l (k =1, 2) (8) The MLP training was performed by the BackPropagation (BP) algorithm [4] and followed the benchmark training rules provided in [12]. The BP algorithm was implemented by using the Neural Network Toolbox for Matlab [13]. O 1 O 2 ew k = O k P 2 (k =1, 2) l=1 O l L BT TP AC MI MA IN FI DE TYPE Fig. 3. MLP used to define the combining weights for the RW and AR models 4.4 Experiments and Results In the performed experiments, the set of 645 examples in the DB was equally divided into training, validation and test sets. We trained the MLP using 2, 4, 6, 8 and 10 nodes in the hidden layer (30 runs for each value). The optimum number

8 A Machine Learning Approach to Define Weights 281 of hidden nodes was chosen as the value that obtained the lowest average SSE error on the validation set over the 30 runs. Table 1 summarizes the MLP results. As it can be seen, the optimum number of nodes according to the validation error was 8 nodes. The gain obtained by this value was also observed in the test set. Table 1. Training results Number of SSE Training SSE Validation SSE Test Hidden Nodes Average Deviation Average Deviation Average Deviation We further investigated the quality of the combined forecasts generated by using the weights predicted by the selected MLP (with 8 nodes). Let TEST be the set of 215 series that were used to generate the test set of the above experiments. Let w 1 i and w2 i be the weights predicted for the series Z i TEST. The forecasting error produced by the combination of methods at time t is: e C i,t = Z i,t Z C i,t = Z i,t 2 ( w i k Z i,t k ) (9) In order to evaluate the amount of these errors across all series Z i TEST for all forecasting points t {T i +1,...,T i +6}, we considered the Percentage Better (PB) measure [15]. Given a reference method R that serves for comparison, the PB measure is computed as follows: PB R = m T i+6 Z i TEST t=t i+1 δ i,t (10) where { 1, if e R δ i,t = i,t < e C i,t (11) 0, otherwise In the above definition, e R i,t is the forecasting error obtained by the method R in the i-th series at forecasting time t, andm is the number of times in which e R i,t ec i,t. Hence, PB R indicates in percentage terms, the number of times that the error obtained by the method R was lower than the error obtained using the combined forecasts. Hence, values lower than 50 indicate that the combined forecasts are more accurate than the forecasts obtained by the reference method. The PB measure was computed for three reference methods. The first one is merely to use RW for forecasting all series and the second is to use AR for all series. The third reference method is the Simple Average (SA). Table 2 summarizes the results over the 30 runs of the best MLP. The average PB measure was lower than 50% for all reference methods which indicates that the combined

9 282 R. Prudêncio and T. Ludermir Table 2. Comparative forecasting performance measured by PB Reference PB Measure Method Average Deviation Conf. Interv. (95%) RW [42.11; 42.29] AR [40.04; 40.36] SA [42.72; 43.76] forecasts were more accurate. The confidence intervals suggest that the obtained gain is statistically significant. 5 Related Work The proposed solution is closely related to previous work that used machine learning to select forecasting methods [3][16][17][18][19]. In the selection approach, time series features are used by a learning technique to predict the best method for forecasting among a set of candidates. In the solution presented here, the learner uses the features to define the best combination of methods. Our approach is more general since the selection problem can be seen as a special case of combination where w k =1ifk is the best method and 0 otherwise. In a previous work [18], we adopted the selection approach and applied a MLP network to select among the RW and AR models. Experiments were performed in the same 645 yearly series and the same forecasting period adopted in the present work. Table 3 shows the PB value computed for the combination approach, by using the selection approach developed in [18] as reference. As it may be seen, the combination procedure obtained a performance gain when compared to the simple selection approach adopted in the previous work. Table 3. Forecasting performance (PB) of the combined forecasts by using the selection approach as a basis for comparision Reference PB Measure Method Average Deviation Conf. Interv. (95%) Selection Approach (MLP) [48.01; 48.54] 6 Conclusion In this work, we proposed the use of machine learning to define linear weights for combining forecasts. In order to evaluate the proposal, we applied a MLP network to support the combination of two forecasting methodss. The performed experiments revealed a significant gain in accuracy compared to benchmarking procedures for forecasting. Modifications in the current implementation may be performed, such as augmenting the set of features, optimizing the MLP design and considering new forecasting error measures to be minimized.

10 A Machine Learning Approach to Define Weights 283 Although the focus of our work is forecasting, the proposed method can also be adapted to other situations, for example, to combine classifiers. In this context, instead of time series features, characteristics of classification tasks should be considered (such as the number of examples). In this context, our solution becomes related to the Meta-Learning field [20], which studies how the performance of learning algorithms can be improved through experience. The use of the proposed solution to combine classifiers will be investigated in future work. References 1. Hibon, M., Evgeniou, T.: To combine or not to combine: selecting among forecasts and their combinations. International Journal of Forecasting, 21(1) (2004) Adya, M., Armstrong, J. S., Collopy, F., Kennedy, M.: Automatic identification of time series features for rule-based forecasting. International Journal of Forecasting, 17(2) (2001) Arinze, B.: Selecting appropriate forecasting models using rule induction. Omega- International Journal of Management Science, 22(6) (1994) Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by backpropagation errors. Nature, 323 (1986) Harvey, A.: Time Series Models. MIT Press, Cambridge, MA (1993) 6. DeMenezes, L., Bunn, D., Taylor, J.: Review of guidelines for the use of combined forecasts. European Jour. of Operational Research, 120 (2000) Armstrong, J.: Findings from evidence-based forecasting: methods for reducing forecast error. (2005) Available at: Accessed on March 20, Granger, C.W.J., Ramanathan, R.: Improved methods of combining forecasts. Journal of Forecasting, 3 (1984) Asku, C., Gunter, S.I.: An empirical analysis of the accuracy of SA, OLS, ERLS and NRLS combination of forecasts. Intern. Journal of Forecasting, 8 (1992) Makridakis, S., Hibon, M.: The M3-competition: results, conclusions and implications. International Journal of Forecasting, 16(4) (2000) Prudêncio, R., Ludermir, T.B.: Using Machine Learning Techniques to Combine Forecasting Methods. Lect. Notes in Artificial Intelligence, 3339 (2004) Prechelt, L.: Proben 1: a set of neural network benchmark problems and benchmarking rules, Tech. Rep. 21/94, Fakultat fur Informatik, Karlsruhe (1994). 13. Demuth, H., Beale, M.:. Neural Network Toolbox for Use with Matlab, The Mathworks Inc, (2003). 14. The Mathworks, Optimization Toolbox User s Guide, The Mathworks Inc. (2003). 15. Flores, B.E.: Use of the sign test to supplement the percentage better statistic. International Journal of Forecasting, 2 (1986) Chu, C-H., Widjaja, D.: Neural network system for forecasting method selection. Decision Support Systems, 12(1) (1994) Venkatachalan, A.R., Sohl, J.E.: An intelligent model selection and forecasting system. Journal of Forecasting, 18 (1999) Prudêncio, R., Ludermir, T.B.: Meta-learning approaches for selecting time series models. Neurocomputing Journal, 61(C) (2004) Prudêncio, R., Ludermir, T.B., DeCarvalho, F.: A modal symbolic classifier for selecting time series models. Pattern Recogn. Letters, 25(8) (2004) Giraud-Carrier, C., Vilalta, R., Brazdil, P.: Introduction to the special issue on meta-learning. Machine Learning, 54 (2004)

Deep Learning Architecture for Univariate Time Series Forecasting

Deep Learning Architecture for Univariate Time Series Forecasting CS229,Technical Report, 2014 Deep Learning Architecture for Univariate Time Series Forecasting Dmitry Vengertsev 1 Abstract This paper studies the problem of applying machine learning with deep architecture

More information

Do we need Experts for Time Series Forecasting?

Do we need Experts for Time Series Forecasting? Do we need Experts for Time Series Forecasting? Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot Campus, Poole, BH12 5BB - United

More information

On the benefit of using time series features for choosing a forecasting method

On the benefit of using time series features for choosing a forecasting method On the benefit of using time series features for choosing a forecasting method Christiane Lemke and Bogdan Gabrys Bournemouth University - School of Design, Engineering and Computing Poole House, Talbot

More information

Pattern Matching and Neural Networks based Hybrid Forecasting System

Pattern Matching and Neural Networks based Hybrid Forecasting System Pattern Matching and Neural Networks based Hybrid Forecasting System Sameer Singh and Jonathan Fieldsend PA Research, Department of Computer Science, University of Exeter, Exeter, UK Abstract In this paper

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 4, April 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Application of

More information

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm

Keywords- Source coding, Huffman encoding, Artificial neural network, Multilayer perceptron, Backpropagation algorithm Volume 4, Issue 5, May 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Huffman Encoding

More information

Estimation of Inelastic Response Spectra Using Artificial Neural Networks

Estimation of Inelastic Response Spectra Using Artificial Neural Networks Estimation of Inelastic Response Spectra Using Artificial Neural Networks J. Bojórquez & S.E. Ruiz Universidad Nacional Autónoma de México, México E. Bojórquez Universidad Autónoma de Sinaloa, México SUMMARY:

More information

Iterative ARIMA-Multiple Support Vector Regression models for long term time series prediction

Iterative ARIMA-Multiple Support Vector Regression models for long term time series prediction and Machine Learning Bruges (Belgium), 23-25 April 24, i6doccom publ, ISBN 978-2874995-7 Available from http://wwwi6doccom/fr/livre/?gcoi=2843244 Iterative ARIMA-Multiple Support Vector Regression models

More information

Address for Correspondence

Address for Correspondence Research Article APPLICATION OF ARTIFICIAL NEURAL NETWORK FOR INTERFERENCE STUDIES OF LOW-RISE BUILDINGS 1 Narayan K*, 2 Gairola A Address for Correspondence 1 Associate Professor, Department of Civil

More information

Modified Holt s Linear Trend Method

Modified Holt s Linear Trend Method Modified Holt s Linear Trend Method Guckan Yapar, Sedat Capar, Hanife Taylan Selamlar and Idil Yavuz Abstract Exponential smoothing models are simple, accurate and robust forecasting models and because

More information

Neural Network to Control Output of Hidden Node According to Input Patterns

Neural Network to Control Output of Hidden Node According to Input Patterns American Journal of Intelligent Systems 24, 4(5): 96-23 DOI:.5923/j.ajis.2445.2 Neural Network to Control Output of Hidden Node According to Input Patterns Takafumi Sasakawa, Jun Sawamoto 2,*, Hidekazu

More information

Negatively Correlated Echo State Networks

Negatively Correlated Echo State Networks Negatively Correlated Echo State Networks Ali Rodan and Peter Tiňo School of Computer Science, The University of Birmingham Birmingham B15 2TT, United Kingdom E-mail: {a.a.rodan, P.Tino}@cs.bham.ac.uk

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Input-variable Specification for Neural Networks An Analysis of Forecasting Low and High Time Series Frequency

Input-variable Specification for Neural Networks An Analysis of Forecasting Low and High Time Series Frequency Universität Hamburg Institut für Wirtschaftsinformatik Prof. Dr. D.B. Preßmar Input-variable Specification for Neural Networks An Analysis of Forecasting Low and High Time Series Frequency Dr. Sven F.

More information

Kalman Filter and SVR Combinations in Forecasting US Unemployment

Kalman Filter and SVR Combinations in Forecasting US Unemployment Kalman Filter and SVR Combinations in Forecasting US Unemployment Georgios Sermpinis 1, Charalampos Stasinakis 1, and Andreas Karathanasopoulos 2 1 University of Glasgow Business School georgios.sermpinis@glasgow.ac.uk,

More information

Deep Neural Networks

Deep Neural Networks Deep Neural Networks DT2118 Speech and Speaker Recognition Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 45 Outline State-to-Output Probability Model Artificial Neural Networks Perceptron Multi

More information

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE

A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE A FUZZY NEURAL NETWORK MODEL FOR FORECASTING STOCK PRICE Li Sheng Institute of intelligent information engineering Zheiang University Hangzhou, 3007, P. R. China ABSTRACT In this paper, a neural network-driven

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

Supervised Learning. George Konidaris

Supervised Learning. George Konidaris Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,

More information

Forecasting River Flow in the USA: A Comparison between Auto-Regression and Neural Network Non-Parametric Models

Forecasting River Flow in the USA: A Comparison between Auto-Regression and Neural Network Non-Parametric Models Journal of Computer Science 2 (10): 775-780, 2006 ISSN 1549-3644 2006 Science Publications Forecasting River Flow in the USA: A Comparison between Auto-Regression and Neural Network Non-Parametric Models

More information

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK Dusan Marcek Silesian University, Institute of Computer Science Opava Research Institute of the IT4Innovations

More information

Neural Networks and Ensemble Methods for Classification

Neural Networks and Ensemble Methods for Classification Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated

More information

Automatic modelling of neural networks for time series prediction in search of a uniform methodology across varying time frequencies

Automatic modelling of neural networks for time series prediction in search of a uniform methodology across varying time frequencies Automatic modelling of neural networks for time series prediction in search of a uniform methodology across varying time frequencies Nikolaos Kourentzes and Sven F. Crone Lancaster University Management

More information

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69 R E S E A R C H R E P O R T Online Policy Adaptation for Ensemble Classifiers Christos Dimitrakakis a IDIAP RR 03-69 Samy Bengio b I D I A P December 2003 D a l l e M o l l e I n s t i t u t e for Perceptual

More information

Longshore current velocities prediction: using a neural networks approach

Longshore current velocities prediction: using a neural networks approach Coastal Processes II 189 Longshore current velocities prediction: using a neural networks approach T. M. Alaboud & M. S. El-Bisy Civil Engineering Dept., College of Engineering and Islamic Architecture,

More information

Input Selection for Long-Term Prediction of Time Series

Input Selection for Long-Term Prediction of Time Series Input Selection for Long-Term Prediction of Time Series Jarkko Tikka, Jaakko Hollmén, and Amaury Lendasse Helsinki University of Technology, Laboratory of Computer and Information Science, P.O. Box 54,

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Logistic Regression and Boosting for Labeled Bags of Instances

Logistic Regression and Boosting for Labeled Bags of Instances Logistic Regression and Boosting for Labeled Bags of Instances Xin Xu and Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand {xx5, eibe}@cs.waikato.ac.nz Abstract. In

More information

The Markov Decision Process Extraction Network

The Markov Decision Process Extraction Network The Markov Decision Process Extraction Network Siegmund Duell 1,2, Alexander Hans 1,3, and Steffen Udluft 1 1- Siemens AG, Corporate Research and Technologies, Learning Systems, Otto-Hahn-Ring 6, D-81739

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

Artificial Neural Network Method of Rock Mass Blastability Classification

Artificial Neural Network Method of Rock Mass Blastability Classification Artificial Neural Network Method of Rock Mass Blastability Classification Jiang Han, Xu Weiya, Xie Shouyi Research Institute of Geotechnical Engineering, Hohai University, Nanjing, Jiangshu, P.R.China

More information

A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems

A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems A Hybrid Neuron with Gradient-based Learning for Binary Classification Problems Ricardo de A. Araújo 1,2, Adriano L. I. Oliveira 1, Silvio R. L. Meira 1 1 Informatics Center Federal University of Pernambuco

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

w c X b 4.8 ADAPTIVE DATA FUSION OF METEOROLOGICAL FORECAST MODULES

w c X b 4.8 ADAPTIVE DATA FUSION OF METEOROLOGICAL FORECAST MODULES 4.8 ADAPTIVE DATA FUSION OF METEOROLOGICAL FORECAST MODULES Shel Gerding*and Bill Myers National Center for Atmospheric Research, Boulder, CO, USA 1. INTRODUCTION Many excellent forecasting models are

More information

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory

Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216

More information

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks

A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks A Priori and A Posteriori Machine Learning and Nonlinear Artificial Neural Networks Jan Zelinka, Jan Romportl, and Luděk Müller The Department of Cybernetics, University of West Bohemia, Czech Republic

More information

Modeling Soil Temperature Using Artificial Neural Network

Modeling Soil Temperature Using Artificial Neural Network 2014 5th International Conference on Environmental Science and Technology IPCBEE vol.69 (2014) (2014) IACSIT Press, Singapore DOI: 10.7763/IPCBEE. 2014. V69. 3 Modeling Soil Temperature Using Artificial

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Advanced statistical methods for data analysis Lecture 2

Advanced statistical methods for data analysis Lecture 2 Advanced statistical methods for data analysis Lecture 2 RHUL Physics www.pp.rhul.ac.uk/~cowan Universität Mainz Klausurtagung des GK Eichtheorien exp. Tests... Bullay/Mosel 15 17 September, 2008 1 Outline

More information

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation 1 Introduction A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation J Wesley Hines Nuclear Engineering Department The University of Tennessee Knoxville, Tennessee,

More information

INAOE. Dra. Ma. del Pilar Gómez Gil. Tutorial An Introduction to the Use of Artificial Neural Networks. Part 4: Examples using Matlab

INAOE. Dra. Ma. del Pilar Gómez Gil. Tutorial An Introduction to the Use of Artificial Neural Networks. Part 4: Examples using Matlab Tutorial An Introduction to the Use of Artificial Neural Networks. Part 4: Examples using Matlab Dra. Ma. del Pilar Gómez Gil INAOE pgomez@inaoep.mx pgomez@acm.org This version: October 13, 2015 1 Outline

More information

Selection of the Appropriate Lag Structure of Foreign Exchange Rates Forecasting Based on Autocorrelation Coefficient

Selection of the Appropriate Lag Structure of Foreign Exchange Rates Forecasting Based on Autocorrelation Coefficient Selection of the Appropriate Lag Structure of Foreign Exchange Rates Forecasting Based on Autocorrelation Coefficient Wei Huang 1,2, Shouyang Wang 2, Hui Zhang 3,4, and Renbin Xiao 1 1 School of Management,

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis

Introduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.

More information

Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees

Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees Comparison of Shannon, Renyi and Tsallis Entropy used in Decision Trees Tomasz Maszczyk and W lodzis law Duch Department of Informatics, Nicolaus Copernicus University Grudzi adzka 5, 87-100 Toruń, Poland

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Computational Intelligence Winter Term 2017/18

Computational Intelligence Winter Term 2017/18 Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning

More information

ECE662: Pattern Recognition and Decision Making Processes: HW TWO

ECE662: Pattern Recognition and Decision Making Processes: HW TWO ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are

More information

LUMS Department of Management Science

LUMS Department of Management Science Universität Hamburg Institut für Wirtschaftsinformatik Prof. Dr. D.B. Preßmar Forecasting High- and Low-Frequency Data with Neural Networks Challenges in Modelling the Input Vector Nikolaos Kourentzes

More information

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES

MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES MODELLING ENERGY DEMAND FORECASTING USING NEURAL NETWORKS WITH UNIVARIATE TIME SERIES S. Cankurt 1, M. Yasin 2 1&2 Ishik University Erbil, Iraq 1 s.cankurt@ishik.edu.iq, 2 m.yasin@ishik.edu.iq doi:10.23918/iec2018.26

More information

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters Kyriaki Kitikidou, Elias Milios, Lazaros Iliadis, and Minas Kaymakis Democritus University of Thrace,

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Learning Tetris. 1 Tetris. February 3, 2009

Learning Tetris. 1 Tetris. February 3, 2009 Learning Tetris Matt Zucker Andrew Maas February 3, 2009 1 Tetris The Tetris game has been used as a benchmark for Machine Learning tasks because its large state space (over 2 200 cell configurations are

More information

Research Article Stacked Heterogeneous Neural Networks for Time Series Forecasting

Research Article Stacked Heterogeneous Neural Networks for Time Series Forecasting Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 21, Article ID 373648, 2 pages doi:1.1155/21/373648 Research Article Stacked Heterogeneous Neural Networks for Time Series Forecasting

More information

Computational Intelligence

Computational Intelligence Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning

More information

Adaptive Boosting of Neural Networks for Character Recognition

Adaptive Boosting of Neural Networks for Character Recognition Adaptive Boosting of Neural Networks for Character Recognition Holger Schwenk Yoshua Bengio Dept. Informatique et Recherche Opérationnelle Université de Montréal, Montreal, Qc H3C-3J7, Canada fschwenk,bengioyg@iro.umontreal.ca

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS

COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS Proceedings of the First Southern Symposium on Computing The University of Southern Mississippi, December 4-5, 1998 COMPARING PERFORMANCE OF NEURAL NETWORKS RECOGNIZING MACHINE GENERATED CHARACTERS SEAN

More information

Meta-learning how to forecast time series

Meta-learning how to forecast time series ISSN 1440-771X Department of Econometrics and Business Statistics http://business.monash.edu/econometrics-and-businessstatistics/research/publications Meta-learning how to forecast time series Thiyanga

More information

Deep Feedforward Networks

Deep Feedforward Networks Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement

More information

Nonlinear Classification

Nonlinear Classification Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions

More information

Stability of backpropagation learning rule

Stability of backpropagation learning rule Stability of backpropagation learning rule Petr Krupanský, Petr Pivoňka, Jiří Dohnal Department of Control and Instrumentation Brno University of Technology Božetěchova 2, 612 66 Brno, Czech republic krupan,

More information

Neural network modelling of reinforced concrete beam shear capacity

Neural network modelling of reinforced concrete beam shear capacity icccbe 2010 Nottingham University Press Proceedings of the International Conference on Computing in Civil and Building Engineering W Tizani (Editor) Neural network modelling of reinforced concrete beam

More information

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS Filippo Portera and Alessandro Sperduti Dipartimento di Matematica Pura ed Applicata Universit a di Padova, Padova, Italy {portera,sperduti}@math.unipd.it

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

Machine learning techniques for decision support in anesthesia

Machine learning techniques for decision support in anesthesia Machine learning techniques for decision support in anesthesia Olivier Caelen 1, Gianluca Bontempi 1, and Luc Barvais 2 1 Machine Learning Group, Département d Informatique, Université Libre de Bruxelles,

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Optimizing forecasts for inflation and interest rates by time-series model averaging

Optimizing forecasts for inflation and interest rates by time-series model averaging Optimizing forecasts for inflation and interest rates by time-series model averaging Presented at the ISF 2008, Nice 1 Introduction 2 The rival prediction models 3 Prediction horse race 4 Parametric bootstrap

More information

5 Autoregressive-Moving-Average Modeling

5 Autoregressive-Moving-Average Modeling 5 Autoregressive-Moving-Average Modeling 5. Purpose. Autoregressive-moving-average (ARMA models are mathematical models of the persistence, or autocorrelation, in a time series. ARMA models are widely

More information

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine Song Li 1, Peng Wang 1 and Lalit Goel 1 1 School of Electrical and Electronic Engineering Nanyang Technological University

More information

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)

More information

WEATHER DEPENENT ELECTRICITY MARKET FORECASTING WITH NEURAL NETWORKS, WAVELET AND DATA MINING TECHNIQUES. Z.Y. Dong X. Li Z. Xu K. L.

WEATHER DEPENENT ELECTRICITY MARKET FORECASTING WITH NEURAL NETWORKS, WAVELET AND DATA MINING TECHNIQUES. Z.Y. Dong X. Li Z. Xu K. L. WEATHER DEPENENT ELECTRICITY MARKET FORECASTING WITH NEURAL NETWORKS, WAVELET AND DATA MINING TECHNIQUES Abstract Z.Y. Dong X. Li Z. Xu K. L. Teo School of Information Technology and Electrical Engineering

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

Σ N (d i,p z i,p ) 2 (1)

Σ N (d i,p z i,p ) 2 (1) A CLASSICAL ALGORITHM FOR AVOIDING LOCAL MINIMA D Gorse and A Shepherd Department of Computer Science University College, Gower Street, London WC1E 6BT, UK J G Taylor Department of Mathematics King s College,

More information

Explaining Results of Neural Networks by Contextual Importance and Utility

Explaining Results of Neural Networks by Contextual Importance and Utility Explaining Results of Neural Networks by Contextual Importance and Utility Kary FRÄMLING Dep. SIMADE, Ecole des Mines, 158 cours Fauriel, 42023 Saint-Etienne Cedex 2, FRANCE framling@emse.fr, tel.: +33-77.42.66.09

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Multilayer Perceptrons (MLPs)

Multilayer Perceptrons (MLPs) CSE 5526: Introduction to Neural Networks Multilayer Perceptrons (MLPs) 1 Motivation Multilayer networks are more powerful than singlelayer nets Example: XOR problem x 2 1 AND x o x 1 x 2 +1-1 o x x 1-1

More information

Competitive Learning for Deep Temporal Networks

Competitive Learning for Deep Temporal Networks Competitive Learning for Deep Temporal Networks Robert Gens Computer Science and Engineering University of Washington Seattle, WA 98195 rcg@cs.washington.edu Pedro Domingos Computer Science and Engineering

More information

Neural Networks DWML, /25

Neural Networks DWML, /25 DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference

More information

A New Hybrid System for Recognition of Handwritten-Script

A New Hybrid System for Recognition of Handwritten-Script computing@tanet.edu.te.ua www.tanet.edu.te.ua/computing ISSN 177-69 A New Hybrid System for Recognition of Handwritten-Script Khalid Saeed 1) and Marek Tabdzki ) Faculty of Computer Science, Bialystok

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

SIMULATION OF FREEZING AND FROZEN SOIL BEHAVIOURS USING A RADIAL BASIS FUNCTION NEURAL NETWORK

SIMULATION OF FREEZING AND FROZEN SOIL BEHAVIOURS USING A RADIAL BASIS FUNCTION NEURAL NETWORK SIMULATION OF FREEZING AND FROZEN SOIL BEHAVIOURS USING A RADIAL BASIS FUNCTION NEURAL NETWORK Z.X. Zhang 1, R.L. Kushwaha 2 Department of Agricultural and Bioresource Engineering University of Saskatchewan,

More information

COMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization

COMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization : Neural Networks Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization 11s2 VC-dimension and PAC-learning 1 How good a classifier does a learner produce? Training error is the precentage

More information

Parallel layer perceptron

Parallel layer perceptron Neurocomputing 55 (2003) 771 778 www.elsevier.com/locate/neucom Letters Parallel layer perceptron Walmir M. Caminhas, Douglas A.G. Vieira, João A. Vasconcelos Department of Electrical Engineering, Federal

More information

Civil and Environmental Research ISSN (Paper) ISSN (Online) Vol.8, No.1, 2016

Civil and Environmental Research ISSN (Paper) ISSN (Online) Vol.8, No.1, 2016 Developing Artificial Neural Network and Multiple Linear Regression Models to Predict the Ultimate Load Carrying Capacity of Reactive Powder Concrete Columns Prof. Dr. Mohammed Mansour Kadhum Eng.Ahmed

More information

That s Hot: Predicting Daily Temperature for Different Locations

That s Hot: Predicting Daily Temperature for Different Locations That s Hot: Predicting Daily Temperature for Different Locations Alborz Bejnood, Max Chang, Edward Zhu Stanford University Computer Science 229: Machine Learning December 14, 2012 1 Abstract. The problem

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction

More information

Neural Networks: Introduction

Neural Networks: Introduction Neural Networks: Introduction Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others 1

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Neural Networks for Two-Group Classification Problems with Monotonicity Hints

Neural Networks for Two-Group Classification Problems with Monotonicity Hints Neural Networks for Two-Group Classification Problems with Monotonicity Hints P. Lory 1, D. Gietl Institut für Wirtschaftsinformatik, Universität Regensburg, D-93040 Regensburg, Germany Abstract: Neural

More information