Generalized Models Based on Neural Networks and Multiple Linear Regression

Similar documents
Towards a Volterra series representation from a Neural Network model

CSC321 Lecture 5: Multilayer Perceptrons

Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature

A Feature Based Neural Network Model for Weather Forecasting

epochs epochs

Understanding Generalization Error: Bounds and Decompositions

CSC 411 Lecture 10: Neural Networks

Machine Learning (CSE 446): Neural Networks

Machine Learning

VC-dimension of a context-dependent perceptron

Machine Learning

Week 5: Logistic Regression & Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks

Artificial Neural Network Approach for Land Cover Classification of Fused Hyperspectral and Lidar Data

Advanced statistical methods for data analysis Lecture 2

Linear Models for Regression

Design Collocation Neural Network to Solve Singular Perturbed Problems with Initial Conditions

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Using Neural Networks for Identification and Control of Systems

Ch.6 Deep Feedforward Networks (2/3)

A New Hybrid System for Recognition of Handwritten-Script

Simple neuron model Components of simple neuron

A Logarithmic Neural Network Architecture for Unbounded Non-Linear Function Approximation

Combination of M-Estimators and Neural Network Model to Analyze Inside/Outside Bark Tree Diameters

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

FEEDBACK GMDH-TYPE NEURAL NETWORK AND ITS APPLICATION TO MEDICAL IMAGE ANALYSIS OF LIVER CANCER. Tadashi Kondo and Junji Ueno

Machine Learning Lecture 7

Learning from Data: Regression

RAINFALL RUNOFF MODELING USING SUPPORT VECTOR REGRESSION AND ARTIFICIAL NEURAL NETWORKS

Support Vector Machines

Linear Models for Classification

Acomplex-valued harmonic with a time-varying phase is a

Load Forecasting Using Artificial Neural Networks and Support Vector Regression

ESTIMATING THE ACTIVATION FUNCTIONS OF AN MLP-NETWORK

Multilayer Perceptrons (MLPs)

Content. Learning. Regression vs Classification. Regression a.k.a. function approximation and Classification a.k.a. pattern recognition

SUPPORT VECTOR REGRESSION WITH A GENERALIZED QUADRATIC LOSS

On the complexity of shallow and deep neural network classifiers

Lecture 5: Logistic Regression. Neural Networks

Machine Learning. VC Dimension and Model Complexity. Eric Xing , Fall 2015

Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso

Classifier Complexity and Support Vector Classifiers

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Artificial Neural Network and Fuzzy Logic

A New Weight Initialization using Statistically Resilient Method and Moore-Penrose Inverse Method for SFANN

4. Multilayer Perceptrons

Application of Artificial Neural Networks in Evaluation and Identification of Electrical Loss in Transformers According to the Energy Consumption

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

Oil Field Production using Machine Learning. CS 229 Project Report

CSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18

Feed-forward Network Functions

Linear Models for Regression CS534

Statistical Learning Reading Assignments

Are Rosenblatt multilayer perceptrons more powerfull than sigmoidal multilayer perceptrons? From a counter example to a general result

A STATE-SPACE NEURAL NETWORK FOR MODELING DYNAMICAL NONLINEAR SYSTEMS

Cheng Soon Ong & Christian Walder. Canberra February June 2018

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Optimization Methods for Machine Learning (OMML)

A Novel Activity Detection Method

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Discriminative Models

Forecasting Crude Oil Price Using Neural Networks

VC dimension, Model Selection and Performance Assessment for SVM and Other Machine Learning Algorithms

Discussion About Nonlinear Time Series Prediction Using Least Squares Support Vector Machine

y(n) Time Series Data

Multilayer Perceptron

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

CSC 578 Neural Networks and Deep Learning

ESTIMATION OF HOURLY MEAN AMBIENT TEMPERATURES WITH ARTIFICIAL NEURAL NETWORKS 1. INTRODUCTION

Vapnik-Chervonenkis Dimension of Neural Nets

When can Deep Networks avoid the curse of dimensionality and other theoretical puzzles

Neural Network Control in a Wastewater Treatment Plant

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

A Training-time Analysis of Robustness in Feed-Forward Neural Networks

Novel determination of dierential-equation solutions: universal approximation method

Neural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav

Linear Dependency Between and the Input Noise in -Support Vector Regression

Resampling techniques for statistical modeling

Neural network modelling of reinforced concrete beam shear capacity

10-701/ Machine Learning, Fall

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

Machine Learning. 7. Logistic and Linear Regression

Vapnik-Chervonenkis Dimension of Neural Nets

Statistical learning theory, Support vector machines, and Bioinformatics

MODULAR ECHO STATE NEURAL NETWORKS IN TIME SERIES PREDICTION

Deep Feedforward Networks

Support Vector Machines vs Multi-Layer. Perceptron in Particle Identication. DIFI, Universita di Genova (I) INFN Sezione di Genova (I) Cambridge (US)

HOPFIELD neural networks (HNNs) are a class of nonlinear

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Bayesian Machine Learning

Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

NONLINEAR BLACK BOX MODELING OF A LEAD ACID BATTERY USING HAMMERSTEIN-WIENER MODEL

Discriminative Models

1162 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 5, SEPTEMBER The Evidence Framework Applied to Support Vector Machines

Ways to make neural networks generalize better

Introduction to Machine Learning

Introduction to Machine Learning

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question

Transcription:

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 279 Generalized Models Based on Neural Networks and Multiple Linear Regression PERO RADONJA 1, SRDJAN STANKOVIC 2, DRAGANA DRAZIC 1 and BRATISLAV MATOVIC 1 1 Institute of Forestry, 113 Kneza Viseslava 3, Belgrade 2 Faculty of Electrical Engineering, University of Belgrade, Belgrade SERBIA radonjap@eunet.yu stankovic@etf.bg.ac.yu inszasum@eunet.yu Abstract - Developed generalized models are based on neural networks, linear and multiple linear regression. Applications of the generalized (regional) models in estimation of the most important general (common) characteristic of the whole region is done. Testing of the regional models with real, referent data is performed by regression analyses, also. The obtained correlation coefficients between referent data and corresponding data computed using the regional models based on linear and multiple linear regression are very height,.9971 and.9955 respectively. In the case of application of neural networks correlation coefficients is the largest,.9985. Key Words: - generalized model, regional model, multiple linear regression, neural networks, Vapnik- Chervonenkis dimension 1 Introduction Regional (generalized) process models can be obtained by introduction some limits in the parameters of general process models. The process which will be considered is biological process of growing. A region can contain many thousands of individual objects. To get the information concerning with same region, it is necessary to make measurements in several (r) samples with more (N) objects. The total number of objects in all samples, rxn, must encompass the complete variability of the region. In our case there were 156 objects in all samples. Detailed measurements on all objects is expensive and requires a lot of time. Because of this we try to find the regional model that enables to get important characteristics of entire region without detailed measurements on all objects. It is known that neural networks (NN) ensure smaller modeling error than classical methods and because of that they are very suitable for generating both specifics and regional models, [1, 2, 3]. *Research is supported by the Ministry of Science and Environmental Protection of Serbia, as part of the project: EE - 27315.B 2 Problem Definition The classical approach to modeling of this process is based on the modified Brink s function [4]. The canonical form of this function is: px qx y( x) = u+ ve we...(1) Note that the y(x), (1), very good represents considered process, [4, 5, 6]. Parameters u, v and w are function of the basic measured values, D and H, and parameters i, p and q. For every object in the region, [7, 8], it is possible to determine its specific model, that is, parameters i, p and q, by some optimization procedure, [9]. The optimization procedure requires besides the basic measured values D and H, minimum 3 sets of new measured data. In our case we have used 14 data sets for every object. Main goal is to get regional models so that by application of regional models we can get any specific model or important regional characteristic only using the basic measured values D and H. For example: the important regional characteristic is the sum of the volumes of all objects. The volumes of objects from whole region are important because of sustainable management with considered ecological system [1, 11, 12 ]. 3 Specific Models Based on

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 28 Neural Networks In considered case NN ensure smaller modeling error than classical methods based on polynomial, exponential functions, or even on the modified Brink s function. Because of that, in our case, NN are very suitable tools in process of generating the specific and after that, the generalized model. It is known that to achieve good performance, both the bias and the variance of the model, should be small. In order to avoid overfitting and to get a small confidence interval it is necessary to use NN with low Vapnik-Chervonenkis (VC) dimension [1, 2, 3]. In our cases VC dimension is determined simply by the number of free model parameters, that is by number of tansig neurons in the second (hidden) layer. tansig neurons have logistic sigmoid tangent hyperbolic transfer function. Note that a decrease of VC dimension occurs at the expense of an increase of the training error. The method of structural risk minimization (SRM) provides a systematic procedure for achieving the best compromise between the training error and the confidence interval by controlling the VC dimension [1, 2]. The goal is to find a network structure such that a decrease of the VC dimension occurs at the expense of the smallest possible increase in the training error. In our case, we decrease monotonically the number of tansig neurons in the hidden layers, [13]. The process of learning is shown in Fig.1. In the process of learning, Levenberg-Marqurdt algorithm is used because of its good convergence properties, [14, 15, 16, 17]. Fig.2 Normalized specific model The normalization is performed evidently by using the largest value of x and y coordinates. Errors of modeling is shown in Fig.3 Fig.3 Errors of modeling 4 Generalized Model Obtained by Using NN The generalized model for considered region, that is, the regional model is obtained as mean value of all normalized specific models from all samples. The obtained model, with normalized y and x coordinates, is presented in Fig.4. 1.9.8.7.6 y.5.4.3.2 Fig.1 Learning process of NN In Fig.2, the normalized specific model is presented..1.2.4.6.8 1 x Fig.4 The generalized (regional) model

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 281 Now any specific model can be obtained by renormalization with D and H, y and x coordinates of the presented generalized model in Fig.4. Quality and performances of the obtained generalized model will be analyzed in the subsequent discussion. 3. 25. S =.425637 r =.997589 5 Generalized Model Based on Linear Regression Alternatively to Fig.4, the regional model, can be the set equations that determine parameters i, p and q using the basic measured values D and H. In order to analyze the most important characteristics of the considered process and get the regional model, in the first step, the parameters i, p and q for all sobjects are found by some optimization procedure, [9]. In Fig.5 parameter i versus H is presented. 3. 25. S = 1.8448854 r =.9549874 i 2. 15. 1. 5... 1. 2. 3. 4. 5. 6. D[cm] Fig.6 Parameter i versus D Parameter q versus D is presented in Fig.7. i 2. 15..3 S =.4281 r =.43277 1..25 5... 5. 1. 15. 2. 25. 3. 35. 4..2 q H [m].15 Fig.5 Parameter i versus H.1 It can be seen that large correlation,.9541, exist between parameter i and H. That means it is possible to compute i only based on H. In the next figure, Fig.6, as an additional illustration, parameter i versus D is presented.. Now, it can be seen that, the larger correlation coefficient, r,.9976, exist between i and D. Because of this, the parameter i will be computed based on Fig.6, that is on eq.(2). i=.32733 +.46132D...(2).5.. 5. 1. 15. 2. 25. 3. 35. 4. 45. 5. D[cm] Fig.7 Parameter q versus D It can be seen that correlation coefficient is only.4328. Unfortunately, the correlation coefficient, r, with respect to H, is even less,.398. Because of that it will be used the next equation:

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 282 q=.12661.155d...(3) Finally, will be analyzed regression lines for parameter p versus D and H. Like in previous case, r of regression process with respect to H, 4336, is less than with respect to D,.5129. Because of that, parameter p versus D is presented in Fig.8 opposite to.9541 and.9976, Figs.5 and 6, in the case of simple linear regression. Standard error of modeling, S, is also less,.4175, than 1.8449 and.4256 in Figs.5 and 6. As an illustration of the multiple linear regression, eq.(5), parameter i versus the D and H is presented on Fig.A1, Appendix A. p 8. 6. S = 1.2416745 r =.51287987 Multiple regression offers better results with respect to r, in the case of estimating q parameter, also. r has value.4348 compared with.398 and.4328 in the case simple linear regression, with respect to H an D respectively. Parameter q versus the D and H is presented on Fig.9. 4. 2..16.. 1. 2. 3. 4. 5. 6. D[cm] Fig.8 Parameter p versus D q.12.8.4 2 4 6 D 4 2 H Also, it is better to use quadratic fitting than linear one and consequently for computing (estimating) parameter p will be used: p D D 2 = 4.23311.9571 +.78...(4) 6 Multiple Linear Regression in Model Generating Generally, multiple regression offers greater possibilities than simple linear regression. With reference to this fact the analysis of applications multiple linear regression is performed also. Based on available data, using multiple linear regression, three equations, for i, q and p parameters, are obtained. i =.18554 +.43549D +.3728H...(5) q =.1244 -.21D +.65H...(6) p = 3.9167 -.5883D +.865H...(7) In the case of compute (estimation) parameter i by eq.(5) equivalent correlation coefficient is.9977 Fig.9 Parameter q versus the D and H In the case of parameter p estimation by eq.(7), equivalent correlation coefficient is.451 opposite to,5129 and.4336, in the case of the simple linear regression. Illustration of an application of the multiple linear regression for estimation of parameter p, eq.(7), is presented on Fig.A2, Appendix B. 7 Testing the Obtained Generalized (Regional) Models Usability of the regional models will be tasted by computing of the correlation coefficients, that is, by applications of regression analyses technique (method). The very important general (common) characteristics of the region are the volumes of all objects in region and sum of all volumes. The volume of every object depends on the basic measured value D and H, and regional model. The models can be, once obtained by NN (Fig.4) or based on linear regression eq.(2, 3 and 4) or multiple linear regression, eq.(5, 6 and 7).

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 283 Note, that it is possible to compute real (referent) volumes, Vref, of all objects from considered samples, using the known parameters i, p and q, Figs.5 to 8., [4, 9]. On the other side, we can compute the volumes for the same objects using the regional models, volumes Vm. Testing the accuracy of the regional models is practically performed by computing the correlation coefficients between the referent volumes, Vref, and the volumes Vm. The result of testing, in the case of application of the linear regression, is presented on Fig.1.. 1. 2. 3. 4. V m [m 3 ] S =.6744 r =.9976..5 1. 1.5 2. 2.5 3. 3.5 4. V ref [m 3 ] Fig.1 Testing of application of linear regression based only on values of D Obviously, in the ideal case, the regression line in Fig.1 must have the angle of 45 and has to start from the origin (,) In Fig.11, the result of testing, of model based on multiple linear regression, is presented. Unexpected, the application of multiple linear regression give the worse result, tangα=.9591 and start point.317 where the application of the simple linear regression gives.9678 and.213. Also, value of S and r are worse in the case of the multiple linear regression,.827 and.9955 compared with.674 and.9971 in the case of the simple linear regression.. 1. 2. 3. 4. V m [m 3 ] S =.827 r =.9955..5 1. 1.5 2. 2.5 3. 3.5 4. V ref [m 3 ] Fig.11 Testing of regional model based on multiple linear regression Testing the accuracy of the regional models based on NN shows its advantage compared even with model based on the linear regression. Indeed, S and r have value.371 and.9985 compared with.674 and.9971. However, tangα has value.9454 that is less worse than.9678 but better start point.53, compared with.213. 8 Conclusion It is shown that the obtained regional models can be very successfully used in the process of computing (estimating) the most important regional characteristics. The volumes of all objects based on regional models are testing with referent volumes and very height correlation coefficients are obtained. Also, it is shown that applications of NN ensures the regional model better than those obtained by the linear and multiple linear regression. References: [1]Haykin, S. (1994): Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Company, New York, 1994. [2]Vapnik, Vladimir N. (1999): An Overview of Statistical Learning Theory, IEEE Transaction on Neural Networks, Vol.1, No.5, Sept. 1999. [3]Radonja Pero, Srdjan Stankovic i Zoran Popovic (24): Specific Process Models Obtained From General Ones, WSEAS TRANSACTIONS on SYSTEMS, Issue 7, Volume 3, September 24, Pub. by WSEAS Press, Athens, pp. 249-2495.

Proceedings of the 5th WSEAS Int. Conf. on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL & SIGNAL PROCESSING, Dallas, USA, November 1-3, 26 284 [4]Riemer, T., von Gadow, K., Sloboda., B. (1995): Ein Modell zur Beschreibung von Baumschaften, Allg Forst Jagdztg 166(7): 144-147. [5]Hui,G.Y.,von Gadow, K. (1997) Entwicklung und Еrprobung eines Еinheitsschaftmodells fuеr die Baumart Cunninghamia lanceolata. Forstw Cbl 116: 315-321. [6]Korol M.,von Gadow, K. (23):Ein Einheitsschaftmodell fuer die Baumart Fichte. Forstw Cbl 122: 1-8. [7] Matović, B. (25): Normal state in spruce-fir forest goals and problems in management in Zlatar. MSc thesis, Faculty of forestry, Belgrade. (in Serbian). [8] Maunaga Z. (1995): Productivity and structural characeristics of same-age stands of spruce in Republic Serbian, Ph. thesis, Faculty of forestry, Belgrade, (in Serbian). [9] Radonja, P., Koprivica, M., Matovic, B. (25): Modelling stem profile and volume by using the modified Brink s function, Forestry, N 4, Vol. (LVII):1-1, (in Serbian). [1] Lek, S., A. Belaud, P. Baran, I. Dimopoulos and M. Delacoste, Application of neural networks to modelling nonlinear relationships in ecology, Ecol. Model., 1996, vol. 9, pp.39-52. [11] Roadknight, C. M., G.R. Balls, G. E. Mills, and D. Palmer-Brown, Modeling complex environmental data, IEEE Trans. Neural Net.,Vol.8 1997, pp. 852-862. [12] Georgina Stegmayer, Marko Pirola, Giancarlo Orengo and Omar Chiotti: Towards a Volterra series representation from a Neural Network model, Proceedings of the 5th WSEAS Int. Conf. on Neural Networks and Applications, WSEAS NNA 24, Udine, Italy, March 25-27, 24 [13]Beale, Mark, (1993): Neural Network Toolbox NN Toolbox, Version 6...88, Release12, September 22, 2, MATLAB 6 R12. [14]Radonja, J.P. (2) Radial Basis Function Neural Networks in Tracking and Extraction of Stochastic Process in Forestry, Proceedings of the 5th Seminar on Neural Networks Application in Electrical Engineering, NEUREL2, September 25-27, 2, IEEE and Acad. Mind, Belgrade, Yug., pp. 81-86. [15]Radonja, J.P., Stankovic S. S. and Cukanovic Nj. R. (2) Multilayer neural networks in process of height curve fitting, INFO science 3/2, Savpo, Belgrade, pp. 22-26. [16]Radonja, Pero, and Stankovivc Srdjan (22): Modeling of a highly nonlinear stochastic process by neural Networks, Recent Advances in Computers, Computing and Communications Editors: N. Mastorakis and V. Mladenov, Published by WSEAS Press. [17] Radonja, Pero, and Stankovic Srdjan: Neural Network Models Based on Small Data Sets, Proceedings of the 6th Seminar on Neural Networks Application in Electrical Engineering, NEUREL22, Editors: B. Reljin and S. Stankovic, September 25-27, Published by: Academic Mind and Faculty of Electrical Engineering, Belgrade, 22, pp. 11-16. Appendix A i 3 25 2 15 1 5 2 D 4 6 Fig.A1 Parameter i versus the D and H Appendix B p 5 4 3 2 1 2 D 4 6 4 2 H Fig.A2 Parameter p versus the D and H 4 2 H