MODEL TUNING WITH THE USE OF HEURISTIC-FREE GMDH (GROUP METHOD OF DATA HANDLING) NETWORKS

Similar documents
Neural network-based athletics performance prediction optimization model applied research

Research on Complex Networks Control Based on Fuzzy Integral Sliding Theory

Image Classification Using EM And JE algorithms

MARKOV CHAIN AND HIDDEN MARKOV MODEL

Predicting Model of Traffic Volume Based on Grey-Markov

Cyclic Codes BCH Codes

A finite difference method for heat equation in the unbounded domain

Associative Memories

[WAVES] 1. Waves and wave forces. Definition of waves

Supplementary Material: Learning Structured Weight Uncertainty in Bayesian Neural Networks

The line method combined with spectral chebyshev for space-time fractional diffusion equation

Lower Bounding Procedures for the Single Allocation Hub Location Problem

Lecture Notes on Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle

Development of whole CORe Thermal Hydraulic analysis code CORTH Pan JunJie, Tang QiFen, Chai XiaoMing, Lu Wei, Liu Dong

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

QUARTERLY OF APPLIED MATHEMATICS

3. Stress-strain relationships of a composite layer

The Application of BP Neural Network principal component analysis in the Forecasting the Road Traffic Accident

EEE 241: Linear Systems

COXREG. Estimation (1)

Chapter 6. Rotations and Tensors

Example: Suppose we want to build a classifier that recognizes WebPages of graduate students.

Homework Assignment 3 Due in class, Thursday October 15

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Chapter 11: Simple Linear Regression and Correlation

Negative Binomial Regression

Generalized Linear Methods

NUMERICAL DIFFERENTIATION

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Nested case-control and case-cohort studies

Multispectral Remote Sensing Image Classification Algorithm Based on Rough Set Theory

Short-Term Load Forecasting for Electric Power Systems Using the PSO-SVR and FCM Clustering Techniques

Numerical Heat and Mass Transfer

Supporting Information

Chapter 13: Multiple Regression

SDMML HT MSc Problem Sheet 4

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

A DIMENSION-REDUCTION METHOD FOR STOCHASTIC ANALYSIS SECOND-MOMENT ANALYSIS

Problem Set 9 Solutions

Supervised Learning. Neural Networks and Back-Propagation Learning. Credit Assignment Problem. Feedforward Network. Adaptive System.

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

n-step cycle inequalities: facets for continuous n-mixing set and strong cuts for multi-module capacitated lot-sizing problem

Linear Approximation with Regularization and Moving Least Squares

A General Column Generation Algorithm Applied to System Reliability Optimization Problems

VQ widely used in coding speech, image, and video

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Chapter 6. Supplemental Text Material

AGC Introduction

Research Article H Estimates for Discrete-Time Markovian Jump Linear Systems

Optimum Selection Combining for M-QAM on Fading Channels

Chapter 9: Statistical Inference and the Relationship between Two Variables

WAVELET-BASED IMAGE COMPRESSION USING SUPPORT VECTOR MACHINE LEARNING AND ENCODING TECHNIQUES

Quantum Runge-Lenz Vector and the Hydrogen Atom, the hidden SO(4) symmetry

Lecture 12: Discrete Laplacian

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Greyworld White Balancing with Low Computation Cost for On- Board Video Capturing

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

IV. Performance Optimization

APPENDIX A Some Linear Algebra

Lecture 10 Support Vector Machines II

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Linear Feature Engineering 11

we have E Y x t ( ( xl)) 1 ( xl), e a in I( Λ ) are as follows:

A Dissimilarity Measure Based on Singular Value and Its Application in Incremental Discounting

Lecture 6: Introduction to Linear Regression

Feature Selection: Part 1

x i1 =1 for all i (the constant ).

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Chapter 3 Describing Data Using Numerical Measures

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Kernel Methods and SVMs Extension

NP-Completeness : Proofs

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

REAL-TIME IMPACT FORCE IDENTIFICATION OF CFRP LAMINATED PLATES USING SOUND WAVES

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

ON AUTOMATIC CONTINUITY OF DERIVATIONS FOR BANACH ALGEBRAS WITH INVOLUTION

Autonomous State Space Models for Recursive Signal Estimation Beyond Least Squares

Statistics for Economics & Business

On the Power Function of the Likelihood Ratio Test for MANOVA

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Comparison of Regression Lines

REDUCTION OF CORRELATION COMPUTATION IN THE PERMUTATION OF THE FREQUENCY DOMAIN ICA BY SELECTING DOAS ESTIMATED IN SUBARRAYS

SIMPLE LINEAR REGRESSION

Boundary Value Problems. Lecture Objectives. Ch. 27

More metrics on cartesian products

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

Errors for Linear Systems

A Robust Method for Calculating the Correlation Coefficient

Key words. corner singularities, energy-corrected finite element methods, optimal convergence rates, pollution effect, re-entrant corners

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Sensitivity Analysis Using Neural Network for Estimating Aircraft Stability and Control Derivatives

STAT 511 FINAL EXAM NAME Spring 2001

Statistical Evaluation of WATFLOOD

Logistic Regression Maximum Likelihood Estimation

Transcription:

MODEL TUNING WITH THE USE OF HEURISTIC-FREE (GROUP METHOD OF DATA HANDLING) NETWORKS M.C. Schrver (), E.J.H. Kerchoffs (), P.J. Water (), K.D. Saman () () Rswaterstaat Drecte Zeeand () Deft Unversty of Technoogy Dpt. HMCZ Fac. of Informaton Technoogy and Systems, Dpt. Medamatcs Koestraat 30 Meeweg 4 4331 KX Mddeburg, the Netherands 68 CD Deft, the Netherands Ema:m.schrver(.d.saman)@dz.rws.mnvenw.n Ema: e..h.erchoffs(p..water)@ts.tudeft.n KEYWORDS Parameter tunng, North Sea mode, data mnng, dependency modeng, neura networs, Group Method of Data Handng (). ABSTRACT In ths paper the use of so-caed heurstc-free (Group Method of Data Handng) networs are consdered to tune an unnown parameter (the wnd stress coeffcent) n a mode used for predcton of the meteoroogca effect on the water eve of the North Sea. tres to fnd unnown reatonshps between the parameter concerned and other varabes, purey on the bass of measurement data (data mnng, dependency modeng). The resuts acheved,.e. the predcton capabtes of the based tuned mode, were found to be reasonaby comparabe wth those from the mode that had been prevousy fne-tuned through a years-astng tunng procedure; moreover, the method whch s generay appcabe provded more nsght nto the physcs behnd the mode concerned. meteoroogca effects on the water eve. Heurstc-free s apped to revea the unnown reatonshps between the wnd stress coeffcent and other physca varabes such as wave energy, wnd and pressure. The found reatons were ncuded n the mode. New forecasts wth the modfed mode were made and compared wth forecasts made by the orgna mode. In the orgna mode wnd stress coeffcents have been determned va a prevous ong astng fne-tunng process. Athough not mprovng prevous predctons, the -based fnetunng method proved to gve comparabe resuts. However these -based resuts can be acheved n much shorter tme than the tme needed for the prevous fne-tunng process. The paper s but up as foows: n Sect. a short descrpton s gven of both basc and heurstc-free ; the atter was found to gve the better resuts and s therefore used throughout the research. The consdered exampe mode tunng probem s deat wth n Sect. 3: an operatona North Sea mode wth the wnd stress coeffcent as the parameter to be tuned. Networ tranng and tunng performances are presented n Sect. 4. A Sect. 5 Concusons competes the paper. 1. INTRODUCTION Fne-tunng physca modes may be a ong astng procedure, especay when parameters or varabes cannot be measured drecty and suffcent a pror nowedge about reatons wth other varabes n the mode s mssng. In ths paper we consder the use of (Group Method of Data Handng) networs, more n partcuar heurstcfree GHMH networs, to estmate a varabe n reaton to other varabes. It s assumed that arge amounts of measured data are avaabe and that the unnown reatons we are oong for, are hdden n these data. The resutng data mnng probem (more n partcuar: dependency modeng probem) s soved wth heurstc-free The method s apped to revea the reevant varabes (nputs) on whch the consdered unnown varabe (output) depends upon, as we as a mathematca expresson between the unnown output and ts found reevant nputs. The procedure s ustrated and vadated on the bass of a reastc exampe: fne-tunng wnd stress coeffcents n an operatona North Sea mode whch s used to predct. THE GROUP METHOD OF DATA HANDLING ().1 Basc The Group Method of Data Handng () was ntroduced by A.G. Ivahneno n 1968 and reformuated n 1971 [Ivahneno 1971]. neura networs are typcay used to approxmate a contnuous functon f : A R n R ; n case of vector functons one networ per output eement s needed. The typca networ shown n Fgure 1 has four nputs (components of the vector x) and one output, the estmate y of the correct functon vaue y = f(x). The frst ayer of the networ conssts of a fanout-nput ayer. The nodes of ths ayer dstrbute ther nputs x1, x,, x n to the approprate nodes of the frst hdden ayer. Every node n each ayer foowng the fanout-nput ayer receves two nputs, whch are outputs of the nodes of the prevous ayer. The scheme of the s a smpe feed-forward sequence. The components of the nput vector x are frst supped to the

nput unts. These then dstrbute them to the approprate frst hdden ayer processng eements. The outputs of these hdden ayer processng eements are then supped to the next ayer, and so on. The fna output of the networ s a snge rea number y. X 1 X X 3 X 4 Input ayer Frst hdden ayer Second hdden ayer Thrd hdden ayer Output ayer (one unt) Fgure 1: Exampe of a Typca Neura Networ Except for the nput ayer, a processng eements n the - th ayer ( =,3, ) have the confguraton shown n Fgure. The output sgna of a ayer- processng eement s gven by the quadratc transfer functon: z a d 1 1 1 1 z b z z c z z 1 e z Z -1 Z -1 1 f Z Fgure : Processng Eement of Layer y' (.1) Thus, the networ as a whoe buds up a poynoma functon of the nput components. The output y of the networ can be expressed as a poynoma of degree K (the so-caed Ivahneno poynoma), where K s the number of ayers n the networ foowng the nput ayer. The networ s deveoped by startng at the nput ayer and growng the networ progressvey towards the output ayer, one ayer at the tme. Startng wth = 1 and proceedng to = unt the entre networ s confgured, the process used s the same, regardess the vaue of. Layer s confgured wth one processng eement for each dfferent par of outputs from the prevous ayer. Assumng that the prevous ayer contans M 1 processng eements, the number of processng eements n ayer w be the number of combnatons of the outputs, taen two at a tme: C ( M 1,), where M 1 M 1 M 1! C ( M 1,) (.)! M 1! The basc dea of s that each of these processng eements wth transfer functon (.1) wants to have ts output y match y=f(x) as cosey as possbe for each networ vector nput x. Ths approxmaton s accompshed through near regresson. The sx coeffcents of each processng eement are found usng a arge set of nput/output exampes (x 1, y 1 ), (x, y ),... (x p, y p ). These determne numerca vaues to be entered nto the transfer functon of the th -processng eement of ayer, whch fnay resuts n a system of near equatons n the coeffcents a, b, c, d, e, f : Z T b c d e f Y a (.3) where Z = the matrx of numbers n whch each row corresponds to the vaues n equaton (.1) and Y = the output of the processng eement (the estmate y of the correct functon vaue y = f(x)). The set of near equatons expressed by (.3) w amost never have an exact souton, ony an approxmate one: T b c d e f Z Y a (.4) where souton Z a = the pseudo nverse of the matrx Z. The, b, c, d, e, f represents the best possbe seecton of coeffcents n terms of mnmzng the mean squared error between the output of each processng eement and the desred output. After the sx coeffcents for each of the processng eements of ayer have been derved, the overa performance of each processng eement n terms of ts goa s evauated through cross-vadaton. For ths purpose a test set s used whch s competey dfferent from the data set used for tranng. For each processng eement the so-caed reguarty crteron: 1 ' T Z a b c d e f Y F (.5) r s determned. The next step n the process s to emnate those processng eements on ayer that have a arge reguarty crteron vaue, whereby arge s user defned. The fna set of remanng processng eements then suppes the Z output vector that feeds ayer +1. When the ayer s adapted t s frozen, and the process contnues wth the next ayer. The process of addng new ayers stops when the stop crteron s met, whch normay s the case when the reguarty crteron vaue of the best performng processng eement n a ayer reaches ts mnmum.. Heurstc-free The networ s not free of heurstcs. Heurstcs are needed wth respect to the predetermnaton of the structure of the parta poynomas, the subdvson of the observaton data nto tranng data and vadaton data, and the predetermnaton of the number of nodes seected n

each ayer. To fnd the optma combnaton of these heurstcs the agorthm must be executed many tmes, each tme for a dfferent combnaton. Ths gves a huge computatona overhead, whereby unfortunatey the rea optma combnaton s rarey found. A souton s a revsed agorthm that was deveoped by Tamura and Kondo [Tamura and Kondo 1984]. Ths agorthm s free of heurstcs. A the data are used for both tranng and testng whch emnates the probem of subdvdng the data nto tranng data and test data. Instead of usng the reguarty crteron, the so-caed Predcton Sum of Squares (PreSS) s cacuated. Ths crteron s used for generatng optma parta poynomas, seectng ntermedate varabes and stoppng the agorthm. The heurstc-free agorthm conssts of four steps: Step 1: Generatng optma parta poynomas n each ayer. Optma parta poynomas are generated through the poynoma generator G 1, G, G 3 or G 4, appyng a stepwse regresson procedure to the poynoma: y a0 a1x a x a4 x x a5 x G1 : y a0 a1x a x a4 x x G : y a0 a1x a x G3 : y a0 a1x a x G4 : y a0 a1x a a 3 x 3 x (.6) (.7) (.8) (.9) In ths stepwse regresson procedure, the above-mentoned PreSS s used as the crteron for seectng the best parta poynoma: m y ˆ y PreSS 1 (.10) T T 1 1 x X X x where: T x = 1, x, x, x, x, x x for =1, m and X = [x, 1 x,,x m ] T, m = the number of tranng observatons, y = the th observed vaue for the output varabe, x = the th observed vaue for the nput varabe x and ŷ = the th estmated vaue obtaned by a regresson of one of the parta poynomas aganst a avaabe tranng data. Step : Seectng the nodes. P nodes wth the smaest PreSS are seected from a the nodes n the ayer. The number p s preferred to be as arge as possbe wthn the computatona capacty of the computer. Therefore, p s determned by computatona crtera and not by heurstcs. The outputs of the seected nodes are regarded to be the nputs of the next ayer. Of each ayer, the smaest PreSS s ept for ater use as the stoppng crteron for the teratve computaton. Step 3: Stoppng the mut-ayered teratve computaton. The teratve computaton of the mode s termnated when the PreSS cannot be further mproved, or a seected parta poynomas are generated by G 4. Step 4: Computaton of the predcted vaues. The fna mode descrbng the reatonshp between the nput and the output varabes can be obtaned n two ways: 1. Seect n the fna ayer the node wth the smaest PreSS. The output of ths node s the output of the fna mode.. Seect the L nodes wth the smaest PSS as was done n the precedng ayers, and cacuate the weghted average of these nodes. In the tranng executed durng ths study, the frst method was used..3 -based Dependency Modeng The dependency modeng probem we were confronted wth n ths research, s to dscover on the bass of (a arge amount of) measurement data whch enttes of an a pror seected set of enttes x (where J = {1,, 3,, N}) have reevant dependency to a certan varabe y and to fnd a descrptve mathematca expresson between y and those found reevant enttes x (where I J). In ths descrpton (for dfferent vaues of ) x can be dfferent varabes and/or the same varabes at dfferent tme nstances;.e. there mght be, for nstance, a dependency between y(t) and z 1 (t-3), z (t) and z 3 (t-1): y = f (x 1, x, x 3 ) wth x 1 = z 1 (t-3), x = z (t), x 3 = z 3 (t-1). The data that are assumed to hde the unnown reatonshps are measurements of the varabes (enttes) y and x at many tme nstances. From the descrpton of Sect..1 on how to bud up a -networ t s cear that durng tranng the subsequent emnaton of processng eements (neurons), that do not perform we on the bass of ther reguarty crteron vaues, has as fna resut that rreevant reatonshps are cut off and therefore rreevant enttes x are removed: the vaue y of the snge output neuron s a poynoma of ony the reevant enttes x. It s easy to understand that the dependency modeng capactes of heurstc-free (usuay consderaby) outpaces those of basc ; ths was aso found n practce. However, for ths a prce has to be pad: heurstcfree s much more computng ntensve than basc whch n tsef s aready computng ntensve! In

ths research we have used a home-made parae mpementaton of heurstc-free (a sghty modfed verson of the one consdered n Sect..), runnng on a (hypercube) parae computer and/or a networ of UNIX machnes [Water and Kerchoffs 1999]. 3. MODEL TUNING PROBLEM 3.1 An Operatona North Sea Mode We ustrate and evauate the consdered -based data mnng method to fne-tune physca modes on the bass of an exampe. The exampe concerns a numerca mode of the North Sea and parts of the adonng waters (see aso [Schrver et a. 001]). Ths so-caed North Sea mode s based on nearzed shaow water equatons: u h u V cos 1 pa g fv 0 (3.1) t x D D x v h v V sn 1 pa g fu 0 (3.) t y D D y h ( Du) ( Dv) 0, (3.3) t x y where t = tme, h = meteoroogca effect on the water eve, (u,v) = water veoctes n the x- and y-drecton, D = depth of the water, f = Coros parameter, = near bottom stress coeffcent, = wnd stress coeffcent, V = wnd veocty, = drecton of the wnd wth respect to the postve x-axs, w = densty of water, p a = atmospherc pressure, and g = acceeraton of gravty. The mode s used to predct the meteoroogca effect on the water eve, whch s extremey mportant for decson support purposes. The North Sea mode was extended wth a Kaman fter whch has mproved the predcton accuraces. The predcton capabtes of the mode can be vadated by comparng predcted and actuay measured water eves at certan fxed measurement ocatons. Dscretzaton of the mode n space yeds to a spata grd; the used sze of each ce n ths grd s x = y = 33,8 m. At each grd pont the dscrete dfferenta equatons are soved for dscrete tmes t wth t = t t -1 = 10 mn. In the consdered mode the North Sea s subdvded nto 6 areas (1; South part, : the Channe, 3: Mdde West part, 4: North West part, 5: North East part, 6: Mdde East part). Each area has ts own pressure, wnd speed and wnd drecton, whch are avaabe every three hours and nterpoated (n space and tme). On the bass of an atmospherca numerca mode a forecast of pressure p a, wnd speed V and wnd drecton as functons of space and tme s cacuated. Snce the grd of the atter mode s coarser than the grd used by the North Sea mode and pressure and wnd data are n the forecast ony avaabe every 6 hours, both spata and tme nterpoatons are used to obtan pressure, wnd speed and wnd drecton on the grd ponts of the North Sea mode. Snce the grd s ony defned for the North sea, externa surges are modeed through the correct nta vaues at the grd boundares. w w However, these correct nta vaues are unnown. The North Sea mode was extended wth a steady state Kaman fter to sove ths probem. As sad, the North Sea mode s used for tde predcton (meteoroogca effect on the water eve). In preparaton of any forecast, frst a hndcast s made: the mode s executed wth nown nput data from a certan moment on n the past unt current tme. The hndcast resuts n the mode s state vector representng the actua stuaton at current tme. When runnng the hndcast, nown nput data s used to perform the measurement update of the Kaman fter. After the hndcast the forecast s made wth as startng pont the mode s state vector but from the hndcast. The Kaman correctons on the wnd and boundary condtons are aso nput parameters for the forecast; these correctons certany have nfuence especay durng the frst hours of the forecast, but are then smoothed by the mode. The forecast s done by runnng the mode wth as nput the cacuated expected future pressure and wnd (n the dfferent wnd areas), startng at the tme the hndcast has fnshed up to a number of hours ahead. After havng ran a forecast, at tme t+n a new hndcast/forcast cyce can be performed, thereby mang a new predcton over the next n hours, and so on. 3. The Wnd Stress Coeffcent Most parameters n the shaow water equatons (3.1), (3.) and (3.3) are physca constants or measurabe varabes. Two varabes, however, the bottom stress coeffcent and wnd stress coeffcent, are physca quanttes that cannot be measured. Durng prevous tunng of the North Sea mode t appeared that the bottom stress coeffcent has a much ower nfuence on the meteoroogca effect than the wnd stress coeffcent, therefore s modeed as a constant and s n ths research beyond further nvestgaton. Snce the wnd stress coeffcent s obvousy dependent on the wnd speed and any of the afore-mentoned wnd areas has ts own wnd speed, n the mode dfferent s are assumed. Two areas (North East and Mdde East parts) were found to have a mnor mpact on the meteoroogca effect. Therefore we w try to estabsh a reaton between the four reevant wnd stress coeffcents and other varabes. Commony, a reaton between the wnd speed and the wnd stress coeffcent s used, for nstance the Charnoc reaton. However, t coud we be that there are other varabes that may nfuence the wnd stress coeffcent. Snce measurements of severa physca varabes were avaabe for many years, the probem can be vewed as a data mnng probem: fnd those physca varabes the wnd stress coeffcent essentay depend on, as we as fnd a mathematca reaton between them, as a ths nformaton s supposed to be hdden n the stored physca measurements. Havng found an expresson for the reaton between and other physca varabes, t can be ncuded n the mode and the mode s predctons can be vadated.

4. -BASED MODEL TUNING 4.1 Data Sets To Tran The Networs For each ( s the wnd stress coeffcent for the -th wnd area; = 1,..4), a separate networ s used. In order to tran a networ, data was gathered n the tme perod 01/01/1998 30/09/1999 (.e. 638 days, wth a resouton of 6 hours resutng n 4 sampes a day, so a together 55 data records). Each record n the tranng set (for a snge net) contans nput/output data at a certan tme t (see fgure 3). The 18 measured basc nputs are: pressure p 1 p 4 n each of the 4 areas, the x- and y-components of the wnd n these areas, the sgnfcant wave heght at two speca ocatons L1 and L, where measured wave data was avaabe, the hgh frequency wave energy (> 0. Hz) at these ocatons L1 and L, and the medum range frequency wave energy (0.1-0. Hz) at L1 and L. Moreover, addtona nputs were ncuded: the dfferences of a the basc nputs at tme t and t-6, as we as each basc nput shfted bacward n tme over a perod of 6, 1, 18 and 4 hours; so a together we have 18 + 18 + 184 = 108 nputs. The output data n each record s obvousy. The vaues of were found by runnng (.e. performng hndcast and successve forecast) the mode on a 4-tmes a day bass durng the entre ndcated perod and for a predetermned number of 7 dfferent vaues of on ts defnton nterva [0;6x10-6 ], and then determnng that vaue of that mnmzes the dfference between predcted and actuay measured meteoroogca effect. In summary, the tranng set for each separate net conssts of 55 records, each havng 108 nput tems and 1 output. (see Fgure 3). It s the tas of the traned networ to revea whch of these 108 nputs are actuay reevant for and to provde a mathematca expresson = f(x 1, x,.., x N ) n these found reevant nputs. tme t 1 t t P 1 xxxx P 1 xxxx P 1( t 6) xxxx P1 ( t 4) xxxx P 1 xxxx P 1 xxxx P 1( t 6) xxxx P1 ( t 4) xxxx P 1 xxxx P 1 xxxx P 1( t 6) xxxx P1 ( t 4) xxxx 108 measured nputs Cacuated output Fgure 3: Structure of the Used Data Set 55 records In addton to the above tranng, the networs are aso traned wth measurements n the perod 4/0/1998 14/05/1998 (wth a resouton of 1 hour, fnay resutng n 191 data records). Here each record contans, n addton to the 18 above-mentoned basc nputs aso the pressure, wnd speed and wnd drecton n the areas North East and Mdde East, the dfferences at tme t and t-1 as we as each nput shfted bacwards n tme over a perod of 1,,, 4 hours, so a together 4 + 4 + 44 = 64 nputs. Now the traned s assumed to revea the reevant nputs out of these 64. 4. Performance networs of varous sze (1-5 ayers) have been traned for each ( = 1,, 3, 4), usng both the data gathered for the perod 01/01/1998 30/09/1999 (6 hours resouton) and the perod 4/0/1998 14/05/1998 (1 hour resouton). In the tranng of the networs an mportant roe was payed by the so-caed condton quafer, whch defnes the senstvty of the matrx n the agorthm. By enargng ths vaue, whch s normay set to 10 6, more dependency between the nputs s assumed. A dsadvantage of enargng ths vaue, however, s that the output becomes more senstve to nose on the nputs. In our research the condton quafer was set to 10 1. In order to vadate the resut of the tranng, the mathematca reatons between and the reevant nputs found were ncuded n the North Sea mode. Wth ths modfed mode a seres of forecasts was made. Every 10 mnutes (n smuaton tme) the predcted meteoroogca effect on the water eve was stored and afterwards compared to the actuay measured meteoroogca effect through the we-nown RMSE (Root Mean Square Error) crteron. Smar forecasts wth RMSE estmatons were aso made usng the currenty operatng North Sea mode wth vaues that have been found after a many years astng, dffcut tunng process and that have proved to gve satsfactory predcton resuts. We are obvousy nterested n the dfference RMSE between both these RMSE performances. In a cases the resuts n terms of RMSE obtaned from predctons wth -based s were comparabe wth (actuay however a tte ess than) those for the fne-tuned mode; however, the nterestng pont s that the dfferences n percentage proved to be reatvey sma so that can be consdered worthwhe for qucy estmatng reasonabe s purey on the bass of measurements. Tabe 1 presents resuts of RMSE for one -based ( = 1 4) whe the other three s were not changed n the North Sea mode. Tabe shows RMSE-fgures for the case that a four s were based. Note that n the experments the best resuts wth ether 1-ayer or -ayer networs are between 7 and 8 %. In both tabes, the upper paced fgures refect the resut for the data set that covers the perod 01/01/1998-30/09/1999 (6 hours resouton) and the ower paced fgures the resut for the other data set coverng the perod 4/0/1998-14/05/1998 (1 hour resouton); n case of ony one fgure, the resut concerns the second mentoned data set. Tabe 1: Resuts of RMSE for one -based RMSE (Predcton over 5 months, 1- ayer ) 1 3 4 0,1% 1,9% 4,5% 3,3% 0,3% 0,9% 0,1% 0,1%

Tabe : RMSE for the case that a four s were -based (1- ayer) (- ayer) (3- ayer) (4- ayer) (5- ayer) RMSE (Predcton over 5 months) RMSE (Predcton over months) 7,6% 7,7% 5. CONCLUSION 7,9% 31,6% 54% 11,7% 7,5% 17,6% 7,% In ths research heurstc-free networs are used to fne-tune a North Sea mode for wnd stress coeffcents n four dfferent wnd areas. Athough the -based dependency modeng method dd not mprove the predcton capabtes of the current North Sea mode, the eventua dfferences n predcton performance were reatvey sma. Consderng the fact that the current mode has been extensvey tuned durng severa years, t s an nterestng resut that roughy the same performance can be acheved wth a competey dfferent technque. The data mnng method therefore proved to be usabe to acheve reasonaby good resuts, wth the addtona advantage that the perod of tme for coectng tranng data, tranng and vadatng s much shorter than the tme (severa years) that has been necessary n the past to fnetune the current mode. Summarzng, -based dependency modeng can be frutfuy used for a fast reasonabe estmate of one or more parameters that cannot be drecty measured and that ac (suffcent) a pror nowedge on reatona dependences. REFERENCES Ivahneno, A.G. 1971. Poynoma theory of compex systems, IEEE Trans. Systems Man Cybernet, SMC-1(4), pp. 364-378. Mueer, J.A. 001. Automatc mode generaton based on. In: E.J.H. Kerchoffs and M. Snore (eds.): Modeng and Smuaton 001 (Proc. of the 15 th European Smuaton Mutconference, Prague, 001), SCS Europe, pp. 661-668. Schrver, M.C., Saman K.D. and Kerchoffs E.J.H. 001. Fne-tunng of a North Sea mode wth the ad of neura networs. In: N. Gambas and C. Frydman (eds.): Smuaton n Industry 001 (Proc. of the 13 th European Smuaton Symposum, Marsee, France, 001), pp. 696-704. Tamura, H. and Kondo T. 1984. On revsed agorthms of wth appcatons. In: S.J. Farow (ed.): Sef- Organzng Methods n Modeng: Type Agorthms, Vo. 54 of Statstcs: Textboos and Monographs, Marce Deer, Inc., chapter 1, pp. 5-41. Water, P.R. and Kerchoffs E.J.H. 1999. -based fnanca forecastng on a hypercube parae computer. In: Smuaton n Industry (Proc. of the 11 th European Smuaton Symposum, Erangen, Germany, 1999), SCS Europe, pp. 586-596. Another nterestng aspect of the -based dependency modeng s that unexpected varabes mght be found to have reevant mpact on the wnd stress coeffcents. The procedure may thus ncrease physca nsght. Our research has shown that wnd stress coeffcents consderaby depend on wave energy, whch confrms certan assumptons wth experts n the fed and can be made expcabe by physcs. It means that the usuay assumed dependence of the wnd stress coeffcents on ony wnd speed and wnd drecton s not optma and can be mproved. Athough no a pror nowedge about the reatonshp between the nputs and the output s needed, a thorough nowedge of the system beng modeed s necessary. Ths nowedge s needed when creatng a vad data set and especay when verfyng tranng resuts. One cannot use heurstc-free for dependency modeng wthout nowedge of the system beng modeed.