ADVANCED SPATIAL DATA ANALYSIS AND MODELLING WITH SUPPORT VECTOR MACHINES

Similar documents
SUPPORT VECTOR MACHINES FOR CLASSIFICATION AND MAPPING OF RESERVOIR DATA

An Introduction to. Support Vector Machine

Introduction to local (nonparametric) density estimation. methods

ENVIRONMENTAL DATA MAPPING

Kernel-based Methods and Support Vector Machines

Functions of Random Variables

Dimensionality Reduction and Learning

Support vector machines II

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

CHAPTER VI Statistical Analysis of Experimental Data

Bayes (Naïve or not) Classifiers: Generative Approach

Support vector machines

Binary classification: Support Vector Machines

Simple Linear Regression

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Summary of the lecture in Biostatistics

Chapter 14 Logistic Regression Models

Research on SVM Prediction Model Based on Chaos Theory

Econometric Methods. Review of Estimation

Study on a Fire Detection System Based on Support Vector Machine

Lecture 3. Sampling, sampling distributions, and parameter estimation

TESTS BASED ON MAXIMUM LIKELIHOOD

Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

ESS Line Fitting

New Schedule. Dec. 8 same same same Oct. 21. ^2 weeks ^1 week ^1 week. Pattern Recognition for Vision

Radial Basis Function Networks

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Dimensionality reduction Feature selection

Objectives of Multiple Regression

6.867 Machine Learning

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Lecture Notes Types of economic variables

ENGI 3423 Simple Linear Regression Page 12-01

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Lecture 8: Linear Regression

A New Family of Transformations for Lifetime Data

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Median as a Weighted Arithmetic Mean of All Sample Observations

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Unsupervised Learning and Other Neural Networks

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

Point Estimation: definition of estimators

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Rademacher Complexity. Examples

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

Nonlinear Blind Source Separation Using Hybrid Neural Networks*

Chapter 9 Jordan Block Matrices

ABOUT ONE APPROACH TO APPROXIMATION OF CONTINUOUS FUNCTION BY THREE-LAYERED NEURAL NETWORK

Comparison of Parameters of Lognormal Distribution Based On the Classical and Posterior Estimates

PROJECTION PROBLEM FOR REGULAR POLYGONS

Bayes Decision Theory - II

Simple Linear Regression

Lecture 7: Linear and quadratic classifiers

Convergence of the Desroziers scheme and its relation to the lag innovation diagnostic

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Sequential Approach to Covariance Correction for P-Field Simulation

PTAS for Bin-Packing

Chapter 5 Properties of a Random Sample

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Some Notes on the Probability Space of Statistical Surveys

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

The Mathematical Appendix

Lecture 1 Review of Fundamental Statistical Concepts

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

Lecture 3 Probability review (cont d)

Solution of General Dual Fuzzy Linear Systems. Using ABS Algorithm

Department of Agricultural Economics. PhD Qualifier Examination. August 2011

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

The OK weights define the best linear unbiased predictor (BLUP). The OK prediction, z ( x ), is defined as: (2) given.

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Chapter 8. Inferences about More Than Two Population Central Values

Statistics MINITAB - Lab 5

Simulation Output Analysis

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Generating Multivariate Nonnormal Distribution Random Numbers Based on Copula Function

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE


C.11 Bang-bang Control

Multi Objective Fuzzy Inventory Model with. Demand Dependent Unit Cost and Lead Time. Constraints A Karush Kuhn Tucker Conditions.

Multiple Linear Regression Analysis

Analysis of Lagrange Interpolation Formula

LECTURE 21: Support Vector Machines

Generative classification models

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

A COMPARATIVE STUDY OF THE METHODS OF SOLVING NON-LINEAR PROGRAMMING PROBLEM

A NEW LOG-NORMAL DISTRIBUTION

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter 4 Multiple Random Variables

KLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames

Analysis of Variance with Weibull Data

Transcription:

,.-0/2436587:9<;:=>-@?A-@3ABC;EDGFE3ABHBHI ADVACED SPAIAL DAA AALYSIS AD MODELLIG WIH SUPPOR VECOR MACHIES Mkhal Kaevsk Aleksey Pozdukhov Mchel Maga 3 Stephae Cau 2 IDIAP-RR-00-3 Submtted to Iteratoal Joural of Fuzzy Systems.! " # $ % & % & $ & & $ $ " # $ % & % & $ & & $ ' ( )!! + +! Isttute of uclear Safety, Russa Academy of Sceces, B. ulskaya 52, 39 Moscow 2 ISA; Place Emle Blodel, 763 Mot-Sat-Aga, Frace 3 Isttute of Meralogy ad Petrology, Uversty of Lausae, BFSH2, 05 Lausae, Swtzerlad

Advaced Spatal Data Aalyss ad Modelg wth Support Vector Maches M. Kaevsk, A. Pozdukhov, S. Cau, M. Maga Abstract-- he preset paper deals wth ovel developmets ad applcato of Support Vector Maches (Support Vector Classfer SVC ad Support Vector Regresso SVR) for the aalyss ad modelg of spatally dstrbuted evrometal ad polluto formato (categorcal ad/or cotuous data). SVC/SVR models are based o the Statstcal Learg heory or Vapk-Chervoeks (VC)-theory. he SVC provde o-lear classfcato by mappg the put space to hgh dmesoal feature spaces where a specal type of hyper-plaes wth maxmal margs (gvg rse to good geeralzatos) are costructed. SVR provde robust o-lear regresso of spatally dstrbuted data. Real case studes of the preset paper deal wth bary classfcato problem of dcator varables, mult-class classfcato of sol types, ad predcto mappg of radoactvely cotamated terrtores. Geostatstcal tools (varography) s used to cotrol the performace of the maches ad for better uderstadg of the results. he SVC/SVR are well adapted to fuzzy evrometal ad polluto data. Idex erms evrometal spatal data classfcato ad mappg, support vector maches, geostatstcs I. IRODUCIO Recetly the aalyss ad processg of spatally dstrbuted ad tme depedet formato have become a very mportat problem due to the comprehesve developmet of evrometal ad polluto motorg etworks eve leadg to data mg problems from oe sde ad much better uderstadg of data aalyss approaches (both model depedet ad data drve) from aother sde. he preset paper deals wth ovel developmets ad adaptato of SVC ad SVR, the models based o Statstcal Learg heory or Vapk-Chervoeks (VC)-theory for the aalyss ad modelg of spatally dstrbuted evrometal ad polluto formato (categorcal ad/or cotuous data). Statstcal Learg heory s a geeral mathematcal framework for estmatg depedeces from emprcal ad fte data sets []-[4]. he basc dea of SVM s to determe a classfer or regresso mache that mmzes Structural Rsk cosstg of the emprcal error ad the complexty of the model leadg to good geeralzato error. he SVM provdes o-lear classfcato (or regresso) by mappg the put space to hgh dmesoal feature spaces where a specal type of hyper-plaes wth maxmal margs (gvg rse to good geeralzatos low errors o valdato data sets) are costructed. I case of classfcato SVM are focusg o the margal data (support vectors - SV) ad ot o statstcs such as meas ad varaces. Oly data pots close to the classfcato decso boudares are mportat for the soluto of the problem. Essetally the method s o-lear, robust ad does ot deped o the dmeso of put space. Recetly frst promsg results o applcato of SVC/SVR for the spatally dstrbuted data were publshed [5]-[7]. he ma atteto was pad to bary classfcato problems ad to uderstadg of SVR applcato to spatal data ad terpretato of SVR hyper-parameters. Results of the SVM classfcato were compared wth dcator krgg [6] as well. It was demostrated that the use of geostatstcal spatal correlato measures lke varogram mproved both uderstadg of the mache performace ad terpretato of the results. he ma atteto the preset paper s pad to: ) the problem of SVC mult-class classfcato of evrometal data sol types that s mportat, e.g. modelg of radoucldes vertcal mgrato, ad 2) to SVR mappg of radoactvely cotamated terrtores by Sr90 Cherobyl radouclde. Orgally SVC were developed for the bary (2 class) classfcato. Dfferet geeralzato schemes of 2-class classfcato problem to mult-class classfcato are cosdered the preset study. he methodology ad the results o sol types classfcato are cosdered detal. Radal bass Gaussa fuctos (both sotropc ad asotropc) were used as the SVC kerels. Error surfaces (trag ad testg errors ad umber of support vectors versus regularzato parameter ad kerel M. Kaevsk s wth the IDIAP Dalle Molle Isttute of Perceptual Artfcal Itellgece, CP 592, 920 Martgy, Swtzerlad, kaevsk@dap.ch ad wth ISA, Roue, Frace. A. Pozdukhov s wth the Moscow State Uversty, Physcs Departmet S. Cau s wth the ISA, Roue, Frace

badwdth) are used order to tue hyper-parameters. Partcularly, kerel badwdths are tued usg testg data sets ad takg to accout spatal varablty of classes. Several approaches for the classfcato of spatally dstrbuted data that were developed wth the framework of geostatstcs ca be foud the revew [8]. he last part of the paper presets results o applcato of Support Vector Regresso to the problem of predcto mappg of spatal data. Real case study s based o data o sol cotamato by Sr90 Cherobyl radouclde the most cotamated Brask rego of Russa. Sr90 has 30 years half-tme decay ad s radologcally mportat. II. IRODUCIO O SUPPOR VECOR MACHIES he ma cocepts ad prcples of SVM are descrbed shortly, startg from leally separable dchotomes. he presetato of the SVM theory s based o []-[4]. A. Prcples of SVM he followg problem s cosdered. A set S of pots (X ) s gve R 2 (we are workg a two dmesoal [X, X 2 ] space). Each pot X belogs to ether of two classes ad s labeled by Y {-,+}. he obectve s to establsh a equato of a hyper-plae that dvdes S leavg all the pots of the same class o the same sde whle maxmzg the mmum dstace betwee ether of the two classes ad the hyper-plae maxmum marg hyper-plae. Optmal hyper-plae wth the largest margs betwee classes s a soluto of the costraed optmzato problems cosdered below []-[4]. B. Learly separable case Let us remd that data set S s learly separable f there exst 2 W R, b R, such that Y ( W X b) +, =,... + () he par (W,b) defes a hyper-plae of equato ( X + b ) = 0 W. Learly separable problem: Gve the trag sample {X,Y } fd the optmum values of the weght vector W ad bas b such that they satsfy costrats Y ( W X b) +, =,... + (2) Ad the weght vector W mmzes the cost fucto (maxmzato of the margs) F( W ) W W / 2 = (3) he cost fucto s a covex fucto of W ad the costrats are lear W. hs costraed optmzato problem ca be solved by usg Lagrage multplers. Lagrage fucto s defed by L( W, b, α) = W X / 2 α = where Lagrage multplers α 0. [ Y ( W X + b) ] he soluto of the costraed optmzato problem s determed by the saddle pot of the Lagraga fucto,, ) whch has to be mmzed wth respect to W ad b ad to be L( W b α maxmzed wth respect to α. Applcato of optmalty codto to the Lagraga fucto yelds

W = = α Y X (4) = α (5) Y = 0 hus, the soluto vector W s defed terms of a expaso that volves the trag data. Because of costraed optmzato problem deals wth a covex cost fucto, t s possble to costruct dual optmzato problem. he dual problem has the same optmal value as the prmal problem, but wth the Lagrage multplers provdg the optmal soluto. he dual problem s formulated as follows: Maxmze the obectve fucto Q( ) = α (/ 2) α α Y Y X = α (6) = = Subect to the costrats Y = 0 α (7) α (8) 0, =,... X ote that the dual problem s preseted oly terms of the trag data. Moreover, the obectve fucto Q(α) to be maxmzed depeds oly o the put patters the form of a set of dot products {X X } =,2,. After determg optmal Lagrage multplers α 0, the optmum weght vector s defed by (4) b = W = + S ( s) ad bas s calculated as, for ote, that from the Kuh-ucker codtos t follows that [ Y ( W X + b) ] = 0 α (9) Oly X Y α that ca be ozero ths equato are those for whch costrats are satsfed wth the equalty sg. he correspodg pots X, called Support Vectors, are the pots of the set S closest to the optmal separatg hyper-plae. I may applcatos umber of support vectors s much less that orgal data pots. he problem of classfyg a ew data pot X s smply solved by computg F( X ) sg( W X + b) = (0) wth the optmal weghts W ad bas b. C. SVM classfcato of o-separable data. Soft marg classfer (allowg for trag errors) I case of learly o-separable set t s ot possble to costruct a separatg hyper-plae wthout allowg classfcato error. he marg of separato betwee classes s sad to be soft f trag data pots volate the codto of lear separablty. I case of o-separable data the prmal optmzato problem s chaged by usg slack varables. Problem s posed as follows: Gve the trag sample {X,Y } fd the optmum values of the weght vector W ad bas b such that they satsfy costrats Y ( W X + b) +, ξ ξ () 0,

he weght vector W ad the slack varable ξ mmze the cost fucto F( W ) = W W / 2 + C ξ (2) = where C s a user specfed parameter (regularzato parameter s proportoal to /C). he dual optmzato problem s the followg: Gve the trag data maxmze the obectve fucto (fd the Lagrage multplers) Q( ) = α (/ 2) α α Y Y X α (3) = = Subect to the costrats (7) ad 0 C, =,... α (4) X ote that ether the slack varables or ther Lagrage multplers appear the dual optmzato problem. he parameter C cotrols the trade-off betwee complexty of the mache ad the umber of oseparable pots. he parameter C has to be selected by user. hs ca be doe usually oe of two ways: ) C s determed expermetally va the stadard use of a trag ad testg data sets, whch s a form of resamplg; 2) It s determed aalytcally by estmatg VC dmeso ad the by usg bouds o the geeralzato performace of the mache based o a VC dmeso []. D. SVM o-lear classfcato I most practcal stuatos the classfcato problems are o-lear ad the hypothess of lear separato the put space are too restrctve. he basc dea of Support Vector Maches s ) to map the data to a hgh dmesoal feature space (possbly of fte dmeso) va a o-lear mappg ad 2) costructo of a optmal hyper-plae (applcato of the lear algorthms descrbed above) for separatg features. he frst tem s agreemet of Cover s theorem o the separablty of patters whch states that put multdmesoal space may be trasformed to a ew feature space where the patters are learly separable wth hgh probablty, provded: ) the trasformato s o-lear; 2) the dmesoalty of the feature space s hgh eough []-[4]. Cover s theorem does ot dscuss the optmalty of the separatg hyper-plae. By usg Vapk s optmal separatg hyper-plae VC dmeso s mmzed ad geeralzato s acheved. Let us remd that the lear case the procedure requres oly the evaluato of dot products of data. ϕ deote a set of o-lear trasformato from the put space to the feature ( x) Let { } =,... m space; m s a dmeso of the feature space. o-lear trasformato s defed a pror. I the o-lear case the optmzato problem the dual form s followg: Gve the trag data maxmze the obectve fucto (fd the Lagrage multplers) Q( ) = α (/ 2) α α Y Y = = K( X α (5) Subect to the costrats (7) ad (4) he kerel K( X, Y) = ( X ) ϕ( Y) = ϕ ( X ) ϕ ( Y) m ϕ (6) = X ) hus, we may use er-product kerel K(X,Y) to costruct the optmal hyper-plae the feature space wthout havg to cosder the feature space tself explct form. he optmal hyper-plae s ow defed as

f ( X ) = α Y K( X, X ) + b (7) F = Fally, the o-lear decso fucto s defed by the followg relatoshp: X [ W K ( X, X + b] ) = sg ) ( (8) he requremet o the kerel K(X, X ) s to satsfy Mercer s codtos []. hree commo types of Support Vector Maches are wdely used:. Polyomal kerel p K ( X, X ) = ( X X + ) where power p s specfed a pror by the user. Mercer s codtos are always satsfed. 2. Radal bass fucto RBF kerel 2 { X / 2 } 2 K( X, X ) = exp σ X Where the kerel badwdth σ (sgma value) s specfed a pror by the user. I geeral, Mahalaobs dstace ca be used. Mercer s codtos are always satsfed. 3. wo-layer perceptro { β X X + } K( X, X ) β = tah 0 0 Mercer s codtos are satsfed oly for some values of β 0, β. For all three kerels (learg maches), the dmesoalty of the feature space s determed by the umber of support vectors extracted from the trag data by the soluto to the costraed optmzato problem. I cotrast to RBF eural etworks, the umber of radal bass fuctos ad ther ceters are determed automatcally by the umber of Support Vectors ad ther values. I the preset study oly the results obtaed wth the RBF kerel are preseted. III. SPAIAL DAA CLASSIFICAIO. CASE SUDIES wo classfcato case studes are cosdered: Bary o-lear classfcato of radoactvely cotamated terrtores (Brask rego, Sr90). hs part of the study s of methodologcal ature ad follows the deas preseted [5]-[6]. rag algorthms are exteded wth k-fold cross-valdato (leave-k-out). Mult-class classfcato of real sol types data Brask rego, Russa. hs case study s mportat for predcto mappg of radoactvely cotamated terrtores, whe takg to accout radoucldes vertcal mgrato sol. he geerc methodology for the aalyss, modelg ad presetato of spatally dstrbuted data follows the basc deas preseted [0]. he ma phases (steps) of the study are followg: Vsualzato of data. Motorg etwork aalyss ad descrpto. Uderstadg of spatal clusterg (results of preferetal samplg) ad represetatvty of data. Comprehesve exploratory data aalyss. Comprehesve exploratory structural aalyss (varography). Modelg of asotropc spatal correlato. Splttg data to trag, testg, ad valdato subsets. I case of clustered data spatal declusterg procedures ca be sued. rag of SVC/SVR. Selecto of the optmal SVC/SVR hyper-parameters. Spatal data classfcato - categorcal data mappg. Spatal data mappg spatal regresso. Comprehesve aalyss of the resduals (statstcal aalyss, correlato, varography) Uderstadg, terpretato, ad presetato of the results. A. wo class classfcato problem Let us cosder bary classfcato problem appled to Sr90 dcator trasformed varable

I(Sr90=0.3 C/km 2.). Idcator trasformato meas, that I(Sr90=0.3 C/km 2.) = I = f Sr90 0.3 C/km 2 (class ) ad I=0 f Sr90>0.3 C/km 2 (class 2). hus, the problem s posed as a bary classfcato problem after the dcator trasformato of Sr90 cocetrato. Here, the dcator s chose close to the meda of Sr90 cocetrato. I o-parametrc geostatstcs dcator trasformato s wdely used whe modelg local probablty desty fuctos: expected value of dcator at usampled pot s a estmato of the probablty desty fucto at ths pot wth a gve cut [9]. he post plot of dcator values are preseted Fgure. Varogram rose for the dcator varable s preseted Fgure 2. Let us remd, that varogram (semvarogram) s a mportat measure of spatal cotuty descrbg spatal correlato ad wdely used geostatstcs [9]-[0]: γ (h) = 0.5 Var{Z(x+h)-Z(x)] where h s a separato vector betwee pots space. Varogram, estmated usg dcator varable for several lag dstaces ad several drectos s preseted as a Varogram rose Fgure 2. Geostat Offce software [0] was used for computatos. Asotropc structure dfferet correlatos dfferet drectos s evdet. Iformato o spatal correlato ca be used data pre-processg: oe obectve ca be a trasformato of put space (X) order to have more sotropc spatal correlato structures. Also ths formato ca be used to tue asotropc SVM kerels whe Mahalaobs dstace s used. Fgure. wo class classfcato problem. rag data set postplot. O class, J class 2. ) SVC trag wo basc strateges were appled for the SVM trag: ) splttg of orgal data set to trag, testg ad valdato subsets; 2) leave-k-out cross-valdato. he frst approach s a tradtoal procedure whe trag data set s used to develop a model, testg data set s used to tue hyperparameters of the model, ad valdato data set s used for the geeralzato (expected) error estmato. akg to accout spatal clusterg preferetal samplg space, spatal declusterg procedures were used to splt data order to have represetatve data sets. he smplest way to do t s to cover the rego uder study by a regular grd ad to select radomly oe data from each grd cell. Radom splttg was used as well.

Fgure 2. Varogram rose of Sr90 dcator varable. here are two hyper-parameters SVM whe RBF kerel s fxed: kerel badwdth (sgma) ad regularzato parameter C. I geeral, full covarace matrx (Mahalaobs dstace) was used. I the preset study the results of the sotropc kerel RE maly preseted. Bascally, there s a geeral recommedato to put C as a bg value whe data are ot osy ad there s o specal eed regularzato. I order to fd the best (mmzg testg error) C ad sgma parameters trag ad testg error surfaces (trag ad testg errors versus sgma ad C) were estmated. It was foud that, after some hgh C values, whe sgma s fxed, trag ad testg errors do ot chage. I our case t was about 00 at optmal sgma value. he error curves alog wth ormalzed umber of Support Vectors (the umber of Support Vectors dvded by the umber of trag data) are preseted Fgure 3. he mmal testg error was acheved at sgma = 0.. A mportat observato, already metoed [5] ad [6] s that at the optmal pot the umber of Support Vector has also mmum. hs, geeral, correspods to small values of geeralzato (expected) errors []. Fgure 3. rag ad testg error curves ad ormalzed umber of Support Vectors. C=00. 2) Bary classfcato he optmal SVC hyper-parameters were used for the categorcal data mappg (predcto of categorcal varable/class at usampled pots). he result s preseted Fgure 4. Varogram rose computed usg the results of SVC classfcato s preseted Fgure 5. Except wth some ose ths varogram rose follows the orgal expermetal varogram rose. hus, classfcato model correctly reflects basc asotropc spatal correlatos.

Fgure 4. SVM 2 class classfcato (categorcal data mappg). Whte zoe class 2. Kerel badwdth = 0., C=00. rag error = 0.08; testg error = 0.2; valdato error = 0.24. + Support Vectors; O class 2 of valdato data; K class of valdato data. Bascally, by varyg kerel badwdth at some fxed C value, t s possble to cover wde rage of model s complexty from overfttg at small sgma values to oversmoothg at hgh sgma values. I the followg, a real case study o mult class classfcato usg data o sol types Brask rego, Russa. IV. SVM MULI-CLASS CLASSIFICAIO he curret secto of the work deals wth the sol types predcto mappg usg Support Vector Maches. he ma obectve of the study s followg: usg avalable categorcal data o sol types (measuremets o a rregular motorg etworks) develop mult-class classfcato Support Vector Mache to predct sol types at the usampled pots (spatal predcto of categorcal varables). he problem ca be cosdered as a patter completo task as well. Fgure 5. SVM classfcato. Varogram rose of dcators after classfcato. I the preset study, SVC are used for evrometal spatal data classfcato. Straghtforward geeralzato of bary SVM classfcato to mult class classfcato (m classes) s the followg: y y K x x + ( m ) ( m ) = arg max λ (, ) (9) m b he real case study deals wth the sol types classfcato Brask rego. hs s the most cotamated part of Russa by Cherobyl radoucldes. Actually, predcto mappg of evromet

cotamato cludes both physco-chemcal modelg of radoucldes mgrato evromet ad spatal data aalyss ad modelg []. Mgrato of radoucldes sol depeds o propertes of radoucldes, sol types, precptato, etc. Varablty of evrometal parameters ad tal fallout at dfferet scales hghly complcates the soluto of the problem. he preset problem deals wth fve classes: 5 Classes data umber of data Class 392 Class2 48 Class3 333 Class4 52 Class5 485 he grd for predctos cossts of 432 pots (the boudary of the grd follows the boudary of the rego). he fluece of sol types o Sr90 vertcal mgrato s preseted Fgure 6, where Sr90 profles after 20 years of fallout are preseted. Fgure 6. Radouclde vertcal mgrato sol. Vertcal profle of Sr90 dstrbuto after 20 years of fallout. he maor classes (post plot of trag data) are preseted Fgure 7. Fgure 7. Maor classes (sol types data) postplot. + class, O class 3, L class 5.

Lke the case of bary classfcato, orgal data were splt to 3 subsets: trag (30), testg (500) ad valdato (500 data). Data were splt several tmes to uderstad fluctuatos of the results. Spatal correlato structures for two maor classes are preseted as Varogram roses Fgures 8 ad 9. Classes were coded as dcators wth correspodg to class ad 0 to all other classes. Dfferet correlato behavor s clearly observed. Fgure 8. Class varogram rose. Fgure 9. Class 3 varogram rose. A. SVC rag here are several possbltes for the mult-class classfcato wth SVM usg bary models: oeto-rest classfcato, par-wse classfcato, drect geeralzato of the SVM to mult-class problems ad others [], [2], [3]. Oe-to-Rest class-sestve classfcato. I ths case m- models are developed from bary classfcato by applyg the most smple algorthm. m-classfers have the same kerel badwdths. Error curves gve geeral overvew of the problem wthout takg to accout dfferet spatal varablty of classes. If classes have dfferet varablty at dfferet scales ad drectos the optmal kerel badwdth characterzes some averaged scale of varablty. Of course, what s optmal for oe

class, ca be over-fttg or over-smoothg for the others. Class sestve approach s fast ad gves geeral overvew of the problem. I some cases t ca gve satsfactory results. he more terestg approach deals wth adaptato of models to spatal varablty of classes. ) Class-Adaptve Approach I ths case for each oe-to-rest M models dfferet optmal kerel badwdths are tued. rag ad testg error curves wth class adaptve techque are preseted Fgure 0. Fgure 0. Oe-to-rest mult-class classfcato. estg error curves. For each oe-to-rest model optmal kerel badwdths mmzg testg errors were selected. Spatal predctos of categorcal varable (sol type mappg) wth optmal m models are preseted Fgure 2. he same approach was appled wth the geeralzato of bary model usg par-wse classfcatos, both class sestve ad class adaptve. I ths case m(m-)/2 are developed. Example of trag testg ad ormalzed umber of Support Vectors curves s preseted Fgure. Fgure. Par-wse trag ad testg error curves ad ormalzed umber of Support Vectors. C= 00.

Fgure 2. SVM Mappg wth class-adaptve badwdths: Class =0.026; Class2 = 0.; Class3 = 0.4; Class4 = 0.06; Class5 = 0.88. I the preset case par-wse classfcato dd ot mprove sgfcatly the results comparso wth smpler oe-to-rest adaptve model. I cocluso, SVC s a promsg approach for the classfcato of spatally dstrbuted evrometal ad polluto data. he use of smple mult class classfcato models (geeralzatos to the bary models) wth class adaptve approach effcetly reproduced spatal varablty of classes. V. POLLUIO DAA MAPPIG WIH SUPPOR VECOR REGRESSIO Let us cosder applcato of the Statstcal Learg heory for spatal data mappg of cotuous varables usg Support Vector Regresso model. Assume Z R s a varable to be predcted based o some geographcal observatos (x,y). Our work ams at estmatg a depedece betwee Z ad the geographcal co-ordates based o emprcal data (samples) S =(x,y,z,ε ), =,, where x,y, - are the geographcal co-ordates of samples Z - are observed or measured quattes. It s assumed to be the realzato of a radom varable Z wth a ukow probablty dstrbuto P x,y (Z). ε - s the measuremet accuracy for the observato Z deotes the sample sze A. 2.2 Predcto problem Assumg f s a predcto fucto (.e. a fucto used to predct the value of Z kowg the geographcal co-ordates), we defe the cost of choosg ths partcular fucto for a gve decso process. Frst, for a gve observato (x,y,z) we defe the ε-sestve cost fucto: f ( x, y) Z ε f f ( x, y) Z > ε C {( x, y), Z, ε, f} = (20) 0 otherwse where ε characterzes some acceptable error. ow, for all possble observatos we defe the global or geeralsato error also kow as the tegrated predcto error IPE: IPE( f ) E ( C(( x, y,) Z, ε, f )) ω( x, y) dxdy = (2) Z where ω(x,y) s some ecoomcal measure, dcatg the relatve mportace of a mstake at pot (x,y). I case of o-homogeeous motorg etworks ths fucto ca take to accout spatal clusterg. Usually ω(x,y) =, so that all postos ar/e assumed to be equally mportat. B. 2.3 Emprcal ad Structural Rsk Mmzato ) 2.3. Fucto Modelg Let us assume that soluto s a fucto that ca be decomposed to two dfferet compoets: a

tred plus a remag radom process. m f ( x, y) = wkϕ k ( x, y) + β K ( x, y) k = J = (22) where K (x,y) s a bass of the tred compoet ad ϕ k, k=,..m s a orthoormal bass of the remag part (ote that m ca be fty). he complexty of the soluto ca be tued through w 2 =Σ k=..m w k 2 []. hus, a relevat strategy to mmse IPE s to mmze the emprcal error together wth matag w 2 small. hs ca be obtaed by mmsg the followg cost fucto: 2 mmze w 2 subect to f ( x, y ) - Z ε, for =,... Whe data le outsde of ths epslo tube due to ose or outlers makg these costrats too strog ad mpossble to fulfl, Vapk suggested to troduce slack varables ξ, ξ. hese varables measure the dstace betwee the observato ad the ε tube. ote that by troducg the couple (ξ, ξ ) the problem has ow 2 ukow varables. But these varables are lked sce oe of the two values s ecessary equals to zero. Ether the slack s postve (ξ = 0) or egatve (ξ = 0). hus, Z [f(x,y)- ε -ξ, f(x,y)+ ε +ξ ]. Followg the deas as the case of SVM classfcato we arrve at the followg optmzato problem: mmse 2 ω + C ( ξ + ξ ) 2 = (23) subect to f ( x, y ) Z ε ξ f ( x, y ) + Z ε ξ ξ, ξ 0 for =,... 2) 2.3.2 Dual formulato A classcal way to reformulate a costrat based mmzato problem s to look for the saddle pot of Lagraga L: L( w, ξ, ξ 2 α) = w + C( ξ + ξ ) α ( Z 2 = = α ( f ( x, y ) Z + ε + ξ ) ( = = where α, α, η, η f ( x, y ) + ε + ξ ) ηξ + η ξ ) are the Lagraga multplers assocated wth the costrats. hey ca be roughly terpreted as a measure of the fluece of the costrats the soluto. A soluto wth α α 0 ca be terpreted as the correspodg data pot has o fluece o ths soluto. = = Fally, the dual formulato of the problem s as follows: maxmse - 2 subect to = = = = m ( α α ) ϕ k ( x, y ) ϕ k ( x, y ) ( α α ) k = ε ( α + α ) + Z ( α α ) = ( α α ) K ( x, y ) = 0 for K =,...m 0 α, α C for,... By usg kerel trck ths problem ca be solved wthout drect modelg a feature space (the same as o-lear classfcato). o do so t s ecessary to choose ϕ k such that: (24)

m ( ) ϕ k ( x, y ) ϕ k ( x, y ) = G ( x, y ),( x, y ) k = hs s the case reproducg kerel Hlbert space, where G s the reproducg kerel. Fuctos ϕ k are the ege fuctos of G. I ths case the soluto ca be formulated the followg form: = + f ( x, y) v G(( x, y),( x, y )) β K ( x, y) = = wth v were obtaed wth Gaussa RBF kerel ad K (x,y)=. m = ( α α ). hs soluto oly depeds o the kerel fucto G. he ma results VI. SVR MAPPIG. CASE SUDY Let us cosder mappg of sol polluto by Cherobyl radouclde Sr90 the Wester part of Brask rego, Russa. he case study follows the basc methodology appled to the classfcato the prevous sectos. A mportat developmet deals wth comprehesve aalyss of the resduals. I terms of geostatstcs useful formato to be extracted from data ad modeled wth SVR s a spatally structured (spatally correlated) formato. From ths pot of vew varography of the resduals s a powerful ad effcet tool for cotrollg the performace of SVR mappg. he varogram rose of trag Sr90 data s preseted Fgure 3. Fgure 3. Varogram rose of Sr90 raw data. I case of regresso whe Gaussa RBF kerel s fxed there are three hyper-parameters: kerel badwdth, regularzato costat C ad ε. herefore, a error cube has to be estmated ad aalyzed to fd optmal SVR parameters. Some deas o the selecto of hyper-parameters are dscussed [7]. rag ad testg error surface are preseted Fgures 4, 5. Fgure 4. rag error surface, C= 000. Axes correspod to X kerel badwdth; Y - ε parameter.

Fgure 5. estg error surface, C= 000. Axes correspod to X kerel badwdth; Y - ε parameter. ormalzed umber of Support Vectors s preseted Fgure 6. he umber of Support Vectors s mootocally decreasg wth parameter ε. Let us ote, that the largest reasoable order of ε correspods to the stadard devato of data. Fgure 5. ormalzed umber of Support Vectors. C= 000. Axes correspod to X kerel badwdth; Y - ε parameter. Fgure 6. SVR mappg of SR90. Varogram of the trag resduals of the model s pure ugget effect correspodg to the ugget of raw data. o - trag data, + Support Vectors. he Sr90 cocetrato vares betwee 0 ad.4 C/km 2 ad the umber of trag data s 200. Regularzato C parameter does ot sgfcatly fluece error curves whe C>000. At optmal kerel badwdth trag error curves does ot chage below some value of ε parameter whch more or less correspods to the square root of ugget orgal data, ad the creases sgfcatly. At fxed kerel badwdth the umber of Support Vectors mootocally decreases (Fgure 5). Some dscussos o error curves behavor ca be foud [7].

VII. 5. COCLUSIOS he problem of spatal data aalyss ad modelg wth Support Vector Maches was cosdered. Both bary ad mult-class classfcato problems were studed. Mult-class problem was vestgated usg real data o sol types. Several models geeralzg bary class SVC were appled. It was foud that smple oe-to-rest model gves satsfactory results. here are stll some ope questos related to the selecto of kerel types, local adaptato of SVC ad SVR, mportace of data preprocessg, etc. Spatal data mappg wth SVR s a effcet olear ad robust approach able to extract spatally structured formato usg raw data. Hgh flexblty of SVR cotrolled by tug hyper-parameters ca be effcetly used to model o-lear treds as well. Importat ad rather opeed questos deal wth multvarate spatal predctos, whe the quatty ad qualty of data for correlated varables s dfferet the problem of spatal co-estmatos; robustess of the soluto, drect adaptato ad mplemetato of geostatstcal tools to SVC/SVR, uderstadg of the fluece of data clusterg (preferetal samplg). ACKOWLEDGEMES he work was supported part by Europea IAS grats 97-3726, 99-00099 ad CARA Swss FRS grat. REFERECES [] Vapk V. Statstcal Learg heory. Joh Wley & Sos, 998. [2] Crsta. ad Shawe-aylor J. A Itroducto to Support Vector Maches ad other kerel-based learg methods. Cambrdge Uversty Press, 2000 89 pp. [3] Burgess C. A tutoral o Support Vector Maches for patter recogto. Data mg ad kowledge dscovery, 998. [4] Cherkassky V ad F. Muler. Learg from data. Wley Iterscece,.Y. 998, 44 p. [5] Kaevsk M.,. Glard, M. Maga, E. Mayoraz. Evrometal Spatal Data Classfcato wth Support Vector Maches. IDIAP Research Report. IDIAP-RR-99-07, 24 p., 999a. (www.dap.ch) [6] Glard, M Kaevsk, E Mayoraz, M Maga. Spatal Data Classfcato wth Support Vector Maches. Accepted for Geostat 2000 cogress. South Afrca, Aprl 2000. [7] M. Kaevsk, S. Cau. Spatal Data Mappg wth Support Vector Regresso. IDIAP Reasearch Report; RR-00-09. [8] Atkso P. M., ad Lews P. Geostatstcal classfcato for remote sesg: a troducto. Computers ad Geoseces, vol. 26 pp. 36-37, 2000. [9] Deutsch C.V. ad A.G. Jourel. GSLIB. Geostatstcal Software Lbrary ad User s Gude. Oxford Uversty Press, ew York, 997. [0] Kaevsk M, V. Demyaov, S. Cherov, E. Saveleva, A. Serov, V. mo, M. Maga. Geostat Offce for Evrometal ad Polluto Spatal Data Aalyss. Mathematsche Geologe, 3, Aprl 999, pp. 73-83. [] M. Kaevsk,. Koptelova, V. Demyaov. RamsW - Software for Modellg Mgrato of Radoucldes Sol. Isttute of uclear Safety (IBRAE). Preprt IBRAE 97-6, Moscow, 997, 2 p. [2] Westo J., Watks C. Mult-class Support Vector Maches. echcal Report CSD-R-98-04, 9p, 998. [3] E. Mayoraz ad E. Alpayd Support Vector Mache for Multclass Classfcato,, IDIAP-RR 98-06, 998 (www.dap.ch) [4] M. Kaevsk, R. Arutyuya, L. Bolshov, V. Demyaov, M. Maga. Artfcal eural etworks ad spatal estmatos of Cherobyl fallout. Geoformatcs, vol. 7, pp. 5-, 996.