Stochastc GIS cellular automata for lad use chage smulato: applcato of a kerel based model O. Okwuash, J. McCoche, P. Nwlo 3, E. Eyo 4 School of Geography, Evromet, ad Earth Sceces, Vctora Uversty of Wellgto, Wellgto, New Zealad +64-7793 ouwa.okwuash@vuw.ac.z Opus Iteratoal Cosultats, P.O. Bo 3, Wellgto, New Zealad +64-75973 Jack.McCoche@opus.co.z 3 Departmet of Surveyg ad Geoformatcs, Uversty of Lagos, Lagos, Ngera +34-83575644 pcwlo@yahoo.com 4 Departmet of Surveyg ad Geoformatcs, Federal Uversty of Techology, Ma, Ngera +34-867895 geoetm@yahoo.com. Itroducto Cellular Automata (CA) are dyamc mathematcal systems (whch are dscrete tme ad space, operate uform regular lattce, ad are characterzed by local teracto) that ca be used wth GIS to smulate lad use chage. The use of Artfcal Neural Network (ANN) for GIS cellular automata calbrato s oe of the most popular stochastc GIS CA calbrato techques. ANNs shortcomgs are that ANNs are black-bo models, ad have a statc ature whch causal factors are udyamc (Kocabas ad Dragcevc 7); ad mght suffer dffcultes wth geeralzato ad produce models that may overft the data (Karystos ad Pados ). Ths study troduces the use of a kerel based model called Support Vector Mache (SVM) for calbratg the GIS cellular automata. SVMs are robust, dyamc, ad ususceptble to overfttg. SVM results are compared wth the ANN.. CA calbrato usg support vector mache A more detaled SVM troducto s gve Cortes ad Vapk (995) ad Wataachaturapor et al. (4). Gve a trag dataset whch cossts of trag
N samples (, y),(, y ),...,(, y ), where R, such that y {, + }. The obectve of SVMs s to fd a lear decso fucto defed by f ( ) = w + b, where N w R, ad b R s a bas. The hyperplaes for the two classes are represeted by y ( w + b). Slack varables ξ > accout for msclassfcato. y ( w + b) ξ represets the hyperplaes for the two classes. The optmal hyperplae f ( ) = s located where the marg betwee the two classes s mamzed ad error mmzed. The costraed optmzato problem s, M : w + Cξ () = Subect to : y ( w + b) ξ, for =,,...,. The costat C, < C <, s called the pealty value. Equato s solved by costructg the Lagraga, w w+ Cξ α ( y ( w z ) + b) + ξ ) = = = L( w, b, ξ, α, β) = β ξ () ad fdg the saddle pot of L ( w, b, ξ, α, β ). The dual form of the soluto of () becomes: Ma: α αα y y ( ) (3) = = = Subect to : α y = ad α C, = for =,,...,. α are called the Lagrage multplers. Accordg to the Karush-Kuh-Tucker (KKT) optmalty codto (Fletcher 987); some of the multplers wll be zero. Multplers wth ozero values are called the support vectors. The result from the optmzer, called a optmal soluto, s the set α = ( α,..., ) ; w ad b are calculated from α ad b = [ w + w ] w = y +, where + ad are the support vectors = of class labels + ad -. The decso rule s the appled to classfy the dataset to two classes + ad, f ( ) = sg yα ( ) + b. (4) = For a olear problem the trasformato fucto φ maps the data to a hgher dmesoal space. Suppose there ests a fucto K, called a kerel fucto, such that, K, ) φ( ) φ( ). (5) ( The formulato of the kerel fucto from the dot product s a specal case of Mercer s theorem (Mercer 99). The optmzato problem becomes, Ma : α α α y y K(, ) (6) = = = α
Subect to : α y = ad α C, = for =,,...,. whle the decso fucto becomes, f ( ) = sg yα K(, ) + b = (7) Eamples of some well-kow kerel fuctos are: lear kerel ), polyomal ( d fucto ( +), radal bass fucto ep, ad sgmod σ fucto tah( κ ( ) + Θ). Now we ca calbrate our CA usg SVM outputs. We ca map the SVM outputs f (), to probabltes usg a sgmod fucto wth parameters A ad B (Platt 999), P(/ f ) = + ep( Af + B) (8) P, Let P ( / f ) be replaced wth t t, where P, s the developmet probablty at tmet. For stochastc CA, a stochastc perturbato term ca be corporated to represet ukow errors durg the smulato; order to esure the predcted patters appromate realty as closely as possble. The error term ( RA ) s gve by Whte ad Egele (993) as: α RA = + ( lγ ). (9) I order to crease cotrol over the perturbato, (9) ca be modfed as (Okwuash et al. 9): α RA = λ β + ( lγ ). () where γ s a uform radom varable wth the rage ad ; ad λ,α, ad β are parameters that cotrol the magtude of the perturbato. The developmet probablty at t + ca be revsed as: P m α t [ { β + ( lγ ) }]* P, λ cos () t+, = *, = where t+, cos, P s determed as:, deotes costrats cotrbutos. + P t, ψ Otherwse developed udeveloped ()
where ψ mples predefed threshold. 3. Methodology Ths study employed maly remote sesg Ladsat Thematc Mapper mages acqured December 8, 984 ad February 6, coverg Lagos, Ngera; ad several other data sources. Fourtee lad use varables were etracted for learg (table ). These varables were etracted wth the ArcGIS. The dstace varables were calculated usg the Eucldea Dstace fucto. The umber of developed cells the 3 3 Moore s eghbourhood was frst computed usg the Focal Statstcs fucto, whle the updated eghbourhood ad modellg was doe MATLAB. The stratfed radom samplg was used to etract the trag data. The followg values were used for the calbrato: λ =. 58, β = 5, α =, A =, B =, ψ =., polyomal fucto of degree d =, ad C = e 8. Bascally, α, b, ad f () were computed. Oly the eghbourhood varable 4, was updated every t t + terato to determe a ew f (),, ad P. Udeveloped cells that have P, developmet probablty greater tha or equal to the threshold probablty ψ were coverted to developed cells. The vsualzato of results was doe ArcGIS. The ANN was traed wth the method of back-propagato, usg a 'two-layer feed-forward etwork' wth 55 euros the hdde layer., Lad use varables Target varables y = + y = developed udeveloped Promty varables : dstace to water : dstace to resdetal area : dstace to dustral ad commercal cetres 3 4 : dstace to maor roads : dstace to Lagos Islad 5 : dstace to teratoal arport 6 : dstace to local arport 7 : dstace to Apapa Port 8 : dstace to T Ca Islad Port 9 Weghted varables : dstace to all settlemets : dstace to all vegetated cells : populato potetal
3: come potetal Local varable t 4 : Ω 33 s the umber of developed cells the 3 3 Moore eghbourhood at tme t ( 9 pels) Costrat varables Maor roads, water, ad developed cells Table. The fourtee lad use varables 4. Dscusso ad results Fgure presets the actual lad use developmet derved from a remotely sesed mage ad the smulated result from the support vector mache GIS-CA model. The SVM result of the cell-by-cell comparso for perods, 984- s gve by the cofuso matr table ad the result from the ANN method s gve by the cofuso matr table 3. Kappa statstc was calculated for the SVM ad ANN respectvely (see fgure ). The kappa coeffcet ca provde much better terpretato for measurg accuracy because t ca address the dfferece betwee the actual agreemet ad chace agreemet (Fug ad LeDrew 988). SVM performed better tha ANN udgg by ther kappa coeffcets. Leged Udeveloped (a) Referece: 984 (b) Referece: Developed (c) Smulated: Klometers 53 6 9 Fgure. (a) Actual base year: 984, (b) Actual target year:, ad (c) SVM smulated target year:
Referece data Developed Udeveloped Predcted data Developed 5753 776 Udeveloped 444 9583 Table. SVM cofuso matr Referece data Developed Udeveloped Predcted data Developed 563 6644 Udeveloped 574 86893 Table 3. ANN cofuso matr K ap p a co e ffc e t.76.74.7.7.68.66.64.6.6.58.56 SVM CA calbrato methods ANN Fgure. Valdato of models: Kappa coeffcets for SVM ad ANN 5. Cocluso The result of ths modellg showed good coformty betwee the smulated ad the actual lad use developmet. The SVM posted a better result tha the ANN model. SVMs are relatvely ew tools that have bee appled to varous felds of study, but have ot bee favourably adopted for modellg lad use chage. SVMs may be computatoally tesve, but studes have show that ther results are hghly accurate, ad seem a promsg tool for smulatg lad use chage. 6. Refereces Cortes C ad Vapk V, 995, Support vector etworks. Mache learg (3): 73-97. Fletcher R, 987, Practcal Methods of Optmzato. d Ed., Joh Wley & Sos, New York. Fug T ad LeDrew E, 988, The determato of optmal threshold levels for chage detecto usg varous accuracy dces. Photogrammetrc Egeerg ad Remote Sesg, 54(): 449 454.
Karystos G ad Pados D,, O overfttg, geeralzato, ad radomly epaded trag sets. IEEE Trasactos o Neural Networks, (5), 5-57. Kocabas V ad Dragcevc S, 7, Ehacg a GIS Cellular Automata Model of Lad Use Chage: Bayesa Networks, Ifluece Dagrams ad Causalty. Trasactos GIS, (5): 68-7. Mercer J, 99, Fuctos of postve ad egatve type ad ther coecto wth the theory of tegral equatos. Trasactos of the Lodo Phlosophcal Socety, A-9: 45-446. Okwuash O, McCoche J, Nwlo P, & Eyo E, 9, Ehacg a GIS cellular automata model of lad use chage usg support vector mache. The 7 th Iteratoal Coferece o Geoformatcs 9, Farfa, Vrga, USA. Platt J, 999, Probablstc outputs for support vector maches ad comparsos to regularzed lkelhood methods. I: A. Smola, P. Bartlett, B. Schlkopf, D. Schuurmas, (Eds.), Advaces Large Marg Classfers. MIT Press, Cambrdge, MA, pp. 6 74. Wataachaturapor P, Varshey P, ad Arora M, 4, Evaluato of factors affectg support vector maches for hyperspectral classfcato. Amerca Socety for Photogrammetry & Remote Sesg (ASPRS) 4 Aual Coferece, Dever, CO. Whte R ad Egele G, 993, Cellular Automata ad fractal urba form: a cellular modellg approach to the evoluto of urba lad-use patters. Evromet ad Plag A, 5(8):75-99.