Decision Boundary Formation of Neural Networks 1

Similar documents
Multigradient for Neural Networks for Equalizers 1

Multilayer Perceptron (MLP)

Kernel Methods and SVMs Extension

EEE 241: Linear Systems

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Which Separator? Spring 1

Multilayer neural networks

Lecture Notes on Linear Regression

Multi-layer neural networks

Pattern Classification

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Evaluation of classifiers MLPs

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Lecture 10 Support Vector Machines II

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Generalized Linear Methods

Lecture 3: Dual problems and Kernels

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

10-701/ Machine Learning, Fall 2005 Homework 3

Nonlinear Classifiers II

Week 5: Neural Networks

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Difference Equations

Report on Image warping

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

Global Sensitivity. Tuesday 20 th February, 2018

Linear Feature Engineering 11

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Section 8.3 Polar Form of Complex Numbers

Homework Assignment 3 Due in class, Thursday October 15

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Convexity preserving interpolation by splines of arbitrary degree

Numerical Heat and Mass Transfer

VQ widely used in coding speech, image, and video

1 Convex Optimization

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

On the Repeating Group Finding Problem

The Order Relation and Trace Inequalities for. Hermitian Operators

Problem Set 9 Solutions

Solving Nonlinear Differential Equations by a Neural Network Method

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Improvement of Histogram Equalization for Minimum Mean Brightness Error

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

MULTISPECTRAL IMAGE CLASSIFICATION USING BACK-PROPAGATION NEURAL NETWORK IN PCA DOMAIN

Formulas for the Determinant

The Study of Teaching-learning-based Optimization Algorithm

More metrics on cartesian products

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION

On the correction of the h-index for career length

Linear Approximation with Regularization and Moving Least Squares

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

/ n ) are compared. The logic is: if the two

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

Spin-rotation coupling of the angularly accelerated rigid body

CSE 252C: Computer Vision III

Module 3: Element Properties Lecture 1: Natural Coordinates

Modeling curves. Graphs: y = ax+b, y = sin(x) Implicit ax + by + c = 0, x 2 +y 2 =r 2 Parametric:

NUMERICAL DIFFERENTIATION

Week 9 Chapter 10 Section 1-5

Composite Hypotheses testing

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Chapter 11: Simple Linear Regression and Correlation

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Assortment Optimization under MNL

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

COMPUTATIONALLY EFFICIENT WAVELET AFFINE INVARIANT FUNCTIONS FOR SHAPE RECOGNITION. Erdem Bala, Dept. of Electrical and Computer Engineering,

COEFFICIENT DIAGRAM: A NOVEL TOOL IN POLYNOMIAL CONTROLLER DESIGN

The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

The Geometry of Logit and Probit

Why feed-forward networks are in a bad shape

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

APPENDIX A Some Linear Algebra

Model of Neurons. CS 416 Artificial Intelligence. Early History of Neural Nets. Cybernetics. McCulloch-Pitts Neurons. Hebbian Modification.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Support Vector Machines CS434

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Online Classification: Perceptron and Winnow

Natural Language Processing and Information Retrieval

The Quadratic Trigonometric Bézier Curve with Single Shape Parameter

Lecture 10 Support Vector Machines. Oct

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

A Hybrid Variational Iteration Method for Blasius Equation

Non-linear Canonical Correlation Analysis Using a RBF Network

Affine transformations and convexity

Vapnik-Chervonenkis theory

The Synchronous 8th-Order Differential Attack on 12 Rounds of the Block Cipher HyRAL

Support Vector Machines

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Improved delay-dependent stability criteria for discrete-time stochastic neural networks with time-varying delays

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

An Improved multiple fractal algorithm

Differentiating Gaussian Processes

Module 14: THE INTEGRAL Exploring Calculus

Transcription:

Decson Boundary ormaton of Neural Networks C. LEE, E. JUNG, O. KWON, M. PARK, AND D. HONG Department of Electrcal and Electronc Engneerng, Yonse Unversty 34 Shnchon-Dong, Seodaemum-Ku, Seoul 0-749, Korea Abstract: In ths paper, we provde a thorough analyss of decson boundares of neural networks when they are used as a classfer. rst, we dvde the classfyng mechansm of the neural network nto two parts: dmenson expanson by hdden neurons and lnear decson boundary formaton by output neurons. In ths paradgm, the nput data s frst warped nto a hgher dmensonal space by the hdden neurons and the output neurons draw lnear decson boundares n the expanded space (hdden neuron space). We also found that the decson boundares n the hdden neuron space are not completely ndependent. hs dependency of decson boundares s extended to multclass problems, provdng a valuable nsght nto formaton of decson boundares n the hdden neuron space. hs analyss provdes a new understandng of how neural networks construct complex decson boundares and explans how dfferent sets of weghts may provde smlar results. Key-Words: neural networks, analyss of decson boundary, dmenson expanson, lnear boundary, dependent decson boundary. Introducton Neural networks have been successfully used n varous pattern recognton problems ncludng character recognton [], remote sensng [], and communcaton [3]. he ncreasng popularty of neural networks s partly due to ther ablty to learn and therefore generalze. And neural networks make no pror assumptons about the statstcs of nput data and can construct complex decson boundares. Although t s known that neural networks can defne arbtrary decson boundares wthout assumng any underlyng dstrbuton [4], the decson boundary of neural networks are not well understood. here have been many papers that analyzed how neural networks are workng [5-7]. Gbson and Cowan nvestgated decson regons of mult-layer perceptrons and derved some geometrc propertes of the decson regons [8]. Makhoul and et. al. showed that neural networks wth a sngle hdden layer are capable of formng dsconnected decson regons [9]. Blackmore and et. al nvestgated decson regon approxmaton of neural networks and compared neural networks wth polynomals [0]. Ntta analyzed the decson boundares of complex valued neural networks []. On the other hand, some researchers nvestgated a learnng algorthm based on decson based formulaton for pattern classfcaton problems []. And Pal and et. al. proposed a method to fnd decson boundares for pattern classfcaton usng genetc algorthms and made extensve performance comparsons wth neural networks and other classfers, provdng nsghts nto the decson boundares of complex pattern classfcaton problems [3]. In ths paper, we systematcally analyze the decson boundary of feedforward neural networks and provde a helpful nsght and new nterpretaton nto the workng mechansm of neural networks. In partcular, when neural networks are used as a classfer, we note that the workng mechansm of the neural network can be dvded nto two parts: dmenson expanson by hdden neurons and lnear decson boundary formaton by output neurons. In ths context, we defne the role of hdden neurons as mappng the orgnal data nto a dfferent dmensonal space. x x x M X n b b X bas Z' g.. Example of 3-layer feedforward neural networks ( pattern classes). Z Y' Y y y he Korea Scence and Engneerng oundaton partly supported the publcaton of ths paper through BERC at Yonse Unversty. - -

eedforward neural networks and termnologes A typcal neural network has the nput layer, a number of hdden layers, and the output layer. It may also nclude bas terms. g. shows an example of 3-layer feedforward neural networks ( pattern classes). he decson rule s to choose the class correspondng to the output neuron wth the largest output [4]. rst, we assume that the actvaton functon s the sgmod functon: ( x) = x e () + In the neural network of g., we wll assume that nput vector X n s an M x column vector, X an (M+) x column vector, Z an N x column vector, Z an (N+) x column vector. Vectors n the output layer, Y and Y, are x column vectors. We wll call Z the output vector of hdden neurons. Each component of the output vector of hdden neurons, Z, s computed as follows: = =,...,N and z =. + N+ z e φ X Consequently, ponts that satsfy φ X = c wll end up wth the same value. And φ X = c represents a pont, a lne, a plane or a hyper-plane n the nput space dependng on the dmenson of the nput vector. In ths paper, we wll call φ X = c equvalent-weght lnes, equvalent-weght planes, or equvalent-weght hyperplanes dependng on the nput dmenson. urthermore, φ X = 0, whch corresponds to z = 0.5, wll be called mddle-weght lnes, mddle-weght planes, or mddle-weght hyper-planes dependng on the dmenson. In g., there are two bas terms b, b. Wthout loss of generalty, we can assume b = b =. It can be easly seen that the relatonshps between these vectors are as follows: X = [ x, x,..., x ] n M X = [ x, x,..., x, b ] M Z =Φ X = [ φ, φ,..., φ ] X = [ φ X, φ X,..., φ X] () N where φ s an (M+) x column vector. We wll call Φ= [,,..., ] φ φ φ N the weght matrx between the nputs and the hdden neurons and φ the weght vector between hdden neuron and the nputs. And the remanng vectors are obtaned as follows: Z = [ ( φ X), ( φ X),..., ( φ X), b ] N Y =Ψ Z = [ ψ, ψ ] Z = [ ψ Z, ψ Z] Y = [ y, y ] = [ ( ψ Z), ( ψ Z)] N where ψ s an (N+) x column vector. We wll call Ψ = [ψ,ψ ] the weght matrx between the output and hdden neurons and ψ the weght vector between output neuron and the hdden neurons. urthermore, we wll call the space defned by the nput vector, X n, the nput space and the space defned by the outputs of the hdden neurons wth the last component removed, [z,z,..., z N ] = [(φ X ),(φ X),..., (φ N X)], the hdden neuron space. g.. Example of a radal bass functon neural network ( pattern classes). g. shows an example of a radal bass functon neural network wth the Gaussan functon n the hdden layer. Snce the Gaussan functon s used as the radal bass functon n the hdden neurons, the outputs of the hdden neurons are computed as follows: ϕ ( X ) = exp( X Μ ) where = [,,..., ], Μ = [ µ, µ,..., µ ], =,..., m m X x x x n. Μ s called the center and s one of the parameters to be updated durng the learnng process. Smlarly, the ponts that satsfy X Μ = c represent ponts, a crcle, a sphere or a hyper-sphere n the nput space dependng on the dmenson of the nput space. In ths paper, the ponts that satsfy X Μ = 0.833, whch corresponds to ϕ ( X ) = 0.5, wll be called a half crcle, a half sphere or a half hyper-sphere dependng on the nput dmenson. he -th output of the output layer s smply computed as follows: n ϕ j j j= y = w = W Φ [ ϕ ϕ ϕ ] where W = [ w, w,..., w ], Φ=,,..., n n here w j s the weght between j-th hdden neuron and -th output neuron. In order to avod confuson, we wll denote the neural network whose actvaton functon s the sgmod functon as SIGNN and the radal bass functon neural network as RBNN. - -

3 Dmenson expanson of hdden neurons rst, we vew the outputs of the hdden neurons as a non-lnear-mappng of nputs snce a typcal actvaton functon s a non-lnear functon such as the sgmod functon or the Gaussan functon. And we observe that addng a hdden neuron s equvalent to expandng the dmenson of the hdden neuron space. hus, f the number of hdden neurons s larger than the number of nputs, the nput data wll be warped nto a hgher dmensonal space. However, the ntrnsc dmenson of the data dstrbuton n the hdden neuron space can not exceed the dmenson of the orgnal nput space. or example, f the dmenson of the nput vector s and the number of hdden neurons s 3, the data wll be dstrbuted on a curved plane n the 3-dmensonal space as shown n g. 3. In other words, f the number of hdden neurons s larger than the dmenson of the nput space, the nput data wll be warped nto a hgher dmensonal space whle mantanng the same ntrnsc dmenson. On the other hand, f j j φ X = c and φ X = c are parallel, then the ntrnsc dmenson n the hdden neuron space can be smaller than the dmenson of the orgnal nput space. g. 4 llustrates an example of the case where the plane n the nput space s mapped onto a curved lne n the hdden neuron space. g. 5 shows an example of dmenson expanson of RBNN wth 3 hdden neurons. he correspondng three half crcles and the decson boundares are also dsplayed. In the next secton, we wll nvestgate decson boundares n the hdden neuron space, whch wll be always lnear boundares. rom ths pont of vew, t can be seen that, when neural networks are used as a classfer, the nput data s frst mapped non-lnearly nto a hgher dmensonal space and then dvded by lnear decson boundares. nally, the lnear decson boundares n the hdden neuron space wll be warped nto complex decson boundares n the orgnal nput space. g. 3. An example of dmenson expanson of SIGNN three mddle-weght lnes n the nput space the correspondng dstrbuton n the hdden neuron space. g. 4. A case where the plane of the nput space s mapped onto a curved lne n the hdden neuron space (SIGNN). g. 5. An example of dmenson expanson of RBNN three half crcles, whch correspond to 3 hdden neurons, n the dmensonal nput space the correspondng curved plane n the 3 dmensonal hdden neuron space. 4 Decson boundares n the hdden neuron space In ths secton, we analyze the decson boundares of neural networks whose actvaton functon s the sgmod functon. However, t can be easly seen that the same analyss can be appled to RBNN. 4. wo pattern classes rst, we wll consder decson boundares of neural networks for a -pattern class problem as shown n g.. Snce the decson boundares between two output neurons s a buldng block for multclass problems, a thorough analyss of the decson boundares of a - pattern class problem wll provde a valuable nsght nto how decson boundares of neural networks are defned. After tranng s fnshed, the decson rule s to choose the class correspondng to the output neuron wth the largest output. And the decson boundary defned by the neural network n g. s gven by y = y n the space defned by y and y ( y - y plane). And the equvalent decson boundary s gven by ( ψ Z) = ( ψ Z) (3) n the hdden neuron space that s defned by z,z,...,z N. Snce the sgmod functon of () s monotoncally ncreasng, the same decson boundary - 3 -

gven by (3) wll be obtaned n the hdden neuron space wth the followng equaton: ψ Z = ψ Z ( ψ ψ ) Z = C Z = 0 cz + cz +... + c z + c b = 0 N N N+ cz + cz +... + c z + c 0 N N N + = where b = and C = ψ ψ s an (N+) x column vector. In other words, the decson boundary between two classes n the hdden neuron space wll be always a lnear decson boundary, though decson boundares n the nput space are non-lnear complex decson boundares. rom ths analyss, t can be easly seen that neural networks wth two output neurons can be always reduced to neural networks wth one output neuron where the weght vector between the hdden neurons and the output neuron s gven by C = ψ ψ. In ths way, the complexty of neural networks can be reduced substantally. And the decson rule s as follows: y = C Z = ( ψ ψ ) Z > 0, decdeω. If Else, decdeω. 4. hree pattern classes If there are 3 pattern classes, typcally there wll be three output neurons and the decson rule s to choose the class correspondng to the output neuron wth the largest output. he output vector of output neurons, Y, wll be gven by Y = [ y, y, y ] = [ ( ψ Z), ( ψ Z), ( ψ Z)]. 3 3 In the hdden neuron space, the three decson boundares between each par of classes wll be gven by ψ Z = ψ Z ( ψ ψ ) Z = 0 (4) ψ Z = ψ Z ( ψ ψ ) Z = 0 (5) 3 3 ψ Z = ψ Z ( ψ ψ ) Z = 0. (6) 3 3 Each equaton represents a lnear decson boundary n the hdden neuron space. hus, any two of the three equatons wll have ntersecton except the trval case that they are parallel. Let Z be a pont on the ntersecton of (4) and (5). In other words, ( ψ ψ ) Z = 0 ( ψ ψ ) Z = 0. 3 hen we can easly show that ( ψ ψ ) Z = ( ψ ψ ) Z ( ψ ψ ) Z = 0. 3 3 In other words, Z wll also satsfy (6). Smlarly, we can show that any pont that s a soluton to any two of the three equatons wll be a soluton of the remanng equaton. hs ndcates that the three lnear decson boundares wll always meet at the same ntersecton. g. 6 llustrates how the decson boundares are formed n the hdden neuron space for a 3-class problem. he frst decson boundary dvdes the hdden neuron space nto two regons (g. 6a). When the second decson boundary between ω and ω s 3 added as shown n g. 6b, the upper-left regon can be classfed as ω ( y > y > y ). And the upper-rght 3 3 regon s classfed as ω and the lower-rght as ω. he lower-left regon s not determned yet. he thrd decson boundary can not dvde the upper-left, whch s the regon that s classfed as ω after the second 3 decson boundary was ntroduced, as shown n g. 6c snce t produces a contradcton ( y > y, y > y, y > y ) n the regon that s denoted 3 3 as NA (not allowed). hus, the thrd decson boundary should dvde the undetermned regon as shown n g. 6d. However, t can not produce the classfcaton result as shown n g. 6e. Let P be a pont on the lne y = y + C where C s an arbtrary postve number. of 3 As P approaches to the decson boundary between ω and ω, the dfference between y and y dmnshes to zero (g. 6f). However, snce y = y + C where C 3 can be arbtrarly large, P can not be classfed as ω but would be classfed as ω ( y >> y y ). It s 3 3 noted that the neural network for a 3-pattern class problem dvdes the hdden neuron space nto 3 regons, though the 3 decson boundares dvdes the hdden neuron space nto 6 regons. As llustrated n the above, f two decson boundares are gven, the remanng decson boundary wll be automatcally determned snce three equatons (4-6) are not lnearly ndependent. In other words, ( ψ ψ ) Z = ( ψ ψ ) Z ( ψ ψ ) Z. 3 3 We showed that that neural networks wth two output neurons can be always reduced to neural networks wth one output neuron. Smlarly, neural networks wth three output neurons can be always reduced to neural networks wth two output neurons where the weght vectors between the hdden neurons and the output neurons are gven by C = ψ ψ and C = ψ ψ. 3 4.3 Angles between lnear decson Boundares n the hdden neuron space In the hdden neuron space, (ψ ψ ) Z =0 represents lnear decson boundares. It can be rewrtten as ( ψ ψ ) Z = C Z = 0 c z +c z +... +c N z N +c N+ b = 0 c z +c z +... +c N z N = c N+ - 4 -

where b = and C = ψ ψ s an (N+) x column vector. Let ψ be the vector whose components are the same as those of ψ wth the last component of ψ removed. In other words, ψ = [ ψ, ψ,..., ψ ],, N, where ψ, j s the j-th component of ψ. Dependng on the dmenson, ( ψ ψ ) Z = 0 may represent a lne, a plane or a hyperplane that are perpendcular to vector ψ ψ = [ ψ ψ, ψ ψ,..., ψ ψ ].,,,,, N, N On the other hand, t can be easly seen that ( ψ ψ ) = ( ψ ψ ) ( ψ ψ ). (7) 3 3 where θ s the angle between ω ω decson j boundary and ω ω decson boundary. Snce j k decson boundares are not a vector, we may have to take the absolute value of (8). g. 7 llustrates the relatonshp of the angles between the decson boundares n the hdden neuron space for a 3 pattern class problem. herefore, t can be sad that the angle between two decson boundares n the hdden neuron space s determned by the drectons and magntudes of vectors ( ψ ψ ), ( ψ ψ ). ψ ψ ( ψ ψ 3 ) Z = 0 j j k x θ ( ψ ψ ) Z = 0 y3>y y>y3 ψ ψ 3 θ y>y y>y x ND y>y y>y x (c) y3>y (e) y>y3 y>y3 NA y3>y y3>y NA y3>y y>y3 NA y>y3 y>y y>y NA y>y y>y (d) (f) y3=y+c P y3>y y>y 3 y3>y y3>y y>y3 y>y3 y>y y>y y>y3 y3>y y>y y>y g. 6. Decson boundares for 3 pattern class problems n the hdden neuron space. he relatonshp among the three vectors of (7) s llustrated n g. 7. he lnear decson boundary between class ω and class ω j, whch s perpendcular, s also shown n g. 7. Snce the angle to ψ ψ j between two lnes (or planes) s the same as that between two vectors that are normal to the two lnes (or planes), the angle between two decson boundares n the hdden neuron space s gven by ( ψ ψ ) ( ψ ψ ) j j k cosθ = (8) ( ψ ψ )( ψ ψ ) j j k ( ψ ψ 3 ) Z = 0 ψ 3 ψ = ( ψ ψ ) ( ψ g. 7. Angles between decson boundares for 3 classes n the hdden neuron space. ψ ) 3 4.4 Decson boundares n the multclass problems If there are K pattern classes (K output neurons), K theoretcally there wll be () decson boundares n the hdden neuron space. However, only K- decson boundares are ndependent and the remanng boundares wll be automatcally determned. In other words, there are only K- degree of freedoms. or example, for a 4-pattern class problem, there are 6 decson boundares n the hdden neuron space. However, we have only 3 degree of freedoms. Once the 3 decson boundares are gven, the remanng 3 boundares wll be automatcally determned as shown n g. 8, where the sold lnes represent the 3 ndependent decson boundares and the dotted lnes represent the 3 dependent decson boundares. In other words, f the ω ω decson boundary and the j ω ω decson boundary are decded, the ω ω k j k decson boundary s automatcally determned. It should pass through the ntersecton where the ω ω j decson boundary and the ω ω decson boundary k meet and s gven by ( ψ ψ ) Z = ( ψ ψ ) Z + ( ψ ψ ) Z = 0. j k j k - 5 -

As shown prevously, the drecton of the ω ω j k decson boundary s also determned once the ω ω j decson boundary and the ω ω decson boundary k are decded. or a 4-class problem, there wll be 4 ntersectons (ponts, lnes, planes or hyper-planes) where 3 decson boundares meet except the trval case that some of decson boundares are parallel. At those ntersectons, the 3 decson regons correspondng to 3 classes wll be determned. By combnng those regons, one can construct the overall decson boundares that are dsplayed wth bold lnes n g. 8. In g. 8, the 4 bold dots represent 4 ntersectons where 3 decson boundares meet. K Although there are () decson boundares for a K- class problem and the decson boundares dvde the hdden neuron space nto numerous subregons, the fnal decson boundares for classfcaton dvde the hdden neuron space nto only K regons that are convex. In general, the lnear decson boundares, whch dvde the hdden neuron space nto K regons, wll be warped nto complex decson boundares that may dvde nonlnearly the orgnal nput space nto much more than K regons. or example, n the case of two pattern class problems, the decson boundary for SIGNN n the X- space s gven by cz + cz +... + c z + c 0 N N N + = c c cn... cn 0 X X N X + e + φ + e + + φ + e + = φ + N c cn 0 X + = e φ + = + where C = ψ ψ. ypcally, decson boundares n the nput space can be straght lnes, curves, crcles, planes, curved surfaces, curved closed surfaces, and etc. the decson boundary wll be an equvalent-weght lne n ths case. More nterestng lnear boundares can be obtaned f we add more hdden neurons. or nstance, g. 9 llustrates how a lnear decson boundary of SIGNN, whch solves an XOR problem, can be obtaned wth two hdden neurons. In g. 9a, the mddle-weght lnes,.e., φ X = 0, are parallel. g. 0 shows another example of lnear decson boundares. In general, f the two mddle-weght lnes meet oblquely, as n the case of g. 0a, the decson boundary n the nput space s not lnear. However, f the decson boundary n the hdden neuron space s perpendcular to one of the coordnates n the hdden neuron space, we wll obtan a lnear decson boundary n the nput space as shown n g. 0. On the other hand, g. shows an example of lnear decson boundary when there are 3 hdden neurons (3 mddle-weght lnes n the nput space). g. shows an example of lnear decson boundares of RBNN wth two hdden neurons that are closely located. As can be seen from gs. 9-, we have more freedom n drawng a more flexble decson boundary wth more hdden neurons. y>y y>y y>y4 y3>y y>y4 y4>y y>y3 y4>y y3>y4 y4>y3 y3>y y>y3 y>y4 y4>y y3>y y>y3 y>y y>y 5 Examples of decson boundares of neural networks Wthout loss of generalty, we may assume that nput vectors to neural networks are bounded. One can always make nput data bounded by gven upper and lower bounds by scalng and translaton. And we assume that there are two pattern classes and lmt the dmenson of the nput space to for an easy llustraton. 5. Lnear decson boundares n the nput space he smplest lnear boundary n the nput space would be obtaned wth one hdden neuron. Assumng SIGNN, y3>y4 y4>y3 g. 8. Sx decson boundares of a 4 pattern class problem n the hdden neuron space. 5. Convex decson boundares n the nput space ypcally, f mddle-weght lnes meet oblquely n SIGNN as shown n gs. 3-4, the decson boundares n the nput space wll be convex except some specal cases, though the decson boundares n the hdden neuron space are lnear. Usually, the convex y3>y y>y3 y > y4 y4>y - 6 -

decson boundares n g. 3a and g. 4a dvde the nput space nto two regons correspondng to two classes. Due to the nature of the sgmod functon, z s bounded by (0,) as shown n gs 3b and 4b. In g. 4b, there are two lmt ponts through whch the lnear decson boundary passes n the hdden neuron space: (0.5,) and (,0.5). hus, t can be seen that the decson boundary n the nput space asymptotcally converges to two equvalent-weght lnes: φ X =.099 and φ X =.099. It s noted that t s mpossble to dvde the nput space nto more than regons wth two hdden neurons except the specal case of g. 9 where two mddle-weght lnes are parallel. However, f two mddle-weght lnes do not ntersect n a gven nput range, t s stll possble to dvde the gven nput space nto 3 regons as shown n g. 5. g.. An example of lnear decson boundary of SIGNN when there are 3 hdden neurons mddleweght lnes and decson boundares (bold lne) n the nput space decson boundary n the hdden neuron space. g.. An example of lnear decson boundares of RBNN wth two hdden neurons that are closely located the nput space the hdden neuron space. g. 9. An example of lnear decson boundary of SIGNN mddle-weght lnes and decson boundares (bold lne) n the nput space decson boundary n the hdden neuron space. g. 3. An example of a convex decson boundary of SIGNN (bold lne) the nput space the hdden neuron space. g. 0. Another example of lnear decson boundary of SIGNN mddle-weght lnes and decson boundary (bold lne) n the nput space decson boundary n the hdden neuron space. g. 4. Another example of convex decson boundares of SIGNN (bold lne) the nput space the hdden neuron space. g. 5. Dvdng the nput space nto 3 regons wth two non-parallel mddle-weght lne (SIGNN) the nput space the hdden neuron space. - 7 -

5.3 Dsconnected and closed decson boundares n the nput space As we add more hdden neurons, more complex decson boundares can be obtaned. As explaned prevously, the decson boundary n the hdden neuron space wll be always lnear. However, wth more hdden neurons, we have more freedom n drawng more complex decson boundares n the nput space wth lnear decson boundares n the hdden neuron space as shown n gs. 6-8 (SIGNN). In general, when there are K output neurons, the neural network dvdes the hdden neuron space nto K regons. However, the hdden neurons and the sgmod functon, whch maps the entre nput space nto bounded regon n the hdden neuron space, make t possble to dvde the nput space nto more than K regons. or nstance, the neural network wth 3 hdden neurons can dvde the nput space nto 4 regons (g. 6), 3 regons (g. 7) and regons (g. 8). In partcular, the decson boundary n g. 8 s a closed boundary. g. 9 shows an example of crcular decson boundares of RBNN wth two hdden neurons and g. 0 shows how two separate decson boundares can be obtaned n the nput space. If the half crcles n the nput space cross, the data dstrbuton n the hdden neuron space s convex (g. 9). If the half crcles n the nput space do not cross, the data dstrbuton n the hdden neuron space become concave (g. 0). Unlke SIGNN, the typcal decson boundary of RBNN s a closed boundary except the specal case of g.. g. 8. Closed decson boundary n the nput space wth 3 hdden neurons (SIGNN) the nput space the hdden neuron space. g. 9. An example of crcular decson boundares of RBNN wth two hdden neurons the nput space the hdden neuron space. g. 6. Dvdng the nput space nto 4 regons wth 3 hdden neurons (SIGNN) the nput space the hdden neuron space. g. 7. More complex decson boundares n the nput space wth 3 hdden neurons (SIGNN) the nput space the hdden neuron space. g. 0. An example of two crcular boundares wth two hdden neurons (RBNN) the nput space the hdden neuron space. 5.4 Identcal decson boundares wth dfferent weghts g. llustrates how neural networks wth dfferent weghts can defne the same decson boundary n the nput space. It can be seen that the same decson boundary n the nput space wll be obtaned even though we move the decson plane n the hdden neuron space (g. b). Although the neural network n g. a s a trval one, the same phenomenon can occur for general neural networks as shown n g.. It has been reported that dfferent sets of weghts can provde almost dentcal performance for a gven problem [5]. And these characterstcs of decson boundares n the hdden neuron space may provde some theoretcal background how dfferent sets of - 8 -

weghts can provde almost dentcal performance for a gven problem. g.. Obtanng the same decson boundares wth dfferent weghts (SIGNN) the nput space the hdden neuron space. g.. Another example of obtanng the same decson boundares wth dfferent weghts (SIGNN) the nput space the hdden neuron space. 6 Conclusons In ths paper, we nvestgated the decson boundares of neural networks whose actvaton functons are the sgmod functon and the radal bass functon neural network. We dvded the classfcaton mechansm of neural networks nto two parts: expandng the dmenson by hdden neurons and drawng lnear boundares by output neurons. In partcular, we analyzed the decson boundares n the hdden neuron space and found some nterestng propertes. rst, the decson boundares n the hdden neuron space are always lnear boundares and that the decson boundares are not completely ndependent. nally, we showed how the lnear boundares n the hdden neuron space can defne complex decson boundares n the nput space wth some nterestng propertes. he analyss of decson boundares provdes a way to reduce the complexty of neural networks and s helpful n weght ntalzaton. References: []. ukushma and N. Wake, "Handwrtten Alphanumerc Character Recognton by the Neocogntron," IEEE rans. on Neural Networks, Vol., No. 3, pp. 355-365, 99. [] Jon Alt Benedktsson, Johannes R. Svensson, Orkan K. Ersoy, and Phlp H. Swan, "Parallel Consensual Neural Networks," IEEE rans. Neural Networks, Vol. 8, No., pp. 54-64, 997. [3] K. Lee, S. Cho, S. Ong, C. You, and D. Hong, "Equalzaton technques usng neural networks for dgtal versatle dsk-read-only memory," Optcal Engneerng, Vol. 38, No., pp. 56-6, 999. [4] Chulhee Lee and Davd. A. Landgrebe, "Decson boundary feature extracton for neural networks," IEEE rans. Neural Networks, Vol. 8, No., pp. 75-83, 997. [5] G. J. Gbson, "A combnatoral approach to understandng perceptron capabltes," IEEE rans. Neural Networks, Vol. 4, No. 6, pp. 989-99, 993. [6] B. Scholkopf, S. Mka, C. J. C. Burges, P. Knrsch, K. R. Muller, G. Ratsch, and A. J. Smola, "Input space versus feature space n kernel-based methods," IEEE rans. Neural Networks, Vol. 0, No. 5, pp. 000-07, 999. [7] I. Seth, "Entropy nets: from decson tree to neural networks," Proceedngs of the IEEE, Vol. 78, No. 0, pp. 605, 990. [8] G. J. Gbson and C.. N. Cowan, "On the decson regons of multlayer perceptrons," Proceedngs of the IEEE, Vol. 78, No. 0, pp. 590-594, 990. [9] J. Makhoul, A. El-Jaroud, and R. Schwartz, "Parttonng capabltes of two-layer neural networks," IEEE rans. Sgnal Processng, Vol. 39, No. 6, pp. 435-440, 99. [0] K. L. Blackmore, R. C. Wllamson, and I. Y. Mareels, "Decson regon approxmaton by polynomals or neural networks," IEEE rans. Informaton heory, Vol. 43, No. 3, pp. 903-907, 997. []. Ntta. "An analyss on decson boundares n the complex back-propagaton network," n Proc. IEEE Conf. IEEE World Congress on Computatonal Intellgence,, pp. 934-939, 994. [] S. Y. Kung and J. S. aur, "Decson-based neural networks wth sgnal/mage classfcaton applcatons," IEEE ran. Neural Networks, Vol. 6, No., pp. 70-8, 995. [3] S. K. Pal, S. Bandyopadhyay, and C. A. Murthy, "Genetc algorthms for generaton of class boundares," IEEE rans. Systems, Man and Cybernetcs, Part B, Vol. 8, No. 6, pp. 86-88, 998. [4] R. Lppmann, "An Introducton to Computng wth Neural Nets," IEEE ASSP Magazne, Vol. 4, No., pp. 4-, Aprl 987. [5] J. Go and C. Lee. "Analyzng weght dstrbuton of neural networks," n Proc. IEEE IJCNN, pp. 999. - 9 -