The Application of Extreme Learning Machine based on Gaussian Kernel in Image Classification

Size: px

Start display at page:

Download "The Application of Extreme Learning Machine based on Gaussian Kernel in Image Classification"

Beatrice Fitzgerald
5 years ago
Views:

1 he Application of Extreme Learning Machine based on Gaussian Kernel in Image Classification Weijie LI, Yi LIN Postgraduate student in College of Survey and Geo-Informatics, tongji university

2 Outline ELM heory Kernel ELM(K-ELM) Classification Application Conclusion and Outlook

3 Study background During the environmental monitoring research process, it was found that machine learning can better apply remote sensing image classification, compared with SVM and ELM classification algorithms, and K-ELM has higher classification accuracy. For small-area research areas, Gaussian kernel ELM has more obvious and effective effects, and more realistic identification of feature information.

4 SLFN theory x 1 x 2 x n ω b 1 β n Input layer i.. l 1 2 m Output layer y 1 y 2 y m Hidden layer Single hidden Layer Feedforward Neural Networks Input layer : n neurons, corresponding to the n input features Hidden layer: include l neurons Output layer: m neurons corresponding to the m output labels For N arbitrarily determined sample(x i, y i ), x i x, x,..., x i1 i2 in x input label i : y y, y,..., y i i1 i2 in Input layer to hidden layer weight: y i : output result Hidden layer to output layer weight:

5 SLFN theory SLFN has the aspect to improve: raining speed is slow. Since the gradient descent method requires multiple iterations to achieve the purpose of correcting the weights and thresholds, the training process takes a long time. Easy to fall into local minimum values, unable to reach the global minimum; So Professor Huang conducted an in-depth study on the SLFN and proposed the ELM theory.

6 SLFN theory SLFN to ELM he active function g(ωx + b) satisfies infinitely differentiable in the arbitrary interval R R, then and can be randomly generated from any continuous probability distribution in any interval of R-space. b Compared SLFN, there is no need to adjust the and, then the entire network only has the output weight is not determined. Extreme learning machine come into being b H N l lm Y N m (1) H is the hidden layer output matrix of training set, Y is the target matrix of training set β is the weight from hidden layer to output layer

7 ELM theory ELM model: H= 1 M l lm H N l lm Y N m Y y 1 M y N N m L l 1 l L g x b g x b g x b g x b g x b g x b l 2 l M M M M g x b g x b g x b 1 N 1 2 N 2 l N l g : the active function ω : the weight from input to hidden layer b : bias x : input the data label (2)

8 ELM theory Solution of β: When L=N, H N l lm % H 1 Y Y N m (3) When another, H matrix is ill-condition, need to be solved according to the minimum norm criterion H( i, x i, b i ) i y i mi n (4) % ar gmi n % H Y Y H F (5) (6) H : Moore-Penrose Generalized Inverse of Implicit Layer Output Matrix

9 ELM theory In order to increase the stability and generalization ability of the ELM, regularization parameters can be added to the ELM. Model is as following: 1 L X, Y;, C H Y C 2 2 (7) At this point, the weight matrix β is estimated as: 1 I H HH Y C N<L (8) 1 I HH H Y C N>>L (9) Kernel function I f ( x) h( x) H ( HH ) -1 Y C (10)

10 Kernel function Kernel function It can map data from low-dimensional to high dimensions, while at the same time transforming scalar product operations from high-dimensional space into lowdimensional calculations. K x,x' = x x' ( ) ( ) ( ) (11) he classes can more easily separated in a higher-dimensional space

11 Kernel function For the case where the number of training sample is not huge. If a feature mapping h(x) is unknown to users, the dimensionality L of the feature space (number of hidden nodes) need not be given either Instead, its corresponding kernel K u, v is given to users I f ( x) h( x) H ( HH ) -1 Y C K( x,x1 ) ( ) ( ) ( ) -1 I f x h x H HH Y : ( ) -1 ELM Y C K ( x,xn ) (12) (13)

12 Kernel function Kernel function other style A powerful way to construct new kernel functions is to use simple kernel functions as basic modules. Given a legal kernel function k 1 (x, x and k 2 (x, x he following new kernel functions are also legal k( x,x' ) = ( x) ( x' ) k( x,x' ) = ck1( x,x' ) k( x,x' ) = k ( x,x' ) k ( x,x' ) 1 k( x,x' ) = k ( x,x' ) k ( x,x' ) (14)

13 Gaussian kernel function Gaussian kernel function One of the kernel functions Usually defined as the Euclidean distance between any point x in space and a certain center xc, effect is often local, the function takes a small value when x is away from xc. exp / 2 k x xc x xc 2 2 (15) xc: kernel function center, σ: function width parameters, control local range of action Cross validation to determine σ

14 Gaussian kernel function An extreme learning machine model with Gaussian kernel function can be expressed as K( x,x1 ) ( ) ( ) ( ) -1 I f x h x H HH Y : ( ) -1 ELM Y C K ( x,xn ) k( x,x' ) = k Gauss ( x Gauss,x Gauss,) (16) (17), / 2 G x x exp x xc 2 ' 2 (18)

554 591pixels Study region: Chao Lake, China

15 Classification application Study area Band number: 7 Satellite : landsat-8 Study image : pixels Study region: Chao Lake, China Image resolution :30m 30m Figure1 Original image

result Figure4 Gaussian kernel ELM classification

16 Classification application Comparison of classification results Figure3 ELM classification result Figure4 Gaussian kernel ELM classification result σ = water Bare-land building forest Arable-land

17 Classification application Classification accuracy comparison C building Forest Arable-land Bare-land Water building Forest Arable-land Bare-land Water Accuracy= kappa= C building Forest Arable-land Bare-land Water building Forest Arable-land Bare-land Water Accuracy= kappa= able1 ELM classification accuracy able2 Gaussian Kernel classification accuracy

Classification application Gaussian kernel function is suitable for the small-region he role of the Gaussian kernel function is often local.

18 Classification application Gaussian kernel function is suitable for the small-region he role of the Gaussian kernel function is often local. Gaussian kernel function ELM classifier has obvious better results for small plaque region classification Mode select Figure5 Original image Figure6 Standard ELM result Figure7 Gaussian kernel ELM result

19 Conclusion and outlook Conclusion Gauss-K-ELM can make classification system more steady, when you add regularization, classification result will not change a lot. Gaussian kernel function can improve classification accuracy, especially for small area research objects, the effect is more significant Outlook Gaussian kernel function runs too slowly. We are working on trying other kernel function for study area, such as polynomial kernel function, mixed kernel function. Because the label is an important factor in classification, the next study of the kernel function will start by constructing different feature spaces.

20 hank you!

CIS 520: Machine Learning Oct 09, Kernel Methods

CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed