High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution

Size: px

Start display at page:

Download "High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution"

Duane Hunt
5 years ago
Views:

1 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution Presented by 1 In collaboration with Mathieu Fauvel1, Stéphane Girard2 and David Sheeren1 1 Dynafor, 2 INRA, University of Toulouse, France Team Mistis, INRIA Grenoble, France

series Method Supervised classification at the object scale High Dimensional Kullback-Leibler

2 Study objectives Agroecological application Grassland management practices Data SITS with high spatial ( 10m) resolution and temporal (2-3 images per month) resolution satellite image time series Method Supervised classification at the object scale High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 2 of 22

3 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 3 of 22

4 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

5 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We

Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle

6 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

7 Statistical problem Learn f such as y i = f (X i), where y i is the predicted label ] X i = [x i1... x ini is a matrix of size (n i d) that contains all the pixels inside g i x ik (t 1) x ik (t) =. R d x ik (t d ) with g i grassland with index i, n i number of pixels in grassland g i k pixel index, k {1,..., n i} d length of time series x ik (t l ) NDVI value of pixel k at time l High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 5 of 22

8 Contributions Thematic contributions Grassland management practices Sentinel-2 contribution Methodological contributions Model grassland distribution Process grassland supervised classification at the parcel scale Robust to the dimension of data (n i pixels, d temporal variables with n i d), the total number of grasslands pixels that might be large. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 6 of 22

9 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 7 of 22

10 Statistical modelling of grasslands For each grassland g i: each pixel x ik N (µ i, Σ i) High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 8 of 22

11 Measuring similarity between two grasslands Measuring similarity between g i and g j measuring similarity betwen N (µ i, Σ i) and N (µ j, Σ j) Symmetrized Kullback-Leibler divergence: KLD(g i, g j) = 1 [ Tr [ ] Σ 1 i Σ j + Σ 1 j Σ i + (µi µ 2 j ) ( Σ 1 i ) + Σ 1 j (µi µ j )] d High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 9 of 22

12 Measuring similarity between two grasslands The number of pixels in a grassland is usually lower than the number of parameters to estimate! 15 Number of grasslands Number of pixels Figure : Histogram of grassland size in number of pixels n i. Red line: number of parameters to estimate for each grassland for a multivariate Gaussian model. It is derived from the number of variables using the formula d(d + 3)/2 = 170 for d = 17. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 10 of 22

13 High dimensional model High Dimensional Discriminant Analysis 1 modelling is used: it assumes that the last eigenvalues of the covariance matrix are equal: with D i = Σ i = P id ip i λ i λ ipi 0 0 λ i } p i (d pi) 0 λ i p i is the number of non-equal eigenvalues, λ i is the multiple eigenvalue corresponding to the noise term (last and equal eigenvalues), λ ij λ i, for j = 1,..., p i 1 C. Bouveyron, S. Girard and C. Schmid, "High-dimensional discriminant analysis", Communications in Statistics - Theory and Methods, vol. 36, no. 14, pp , 2007 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 11 of 22

14 High dimensional model High Dimensional Discriminant Analysis 1 modelling is used: it assumes that the last eigenvalues of the covariance matrix are equal: Σ i = Q iλ iq i + λ ii d I d is the identity matrix of size d, Q i = [ ] q i1,..., q ipi, ] Λ i = diag [λ i1 λ i,..., λ ipi λ i. Folllowing this model, Σ 1 i can be computed explicitly: Σ 1 i = Q iv iq i + λ 1 i [ ] 1 with V i = diag λ i 1 λ i1,..., 1 λ i 1 λ ipi I d 1 C. Bouveyron, S. Girard and C. Schmid, "High-dimensional discriminant analysis", Communications in Statistics - Theory and Methods, vol. 36, no. 14, pp , 2007 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 11 of 22

15 High Dimensional Symmetrized KLD To compute HDKLD, only the p i first eigenvalues/eigenvectors are required: the number of parameters to estimate is reduced, the unstable estimation of the eigenvectors associated to small eigenvalues is avoided. HDKLD(g i, g j) = 1 [ 2 Λ 1 2 j Q j Q iv 1 2 i 2 2 F Λ 1 i Q i Q jv 1 2 j 2 F + λ 1 i Tr [ ] Λ j λj Tr [ ] V i + λ 1 j Tr [ ] Λ i λi Tr [ ] V j V 1 2 i Q i (µ i µ j ) 2 V 1 2 j Q j (µ i µ j ) 2 + λi + λj (µ λ i µ j ) 2 + λ2 i + λ 2 j d iλ j λ iλ j ] d where L 2 F = Tr(L L) is the Frobenius norm. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 12 of 22

16 Construction of a positive definite kernel HDKLD is used to build a positive definite kernel function. This kernel function can be used in any kernel method, such as SVM. (HD)KLD is a semi-metric. K(g i, g j) = exp [ (HD)KLD(g i,g j ) ] 2 with σ R + σ High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 13 of 22

17 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 14 of 22

18 Data Study site Field data 52 parcels with 3 management practices (field survey, 2015): Class Nb of grasslands Mowing 34 Grazing 10 Mixed (mowing & grazing) 8 Satellite data Formosat-2 Spatial resolution: 8m 4 spectral bands (B, G, R, NIR) d = 17 images from Feb. 16 to Dec. 20, 2013: High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 15 of 22

19 Classification methods Comparison with other methods: Method p-svm µ-svm KLD-SVM HDKLD-SVM Scale Pixel Object Object Object Expl. variable x ik µ i N (µ i, Σ i) N (µ i, Σ i) Kernel RBF RBF K(g i, g j) K(g i, g j) Nb of samples majority voting rule Optimal hyperparameters have been optimized by cross-validation. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 16 of 22

20 Processing chain Leave-One-Out Cross-Validation for error estimation. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 17 of 22

21 Results HDKLD outperforms KLD, but the results are not significantly different than the other methods. Classification accuracy P-SVM µ-svm KLD-SVM HDKLD-SVM REF REF REF REF OA κ PRED Test of significance of observed differences Z p-svm µ-svm KLD-SVM HDKLD-SVM p-svm µ-svm KLD-SVM HDKLD-SVM ˆκ m ˆκ n Z = var(ˆκ ˆ m) + var(ˆκ ˆ n) (*from Congalton and Green, Assessing the Accuracy of Remotely Sensed Data, Principles and Practices, 2009) High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 18 of 22

22 Results The proposed model enables a proper grassland modelling at the parcel scale, reduces the number of elements to be processed by SVM. For HDKLD-SVM: G = 52 grasslands processed For P-SVM: N = 8741 pixels processed High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 19 of 22

23 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 20 of 22

Conclusion Gaussian modelling seems accurate to model grassland pixels distribution. HDKLD is efficient for measuring grassland proximity and robust to the dimension of data.

24 Conclusion Gaussian modelling seems accurate to model grassland pixels distribution. HDKLD is efficient for measuring grassland proximity and robust to the dimension of data. Best results are not significantly different. HDKLD is better than KLD. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 21 of 22

The method will be used for unsupervised classification. Thank you for your attention.

25 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Prospects The method will be tested on a larger dataset. The method will be further extended to multispectral data. The method will be used for unsupervised classification. Thank you for your attention. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

26 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

27 Estimation of HDKLD parameters: HDKLD(g i, g j) = 1 [ 2 Λ 1 2 j Q j Q iv 1 2 i 2 2 F Λ 1 i Q i Q jv 1 2 j 2 F + λ 1 i Tr [ ] Λ j λj Tr [ ] V i + λ 1 j Tr [ ] Λ i λi Tr [ ] V j V 1 2 i Q i (µ i µ j ) 2 V 1 2 j Q j (µ i µ j ) 2 + λi + λj (µ λ i µ j ) 2 + λ2 i + λ 2 j d iλ j λ iλ j ˆλ ij and ˆq ij are the first eigenvalues/eigenvectors of ˆΣ i, j {1,..., p i}, ] d ˆp i corresponds to the number of eigenvalues needed to reach a given percentage of variance t, ˆλ i = Tr( ˆΣ i ) ˆpi ˆλ ij j=1 Tr( ˆΣ i ) ˆλij j ˆp i d ˆp i. t, t being a user defined parameter, High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

28 Optimal parameters have been optimized during cross-validation given this search grid: Parameter p-svm µ-svm KLD-SVM HDKLD-SVM σ {2 5, 2 4,..., 2 5 } {2 8, 2 9,..., 2 12 } C {1, 10, 100} t {0.80, 0.85, 0.90, 0.95, 0.99} Bhattacharyya distance: D B (g i, g j) = 1 8 (µ i µ j ) Σ 1 (µ i µ j ) + 1 ( ) 2 ln det(σ) det(σi)det(σ j) with Σ = Σi + Σj 2 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

Object-based classification of grasslands from high resolution satellite image time series using Gaussian mean map kernels

Object-based classification of grasslands from high resolution satellite image time series using Gaussian mean map kernels Maïlys Lopes, Mathieu Fauvel, Stephane Girard, David Sheeren To cite this version: