Analyzing dynamic ensemble selection techniques using dissimilarity analysis

Size: px

Start display at page:

Download "Analyzing dynamic ensemble selection techniques using dissimilarity analysis"

Alexina Hutchinson
6 years ago
Views:

1 Analyzing dynamic ensemble selection techniques using dissimilarity analysis George D. C. Cavalcanti 1 1 Centro de Informática - Universidade Federal de Pernambuco (UFPE), Brazil gdcc@cin.ufpe.br George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

2 Overview 1 Introduction 2 Objective 3 Meta-Learning for Dynamic Ensemble Selection 4 Experimental Study 5 Conclusion George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

3 Introduction There is no clear guideline to choose a good learning method George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

4 Introduction There is no clear guideline to choose a good learning method Selecting the best current classifier can lead to the choice of the worst classifier for future data George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

5 Introduction There is no clear guideline to choose a good learning method Selecting the best current classifier can lead to the choice of the worst classifier for future data No free lunch theorem No dominant classifier exists for all the data distributions, and the data distribution of the task at hand is usually unknown George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

6 Combination of Classifiers Introduction L 1# L 2# x q# Combiner # decision# L 3#...# L m# pool#of#classifiers# Combination of Classifiers Consists of combining the opinions of an ensemble of classifiers in the hope that the new opinion will be better than the individual ones. Vox populi, vox Dei. George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

7 Combination of Classifiers Introduction Is it worth to combine classifiers? Three reasons to combine classifiers: Statistical Computational Representational George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

8 Statistical (or worst case) motivation Introduction Two options: George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

9 Statistical (or worst case) motivation Introduction Two options: 1 Pick any classifier risk of making a bad choice George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

10 Statistical (or worst case) motivation Introduction Ensembles de classificado Two options: Introdução 1 Pick any classifier risk of making a bad choice O uso de ensembles foi justificado por D 2 Average them três maneiras: no guarantee to perform better than the single best D probably presents little interference of bad classifiers Estatística Computacional Avoid the worst classifier by averaging several classifiers George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39 5

11 Computational motivation Introduction O uso de ensembles foi justificado por Dietterich de três Different maneiras: algorithms lead to different local optima Training algorithms such as hill-climbing and random search Classifiers Estatísticaend closer tocomputacional optimal classifier D Representativa Error surface 5 Aggregation may lead to a classifier that is a better approximation than any single D i George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

12 Representational (or best case) motivation ustificado Introduction por Dietterich de D might not be in the classifier s space Ex.: classifiers space contains only linear classifiers So, an ensemble of linear classifiers can approximate any decision boundary with Representativa any predefined accuracy putacional Best classifier out of the classifiers space George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

13 Multiple Classifier System Competitions results Netflix Prize KDDCup Gödel Prize 2003: AdaBoost algorithm ImageNet George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

14 Multiple Classifier System Fusion versus Selection L 1# L 2# x q# Combiner # decision# L 3#...# L m# pool#of#classifiers# George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

15 Multiple Classifier System Two main phases: Generation and Combination L 1# L 2# x q# Combiner # decision# L 3#...# L m# pool#of#classifiers# George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

16 Multiple Classifier System Two main phases: Generation and Combination L 1# L 2# x q# Combiner # decision# L 3#...# L m# pool#of#classifiers# George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

17 Dynamic Ensemble Selection (DES) Introduction DES techniques assume that each base classifier is a local expert measure the level of competence of the classifiers George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

18 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

19 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

20 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

21 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

22 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

23 Dynamic Ensemble Selection (DES) Introduction George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

24 Dynamic Ensemble Selection (DES) General architecture x q# L 1# L 1# L 2# L 3# Dynamic## Selec<on # L 2# L 3# Combiner # decision#...#...# L m# L n# pool#of#classifiers#l# pool#of#classifiers#l # L # # L # George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

25 DES Criterion Introduction Given a test pattern x q : Select the most competent classifiers to predict the label of x q How to estimate the competence level of base classifiers? The Region of Competence of x q is defined by its neighbours George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

26 DES Criterion Region of Competence Xq Xq k=1 k=3 Xq k=5 George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

27 DES Criterion Region of Competence Xq Xq Xq Xq k=1 k=1 k=3 k=3 Xq Xq k=5 k=5 George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

28 DES Criterion Region of Competence Xq Xq Xq Xq Xq Xq k=1 k=1 k=1 k=3 k=3 k=3 Xq Xq Xq k=5 k=5 k=5 George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

29 DES Criterion Algorithm OLA: an example k=3 C1,C2,C3 C1,C2,C3 Xq C1,C2,C3 C1,C2,C3 C 1 : 2/5 = C 2 : 4/5 = C 3 : 3/5 = C1,C2,C3 k=5 Three classifiers: {C 1, C 2, C 3 } Green means that C i correctly classified x q Orange means that C i did not correctly classify x q George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

30 Oracle [Kuncheva, PAMI 2002] Ideal DES technique that always selects the classifier that predicts the correct label for x q and rejects otherwise Oracle definition { δi,j = 1, if ci correctly classifies xq δ i,j = 0, otherwise George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

31 Objective Compare the criteria used to estimate the level of competence of classifiers using dissimilarity representation George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

32 Objective Compare the criteria used to estimate the level of competence of classifiers using dissimilarity representation The purpose of the dissimilarity analysis is twofold: Understand the relationship between different DES criteria Determine which DES criterion behaves similar to the Oracle George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

33 Objective Compare the criteria used to estimate the level of competence of classifiers using dissimilarity representation The purpose of the dissimilarity analysis is twofold: Understand the relationship between different DES criteria Determine which DES criterion behaves similar to the Oracle Hypothesis: Techniques closer to the Oracle in the dissimilarity space achieve higher recognition accuracy George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

34 Dynamic Ensemble Selection: Literature OLA: Overall Local Accuracy Woods et al., PAMI 1997 LCA: Local Classifier Accuracy Woods et al., PAMI 1997 MCB: Multiple Classifier Behaviour Giancinto, 2001 MLA: Modified Local Accuracy Smits, 2002 KNORA: K-Nearests Oracles-Eliminate Ko et al., Pattern Recognition 2008 KNOP: K-Nearests Output Profiles Cavalin et al., Pattern Recognition 2013 Meta-DES: On Meta-Learning for Dynamic Ensemble Selection Cruz et al., ICPR 2014 George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

35 Meta-DES: Meta-Features Multiple criteria are used to estimate the competence of base classifiers These criteria are encoded as meta-features MF Criterion Paradigm f 1 Local accuracy in the RoC Classifier Accuracy over a local region f 2 Extent of consensus in the RoC Classifier consensus f 3 Overall accuracy in the RoC Classifier Accuracy over a local region f 4 Accuracy in the decision space Output Profiles f 5 Degree of confidence for the input sample Classifier confidence MF: meta-feature RoC: region of competence George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

Meta-DES: Meta-Classifier A meta-classifier (λ) is trained to distinguish between a competent and a not competent classifier For each pair (classifier c i, query

36 Meta-DES: Meta-Classifier A meta-classifier (λ) is trained to distinguish between a competent and a not competent classifier For each pair (classifier c i, query pattern x q ) The output of λ is Yes (competent) or No (not competent) f 1" f 2" f 3" λ"classifier" Yes/No" f 4" f 5" George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

37 Meta-DES architecture 3.2.1) Overproduction Training 3.2.2) Meta-Training Phase Training Xj,Train Classifier Generation Process 3.2.3) Generalization Phase Test Selection DSEL Xj,Train Xj,Test Xj,DSEL hc Data Generation Process Sample Selection Pool of classifiers Meta-Feature Extraction Process K Kp Meta-Feature Extraction Process vi,j C = {c1,, cm} vi,j Dynamic Selection Majority Vote Meta Training Process Selector wl Rafael M. O. Cruz, Robert Sabourin and George D. C. Cavalcanti. On Meta-Learning for Dynamic Ensemble Selection. International Conference on Pattern Recognition (ICPR), pp , George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

38 Experimental protocol Pool of classifiers: 10 Perceptrons (generated using Bagging) Number of replications: 20 Training, Meta-training, Selection, and Test Size of the region of competence: 7 Database No. of Instances Dimensionality No. of Classes Source Pima UCI Liver Disorders UCI Breast (WDBC) UCI Vehicle UCI Blood transfusion UCI Sonar UCI Ionosphere UCI Wine UCI Haberman UCI Banana PRTOOLS Lithuanian PRTOOLS George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

39 Disssimilarity Analysis The first step is to compute the Dissimilarity matrix D D is an 8 8 symmetrical matrix d A,B is the dissimilarity between two DES techniques The dissimilarity d A,B is compute as Dissimilarity metric d A,B = 1 NM N M j=1 i=1 ( ) 2 δi,j A δi,j B δ A i,j is the level of competence of the technique A δ B i,j is the level of competence of the technique B N is the size of the validation dataset M is size of the pool of classifiers George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

40 Dissimilarity Matrix For each dataset, we compute a dissimilarity matrix D Pima, D Liver,..., D Lithuanian The average dissimilarity matrix D is computed as the mean of D a D = mean(d Pima, D Liver,..., D Lithuanian ) Meta-Learning KNORA MCB LCA OLA MLA KNOP Oracle Meta-Learning (0.06) 0.46(0.15) 0.40(0.07) 0.36(0.06) 0.40(0.04) 0.53(0.08) 0.54(0.03) KNORA 0.36(0.06) (0.06) 0.42(0.01) 0.44(0.01) 0.71(0.04) 0.74(0.11) 0.68(0.01) MCB 0.46(0.15) 0.89(0.06) (0.01) 0.89(0.06) 1.06(0.07) 0.75(0.03) 0.72(0.08) LCA 0.40(0.07) 0.42(0.01) 0.58(0.01) (0.01) 0.45(0.02) 0.31(0.04) 0.60(0.06) OLA 0.36(0.06) 0.44(0.01) 0.89(0.06) 0.42(0.01) (0.04) 0.74(0.11) 0.68(0.11) MLA 0.40(0.04) 0.71(0.04) 1.06(0.07) 0.45(0.02) 0.71(0.04) (0.01) 0.63(0.07) KNOP 0.53(0.08) 0.74(0.11) 0.75(0.03) 0.31(0.04) 0.74(0.11) 0.54(0.01) (0.12) Oracle 0.54(0.03) 0.68(0.01) 0.72(0.08) 0.60(0.06) 0.68(0.11) 0.63(0.07) 0.86(0.12) 0 George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

41 Dissimilarity Matrix For each dataset, we compute a dissimilarity matrix D Pima, D Liver,..., D Lithuanian The average dissimilarity matrix D is computed as the mean of D a D = mean(d Pima, D Liver,..., D Lithuanian ) Meta-Learning KNORA MCB LCA OLA MLA KNOP Oracle Meta-Learning (0.06) 0.46(0.15) 0.40(0.07) 0.36(0.06) 0.40(0.04) 0.53(0.08) 0.54(0.03) KNORA 0.36(0.06) (0.06) 0.42(0.01) 0.44(0.01) 0.71(0.04) 0.74(0.11) 0.68(0.01) MCB 0.46(0.15) 0.89(0.06) (0.01) 0.89(0.06) 1.06(0.07) 0.75(0.03) 0.72(0.08) LCA 0.40(0.07) 0.42(0.01) 0.58(0.01) (0.01) 0.45(0.02) 0.31(0.04) 0.60(0.06) OLA 0.36(0.06) 0.44(0.01) 0.89(0.06) 0.42(0.01) (0.04) 0.74(0.11) 0.68(0.11) MLA 0.40(0.04) 0.71(0.04) 1.06(0.07) 0.45(0.02) 0.71(0.04) (0.01) 0.63(0.07) KNOP 0.53(0.08) 0.74(0.11) 0.75(0.03) 0.31(0.04) 0.74(0.11) 0.54(0.01) (0.12) Oracle 0.54(0.03) 0.68(0.01) 0.72(0.08) 0.60(0.06) 0.68(0.11) 0.63(0.07) 0.86(0.12) 0 How to show this matrix D in a 2D-plot? George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

42 Classifier Projection Space (CPS) [Pekalska et al., 2002] The CPS is an R n space where each DES technique is represented as a point and the Euclidean distance between two techniques is equal to their dissimilarities 2-Dimensional projection based on a Non-linear multidimensional scaling (Sammon mapping) George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

43 Classifier Projection Space (CPS) CPS for the average dissimilarity matrix D 0.06 MCB KNOP MLA OLA LCA KNORA META LEARNING ORACLE Rafael M. O. Cruz, George D. C. Cavalcanti, Tsang Ing Ren and Robert Sabourin. Feature representation selection based on Classifier Projection Space and Oracle analysis. Expert Systems with Applications, v. 40, pp , George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

44 Average Distance to the Oracle For each classification problem The dissimilarity between each DES and the Oracle (d a,oracle ) Mean and standard deviation Database Meta-Learning KNORA-E MCB LCA OLA MLA KNOP Pima 0.32(0.04) 0.43(0.01) 0.47(0.08) 0.36(0.06) 0.43(0.01) 0.44(0.07) 0.41(0.02) Liver Disorders 0.50(0.04) 0.61(0.01) 0.67(.008) 0.56(0.06) 0.61(0.01) 0.60(0.07) 0.51(0.02) Breast Cancer 0.59(0.35) 1.22(0.10) 1.20(0.10) 0.69(0.01) 1.20(0.10) 0.77(0.03) 1.20(0.10) Blood Transfusion 0.33(0.03) 0.40(0.01) 0.46(0.01) 0.36(.003) 0.40(0.01) 0.44(0.08) 0.4(0.01) Banana 0.33(0.10) 0.29(0.01) 0.36(0.01) 0.24(0.01) 0.29(0.01) 0.36(0.01) 0.34(0.01) Vehicle 0.36(0.07) 0.49(0.01) 0.48(0.02) 0.36(0.04) 0.49(0.01) 0.37(0.05) 0.47(0.02) Lithuanian Classes 0.47(0.14) 0.49(0.02) 0.56(0.02) 0.39(0.04) 0.49(0.02) 0.54(0.01) 0.51(0.03) Sonar 0.58(0.10) 0.91(0.04) 0.88(0.01) 0.70(0.01) 0.91(0.04) 0.85(0.02) 0.84(0.06) Ionosphere 0.62(0.22) 0.89(0.05) 0.88(0.06) 0.70(0.07) 0.89(0.05) 0.68(0.02) 0.88(0.06) Wine 1.03(0.20) 0.88(0.11) 0.98(0.11) 0.73(0.02) 0.88(0.11) 0.93(0.06) 0.82(0.14) Haberman 0.79(0.04) 0.89(0.05) 1.01(0.05) 0.82(0.02) 0.89(0.05) 0.92(0.04) 0.86(0.06) Mean 0.54(0.05) 0.68(0.01) 0.72(0.08) 0.60(0.06) 0.68(0.11) 0.63(0.07) 0.86(0.12) George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

45 Average Distance to the Oracle Database Meta-Learning KNORA-E MCB LCA OLA MLA KNOP Pima 0.32(0.04) 0.43(0.01) 0.47(0.08) 0.36(0.06) 0.43(0.01) 0.44(0.07) 0.41(0.02) Liver Disorders 0.50(0.04) 0.61(0.01) 0.67(.008) 0.56(0.06) 0.61(0.01) 0.60(0.07) 0.51(0.02) Breast Cancer 0.59(0.35) 1.22(0.10) 1.20(0.10) 0.69(0.01) 1.20(0.10) 0.77(0.03) 1.20(0.10) Blood Transfusion 0.33(0.03) 0.40(0.01) 0.46(0.01) 0.36(.003) 0.40(0.01) 0.44(0.08) 0.4(0.01) Banana 0.33(0.10) 0.29(0.01) 0.36(0.01) 0.24(0.01) 0.29(0.01) 0.36(0.01) 0.34(0.01) Vehicle 0.36(0.07) 0.49(0.01) 0.48(0.02) 0.36(0.04) 0.49(0.01) 0.37(0.05) 0.47(0.02) Lithuanian Classes 0.47(0.14) 0.49(0.02) 0.56(0.02) 0.39(0.04) 0.49(0.02) 0.54(0.01) 0.51(0.03) Sonar 0.58(0.10) 0.91(0.04) 0.88(0.01) 0.70(0.01) 0.91(0.04) 0.85(0.02) 0.84(0.06) Ionosphere 0.62(0.22) 0.89(0.05) 0.88(0.06) 0.70(0.07) 0.89(0.05) 0.68(0.02) 0.88(0.06) Wine 1.03(0.20) 0.88(0.11) 0.98(0.11) 0.73(0.02) 0.88(0.11) 0.93(0.06) 0.82(0.14) Haberman 0.79(0.04) 0.89(0.05) 1.01(0.05) 0.82(0.02) 0.89(0.05) 0.92(0.04) 0.86(0.06) Mean 0.54(0.05) 0.68(0.01) 0.72(0.08) 0.60(0.06) 0.68(0.11) 0.63(0.07) 0.86(0.12) The meta-learning framework is closer to the Oracle for the majority of datasets, followed by the LCA technique Rafael M. O. Cruz, Robert Sabourin and George D. C. Cavalcanti. Analyzing Dynamic Ensemble Selection Techniques Using Dissimilarity Analysis. Workshop on Artificial Neural Networks in Pattern Recognition, v. 8774, p , George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

46 Comparative Results Accuracy rate: mean and standard deviation Database Meta-Learning KNORA-E MCB LCA OLA MLA KNOP Oracle Pima 77.74(2.34) 73.16(1.86) 73.05(2.21) 72.86(2.98) 73.14(2.56) 73.96(2.31) 73.42(2.11) 95.10(1.19) Liver Disorders (5.57) 63.86(3.28) 63.19(2.39) 62.24(4.01) 62.05(3.27) 57.10(3.29) 65.23(2.29) 90.07(2.41) Breast Cancer 97.41(1.07) 96.93(1.10) 96.83(1.35) 97.15(1.58) 96.85(1.32) 96.66(1.34) 95.42(0.89) 99.13(0.52) Blood Transfusion 79.14(1.88) 74.59(2.62) 72.59(3.20) 72.20(2.87) 72.33(2.36) 70.17(3.05) 77.54(2.03) 94.20(2.08) Banana 90.16(2.09) 88.83(1.67) 88.17(3.37) 89.28(1.89) 89.40(2.15) 80.83(6.15) 85.73(10.65) 94.75(2.09) Vehicle 82.50(2.07) 81.19(1.54) 80.20(4.05) 80.33(1.84) 81.50(3.24) 71.15(3.50) 80.09(1.47) 96.80(0.94) Lithuanian Classes 90.26(2.78) 88.83(2.50) 89.17(2.30) 88.10(2.20) 87.95(1.85) 77.67(3.20) 89.33(2.29) (0.57) Sonar 79.72(1.86) 74.95(2.79) 75.20(3.35) 76.51(2.06) 74.52(1.54) 74.85(1.34) 75.72(2.82) 94.46(1.63) Ionosphere 89.31(0.95) 87.37(3.07) 85.71(2.12) 86.56(1.98) 86.56(1.98) 87.35(1.34) 85.71(5.52) 96.20(1.72) Wine 96.94(4.08) 95.00(1.53) 95.55(2.30) 95.85(2.25) 96.16(3.02) 96.66(3.36) 95.00(4.14) (0.21) Haberman 76.71(3.52) 71.23(4.16) 72.86(3.65) 70.16(3.56) 72.26(4.17) 65.01(3.20) 75.00(3.40) 97.36(3.34) The best results are in bold. Results that are significantly better (p < 0.05) are underlined The accuracy of the proposed Meta-learning framework is statistically superior in 8 out of 11 datasets George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

47 Conclusion We conducted a study about the dissimilarity between different DES techniques using Classifier Projection Space (CPS) George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

48 Conclusion We conducted a study about the dissimilarity between different DES techniques using Classifier Projection Space (CPS) Techniques that use the same kind of information such as LCA, OLA and MLA are likely to present similar results George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

49 Conclusion We conducted a study about the dissimilarity between different DES techniques using Classifier Projection Space (CPS) Techniques that use the same kind of information such as LCA, OLA and MLA are likely to present similar results The combination of multiple criterion using meta-learning achieves a result close to the Oracle in the dissimilarity space And also achieves higher recognition rates George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

50 Analyzing dynamic ensemble selection techniques using dissimilarity analysis George D. C. Cavalcanti 1 1 Centro de Informática - Universidade Federal de Pernambuco (UFPE), Brazil gdcc@cin.ufpe.br George D. C. Cavalcanti (CIn-UFPE) WTDCC-UFU / 39

arxiv: v1 [cs.lg] 13 Aug 2014

arxiv: v1 [cs.lg] 13 Aug 2014 A Classifier-free Ensemble Selection Method based on Data Diversity in Random Subspaces Technical Report arxiv:1408.889v1 [cs.lg] 13 Aug 014 Albert H.R. Ko École de technologie supérieure (ÉTS), Université