Recent advances in Time Series Classification

Size: px
Start display at page:

Download "Recent advances in Time Series Classification"

Transcription

1 Distance Shapelet BoW Kernels CCL Recent advances in Time Series Classification Simon Malinowski, LinkMedia Research Team Classification day #3 S. Malinowski Time Series Classification 21/06/17 1 / 55

2 Distance Shapelet BoW Kernels CCL Time Series Classification Time series... A time series x is a set of values x(1), x(2),..., x(l), x(i) R, ordered in time....classification Supervised task : We are given a training set T = {(x k, y k ), 1 k N}, where xk is a time series R L yk is a label associated to x k, y k {1,..., C} Aim : Learn a relationship between the time series in T and their labels to predict unknown labels of testing time series. S. Malinowski Time Series Classification 21/06/17 2 / 55

3 Distance Shapelet BoW Kernels CCL Example of time series classification Normal heartbeat Myocard infarction ECG signal ECG signal Index Index Many other applications: Satellite Image Time Series, gesture recognition, finance, music... S. Malinowski Time Series Classification 21/06/17 3 / 55

4 Distance Shapelet BoW Kernels CCL Main approaches for time series classification 1 Distance-based approaches rely on distance measures between raw time series 2 Shapelet-based approaches rely on shapelets 3 Dictionary-based approaches Bag-of-Words 4 Enhanced Bag-of-Words 5 Ensemble approaches You can find a good recent review about TSC in [Bagnall et al., 2016] Website : timeseriesclassification.com, with source codes and more than 80 datasets S. Malinowski Time Series Classification 21/06/17 4 / 55

5 Distance Shapelet BoW Kernels CCL ED DTW GAK 1 Distance-Based Time Series Classification Euclidean Distance Dynamic Time Warping Global Alignement Kernel 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 5 / 55

6 Distance Shapelet BoW Kernels CCL ED DTW GAK 1 Distance-Based Time Series Classification Euclidean Distance Dynamic Time Warping Global Alignement Kernel 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 6 / 55

7 Distance Shapelet BoW Kernels CCL ED DTW GAK Euclidean Distance X = x 1,..., x n and Y = y 1,..., y n two time series. d(x, Y ) 2 = n (x i y i ) 2 i= t S. Malinowski Time Series Classification 21/06/17 7 / 55

8 Distance Shapelet BoW Kernels CCL ED DTW GAK Euclidean Distance X = x 1,..., x n and Y = y 1,..., y n two time series. d(x, Y ) 2 = n (x i y i ) 2 i= t Easy to compute Compatible with kernel machines Same lengths Shifts, warpings S. Malinowski Time Series Classification 21/06/17 7 / 55

9 Distance Shapelet BoW Kernels CCL ED DTW GAK 1 Distance-Based Time Series Classification Euclidean Distance Dynamic Time Warping Global Alignement Kernel 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 8 / 55

10 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping [Sakoe and Chiba, 1978] Measure based on the best alignment between two time series t S. Malinowski Time Series Classification 21/06/17 9 / 55

11 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping x 4 d 4,5 x 3 x 2 d 2,1 x 1 d 1,1 d 1,2 y 1 y 2 y 3 y 4 y 5 d i,j = (x i y j ) 2 S. Malinowski Time Series Classification 21/06/17 10 / 55

12 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping x 4 d 4,5 x 3 x 2 d 2,1 π 1 x 1 d 1,1 d 1,2 y 1 y 2 y 3 y 4 y 5 S 2 1 = i,j π 1 d i,j S. Malinowski Time Series Classification 21/06/17 10 / 55

13 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping x 4 d 4,5 x 3 π 2 x 2 d 2,1 π 1 x 1 d 1,1 d 1,2 y 1 y 2 y 3 y 4 y 5 S 1 = i,j π 1 d i,j S 2 2 = i,j π 2 d i,j S. Malinowski Time Series Classification 21/06/17 10 / 55

14 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping x 4 d 4,5 x 3 π 2 x 2 d 2,1 π 1 x 1 d 1,1 d 1,2 y 1 y 2 y 3 y 4 y 5 S 1 = i,j π 1 d i,j S 2 2 = i,j π 2 d i,j The best alignment is the one leading to the minimum score, this score being the DTW S. Malinowski Time Series Classification 21/06/17 10 / 55

15 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping x 4 d 4,5 x 3 π 2 x 2 d 2,1 π 1 x 1 d 1,1 d 1,2 y 1 y 2 y 3 y 4 y 5 S 1 = i,j π 1 d i,j S 2 2 = i,j π 2 d i,j The best alignment is the one leading to the minimum score, this score being the DTW Shifts, warpings Complexity Good performance Not a proper distance S. Malinowski Time Series Classification 21/06/17 10 / 55

16 Distance Shapelet BoW Kernels CCL ED DTW GAK Dynamic Time Warping S. Malinowski Time Series Classification 21/06/17 11 / 55

17 Distance Shapelet BoW Kernels CCL ED DTW GAK ED versus DTW 1NN DTW DTW is better here ED is better here Win/Tie/Loose : 50/4/ NN ED S. Malinowski Time Series Classification 21/06/17 12 / 55

18 Distance Shapelet BoW Kernels CCL ED DTW GAK 1 Distance-Based Time Series Classification Euclidean Distance Dynamic Time Warping Global Alignement Kernel 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 13 / 55

19 Distance Shapelet BoW Kernels CCL ED DTW GAK Global Alignment Kernel - GAK [Cuturi, 2011] x 4 k 4,5 x 3 x 2 k 2,1 π 1 x 1 k 1,1 k 1,2 y 1 y 2 y 3 y 4 y 5 (xi yj )2 k i,j = e S. Malinowski Time Series Classification 21/06/17 14 / 55

20 Distance Shapelet BoW Kernels CCL ED DTW GAK Global Alignment Kernel - GAK [Cuturi, 2011] x 4 k 4,5 x 3 x 2 k 2,1 π 1 x 1 k 1,1 k 1,2 y 1 y 2 y 3 y 4 y 5 DTW is score of the best path (xi yj )2 k i,j = e S. Malinowski Time Series Classification 21/06/17 14 / 55

21 Distance Shapelet BoW Kernels CCL ED DTW GAK Global Alignment Kernel - GAK [Cuturi, 2011] x 4 k 4,5 x 3 x 2 k 2,1 π 1 x 1 k 1,1 k 1,2 y 1 y 2 y 3 y 4 y 5 DTW is score of the best path GAK considers all possible paths : GAK(X, Y ) = π Π (xi yj )2 k i,j = e (i,j) π k i,j S. Malinowski Time Series Classification 21/06/17 14 / 55

22 Distance Shapelet BoW Kernels CCL ED DTW GAK Global Alignment Kernel - GAK [Cuturi, 2011] x 4 k 4,5 x 3 x 2 k 2,1 π 1 x 1 k 1,1 k 1,2 y 1 y 2 y 3 y 4 y 5 DTW is score of the best path GAK considers all possible paths : GAK(X, Y ) = π Π (xi yj )2 k i,j = e (i,j) π k i,j Shifts, warpings Good performance Can be used in kernel machines Complexity S. Malinowski Time Series Classification 21/06/17 14 / 55

23 Distance Shapelet BoW Kernels CCL Shapelets ST LS 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification Shapelets Shapelet transform Learning Shapelets 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 15 / 55

24 Distance Shapelet BoW Kernels CCL Shapelets ST LS 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification Shapelets Shapelet transform Learning Shapelets 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 16 / 55

25 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelets First introduced in 2009 in [Ye and Keogh, 2009] A shapelet is a subsequence (of a time series) that is representative of a class Class 1 Class Index Index S. Malinowski Time Series Classification 21/06/17 17 / 55

26 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelets First introduced in 2009 in [Ye and Keogh, 2009] A shapelet is a subsequence (of a time series) that is representative of a class Class 1 Class Index Index S. Malinowski Time Series Classification 21/06/17 17 / 55

27 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelets First introduced in 2009 in [Ye and Keogh, 2009] A shapelet is a subsequence (of a time series) that is representative of a class Class 1 Class Index Index Shapelet s 1 is a representative of class 1 if distances between s 1 and time series of class 1 are low distances between s 1 and time series of other classes are high S. Malinowski Time Series Classification 21/06/17 17 / 55

28 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

29 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

30 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

31 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

32 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

33 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

34 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series Index S. Malinowski Time Series Classification 21/06/17 18 / 55

35 Distance Shapelet BoW Kernels CCL Shapelets ST LS Distance between a shapelet and a time series This shapelet represents class 1: Index Index Low distance with class 1 time series High distance with class 2 time series S. Malinowski Time Series Classification 21/06/17 18 / 55

36 Distance Shapelet BoW Kernels CCL Shapelets ST LS Bad or good shapelet? A bad shapelet? Distances S. Malinowski Time Series Classification 21/06/17 19 / 55

37 Distance Shapelet BoW Kernels CCL Shapelets ST LS Bad or good shapelet? A bad shapelet? Distances A good shapelet? Distances S. Malinowski Time Series Classification 21/06/17 19 / 55

38 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelets Images taken from [Ye and Keogh, 2009] S. Malinowski Time Series Classification 21/06/17 20 / 55

39 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelets Images taken from [Ye and Keogh, 2009] S. Malinowski Time Series Classification 21/06/17 20 / 55

40 Distance Shapelet BoW Kernels CCL Shapelets ST LS 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification Shapelets Shapelet transform Learning Shapelets 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 21 / 55

41 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelet transform [Hills et al., 2014] This approach is based on 3 steps : 1 Extract the K best shapelets (information gain criterion) 2 Transform each training time series into a K-dimensional vector M = M 1,..., M K, where M i is the distance between the time series and the i th shapelet 3 Train a classifier on the transformed series Picture taken from [Grabocka et al., 2014] S. Malinowski Time Series Classification 21/06/17 22 / 55

42 Distance Shapelet BoW Kernels CCL Shapelets ST LS 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification Shapelets Shapelet transform Learning Shapelets 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 23 / 55

43 Distance Shapelet BoW Kernels CCL Shapelets ST LS Learning shapelets [Grabocka et al., 2014] Based on the shapelet transform principle BUT, shapelets are learned via logistic regression Given a set of weights w 0,..., w K and a set of K shapelets Ŷ i = w 0 + K w k M i,k k=1 L(Y i, Ŷi) = Y i ln(g(ŷi)) (1 Y i )ln(1 g(ŷi)) Objective : find K shapelets and K weights such that this loss is minimized gradient descent S. Malinowski Time Series Classification 21/06/17 24 / 55

44 Distance Shapelet BoW Kernels CCL Shapelets ST LS Comparison between shapelet-based methods Shapelet transform ST is better here Shapelets is better here Win/Tie/Loose : 77/2/6 Shapelet Transform Shapelet Transform is better here Learning Shapelets is better here Win/Tie/Loose : 25/0/ Shapelets Learning Shapelets S. Malinowski Time Series Classification 21/06/17 25 / 55

45 Distance Shapelet BoW Kernels CCL Shapelets ST LS Shapelet transform VS 1NN-DTW Shapelet transform ST is better here 1NN DTW is better here Win/Tie/Loose : 67/4/ NN DTW S. Malinowski Time Series Classification 21/06/17 26 / 55

46 Distance Shapelet BoW Kernels CCL Framework BOTSW 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches Framework Bag-of-Temporal SIFT Words 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 27 / 55

47 Distance Shapelet BoW Kernels CCL Framework BOTSW 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches Framework Bag-of-Temporal SIFT Words 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 28 / 55

48 Distance Shapelet BoW Kernels CCL Framework BOTSW Bag-of-Words framework The BoW framework is the following Time Series [codewords] Selection of Keypoints (or windows) Description of Keypoints (or windows) Bag-of-Words Representation Classifier S. Malinowski Time Series Classification 21/06/17 29 / 55

49 Distance Shapelet BoW Kernels CCL Framework BOTSW Feature extraction over windows Features that have been considered in the literature : Mean, slope, variance, extrema, starting time [Baydogan et al., 2013] SAX symbols (quantized mean) [Lin et al., 2012, Senin and Malinchik, 2013] Fourier Coefficients [Schäfer, 2014] Wavelet coefficients [Wang et al., 2013] Temporal-SIFT descriptors [Bailly et al., 2015] These features are then used in a Bag-of-Word approaches S. Malinowski Time Series Classification 21/06/17 30 / 55

50 Distance Shapelet BoW Kernels CCL Framework BOTSW 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches Framework Bag-of-Temporal SIFT Words 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 31 / 55

51 Distance Shapelet BoW Kernels CCL Framework BOTSW Bag-of-Temporal SIFT Words (BoTSW) Joint work with Adeline Bailly, Romain Tavenard (Rennes 2) and Thomas Guyet (IRISA/AgroCampus)[Bailly et al., 2015] BoTSW can be outlined by the following scheme: Time Series [codewords] Extraction of Keypoints Description of Keypoints Bag-of-Words Representation Classifier S. Malinowski Time Series Classification 21/06/17 32 / 55

52 Distance Shapelet BoW Kernels CCL Framework BOTSW Extraction of Keypoints - Dense Extraction Keypoints are extracted densely in space. S. Malinowski Time Series Classification 21/06/17 33 / 55

53 Distance Shapelet BoW Kernels CCL Framework BOTSW Outline BoTSW can be outlined by the following scheme: Time Series [codewords] Extraction of Keypoints Description of Keypoints Bag-of-Words Representation Classifier S. Malinowski Time Series Classification 21/06/17 34 / 55

54 Distance Shapelet BoW Kernels CCL Framework BOTSW BoTSW - Feature vectors Parameters: n b, the number of blocks (4) and a, the block s size (2). S. Malinowski Time Series Classification 21/06/17 35 / 55

55 Distance Shapelet BoW Kernels CCL Framework BOTSW BoTSW - Feature vectors Parameters: n b, the number of blocks (4) and a, the block s size (2). This description is done for the original time series S(t) and for L(t, σ) = G(t, k s 0 σ) S(t), 0 s s max S. Malinowski Time Series Classification 21/06/17 35 / 55

56 Distance Shapelet BoW Kernels CCL Framework BOTSW Outline BoTSW can be outlined by the following scheme: Time Series [codewords] Extraction of Keypoints Description of Keypoints Bag-of-Words Representation Classifier S. Malinowski Time Series Classification 21/06/17 36 / 55

57 Distance Shapelet BoW Kernels CCL Framework BOTSW Generation of the dictionary k-means to generate the codewords (here k = 6) S. Malinowski Time Series Classification 21/06/17 37 / 55

58 Distance Shapelet BoW Kernels CCL Framework BOTSW BoTSW - Feature vectors assignment S. Malinowski Time Series Classification 21/06/17 38 / 55

59 Distance Shapelet BoW Kernels CCL Framework BOTSW Bag-of-Words Representation Different normalizations can be applied to this frequency vector : Signed-Square-Root (SSR) TF-IDF S. Malinowski Time Series Classification 21/06/17 39 / 55

60 Distance Shapelet BoW Kernels CCL Framework BOTSW BOTSW versus DTW 1NN-DTW NN-DTW is better here BOTSW is better here Win/Tie/Loose : 66/1/ BOTSW (SIFT) S. Malinowski Time Series Classification 21/06/17 40 / 55

61 Distance Shapelet BoW Kernels CCL Framework BOTSW BOTSW versus other BoW approaches BOTSW BOTSW is better here BOSS is better here Win/Tie/Loose : 46/8/31 BOTSW BOTSW is better here TSBF is better here Win/Tie/Loose : 55/2/ BOSS TSBF S. Malinowski Time Series Classification 21/06/17 41 / 55

62 Distance Shapelet BoW Kernels CCL 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 42 / 55

63 Distance Shapelet BoW Kernels CCL From BoW to kernels Joint work with Romain Tavenard, Adeline Bailly (Rennes 2), Laetitia Chapel (IRISA), Benjamin Bustos (Univ. of Chile)[Tavenard et al., 2017] Up to now, feature vectors are extracted from time series these vectors are quantized into words time series are represented as histogram of occurences of words loss of information in the quantization temporal information is lost Design of kernels between sets of feature vectors no quantization of feature vectors integrate temporal information S. Malinowski Time Series Classification 21/06/17 43 / 55

64 Distance Shapelet BoW Kernels CCL Signature Quadratic Form Distance Let X = {x i, x i R d } i=1,...,n be a time series (a set of feature vectors) Let Y = {y i, y i R d } i=1,...,m be a time series (a set of feature vectors) The SQFD between X and Y is defined as : SQFD(X, Y ) 2 = 1 n 2 n n k(x i, x j )+ 1 m 2 i=1 j=1 m m i=1 j=1 k(y i, y j ) 2 n m n i=1 j=1 m k(x i, y j ), where k(.,.) is a local kernel between feature vectors (gaussian kernel for instance) S. Malinowski Time Series Classification 21/06/17 44 / 55

65 Distance Shapelet BoW Kernels CCL Signature Quadratic Form Distance Let X = {x i, x i R d } i=1,...,n be a time series (a set of feature vectors) Let Y = {y i, y i R d } i=1,...,m be a time series (a set of feature vectors) The SQFD between X and Y is defined as : SQFD(X, Y ) 2 = 1 n 2 n n k(x i, x j )+ 1 m 2 i=1 j=1 m m i=1 j=1 k(y i, y j ) 2 n m n i=1 j=1 m k(x i, y j ), where k(.,.) is a local kernel between feature vectors (gaussian kernel for instance) It can be kernelized into K SQFD (X, Y ) = e γ f SQFD(X,Y ) 2. S. Malinowski Time Series Classification 21/06/17 44 / 55

66 Distance Shapelet BoW Kernels CCL Incorporating time in the SQFD kernel The more direct way is to integrate time in the local kernel k(x i, y j ) = e γ l x i x j 2, that now becomes : k t (x i, y j ) = e γt (tj ti )2 k(x i, y j ). S. Malinowski Time Series Classification 21/06/17 45 / 55

67 Distance Shapelet BoW Kernels CCL Classification performance Error rate SQFD-Fourier without time (γ t = 0) SQFD-Fourier with time SQFD-Fourier with time (normalized kernel) BoW Feature map dimension D S. Malinowski Time Series Classification 21/06/17 46 / 55

68 Distance Shapelet BoW Kernels CCL BoW VS Temporal kernel 0.8 SQFD-Fourier (D = 8192) W / T / L = 49 / 8 / 28 p = BoW S. Malinowski Time Series Classification 21/06/17 47 / 55

69 Distance Shapelet BoW Kernels CCL Temporal kernel VS other approaches SQFD-Fourier (D = 8192) W / T / L = 51 / 3 / 31 p = BOSS SQFD-Fourier (D = 8192) W / T / L = 56 / 3 / 26 p < LearningShapelets S. Malinowski Time Series Classification 21/06/17 48 / 55

70 Distance Shapelet BoW Kernels CCL 1 Distance-Based Time Series Classification 2 Shapelet-Based Time Series Classification 3 Dictionnary-based approaches 4 Efficient kernels for time series classification 5 Conclusion S. Malinowski Time Series Classification 21/06/17 49 / 55

71 Distance Shapelet BoW Kernels CCL Conclusion Conclusion Three main familiy of approaches for TSC : Distance-based Shapelet-based Dictionnary-based Shapelet and dictionnary-based are more accurate but there is not A METHOD better than all others Dictionnary-based methods can be improved by temporal kernels S. Malinowski Time Series Classification 21/06/17 50 / 55

72 Distance Shapelet BoW Kernels CCL Bibliography I Bagnall, A., Lines, J., Bostrom, A., Large, J., and Keogh, E. (2016). The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery, pages Bailly, A., Malinowski, S., Tavenard, R., Guyet, T., and Chapel, L. (2015). Bag-of-Temporal-SIFT-Words for Time Series Classification. In Proceedings of the ECML-PKDD Workshop on Advanced Analytics and Learning on Temporal Data. Baydogan, M. G., Runger, G., and Tuv, E. (2013). A Bag-of-Features Framework to Classify Time Series. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11): S. Malinowski Time Series Classification 21/06/17 51 / 55

73 Distance Shapelet BoW Kernels CCL Bibliography II Cuturi, M. (2011). Fast global alignment kernels. In Proceedings of the International Conference on Machine Learning, pages Grabocka, J., Schilling, N., Wistuba, M., and Schmidt-Thieme, L. (2014). Learning time-series shapelets. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages Hills, J., Lines, J., Baranauskas, E., Mapp, J., and Bagnall, A. (2014). Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery, 28(4): Lin, J., Khade, R., and Li, Y. (2012). Rotation-invariant similarity in time series using bag-of-patterns representation. International Journal of Information Systems, 39: S. Malinowski Time Series Classification 21/06/17 52 / 55

74 Distance Shapelet BoW Kernels CCL Bibliography III Sakoe, H. and Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1): Schäfer, P. (2014). The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 29(6): Senin, P. and Malinchik, S. (2013). SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model. Proceedings of the IEEE International Conference on Data Mining, pages S. Malinowski Time Series Classification 21/06/17 53 / 55

75 Distance Shapelet BoW Kernels CCL Bibliography IV Tavenard, R., Malinowski, S., Chapel, L., Bailly, A., Sanchez, H., and Bustos, B. (2017). Efficient Temporal Kernels between Feature Sets for Time Series Classification. In Proceedings of ECML-PKDD. Wang, J., Liu, P., F.H. She, M., Nahavandi, S., and Kouzani, A. (2013). Bag-of-Words Representation for Biomedical Time Series Classification. Biomedical Signal Processing and Control, 8(6): Ye, L. and Keogh, E. (2009). Time series shapelets: a new primitive for data mining. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages S. Malinowski Time Series Classification 21/06/17 54 / 55

76 Distance Shapelet BoW Kernels CCL Questions? S. Malinowski Time Series Classification 21/06/17 55 / 55

Learning Time-Series Shapelets

Learning Time-Series Shapelets Learning Time-Series Shapelets Josif Grabocka, Nicolas Schilling, Martin Wistuba and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim, Germany SIGKDD 14,

More information

Time Series Classification

Time Series Classification Distance Measures Classifiers DTW vs. ED Further Work Questions August 31, 2017 Distance Measures Classifiers DTW vs. ED Further Work Questions Outline 1 2 Distance Measures 3 Classifiers 4 DTW vs. ED

More information

Machine Learning on temporal data

Machine Learning on temporal data Machine Learning on temporal data Learning Dissimilarities on Time Series Ahlame Douzal (Ahlame.Douzal@imag.fr) AMA, LIG, Université Joseph Fourier Master 2R - MOSIG (2011) Plan Time series structure and

More information

Fusion of Similarity Measures for Time Series Classification

Fusion of Similarity Measures for Time Series Classification Fusion of Similarity Measures for Time Series Classification Krisztian Buza, Alexandros Nanopoulos, and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim,

More information

Low Bias Bagged Support Vector Machines

Low Bias Bagged Support Vector Machines Low Bias Bagged Support Vector Machines Giorgio Valentini Dipartimento di Scienze dell Informazione Università degli Studi di Milano, Italy valentini@dsi.unimi.it Thomas G. Dietterich Department of Computer

More information

Support Vector Machine for Classification and Regression

Support Vector Machine for Classification and Regression Support Vector Machine for Classification and Regression Ahlame Douzal AMA-LIG, Université Joseph Fourier Master 2R - MOSIG (2013) November 25, 2013 Loss function, Separating Hyperplanes, Canonical Hyperplan

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

Time Series Shapelets

Time Series Shapelets Time Series Shapelets Lexiang Ye and Eamonn Keogh This file contains augmented versions of the figures in our paper, plus additional experiments and details that were omitted due to space limitations Please

More information

Temporal and Frequential Metric Learning for Time Series knn Classication

Temporal and Frequential Metric Learning for Time Series knn Classication Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Temporal and Frequential Metric Learning for Time Series knn Classication Cao-Tri Do 123, Ahlame Douzal-Chouakria

More information

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems

More information

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:

More information

Comparing linear and non-linear transformation of speech

Comparing linear and non-linear transformation of speech Comparing linear and non-linear transformation of speech Larbi Mesbahi, Vincent Barreaud and Olivier Boeffard IRISA / ENSSAT - University of Rennes 1 6, rue de Kerampont, Lannion, France {lmesbahi, vincent.barreaud,

More information

Linear Classification and SVM. Dr. Xin Zhang

Linear Classification and SVM. Dr. Xin Zhang Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Time-Series Analysis Prediction Similarity between Time-series Symbolic Approximation SAX References. Time-Series Streams

Time-Series Analysis Prediction Similarity between Time-series Symbolic Approximation SAX References. Time-Series Streams Time-Series Streams João Gama LIAAD-INESC Porto, University of Porto, Portugal jgama@fep.up.pt 1 Time-Series Analysis 2 Prediction Filters Neural Nets 3 Similarity between Time-series Euclidean Distance

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Least Squares SVM Regression

Least Squares SVM Regression Least Squares SVM Regression Consider changing SVM to LS SVM by making following modifications: min (w,e) ½ w 2 + ½C Σ e(i) 2 subject to d(i) (w T Φ( x(i))+ b) = e(i), i, and C>0. Note that e(i) is error

More information

Generalized Gradient Learning on Time Series under Elastic Transformations

Generalized Gradient Learning on Time Series under Elastic Transformations Generalized Gradient Learning on Time Series under Elastic Transformations Brijnesh J. Jain Technische Universität Berlin, Germany e-mail: brijnesh.jain@gmail.com arxiv:1502.04843v2 [cs.lg] 9 Jun 2015

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

L 2,1 Norm and its Applications

L 2,1 Norm and its Applications L 2, Norm and its Applications Yale Chang Introduction According to the structure of the constraints, the sparsity can be obtained from three types of regularizers for different purposes.. Flat Sparsity.

More information

Distances & Similarities

Distances & Similarities Introduction to Data Mining Distances & Similarities CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Distances & Similarities Yale - Fall 2016 1 / 22 Outline

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Variable Selection and Weighting by Nearest Neighbor Ensembles

Variable Selection and Weighting by Nearest Neighbor Ensembles Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction

More information

Kernel Learning via Random Fourier Representations

Kernel Learning via Random Fourier Representations Kernel Learning via Random Fourier Representations L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Module 5: Machine Learning L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Kernel Learning via Random

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

6.036 midterm review. Wednesday, March 18, 15

6.036 midterm review. Wednesday, March 18, 15 6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that

More information

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation. ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent

More information

ANLP Lecture 22 Lexical Semantics with Dense Vectors

ANLP Lecture 22 Lexical Semantics with Dense Vectors ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous

More information

Machine Learning Linear Models

Machine Learning Linear Models Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima. http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting

Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting Vol. 9, No., 18 Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting Juan Huo School of Electrical and Automation Engineering Zhengzhou University, Henan Province,

More information

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber

Machine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to

More information

Bits of Machine Learning Part 1: Supervised Learning

Bits of Machine Learning Part 1: Supervised Learning Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

EE 6882 Visual Search Engine

EE 6882 Visual Search Engine EE 6882 Visual Search Engine Prof. Shih Fu Chang, Feb. 13 th 2012 Lecture #4 Local Feature Matching Bag of Word image representation: coding and pooling (Many slides from A. Efors, W. Freeman, C. Kambhamettu,

More information

Orientation Map Based Palmprint Recognition

Orientation Map Based Palmprint Recognition Orientation Map Based Palmprint Recognition (BM) 45 Orientation Map Based Palmprint Recognition B. H. Shekar, N. Harivinod bhshekar@gmail.com, harivinodn@gmail.com India, Mangalore University, Department

More information

An Adaptive Test of Independence with Analytic Kernel Embeddings

An Adaptive Test of Independence with Analytic Kernel Embeddings An Adaptive Test of Independence with Analytic Kernel Embeddings Wittawat Jitkrittum Gatsby Unit, University College London wittawat@gatsby.ucl.ac.uk Probabilistic Graphical Model Workshop 2017 Institute

More information

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology ..... SPARROW SPARse approximation Weighted regression Pardis Noorzad Department of Computer Engineering and IT Amirkabir University of Technology Université de Montréal March 12, 2012 SPARROW 1/47 .....

More information

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Experiment presentation for CS3710:Visual Recognition Presenter: Zitao Liu University of Pittsburgh ztliu@cs.pitt.edu

More information

Machine Learning, Fall 2012 Homework 2

Machine Learning, Fall 2012 Homework 2 0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Modeling Complex Temporal Composition of Actionlets for Activity Prediction

Modeling Complex Temporal Composition of Actionlets for Activity Prediction Modeling Complex Temporal Composition of Actionlets for Activity Prediction ECCV 2012 Activity Recognition Reading Group Framework of activity prediction What is an Actionlet To segment a long sequence

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

An Adaptive Test of Independence with Analytic Kernel Embeddings

An Adaptive Test of Independence with Analytic Kernel Embeddings An Adaptive Test of Independence with Analytic Kernel Embeddings Wittawat Jitkrittum 1 Zoltán Szabó 2 Arthur Gretton 1 1 Gatsby Unit, University College London 2 CMAP, École Polytechnique ICML 2017, Sydney

More information

Kai Yu NEC Laboratories America, Cupertino, California, USA

Kai Yu NEC Laboratories America, Cupertino, California, USA Kai Yu NEC Laboratories America, Cupertino, California, USA Joint work with Jinjun Wang, Fengjun Lv, Wei Xu, Yihong Gong Xi Zhou, Jianchao Yang, Thomas Huang, Tong Zhang Chen Wu NEC Laboratories America

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Support Vector Machine & Its Applications

Support Vector Machine & Its Applications Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia

More information

Day-ahead time series forecasting: application to capacity planning

Day-ahead time series forecasting: application to capacity planning Day-ahead time series forecasting: application to capacity planning Colin Leverger 1,2, Vincent Lemaire 1 Simon Malinowski 2, Thomas Guyet 3, Laurence Rozé 4 1 Orange Labs, 35000 Rennes France 2 Univ.

More information

Sparse Kernel Machines - SVM

Sparse Kernel Machines - SVM Sparse Kernel Machines - SVM Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Support

More information

Efficient search of the best warping window for Dynamic Time Warping

Efficient search of the best warping window for Dynamic Time Warping Efficient search of the best warping window for Dynamic Time Warping Chang Wei Tan 1 Matthieu Herrmann 1 Germain Geoffrey I. Forestier 2,1 Webb 1 François Petitjean 1 1 Faculty of IT, Monash University,

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Fisher Vector image representation

Fisher Vector image representation Fisher Vector image representation Machine Learning and Category Representation 2014-2015 Jakob Verbeek, January 9, 2015 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15 A brief recap on kernel

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

Time Series Classification Using Time Warping Invariant Echo State Networks

Time Series Classification Using Time Warping Invariant Echo State Networks 2016 15th IEEE International Conference on Machine Learning and Applications Time Series Classification Using Time Warping Invariant Echo State Networks Pattreeya Tanisaro and Gunther Heidemann Institute

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Compressed Fisher vectors for LSVR

Compressed Fisher vectors for LSVR XRCE@ILSVRC2011 Compressed Fisher vectors for LSVR Florent Perronnin and Jorge Sánchez* Xerox Research Centre Europe (XRCE) *Now with CIII, Cordoba University, Argentina Our system in a nutshell High-dimensional

More information

Multi-scale Geometric Summaries for Similarity-based Upstream S

Multi-scale Geometric Summaries for Similarity-based Upstream S Multi-scale Geometric Summaries for Similarity-based Upstream Sensor Fusion Duke University, ECE / Math 3/6/2019 Overall Goals / Design Choices Leverage multiple, heterogeneous modalities in identification

More information

Statistical learning theory, Support vector machines, and Bioinformatics

Statistical learning theory, Support vector machines, and Bioinformatics 1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Distribution-Free Distribution Regression

Distribution-Free Distribution Regression Distribution-Free Distribution Regression Barnabás Póczos, Alessandro Rinaldo, Aarti Singh and Larry Wasserman AISTATS 2013 Presented by Esther Salazar Duke University February 28, 2014 E. Salazar (Reading

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Learning Binary Classifiers for Multi-Class Problem

Learning Binary Classifiers for Multi-Class Problem Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL

More information

Loss Functions and Optimization. Lecture 3-1

Loss Functions and Optimization. Lecture 3-1 Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative Assignment 1 is released: http://cs231n.github.io/assignments2017/assignment1/ Due Thursday April 20, 11:59pm on Canvas (Extending

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

Multi-theme Sentiment Analysis using Quantified Contextual

Multi-theme Sentiment Analysis using Quantified Contextual Multi-theme Sentiment Analysis using Quantified Contextual Valence Shifters Hongkun Yu, Jingbo Shang, MeichunHsu, Malú Castellanos, Jiawei Han Presented by Jingbo Shang University of Illinois at Urbana-Champaign

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

Training the linear classifier

Training the linear classifier 215, Training the linear classifier A natural way to train the classifier is to minimize the number of classification errors on the training data, i.e. choosing w so that the training error is minimized.

More information

Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Machine Learning, Midterm Exam: Spring 2009 SOLUTION 10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

CSE 546 Midterm Exam, Fall 2014

CSE 546 Midterm Exam, Fall 2014 CSE 546 Midterm Eam, Fall 2014 1. Personal info: Name: UW NetID: Student ID: 2. There should be 14 numbered pages in this eam (including this cover sheet). 3. You can use an material ou brought: an book,

More information

INTEREST POINTS AT DIFFERENT SCALES

INTEREST POINTS AT DIFFERENT SCALES INTEREST POINTS AT DIFFERENT SCALES Thank you for the slides. They come mostly from the following sources. Dan Huttenlocher Cornell U David Lowe U. of British Columbia Martial Hebert CMU Intuitively, junctions

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

CS 229 Final report A Study Of Ensemble Methods In Machine Learning

CS 229 Final report A Study Of Ensemble Methods In Machine Learning A Study Of Ensemble Methods In Machine Learning Abstract The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 5: Multi-class and preference learning Juho Rousu 11. October, 2017 Juho Rousu 11. October, 2017 1 / 37 Agenda from now on: This week s theme: going

More information

B555 - Machine Learning - Homework 4. Enrique Areyan April 28, 2015

B555 - Machine Learning - Homework 4. Enrique Areyan April 28, 2015 - Machine Learning - Homework Enrique Areyan April 8, 01 Problem 1: Give decision trees to represent the following oolean functions a) A b) A C c) Ā d) A C D e) A C D where Ā is a negation of A and is

More information

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010

Linear Regression. Aarti Singh. Machine Learning / Sept 27, 2010 Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X

More information

Machine Learning Basics

Machine Learning Basics Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each

More information

ECE521 Lectures 9 Fully Connected Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux INRIA 8 Nov 2011 Nicolas Le Roux (INRIA) Neural networks and optimization 8 Nov 2011 1 / 80 1 Introduction 2 Linear classifier 3 Convolutional neural networks

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010

More information