Distance Measures Classifiers DTW vs. ED Further Work Questions August 31, 2017
Distance Measures Classifiers DTW vs. ED Further Work Questions Outline 1 2 Distance Measures 3 Classifiers 4 DTW vs. ED 5 Further Work 6 Questions
Distance Measures Classifiers DTW vs. ED Further Work Questions Classification Sorting objects into pre-defined groups
Distance Measures Classifiers DTW vs. ED Further Work Questions A supervised learning problem We have a training set - data which is already labelled Given a new test time series - which group does it belong to?
Distance Measures Classifiers DTW vs. ED Further Work Questions - Applications Speech recognition Image processing ECG readings
Distance Measures Classifiers DTW vs. ED Further Work Questions Objectives of the Project Investigate commonly used methods Research the limitations and capabilities of these methods Understand when these methods are best used Identify areas of further research
Distance Measures Classifiers DTW vs. ED Further Work Questions Two Step Process 1 Measuring distance between test time series and time series in the training set Euclidean Distance Dynamic Time Warping
Distance Measures Classifiers DTW vs. ED Further Work Questions Two Step Process 1 Measuring distance between test time series and time series in the training set Euclidean Distance Dynamic Time Warping 2 Classifying the time series based on distance from training time series Usually some form of a nearest neighbour algorithm
Distance Measures Classifiers DTW vs. ED Further Work Questions Distance Measures: Euclidean Distance Pointwise difference between time series
Distance Measures Classifiers DTW vs. ED Further Work Questions Distance Measures: Dynamic Time Warping Uses dynamic programming to minimise the difference between the time series Accounts for different time scales
Distance Measures Classifiers DTW vs. ED Further Work Questions Distance Measures: Dynamic Time Warping - Algorithm Take two time series: T = {1, 3, 1, 0} and S = {0, 1, 3, 2} Construct a cost matrix C with C i,j = (T i S j ) 2 1 0 4 1 9 4 0 1 1 0 4 1 0 1 9 4 Construct the matrix D sequentially with D i,j = C i,j + min(d i 1,j 1, D i,j 1, D i 1,j ) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Distance Measures Classifiers DTW vs. ED Further Work Questions Distance Measures: Dynamic Time Warping - Algorithm Take two time series: T = {1, 3, 1, 0} and S = {0, 1, 3, 2} Construct a cost matrix C with C i,j = (T i S j ) 2 1 0 4 1 9 4 0 1 1 0 4 1 0 1 9 4 Construct the matrix D sequentially with D i,j = C i,j + min(d i 1,j 1, D i,j 1, D i 1,j ) 1 1 5 6 10 5 1 2 11 5 5 2 11 6 14 6
Distance Measures Classifiers DTW vs. ED Further Work Questions Distance Measures: Dynamic Time Warping - Algorithm Take two time series: T = {1, 3, 1, 0} and S = {0, 1, 3, 2} Construct a cost matrix C with C i,j = (T i S j ) 2 1 0 4 1 9 4 0 1 1 0 4 1 0 1 9 4 Construct the matrix D sequentially with D i,j = C i,j + min(d i 1,j 1, D i,j 1, D i 1,j ) 1 1 5 6 10 5 1 2 11 5 5 2 11 6 14 6
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW Example: Optimal Path 1 1 5 6 10 5 1 2 11 5 5 2 11 6 14 6 Step Cell Alignment 1 (1,1) a 1 b 1 2 (1,2) a 1 b 2 3 (2,3) a 2 b 3 4 (3,4) a 3 b 4 5 (4,4) a 4 b 4
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW Example: Optimal Path 1 1 5 6 10 5 1 2 11 5 5 2 11 6 14 6 Step Cell Alignment 1 (1,1) a 1 b 1 2 (1,2) a 1 b 2 3 (2,3) a 2 b 3 4 (3,4) a 3 b 4 5 (4,4) a 4 b 4
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW Example: Optimal Path 1 1 5 6 10 5 1 2 11 5 5 2 11 6 14 6 Step Cell Alignment 1 (1,1) a 1 b 1 2 (1,2) a 1 b 2 3 (2,3) a 2 b 3 4 (3,4) a 3 b 4 5 (4,4) a 4 b 4
Distance Measures Classifiers DTW vs. ED Further Work Questions Classification Step - K-Nearest Neighbour Assigns label based on the most common label of the K nearest neighbours K is pre-specified Simplest example is 1-NN
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW vs. ED - Theoretical Results Take a training set made up of two time series, a (Class 1) and b (Class 2), and take a time series c we wish to label.
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW vs. ED - Theoretical Results Take a training set made up of two time series, a (Class 1) and b (Class 2), and take a time series c we wish to label. Assumptions All time series are the same length, n c is from Class 1, i.e. c = a, with some white noise - N(0, σ 2 ) We use ED and 1-NN
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW vs. ED - Theoretical Results P(c labelled correctly) = Φ 1 n (a i b i ) 2σ 2, i=1
Distance Measures Classifiers DTW vs. ED Further Work Questions DTW vs. ED - Theoretical Results P(c labelled correctly) = Φ 1 n (a i b i ) 2σ 2, What does this mean? We want: a small σ (variance of the noise) a longer time series i=1 well defined differences between training time series
Distance Measures Classifiers DTW vs. ED Further Work Questions Standard Performance Comparison Data set: coffee bean spectrograph readings
Distance Measures Classifiers DTW vs. ED Further Work Questions Standard Performance Comparison Our performance measure is the proportion of correct classifications
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Shifted Time Series
Distance Measures Classifiers DTW vs. ED Further Work Questions Efficiency Efficiency is very important particularly when using TSC in real time DTW performs very poorly in comparison with ED
Distance Measures Classifiers DTW vs. ED Further Work Questions Efficiency Efficiency is very important particularly when using TSC in real time DTW performs very poorly in comparison with ED Time Taken for Distance Measures (milliseconds) Measure Min. Time Mean Time Max. Time ED 11.21 11.69 55.43 DTW 2533.88 2581.30 3919.35
Distance Measures Classifiers DTW vs. ED Further Work Questions Pros and Cons: DTW vs. ED ED Only works with time series of the same length More resistant to noisy or spikey test data than DTW Fails when data is shifted or transformed with respect to time Quicker/simpler DTW Can be used on time series of any length Generally weaker when data is spikey or noisy Robust when data is shifted or transformed with respect to time Much slower
Distance Measures Classifiers DTW vs. ED Further Work Questions Changing DTW Adding a window: 1 1 10 5 1 5 5 2 14 6 More efficient Not much accuracy lost
Distance Measures Classifiers DTW vs. ED Further Work Questions Further Work Probabilistic K-NN Classification is generally a binary process May be advantageous to give a measure of how sure we are Early TSC Classifying time series before we have received all data Trade-off between accuracy and speed
Distance Measures Classifiers DTW vs. ED Further Work Questions Questions Thank you for listening! Any Questions?