Dynamic Time Warping

Size: px
Start display at page:

Download "Dynamic Time Warping"

Transcription

1 Dynamic Time Warping Steven Elsworth School of Mathematics, University of Manchester Friday 29 th September, 2017 S. Elsworth Dynamic Time Warping 1 / 19

2 What is a time series? A time series is a sequence of measurements of a quantity (e.g. temperature, pressure) over time. S. Elsworth Dynamic Time Warping 2 / 19

3 What is a time series? A time series is a sequence of measurements of a quantity (e.g. temperature, pressure) over time. We do not need equally spaced time points. S. Elsworth Dynamic Time Warping 2 / 19

4 Temperature ( / C) What is a time series? A time series is a sequence of measurements of a quantity (e.g. temperature, pressure) over time. We do not need equally spaced time points. Temperature at 12:00 in Manchester during Nov Day in November S. Elsworth Dynamic Time Warping 2 / 19

5 Temperature ( / C) ON or OFF What is a time series? A time series is a sequence of measurements of a quantity (e.g. temperature, pressure) over time. We do not need equally spaced time points. Temperature at 12:00 in Manchester during Nov Light switch over 24 hour period Day in November Hour S. Elsworth Dynamic Time Warping 2 / 19

6 What is a distance measure? Consider two time series x = {x i } m i=1 and y = {y i} n i=1. A distance/ similarity measure d(x, y) gives the distance/ similarity between the time series x and y. Some examples include: S. Elsworth Dynamic Time Warping 3 / 19

7 What is a distance measure? Consider two time series x = {x i } m i=1 and y = {y i} n i=1. A distance/ similarity measure d(x, y) gives the distance/ similarity between the time series x and y. Some examples include: Euclidean distance (on time series or transformed representation of time series) S. Elsworth Dynamic Time Warping 3 / 19

8 What is a distance measure? Consider two time series x = {x i } m i=1 and y = {y i} n i=1. A distance/ similarity measure d(x, y) gives the distance/ similarity between the time series x and y. Some examples include: Euclidean distance (on time series or transformed representation of time series) Dynamic Time Warping S. Elsworth Dynamic Time Warping 3 / 19

9 What is a distance measure? Consider two time series x = {x i } m i=1 and y = {y i} n i=1. A distance/ similarity measure d(x, y) gives the distance/ similarity between the time series x and y. Some examples include: Euclidean distance (on time series or transformed representation of time series) Dynamic Time Warping Longest Common Subsequence model S. Elsworth Dynamic Time Warping 3 / 19

10 What is a distance measure? Consider two time series x = {x i } m i=1 and y = {y i} n i=1. A distance/ similarity measure d(x, y) gives the distance/ similarity between the time series x and y. Some examples include: Euclidean distance (on time series or transformed representation of time series) Dynamic Time Warping Longest Common Subsequence model Threshold based distance measures S. Elsworth Dynamic Time Warping 3 / 19

11 Example data Consider the following time series: S. Elsworth Dynamic Time Warping 4 / 19

12 Euclidean distance Defined by d(x, y) = n i=0 d(x i, y i ) 2 where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. S. Elsworth Dynamic Time Warping 5 / 19

13 Euclidean distance Defined by d(x, y) = n i=0 d(x i, y i ) 2 where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. Fast to compute Satisfies the triangle inequality Requires n = m Performs poorly when the time series have outliers. S. Elsworth Dynamic Time Warping 5 / 19

14 Euclidean distance Defined by d(x, y) = n i=0 d(x i, y i ) 2 where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. Fast to compute Satisfies the triangle inequality Requires n = m Performs poorly when the time series have outliers. S. Elsworth Dynamic Time Warping 5 / 19

15 Dynamic Time Warping First proposed by Berndt and Clifford (1994). Start by constructing n m matrix D, where D i,j = d(x i, y j ) where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. S. Elsworth Dynamic Time Warping 6 / 19

16 Dynamic Time Warping First proposed by Berndt and Clifford (1994). Start by constructing n m matrix D, where D i,j = d(x i, y j ) where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. A warping path w is a contiguous set of matrix elements which defines a mapping between x and y that satisfies the following conditions: S. Elsworth Dynamic Time Warping 6 / 19

17 Dynamic Time Warping First proposed by Berndt and Clifford (1994). Start by constructing n m matrix D, where D i,j = d(x i, y j ) where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. A warping path w is a contiguous set of matrix elements which defines a mapping between x and y that satisfies the following conditions: 1 Boundary conditions: w 1 = (1, 1) and w k = (m, n) where k is the length of the warping path. S. Elsworth Dynamic Time Warping 6 / 19

18 Dynamic Time Warping First proposed by Berndt and Clifford (1994). Start by constructing n m matrix D, where D i,j = d(x i, y j ) where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. A warping path w is a contiguous set of matrix elements which defines a mapping between x and y that satisfies the following conditions: 1 Boundary conditions: w 1 = (1, 1) and w k = (m, n) where k is the length of the warping path. 2 Continuity: if w i = (a, b) then w i 1 = (a, b ) where a a 1 and b b 1. S. Elsworth Dynamic Time Warping 6 / 19

19 Dynamic Time Warping First proposed by Berndt and Clifford (1994). Start by constructing n m matrix D, where D i,j = d(x i, y j ) where d(x i, y i ) = x i y i or d(x i, y i ) = (x i y i ) 2. A warping path w is a contiguous set of matrix elements which defines a mapping between x and y that satisfies the following conditions: 1 Boundary conditions: w 1 = (1, 1) and w k = (m, n) where k is the length of the warping path. 2 Continuity: if w i = (a, b) then w i 1 = (a, b ) where a a 1 and b b 1. 3 Monotonicity: if w i = (a, b) then w i 1 = (a, b ) where a a 0 and b b 0. S. Elsworth Dynamic Time Warping 6 / 19

20 Cont. Boundary Continuity Monotonicity We want d DTW (x, y) = min k i=1 D(w i). S. Elsworth Dynamic Time Warping 7 / 19

21 Cont. Boundary Continuity Monotonicity We want d DTW (x, y) = min k i=1 D(w i). In practise we use dynamic programming to prevent construction of the whole matrix. Define γ(i, j) to be the cumulative distance then γ(i, j) = d(x i, y j ) + min{γ(i 1, j 1), γ(i 1, j), γ(i, j 1)}. S. Elsworth Dynamic Time Warping 7 / 19

22 Cont. Doesnt satisfy triangle inequality Slow to compute (O(nm) time and space complexity) S. Elsworth Dynamic Time Warping 8 / 19

23 Cont. Doesnt satisfy triangle inequality Slow to compute (O(nm) time and space complexity) S. Elsworth Dynamic Time Warping 8 / 19

24 3D graphical view of Dynamic Time Warping S. Elsworth Dynamic Time Warping 9 / 19

25 What are distance measures used for? S. Elsworth Dynamic Time Warping 10 / 19

26 What are distance measures used for? Indexing - Given time series x, find time series y in database DB such that d(x, y) is minimal. S. Elsworth Dynamic Time Warping 10 / 19

27 What are distance measures used for? Indexing - Given time series x, find time series y in database DB such that d(x, y) is minimal. Clustering - Split database DB into groups using d(, ). S. Elsworth Dynamic Time Warping 10 / 19

28 What are distance measures used for? Indexing - Given time series x, find time series y in database DB such that d(x, y) is minimal. Clustering - Split database DB into groups using d(, ). Classification - Given clusters DB 1,, DB k, which cluster does timeseries x belong to? S. Elsworth Dynamic Time Warping 10 / 19

29 What are distance measures used for? Indexing - Given time series x, find time series y in database DB such that d(x, y) is minimal. Clustering - Split database DB into groups using d(, ). Classification - Given clusters DB 1,, DB k, which cluster does timeseries x belong to? Query search - Given large time series x and small query time series q and distance d, find subsequences of x which are less than d away from q. S. Elsworth Dynamic Time Warping 10 / 19

30 How do we speed it up? S. Elsworth Dynamic Time Warping 11 / 19

31 How do we speed it up? 1 Global Constraints S. Elsworth Dynamic Time Warping 11 / 19

32 How do we speed it up? 1 Global Constraints 2 Lower bounds S. Elsworth Dynamic Time Warping 11 / 19

33 How do we speed it up? 1 Global Constraints 2 Lower bounds 3 FASTDTW S. Elsworth Dynamic Time Warping 11 / 19

34 Global Constraints Figure: Sakoe-Chimba band S. Elsworth Dynamic Time Warping 12 / 19

35 Global Constraints Figure: Sakoe-Chimba band S. Elsworth Figure: Itakura parallelogram Dynamic Time Warping 12 / 19

36 Lower Bounds Many lower bounds have been developed. These are cheaper to compute but prone off unnecessary DTW distance measures. Two most popular are LB KIM and LB KEOGH. LB kimfl is the simplest and of O(1). It consists of finding the Eulidean distance of the first and last point in time series. LB kim is of O(n) and consists of distance between maximum and minimum values of time series. S. Elsworth Dynamic Time Warping 13 / 19

37 Tightness of bound Lower Bounds Many lower bounds have been developed. These are cheaper to compute but prone off unnecessary DTW distance measures. Two most popular are LB KIM and LB KEOGH. LB kimfl is the simplest and of O(1). It consists of finding the Eulidean distance of the first and last point in time series. LB kim is of O(n) and consists of distance between maximum and minimum values of time series. 1 max(lb KeoghEQ,LB KeoghEC )! A LB KeoghEQ A LB KimFL A LB LB Yi Kim 0 O(1) O(n) Order of bound S. Elsworth Dynamic Time Warping 13 / 19

38 FASTDTW FASTDTW was developed by Salvador and Chan. It achieves linear time and space complexity. Consists of three stages: Coarsening, projection, refinement. S. Elsworth Dynamic Time Warping 14 / 19

39 FASTDTW FASTDTW was developed by Salvador and Chan. It achieves linear time and space complexity. Consists of three stages: Coarsening, projection, refinement. Figure: 1/4 Figure: 1/2 Figure: 1/1 S. Elsworth Dynamic Time Warping 14 / 19

40 FASTDTW FASTDTW was developed by Salvador and Chan. It achieves linear time and space complexity. Consists of three stages: Coarsening, projection, refinement. Figure: 1/4 Figure: 1/2 Figure: 1/1 Doesnt always find optimal solution. S. Elsworth Dynamic Time Warping 14 / 19

41 Query search example Consider the following time series. 8 Time Series #10 7 This is an electrocardiogram of an 84-year old female. Recording is 22 hours and 23 minutes long containing 110,087 beats. End cropped for visual purposes. S. Elsworth Dynamic Time Warping 15 / 19

42 Time series: x 250 samples per second giving a total length of 20,145,000. Here is a small 4s clippet (1000 data points). 0.4 Time Series clippet S. Elsworth Dynamic Time Warping 16 / 19

43 Query: q Premature venticular contraction. These are premature heartbeats originating from the ventricles of the heart, possibly due to high blood pressure or heart disease. Query containing 421 data points, about 1.7 seconds. 4 Query S. Elsworth Dynamic Time Warping 17 / 19

44 Results Find the following two closest matches. 4 2 Query search Query 1st Match 2nd Match S. Elsworth Dynamic Time Warping 18 / 19

45 References DTW : Berndt and Clifford (1994) Sakoe-Chimba band: Sakoe and Chimba (1978) Itakura parallelogram: Itakura (1975) Bounds and UCR suite: Rakthanmanon et al. (2012) FastDTW: Salvador and Chan (2007) Last worked example: S. Elsworth Dynamic Time Warping 19 / 19

Time-Series Analysis Prediction Similarity between Time-series Symbolic Approximation SAX References. Time-Series Streams

Time-Series Analysis Prediction Similarity between Time-series Symbolic Approximation SAX References. Time-Series Streams Time-Series Streams João Gama LIAAD-INESC Porto, University of Porto, Portugal jgama@fep.up.pt 1 Time-Series Analysis 2 Prediction Filters Neural Nets 3 Similarity between Time-series Euclidean Distance

More information

Anticipatory DTW for Efficient Similarity Search in Time Series Databases

Anticipatory DTW for Efficient Similarity Search in Time Series Databases Anticipatory DTW for Efficient Similarity Search in Time Series Databases Ira Assent Marc Wichterich Ralph Krieger Hardy Kremer Thomas Seidl RWTH Aachen University, Germany Aalborg University, Denmark

More information

Time Series Classification

Time Series Classification Distance Measures Classifiers DTW vs. ED Further Work Questions August 31, 2017 Distance Measures Classifiers DTW vs. ED Further Work Questions Outline 1 2 Distance Measures 3 Classifiers 4 DTW vs. ED

More information

Efficient search of the best warping window for Dynamic Time Warping

Efficient search of the best warping window for Dynamic Time Warping Efficient search of the best warping window for Dynamic Time Warping Chang Wei Tan 1 Matthieu Herrmann 1 Germain Geoffrey I. Forestier 2,1 Webb 1 François Petitjean 1 1 Faculty of IT, Monash University,

More information

University of Florida CISE department Gator Engineering. Clustering Part 1

University of Florida CISE department Gator Engineering. Clustering Part 1 Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects

More information

GLR-Entropy Model for ECG Arrhythmia Detection

GLR-Entropy Model for ECG Arrhythmia Detection , pp.363-370 http://dx.doi.org/0.4257/ijca.204.7.2.32 GLR-Entropy Model for ECG Arrhythmia Detection M. Ali Majidi and H. SadoghiYazdi,2 Department of Computer Engineering, Ferdowsi University of Mashhad,

More information

On-line Dynamic Time Warping for Streaming Time Series

On-line Dynamic Time Warping for Streaming Time Series On-line Dynamic Time Warping for Streaming Time Series Izaskun Oregi 1, Aritz Pérez 2, Javier Del Ser 1,2,3, and José A. Lozano 2,4 1 TECNALIA, 48160 Derio, Spain {izaskun.oregui,javier.delser}@tecnalia.com

More information

Extracting Optimal Performance from Dynamic Time Warping

Extracting Optimal Performance from Dynamic Time Warping Extracting Optimal Performance from Dynamic Time Warping Abdullah Mueen mueen@unm.edu Eamonn Keogh eamonn@cs.ucr.edu While you are waiting: Please download the slides at: www.cs.unm.edu/~mueen/dtw1.pdf

More information

MANAGING UNCERTAINTY IN SPATIO-TEMPORAL SERIES

MANAGING UNCERTAINTY IN SPATIO-TEMPORAL SERIES MANAGING UNCERTAINTY IN SPATIO-TEMPORAL SERIES Yania Molina Souto, Ana Maria de C. Moura, Fabio Porto Laboratório de Computação Científica LNCC DEXL Lab Petrópolis RJ Brasil yaniams@lncc.br, anamoura@lncc.br,

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

SBASS: Segment based approach for subsequence searches in sequence databases

SBASS: Segment based approach for subsequence searches in sequence databases Comput Syst Sci & Eng (2007) 1-2: 37-46 2007 CRL Publishing Ltd International Journal of Computer Systems Science & Engineering SBASS: Segment based approach for subsequence searches in sequence databases

More information

Option 1: Landmark Registration We can try to align specific points. The Registration Problem. Landmark registration. Time-Warping Functions

Option 1: Landmark Registration We can try to align specific points. The Registration Problem. Landmark registration. Time-Warping Functions The Registration Problem Most analyzes only account for variation in amplitude. Frequently, observed data exhibit features that vary in time. Option 1: Landmark Registration We can try to align specific

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and

More information

Improving time-series rule matching performance for detecting energy consumption patterns

Improving time-series rule matching performance for detecting energy consumption patterns Improving time-series rule matching performance for detecting energy consumption patterns Maël Guilleme, Laurence Rozé, Véronique Masson, Cérès Carton, René Quiniou, Alexandre Termier To cite this version:

More information

Università di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7

Università di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7 Università di Pisa A.A. 2016-2017 Data Mining II June 13th, 2017 Exercise 1 - Sequential patterns (6 points) a) (3 points) Given the following input sequence < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D}

More information

Recent advances in Time Series Classification

Recent advances in Time Series Classification Distance Shapelet BoW Kernels CCL Recent advances in Time Series Classification Simon Malinowski, LinkMedia Research Team Classification day #3 S. Malinowski Time Series Classification 21/06/17 1 / 55

More information

Data Mining 4. Cluster Analysis

Data Mining 4. Cluster Analysis Data Mining 4. Cluster Analysis 4.2 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Data Structures Interval-Valued (Numeric) Variables Binary Variables Categorical Variables Ordinal Variables Variables

More information

Introduction to Time Series Mining. Slides edited from Keogh Eamonn s tutorial:

Introduction to Time Series Mining. Slides edited from Keogh Eamonn s tutorial: Introduction to Time Series Mining Slides edited from Keogh Eamonn s tutorial: 25.1750 25.2250 25.2500 25.2500 25.2750 25.3250 25.3500 25.3500 25.4000 25.4000 25.3250 25.2250 25.2000 25.1750.... 24.6250

More information

Uncertain Time-Series Similarity: Return to the Basics

Uncertain Time-Series Similarity: Return to the Basics Uncertain Time-Series Similarity: Return to the Basics Dallachiesa et al., VLDB 2012 Li Xiong, CS730 Problem Problem: uncertain time-series similarity Applications: location tracking of moving objects;

More information

Learning Time-Series Shapelets

Learning Time-Series Shapelets Learning Time-Series Shapelets Josif Grabocka, Nicolas Schilling, Martin Wistuba and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim, Germany SIGKDD 14,

More information

Math 110 Test # 1. The set of real numbers in both of the intervals [0, 2) and ( 1, 0] is equal to. Question 1. (F) [ 1, 2) (G) (2, ) (H) [ 1, 2]

Math 110 Test # 1. The set of real numbers in both of the intervals [0, 2) and ( 1, 0] is equal to. Question 1. (F) [ 1, 2) (G) (2, ) (H) [ 1, 2] Friday July 8, 00 Jacek Szmigielski Math 0 Test # Fill in the bubbles that correspond to the correct answers. No aids: no calculators, closed book. You are not permitted to consult with your fellow students

More information

Chapter 3. Measuring data

Chapter 3. Measuring data Chapter 3 Measuring data 1 Measuring data versus presenting data We present data to help us draw meaning from it But pictures of data are subjective They re also not susceptible to rigorous inference Measuring

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu November 3, 2015 Methods to Learn Matrix Data Text Data Set Data Sequence Data Time Series Graph

More information

HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence

HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence Eamonn Keogh *Jessica Lin Ada Fu University of California, Riverside George Mason Univ Chinese Univ of Hong Kong The 5 th IEEE International

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Similarity and Dissimilarity Similarity Numerical measure of how alike two data objects are. Is higher

More information

Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li

Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li 77 Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li 1) Introduction Cluster analysis deals with separating data into groups whose identities are not known in advance. In general, even the

More information

Chapter 17 Current and Resistance

Chapter 17 Current and Resistance Chapter 17 Current and Resistance Current Practical applications were based on static electricity. A steady source of electric current allowed scientists to learn how to control the flow of electric charges

More information

Generalized Gradient Learning on Time Series under Elastic Transformations

Generalized Gradient Learning on Time Series under Elastic Transformations Generalized Gradient Learning on Time Series under Elastic Transformations Brijnesh J. Jain Technische Universität Berlin, Germany e-mail: brijnesh.jain@gmail.com arxiv:1502.04843v2 [cs.lg] 9 Jun 2015

More information

Nearest-Neighbor Searching Under Uncertainty

Nearest-Neighbor Searching Under Uncertainty Nearest-Neighbor Searching Under Uncertainty Wuzhou Zhang Joint work with Pankaj K. Agarwal, Alon Efrat, and Swaminathan Sankararaman. To appear in PODS 2012. Nearest-Neighbor Searching S: a set of n points

More information

Temporal and Frequential Metric Learning for Time Series knn Classication

Temporal and Frequential Metric Learning for Time Series knn Classication Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Temporal and Frequential Metric Learning for Time Series knn Classication Cao-Tri Do 123, Ahlame Douzal-Chouakria

More information

10.1 Vectors. c Kun Wang. Math 150, Fall 2017

10.1 Vectors. c Kun Wang. Math 150, Fall 2017 10.1 Vectors Definition. A vector is a quantity that has both magnitude and direction. A vector is often represented graphically as an arrow where the direction is the direction of the arrow, and the magnitude

More information

The Normal Distribution. Chapter 6

The Normal Distribution. Chapter 6 + The Normal Distribution Chapter 6 + Applications of the Normal Distribution Section 6-2 + The Standard Normal Distribution and Practical Applications! We can convert any variable that in normally distributed

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1 Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency Srihari: CSE 626 1 Text Retrieval Retrieval of text-based information is referred to as Information Retrieval (IR)

More information

Neural Networks Lecture 2:Single Layer Classifiers

Neural Networks Lecture 2:Single Layer Classifiers Neural Networks Lecture 2:Single Layer Classifiers H.A Talebi Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Winter 2011. A. Talebi, Farzaneh Abdollahi Neural

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

4. Numerical Quadrature. Where analytical abilities end... continued

4. Numerical Quadrature. Where analytical abilities end... continued 4. Numerical Quadrature Where analytical abilities end... continued Where analytical abilities end... continued, November 30, 22 1 4.3. Extrapolation Increasing the Order Using Linear Combinations Once

More information

Searching Dimension Incomplete Databases

Searching Dimension Incomplete Databases IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 6, NO., JANUARY 3 Searching Dimension Incomplete Databases Wei Cheng, Xiaoming Jin, Jian-Tao Sun, Xuemin Lin, Xiang Zhang, and Wei Wang Abstract

More information

CSE446: non-parametric methods Spring 2017

CSE446: non-parametric methods Spring 2017 CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want

More information

Chapter 2: Linear Programming Basics. (Bertsimas & Tsitsiklis, Chapter 1)

Chapter 2: Linear Programming Basics. (Bertsimas & Tsitsiklis, Chapter 1) Chapter 2: Linear Programming Basics (Bertsimas & Tsitsiklis, Chapter 1) 33 Example of a Linear Program Remarks. minimize 2x 1 x 2 + 4x 3 subject to x 1 + x 2 + x 4 2 3x 2 x 3 = 5 x 3 + x 4 3 x 1 0 x 3

More information

Research Article Dynamic Time Warping Distance Method for Similarity Test of Multipoint Ground Motion Field

Research Article Dynamic Time Warping Distance Method for Similarity Test of Multipoint Ground Motion Field Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2, Article ID 74957, 2 pages doi:.55/2/74957 Research Article Dynamic Time Warping Distance Method for Similarity Test of Multipoint

More information

Lecture 3. Econ August 12

Lecture 3. Econ August 12 Lecture 3 Econ 2001 2015 August 12 Lecture 3 Outline 1 Metric and Metric Spaces 2 Norm and Normed Spaces 3 Sequences and Subsequences 4 Convergence 5 Monotone and Bounded Sequences Announcements: - Friday

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel 2009 IEEE International

More information

Supplementary Information

Supplementary Information Supplementary Information For the article"comparable system-level organization of Archaea and ukaryotes" by J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, and. Szathmáry (reference numbers

More information

Grade 6 Curriculum Map Key: Math in Focus Course 1 (MIF)

Grade 6 Curriculum Map Key: Math in Focus Course 1 (MIF) TIME FRAME September (10 days) UNIT/CONCEPTS Course 1A Content Chapter 1: Positive Numbers and the Number Line CORE GOALS & SKILLS Big Idea: Whole numbers, fractions, and decimals are numbers that can

More information

Year 10 Higher Easter 2017 Revision

Year 10 Higher Easter 2017 Revision Year 10 Higher Easter 2017 Revision 1. A show is on Friday, Saturday and Sunday. 2 5 of the total tickets sold are for the Saturday show. 240 tickets are sold for the Saturday show. Three times as many

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

Chapter 17. Current and Resistance

Chapter 17. Current and Resistance Chapter 17 Current and Resistance Electric Current The current is the rate at which the charge flows through a surface Look at the charges flowing perpendicularly through a surface of area A I av The SI

More information

Absolute Value Inequalities 2.5, Functions 3.6

Absolute Value Inequalities 2.5, Functions 3.6 Absolute Value Inequalities 2.5, Functions 3.6 Fall 2013 - Math 1010 (Math 1010) M 1010 3.6 1 / 17 Roadmap 2.5 - Review of absolute value inequalities. 3.6 - Functions: Relations, Functions 3.6 - Evaluating

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Mathematically Sophisticated Classification Todd Wilson Statistical Learning Group Department of Statistics North Carolina State University September 27, 2016 1 / 29 Support Vector

More information

CSE 546 Final Exam, Autumn 2013

CSE 546 Final Exam, Autumn 2013 CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,

More information

The problem of classification in Content-Based Image Retrieval systems

The problem of classification in Content-Based Image Retrieval systems The problem of classification in Content-Based Image Retrieval systems Tatiana Jaworska Systems Research Institute, Polish Academy of Sciences Warsaw, Poland Presentation plan Overview of the main idea

More information

BREVET BLANC N 1 JANUARY 2012

BREVET BLANC N 1 JANUARY 2012 Exercise. (5 pts) duration: h Presentation and explanations (4 points) Numerical Activities Consider the figure opposite, which is made up of two squares.. a) Calculate the area A of the white part. b)

More information

MAC Learning Objectives. Learning Objectives (Cont.) Module 10 System of Equations and Inequalities II

MAC Learning Objectives. Learning Objectives (Cont.) Module 10 System of Equations and Inequalities II MAC 1140 Module 10 System of Equations and Inequalities II Learning Objectives Upon completing this module, you should be able to 1. represent systems of linear equations with matrices. 2. transform a

More information

STATISTICAL PERFORMANCE

STATISTICAL PERFORMANCE STATISTICAL PERFORMANCE PROVISIONING AND ENERGY EFFICIENCY IN DISTRIBUTED COMPUTING SYSTEMS Nikzad Babaii Rizvandi 1 Supervisors: Prof.Albert Y.Zomaya Prof. Aruna Seneviratne OUTLINE Introduction Background

More information

Outline. Background on time series mining Similarity Measures Normalization

Outline. Background on time series mining Similarity Measures Normalization Reconvene In the first half, we have seen many wonderful things you can do with the Matrix Profile, without explaining how to compute it! In this half, we will describe algorithms to compute matrix profile,

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Time Series. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. Mining and Forecasting

Time Series. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. Mining and Forecasting http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Time Series Mining and Forecasting Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech

More information

Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting

Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting Vol. 9, No., 18 Dynamic Time Warping and FFT: A Data Preprocessing Method for Electrical Load Forecasting Juan Huo School of Electrical and Automation Engineering Zhengzhou University, Henan Province,

More information

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification

Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Computational Intelligence Lecture 3: Simple Neural Networks for Pattern Classification Farzaneh Abdollahi Department of Electrical Engineering Amirkabir University of Technology Fall 2011 arzaneh Abdollahi

More information

A FINE-GRAINED APPROACH TO

A FINE-GRAINED APPROACH TO A FINE-GRAINED APPROACH TO ALGORITHMS AND COMPLEXITY Virginia Vassilevska Williams MIT ICDT 2018 THE CENTRAL QUESTION OF ALGORITHMS RESEARCH ``How fast can we solve fundamental problems, in the worst case?

More information

Time: 1 hour 30 minutes

Time: 1 hour 30 minutes Paper Reference(s) 6684/0 Edexcel GCE Statistics S Silver Level S Time: hour 30 minutes Materials required for examination papers Mathematical Formulae (Green) Items included with question Nil Candidates

More information

Decision Trees (Cont.)

Decision Trees (Cont.) Decision Trees (Cont.) R&N Chapter 18.2,18.3 Side example with discrete (categorical) attributes: Predicting age (3 values: less than 30, 30-45, more than 45 yrs old) from census data. Attributes (split

More information

Definitions and Properties of R N

Definitions and Properties of R N Definitions and Properties of R N R N as a set As a set R n is simply the set of all ordered n-tuples (x 1,, x N ), called vectors. We usually denote the vector (x 1,, x N ), (y 1,, y N ), by x, y, or

More information

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Histograms: Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays Sep 9 1:13 PM Shape: Skewed left Bell shaped Symmetric Bi modal Symmetric Skewed

More information

Max-Sum Diversification, Monotone Submodular Functions and Semi-metric Spaces

Max-Sum Diversification, Monotone Submodular Functions and Semi-metric Spaces Max-Sum Diversification, Monotone Submodular Functions and Semi-metric Spaces Sepehr Abbasi Zadeh and Mehrdad Ghadiri {sabbasizadeh, ghadiri}@ce.sharif.edu School of Computer Engineering, Sharif University

More information

Machine Learning on temporal data

Machine Learning on temporal data Machine Learning on temporal data Learning Dissimilarities on Time Series Ahlame Douzal (Ahlame.Douzal@imag.fr) AMA, LIG, Université Joseph Fourier Master 2R - MOSIG (2011) Plan Time series structure and

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

Modeling Periodic Phenomenon with Sinusoidal Functions Lesson 33

Modeling Periodic Phenomenon with Sinusoidal Functions Lesson 33 (A) Lesson Objectives a. Work with the equation f x A Bx C D sin in the context of word problems using the TI-84 and graphic approaches to answering application questions. b. Use the graphing calculator

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities

More information

Math 117: Infinite Sequences

Math 117: Infinite Sequences Math 7: Infinite Sequences John Douglas Moore November, 008 The three main theorems in the theory of infinite sequences are the Monotone Convergence Theorem, the Cauchy Sequence Theorem and the Subsequence

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Number of weeks in n days. Age in human years of your dog when she has lived n years. Your actual height h when you are wearing 2 inch shoes.

Number of weeks in n days. Age in human years of your dog when she has lived n years. Your actual height h when you are wearing 2 inch shoes. The number of inches in n feet. Age in human years of your dog when she has lived n years Number of weeks in n days Your actual height h when you are wearing 2 inch shoes. 12n 7n! $ %&! $ h - 2 The product

More information

f (x) f (a) f (a) = lim x a f (a) x a

f (x) f (a) f (a) = lim x a f (a) x a Differentiability Revisited Recall that the function f is differentiable at a if exists and is finite. f (a) = lim x a f (x) f (a) x a Another way to say this is that the function f (x) f (a) F a (x) =

More information

Econ Lecture 3. Outline. 1. Metric Spaces and Normed Spaces 2. Convergence of Sequences in Metric Spaces 3. Sequences in R and R n

Econ Lecture 3. Outline. 1. Metric Spaces and Normed Spaces 2. Convergence of Sequences in Metric Spaces 3. Sequences in R and R n Econ 204 2011 Lecture 3 Outline 1. Metric Spaces and Normed Spaces 2. Convergence of Sequences in Metric Spaces 3. Sequences in R and R n 1 Metric Spaces and Metrics Generalize distance and length notions

More information

a) 3 cm b) 3 cm c) cm d) cm

a) 3 cm b) 3 cm c) cm d) cm (1) Choose the correct answer: 1) =. a) b) ] - [ c) ] - ] d) ] [ 2) The opposite figure represents the interval. a) [-3, 5 ] b) ] -3, 5 [ c) [ -3, 5 [ d) ] -3, 5 ] -3 5 3) If the volume of the sphere is

More information

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why?

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why? KTH ROYAL INSTITUTE OF TECHNOLOGY Norms for vectors and matrices Why? Lecture 5 Ch. 5, Norms for vectors and matrices Emil Björnson/Magnus Jansson/Mats Bengtsson April 27, 2016 Problem: Measure size of

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Knowledge Discovery in Databases II Summer Term 2017

Knowledge Discovery in Databases II Summer Term 2017 Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases II Summer Term 2017 Lecture 3: Sequence Data Lectures :

More information

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines Nonlinear Support Vector Machines through Iterative Majorization and I-Splines P.J.F. Groenen G. Nalbantov J.C. Bioch July 9, 26 Econometric Institute Report EI 26-25 Abstract To minimize the primal support

More information

ICML - Kernels & RKHS Workshop. Distances and Kernels for Structured Objects

ICML - Kernels & RKHS Workshop. Distances and Kernels for Structured Objects ICML - Kernels & RKHS Workshop Distances and Kernels for Structured Objects Marco Cuturi - Kyoto University Kernel & RKHS Workshop 1 Outline Distances and Positive Definite Kernels are crucial ingredients

More information

Transforming Hierarchical Trees on Metric Spaces

Transforming Hierarchical Trees on Metric Spaces CCCG 016, Vancouver, British Columbia, August 3 5, 016 Transforming Hierarchical Trees on Metric Spaces Mahmoodreza Jahanseir Donald R. Sheehy Abstract We show how a simple hierarchical tree called a cover

More information

THE CRYSTAL BALL FORECAST CHART

THE CRYSTAL BALL FORECAST CHART One-Minute Spotlight THE CRYSTAL BALL FORECAST CHART Once you have run a simulation with Oracle s Crystal Ball, you can view several charts to help you visualize, understand, and communicate the simulation

More information

Lecture 9: March 26, 2014

Lecture 9: March 26, 2014 COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 9: March 26, 204 Spring 204 Scriber: Keith Nichols Overview. Last Time Finished analysis of O ( n ɛ ) -query

More information

Measures of Morphological Complexity of Gray Matter on Magnetic Resonance Imaging for Control Age Grouping

Measures of Morphological Complexity of Gray Matter on Magnetic Resonance Imaging for Control Age Grouping Measures of Morphological Complexity of Gray Matter on Magnetic Resonance Imaging for Control Age Grouping Tuan Pham, Taishi Abe, Ryuichi Oka and Yung-Fu Chen Linköping University Post Print N.B.: When

More information

A Comparison of HRV Techniques: The Lomb Periodogram versus The Smoothed Pseudo Wigner-Ville Distribution

A Comparison of HRV Techniques: The Lomb Periodogram versus The Smoothed Pseudo Wigner-Ville Distribution A Comparison of HRV Techniques: The Lomb Periodogram versus The Smoothed Pseudo Wigner-Ville Distribution By: Mark Ebden Submitted to: Prof. Lionel Tarassenko Date: 19 November, 2002 (Revised 20 November)

More information

An invariance result for Hammersley s process with sources and sinks

An invariance result for Hammersley s process with sources and sinks An invariance result for Hammersley s process with sources and sinks Piet Groeneboom Delft University of Technology, Vrije Universiteit, Amsterdam, and University of Washington, Seattle March 31, 26 Abstract

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 9: Basis Expansions Department of Statistics & Biostatistics Rutgers University Nov 01, 2011 Regression and Classification Linear Regression. E(Y X) = f(x) We want to learn

More information

Consider teaching Geometry throughout the school year. Geometry Friday is a suggestion. See a member of the pacing committee for information.

Consider teaching Geometry throughout the school year. Geometry Friday is a suggestion. See a member of the pacing committee for information. Overview Content Standard Domains and Clusters Ratios and roportional Relationships [R] Understand ratio concepts and use ratio reasoning to solve problems. The Number System [NS] Apply and extend previous

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Write down all the 3-digit positive integers that are squares and whose individual digits are non-zero squares. Answer: Relay

Write down all the 3-digit positive integers that are squares and whose individual digits are non-zero squares. Answer: Relay A Write down all the 3-digit positive integers that are squares and whose individual digits are non-zero squares. A2 What is the sum of all positive fractions that are less than and have denominator 73?

More information

Probabilistic Similarity Search for Uncertain Time Series

Probabilistic Similarity Search for Uncertain Time Series Probabilistic Similarity Search for Uncertain Time Series Johannes Aßfalg, Hans-Peter Kriegel, Peer Kröger, Matthias Renz Ludwig-Maximilians-Universität München, Oettingenstr. 67, 80538 Munich, Germany

More information

Day-ahead time series forecasting: application to capacity planning

Day-ahead time series forecasting: application to capacity planning Day-ahead time series forecasting: application to capacity planning Colin Leverger 1,2, Vincent Lemaire 1 Simon Malinowski 2, Thomas Guyet 3, Laurence Rozé 4 1 Orange Labs, 35000 Rennes France 2 Univ.

More information