CLASSIFICATION OF KINETICS OF MOVEMENT FOR LOWER LIMB USING COVARIATE SHIFT METHOD FOR BRAIN COMPUTER INTERFACE

Similar documents
HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems

Recipes for the Linear Analysis of EEG and applications

Applications of High Dimensional Clustering on Riemannian Manifolds in Brain Measurements

Classification of Mental Tasks from EEG Signals Using Spectral Analysis, PCA and SVM

Myoelectrical signal classification based on S transform and two-directional 2DPCA

Supplementary Materials for

Linking non-binned spike train kernels to several existing spike train metrics

Brain Computer Interface Using Tensor Decompositions and Multi-way Analysis

Bayesian Classification of Single-Trial Event-Related Potentials in EEG

Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces

RECOGNITION ALGORITHM FOR DEVELOPING A BRAIN- COMPUTER INTERFACE USING FUNCTIONAL NEAR INFRARED SPECTROSCOPY

A Maxmin Approach to Optimize Spatial Filters for EEG Single-Trial Classification

Single-trial P300 detection with Kalman filtering and SVMs

Stationary Common Spatial Patterns for Brain-Computer Interfacing

STATE GENERALIZATION WITH SUPPORT VECTOR MACHINES IN REINFORCEMENT LEARNING. Ryo Goto, Toshihiro Matsui and Hiroshi Matsuo

Impedance Control and Internal Model Formation When Reaching in a Randomly Varying Dynamical Environment

L5 Support Vector Classification

Importance Reweighting Using Adversarial-Collaborative Training

Direct Importance Estimation with Gaussian Mixture Models

Standard P300 scalp. Fz, Cz, Pz Posterior Electrodes: PO 7, PO 8, O Z

Slice Oriented Tensor Decomposition of EEG Data for Feature Extraction in Space, Frequency and Time Domains

Electroencephalogram Based Causality Graph Analysis in Behavior Tasks of Parkinson s Disease Patients

Tutorial on Methods for Interpreting and Understanding Deep Neural Networks. Part 3: Applications & Discussion

Introduction to Machine Learning Midterm Exam

MEG Source Localization Using an MLP With Distributed Output Representation

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Supplemental Information. Noise and Correlations. in Parallel Perceptual Decision Making. Thomas U. Otto and Pascal Mamassian

Masking Covariance for Common Spatial Pattern as Feature Extraction

Introduction to Support Vector Machines

Applications of Inertial Measurement Units in Monitoring Rehabilitation Progress of Arm in Stroke Survivors

Executed Movement Using EEG Signals through a Naive Bayes Classifier

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann

Modeling of Surface EMG Signals using System Identification Techniques

Direct Estimation of Wrist Joint Angular Velocities from Surface EMGs by Using an SDNN Function Approximator

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Content. Learning. Regression vs Classification. Regression a.k.a. function approximation and Classification a.k.a. pattern recognition

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

STUDY ON METHODS FOR COMPUTER-AIDED TOOTH SHADE DETERMINATION

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment.

Adaptive changes of rhythmic EEG oscillations in space implications for brain-machine interface applications.

MEASURE, MODELING AND COMPENSATION OF FATIGUE-INDUCED DELAY DURING NEUROMUSCULAR ELECTRICAL STIMULATION

Course content (will be adapted to the background knowledge of the class):

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Introduction to Machine Learning Midterm Exam Solutions

Comparison and Development of Algorithms for Motor Imagery Classification in EEG-based Brain-Computer Interfaces

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Study of Feature Extraction for Motor-Imagery EEG Signals

New Machine Learning Methods for Neuroimaging

Statistical Pattern Recognition

Support Vector Machines Explained

EEG- Signal Processing

Transfer Learning in Brain-Computer Interfaces

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation

Classification Error Correction: A Case Study in Brain-Computer Interfacing

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

Multiway Canonical Correlation Analysis for Frequency Components Recognition in SSVEP-based BCIs

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Chapter 35 Bayesian STAI Anxiety Index Predictions Based on Prefrontal Cortex NIRS Data for the Resting State

Research Article Classifying EEG for Brain-Computer Interface: Learning Optimal Filters for Dynamical System Features

MODEL-BASED SOURCE SEPARATION FOR MULTI-CLASS MOTOR IMAGERY

Learning Gaussian Process Models from Uncertain Data

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

THESIS ANALYSIS OF TEMPORAL STRUCTURE AND NORMALITY IN EEG DATA. Submitted by Artem Sokolov Department of Computer Science

Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion

Lecture 4: Feed Forward Neural Networks

LECTURE NOTE #3 PROF. ALAN YUILLE

Brief Introduction of Machine Learning Techniques for Content Analysis

Classification on Pairwise Proximity Data

GLR-Entropy Model for ECG Arrhythmia Detection

A Multivariate Time-Frequency Based Phase Synchrony Measure for Quantifying Functional Connectivity in the Brain

Is the Human Visual System Invariant to Translation and Scale?

2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

Qualifying Exam in Machine Learning

MECHANICAL IMPEDANCE OF ANKLE AS A FUNCTION OF ELECTROMYOGRAPHY SIGNALS OF LOWER LEG MUSCLES USING ARTIFICIAL NEURAL NETWORK

Advanced Introduction to Machine Learning CMU-10715

THE MAN WHO MISTOOK HIS COMPUTER FOR A HAND

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Motor imagery EEG discrimination using Hilbert-Huang Entropy.

Sparse Kernel Machines - SVM

Novel Quantization Strategies for Linear Prediction with Guarantees

Nonlinear reverse-correlation with synthesized naturalistic noise

Abstract This project investigates the development of a BCI system using a consumer grade EEG headset. This includes signal acquisition,

Heartbeats Arrhythmia Classification Using Probabilistic Multi-Class Support Vector Machines: A Comparative Study

Classifying EEG for Brain Computer Interfaces Using Gaussian Process

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

Speaker Verification Using Accumulative Vectors with Support Vector Machines

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

Pattern Recognition and Machine Learning

Distribution-Free Distribution Regression

A Comparison of EMG and EEG Signals for Prostheses Control using Decision Tree

SVM TRADE-OFF BETWEEN MAXIMIZE THE MARGIN AND MINIMIZE THE VARIABLES USED FOR REGRESSION

Perceptron Revisited: Linear Separators. Support Vector Machines

ELECTROENCEPHALOGRAM (EEG), a type of neural

Linear & nonlinear classifiers

Application of Newton/GMRES Method to Nonlinear Model Predictive Control of Functional Electrical Stimulation

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina

An Improved Conjugate Gradient Scheme to the Solution of Least Squares SVM

Transcription:

2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) CLASSIFICATION OF KINETICS OF MOVEMENT FOR LOWER LIMB USING COVARIATE SHIFT METHOD FOR BRAIN COMPUTER INTERFACE A. Hassan 1, I. Niazi 2, M. Jochumsen 2, F. Riaz 1, K. Dremstrup 2 1 College of Electrical and Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad, Pakistan. 2 Department of Health Science and Technology (HST), Aalborg University, Denmark. ABSTRACT Detecting movement intentions from Electroencephalography (EEG) signals and extracting intended kinetic information such as force and speed may have implications for rehabilitation with assistive technologies by casually linking afferent feedback from the assistive device with the cortical generated movement potentials. However, extraction and classification of kinetics from the movement intention (before task onset) on a single-trial basis have only been performed with limited performance due to low signal-to-noise ratio and large trialto-trial variability. The aim of this study was to investigate a covariate shift method to address the basic challenge of nonstationarity (changes from session to session and trial-to-trial variability) for decoding different levels of speed and force. We tested this method using cross-validation procedures and a linear support vector machine to classify temporal features associated with two levels of force and speed in 9 subjects. The classification accuracy obtained across different class pairs across subjects was 73.1 ± 6.8 % and 70.0 ± 3.6 % with and without the covariate shift method, respectively. The classification accuracy was significantly higher (p < 0.03) using the covariate shift method. Index Terms Covariate shift, movement-related cortical potentials, brain-computer interface, speed, force. 1. INTRODUCTION In the last decade brain-controlled devices, commonly known as brain-computer interfaces (BCIs), have emerged as an augmentative tool for rehabilitation to improve or restore the ability of movement-impaired individuals [1]. It has been reported that coincident activation of the brain from somatosensory feedback (from a single pulse electrical stimulation) and motor imagination can promote plastic changes associated with those seen in motor relearning [2] for different patient groups e.g. stroke. This idea was implemented Thanks to National University of Sciences and Technology, Pakistan for funding this work. This work was supported by the Danish Agency for Science and Technology and Innovation, Denmark. as a BCI by detecting the movement intention of movementrelated cortical potential (MRCP) that triggered electrical stimulation [3]. Instead of triggering a single pulse electrical stimulation, the detection of the movement intention can trigger functional electrical stimulation or a robotic device that can replicate a certain movement. However, to replicate the intended movement information about speed and force must be extracted before task onset, and in this way correct afferent feedback can be provided by the assistive device according to the movement intention. This will close the motor control loop and provide the BCI with more degrees of freedom that can be used to introduce variable training that can maximize the retention of re-learned movements during rehabilitation of stroke patients [4]. Extracting different levels of speed and force from movements intentions have been done previously e.g. [5, 6, 7] by using the marginal distribution of the discrete wavelet transform and from temporal features. These were classified with support vector machines (SVMs) with optimized Gaussian kernels. The extraction of the kinetic information from a limb is impeded by the inherent problem of non-stationarity in EEG signals. This is a big hindrance in the robustness of BCI systems which can be used for rehabilitation. This problem even becomes more challenging if a less number of electrodes is used, which would be desirable for a quick EEG setup in a clinical setting. The factors contributing to the nonstationarity of EEG data are changes in user attention level during sessions, fatigue and differences in the impedance of electrode due to their positioning. Hence, the EEG distributions change from one session to another and even within a single session illustrating the non-stationarity of the data [8]. However, in standard supervised machine learning algorithms the prior probability (p(x)) is assumed to remain the same during the training and testing phase. The situation where the prior probability changes between training and testing is called covariate shift [9]. In this situation, an adaptive BCI system can be designed in a way, so that it can adjust itself in case there is a shift in the data. In a previous study, the covariate shift has been applied for a two-class problem (discrimination between a movement or an idle state) using 978-1-4799-2893-4/14/$31.00 2014 IEEE 5895

event related synchronization in the beta band which appear after termination of the movement [10]. In this paper, only time domain features from the movement intention are extracted to classify force and speed into different levels. The features are classified with an SVM classifier with a linear kernel. The aim of this study was to compare the performance of the SVMs with and without the use the covariate shift method. Maintain contraction (0.5 s) 20/60 % MVC 2. METHODOLOGY Preparation (3 s) Rest (3.5-4.5 s) 2.1. Subjects Nine healthy subjects (1 female and 8 males 29 ± 6 years old) participated in this study. A informed consent was taken from all the subjects before participation. The procedures were approved by the local ethical committee (N-20100067). 2.2. Experimental Protocol Subjects were seated in a chair with their right foot fixated to a pedal where a force transducer was attached. They were instructed to perform maximum voluntary contractions (MVCs) followed by four different tasks of real isometric dorsi-flexions of the right ankle: i) 0.5 s to reach 20 % MVC (f20) ii) 0.5 s to reach 60 % MVC (f60), iii) 3 s to reach 20 % MVC (s20) and iv) 3 s to reach 60 % MVC (s60). A custom made program called Follow Me by Knud Larsen, Aalborg University, was used to assist the subjects cues to perform the movements with correct level of force and speed by providing cues. The subjects were instructed to follow a particular force trace (see Fig: 1). They received visual feedback of their performance during the experiment. Each of the four tasks was repeated 50 times in blocks which were randomized. 2.3. Signal Acquisition EEG Only ten channels of monopolar EEG were continuously recorded using scalp electrodes with a sampling rate of 500Hz. The 20 mm Blue Sensor Ag/AgCl, AMBU A/S, Denmark electrodes were placed on the scalp according to the international 10-20 system at FP1, F3, F4, Fz, C3, C4, Cz, P3, P4 and Pz locations. The reference and ground electrodes were placed on the right earlobe and at nasion, respectively. The EEG was divided into epochs using a trigger that was sent from Follow Me at the beginning of each trial (at the beginning of the preparation phase in Fig. 1) to the EEG amplifier. FP1 was used to record EOG activity. Force and MVC The recordings for force were made using Mr. Kick (Knud Larsen, SMI, Aalborg University) and used as input to Fol- Time [s] Fig. 1. The subjects had to prepare for 3 s after which they started the execution of the movement. The execution phase lasted either 0.5 or 3 s. When they reached the desired force level (20 or 60 % MVC) they maintained the contraction for 0.5 s followed by a rest period. low Me. The force was sampled at 2000 Hz. A total of three MVCs were recorded with a rest of one minute from one contraction to another. The MVC having the highest value was used. Pre-processing A 2nd order Buttorworth filter was used to bandpass filter the EEG signal from 0.05 to 10 Hz and spatially filtered with a Large Laplacian spatial filter. 2.4. Classification of Movements Feature extraction Epochs having EOG activity greater than 125 µv were rejected ( 6 per task). The single-trial movement intentions (2000 100 ms before the task onset) were used to extract six temporal features to predict the intended movement type. The following features were extracted: i) point of max. negativity, ii) mean value of amplitude, iii) and iv) slope and intersection of a linear regression using data from the entire interval, v) and vi) slope and intersection between 500 100 ms of a linear regression before the task onset (see Fig. 2). Covariate shift Under the conditions of covariate shift, the prior probability p(x) changes between the training and testing data where the data is assumed to be generated by the model p(y x)p(x). In this situation, the training data will not accurately represent the statistics of the test data. In the case of covariate shift in 5896

Amplitude [µv] 4 0-4 -8-12 Slow 20 % MVC (n=50) Slow 60 % MVC (n=49) Fast 20 % MVC (n=50) Fast 60 % MVC (n=50) -3-2 -1 0 1 2 Time [s] Fig. 2. Average of the four tasks for subject 1. Fast and Slow refers to 0.5 s and 0.3 s respectively, to reach the desired level of MVC, where 0 is the task onset and n is the number of trials in the average. width σ = 0.1, B = 1000, and ɛ = ( n tr 1/ n tr ) for calculating the IWs. Importance Weighted Support Vector Machines After calculating the IWs, the next step is to incorporate them into a machine learning algorithm. For our problem, we have used the standard SVMs as the machine learning algorithm because on the complexity and performance scale, they are considered to be optimal. The standard form of soft margin SVMs is: max α such that n i=1 α i 1 2 αt Hα (4) 0 α i C and n i=1 α iy i = 0, where α i are the Lagrange multipliers, H i,j y i y j x i, x j and C is the penalty factor for misclassification for n number of training samples. By incorporating the IWs into the SVM, the importance weighted SVM (IWSVM) looks like: the data, we would like to assign more weight to those training samples that give the most information about the test data. The goal is to estimate importance weights (IWs) from training (x tr i ) and test samples (xte i ) such that: β(x) = p te (x)/p tr (x) (1) where β(x) are the IWs. By using these IWs to weight the training data, we push the learning algorithm towards those training samples that give best representation of the test data. However, estimating the probability distributions for multidimensional data is always a challenge. In this paper, we propose to use Gaussian kernels to transform the input data into high dimensional space. Kernel mean matching (KMM) method proposed by A. Gretton et.al [11] is used to minimise the difference between the training and testing data distributions in the transformed domain. The objective function (J) is given by the difference of the two empirical means of training (n tr ) and testing (n te ) data samples: 1 ntr J(β) = min β n tr i=1 β iφ(x tr i ) 1 nte n te i=1 Φ(xte i ) 2 (2) subject to β i [0, B] and 1 i=1 β i 1 ɛ, n tr ntr where B > 0 and ɛ > 0 are tuning parameters and Φ is the Gaussian basis function. The objective function can be expanded into a quadratic programming problem to find suitable β as: min β 1 2 βt Kβ κ T β, (3) subject to β i [0, B] and n tr i=1 β i n tr n tr ɛ, where K ij = k ( x tr i, ) xtr j and κi = ntr nte n te j=1 k ( x tr i, ) xte j and k is the Gaussian kernel mapping function. In this paper, we have used the following values of Gaussian kernel max α such that n i=1 α i 1 2 αt Hα (5) 0 α i β i C and n i=1 α iy i = 0. For complete mathematical and algorithmic details of IWSVMs, readers are referred to [12]. The only difference between the standard SVM and IWSVM is the limit on the Lagrange multipliers (α i ). This IWSVM is going to adapt the decision boundary to consider the IWs calculated from the training and testing data. This type of SVMs should perform better than a standard SVMs if there exists a data shift in the input data. 3. RESULTS The 3-fold cross-validation was used to calculate the classification accuracy of each task pair(fast 20% MVC (f20) versus fast 60% MVC (f60) etc.). The results are presented in Fig. 3. The best performance was obtained when the linear SVM was used in combination with covariate shift methodology for f60-s60 class pair (79.3 ± 8.7%). On average across the subjects and the six task pairs linear SVM with covariate shift methods gave classification accuracy of 73.1 ± 6.8%), whereas linear SVM without the covariate shift method gave 70.0 ± 3.6% of classification accuracy. A two-way analysis of variance (ANOVA) with factors methods (linear SVM with and without covariate shift) and six class pairs (f20-f60, f20-s20, f20-s60, f60-s20, f60-s60, and s20-s60) showed significance improvement (F (1, 8) = 6.27, p < 0.03). 4. DISCUSSIONS In this paper, we have proposed to use covariate shift to deal with the non-stationarity of EEG signals from one session to 5897

5. REFERENCES Classification Accuracy 1 0.8 0.6 0.4 0.2 0 Linear SVM with covariate shift method Linear SVM without covariate shift method f20-f60 f20-s20 f20-s60 f60-s20 f60-s60 s20-s60 Fig. 3. The columns represents the average (across subjects) classification performance of the SVMs with (dark shade) and without (light shade) covariate shift method for all combinations of the task pairs. The standard deviations are indicated in the figure. another to classify different movement kinetics before the onset of the task. Any BCI system capable of classifying movement kinetics, with short latency with respect to the onset of a task can be used as a neuromodulatory system [2] and could potentially be used to drive an external augmentative device (robot assisted movements) for restoring or improving the motor functions for stroke patients. As proposed by Mrachacz-Kersting et. al. [2], plasticity can be induced within the human motor cortex if an artificially generated signal (peripheral electrical stimulation) can coincide with a physiologically generated signal (movement intention). The experimental results presented in this paper show that by using a very simple classifier (linear kernel SVM), we can deal with the non-stationarity of EEG signals achieving significant improvement (p < 0.03) in classification accuracy over a standard method. The classification accuracies of self-paced neuromodulatory BCI systems have been found to be positively correlated with excitability of the corticospinal projections of the target muscle [3]. It is a known fact that the neural signals generated in selfpaced scenarios are different compared to those obtained in cue based scenarios [13]. In this paper, the level of force and speed was precisely controlled across subjects using a cue based system. One of the issues with KMM methodology used in this paper is that it requires the testing data to be available before the IWs can be calculated. Therefore, data recorded from previous sessions can be used to train the IWSVMs. In future, we would like to make covariate shift methods incremental and adaptive to make them directly applicable to online BCI systems. [1] J. Daly and J. Wolpaw, Brain-computer interfaces in neurological rehabilitation, The Lancet Neurology, vol. 7, no. 11, pp. 1032 1043, 2008. [2] N. Mrachacz-Kersting, S. Kristensen, I. Niazi, and D. Farina, Precise temporal association between cortical potentials evoked by motor imagination and afference induces cortical plasticity, Journal of Physiology, vol. 590, no. 7, pp. 1669 1682, 2012. [3] I. Niazi, N. Mrachacz-Kersting, N. Jiang, K. Dremstrup, and D. Farina, Peripheral electrical stimulation triggered by self-paced detection of motor intention enhances motor evoked potentials, IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 20, no. 4, pp. 595 604, 2012. [4] J. Krakauer, Motor learning: its relevance to stroke recovery and neurorehabilitation, Current Opinion in Neurology, vol. 19, no. 1, pp. 84 90, 2006. [5] D. Farina, O. Nascimento, M. Lucas, and C. Doncarli, Optimization of wavelets for classification of movement-related cortical potentials generated by variation of force-related parameters, Journal of Neuroscience Methods, vol. 162, no. 1-2, pp. 357 363, 2007. [6] Y. Gu, O. Nascimento, M. Lucas, and D. Farina, Identification of task parameters from movement-related cortical potentials, Medical & Biological Engineering & Computing, vol. 47, no. 12, pp. 1257 1264, 2009. [7] M. Jochumsen, I. Niazi, N. Mrachacz-Kersting, D. Farina, and K. Dremstrup, Detection and classification of movement-related cortical potentials associated with task force and speed, Journal of Neural Engineering, vol. 10, no. 5, pp. 056015, 2013. [8] Y. Li, H. Kambara, Y. Koike, and M. Sugiyama, Application of covariate shift adaptation techniques in brain computer interfaces, IEEE Transactions on Biomedical Engineering, vol. 57, no. 6, pp. 1318 1324, 2010. [9] J. Qui nonero-candela, M. Sugiyama, A. Schwaighofer, and N. Lawrence, Eds., Dataset Shift in Machine Learning, MIT Press, Cambridge, MA, 2009. [10] R. Mohammadi, A. Mahloojifar, and D. Coyle, Unsupervised short-term covariate shift minimization for self-paced bci, in IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain, Singapore, 2013. [11] A. Gretton, A. Smola, J. Huang, M. Schmittfull, K. Borgwardt, and B. Schoelkopf, Dataset Shift in Machine Learning, chapter 8. Covariate Shift and Local 5898

Learning by Distribution Matching, pp. 131 160, MIT Press, Cambridge, MA, 2009. [12] A. Hassan, R. Damper, and M. Niranjan, On acoustic emotion recognition: Compensating for covariate shift, IEEE Transactions on Audio, Speech and Language Processing, vol. 21, no. 7, pp. 1458 1468, 2013. [13] S. Jankelowitz and J. Colebatch, Movement related potentials associated with self-paced, cued and imagined arm movements, Experimental Brain Research, vol. 147, no. 1, pp. 98 107, 2002. 5899