Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes

Size: px
Start display at page:

Download "Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes"

Transcription

1 Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes Joachim Schenk, Stefan Schwärzler, and Gerhard Rigoll Institute for Human-Machine Communication Technische Universität München Theresienstraße 90, München {schenk, schwaerzler, rigoll}@mmk.ei.tum.de Abstract In this paper we study the influence of quantization on the features used in on-line handwriting recognition in order to apply discrete single and multiple stream HMMs. It turns out that the separation of the features in statisticaly independent streams influences the performance of the recognition system: using the discrete pressure feature as an example, we show that the statistical dependencies between the features are as important as their proper quantization. In an experimental section we show that a continuous state-of-the-art system for on-line handwritten whiteboard note recognition can be outperformed by r = 2.0 % relative in word level accuracy using a three stream discrete HMM system with well chosen streams. A relative improvement of r = 4.5 % can be achieved when comparing a single stream HMM system with the best performing mutliple stream HMM system. Keywords: Discrete HMM, discrete multiple stream HMM, on-line whiteboard note recognition, quantization 1. Introduction In modern on-line handwriting recognition, Hidden-Markov-Models (HMMs, [15; 18]) have proven to be the classifier of choice [13], due to their capability of modeling time-dynamic sequences of variable lengths. HMMs also compensate for statistical variations in those sequences. More recently, they were introduced for on-line handwritten whiteboard note recognition [9]. Commonly, in a handwriting recognition system, each symbol (for the purposes of this paper, each character) is represented by a single HMM (either discrete or continuous). By combining character-hmms, words are recognized using a dictionary. While high recognition rates are reported for isolated word recognition systems, e. g. [5], performance considerably drops when it comes to unconstrained handwritten sentence recognition [9]: the lack of previous word segmentation introduces new variability and therefore requires more sophisticated character recognizers. An even more demanding task is the recognition of handwritten whiteboard notes as introduced in [9]. Due to the conditions described in [9], one may characterize the problem of on-line whiteboard note recognition as difficult. One distinguishes between continuous and discrete HMMs. The latter are further divided into discrete single and multiple stream HMMs [16]. In case of continuous HMMs, the observation probability is modeled by mixtures of Gaussians [15]. It is known that the performance of a handwriting recognition systems depends on both the number of Gaussians used and the number of iterations for which each Gaussian is trained [4]. Therefore, exhaustive optimizations are needed when using continuous HMMs. In case of discrete HMMs, the probabilities are estimated by counting each discrete symbol s occurrences [15]. The probability density function of the observation is thereby derived exactly and not approximated by Gaussians, as in case of continuous HMMs. However, in order to transform the continuous handwriting data into discrete symbols, vector quantization (VQ) is performed, which introduces a quantization error. While in automatic speech recognition (ASR) continuous HMMs have gained wide acceptance, it remains unclear whether discrete or continuous HMMs should be used in on-line handwriting [16] and whiteboard note recognition in particular. In this paper we discuss the use of discrete single and multiple stream HMMs for the task of on-line whiteboard note recognition with respect to varying

2 codebook-sizes and the number of streams. While ASR features generally are continuous in nature, in handwriting recognition continuous, discrete, and binary features are used [9]. We prove that the feature describing the pen pressure is not quantized adequately when performing vector quantization due to quantization errors. In this paper we therefore form groups of features which are quantized separately and show that an improvement can be achieved using certain multiple stream configurations. To that end, the next section gives a brief overview of the general preprocessing and feature extraction system for whiteboard note recognition. Section 3 reviews VQ and both discrete single and multiple stream HMMs. Five different discrete HMM recognition systems used and evaluated in this paper are briefly summarized in Sec. 4. In an experimental section (Sec. 5), the database used for evaluation is explained and the formerly introduced systems are evaluated. In addition, our system is compared to a state-of-the-art continuous system to prove competitiveness. Finally, conclusions and an outlook for further work are presented in Sec System Overview For recording the handwritten whiteboard data the ebeam-system 1 is used. For further information refer to [9]. After recording, the sampled data (which is sampled equidistantly neither in time nor space) is preprocessed and normalized as a first step. The data is thereby resampled in order to achieve an equidistant sampling in space. Then, a histogrambased skew- and slant-correction is performed as described in [7]. Finally all text lines are normalized to meet a distance of one between the upper and lower baseline similar to [3]. Following the preprocessing, 24 features are extracted from the recorded data, resulting in a 24- dimensional feature vector f(t) = (f 1 (t),..., f 24 (t)) T derived from the three-dimensional data vector (sample points) s t = (x(t), y(t), p(t)) T. The state-of-theart features for handwriting recognition [6] and recently published new features for whiteboard note recognition [9] used in this paper are briefly listed below. Commonly, the features can be divided into two classes: on-line and off-line features. As on-line features we extract f 1 : indicating the pen pressure, i. e. f 1 = 1 if the pen s touches hits the whiteboard and f 1 = 0 otherwise. f 2 : velocity equivalent computed before resampling and later interpolated 1 f 3 : x-coordinate after resampling and subtraction of moving average f 4 : y-coordinate after resampling and normalization f 5,6 : angle α of spatially resampled and normalized strokes (coded as sin α and cos α, and called writing direction ) f 7,8 : difference of consecutive angles α = α t α t 1 (coded as sin α and cos α, and called curvature ) In addition, we use on-line features which describe the relation between a sample point s t and its neighbors as described in [9] and altered. f 9 : logarithmic transformation of the aspect of the trajectory between the points s t τ and s t, whereby τ < t denotes the τ th sample point before s t. As this aspect v (referred to as vicinity aspect ), v = ( ) y x, y + x x = x(t) x(t τ) y = y(t) y(t τ), tends to peak for small values of y + x, we narrow its range by f 9 = sign(v) log(1 + v ). f 10,11 : angle ϕ between the line [s t τ, s t ] and lower line (coded as sin ϕ and cos ϕ, and called vicinity slope ) f 12 : the length of trajectory normalized by the max( x ; y ) ( vicinity curliness ) f 13 : average square distance to each point in the trajectory and the line [s t τ, s t ] The off-line features are: f : a 3 3 subsampled bitmap slid along pen s trajectory ( context map ) to incorporate a partition of the currently written letter s actual image f : number of pixels above/beneath the current sample point s t (the ascenders and descenders ) 3. Discrete Single and Multiple Stream HMMs and Vector Quantization In this section we briefly summarize discrete Hidden Markov Models (HMMs) and discrete multiple stream HMMs from a Graphical Model s (GM, [2]) point of view and give a common notation. Then vector quantization (VQ) is reviewed and the necessary feature normalization for on-line whiteboard note recognition is explained Discrete single stream HMMs The GM of a discrete single stream HMM λ with the variable parameters λ = (A, B, π) and hidden states s 1,..., s N is depicted in Fig. 1 left, where q t denotes the state s i which is occupied at time instance t. Matrix A, consisting of the entries a ij =

3 discrete HMM q t-1 q t o t-1 o t hidden discrete node discrete multiple stream HMM q t-1 q t o t-1,1 o t-1,d o t,1 o t,d o t 1 o t observed discrete node in B = {B 1,..., B D }. At each time instance t the observation o t = [o t (1),..., o t (D)] is produced, where the observation o t (d) of stream d is produced statistically independently from all other streams given the current state q t. This is indicated in Fig. 1: p(o t s i ) = p(o t (d) s i ) = b i (o t (d)) l i (o t ). (3) The joint probability of the observation sequence O and the state sequence q = (q 1,..., q T ) is then, similarly to Eq. 1, given by Figure 1. GM of a discrete single stream HMM (left) and a discrete multiple stream HMM (right). p(q t = s j q t 1 = s i ) thereby describes the timeinvariant probability of a state transition q t 1 q t, B, with entries b si (o t ) = p(o t s i ) the discrete emission-probability of each state s i for each possible symbol o t and π = (π i,..., π N ) the initial state distribution π i = p(q 1 = s i ) [15]. Given a certain parameter set λ, the discrete HMM s joint probability of the observation o = (o 1,... o T ) and the state sequence q = (q 1,..., q T ) can be calculated to p(o, q λ) = p(q 1 ) p(o 1 q 1 ) p(q t q t 1 ) p(o t q t ). (1) By marginalizing, that is summing up Eq. 1 over all possible state sequences q Q, and using the above substitutions a ij, b si (o k ) and π i, the well-known production probability p(o λ) = q Q π q1 b q1 (o 1 ) a qt 1q t b qt (o t ) (2) is derived and can be computed efficiently using the forward-algorithm [15]. The parameters λ of a HMM can be trained using EM-Algorithm, in case of HMMs known as Baum-Welch-algorithm [1] Discrete multiple stream HMMs In GM-notation, a discrete multiple stream HMM with D observation streams and parameters λ St = (A, B, π) producing the observation sequence O = [o 1,..., o T ] is shown on the right hand side of Fig. 1. It is the discrete counterpart of the Multi-Stream HMM with equally weighted streams, as e. g. described in [14]. While the parameters A and π have the same definition as for single stream HMMs, the production probabilities of the discrete multiple stream HMM are stored separately for each stream d p(o, q λ St ) = p(q 1 ) p(o 1 (d) s i ) p(q t q t 1 ) p(o t (d) s i ). (4) Again, by marginalizing and using the substitutions a ij, π i, and Eq. 3 the production probability yields p(o λ) = q Q π q1 l q1 (o 1 ) a qt 1q t l qt (o t ). (5) As indicated by the similar structure of Eqs. 2 and 5, the training algorithms for the discrete HMM as described in [1] have to be slightly modified to handle multiple streams Vector Quantization In order to use discrete single or multiple stream HMMs, all continuous observations O are assigned to a stream of discrete observations o via quantization, whereby continuous, N-dimensional sequence O = (f 1,..., f T ), f t R N of length T is mapped to a discrete, one dimensional sequence of codebook indices ô = ( ˆf 1,..., ˆf T ), ˆf t N provided by a codebook C = (c 1,..., c Ncdb ), c k R N containing C = N cdb centroids c i [12]. For N = 1 this mapping is called scalar, and in all other cases (N 2) vector quantization (VQ). Once a codebook C is generated, the assignment of the continuous sequence to the codebook entries is a minimum distance search ˆf t = argmin d(f t, c k ), (6) 1 k N Cdb where d(f t, c k ) is commonly the squared Euclidean distance. The codebook C itself and its entries c i are derived from a training set S train containing S train = N train training samples O i by partitioning the N-dimensional feature space defined by S train into

4 N cdb cells. This is performed by the well known k- Means algorithm as described e. g. in [12]. As stated in [12], the centroids of a well trained codebook capture the distribution of the underlying feature vectors p(f) in the training data. As the values of the features described in Sec. 2 are neither mean nor variance normalized, each feature f j is normalized to the mean µ j = 0 and standard derivation σ j = 1, yielding the normalized feature f j. Thereby the statistical dependencies of the features are not changed. 4. Discrete HMM systems In this section we briefly summarize the five different system topologies that are evaluated in Sec. 5. System 1: The first system comprises the baseline system in which all normalized features ( f 1,..., f 24 ) are jointly quantized by one codebook and mapped to the discrete symbol ˆf. This results in the single observation stream o = ( ˆf 1,..., ˆf T ). As only one observation stream exists recognition is performed by single stream HMMs as described in Sec The same baseline system has been used in our previous work [17]. System 2: To prove that the binary pressure feature f 1 is not adequately quantized by standard VQ, the second system is used. All features except f 1 are jointly quantized by the discrete symbol ˆfr, resulting in one single observation stream o = ( ˆf r,1,..., ˆf r,t ). Again, the single stream observations are recognized by single stream HMMs (Sec. 3.1) (see also [17]). System 3: In order to keep the exact value of feature f 1, its values are directly used to form an independent observation, where o t (1) in Eq. 3 becomes o t (1) = f 1 (t), using the unnormalized values of the feature. The remaining features ( f 2,..., f 24 ) are jointly quantized as in the second system, forming a second observation o t (2) = ˆf r (t). The two-stream observation o t = (f 1 (t), ˆf r,t ) is then recognized using two-stream HMMs as explained in Sec System 4: Besides the discrete pressure, we use other discrete features: the off-line features f In this system the values of each discrete feature, including f 1, form a separate stream. The remaining features (f 2 13 ) are jointly quantized and mapped by the discrete symbol ˆfon. Thus a 13-stream observation o t = (f 1, ˆf on, f 14,..., f 24 ) is derived and a 13- stream HMM is applied for recognition. System 5: The last system is motivated by e. g. [16], in which the on-line and off-line features form two separate observation streams. In this paper we augment this approach by a further discrete observation stream formed by the values of f 1, whereby the on-line features f 2 13 are jointly quantized and mapped to the discrete symbol ˆf on and the off-line features f are mapped to the discrete symbol ˆfoff. Hence, a three-stream observation o t = (f 1, ˆf on, ˆf off ) is obtained. As the on-line and off-line features are quantized independently their codebook sizes N on and N off may differ. The optimal ratio R = N off/n on is found by experiment. 5. Experiments The experiments presented in this section are conducted on a database containing handwritten, heuristically line-segmented whiteboard notes (IAM- OnDB 2, [8]). Comparability of the results is provided by using the settings of the writer-independent IAMonDB-t1 benchmark. The IAM-onDB-t1 benchmark provides four writer-disjunct sets of arbitrary size: one training set, two validation sets and a test set. In this paper the training set is used for training, the combination of both validation sets is used for validation, and the final tests are performed on the test set. Our experiments use the same HMM topology and number of states as in [9]. The following five experiments are conducted on the combination of both validation sets and on each system described in Sec. 4 at different codebook sizes. As the number of observation streams varies, the number N of feature describing parameters is fixed, i. e. N = D N d where N d is the number of codebook entries of each observation stream d. To achieve comparability all experiments are performed for values of N = 10, 100, 500, 1000, 2000, 5000 and Experiment 1 (Exp. 1 ): Using the first system, this experiment forms the baseline: all features are quantized in one single stream. The results derived on the validation sets are shown in Fig. 2 left. The best character level accuracy is achieved for N cdb = 5000 prototypes and yields a b = 62.6 %. The drop in performance when further increasing the codebook size to N = 7500 is due to sparse data [15]. Experiment 2 (Exp. 2 ): The inadequate quantization of the binary pressure feature is proven by experiments on the second system. As Fig. 2 left shows, only slight degradation in recognition performance compared to the baseline can be observed. In fact, both codebook size-acc curves run almost parallel. The peak rate of a sys 2 = 62.5 % is once again reached for a codebook size of N cdb = 5000, which equals a relative change of r = 0.2 %. This is rather surprising as in [11] pressure is proven to be a relevant feature in on-line whiteboard note recognition. 2 fki/iamnodb/

5 a sys 5 a base a sys baseline system 2 system 3 system 4 system Codebook size N % ACC val R opt N = 10 N = 100 N = 500 N = 1000 N = 2000 N = 5000 N = R Figure 2. Left: evaluation of different systems character level accuracies with respect to the codebook size N. Right: character level accuracies for different codebook sizes and varying ratios R = N off/n on of a three-stream HMM system (where the three streams are formed by the values of the pressure, the jointly quantized on-line features using N on centroids, and the jointly quantized off-line features using N off centroids). Experiment 3 (Exp. 3 ): The result of the previous experiment suggests that the discrete pressure information of feature f 1 is not adequately quantized by the vector quantizer. The third system allows the direct use of the values of f 1 as a separate observation stream. The pressure information is thereby modeled without loss. This results in no improvement. As mentioned in Sec. 3.2, both observation streams are modeled statistically independently. Although modeled without loss in a separate stream, the neglect of the statistical dependencies between the pressure and the remaining features inhibits any improvement. Experiment 4 (Exp. 4 ): The previous experiment s outcome might show that neglecting the statistical dependencies between the features influences the recognition accuracy. Because in on-line handwritten whiteboard note recognition other discrete features besides the pressure feature are used, they can be modeled in separate observation streams without loss using the fourth system. The exact values of each feature are thereby modeled without any loss while their statistical dependencies are neglected. Peak performance of a sys 4 = 60.6 % is reached for N = 5000, which is a relative drop of r = 3.3 % compared to the baseline system. This drop confirms the assumption that the statistical dependencies between features have an influence on the system s performance. Table 1. Final results of experiments 1,...,5 and a state-of-the-art continuous system [10] on the same word level recognition task. system Exp 1 (A b ) Exp 2 Exp 3 Exp 4 Exp 5 (A sys 2 ) (A sys 3 ) (A sys 4 ) (A sys 5 ) [10] word acc % 63.3 % 63.6 % 61.0 % 66.5 % 65.2 % Experiment 5 (Exp. 5 ): In this last experiment the fifth system is evaluated. Three independent observation streams are formed consisting of the values of feature f 1, the jointly quantized on-line features f 2 13, and the jointly quantized off-line features f Both parameters, N = 2+N off +N on and the ratio R = N off/n on (where N on and N off denotes the number of centroids used to quantize the on-line and off-line features), are found by experiment. Fig. 2 shows at right the character level accuracy at given ratio R for different N. The peak rates which are achieved for each N and corresponding optimal R are plotted in Fig. 2 left. The highest character level accuracy rate of a sys 5 = 64.8 % is reached for N = 5000 and R opt = 0.105, i. e. N on = 4523 and N off = 475 centroids are used to quantize the on-line and off-line features respectively. This is a relative improvement of r = 3.4 % compared to the baseline.

6 In order to prove the competitiveness of our systems the parameters and models which provide the best results in the experiments Exp. 1,..., 5 are taken to perform word level recognition on the test set of the IAM-onDB-t1 benchmark. The results are compared with the performance of a state-of-the-art continuous recognition system [10] and shown in Tab. 1. The highest word level accuracy of A sys 5 = 66.5 % can be reported for the 5 th system using three independent feature streams leading to an improvement of r = 2.0 % relative (R = 1.3 % absolute) as compared to the state-of-the-art system. A straightforward approach, i. e. jointly quantizing all features in a single stream, leads to a word level accuracy of A b = 63.5 %. The three stream approach as perfomed in system five therefore leads to a relative improvement of r = 4.5 % compared to this baseline system. 6. Conclusions and Outlook In this paper we intensively investigated the use of discrete single and multiple stream HMMs. We examined both the continuous and discrete features typically used in on-line handwritten whiteboard note recognition and their vector quantizations. These quantizations introduce quantization errors. By conducting a series of experiments we showed, using the pressure feature as example, that both the quantization of the features and the statistical dependencies between them influence recognition performance. This assumption is confirmed by the observation that recognition accuracy considerably drops if the statistical dependencies between the features are neglected, even though they are modeled without loss in separate streams. However, we also showed that a singlestream discrete HMM baseline system using jointly quantized features can be outperformed by r = 4.5 % relative in word level accuracy using a three-stream discrete HMM system which models the pressure, the on-line features and the off-line features independently. Finally our discrete systems were compared to a state-of-the-art continuous system as presented in [10], yielding an improvement of r = 2.0 % relative in word level accuracy. As our experiments indicate, the statistical dependencies between the features have an impact on the system s performance. In future work we therefore plan to study these influences in further detail in order to achieve an optimal feature stream selection. Acknowledgments The authors thank M. Liwicki for providing the lattice for the final benchmark and G. Weinberg for his useful comments. References [1] L. Baum and T. Petrie, Statistical Inference for Probabilistic Functions of Finite State Markov Chains, Annals of Mathematical Statistics, 37: , [2] J. Bilms, Graphical Models and Automatic Speech Recognition, M. Johnson, S. Khudanpur, M. Ostendorf and R. Rosenfeld, editors, in: Mathematical Foundations of Speech and Language Processing, pp Springer, [3] R. Bozinovic and S. Srihari, Off-Line Cursive Script Word Recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, 11(1):68 83, [4] S. Günter and H. Bunke, HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components, Pattern Recogn., 37: , [5] J. Schenk and G. Rigoll, Novel Hybrid NN/HMM Modelling Techniques for On-Line Handwriting Recognition, Proc. of the Int. Workshop on Frontiers in Handwriting Recogn., pp , [6] S. Jaeger, S. Manke, J. Reichert and A. Waibel, The NPen++ Recognizer, Int. J. on Document Analysis and Recogn., 3: , [7] E. Kavallieratou, N. Fakotakis and G. Kokkinakis, New Algorithms for Skewing Correction and Slant Removal on Word-Level, Proc. of the Int. Conf. ECS, 2: , [8] M. Liwicki and H. Bunke, IAM-OnDB - an On-Line English Sentence Database Acquired from Handwritten Text on a Whiteboard, Proc. of the Int. Conf. on Document Analysis and Recogn., 2: , [9] M. Liwicki and H. Bunke, HMM-Based On-Line Recognition of Handwritten Whiteboard Notes, Proc. of the Int. Workshop on Frontiers in Handwriting Recogn., pp , [10] M. Liwicki and H. Bunke, Combining On-Line and Off-Line Systems for Handwriting Recognition, Proc. of the Int. Conf. on Document Analysis and Recogn., pp , [11] M. Liwicki and H. Bunke, Feature Selection for On- Line Handwriting Recognition of Whiteboard Notes, Proc. of the Conf. of the Graphonomics Society, pp , [12] J. Makhoul, S. Roucos and H. Gish, Vector Quantization in Speech Coding, Proc. of the IEEE, 73(11): , November [13] R. Plamondon and S. Srihari, On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey, IEEE Trans on Pattern Analysis and Machine Intelligence, 22(1):63 84, [14] A. P.S. and K. A.K., Automatic Facial Expression Recognition Using Facial Animation Parameters and Multistream HMMs, IEEE Transactions on Information Forensics and Security, pp 1 9, [15] L. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc. of the IEEE, 77(2): , February [16] G. Rigoll, A. Kosmala, J. Rottland and C. Neukirchen, A Comparison between Continuous and Discrete Density Hidden Markov Models for Cursive Handwriting Recognition, Proc. of the Int. Conf. on Pattern Recogn., 2: , [17] J. Schenk, S. Schwärzler, G. Ruske and G. Rigoll, Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition, Proc. of the 30th Symposium of DAGM, pp , [18] H. Yasuda, T-Talahasjo and T. Matsumoto, A Discrete HMM For Online Handwriting Recognition, Int. J. of Pattern Recogn. and Artificial Intelligence, 14(5): , 2000.

Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition

Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition Joachim Schenk, Stefan Schwärzler, Günther Ruske, and Gerhard Rigoll Institute for Human-Machine Communication Technische

More information

PCA in On-Line Handwriting Recognition of Whiteboard Notes: A Novel VQ Design for Use with Discrete HMMs

PCA in On-Line Handwriting Recognition of Whiteboard Notes: A Novel VQ Design for Use with Discrete HMMs PCA in On-Line Handwriting Recognition of Whiteboard otes: A ovel VQ Design for Use with Discrete HMMs Joachim Schenk, Stefan Schwärzler, and Gerhard Rigoll Institute for Human-Machine Communication Technische

More information

Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition

Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition Joachim Schenk, Gerhard Rigoll To cite this version: Joachim Schenk, Gerhard Rigoll. Novel Hybrid NN/HMM Modelling Techniques

More information

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula

More information

Substroke Approach to HMM-based On-line Kanji Handwriting Recognition

Substroke Approach to HMM-based On-line Kanji Handwriting Recognition Sixth International Conference on Document nalysis and Recognition (ICDR 2001), pp.491-495 (2001-09) Substroke pproach to HMM-based On-line Kanji Handwriting Recognition Mitsuru NKI, Naoto KIR, Hiroshi

More information

A New OCR System Similar to ASR System

A New OCR System Similar to ASR System A ew OCR System Similar to ASR System Abstract Optical character recognition (OCR) system is created using the concepts of automatic speech recognition where the hidden Markov Model is widely used. Results

More information

Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition

Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition 2009 10th International Conference on Document Analysis and Recognition Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition AdriàGiménez and Alfons Juan DSIC/ITI,Univ. Politècnica de València,

More information

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems

HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems HMM and IOHMM Modeling of EEG Rhythms for Asynchronous BCI Systems Silvia Chiappa and Samy Bengio {chiappa,bengio}@idiap.ch IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland Abstract. We compare the use

More information

Bayesian Network Structure Learning and Inference Methods for Handwriting

Bayesian Network Structure Learning and Inference Methods for Handwriting Bayesian Network Structure Learning and Inference Methods for Handwriting Mukta Puri, Sargur N. Srihari and Yi Tang CEDAR, University at Buffalo, The State University of New York, Buffalo, New York, USA

More information

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

HIDDEN MARKOV MODELS IN SPEECH RECOGNITION

HIDDEN MARKOV MODELS IN SPEECH RECOGNITION HIDDEN MARKOV MODELS IN SPEECH RECOGNITION Wayne Ward Carnegie Mellon University Pittsburgh, PA 1 Acknowledgements Much of this talk is derived from the paper "An Introduction to Hidden Markov Models",

More information

Hidden Markov Modelling

Hidden Markov Modelling Hidden Markov Modelling Introduction Problem formulation Forward-Backward algorithm Viterbi search Baum-Welch parameter estimation Other considerations Multiple observation sequences Phone-based models

More information

University of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I

University of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I University of Cambridge MPhil in Computer Speech Text & Internet Technology Module: Speech Processing II Lecture 2: Hidden Markov Models I o o o o o 1 2 3 4 T 1 b 2 () a 12 2 a 3 a 4 5 34 a 23 b () b ()

More information

ON SCALABLE CODING OF HIDDEN MARKOV SOURCES. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

ON SCALABLE CODING OF HIDDEN MARKOV SOURCES. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose ON SCALABLE CODING OF HIDDEN MARKOV SOURCES Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose Department of Electrical and Computer Engineering University of California, Santa Barbara, CA, 93106

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Hidden Markov Models

Hidden Markov Models 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 22 April 2, 2018 1 Reminders Homework

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 2-24-2004 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Hidden Markov Models Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Additional References: David

More information

A New Smoothing Method for Lexicon-based Handwritten Text Keyword Spotting

A New Smoothing Method for Lexicon-based Handwritten Text Keyword Spotting A New Smoothing Method for Lexicon-based Handwritten Text Keyword Spotting Joan Puigcerver, Alejandro H. Toselli, Enrique Vidal {joapuipe,ahector,evidal}@prhlt.upv.es Pattern Recognition and Human Language

More information

Hidden Markov Models and other Finite State Automata for Sequence Processing

Hidden Markov Models and other Finite State Automata for Sequence Processing To appear in The Handbook of Brain Theory and Neural Networks, Second edition, (M.A. Arbib, Ed.), Cambridge, MA: The MIT Press, 2002. http://mitpress.mit.edu The MIT Press Hidden Markov Models and other

More information

On Optimal Coding of Hidden Markov Sources

On Optimal Coding of Hidden Markov Sources 2014 Data Compression Conference On Optimal Coding of Hidden Markov Sources Mehdi Salehifar, Emrah Akyol, Kumar Viswanatha, and Kenneth Rose Department of Electrical and Computer Engineering University

More information

Training Hidden Markov Models with Multiple Observations A Combinatorial Method

Training Hidden Markov Models with Multiple Observations A Combinatorial Method Li & al., IEEE Transactions on PAMI, vol. PAMI-22, no. 4, pp 371-377, April 2000. 1 Training Hidden Markov Models with Multiple Observations A Combinatorial Method iaolin Li, Member, IEEE Computer Society

More information

An Evolutionary Programming Based Algorithm for HMM training

An Evolutionary Programming Based Algorithm for HMM training An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,

More information

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010 Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Data Mining in Bioinformatics HMM

Data Mining in Bioinformatics HMM Data Mining in Bioinformatics HMM Microarray Problem: Major Objective n Major Objective: Discover a comprehensive theory of life s organization at the molecular level 2 1 Data Mining in Bioinformatics

More information

Math 350: An exploration of HMMs through doodles.

Math 350: An exploration of HMMs through doodles. Math 350: An exploration of HMMs through doodles. Joshua Little (407673) 19 December 2012 1 Background 1.1 Hidden Markov models. Markov chains (MCs) work well for modelling discrete-time processes, or

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg Temporal Reasoning Kai Arras, University of Freiburg 1 Temporal Reasoning Contents Introduction Temporal Reasoning Hidden Markov Models Linear Dynamical Systems (LDS) Kalman Filter 2 Temporal Reasoning

More information

Statistical Sequence Recognition and Training: An Introduction to HMMs

Statistical Sequence Recognition and Training: An Introduction to HMMs Statistical Sequence Recognition and Training: An Introduction to HMMs EECS 225D Nikki Mirghafori nikki@icsi.berkeley.edu March 7, 2005 Credit: many of the HMM slides have been borrowed and adapted, with

More information

Can Markov properties be learned by hidden Markov modelling algorithms?

Can Markov properties be learned by hidden Markov modelling algorithms? Can Markov properties be learned by hidden Markov modelling algorithms? Jean-Paul van Oosten August 2010 Master Thesis Artificial Intelligence Department of Artificial Intelligence, University of Groningen,

More information

Hidden Markov Models in Language Processing

Hidden Markov Models in Language Processing Hidden Markov Models in Language Processing Dustin Hillard Lecture notes courtesy of Prof. Mari Ostendorf Outline Review of Markov models What is an HMM? Examples General idea of hidden variables: implications

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Hidden Markov Models Barnabás Póczos & Aarti Singh Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed

More information

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech

More information

Recall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series

Recall: Modeling Time Series. CSE 586, Spring 2015 Computer Vision II. Hidden Markov Model and Kalman Filter. Modeling Time Series Recall: Modeling Time Series CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates

More information

Introduction to Markov systems

Introduction to Markov systems 1. Introduction Up to now, we have talked a lot about building statistical models from data. However, throughout our discussion thus far, we have made the sometimes implicit, simplifying assumption that

More information

Data Analyzing and Daily Activity Learning with Hidden Markov Model

Data Analyzing and Daily Activity Learning with Hidden Markov Model Data Analyzing and Daily Activity Learning with Hidden Markov Model GuoQing Yin and Dietmar Bruckner Institute of Computer Technology Vienna University of Technology, Austria, Europe {yin, bruckner}@ict.tuwien.ac.at

More information

Robert Collins CSE586 CSE 586, Spring 2015 Computer Vision II

Robert Collins CSE586 CSE 586, Spring 2015 Computer Vision II CSE 586, Spring 2015 Computer Vision II Hidden Markov Model and Kalman Filter Recall: Modeling Time Series State-Space Model: You have a Markov chain of latent (unobserved) states Each state generates

More information

Hidden Markov Models and Gaussian Mixture Models

Hidden Markov Models and Gaussian Mixture Models Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian

More information

Hidden Markov Model and Speech Recognition

Hidden Markov Model and Speech Recognition 1 Dec,2006 Outline Introduction 1 Introduction 2 3 4 5 Introduction What is Speech Recognition? Understanding what is being said Mapping speech data to textual information Speech Recognition is indeed

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Information Extraction, Hidden Markov Models Sameer Maskey Week 5, Oct 3, 2012 *many slides provided by Bhuvana Ramabhadran, Stanley Chen, Michael Picheny Speech Recognition

More information

Hidden Markov Models

Hidden Markov Models 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 19 Nov. 5, 2018 1 Reminders Homework

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms   Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Comparison of Log-Linear Models and Weighted Dissimilarity Measures

Comparison of Log-Linear Models and Weighted Dissimilarity Measures Comparison of Log-Linear Models and Weighted Dissimilarity Measures Daniel Keysers 1, Roberto Paredes 2, Enrique Vidal 2, and Hermann Ney 1 1 Lehrstuhl für Informatik VI, Computer Science Department RWTH

More information

More on HMMs and other sequence models. Intro to NLP - ETHZ - 18/03/2013

More on HMMs and other sequence models. Intro to NLP - ETHZ - 18/03/2013 More on HMMs and other sequence models Intro to NLP - ETHZ - 18/03/2013 Summary Parts of speech tagging HMMs: Unsupervised parameter estimation Forward Backward algorithm Bayesian variants Discriminative

More information

Lecture 11: Hidden Markov Models

Lecture 11: Hidden Markov Models Lecture 11: Hidden Markov Models Cognitive Systems - Machine Learning Cognitive Systems, Applied Computer Science, Bamberg University slides by Dr. Philip Jackson Centre for Vision, Speech & Signal Processing

More information

Hidden Markov Models Hamid R. Rabiee

Hidden Markov Models Hamid R. Rabiee Hidden Markov Models Hamid R. Rabiee 1 Hidden Markov Models (HMMs) In the previous slides, we have seen that in many cases the underlying behavior of nature could be modeled as a Markov process. However

More information

Biometric Hash based on Statistical Features of Online Signatures

Biometric Hash based on Statistical Features of Online Signatures Biometric Hash based on Statistical Features of Online Signatures Claus Vielhauer 1,2, Ralf Steinmetz 1, Astrid Mayerhöfer 3 1 Technical University Darmstadt Institute for Industrial Process- and System

More information

Expectation Maximization (EM)

Expectation Maximization (EM) Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted

More information

HMM part 1. Dr Philip Jackson

HMM part 1. Dr Philip Jackson Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data Statistical Machine Learning from Data Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL),

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Statistical NLP: Hidden Markov Models. Updated 12/15

Statistical NLP: Hidden Markov Models. Updated 12/15 Statistical NLP: Hidden Markov Models Updated 12/15 Markov Models Markov models are statistical tools that are useful for NLP because they can be used for part-of-speech-tagging applications Their first

More information

Machine Learning for natural language processing

Machine Learning for natural language processing Machine Learning for natural language processing Hidden Markov Models Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 33 Introduction So far, we have classified texts/observations

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Page 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence

Page 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)

More information

OFF-LINE HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODELS

OFF-LINE HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODELS OFF-LINE HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODELS A. El-Yacoubi, 1,4 R. Sabourin, 1,2 M. Gilloux 3 and C.Y. Suen 1 1 Centre for Pattern Recognition and Machine Intelligence Department of

More information

Active learning in sequence labeling

Active learning in sequence labeling Active learning in sequence labeling Tomáš Šabata 11. 5. 2017 Czech Technical University in Prague Faculty of Information technology Department of Theoretical Computer Science Table of contents 1. Introduction

More information

1. Markov models. 1.1 Markov-chain

1. Markov models. 1.1 Markov-chain 1. Markov models 1.1 Markov-chain Let X be a random variable X = (X 1,..., X t ) taking values in some set S = {s 1,..., s N }. The sequence is Markov chain if it has the following properties: 1. Limited

More information

Inference in Explicit Duration Hidden Markov Models

Inference in Explicit Duration Hidden Markov Models Inference in Explicit Duration Hidden Markov Models Frank Wood Joint work with Chris Wiggins, Mike Dewar Columbia University November, 2011 Wood (Columbia University) EDHMM Inference November, 2011 1 /

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Graphical Models for Automatic Speech Recognition

Graphical Models for Automatic Speech Recognition Graphical Models for Automatic Speech Recognition Advanced Signal Processing SE 2, SS05 Stefan Petrik Signal Processing and Speech Communication Laboratory Graz University of Technology GMs for Automatic

More information

Hidden Markov Models

Hidden Markov Models CS769 Spring 2010 Advanced Natural Language Processing Hidden Markov Models Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu 1 Part-of-Speech Tagging The goal of Part-of-Speech (POS) tagging is to label each

More information

FEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION. Xiao Li and Jeff Bilmes

FEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION. Xiao Li and Jeff Bilmes FEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION Xiao Li and Jeff Bilmes Department of Electrical Engineering University. of Washington, Seattle {lixiao, bilmes}@ee.washington.edu

More information

Multiscale Systems Engineering Research Group

Multiscale Systems Engineering Research Group Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden

More information

Augmented Statistical Models for Speech Recognition

Augmented Statistical Models for Speech Recognition Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Lecture 7: Pitch and Chord (2) HMM, pitch detection functions. Li Su 2016/03/31

Lecture 7: Pitch and Chord (2) HMM, pitch detection functions. Li Su 2016/03/31 Lecture 7: Pitch and Chord (2) HMM, pitch detection functions Li Su 2016/03/31 Chord progressions Chord progressions are not arbitrary Example 1: I-IV-I-V-I (C-F-C-G-C) Example 2: I-V-VI-III-IV-I-II-V

More information

Upper Bound Kullback-Leibler Divergence for Hidden Markov Models with Application as Discrimination Measure for Speech Recognition

Upper Bound Kullback-Leibler Divergence for Hidden Markov Models with Application as Discrimination Measure for Speech Recognition Upper Bound Kullback-Leibler Divergence for Hidden Markov Models with Application as Discrimination Measure for Speech Recognition Jorge Silva and Shrikanth Narayanan Speech Analysis and Interpretation

More information

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood

More information

A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong

A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong A Higher-Order Interactive Hidden Markov Model and Its Applications Wai-Ki Ching Department of Mathematics The University of Hong Kong Abstract: In this talk, a higher-order Interactive Hidden Markov Model

More information

Pairwise Neural Network Classifiers with Probabilistic Outputs

Pairwise Neural Network Classifiers with Probabilistic Outputs NEURAL INFORMATION PROCESSING SYSTEMS vol. 7, 1994 Pairwise Neural Network Classifiers with Probabilistic Outputs David Price A2iA and ESPCI 3 Rue de l'arrivée, BP 59 75749 Paris Cedex 15, France a2ia@dialup.francenet.fr

More information

Sequences and Information

Sequences and Information Sequences and Information Rahul Siddharthan The Institute of Mathematical Sciences, Chennai, India http://www.imsc.res.in/ rsidd/ Facets 16, 04/07/2016 This box says something By looking at the symbols

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction

More information

arxiv: v1 [cs.hc] 9 Sep 2014

arxiv: v1 [cs.hc] 9 Sep 2014 Co-adaptation in a Handwriting Recognition System Sunsern Cheamanunkul University of California, San Diego 900 Gilman Dr, La Jolla, CA 909 scheaman@eng.ucsd.edu Yoav Freund University of California, San

More information

ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM

ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM Mayukh Bhaowal and Kunal Chawla (Students)Indian Institute of Information Technology, Allahabad, India Abstract: Key words: Speech recognition

More information

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course) 10. Hidden Markov Models (HMM) for Speech Processing (some slides taken from Glass and Zue course) Definition of an HMM The HMM are powerful statistical methods to characterize the observed samples of

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

Hidden Markov models

Hidden Markov models Hidden Markov models Charles Elkan November 26, 2012 Important: These lecture notes are based on notes written by Lawrence Saul. Also, these typeset notes lack illustrations. See the classroom lectures

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

AN INVESTIGATION ON EFFICIENT FEATURE EXTRACTION APPROACHES FOR ARABIC LETTER RECOGNITION

AN INVESTIGATION ON EFFICIENT FEATURE EXTRACTION APPROACHES FOR ARABIC LETTER RECOGNITION AN INVESTIGATION ON EFFICIENT FEATURE EXTRACTION APPROACHES FOR ARABIC LETTER RECOGNITION H. Aboaisha 1, Zhijie Xu 1, I. El-Feghi 2 z.xu@hud.ac.uk Idrisel@ee.edu.ly 1 School of Computing and Engineering,

More information

A Hierarchical Representation for the Reference Database of On-Line Chinese Character Recognition

A Hierarchical Representation for the Reference Database of On-Line Chinese Character Recognition A Hierarchical Representation for the Reference Database of On-Line Chinese Character Recognition Ju-Wei Chen 1,2 and Suh-Yin Lee 1 1 Institute of Computer Science and Information Engineering National

More information

Conditional Random Fields

Conditional Random Fields Conditional Random Fields Micha Elsner February 14, 2013 2 Sums of logs Issue: computing α forward probabilities can undeflow Normally we d fix this using logs But α requires a sum of probabilities Not

More information

Parametric Models Part III: Hidden Markov Models

Parametric Models Part III: Hidden Markov Models Parametric Models Part III: Hidden Markov Models Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2014 CS 551, Spring 2014 c 2014, Selim Aksoy (Bilkent

More information

Particle Swarm Optimization of Hidden Markov Models: a comparative study

Particle Swarm Optimization of Hidden Markov Models: a comparative study Particle Swarm Optimization of Hidden Markov Models: a comparative study D. Novák Department of Cybernetics Czech Technical University in Prague Czech Republic email:xnovakd@labe.felk.cvut.cz M. Macaš,

More information

Household Energy Disaggregation based on Difference Hidden Markov Model

Household Energy Disaggregation based on Difference Hidden Markov Model 1 Household Energy Disaggregation based on Difference Hidden Markov Model Ali Hemmatifar (ahemmati@stanford.edu) Manohar Mogadali (manoharm@stanford.edu) Abstract We aim to address the problem of energy

More information

Hidden Markov Models. Dr. Naomi Harte

Hidden Markov Models. Dr. Naomi Harte Hidden Markov Models Dr. Naomi Harte The Talk Hidden Markov Models What are they? Why are they useful? The maths part Probability calculations Training optimising parameters Viterbi unseen sequences Real

More information

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis

A Novel Rejection Measurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis 009 0th International Conference on Document Analysis and Recognition A Novel Rejection easurement in Handwritten Numeral Recognition Based on Linear Discriminant Analysis Chun Lei He Louisa Lam Ching

More information

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II)

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II) Speech and Language Processing Chapter 9 of SLP Automatic Speech Recognition (II) Outline for ASR ASR Architecture The Noisy Channel Model Five easy pieces of an ASR system 1) Language Model 2) Lexicon/Pronunciation

More information

ASR using Hidden Markov Model : A tutorial

ASR using Hidden Markov Model : A tutorial ASR using Hidden Markov Model : A tutorial Samudravijaya K Workshop on ASR @BAMU; 14-OCT-11 samudravijaya@gmail.com Tata Institute of Fundamental Research Samudravijaya K Workshop on ASR @BAMU; 14-OCT-11

More information

Dept. of Linguistics, Indiana University Fall 2009

Dept. of Linguistics, Indiana University Fall 2009 1 / 14 Markov L645 Dept. of Linguistics, Indiana University Fall 2009 2 / 14 Markov (1) (review) Markov A Markov Model consists of: a finite set of statesω={s 1,...,s n }; an signal alphabetσ={σ 1,...,σ

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer

Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer Gábor Gosztolya, András Kocsor Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University

More information