The Comparison of Vector Quantization Algoritms in Fish Species Acoustic Voice Recognition Using Hidden Markov Model

Size: px
Start display at page:

Download "The Comparison of Vector Quantization Algoritms in Fish Species Acoustic Voice Recognition Using Hidden Markov Model"

Transcription

1 The Comparison Vector Quantization Algoritms in Fish Species Acoustic Voice Recognition Using Hidden Markov Model Diponegoro A.D 1). and Fawwaz Al Maki. W 1) 1) Department Electrical Enginering, University Indonesia, Indonesia Abstract The implementation Vector Quantization (VQ) in the Fish acoustic voice recognition using Hidden Markov Model (HMM) was to reduce the memory capacity and to reduce the computation time. There were three kinds VQ algorithms that implemented in the fish voice recognition namely Traditional K-Means Clustering, LBG (Linde, Buzo, and Gray), and Successive Binary Split. In the vf recognition processing the input fish voice waveform was converted to the descrete signal and atracted to obtain its spectrum characteristic using Mel Frequency Cepstrum Coefficient (MFCC). The vector components fish voice spectrum were quantized using three kind VQ algoritms. The performance these VQ algoritms were examined during fish voice recognition processing by means HMM. Based on the experiment result the Sucessive Binary Split algorithm was the optimum algorithm because its algorithm had the higest accuracy compared to the two other algorithms. During the recognition processing, the Sucessive Binary Split algorithm required the lowest memory capasity and time consumption. Keywords---Vector quantization, HMM, Fish acoustic voice II. INTRODUCTION Every kind Soniferous fishes are able to produce the specific acoustic voice that distinguish them from their species and introduce their behaviour such as courtship behaviour [1], mating behaviour [2], spawning behaviour [3] [4], and reproductive behaviour [5]. During the recognition processing, the wave characteristic the observed fish were compared to the number wave characteristics a number fish voices in a data base. In case the number fishes were so big therefore to search the vector components in data base need the long time computation. To solve such problem, the nearest vector component every spectrals were combined into one value that was called centroid or codeword. The combination several vector components to one value codeword were processed by means VQ algorithm. There were three kind VQ algorithms namely Traditional K-Means Clustering, LBG, and Sucessive Binary Split. From the three VQ algorithms which one was the optimum performances in term the smalest memory capacity, the shortest computation time and also the importance term was to obtain the highest accuracy recognition result. II. VECTOR QUANTIZATION The vector components the extracted fish voice spectrums were mapping from a large vector space to the finite number region space. Each region was called a cluster. In a cluster the vector components were called as the sample points. The nearest-neighbor sample points were quantized to a centroid or a codeword by means VQ quantization (see Fig. 1). The distance between the sample points to its centroid called VQ distortion. Increasing the number sample points caused the distance the VQ distortion became smaller it means that the accuracy became higher. In the certain number sample points (vector component), if the VQ distortion were small then it required the big number centroids. it means that the computation time became longer and the storage capacity became bigger. It also depend on the number attracted waves. The relation between VQ distortion and the acoustic waves were depend on the number extracted waves that were produced from the concerned acoustic wave. If the acoustic waves every kind observed fishes had the big differences each others, the duration time extracted waves were longer than the duration time extracted wave if he acoustic wave every kind fishes had nearly same each others. The method VQ algorithms would determine the performances fish species recognition based on the fishes acoustic voices that were produced. The VQ algorithms were used in this paper K-Means Clustering (Traditional K-Means Clustering), Sucessive Binary Split (Binary Split), dan LBG (Linde, Buzo, and Gray). A. K-Means Clustering algorithm [7] K-Means Clustering algorithm was used the method to built the codewords. The procedure K-Means Clustering algorithm was explained in The flow chart that shown in Fig. 2.

2 C C = + m C m = m C m ( 1+ ε ) ( 1+ ε ) where ε is a spliting parameter (choose ε = 0.01) start Determine initial codebook Fig. 1. VQ processing [6] Start establish new codeword with centorid and cluster Quantize all the training vector Determine initial codeword Determine centroid new cluster cluster vector Compute Distortion (D) Fine codeword Update codeword No no D-D < t m < M Yes End Compute distortion (D) Fig. 3. Flow chart Sucessive Binary Split algorithm D < D End Fig. 2. Flow chart K-Means Clustering algorithm B. Sucessive Binary Split algorithm [7]. In the Binary Split algorithm the initial codebook are set at the random value M. The Sucessive Binary Split algorithm procedure was shown in a flow chart Fig. 3. C.. LBG algorithm [7] [8] The LBG algorithm procedure was shown in Fig. 4. Spliting each current codebook C m according to the rule Fig. 4. Flow chart LBG algorithm [8]

3 III. RECOGNITION PROCESSING In the recognition processing, the extracted wave the observed fishes acoustic voice were determined its characteristics (vector components and HMM parameters) based on the characteristic in data base. The comparison results between the observed fish acoustic voice characteristic and the fish acoustic voice characteristic in data base would be used to recognize the name observed species fish. In the recognition processing, the kind fish that had the highest log-probability value that used to decide the name the observed fish. The block diagram recognition processing was shown in Fig. 5. Fish Acous tic Fish Acous tic Discrete Signal Process Discrete Signal Process VQ VQ Fig. 5. Recognition processing procedure The notation HMM can be writen as followed [9] λ = (A, B, π) (1) where A = a ij = P[q t+1 = j q t = i] is state-transition probability B = b j = P[o t = v k q t = j] is Observation symbol probability distribution. π = {π j }= P[q 1 =i] is the initial state distribution. The observation sequence is given by O = (o 1 o 2... o T ) (2) The staet sequence is given by HMM for training HMM for recog Data base Deci tion q = (q 1 q 2... q T ) (3) The HMM probability (log probability) is given by P(O λ) = Σ P(O q, λ)p(q λ) (4) Where the probability the observation sequence can be writen as P(O q, λ) = b q1 (o 1 ). b q2 (o 2 )... b qt (o T ) (5) And the probability a state sequence q can be writen as P(q λ) = π q1 a q1 q2 a q2 q3... a qt-1 qt (6) IV. EXPERIMENT RESULT The fish species were used in this experiments coonsisted 5 (five) kind fish accoustic voice namely : - Cynoscion regalis drumming - Cynoscion regalis chattering - Conodon nobilis - Opsanus tau - Cynoscion jamaicensis Every 5 (five) kind fishes accoustic voice were segmented into 60 (sixty) burst the extracted wave in a certain time period. The training processing were excecuted for 12 (twelve) times. In this experiment the time period (duration time) the extracted waves were implemented for 3 (three) duration times namely 1) The duration time less than 0.4 second 2) The duration time between 0.6 to 2.3 second 3) The above duration times combined to the random duration time burst The dimension codebook that were applied in this experiment were excecuted for 3 (three) s namely 1) 32 bit codebook 2) 64 bit codebook 3) 128 bit codebook A. The accuracy level performance The experiment performed the accuracy level each VQ algorithms The results were shown in Table I to Table III, TABLE I. The accuracy level ( %) fish voice recognition for Traditional K-Means Clustering algorithm Codebook Accuracy level (%) 0.4 s 2.3 s Comb , ,33 46,67 36, , ,67 26,67 36,67 TABLE II. The accuracy level ( %) fish voice recognition for. Sucessive Binary Split algorithm. Code book Accuracy level (%) 0.4 s 2.3 s Comb 32 46,67 63,33 56, ,67 83, , ,67 From the tables, it could be showed that LBG algoritm was most accurate compared to the two others algorithms for combination burst, the highest codebook and for 10 times cycle.

4 TABLE III. The accuracy level ( %) fish voice recognition for. LBG algorithm. Code book Accuracy level (%) 04 s 2.3 s Comb , , , ,67 86, ,67 46,67 56,67 B. Relative time consumption The time consumption were measured based on the cumpoter time started from entering the data until the results were diplayed completely on the monitor. Relative time calculation results HMM training for each VQ algorithms were shown in Table IV. In the table showed that LBG algorithm consummated the smalest excecution time. TABLE IV. Excecution time each VQ algoritms for codebook and number Baum Welch C. VQ distortion The VQ distortion for several codebooks and number were shown in Table V In the table shows that the VQ distortion became smaller for the bigger codebook and also for the bigger number. TABLE V. VQ distortion for several codebook and number Iteration Code- Book Codebook The relatif Time consumption HMM training processing Trad. K Means Clust. LBG Succ. Binary Split 32 42,05 34,75 37, ,56 51,84 53, ,53 86, ,79 108,14 111,76 VQ distortion Based on the above results, at the same value repetition and at the same duration time, mainly the increasing the codebook would increase the recognition accuracy. Such case happened because the increasing the number codeword in a codebook, the consequences that the distant VQ distortion became smaller. It means that the probability error also became higher. V. CONCLUTION Based on the results LBG algorithm was the smallest excecution time, and also LBG algoritm was most accurate compared to the two others algorithms for combination burst type, the highest codebook and for 10 times. REFERENCES [1] Gerald, J. W., sound production during courtship in six species sunfish, Evolution 25: 75-87, [2] Fine, M. L., Seasonal and geographical variation Matting call oyster toad-fish, Oecologia, 36: 45-47,1978. [3] Philip S Lobel, sound produced by spawning fish, Environmental Biology fishes 33, , 1992 [4] Lugli Marco, Gianni Pavan, Ptrizia Torricelli, Laura Bobbio, Spawning vocalization in male freshwater gobiids, Environmental Biology fishes 43: , [5] Stout J. F., Sound communication during the reproductive behavior Notropis analostanus, Amer. Midle Nat 94: , [6] Batri, Nadim, Robust Spectral Parameter Coding in Speech Recognition, Thesis, Department Electrical Engineering, McGill University, Montreal, Canada, [7] Thomas M Parks, Vector Quantization Codebook Design Using Neural Networks, Air Force Office Scientific Research (AFOSR/JSEP), December [8] Liu, Zhongmin, Yin, Qizhang, Zhang, Weimin, A Speaker Identification and Verification System, EEL6586 Final Project, 2002 [9] Rabiner, L, Juang, Bing Hwang, Fundamentals Speech Recognition, Prentice Hall, Inc., New Jersey, 1993.

5

6 TABLE IV. Baum Welch Algorithm Accuracy level (%) for berbagai jenis sinyal yang digunakan sebagai masukan B P C 10 23, ,67 26,67 26,67

Hidden Markov Model and Speech Recognition

Hidden Markov Model and Speech Recognition 1 Dec,2006 Outline Introduction 1 Introduction 2 3 4 5 Introduction What is Speech Recognition? Understanding what is being said Mapping speech data to textual information Speech Recognition is indeed

More information

Ch. 10 Vector Quantization. Advantages & Design

Ch. 10 Vector Quantization. Advantages & Design Ch. 10 Vector Quantization Advantages & Design 1 Advantages of VQ There are (at least) 3 main characteristics of VQ that help it outperform SQ: 1. Exploit Correlation within vectors 2. Exploit Shape Flexibility

More information

ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM

ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM ISOLATED WORD RECOGNITION FOR ENGLISH LANGUAGE USING LPC,VQ AND HMM Mayukh Bhaowal and Kunal Chawla (Students)Indian Institute of Information Technology, Allahabad, India Abstract: Key words: Speech recognition

More information

Hidden Markov Modelling

Hidden Markov Modelling Hidden Markov Modelling Introduction Problem formulation Forward-Backward algorithm Viterbi search Baum-Welch parameter estimation Other considerations Multiple observation sequences Phone-based models

More information

EE368B Image and Video Compression

EE368B Image and Video Compression EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer

More information

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18

More information

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization 3.1. Introduction CHAPTER 3 Transformed Vector Quantization with Orthogonal Polynomials In the previous chapter, a new integer image coding technique based on orthogonal polynomials for monochrome images

More information

The Secrets of Quantization. Nimrod Peleg Update: Sept. 2009

The Secrets of Quantization. Nimrod Peleg Update: Sept. 2009 The Secrets of Quantization Nimrod Peleg Update: Sept. 2009 What is Quantization Representation of a large set of elements with a much smaller set is called quantization. The number of elements in the

More information

GMM Vector Quantization on the Modeling of DHMM for Arabic Isolated Word Recognition System

GMM Vector Quantization on the Modeling of DHMM for Arabic Isolated Word Recognition System GMM Vector Quantization on the Modeling of DHMM for Arabic Isolated Word Recognition System Snani Cherifa 1, Ramdani Messaoud 1, Zermi Narima 1, Bourouba Houcine 2 1 Laboratoire d Automatique et Signaux

More information

HIDDEN MARKOV MODELS IN SPEECH RECOGNITION

HIDDEN MARKOV MODELS IN SPEECH RECOGNITION HIDDEN MARKOV MODELS IN SPEECH RECOGNITION Wayne Ward Carnegie Mellon University Pittsburgh, PA 1 Acknowledgements Much of this talk is derived from the paper "An Introduction to Hidden Markov Models",

More information

Lecture 5: GMM Acoustic Modeling and Feature Extraction

Lecture 5: GMM Acoustic Modeling and Feature Extraction CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 5: GMM Acoustic Modeling and Feature Extraction Original slides by Dan Jurafsky Outline for Today Acoustic

More information

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm

CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm + September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature

More information

Vector Quantization and Subband Coding

Vector Quantization and Subband Coding Vector Quantization and Subband Coding 18-796 ultimedia Communications: Coding, Systems, and Networking Prof. Tsuhan Chen tsuhan@ece.cmu.edu Vector Quantization 1 Vector Quantization (VQ) Each image block

More information

On Optimal Coding of Hidden Markov Sources

On Optimal Coding of Hidden Markov Sources 2014 Data Compression Conference On Optimal Coding of Hidden Markov Sources Mehdi Salehifar, Emrah Akyol, Kumar Viswanatha, and Kenneth Rose Department of Electrical and Computer Engineering University

More information

Using the Sound Recognition Techniques to Reduce the Electricity Consumption in Highways

Using the Sound Recognition Techniques to Reduce the Electricity Consumption in Highways Marsland Press Journal of American Science 2009:5(2) 1-12 Using the Sound Recognition Techniques to Reduce the Electricity Consumption in Highways 1 Khalid T. Al-Sarayreh, 2 Rafa E. Al-Qutaish, 3 Basil

More information

CS578- Speech Signal Processing

CS578- Speech Signal Processing CS578- Speech Signal Processing Lecture 7: Speech Coding Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab yannis@csd.uoc.gr Univ. of Crete Outline 1 Introduction

More information

University of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I

University of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I University of Cambridge MPhil in Computer Speech Text & Internet Technology Module: Speech Processing II Lecture 2: Hidden Markov Models I o o o o o 1 2 3 4 T 1 b 2 () a 12 2 a 3 a 4 5 34 a 23 b () b ()

More information

A New OCR System Similar to ASR System

A New OCR System Similar to ASR System A ew OCR System Similar to ASR System Abstract Optical character recognition (OCR) system is created using the concepts of automatic speech recognition where the hidden Markov Model is widely used. Results

More information

ADVANCED SPEAKER RECOGNITION

ADVANCED SPEAKER RECOGNITION ADVANCED SPEAKER RECOGNITION Amruta Anantrao Malode and Shashikant Sahare 1 Department of Electronics & Telecommunication, Pune University, Pune, India ABSTRACT The domain area of this topic is Bio-metric.

More information

Pulse-Code Modulation (PCM) :

Pulse-Code Modulation (PCM) : PCM & DPCM & DM 1 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number of bits used to represent each sample. The rate from

More information

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II)

Speech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II) Speech and Language Processing Chapter 9 of SLP Automatic Speech Recognition (II) Outline for ASR ASR Architecture The Noisy Channel Model Five easy pieces of an ASR system 1) Language Model 2) Lexicon/Pronunciation

More information

Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition

Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition ABSTRACT It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden

More information

Robust Speaker Identification

Robust Speaker Identification Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }

More information

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS

SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula

More information

6 Quantization of Discrete Time Signals

6 Quantization of Discrete Time Signals Ramachandran, R.P. Quantization of Discrete Time Signals Digital Signal Processing Handboo Ed. Vijay K. Madisetti and Douglas B. Williams Boca Raton: CRC Press LLC, 1999 c 1999byCRCPressLLC 6 Quantization

More information

Scalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014

Scalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014 Scalar and Vector Quantization National Chiao Tung University Chun-Jen Tsai 11/06/014 Basic Concept of Quantization Quantization is the process of representing a large, possibly infinite, set of values

More information

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)

10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course) 10. Hidden Markov Models (HMM) for Speech Processing (some slides taken from Glass and Zue course) Definition of an HMM The HMM are powerful statistical methods to characterize the observed samples of

More information

Hidden Markov Models. Dr. Naomi Harte

Hidden Markov Models. Dr. Naomi Harte Hidden Markov Models Dr. Naomi Harte The Talk Hidden Markov Models What are they? Why are they useful? The maths part Probability calculations Training optimising parameters Viterbi unseen sequences Real

More information

Symmetric Distortion Measure for Speaker Recognition

Symmetric Distortion Measure for Speaker Recognition ISCA Archive http://www.isca-speech.org/archive SPECOM 2004: 9 th Conference Speech and Computer St. Petersburg, Russia September 20-22, 2004 Symmetric Distortion Measure for Speaker Recognition Evgeny

More information

An Evolutionary Programming Based Algorithm for HMM training

An Evolutionary Programming Based Algorithm for HMM training An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,

More information

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT

More information

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms

Shankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar

More information

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 10: Acoustic Models

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 10: Acoustic Models Statistical NLP Spring 2009 The Noisy Channel Model Lecture 10: Acoustic Models Dan Klein UC Berkeley Search through space of all possible sentences. Pick the one that is most probable given the waveform.

More information

Statistical NLP Spring The Noisy Channel Model

Statistical NLP Spring The Noisy Channel Model Statistical NLP Spring 2009 Lecture 10: Acoustic Models Dan Klein UC Berkeley The Noisy Channel Model Search through space of all possible sentences. Pick the one that is most probable given the waveform.

More information

SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ. Yongqiang Wang, Jinyu Li and Yifan Gong

SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ. Yongqiang Wang, Jinyu Li and Yifan Gong SMALL-FOOTPRINT HIGH-PERFORMANCE DEEP NEURAL NETWORK-BASED SPEECH RECOGNITION USING SPLIT-VQ Yongqiang Wang, Jinyu Li and Yifan Gong Microsoft Corporation, One Microsoft Way, Redmond, WA 98052 {erw, jinyli,

More information

Introduction to Markov systems

Introduction to Markov systems 1. Introduction Up to now, we have talked a lot about building statistical models from data. However, throughout our discussion thus far, we have made the sometimes implicit, simplifying assumption that

More information

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) Speaker Identification Based On Discriminative Vector Quantization And Data Fusion 2005 Guangyu Zhou

More information

Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer

Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer Gábor Gosztolya, András Kocsor Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University

More information

Soft-Output Trellis Waveform Coding

Soft-Output Trellis Waveform Coding Soft-Output Trellis Waveform Coding Tariq Haddad and Abbas Yongaçoḡlu School of Information Technology and Engineering, University of Ottawa Ottawa, Ontario, K1N 6N5, Canada Fax: +1 (613) 562 5175 thaddad@site.uottawa.ca

More information

Bich Ngoc Do. Neural Networks for Automatic Speaker, Language and Sex Identification

Bich Ngoc Do. Neural Networks for Automatic Speaker, Language and Sex Identification Charles University in Prague Faculty of Mathematics and Physics University of Groningen Faculty of Arts MASTER THESIS Bich Ngoc Do Neural Networks for Automatic Speaker, Language and Sex Identification

More information

Sparse Models for Speech Recognition

Sparse Models for Speech Recognition Sparse Models for Speech Recognition Weibin Zhang and Pascale Fung Human Language Technology Center Hong Kong University of Science and Technology Outline Introduction to speech recognition Motivations

More information

Design of a CELP coder and analysis of various quantization techniques

Design of a CELP coder and analysis of various quantization techniques EECS 65 Project Report Design of a CELP coder and analysis of various quantization techniques Prof. David L. Neuhoff By: Awais M. Kamboh Krispian C. Lawrence Aditya M. Thomas Philip I. Tsai Winter 005

More information

VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION

VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION Xiangyang Chen B. Sc. (Elec. Eng.), The Branch of Tsinghua University, 1983 A THESIS SUBMITTED LV PARTIAL FVLFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

More information

ON SCALABLE CODING OF HIDDEN MARKOV SOURCES. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

ON SCALABLE CODING OF HIDDEN MARKOV SOURCES. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose ON SCALABLE CODING OF HIDDEN MARKOV SOURCES Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose Department of Electrical and Computer Engineering University of California, Santa Barbara, CA, 93106

More information

Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features

Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features Heiga ZEN (Byung Ha CHUN) Nagoya Inst. of Tech., Japan Overview. Research backgrounds 2.

More information

Lecture 9: Speech Recognition. Recognizing Speech

Lecture 9: Speech Recognition. Recognizing Speech EE E68: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition 3 4 Recognizing Speech Feature Calculation Sequence Recognition Hidden Markov Models Dan Ellis http://www.ee.columbia.edu/~dpwe/e68/

More information

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 Chapter 9 Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 1 LPC Methods LPC methods are the most widely used in speech coding, speech synthesis, speech recognition, speaker recognition and verification

More information

Lecture 9: Speech Recognition

Lecture 9: Speech Recognition EE E682: Speech & Audio Processing & Recognition Lecture 9: Speech Recognition 1 2 3 4 Recognizing Speech Feature Calculation Sequence Recognition Hidden Markov Models Dan Ellis

More information

Outline of Today s Lecture

Outline of Today s Lecture University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Jeff A. Bilmes Lecture 12 Slides Feb 23 rd, 2005 Outline of Today s

More information

Hidden Markov Model Based Robust Speech Recognition

Hidden Markov Model Based Robust Speech Recognition Hidden Markov Model Based Robust Speech Recognition Vikas Mulik * Vikram Mane Imran Jamadar JCEM,K.M.Gad,E&Tc,&Shivaji University, ADCET,ASHTA,E&Tc&Shivaji university ADCET,ASHTA,Automobile&Shivaji Abstract

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

Proc. of NCC 2010, Chennai, India

Proc. of NCC 2010, Chennai, India Proc. of NCC 2010, Chennai, India Trajectory and surface modeling of LSF for low rate speech coding M. Deepak and Preeti Rao Department of Electrical Engineering Indian Institute of Technology, Bombay

More information

Lecture 11: Hidden Markov Models

Lecture 11: Hidden Markov Models Lecture 11: Hidden Markov Models Cognitive Systems - Machine Learning Cognitive Systems, Applied Computer Science, Bamberg University slides by Dr. Philip Jackson Centre for Vision, Speech & Signal Processing

More information

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future

More information

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391

Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Hidden Markov Models The three basic HMM problems (note: change in notation) Mitch Marcus CSE 391 Parameters of an HMM States: A set of states S=s 1, s n Transition probabilities: A= a 1,1, a 1,2,, a n,n

More information

Data Analyzing and Daily Activity Learning with Hidden Markov Model

Data Analyzing and Daily Activity Learning with Hidden Markov Model Data Analyzing and Daily Activity Learning with Hidden Markov Model GuoQing Yin and Dietmar Bruckner Institute of Computer Technology Vienna University of Technology, Austria, Europe {yin, bruckner}@ict.tuwien.ac.at

More information

Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm

Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm EngOpt 2008 - International Conference on Engineering Optimization Rio de Janeiro, Brazil, 0-05 June 2008. Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic

More information

Automatic Speech Recognition (CS753)

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short

More information

A classification of marked hijaiyah letters pronunciation using hidden Markov model

A classification of marked hijaiyah letters pronunciation using hidden Markov model A classification of marked hijaiyah letters pronunciation using hidden Markov model Untari. Wisesty, M. Syahrul Mubarok, and Adiwijaya Citation: AIP Conference Proceedings 1867, 020036 (2017); doi: 10.1063/1.4994439

More information

Analysis of methods for speech signals quantization

Analysis of methods for speech signals quantization INFOTEH-JAHORINA Vol. 14, March 2015. Analysis of methods for speech signals quantization Stefan Stojkov Mihajlo Pupin Institute, University of Belgrade Belgrade, Serbia e-mail: stefan.stojkov@pupin.rs

More information

Statistical NLP Spring Digitizing Speech

Statistical NLP Spring Digitizing Speech Statistical NLP Spring 2008 Lecture 10: Acoustic Models Dan Klein UC Berkeley Digitizing Speech 1 Frame Extraction A frame (25 ms wide) extracted every 10 ms 25 ms 10ms... a 1 a 2 a 3 Figure from Simon

More information

Digitizing Speech. Statistical NLP Spring Frame Extraction. Gaussian Emissions. Vector Quantization. HMMs for Continuous Observations? ...

Digitizing Speech. Statistical NLP Spring Frame Extraction. Gaussian Emissions. Vector Quantization. HMMs for Continuous Observations? ... Statistical NLP Spring 2008 Digitizing Speech Lecture 10: Acoustic Models Dan Klein UC Berkeley Frame Extraction A frame (25 ms wide extracted every 10 ms 25 ms 10ms... a 1 a 2 a 3 Figure from Simon Arnfield

More information

POWER quality monitors are increasingly being used to assist

POWER quality monitors are increasingly being used to assist IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 20, NO. 3, JULY 2005 2129 Disturbance Classification Using Hidden Markov Models and Vector Quantization T. K. Abdel-Galil, Member, IEEE, E. F. El-Saadany, Member,

More information

THere are many sensing scenarios for which the target is

THere are many sensing scenarios for which the target is Adaptive Multi-Aspect Target Classification and Detection with Hidden Markov Models Shihao Ji, Xuejun Liao, Senior Member, IEEE, and Lawrence Carin, Fellow, IEEE Abstract Target detection and classification

More information

Dept. of Linguistics, Indiana University Fall 2009

Dept. of Linguistics, Indiana University Fall 2009 1 / 14 Markov L645 Dept. of Linguistics, Indiana University Fall 2009 2 / 14 Markov (1) (review) Markov A Markov Model consists of: a finite set of statesω={s 1,...,s n }; an signal alphabetσ={σ 1,...,σ

More information

CS838-1 Advanced NLP: Hidden Markov Models

CS838-1 Advanced NLP: Hidden Markov Models CS838-1 Advanced NLP: Hidden Markov Models Xiaojin Zhu 2007 Send comments to jerryzhu@cs.wisc.edu 1 Part of Speech Tagging Tag each word in a sentence with its part-of-speech, e.g., The/AT representative/nn

More information

Lecture 3: ASR: HMMs, Forward, Viterbi

Lecture 3: ASR: HMMs, Forward, Viterbi Original slides by Dan Jurafsky CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 3: ASR: HMMs, Forward, Viterbi Fun informative read on phonetics The

More information

Hidden Markov Models Hamid R. Rabiee

Hidden Markov Models Hamid R. Rabiee Hidden Markov Models Hamid R. Rabiee 1 Hidden Markov Models (HMMs) In the previous slides, we have seen that in many cases the underlying behavior of nature could be modeled as a Markov process. However

More information

The Noisy Channel Model. CS 294-5: Statistical Natural Language Processing. Speech Recognition Architecture. Digitizing Speech

The Noisy Channel Model. CS 294-5: Statistical Natural Language Processing. Speech Recognition Architecture. Digitizing Speech CS 294-5: Statistical Natural Language Processing The Noisy Channel Model Speech Recognition II Lecture 21: 11/29/05 Search through space of all possible sentences. Pick the one that is most probable given

More information

Allpass Modeling of LP Residual for Speaker Recognition

Allpass Modeling of LP Residual for Speaker Recognition Allpass Modeling of LP Residual for Speaker Recognition K. Sri Rama Murty, Vivek Boominathan and Karthika Vijayan Department of Electrical Engineering, Indian Institute of Technology Hyderabad, India email:

More information

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 9: Acoustic Models

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 9: Acoustic Models Statistical NLP Spring 2010 The Noisy Channel Model Lecture 9: Acoustic Models Dan Klein UC Berkeley Acoustic model: HMMs over word positions with mixtures of Gaussians as emissions Language model: Distributions

More information

Estimation of Relative Operating Characteristics of Text Independent Speaker Verification

Estimation of Relative Operating Characteristics of Text Independent Speaker Verification International Journal of Engineering Science Invention Volume 1 Issue 1 December. 2012 PP.18-23 Estimation of Relative Operating Characteristics of Text Independent Speaker Verification Palivela Hema 1,

More information

Estimation of Cepstral Coefficients for Robust Speech Recognition

Estimation of Cepstral Coefficients for Robust Speech Recognition Estimation of Cepstral Coefficients for Robust Speech Recognition by Kevin M. Indrebo, B.S., M.S. A Dissertation submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models Slides mostly from Mitch Marcus and Eric Fosler (with lots of modifications). Have you seen HMMs? Have you seen Kalman filters? Have you seen dynamic programming? HMMs are dynamic

More information

Speech Recognition HMM

Speech Recognition HMM Speech Recognition HMM Jan Černocký, Valentina Hubeika {cernocky ihubeika}@fit.vutbr.cz FIT BUT Brno Speech Recognition HMM Jan Černocký, Valentina Hubeika, DCGM FIT BUT Brno 1/38 Agenda Recap variability

More information

Frog Sound Identification System for Frog Species Recognition

Frog Sound Identification System for Frog Species Recognition Frog Sound Identification System for Frog Species Recognition Clifford Loh Ting Yuan and Dzati Athiar Ramli Intelligent Biometric Research Group (IBG), School of Electrical and Electronic Engineering,

More information

speaker recognition using gmm-ubm semester project presentation

speaker recognition using gmm-ubm semester project presentation speaker recognition using gmm-ubm semester project presentation OBJECTIVES OF THE PROJECT study the GMM-UBM speaker recognition system implement this system with matlab document the code and how it interfaces

More information

Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition

Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition 2009 10th International Conference on Document Analysis and Recognition Embedded Bernoulli Mixture HMMs for Handwritten Word Recognition AdriàGiménez and Alfons Juan DSIC/ITI,Univ. Politècnica de València,

More information

Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes

Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes Discrete Single Vs. Multiple Stream HMMs: A Comparative Evaluation of Their Use in On-Line Handwriting Recognition of Whiteboard Notes Joachim Schenk, Stefan Schwärzler, and Gerhard Rigoll Institute for

More information

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015 Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch COMP-599 Oct 1, 2015 Announcements Research skills workshop today 3pm-4:30pm Schulich Library room 313 Start thinking about

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Hidden Markov Models

Hidden Markov Models CS769 Spring 2010 Advanced Natural Language Processing Hidden Markov Models Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu 1 Part-of-Speech Tagging The goal of Part-of-Speech (POS) tagging is to label each

More information

arxiv: v1 [cs.sd] 25 Oct 2014

arxiv: v1 [cs.sd] 25 Oct 2014 Choice of Mel Filter Bank in Computing MFCC of a Resampled Speech arxiv:1410.6903v1 [cs.sd] 25 Oct 2014 Laxmi Narayana M, Sunil Kumar Kopparapu TCS Innovation Lab - Mumbai, Tata Consultancy Services, Yantra

More information

L23: hidden Markov models

L23: hidden Markov models L23: hidden Markov models Discrete Markov processes Hidden Markov models Forward and Backward procedures The Viterbi algorithm This lecture is based on [Rabiner and Juang, 1993] Introduction to Speech

More information

Master 2 Informatique Probabilistic Learning and Data Analysis

Master 2 Informatique Probabilistic Learning and Data Analysis Master 2 Informatique Probabilistic Learning and Data Analysis Faicel Chamroukhi Maître de Conférences USTV, LSIS UMR CNRS 7296 email: chamroukhi@univ-tln.fr web: chamroukhi.univ-tln.fr 2013/2014 Faicel

More information

Fractal Dimension and Vector Quantization

Fractal Dimension and Vector Quantization Fractal Dimension and Vector Quantization [Extended Abstract] Krishna Kumaraswamy Center for Automated Learning and Discovery, Carnegie Mellon University skkumar@cs.cmu.edu Vasileios Megalooikonomou Department

More information

Comparing linear and non-linear transformation of speech

Comparing linear and non-linear transformation of speech Comparing linear and non-linear transformation of speech Larbi Mesbahi, Vincent Barreaud and Olivier Boeffard IRISA / ENSSAT - University of Rennes 1 6, rue de Kerampont, Lannion, France {lmesbahi, vincent.barreaud,

More information

MANY papers and books are devoted to modeling fading

MANY papers and books are devoted to modeling fading IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 16, NO. 9, DECEMBER 1998 1809 Hidden Markov Modeling of Flat Fading Channels William Turin, Senior Member, IEEE, Robert van Nobelen Abstract Hidden

More information

Multiscale Systems Engineering Research Group

Multiscale Systems Engineering Research Group Hidden Markov Model Prof. Yan Wang Woodruff School of Mechanical Engineering Georgia Institute of echnology Atlanta, GA 30332, U.S.A. yan.wang@me.gatech.edu Learning Objectives o familiarize the hidden

More information

Singer Identification using MFCC and LPC and its comparison for ANN and Naïve Bayes Classifiers

Singer Identification using MFCC and LPC and its comparison for ANN and Naïve Bayes Classifiers Singer Identification using MFCC and LPC and its comparison for ANN and Naïve Bayes Classifiers Kumari Rambha Ranjan, Kartik Mahto, Dipti Kumari,S.S.Solanki Dept. of Electronics and Communication Birla

More information

Fractal Dimension and Vector Quantization

Fractal Dimension and Vector Quantization Fractal Dimension and Vector Quantization Krishna Kumaraswamy a, Vasileios Megalooikonomou b,, Christos Faloutsos a a School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 523 b Department

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

A Hidden Markov Model Based Procedure for Identifying Household Electric Loads

A Hidden Markov Model Based Procedure for Identifying Household Electric Loads A Hidden Markov Model Based Procedure for Identifying Household Electric Loads *Tehseen Zia, **Dietmar Bruckner (Senior Member, IEEE), *** Adeel Zaidi *Department of Computer Science and Information Technology,

More information

An Introduction to Bioinformatics Algorithms Hidden Markov Models

An Introduction to Bioinformatics Algorithms   Hidden Markov Models Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training

More information

Exploring the Discrete Wavelet Transform as a Tool for Hindi Speech Recognition

Exploring the Discrete Wavelet Transform as a Tool for Hindi Speech Recognition Exploring the Discrete Wavelet Transform as a Tool for Hindi Speech Recognition Shivesh Ranjan Abstract In this paper, we propose a new scheme for recognition of isolated words in Hindi Language speech,

More information

Research Article The Application of Baum-Welch Algorithm in Multistep Attack

Research Article The Application of Baum-Welch Algorithm in Multistep Attack e Scientific World Journal, Article ID 374260, 7 pages http://dx.doi.org/10.1155/2014/374260 Research Article The Application of Baum-Welch Algorithm in Multistep Attack Yanxue Zhang, 1 Dongmei Zhao, 2

More information

Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM)

Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM) Calhoun: The NPS Institutional Archive DSpace Repository Theses and Dissertations Thesis and Dissertation Collection 2006-03 Isolated word recognition from in-ear microphone data using Hidden Markov Models

More information

Lecture 7: Pitch and Chord (2) HMM, pitch detection functions. Li Su 2016/03/31

Lecture 7: Pitch and Chord (2) HMM, pitch detection functions. Li Su 2016/03/31 Lecture 7: Pitch and Chord (2) HMM, pitch detection functions Li Su 2016/03/31 Chord progressions Chord progressions are not arbitrary Example 1: I-IV-I-V-I (C-F-C-G-C) Example 2: I-V-VI-III-IV-I-II-V

More information

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ Digital Speech Processing Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models Adaptive and Differential Coding 1 Speech Waveform Coding-Summary of Part 1 1. Probability

More information

CS 136 Lecture 5 Acoustic modeling Phoneme modeling

CS 136 Lecture 5 Acoustic modeling Phoneme modeling + September 9, 2016 Professor Meteer CS 136 Lecture 5 Acoustic modeling Phoneme modeling Thanks to Dan Jurafsky for these slides + Directly Modeling Continuous Observations n Gaussians n Univariate Gaussians

More information