Sparse Models for Speech Recognition
|
|
- Lenard Rose
- 5 years ago
- Views:
Transcription
1 Sparse Models for Speech Recognition Weibin Zhang and Pascale Fung Human Language Technology Center Hong Kong University of Science and Technology
2 Outline Introduction to speech recognition Motivations for sparse models Maximum likelihood training of sparse models ML training of sparse banded models Discriminative training of sparse models Conclusions ECE/HKUST Weibin Zhang 2
3 Speech Recognition & its Applications 1. Automatic Speech Recognition (ASR): Convert speech wave into text automatically 2. Applications: Office/business systems: Manufacturing Telecommunications Mobile telephony Home Automation Navigation ECE/HKUST Weibin Zhang 3
4 History of ASR Technical Point of View ECE/HKUST Weibin Zhang 4
5 ASR Research -- Overview Statistical approaches lead in all area. Still big gap between human and machine performance...however Useful systems have been built which are changing the way we interact with the world...within five years people will discard their keyboards and interact with computers using touch-screens and voice controls... Bill Gates, Feb 2008 ECE/HKUST Weibin Zhang 5
6 Statistical speech recognition system ECE/HKUST Weibin Zhang 6
7 Statistical speech recognition system Language Model: P( recognize speech ) >> P( wreck a nice beach ) Dictionary: Wreck r e k Beach b i th Acoustic Model: P(O recognize speech ) ECE/HKUST Weibin Zhang 7
8 Statistical speech recognition system ECE/HKUST Weibin Zhang 8
9 Acoustic modeling Left-to-right hidden Markov models (HMMs) GMM-HMM based acoustic models p o t s j = m c jm N(o t : μ jm, Σ jm ) Θ = a ij, b j (o t ) = {a ij, c jm, u jm, Σ jm } ECE/HKUST Weibin Zhang 9
10 Evaluation of ASR system Word error rate (WER) = 1 accuracy Real time factor (RTF) ECE/HKUST Weibin Zhang 10
11 Covariance modeling ECE/HKUST Weibin Zhang 11
12 Covariance modeling Less Training data ECE/HKUST Weibin Zhang 12
13 ML training of sparse models Maximum likelihood (ML) training Θ = argmax{log(p(o Θ))} Θ The proposed new objective function S 1 i=2 M m=1 L Θ = log(p(o Θ)) ρ C im 1 ECE/HKUST Weibin Zhang 13
14 Maximizing the auxiliary function The precision matrices can be updated using C im = argmax{logdetc im trace S im C im λ C im 1 } C im 0 λ = 2ρ and S γ im is the sample covariance matrix. im Convex optimization or other more efficient methods (e.g. graphical lasso) ECE/HKUST Weibin Zhang 14
15 Experiments on the WSJ data Experimental setup Training, development and testing data sets Standard bigram language model Feature vector: 39-dimension MFCC 39 phonemes for English (39 3 triphones) 2843 tied HMM states ECE/HKUST Weibin Zhang 15
16 Tuning results on the dev. data set ECE/HKUST Weibin Zhang 16
17 WER on the testing data set Our result of 8.77% WER is comparable to the 8.6% WER reported in (Ko & Mak, 2011) using a similar testing configuration, but using 70 hours of training data Model type #Gaussians WER Rel. improv. Significant? Full No Diagonal Sparse % Yes ECE/HKUST Weibin Zhang 17
18 Sparse banded models Sparse models Sparse models Sparse banded feature reorder models ECE/HKUST Weibin Zhang 18
19 Training of sparse banded models Weighted lasso: f C im = H C im 1 H k, l = C im k, l = 0 Diagonal Sparse banded Full ECE/HKUST Weibin Zhang 19
20 Importance of the feature order O~N μ, Σ ; C = Σ 1 ; C ij = 0 o i and o j are conditionally independent (CI), given other variables. Rearrange the feature order so that o i and o j are CI if i j > b Three orders are investigated: HTK order : m 1 m 13 Δm 1 Δm 13 ΔΔm 1 ΔΔm 13 Knowledge-based order : m 1 Δm 1 ΔΔm 1 m 13 Δm 13 ΔΔm 13 Data-driven order : m 1 ΔΔm 1 Δm 6 Δm 10 ECE/HKUST Weibin Zhang 20
21 Results on the development data ECE/HKUST Weibin Zhang 21
22 Results on the test data Model type #Gaussians WER Rel. improv. Significant? Full No Diagonal Sparse % Yes Band Yes ECE/HKUST Weibin Zhang 22
23 Decoding time Sparse banded modes are the fastest since: 1) smaller searching beamwidths; 2) less model parameters. ECE/HKUST Weibin Zhang 23
24 Discriminative training MMI objective function: Θ = Argmax{ log P w r O, Θ } Θ New Objective function S 1 L Θ = log P w r O, Θ ρ C im 1 i=2 M m=1 A valid weak-sense auxiliary function is Q Θ; Θ =Q n Θ; Θ -Q d Θ; Θ +Q s Θ; Θ +Q I Θ; Θ S 1 M ρ C im 1 i=2 m=1 Same as ML training Ensure stability Improve generalization Regularization term ECE/HKUST Weibin Zhang 24
25 Results on the WSJ testing data Model type #Gaussians ML training MMI Full Diagonal Diagonal+ STC Sparse ECE/HKUST Weibin Zhang 25
26 Summary Sparse models are effective in dealing with the problems that conventional diagonal and full covariance models face: computation, incorrect model assumptions and over-fitting when training data is insufficient. We derive the overall training process under the HMM framework using both maximum likelihood training and discriminative training. The proposed sparse models subsume the traditional diagonal and full covariance models as special cases. ECE/HKUST Weibin Zhang 26
27 Thank you! ECE/HKUST Weibin Zhang 27
Hidden Markov Model and Speech Recognition
1 Dec,2006 Outline Introduction 1 Introduction 2 3 4 5 Introduction What is Speech Recognition? Understanding what is being said Mapping speech data to textual information Speech Recognition is indeed
More informationExperiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition
Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition ABSTRACT It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden
More informationSpeech and Language Processing. Chapter 9 of SLP Automatic Speech Recognition (II)
Speech and Language Processing Chapter 9 of SLP Automatic Speech Recognition (II) Outline for ASR ASR Architecture The Noisy Channel Model Five easy pieces of an ASR system 1) Language Model 2) Lexicon/Pronunciation
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data Statistical Machine Learning from Data Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL),
More informationThe Noisy Channel Model. CS 294-5: Statistical Natural Language Processing. Speech Recognition Architecture. Digitizing Speech
CS 294-5: Statistical Natural Language Processing The Noisy Channel Model Speech Recognition II Lecture 21: 11/29/05 Search through space of all possible sentences. Pick the one that is most probable given
More informationWhy DNN Works for Acoustic Modeling in Speech Recognition?
Why DNN Works for Acoustic Modeling in Speech Recognition? Prof. Hui Jiang Department of Computer Science and Engineering York University, Toronto, Ont. M3J 1P3, CANADA Joint work with Y. Bao, J. Pan,
More informationAugmented Statistical Models for Speech Recognition
Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent
More informationLecture 3: ASR: HMMs, Forward, Viterbi
Original slides by Dan Jurafsky CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 3: ASR: HMMs, Forward, Viterbi Fun informative read on phonetics The
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 6: Hidden Markov Models (Part II) Instructor: Preethi Jyothi Aug 10, 2017 Recall: Computing Likelihood Problem 1 (Likelihood): Given an HMM l =(A, B) and an
More informationCovariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data
Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data Jan Vaněk, Lukáš Machlica, Josef V. Psutka, Josef Psutka University of West Bohemia in Pilsen, Univerzitní
More informationThe Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 9: Acoustic Models
Statistical NLP Spring 2010 The Noisy Channel Model Lecture 9: Acoustic Models Dan Klein UC Berkeley Acoustic model: HMMs over word positions with mixtures of Gaussians as emissions Language model: Distributions
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 23&27 January 2014 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationUniversity of Cambridge. MPhil in Computer Speech Text & Internet Technology. Module: Speech Processing II. Lecture 2: Hidden Markov Models I
University of Cambridge MPhil in Computer Speech Text & Internet Technology Module: Speech Processing II Lecture 2: Hidden Markov Models I o o o o o 1 2 3 4 T 1 b 2 () a 12 2 a 3 a 4 5 34 a 23 b () b ()
More informationCS 136 Lecture 5 Acoustic modeling Phoneme modeling
+ September 9, 2016 Professor Meteer CS 136 Lecture 5 Acoustic modeling Phoneme modeling Thanks to Dan Jurafsky for these slides + Directly Modeling Continuous Observations n Gaussians n Univariate Gaussians
More informationStatistical NLP Spring Digitizing Speech
Statistical NLP Spring 2008 Lecture 10: Acoustic Models Dan Klein UC Berkeley Digitizing Speech 1 Frame Extraction A frame (25 ms wide) extracted every 10 ms 25 ms 10ms... a 1 a 2 a 3 Figure from Simon
More informationDigitizing Speech. Statistical NLP Spring Frame Extraction. Gaussian Emissions. Vector Quantization. HMMs for Continuous Observations? ...
Statistical NLP Spring 2008 Digitizing Speech Lecture 10: Acoustic Models Dan Klein UC Berkeley Frame Extraction A frame (25 ms wide extracted every 10 ms 25 ms 10ms... a 1 a 2 a 3 Figure from Simon Arnfield
More informationThe Noisy Channel Model and Markov Models
1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle
More informationThe Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 10: Acoustic Models
Statistical NLP Spring 2009 The Noisy Channel Model Lecture 10: Acoustic Models Dan Klein UC Berkeley Search through space of all possible sentences. Pick the one that is most probable given the waveform.
More informationStatistical NLP Spring The Noisy Channel Model
Statistical NLP Spring 2009 Lecture 10: Acoustic Models Dan Klein UC Berkeley The Noisy Channel Model Search through space of all possible sentences. Pick the one that is most probable given the waveform.
More informationHidden Markov Models in Language Processing
Hidden Markov Models in Language Processing Dustin Hillard Lecture notes courtesy of Prof. Mari Ostendorf Outline Review of Markov models What is an HMM? Examples General idea of hidden variables: implications
More informationSpeaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
Zhang et al EURASIP Journal on Audio, Speech, and Music Processing 2014, 2014:11 RESEARCH Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation Wen-Lin Zhang 1*,Wei-QiangZhang
More informationForward algorithm vs. particle filtering
Particle Filtering ØSometimes X is too big to use exact inference X may be too big to even store B(X) E.g. X is continuous X 2 may be too big to do updates ØSolution: approximate inference Track samples
More informationShankar Shivappa University of California, San Diego April 26, CSE 254 Seminar in learning algorithms
Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. Say Wei Foo, Yong Lian, Liang Dong. IEEE Transactions on Circuits and Systems for Video Technology, May 2004. Shankar
More informationCS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm
+ September13, 2016 Professor Meteer CS 136a Lecture 7 Speech Recognition Architecture: Training models with the Forward backward algorithm Thanks to Dan Jurafsky for these slides + ASR components n Feature
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationDiagonal Priors for Full Covariance Speech Recognition
Diagonal Priors for Full Covariance Speech Recognition Peter Bell 1, Simon King 2 Centre for Speech Technology Research, University of Edinburgh Informatics Forum, 10 Crichton St, Edinburgh, EH8 9AB, UK
More informationHidden Markov Modelling
Hidden Markov Modelling Introduction Problem formulation Forward-Backward algorithm Viterbi search Baum-Welch parameter estimation Other considerations Multiple observation sequences Phone-based models
More informationEngineering Part IIB: Module 4F11 Speech and Language Processing Lectures 4/5 : Speech Recognition Basics
Engineering Part IIB: Module 4F11 Speech and Language Processing Lectures 4/5 : Speech Recognition Basics Phil Woodland: pcw@eng.cam.ac.uk Lent 2013 Engineering Part IIB: Module 4F11 What is Speech Recognition?
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationReformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features
Reformulating the HMM as a trajectory model by imposing explicit relationship between static and dynamic features Heiga ZEN (Byung Ha CHUN) Nagoya Inst. of Tech., Japan Overview. Research backgrounds 2.
More information] Automatic Speech Recognition (CS753)
] Automatic Speech Recognition (CS753) Lecture 17: Discriminative Training for HMMs Instructor: Preethi Jyothi Sep 28, 2017 Discriminative Training Recall: MLE for HMMs Maximum likelihood estimation (MLE)
More informationHidden Markov Models
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 22 April 2, 2018 1 Reminders Homework
More informationHidden Markov Models Hamid R. Rabiee
Hidden Markov Models Hamid R. Rabiee 1 Hidden Markov Models (HMMs) In the previous slides, we have seen that in many cases the underlying behavior of nature could be modeled as a Markov process. However
More informationGraphical models for part of speech tagging
Indian Institute of Technology, Bombay and Research Division, India Research Lab Graphical models for part of speech tagging Different Models for POS tagging HMM Maximum Entropy Markov Models Conditional
More informationHidden Markov Models and Gaussian Mixture Models
Hidden Markov Models and Gaussian Mixture Models Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 4&5 25&29 January 2018 ASR Lectures 4&5 Hidden Markov Models and Gaussian
More informationLecture 5: GMM Acoustic Modeling and Feature Extraction
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 5: GMM Acoustic Modeling and Feature Extraction Original slides by Dan Jurafsky Outline for Today Acoustic
More informationLecture 10. Discriminative Training, ROVER, and Consensus. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 10 Discriminative Training, ROVER, and Consensus Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com
More informationA TWO-LAYER NON-NEGATIVE MATRIX FACTORIZATION MODEL FOR VOCABULARY DISCOVERY. MengSun,HugoVanhamme
A TWO-LAYER NON-NEGATIVE MATRIX FACTORIZATION MODEL FOR VOCABULARY DISCOVERY MengSun,HugoVanhamme Department of Electrical Engineering-ESAT, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Bus
More informationDetection-Based Speech Recognition with Sparse Point Process Models
Detection-Based Speech Recognition with Sparse Point Process Models Aren Jansen Partha Niyogi Human Language Technology Center of Excellence Departments of Computer Science and Statistics ICASSP 2010 Dallas,
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 20: HMMs / Speech / ML 11/8/2011 Dan Klein UC Berkeley Today HMMs Demo bonanza! Most likely explanation queries Speech recognition A massive HMM! Details
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationA Direct Criterion Minimization based fmllr via Gradient Descend
A Direct Criterion Minimization based fmllr via Gradient Descend Jan Vaněk and Zbyněk Zajíc University of West Bohemia in Pilsen, Univerzitní 22, 306 14 Pilsen Faculty of Applied Sciences, Department of
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs
More informationFull-covariance model compensation for
compensation transms Presentation Toshiba, 12 Mar 2008 Outline compensation transms compensation transms Outline compensation transms compensation transms Noise model x clean speech; n additive ; h convolutional
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More information1. Markov models. 1.1 Markov-chain
1. Markov models 1.1 Markov-chain Let X be a random variable X = (X 1,..., X t ) taking values in some set S = {s 1,..., s N }. The sequence is Markov chain if it has the following properties: 1. Limited
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationAn Evolutionary Programming Based Algorithm for HMM training
An Evolutionary Programming Based Algorithm for HMM training Ewa Figielska,Wlodzimierz Kasprzak Institute of Control and Computation Engineering, Warsaw University of Technology ul. Nowowiejska 15/19,
More informationHIDDEN MARKOV MODELS IN SPEECH RECOGNITION
HIDDEN MARKOV MODELS IN SPEECH RECOGNITION Wayne Ward Carnegie Mellon University Pittsburgh, PA 1 Acknowledgements Much of this talk is derived from the paper "An Introduction to Hidden Markov Models",
More informationLow development cost, high quality speech recognition for new languages and domains. Cheap ASR
Low development cost, high quality speech recognition for new languages and domains Cheap ASR Participants: Senior members : Lukas Burget, Nagendra Kumar Goel, Daniel Povey, Richard Rose Graduate students:
More informationA New OCR System Similar to ASR System
A ew OCR System Similar to ASR System Abstract Optical character recognition (OCR) system is created using the concepts of automatic speech recognition where the hidden Markov Model is widely used. Results
More informationPHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS
PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS Jinjin Ye jinjin.ye@mu.edu Michael T. Johnson mike.johnson@mu.edu Richard J. Povinelli richard.povinelli@mu.edu
More informationFactor analysed hidden Markov models for speech recognition
Factor analysed hidden Markov models for speech recognition A-V.I. Rosti, M.J.F. Gales Cambridge University Engineering Department, Trumpington Street, Cambridge, CB2 1PZ, UK Abstract Recently various
More informationHidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationHidden Markov Models
Hidden Markov Models Dr Philip Jackson Centre for Vision, Speech & Signal Processing University of Surrey, UK 1 3 2 http://www.ee.surrey.ac.uk/personal/p.jackson/isspr/ Outline 1. Recognizing patterns
More informationCSC321 Lecture 7 Neural language models
CSC321 Lecture 7 Neural language models Roger Grosse and Nitish Srivastava February 1, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 7 Neural language models February 1, 2015 1 / 19 Overview We
More informationspeaker recognition using gmm-ubm semester project presentation
speaker recognition using gmm-ubm semester project presentation OBJECTIVES OF THE PROJECT study the GMM-UBM speaker recognition system implement this system with matlab document the code and how it interfaces
More informationChapter 3: Basics of Language Modelling
Chapter 3: Basics of Language Modelling Motivation Language Models are used in Speech Recognition Machine Translation Natural Language Generation Query completion For research and development: need a simple
More informationFEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION. Xiao Li and Jeff Bilmes
FEATURE PRUNING IN LIKELIHOOD EVALUATION OF HMM-BASED SPEECH RECOGNITION Xiao Li and Jeff Bilmes Department of Electrical Engineering University. of Washington, Seattle {lixiao, bilmes}@ee.washington.edu
More informationStructured Precision Matrix Modelling for Speech Recognition
Structured Precision Matrix Modelling for Speech Recognition Khe Chai Sim Churchill College and Cambridge University Engineering Department 6th April 2006 Dissertation submitted to the University of Cambridge
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More information10. Hidden Markov Models (HMM) for Speech Processing. (some slides taken from Glass and Zue course)
10. Hidden Markov Models (HMM) for Speech Processing (some slides taken from Glass and Zue course) Definition of an HMM The HMM are powerful statistical methods to characterize the observed samples of
More informationRobust Speaker Identification
Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }
More informationHidden Markov Models
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 19 Nov. 5, 2018 1 Reminders Homework
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short
More informationDeep Learning for Speech Recognition. Hung-yi Lee
Deep Learning for Speech Recognition Hung-yi Lee Outline Conventional Speech Recognition How to use Deep Learning in acoustic modeling? Why Deep Learning? Speaker Adaptation Multi-task Deep Learning New
More informationLinear Transforms in Automatic Speech Recognition: Estimation Procedures & Integration of Diverse Acoustic Data
Linear Transforms in Automatic Speech Recognition: Estimation Procedures & Stavros Tsakalidis Advisor: Prof. William Byrne Thesis Committee: Prof. William Byrne, Prof. Sanjeev Khudanpur, Prof. Paul Sotiriadis,
More informationAutomatic Speech Recognition and Statistical Machine Translation under Uncertainty
Outlines Automatic Speech Recognition and Statistical Machine Translation under Uncertainty Lambert Mathias Advisor: Prof. William Byrne Thesis Committee: Prof. Gerard Meyer, Prof. Trac Tran and Prof.
More informationOutline of Today s Lecture
University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Jeff A. Bilmes Lecture 12 Slides Feb 23 rd, 2005 Outline of Today s
More informationExpectation Maximization (EM)
Expectation Maximization (EM) The EM algorithm is used to train models involving latent variables using training data in which the latent variables are not observed (unlabeled data). This is to be contrasted
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 21: Speaker Adaptation Instructor: Preethi Jyothi Oct 23, 2017 Speaker variations Major cause of variability in speech is the differences between speakers Speaking
More informationIntroduction to Neural Networks
Introduction to Neural Networks Steve Renals Automatic Speech Recognition ASR Lecture 10 24 February 2014 ASR Lecture 10 Introduction to Neural Networks 1 Neural networks for speech recognition Introduction
More informationNote Set 5: Hidden Markov Models
Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional
More informationSYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS
SYMBOL RECOGNITION IN HANDWRITTEN MATHEMATI- CAL FORMULAS Hans-Jürgen Winkler ABSTRACT In this paper an efficient on-line recognition system for handwritten mathematical formulas is proposed. After formula
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization
More informationChapter 3: Basics of Language Modeling
Chapter 3: Basics of Language Modeling Section 3.1. Language Modeling in Automatic Speech Recognition (ASR) All graphs in this section are from the book by Schukat-Talamazzini unless indicated otherwise
More informationDiscriminative models for speech recognition
Discriminative models for speech recognition Anton Ragni Peterhouse University of Cambridge A thesis submitted for the degree of Doctor of Philosophy 2013 Declaration This dissertation is the result of
More informationHidden Markov Models
CS769 Spring 2010 Advanced Natural Language Processing Hidden Markov Models Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu 1 Part-of-Speech Tagging The goal of Part-of-Speech (POS) tagging is to label each
More informationGraphical Models for Automatic Speech Recognition
Graphical Models for Automatic Speech Recognition Advanced Signal Processing SE 2, SS05 Stefan Petrik Signal Processing and Speech Communication Laboratory Graz University of Technology GMs for Automatic
More informationSegmental Recurrent Neural Networks for End-to-end Speech Recognition
Segmental Recurrent Neural Networks for End-to-end Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, Noah Smith and Steve Renals TTI-Chicago, UoE, CMU and UW 9 September 2016 Background A new wave
More informationSpeech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models.
Speech Recognition Lecture 8: Expectation-Maximization Algorithm, Hidden Markov Models. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.com This Lecture Expectation-Maximization (EM)
More informationA Bayesian Perspective on Residential Demand Response Using Smart Meter Data
A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu
More informationImproving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer
Improving the Multi-Stack Decoding Algorithm in a Segment-based Speech Recognizer Gábor Gosztolya, András Kocsor Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University
More informationUsing Sub-Phonemic Units for HMM Based Phone Recognition
Jarle Bauck Hamar Using Sub-Phonemic Units for HMM Based Phone Recognition Thesis for the degree of Philosophiae Doctor Trondheim, June 2013 Norwegian University of Science and Technology Faculty of Information
More informationHuman Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data
Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data 0. Notations Myungjun Choi, Yonghyun Ro, Han Lee N = number of states in the model T = length of observation sequence
More informationHeeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University
Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2011 1 HMM Lecture Notes Dannie Durand and Rose Hoberman October 11th 1 Hidden Markov Models In the last few lectures, we have focussed on three problems
More informationWeighted Finite-State Transducers in Computational Biology
Weighted Finite-State Transducers in Computational Biology Mehryar Mohri Courant Institute of Mathematical Sciences mohri@cims.nyu.edu Joint work with Corinna Cortes (Google Research). 1 This Tutorial
More informationEstimation of Cepstral Coefficients for Robust Speech Recognition
Estimation of Cepstral Coefficients for Robust Speech Recognition by Kevin M. Indrebo, B.S., M.S. A Dissertation submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment
More informationBrief Introduction of Machine Learning Techniques for Content Analysis
1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview
More informationHMM part 1. Dr Philip Jackson
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -
More informationMultimodal Deep Learning for Predicting Survival from Breast Cancer
Multimodal Deep Learning for Predicting Survival from Breast Cancer Heather Couture Deep Learning Journal Club Nov. 16, 2016 Outline Background on tumor histology & genetic data Background on survival
More informationTemporal Modeling and Basic Speech Recognition
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Temporal Modeling and Basic Speech Recognition Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Today s lecture Recognizing
More informationSubspace based/universal Background Model (UBM) based speech modeling This paper is available at
Subspace based/universal Background Model (UBM) based speech modeling This paper is available at http://dpovey.googlepages.com/jhu_lecture2.pdf Daniel Povey June, 2009 1 Overview Introduce the concept
More informationCross-Lingual Language Modeling for Automatic Speech Recogntion
GBO Presentation Cross-Lingual Language Modeling for Automatic Speech Recogntion November 14, 2003 Woosung Kim woosung@cs.jhu.edu Center for Language and Speech Processing Dept. of Computer Science The
More informationIndependent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationHidden Markov Models
Hidden Markov Models Lecture Notes Speech Communication 2, SS 2004 Erhard Rank/Franz Pernkopf Signal Processing and Speech Communication Laboratory Graz University of Technology Inffeldgasse 16c, A-8010
More informationUnsupervised Learning of Hierarchical Models. in collaboration with Josh Susskind and Vlad Mnih
Unsupervised Learning of Hierarchical Models Marc'Aurelio Ranzato Geoff Hinton in collaboration with Josh Susskind and Vlad Mnih Advanced Machine Learning, 9 March 2011 Example: facial expression recognition
More informationEigenvoice Speaker Adaptation via Composite Kernel PCA
Eigenvoice Speaker Adaptation via Composite Kernel PCA James T. Kwok, Brian Mak and Simon Ho Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Hong Kong [jamesk,mak,csho]@cs.ust.hk
More informationASR using Hidden Markov Model : A tutorial
ASR using Hidden Markov Model : A tutorial Samudravijaya K Workshop on ASR @BAMU; 14-OCT-11 samudravijaya@gmail.com Tata Institute of Fundamental Research Samudravijaya K Workshop on ASR @BAMU; 14-OCT-11
More information