University of Birmingham Research Archive
|
|
- Elisabeth Dean
- 6 years ago
- Views:
Transcription
1
2 University of Birmingham Research Archive e-theses repository This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder.
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 1
24
25
26
27
28
29 Chapter 1 Introduction Chapter 2 Chapter 3 Literature and review of techniques Corpora descriptions microphone recorded speech signals telephony recorded speech signals Chapter 4 Base-line systems and system evaluations Adult speech Sub-band experiments Thesis Chapter 5 Accent ID vs Speaker ID Adult speech Microphon e recorded speech signals Full-band experiments Full-band experiments Chapter 6 Speaker recognition Adult speech Microphone recorded speech signals Sub-band experiments Child speech Microphone recorded speech signals Full-band experiments Chapter 7 Gender ID Sub-band experiments Age-group ID Child speech Microphone recorded speech signals Full-band experiments Human experimtents Chapter 8 Conclusion Child speech Microphone recorded speech signals Sub-band experimetns Full-band experiments Human experiments Sub-band experiments
30
31 2
32
33
34
35
36 [ ] [ ] = [ ] [ ]
37 [, / ] Gain 9 x Frequency (Hz)
38 = = ( + ) = + (, ) = ( + + ) ( + )
39 =,..., =,..., ; =,..., ( ) ; ; = ; ;, =,..., ; = = ;
40 = +
41 = = ( ) ( ) ( ) = = = ( ) ( ) ( )
42 (, ) ( ) ( )
43 ( ) = (, ) = = {,, }, =,...,.
44 (, ), T (, ) =, [ (, )], = =, T, = (, ) = (, ) = T =, T =,
45 ( ) = (, ) = = {,, }, =,...,. (, ) = + + = T =, T =, = T, =,
46
47 = + () {, } =,..., {, }, R. =,,..., >. + =. + + = +. + = (. + ). + = +. + = (. + )
48 y w w x + b = 1 w x + b = 0 2 w w x + b = 1 x b w (. + ) + = =
49 = = = = =. =, (. + ) =
50 (.,.) ( ) = (, ) + = = = ( ) (.,.) (.,.) (, ) = ( ) ( ) ( ) ( )
51 (, )
52 N (, ) N ( +, ) < : > = = ( ) + ( ) ( ) ( ) + ( ) ( ) + ( ) ( )
53 N (, ) N (, ), = + + [ ] N (, ). N (, ) [ ] = [ ] = [ + + ] = + [ ] + [ ] = + + =. [ ] = = [( [ ])( [ ]) ] = [( [ ])( [ ]) ] = [( [ ])( [ ]) ] N (, ) = ( ) = [( [ ])( [ ]) ] = [ ( + + ) ] = [ ] + [ ] =.
54 [ ] = ( ) [ ] = [ ] [ ] = [( [ ])( [ ]) ] = [( + + )( + + ) ] = [ ] = [ ] + [ ] = +. [ ] ([ ] N, [ + ]). N (, + ) ( ) ; =,..., L(,, ) = = ( ) () / + / ( ( ) ) ( + ) ( ( ) )
55 Speaker/session dependent Mean supervector : > UBM MAP Apapt.... Features for a given speaker S Speaker/session dependent GMM. = + (, )
56 T (, ) =, ( [ + ], )N ( ) = = N ( ) = [ + ] = [, ( )], = = = T (, ) =,,, (, [ + ],, ).
57 Unlabelled data UBM T Testutterance Front-end processing Super-vector extraction i-vector extraction Optional normalization and compensation techniques, such as LDA and length norm. Scoring Model i-vector Score (, ) =,
58 (, ) = ( ) (, ) =, ( ) =, (, ) =,
59 =. = = = =
60 X1 Y1 i-vector extraction Model 1 i-vector extraction X2 Y1... Xn Ym i-vector extraction i-vector extraction LDA and length norm. Model 2... Model m Decision LDA and length norm. Test utterance
61 = +, ( ) ( )
62
63 =. µ T µ I µ T µ I
64 =
65 = {(, ),..., (, )} = ( )
66
67
68 =. %,. %. %. %. %. %. %. %. %. %.% %
69
70
71
72
73 3
74
75 Train Test Evaluation Train Test Evaluation Train and evaluation Test Evaluation CSLU Kids Corpus Total number of speakers 1118 Age-Group ID Gender ID Speaker ID (Identifying a child in school) 352 spk. 766 spk. 766 spk.(n -1 file per spk.)* 430 spk. 687 spk. 687 spk.(n -1 file per spk.)* 918 spk. 100 spk. 100 spk. 50% male and 50% female. 54.2% male and 45.8% female. 50% male and 50% female. 55.7% male and 44.3% female. 50% male and 50% female.
76
77
78
79 Train and Evaluation Test TIMIT Corpus Total number of speakers 630 (438Male+192 Female) Speaker Identification 530 spk. 100 spk.
80
81
82 4
83
84
85
86
87 = = =.
88
89 Speaker Detection Performance 40 Miss probability (in %) False Alarm probability (in %)
90
91 Development Features UBM Training Computing Statistics T-Matrix Training Extracting i-vectors LDA Training i-vectors Test i-vectors Scoring Decision
92
93 40 False Negative Rate (FNR) [%] False Positive Rate (FPR) [%] Score Model Index Test Index
94
95 5
96
97 + =
98
99 ,...,
100 X1 Y1 i-vector extraction i-vector extraction X2 Y1... Xn Ym i-vector extraction i-vector extraction SVM Decision Test utterance
101
102 Identification Rate (%) A SID (30 Seconds test segments) SID (10 Seconds test segments) SID (3 Seconds test segments) B C D Sub Band
103 24 21 A B C D 18 Identification Rate (%) Sub Band A C D 0.01 NSID NAID B Sub Band
104 24 21 A B C D 18 Identification Rate (%) Sub Band
105 A C D NSID NAID B Sub Band....%
106
107
108 6
109
110
111 /
112
113 <
114
115 3 2.5 EER with 90% Confidence Interval EER (%) Number of Mixture Components
116 5 EER with 90% Confidence Interval 4 EER (%) Frequency (Hz)
117 8 7 EER with 90% Confidence Interval EER (%) Frequency (Hz)
118
119
120 EER (%) B1 GMM UBM (64 Mixture Components) GMM SVM (64 Mixture Components) B2 B3 B Sub Band GMM UBM (64 Mixture Components) GMM SVM (64 Mixture Components) 35 Identification Rate(%) Sub Band
121 GMM UBM GMM SVM GMM UBM Correlation Matrix GMM SVM
122 40 35 Kth to 2th Grade Speakers 3th to 6th Grade Speakers 7th to 10th Grade Speakers 30 Identification Rate (%) Sub Band
123
124
125 7
126
127
128 +
129
130
131 Full bandwidth performance Identification Rate (%) S1 S5 S9 S13 S17 S21 Sub Band Identification Rate (%) AG1 (5 9 years old) AG2 (9 13 years old) AG3 (13 16 years old) AG1 FB AG2 FB AG3 FB S1 S5 S9 S13 S17 S21 Sub Band
132
133 Identification Rate (%) S1 S5 S9 S13 S17 S21 Sub Band
134 0.015 Normalised GenderID Normalised AgeID Sub Band
135
136
137
138
139
140
141
142
143
144 8
145
146
147
148
149 A
150 B (, ) = = ( ) ( ) (, ) = = = ( )( ) = ( ) ( ) =
151 C = (, )
152
153 D [ ] = + N (, ) [ ] =, [ ] =. = [ ] = ( ) = [( )( ) ] =
154 ( ) = [ ] = = [( )( ) [ ] ( = )( ) ( )( ) ( )( ) ( )( ) N (, ) N (, ) = + ( ), =.
155 E {,,..., }, ( ) = ( ) = ( ) = ( ) = { ( ) = ( ) } ( ) = ( ) = ( ) ( )
156 ( ) = ( ) { ( ) + ( ) ( ) } ( ) ( ) = ( ) ( ) ( ) = ( ) ( ) ( ) = ( ) ( ) ( ) ( ) ( ),..., ( ) ( ) ( ) = + ( ) ( ) ( ) ( ) ( ( ) ( ), ( )) [ ( )] ( ) [ ( )] = ( ) ( ) [ ( )] ( ) ( ) ( ) ( ) = ( ) = ( ) ( )
157 C = ( ) ( ( ) ( )) = ( ) ( ( ) ( )) ( ) C = C C = C C ( ) = ( + ( ) ). ( )
158 F << ( ) ( ) { = + ; } ( ) + ( ) ( ) ( ) + ( ) ( ) + ( ) ( ) =
159 >
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
Front-End Factor Analysis For Speaker Verification
IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Front-End Factor Analysis For Speaker Verification Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, and Pierre Ouellet, Abstract This
More informationHow to Deal with Multiple-Targets in Speaker Identification Systems?
How to Deal with Multiple-Targets in Speaker Identification Systems? Yaniv Zigel and Moshe Wasserblat ICE Systems Ltd., Audio Analysis Group, P.O.B. 690 Ra anana 4307, Israel yanivz@nice.com Abstract In
More informationTNO SRE-2008: Calibration over all trials and side-information
Image from Dr Seuss TNO SRE-2008: Calibration over all trials and side-information David van Leeuwen (TNO, ICSI) Howard Lei (ICSI), Nir Krause (PRS), Albert Strasheim (SUN) Niko Brümmer (SDV) Knowledge
More informationi-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU
i-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU 2013-11-18 Framework 1. GMM-UBM Feature is extracted by frame. Number of features are unfixed. Gaussian Mixtures are used to fit all the features. The mixtures
More informationSession Variability Compensation in Automatic Speaker Recognition
Session Variability Compensation in Automatic Speaker Recognition Javier González Domínguez VII Jornadas MAVIR Universidad Autónoma de Madrid November 2012 Outline 1. The Inter-session Variability Problem
More informationIBM Research Report. Training Universal Background Models for Speaker Recognition
RC24953 (W1003-002) March 1, 2010 Other IBM Research Report Training Universal Bacground Models for Speaer Recognition Mohamed Kamal Omar, Jason Pelecanos IBM Research Division Thomas J. Watson Research
More informationISCA Archive
ISCA Archive http://www.isca-speech.org/archive ODYSSEY04 - The Speaker and Language Recognition Workshop Toledo, Spain May 3 - June 3, 2004 Analysis of Multitarget Detection for Speaker and Language Recognition*
More informationRobust Speaker Identification
Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }
More informationspeaker recognition using gmm-ubm semester project presentation
speaker recognition using gmm-ubm semester project presentation OBJECTIVES OF THE PROJECT study the GMM-UBM speaker recognition system implement this system with matlab document the code and how it interfaces
More informationAround the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre
Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre The 2 Parts HDM based diarization System The homogeneity measure 2 Outline
More informationSCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS
SCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS Marcel Katz, Martin Schafföner, Sven E. Krüger, Andreas Wendemuth IESK-Cognitive Systems University
More informationA Small Footprint i-vector Extractor
A Small Footprint i-vector Extractor Patrick Kenny Odyssey Speaker and Language Recognition Workshop June 25, 2012 1 / 25 Patrick Kenny A Small Footprint i-vector Extractor Outline Introduction Review
More informationNovel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors
Published in Ninth International Conference on Advances in Pattern Recognition (ICAPR-2017), Bangalore, India Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors
More informationJoint Factor Analysis for Speaker Verification
Joint Factor Analysis for Speaker Verification Mengke HU ASPITRG Group, ECE Department Drexel University mengke.hu@gmail.com October 12, 2012 1/37 Outline 1 Speaker Verification Baseline System Session
More informationMaximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems
Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems Chin-Hung Sit 1, Man-Wai Mak 1, and Sun-Yuan Kung 2 1 Center for Multimedia Signal Processing Dept. of
More informationStudies on Model Distance Normalization Approach in Text-independent Speaker Verification
Vol. 35, No. 5 ACTA AUTOMATICA SINICA May, 009 Studies on Model Distance Normalization Approach in Text-independent Speaker Verification DONG Yuan LU Liang ZHAO Xian-Yu ZHAO Jian Abstract Model distance
More informationSupport Vector Machines using GMM Supervectors for Speaker Verification
1 Support Vector Machines using GMM Supervectors for Speaker Verification W. M. Campbell, D. E. Sturim, D. A. Reynolds MIT Lincoln Laboratory 244 Wood Street Lexington, MA 02420 Corresponding author e-mail:
More informationExemplar-based voice conversion using non-negative spectrogram deconvolution
Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1, Tuomas Virtanen 2, Tomi Kinnunen 3, Eng Siong Chng 1, Haizhou Li 1,4 1 Nanyang Technological University, Singapore
More informationSpeaker recognition by means of Deep Belief Networks
Speaker recognition by means of Deep Belief Networks Vasileios Vasilakakis, Sandro Cumani, Pietro Laface, Politecnico di Torino, Italy {first.lastname}@polito.it 1. Abstract Most state of the art speaker
More informationSpeaker Verification Using Accumulative Vectors with Support Vector Machines
Speaker Verification Using Accumulative Vectors with Support Vector Machines Manuel Aguado Martínez, Gabriel Hernández-Sierra, and José Ramón Calvo de Lara Advanced Technologies Application Center, Havana,
More informationExperiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition
Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition ABSTRACT It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 21: Speaker Adaptation Instructor: Preethi Jyothi Oct 23, 2017 Speaker variations Major cause of variability in speech is the differences between speakers Speaking
More informationPresented By: Omer Shmueli and Sivan Niv
Deep Speaker: an End-to-End Neural Speaker Embedding System Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu Presented By: Omer Shmueli and Sivan
More informationText-Independent Speaker Identification using Statistical Learning
University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 7-2015 Text-Independent Speaker Identification using Statistical Learning Alli Ayoola Ojutiku University of Arkansas, Fayetteville
More informationEstimation of Relative Operating Characteristics of Text Independent Speaker Verification
International Journal of Engineering Science Invention Volume 1 Issue 1 December. 2012 PP.18-23 Estimation of Relative Operating Characteristics of Text Independent Speaker Verification Palivela Hema 1,
More informationTime-Varying Autoregressions for Speaker Verification in Reverberant Conditions
INTERSPEECH 017 August 0 4, 017, Stockholm, Sweden Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions Ville Vestman 1, Dhananjaya Gowda, Md Sahidullah 1, Paavo Alku 3, Tomi
More informationUncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition
Uncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition Patrick Kenny, Themos Stafylakis, Md. Jahangir Alam and Marcel Kockmann Odyssey Speaker and Language Recognition Workshop
More informationMonaural speech separation using source-adapted models
Monaural speech separation using source-adapted models Ron Weiss, Dan Ellis {ronw,dpwe}@ee.columbia.edu LabROSA Department of Electrical Enginering Columbia University 007 IEEE Workshop on Applications
More informationGeoffrey Zweig May 7, 2009
Geoffrey Zweig May 7, 2009 Taxonomy of LID Techniques LID Acoustic Scores Derived LM Vector space model GMM GMM Tokenization Parallel Phone Rec + LM Vectors of phone LM stats [Carrasquillo et. al. 02],
More informationKernel Methods for Text-Independent Speaker Verification
Kernel Methods for Text-Independent Speaker Verification Chris Longworth Cambridge University Engineering Department and Christ s College February 25, 2010 Dissertation submitted to the University of Cambridge
More informationModified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System
INERSPEECH 2015 Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System QingYang Hong 1, Lin Li 1, Ming Li 2, Ling Huang 1, Lihong Wan 1, Jun Zhang 1
More informationAn Integration of Random Subspace Sampling and Fishervoice for Speaker Verification
Odyssey 2014: The Speaker and Language Recognition Workshop 16-19 June 2014, Joensuu, Finland An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification Jinghua Zhong 1, Weiwu
More informationThe effect of speaking rate and vowel context on the perception of consonants. in babble noise
The effect of speaking rate and vowel context on the perception of consonants in babble noise Anirudh Raju Department of Electrical Engineering, University of California, Los Angeles, California, USA anirudh90@ucla.edu
More informationAllpass Modeling of LP Residual for Speaker Recognition
Allpass Modeling of LP Residual for Speaker Recognition K. Sri Rama Murty, Vivek Boominathan and Karthika Vijayan Department of Electrical Engineering, Indian Institute of Technology Hyderabad, India email:
More informationIndependent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationLow-dimensional speech representation based on Factor Analysis and its applications!
Low-dimensional speech representation based on Factor Analysis and its applications! Najim Dehak and Stephen Shum! Spoken Language System Group! MIT Computer Science and Artificial Intelligence Laboratory!
More informationSupport Vector Machines and Speaker Verification
1 Support Vector Machines and Speaker Verification David Cinciruk March 6, 2013 2 Table of Contents Review of Speaker Verification Introduction to Support Vector Machines Derivation of SVM Equations Soft
More informationMulticlass Discriminative Training of i-vector Language Recognition
Odyssey 214: The Speaker and Language Recognition Workshop 16-19 June 214, Joensuu, Finland Multiclass Discriminative Training of i-vector Language Recognition Alan McCree Human Language Technology Center
More informationEFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION. Taufiq Hasan Al Banna
EFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION by Taufiq Hasan Al Banna APPROVED BY SUPERVISORY COMMITTEE: Dr. John H. L. Hansen, Chair Dr. Carlos Busso Dr. Hlaing Minn Dr. P. K. Rajasekaran
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationA Generative Model Based Kernel for SVM Classification in Multimedia Applications
Appears in Neural Information Processing Systems, Vancouver, Canada, 2003. A Generative Model Based Kernel for SVM Classification in Multimedia Applications Pedro J. Moreno Purdy P. Ho Hewlett-Packard
More informationHarmonic Structure Transform for Speaker Recognition
Harmonic Structure Transform for Speaker Recognition Kornel Laskowski & Qin Jin Carnegie Mellon University, Pittsburgh PA, USA KTH Speech Music & Hearing, Stockholm, Sweden 29 August, 2011 Laskowski &
More informationBich Ngoc Do. Neural Networks for Automatic Speaker, Language and Sex Identification
Charles University in Prague Faculty of Mathematics and Physics University of Groningen Faculty of Arts MASTER THESIS Bich Ngoc Do Neural Networks for Automatic Speaker, Language and Sex Identification
More informationA SUPERVISED FACTORIAL ACOUSTIC MODEL FOR SIMULTANEOUS MULTIPARTICIPANT VOCAL ACTIVITY DETECTION IN CLOSE-TALK MICROPHONE RECORDINGS OF MEETINGS
A SUPERVISED FACTORIAL ACOUSTIC MODEL FOR SIMULTANEOUS MULTIPARTICIPANT VOCAL ACTIVITY DETECTION IN CLOSE-TALK MICROPHONE RECORDINGS OF MEETINGS Kornel Laskowski and Tanja Schultz interact, Carnegie Mellon
More information26 Chapter 4 Classification
26 Chapter 4 Classification The preceding tree cannot be simplified. 2. Consider the training examples shown in Table 4.1 for a binary classification problem. Table 4.1. Data set for Exercise 2. Customer
More informationUnifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication
Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication Aleksandr Sizov 1, Kong Aik Lee, Tomi Kinnunen 1 1 School of Computing, University of Eastern Finland, Finland Institute
More informationSegmental Recurrent Neural Networks for End-to-end Speech Recognition
Segmental Recurrent Neural Networks for End-to-end Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, Noah Smith and Steve Renals TTI-Chicago, UoE, CMU and UW 9 September 2016 Background A new wave
More informationUsing Deep Belief Networks for Vector-Based Speaker Recognition
INTERSPEECH 2014 Using Deep Belief Networks for Vector-Based Speaker Recognition W. M. Campbell MIT Lincoln Laboratory, Lexington, MA, USA wcampbell@ll.mit.edu Abstract Deep belief networks (DBNs) have
More informationDomain-invariant I-vector Feature Extraction for PLDA Speaker Verification
Odyssey 2018 The Speaker and Language Recognition Workshop 26-29 June 2018, Les Sables d Olonne, France Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification Md Hafizur Rahman 1, Ivan
More informationMixtures of Gaussians with Sparse Structure
Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or
More informationMixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes
Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering
More informationSPEECH enhancement has been studied extensively as a
JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2017 1 Phase-Aware Speech Enhancement Based on Deep Neural Networks Naijun Zheng and Xiao-Lei Zhang Abstract Short-time frequency transform STFT)
More informationApplication of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data
Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Juan Torres 1, Ashraf Saad 2, Elliot Moore 1 1 School of Electrical and Computer
More informationUnsupervised Methods for Speaker Diarization. Stephen Shum
Unsupervised Methods for Speaker Diarization by Stephen Shum B.S., University of California, Berkeley (2009) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment
More informationGender Classification in Speech Processing. Baiju M Nair(2011CRF3637) Geetanjali Srivastava(2012EEZ8304) Group No. 22
Gender Classification in Speech Processing Baiju M Nair(2011CRF3637) Geetanjali Srivastava(2012EEZ8304) Group No. 22 Introduction Main objective is gender classification in speech processing. Classify
More informationSUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING 1. Correlation and Class Based Block Formation for Improved Structured Dictionary Learning
SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING 1 Correlation and Class Based Block Formation for Improved Structured Dictionary Learning Nagendra Kumar and Rohit Sinha, Member, IEEE arxiv:178.1448v2
More informationAutomatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion
INTERSPEECH 203 Automatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion Ville Hautamäki, Kong Aik Lee 2, David van Leeuwen 3, Rahim Saeidi 3, Anthony Larcher 2, Tomi Kinnunen, Taufiq
More informationFast speaker diarization based on binary keys. Xavier Anguera and Jean François Bonastre
Fast speaker diarization based on binary keys Xavier Anguera and Jean François Bonastre Outline Introduction Speaker diarization Binary speaker modeling Binary speaker diarization system Experiments Conclusions
More informationINTERSPEECH 2016 Tutorial: Machine Learning for Speaker Recognition
INTERSPEECH 2016 Tutorial: Machine Learning for Speaker Recognition Man-Wai Mak and Jen-Tzung Chien The Hong Kong Polytechnic University, Hong Kong National Chiao Tung University, Taiwan September 8, 2016
More informationKernel Based Text-Independnent Speaker Verification
12 Kernel Based Text-Independnent Speaker Verification Johnny Mariéthoz 1, Yves Grandvalet 1 and Samy Bengio 2 1 IDIAP Research Institute, Martigny, Switzerland 2 Google Inc., Mountain View, CA, USA The
More informationOn The Best Principal. Submatrix Problem
On The Best Principal Submatrix Problem by Seth Charles Lewis A thesis submitted to University of Birmingham for the degree of Doctor of Philosophy (PhD) School of Mathematics University of Birmingham
More informationSpoken Language Understanding in a Latent Topic-based Subspace
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Spoken Language Understanding in a Latent Topic-based Subspace Mohamed Morchid 1, Mohamed Bouaziz 1,3, Waad Ben Kheder 1, Killian Janod 1,2, Pierre-Michel
More informationApproximate Bayesian Inference for Robust Speech Processing. A Thesis. Submitted to the Faculty. Drexel University. Ciira wa Maina
Approximate Bayesian Inference for Robust Speech Processing A Thesis Submitted to the Faculty of Drexel University by Ciira wa Maina in partial fulfillment of the requirements for the degree of Doctor
More informationPerformance Evaluation
Performance Evaluation David S. Rosenberg Bloomberg ML EDU October 26, 2017 David S. Rosenberg (Bloomberg ML EDU) October 26, 2017 1 / 36 Baseline Models David S. Rosenberg (Bloomberg ML EDU) October 26,
More informationMinimax i-vector extractor for short duration speaker verification
Minimax i-vector extractor for short duration speaker verification Ville Hautamäki 1,2, You-Chi Cheng 2, Padmanabhan Rajan 1, Chin-Hui Lee 2 1 School of Computing, University of Eastern Finl, Finl 2 ECE,
More information8. Classification and Pattern Recognition
8. Classification and Pattern Recognition 1 Introduction: Classification is arranging things by class or category. Pattern recognition involves identification of objects. Pattern recognition can also be
More informationTowards Multi-Modal Driver s Stress Detection
Towards Multi-Modal Driver s Stress Detection Hynek Bořil, Pinar Boyraz, John H.L. Hansen Center for Robust Speech Systems, Erik Jonsson School of Engineering & Computer Science, University of Texas at
More informationReview of Lecture 1. Across records. Within records. Classification, Clustering, Outlier detection. Associations
Review of Lecture 1 This course is about finding novel actionable patterns in data. We can divide data mining algorithms (and the patterns they find) into five groups Across records Classification, Clustering,
More informationPHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS
PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS Jinjin Ye jinjin.ye@mu.edu Michael T. Johnson mike.johnson@mu.edu Richard J. Povinelli richard.povinelli@mu.edu
More informationSpeech Signal Representations
Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6
More informationGlottal Modeling and Closed-Phase Analysis for Speaker Recognition
Glottal Modeling and Closed-Phase Analysis for Speaker Recognition Raymond E. Slyh, Eric G. Hansen and Timothy R. Anderson Air Force Research Laboratory, Human Effectiveness Directorate, Wright-Patterson
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationGMM-Based Speech Transformation Systems under Data Reduction
GMM-Based Speech Transformation Systems under Data Reduction Larbi Mesbahi, Vincent Barreaud, Olivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-22305 Lannion Cedex
More informationUnsupervised Vocabulary Induction
Infant Language Acquisition Unsupervised Vocabulary Induction MIT (Saffran et al., 1997) 8 month-old babies exposed to stream of syllables Stream composed of synthetic words (pabikumalikiwabufa) After
More informationA Generative Model for Score Normalization in Speaker Recognition
INTERSPEECH 017 August 0 4, 017, Stockholm, Sweden A Generative Model for Score Normalization in Speaker Recognition Albert Swart and Niko Brümmer Nuance Communications, Inc. (South Africa) albert.swart@nuance.com,
More informationIntroduction to Machine Learning Midterm, Tues April 8
Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend
More informationApproximating the Covariance Matrix with Low-rank Perturbations
Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu
More informationGain Compensation for Fast I-Vector Extraction over Short Duration
INTERSPEECH 27 August 2 24, 27, Stockholm, Sweden Gain Compensation for Fast I-Vector Extraction over Short Duration Kong Aik Lee and Haizhou Li 2 Institute for Infocomm Research I 2 R), A STAR, Singapore
More informationSpeaker Recognition Using Artificial Neural Networks: RBFNNs vs. EBFNNs
Speaer Recognition Using Artificial Neural Networs: s vs. s BALASKA Nawel ember of the Sstems & Control Research Group within the LRES Lab., Universit 20 Août 55 of Sida, BP: 26, Sida, 21000, Algeria E-mail
More informationBayesian Estimation of Bipartite Matchings for Record Linkage
Bayesian Estimation of Bipartite Matchings for Record Linkage Mauricio Sadinle msadinle@stat.duke.edu Duke University Supported by NSF grants SES-11-30706 to Carnegie Mellon University and SES-11-31897
More informationImproved Method for Epoch Extraction in High Pass Filtered Speech
Improved Method for Epoch Extraction in High Pass Filtered Speech D. Govind Center for Computational Engineering & Networking Amrita Vishwa Vidyapeetham (University) Coimbatore, Tamilnadu 642 Email: d
More informationSpectral and Textural Feature-Based System for Automatic Detection of Fricatives and Affricates
Spectral and Textural Feature-Based System for Automatic Detection of Fricatives and Affricates Dima Ruinskiy Niv Dadush Yizhar Lavner Department of Computer Science, Tel-Hai College, Israel Outline Phoneme
More informationAutomatic Phoneme Recognition. Segmental Hidden Markov Models
Automatic Phoneme Recognition with Segmental Hidden Markov Models Areg G. Baghdasaryan Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment
More informationJuly 6, Applause Identification and its relevance to Archival of Carnatic Music. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.
Applause Identification and its relevance to Archival of Carnatic Music Padi Sarala 1 Vignesh Ishwar 1 Ashwin Bellur 1 Hema A.Murthy 1 1 Computer Science Dept, IIT Madras, India. July 6, 2012 Outline of
More informationModeling Prosody for Speaker Recognition: Why Estimating Pitch May Be a Red Herring
Modeling Prosody for Speaker Recognition: Why Estimating Pitch May Be a Red Herring Kornel Laskowski & Qin Jin Carnegie Mellon University Pittsburgh PA, USA 28 June, 2010 Laskowski & Jin ODYSSEY 2010,
More informationINVESTIGATION OF MICROWAVE TRI-RESONATOR STRUCTURES
SCHOOL OF ELECTRONIC, ELECTRICAL AND COMPUER ENGINEERING THE UNIVERSITY OF BIRMINGHAM INVESTIGATION OF MICROWAVE TRI-RESONATOR STRUCTURES Negassa Sori Gerba A thesis submitted to the University of Birmingham
More information"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction"
"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction" Francesco Nesta, Marco Matassoni {nesta, matassoni}@fbk.eu Fondazione Bruno Kessler-Irst, Trento (ITALY) For contacts:
More informationA Confidence-Based Late Fusion Framework For Audio-Visual Biometric Identification
Pattern Recognition Letters journal homepage: www.elsevier.com A Confidence-Based Late Fusion Framework For Audio-Visual Biometric Identification Mohammad Rafiqul Alam a,, Mohammed Bennamoun a, Roberto
More informationDepartment of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.
Department of Economics Business Statistics Chapter 1 Chi-square test of independence & Analysis of Variance ECON 509 Dr. Mohammad Zainal Chapter Goals After completing this chapter, you should be able
More informationCorrespondence. Pulse Doppler Radar Target Recognition using a Two-Stage SVM Procedure
Correspondence Pulse Doppler Radar Target Recognition using a Two-Stage SVM Procedure It is possible to detect and classify moving and stationary targets using ground surveillance pulse-doppler radars
More informationStat 135 Fall 2013 FINAL EXAM December 18, 2013
Stat 135 Fall 2013 FINAL EXAM December 18, 2013 Name: Person on right SID: Person on left There will be one, double sided, handwritten, 8.5in x 11in page of notes allowed during the exam. The exam is closed
More informationAugmented Statistical Models for Classifying Sequence Data
Augmented Statistical Models for Classifying Sequence Data Martin Layton Corpus Christi College University of Cambridge September 2006 Dissertation submitted to the University of Cambridge for the degree
More information1. Use Scenario 3-1. In this study, the response variable is
Chapter 8 Bell Work Scenario 3-1 The height (in feet) and volume (in cubic feet) of usable lumber of 32 cherry trees are measured by a researcher. The goal is to determine if volume of usable lumber can
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Expectation Maximization (EM) and Mixture Models Hamid R. Rabiee Jafar Muhammadi, Mohammad J. Hosseini Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2 Agenda Expectation-maximization
More informationNorm Referenced Test (NRT)
22 Norm Referenced Test (NRT) NRT Test Design In 2005, the MSA Mathematics tests included the TerraNova Mathematics Survey (TN) Form C at Grades 3, 4, 5, 7, and 8 and Form D at Grade 6. The MSA Grade 10
More informationAchieving Reliable Energy Production During Winter Months
Achieving Reliable Energy Production During Winter Months Monelle Comeau CanWEA 2017 2017-10-05 1 AGENDA 1 Overview of ENERCON s Icing and Cold Climate Innovations 2 Ice Detection System Evaluations Technology
More informationA NONPARAMETRIC BAYESIAN APPROACH FOR SPOKEN TERM DETECTION BY EXAMPLE QUERY
A NONPARAMETRIC BAYESIAN APPROACH FOR SPOKEN TERM DETECTION BY EXAMPLE QUERY Amir Hossein Harati Nead Torbati and Joseph Picone College of Engineering, Temple University Philadelphia, Pennsylvania, USA
More informationModel-based unsupervised segmentation of birdcalls from field recordings
Model-based unsupervised segmentation of birdcalls from field recordings Anshul Thakur School of Computing and Electrical Engineering Indian Institute of Technology Mandi Himachal Pradesh, India Email:
More informationFoCal Multi-class: Toolkit for Evaluation, Fusion and Calibration of Multi-class Recognition Scores Tutorial and User Manual
FoCal Multi-class: Toolkit for Evaluation, Fusion and Calibration of Multi-class Recognition Scores Tutorial and User Manual Niko Brümmer Spescom DataVoice niko.brummer@gmail.com June 2007 Contents 1 Introduction
More information