ECE 598: The Speech Chain. Lecture 9: Consonants

Size: px
Start display at page:

Download "ECE 598: The Speech Chain. Lecture 9: Consonants"

Transcription

1 ECE 598: The Speech Chain Lecture 9: Consonants

2 Today International Phonetic Alphabet History SAMPA an IPA for ASCII Sounds with a Side Branch Nasal Consonants Reminder: Impedance of a uniform tube Liquids: /l/, /r/ Events in the Release of a SyllableInitial Stop Consonant Transient and Frication Sources Aspiration Formant Transitions

3 Topic #1: International Phonetic Alphabet

4 International Phonetic Alphabet: Purpose and Brief History Purpose of the alphabet: to provide a universal notation for the sounds of the world s s languages Universal = If any language on Earth distinguishes two phonemes, IPA must also distinguish them Distinguish = Meaning of a word changes when the phoneme changes, e.g. cat vs. bat. bat. Very Brief History: 1446: King Sejong of Chosen publishes a distinctivefeature based phonetic notation for Korean. 1867: Alexander Melville Bell publishes a distinctivefeature featurebased universal phonetic notation in Visible Speech: The Science of the Universal Alphabetic. His notation is rejected as being too expensive to print. 1886: International Phonetic Association founded in Paris by phoneticians from across Europe; begins developing IPA notation. 1991: Unicode provides a standard method for including IPA notation in computer documents.

5 International Phonetic Alphabet: Vowels SAMPA SAMPA i / y I / Y e / / u U? / o E / 9 { a 3 V / O A / Q * SAMPA: An international standard for the ASCII transcription of the IPA phonemes. Maps most IPA phones to the ASCII printable characters.

6 Vowels: Sample Words From With phonological feature annotation. All are [syllabic]. Lax Vowels [reduced,lax[ reduced,lax,blade]: lax: like IPA central, central, but also short duration I pit pit [high,front high,front] ] high: like IPA close. close. E pet pet [high,front] V cut kvt [high, high,front] U put put [high,front] Tense Vowels: [reduced,[ reduced,lax] i ease iz [low,high,front]] low: like IPA open. open. Specified if [high,[ high,lax] e raise rez [low, low,high,front] { pat p{t [low,front low,front] u lose luz [low,high, low,high,front,round] ] [round]: o nose noz [low, low,high, high,front,round] ] possible if [lax,[ lax,front] O cause koz [low,front,round front,round] A pot fad@ [low,front, front,round] round] Classic Diphthongs: ai rise raiz OI noise noiz au rouse rauz Schwa and Syllabic /r/: 3 furs f3 z z [reduced,lax,blade[ reduced,lax,blade,anterior, corner korn@ [reduced,blade reduced,blade,anterior, anterior,distributed]

7 IPA: Regular Consonants Tongue Blade Tongue Body p b t d k gq m n N? 4 B f v T D s z S Z C x G h r j l L

8 Obstruent Consonants: Sample Words From With phonological feature annotation. All are [syllabic].[ Stops [sonorant,[ sonorant,continuant]: p pin pin [lips,round, round,voice] b bin bin [lips,round,voice round,voice] t tin tin [blade,anterior blade,anterior,distributed, distributed,voice] anterior: of the alveolar ridge d din din [blade,anterior, blade,anterior,distributed,voice] ] distributed: tongue tip flat k kin kin [body,high body,high,voice] g give giv [body,high,voice body,high,voice] Affricates [sonorant,[ sonorant,continuant]: ts chin tsin dz gin dzin [blade,anterior,distributed anterior,distributed,voice] [blade,anterior,distributed,voice anterior,distributed,voice] Fricatives [sonorant,continuant[ sonorant,continuant]: f fin fin [lips,round, round,voice] v vim vim [lips,round,voice round,voice] T thin TIn [blade,anterior,distributed blade,anterior,distributed,voice] D this DIs [blade,anterior,distributed,voice blade,anterior,distributed,voice] s sin sin [blade,anterior blade,anterior,distributed, distributed,voice] z zin zin [blade,anterior, blade,anterior,distributed,voice] S shin SIn [blade,anterior,distributed anterior,distributed,voice] Z measure mez@ [blade,anterior,distributed,voice anterior,distributed,voice] h hit hit [glottal] voicing is not specified, depends on context

9 DoublyArticulated Consonants SAMPA w H

10 Sonorant Consonants: Sample Words Mostly from With phonological feature annotation Flap [sonorant,continuant, continuant,nasal, nasal,syllabic]: syllabic]: 4 butter bv43 [blade,anterior blade,anterior,distributed] distributed] Nasals [sonorant,continuant,nasal continuant,nasal]: m mock mak [lips,round, round,syllabic] n knock nak [blade,anterior blade,anterior,distributed, distributed,syllabic] syllabic] =n button bv4=n [blade,anterior, blade,anterior,distributed,syllabic] N thing TIN [body,high body,high,syllabic] Liquids [sonorant,continuant sonorant,continuant,syllabic]: syllabic]: r wrong ron [blade,anterior, anterior,distributed] l long lon [blade,anterior blade,anterior,distributed] distributed] Glides [sonorant,continuant sonorant,continuant,syllabic]: syllabic]: w wasp wasp [lips,round lips,round] j yacht jat [blade,anterior, anterior,distributed]

11 Examples: SAMPA 1. You wish to know all about my grandfather. Well, he is nearly ninety three years old, but he still thinks as swiftly as ever. 2. ju wis tu no Al mai wel,, hi iz nirli naintitri yirz old, bvt hi stil Tinks }z swiftli }z

12 Example: Phonological Feature Matrix Example: Phonological Feature Matrix [high] [high] [pharyngeal/low] [pharyngeal/low] [lax] [lax] [lips] [lips] [glottal] [glottal] [voice] [voice] [nasal] [nasal] [front] [front] [body] [body] [distributed] [distributed] [anterior] [anterior] [blade] [blade] [round] [round] [reduced] [reduced] [continuant] [continuant] [sonorant] [sonorant] [syllabic] [syllabic] K I t s u

13 NonPulmonic Consonants

14 Topic #2: Sounds with Side Branches: Nasal Consonants, /l/, /r/

15 Nasal Murmur the mug the nut sing a song Observations: Lowfrequency resonance (about 300Hz) always present Lowfrequency resonance has wide bandwidth (about 150Hz) Energy of lowfrequency resonance is very constant Most highfrequency resonances cancelled by zeros Different places of articulation have different high frequency spectra Highfrequency spectrum is talkerdependent and variable

16 Acoustic Description of Vocal Tract with a Side Branch U N (0,ω) A P U P (0,ω) U M (0,ω) A M L P 0 Continuity Equations at the Juncture (x=0): 1. Conservation of mass 2. Continuity of pressure u P (0,ω)u N (0,ω)u M (0,ω)=0 p P (0,ω)=p N (0,ω)=p M (0,ω) L M x

17 Reminder: How to Calculate Impedance of a Uniform Tube 1. Express u(x,ω), p(x,ω) ) in terms of p and p p(x,ω) ) = p e jωx/c p e jωx/c u(x,ω) ) = A v(x,ω) ) = (A/( A/ρc)(p e jωx/c p e jωx/c 2. Impose one boundary condition 1. u(l,ω)=0: p = p e 2j or 3. p(l,ω)=0: p = p e 2j 2jωL/c 2jωL/c 3. Calculate impedance at the other boundary 1. z(0,ω) ) = p(0,ω)/v(0, )/v(0,ω)) = (ρc)(p( p )/(p p ) or 3. z(0,ω) ) = p(0,ω)/u(0, )/u(0,ω)) = (ρc/a)(p( p )/(p p ) 4. Result (for a uniform tube, using z=p/u p/u): 1. u(l,ω)=0: z(0,ω) ) = j(ρc/a)cot( c/a)cot(ωl/c) 2. or 3. p(l,ω)=0: z(0,ω) ) = j( j(ρc/a)tan( c/a)tan(ωl/c) x/c )

18 Resonant Frequencies ( Formants( Formants ) of a Nasal Consonant 1. Air Flow from Pharynx must equal Air Flow to Nose and Mouth u P u N u M =0 2. Therefore, the admittances sum to zero (1/z P )(1/z N )(1/z M )=0 3. Plug in the true admittances j(a P /ρc)tan( c)tan(ωl P /c)j(a N /ρc)cot( c)cot(ωl N /c)j(a M /ρc)tan( c)tan(ωl M /c)=0 4. For most resonant frequencies: Eq.. (3) must be solved numerically on a computer (or graphically, as in Fujimura, JASA 1962)

19 First Nasal Formant 1. The true resonance equation: ja P tan(ωl P /c)ja N cot(ωl N /c)ja M tan(ωl M /c)=0 2. Lowfrequency approximation: A P (ωl P /c)a N /(ωl N /c)a M (ωl M /c)=0 (A M L M A P L P )(ω/c) 2 A N /L N =0 ω 2 = c 2 (A N /L N )/(A M L M A P L P ) 3. Typical example, nasal F 1 : A P L P =60cm 3, A M L M =40cm 3, A N /L N =(3.5cm 2 /9cm) = 2/5cm F 1 =(c/2π)(1/250) 1/2 350Hz

20 Nasal Consonant Formants = Resonances of the OralNasal NasalPharyngeal Combined System Fourth Nasal Resonance 2400 Third Nasal Resonance 1800 Second Nasal Resonance 1400 First Nasal Resonance 350Hz, with broad bandwidth B 300Hz

21 Nasal Consonant AntiResonances 1. AntiResonances (zeros) occur because of energy lost into the sidebranch 2. Continuity of pressure at the juncture: p P (0,ω)=p N (0,ω)=p M (0,ω) 3. If it turns out that p N (0,ω)=0, that s s OK; that s just part of the normal standing wave pattern in the nostrils. 4. If it turns out that p M (0,ω)=0, that s s extra: it means that the side branch (into the mouth) is draining away all of the energy that would otherwise go out through the nostrils.

22 Nasal Consonant AntiResonances 1. AntiResonances (zeros) occur at any frequency such that, regardless of um(0,w), p M (0,ω) ) is required to be zero 2. In other words: z M (0,ω)= )=p M /u M =0 3. Uniform tube: z M (0,ω)=( )=(ρc/a M )cot(ωl M /c) 4. Therefore F zm 5. Anti zm =(mc/2l M )(c/4l M ) Antiresonances of the English nasal consonants: 1. /m/: L M 8cm, F zm 1100, 3300, 5500Hz 2. /n/: L M 5cm, F zm 1700, 5100, 8500Hz 3. /ng/: L M 2cm, F zm 4400Hz,

23 /l/: Liquids Main airway: around the tongue (on both sides, so there may be zeros because of added transfer functions) Side branch: above the tongue /r/: L = 6cm F Z = 35400/4L = 1500Hz, between F2 and F3, pushes F2 down Main airway: as in a vowel Side branch: under the tongue L = 3.5cm F Z = 2500Hz, between F3 and F4, pushes F3 down

24 Topic #3: Events in the Release of a Stop (Plosive) Consonant

25 Events in the Release of a Stop Burst = transient frication (the part of the spectrogram whose transfer function has poles only at the front cavity resonance frequencies, not at the back cavity resonances).

26 Events in the Release of a Stop Unaspirated (/b/) Transient Frication Aspiration Voicing Aspirated (/t/)

27 Prevoicing during Closure To make a voiced stop in most European languages: Tongue root is relaxed, allowing it to expand so that vocal folds can continue to vibrating for a little while after oral closure. Result is a lowfrequency voice bar that may continue well into closure. In English, closure voicing is typical of read speech, but not casual speech. the bug

28 Transient: The Release of Pressure

29 Transfer Function During Transient and Frication: Poles Turbulence striking an obstacle makes noise Front cavity resonance frequency: F R = c/4l f Example: FR=2250 = c/4l f L f = 4cm the shutter

30 Transfer Function During Frication: An Important Zero

31 Transfer Function During Frication: An Important Zero

32 Transfer Function During Aspiration

33 Formant Transitions: A Perceptual Study The study: (1) Synthesize speech with different formant patterns, (2) record subject responses. Delattre, Liberman and Cooper, J. Acoust. Soc. Am

34 Perception of Formant Transitions

35 Summary International Phonetic Alphabet Distinct = distinguishes words Universal = distinct any language distinct in IPA Sounds with a Side Branch Poles=poles of entire system (oralnasal nasalpharyngeal) Zeros=resonances of side branch (P=0 at juncture) Events in the Release of a SyllableInitial Stop Consonant Transient pop, pop, triangular shape, about 0.5ms long Frication = turbulence at the constriction, about 5ms Aspiration = turbulence at glottis, 070ms0 Formant Transitions may start during aspiration (in the case of an unvoiced stop release)

SPEECH COMMUNICATION 6.541J J-HST710J Spring 2004

SPEECH COMMUNICATION 6.541J J-HST710J Spring 2004 6.541J PS3 02/19/04 1 SPEECH COMMUNICATION 6.541J-24.968J-HST710J Spring 2004 Problem Set 3 Assigned: 02/19/04 Due: 02/26/04 Read Chapter 6. Problem 1 In this problem we examine the acoustic and perceptual

More information

COMP 546, Winter 2018 lecture 19 - sound 2

COMP 546, Winter 2018 lecture 19 - sound 2 Sound waves Last lecture we considered sound to be a pressure function I(X, Y, Z, t). However, sound is not just any function of those four variables. Rather, sound obeys the wave equation: 2 I(X, Y, Z,

More information

Speech Spectra and Spectrograms

Speech Spectra and Spectrograms ACOUSTICS TOPICS ACOUSTICS SOFTWARE SPH301 SLP801 RESOURCE INDEX HELP PAGES Back to Main "Speech Spectra and Spectrograms" Page Speech Spectra and Spectrograms Robert Mannell 6. Some consonant spectra

More information

Sound 2: frequency analysis

Sound 2: frequency analysis COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2 Wave equation Pressure

More information

Algorithms for NLP. Speech Signals. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Algorithms for NLP. Speech Signals. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Algorithms for NLP Speech Signals Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Maximum Entropy Models Improving on N-Grams? N-grams don t combine multiple sources of evidence well P(construction

More information

8 M. Hasegawa-Johnson. DRAFT COPY.

8 M. Hasegawa-Johnson. DRAFT COPY. Lecture Notes in Speech Production, Speech Coding, and Speech Recognition Mark Hasegawa-Johnson University of Illinois at Urbana-Champaign February 7, 2000 8 M. Hasegawa-Johnson. DRAFT COPY. Chapter 2

More information

T Automatic Speech Recognition: From Theory to Practice

T Automatic Speech Recognition: From Theory to Practice Automatic Speech Recognition: From Theory to Practice http://www.cis.hut.fi/opinnot// September 20, 2004 Prof. Bryan Pellom Department of Computer Science Center for Spoken Language Research University

More information

A Speech Enhancement System Based on Statistical and Acoustic-Phonetic Knowledge

A Speech Enhancement System Based on Statistical and Acoustic-Phonetic Knowledge A Speech Enhancement System Based on Statistical and Acoustic-Phonetic Knowledge by Renita Sudirga A thesis submitted to the Department of Electrical and Computer Engineering in conformity with the requirements

More information

Signal representations: Cepstrum

Signal representations: Cepstrum Signal representations: Cepstrum Source-filter separation for sound production For speech, source corresponds to excitation by a pulse train for voiced phonemes and to turbulence (noise) for unvoiced phonemes,

More information

Determining the Shape of a Human Vocal Tract From Pressure Measurements at the Lips

Determining the Shape of a Human Vocal Tract From Pressure Measurements at the Lips Determining the Shape of a Human Vocal Tract From Pressure Measurements at the Lips Tuncay Aktosun Technical Report 2007-01 http://www.uta.edu/math/preprint/ Determining the shape of a human vocal tract

More information

Speech Recognition. CS 294-5: Statistical Natural Language Processing. State-of-the-Art: Recognition. ASR for Dialog Systems.

Speech Recognition. CS 294-5: Statistical Natural Language Processing. State-of-the-Art: Recognition. ASR for Dialog Systems. CS 294-5: Statistical Natural Language Processing Speech Recognition Lecture 20: 11/22/05 Slides directly from Dan Jurafsky, indirectly many others Speech Recognition Overview: Demo Phonetics Articulatory

More information

Resonances and mode shapes of the human vocal tract during vowel production

Resonances and mode shapes of the human vocal tract during vowel production Resonances and mode shapes of the human vocal tract during vowel production Atle Kivelä, Juha Kuortti, Jarmo Malinen Aalto University, School of Science, Department of Mathematics and Systems Analysis

More information

Source/Filter Model. Markus Flohberger. Acoustic Tube Models Linear Prediction Formant Synthesizer.

Source/Filter Model. Markus Flohberger. Acoustic Tube Models Linear Prediction Formant Synthesizer. Source/Filter Model Acoustic Tube Models Linear Prediction Formant Synthesizer Markus Flohberger maxiko@sbox.tugraz.at Graz, 19.11.2003 2 ACOUSTIC TUBE MODELS 1 Introduction Speech synthesis methods that

More information

4.2 Acoustics of Speech Production

4.2 Acoustics of Speech Production 4.2 Acoustics of Speech Production Acoustic phonetics is a field that studies the acoustic properties of speech and how these are related to the human speech production system. The topic is vast, exceeding

More information

Witsuwit en phonetics and phonology. LING 200 Spring 2006

Witsuwit en phonetics and phonology. LING 200 Spring 2006 Witsuwit en phonetics and phonology LING 200 Spring 2006 Announcements Correction to homework #2 (due Thurs in section) 5. all 6. (a)-(g), (j) (rest of assignment remains the same) Announcements Clickers

More information

CS425 Audio and Speech Processing. Matthieu Hodgkinson Department of Computer Science National University of Ireland, Maynooth

CS425 Audio and Speech Processing. Matthieu Hodgkinson Department of Computer Science National University of Ireland, Maynooth CS425 Audio and Speech Processing Matthieu Hodgkinson Department of Computer Science National University of Ireland, Maynooth April 30, 2012 Contents 0 Introduction : Speech and Computers 3 1 English phonemes

More information

Numerical Simulation of Acoustic Characteristics of Vocal-tract Model with 3-D Radiation and Wall Impedance

Numerical Simulation of Acoustic Characteristics of Vocal-tract Model with 3-D Radiation and Wall Impedance APSIPA AS 2011 Xi an Numerical Simulation of Acoustic haracteristics of Vocal-tract Model with 3-D Radiation and Wall Impedance Hiroki Matsuzaki and Kunitoshi Motoki Hokkaido Institute of Technology, Sapporo,

More information

L8: Source estimation

L8: Source estimation L8: Source estimation Glottal and lip radiation models Closed-phase residual analysis Voicing/unvoicing detection Pitch detection Epoch detection This lecture is based on [Taylor, 2009, ch. 11-12] Introduction

More information

Chapter 12 Sound in Medicine

Chapter 12 Sound in Medicine Infrasound < 0 Hz Earthquake, atmospheric pressure changes, blower in ventilator Not audible Headaches and physiological disturbances Sound 0 ~ 0,000 Hz Audible Ultrasound > 0 khz Not audible Medical imaging,

More information

ENTROPY RATE-BASED STATIONARY / NON-STATIONARY SEGMENTATION OF SPEECH

ENTROPY RATE-BASED STATIONARY / NON-STATIONARY SEGMENTATION OF SPEECH ENTROPY RATE-BASED STATIONARY / NON-STATIONARY SEGMENTATION OF SPEECH Wolfgang Wokurek Institute of Natural Language Processing, University of Stuttgart, Germany wokurek@ims.uni-stuttgart.de, http://www.ims-stuttgart.de/~wokurek

More information

The (hi)story of laryngeal contrasts in Government Phonology

The (hi)story of laryngeal contrasts in Government Phonology The (hi)story of laryngeal contrasts in Government Phonology Katalin Balogné Bérces PPKE University, Piliscsaba, Hungary bbkati@yahoo.com Dániel Huber Sorbonne Nouvelle, Paris 3, France huberd@freemail.hu

More information

On the relationship between intra-oral pressure and speech sonority

On the relationship between intra-oral pressure and speech sonority On the relationship between intra-oral pressure and speech sonority Anne Cros, Didier Demolin, Ana Georgina Flesia, Antonio Galves Interspeech 2005 1 We address the question of the relationship between

More information

Lecture 5: Jan 19, 2005

Lecture 5: Jan 19, 2005 EE516 Computer Speech Processing Winter 2005 Lecture 5: Jan 19, 2005 Lecturer: Prof: J. Bilmes University of Washington Dept. of Electrical Engineering Scribe: Kevin Duh 5.1

More information

The effect of speaking rate and vowel context on the perception of consonants. in babble noise

The effect of speaking rate and vowel context on the perception of consonants. in babble noise The effect of speaking rate and vowel context on the perception of consonants in babble noise Anirudh Raju Department of Electrical Engineering, University of California, Los Angeles, California, USA anirudh90@ucla.edu

More information

Functional principal component analysis of vocal tract area functions

Functional principal component analysis of vocal tract area functions INTERSPEECH 17 August, 17, Stockholm, Sweden Functional principal component analysis of vocal tract area functions Jorge C. Lucero Dept. Computer Science, University of Brasília, Brasília DF 791-9, Brazil

More information

Tuesday, August 26, 14. Articulatory Phonology

Tuesday, August 26, 14. Articulatory Phonology Articulatory Phonology Problem: Two Incompatible Descriptions of Speech Phonological sequence of discrete symbols from a small inventory that recombine to form different words Physical continuous, context-dependent

More information

L7: Linear prediction of speech

L7: Linear prediction of speech L7: Linear prediction of speech Introduction Linear prediction Finding the linear prediction coefficients Alternative representations This lecture is based on [Dutoit and Marques, 2009, ch1; Taylor, 2009,

More information

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 Chapter 9 Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 1 LPC Methods LPC methods are the most widely used in speech coding, speech synthesis, speech recognition, speaker recognition and verification

More information

A Linguistic Feature Representation of the Speech Waveform

A Linguistic Feature Representation of the Speech Waveform September 1993 LIDS-TH-2195 Research Supported By: National Science Foundation NDSEG Fellowships A Linguistic Feature Representation of the Speech Waveform Ellen Marie Eide September 1993 LIDS-TH-2195

More information

THE ODDS OF ETERNAL OPTIMIZATION IN OPTIMALITY THEORY

THE ODDS OF ETERNAL OPTIMIZATION IN OPTIMALITY THEORY PAUL BOERSMA THE ODDS OF ETERNAL OPTIMIZATION IN OPTIMALITY THEORY Abstract. The first part of this paper shows that a non-teleological account of sound change is possible if we assume two things: first,

More information

Human speech sound production

Human speech sound production Human speech sound production Michael Krane Penn State University Acknowledge support from NIH 2R01 DC005642 11 28 Apr 2017 FLINOVIA II 1 Vocal folds in action View from above Larynx Anatomy Vocal fold

More information

PHY 103 Impedance. Segev BenZvi Department of Physics and Astronomy University of Rochester

PHY 103 Impedance. Segev BenZvi Department of Physics and Astronomy University of Rochester PHY 103 Impedance Segev BenZvi Department of Physics and Astronomy University of Rochester Reading Reading for this week: Hopkin, Chapter 1 Heller, Chapter 1 2 Waves in an Air Column Recall the standing

More information

Speech Signal Representations

Speech Signal Representations Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6

More information

Supporting Information

Supporting Information Supporting Information Blasi et al. 10.1073/pnas.1605782113 SI Materials and Methods Positional Test. We simulate, for each language and signal, random positions of the relevant signal-associated symbol

More information

PHY 103 Impedance. Segev BenZvi Department of Physics and Astronomy University of Rochester

PHY 103 Impedance. Segev BenZvi Department of Physics and Astronomy University of Rochester PHY 103 Impedance Segev BenZvi Department of Physics and Astronomy University of Rochester Midterm Exam Proposed date: in class, Thursday October 20 Will be largely conceptual with some basic arithmetic

More information

Constriction Degree and Sound Sources

Constriction Degree and Sound Sources Constriction Degree and Sound Sources 1 Contrasting Oral Constriction Gestures Conditions for gestures to be informationally contrastive from one another? shared across members of the community (parity)

More information

Linear Prediction 1 / 41

Linear Prediction 1 / 41 Linear Prediction 1 / 41 A map of speech signal processing Natural signals Models Artificial signals Inference Speech synthesis Hidden Markov Inference Homomorphic processing Dereverberation, Deconvolution

More information

Lecture 5: GMM Acoustic Modeling and Feature Extraction

Lecture 5: GMM Acoustic Modeling and Feature Extraction CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 5: GMM Acoustic Modeling and Feature Extraction Original slides by Dan Jurafsky Outline for Today Acoustic

More information

Cochlear modeling and its role in human speech recognition

Cochlear modeling and its role in human speech recognition Allen/IPAM February 1, 2005 p. 1/3 Cochlear modeling and its role in human speech recognition Miller Nicely confusions and the articulation index Jont Allen Univ. of IL, Beckman Inst., Urbana IL Allen/IPAM

More information

SPEECH ANALYSIS AND SYNTHESIS

SPEECH ANALYSIS AND SYNTHESIS 16 Chapter 2 SPEECH ANALYSIS AND SYNTHESIS 2.1 INTRODUCTION: Speech signal analysis is used to characterize the spectral information of an input speech signal. Speech signal analysis [52-53] techniques

More information

Linear Prediction: The Problem, its Solution and Application to Speech

Linear Prediction: The Problem, its Solution and Application to Speech Dublin Institute of Technology ARROW@DIT Conference papers Audio Research Group 2008-01-01 Linear Prediction: The Problem, its Solution and Application to Speech Alan O'Cinneide Dublin Institute of Technology,

More information

Feature extraction 1

Feature extraction 1 Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Feature extraction 1 Dr Philip Jackson Cepstral analysis - Real & complex cepstra - Homomorphic decomposition Filter

More information

Speech production by means of a hydrodynamic model and a discrete-time description

Speech production by means of a hydrodynamic model and a discrete-time description Eindhoven University of Technology MASTER Speech production by means of a hydrodynamic model and a discrete-time description Bogaert, I.J.M. Award date: 1994 Disclaimer This document contains a student

More information

Applications of Linear Prediction

Applications of Linear Prediction SGN-4006 Audio and Speech Processing Applications of Linear Prediction Slides for this lecture are based on those created by Katariina Mahkonen for TUT course Puheenkäsittelyn menetelmät in Spring 03.

More information

A Proposal for a Phonetic Based Encodng for Indic scripts BArtFy ElEpyo\ k Elỹ -venk ḱv ka ek þ-tav

A Proposal for a Phonetic Based Encodng for Indic scripts BArtFy ElEpyo\ k Elỹ -venk ḱv ka ek þ-tav A Proposal for a Phonetic Based Encodng for Indic scripts BArtFy ElEpyo\ k Elỹ -venk ḱv ka ek þ-tav Amitabh Trehan 1 Supervisors: Dr. Sanjiva Prasad, Computer Science Department. Dr. Wagish Shukla, Maths

More information

Parametric Specification of Constriction Gestures

Parametric Specification of Constriction Gestures Parametric Specification of Constriction Gestures Specify the parameter values for a dynamical control model that can generate appropriate patterns of kinematic and acoustic change over time. Model should

More information

Voiced Speech. Unvoiced Speech

Voiced Speech. Unvoiced Speech Digital Speech Processing Lecture 2 Homomorphic Speech Processing General Discrete-Time Model of Speech Production p [ n] = p[ n] h [ n] Voiced Speech L h [ n] = A g[ n] v[ n] r[ n] V V V p [ n ] = u [

More information

Lab 9a. Linear Predictive Coding for Speech Processing

Lab 9a. Linear Predictive Coding for Speech Processing EE275Lab October 27, 2007 Lab 9a. Linear Predictive Coding for Speech Processing Pitch Period Impulse Train Generator Voiced/Unvoiced Speech Switch Vocal Tract Parameters Time-Varying Digital Filter H(z)

More information

Poles, Zeros, and Frequency Response

Poles, Zeros, and Frequency Response Complex Poles Poles, Zeros, and Frequency esponse With only resistors and capacitors, you're stuck with real poles. If you want complex poles, you need either an op-amp or an inductor as well. Complex

More information

Quarterly Progress and Status Report. Speech at high ambient air-pressure

Quarterly Progress and Status Report. Speech at high ambient air-pressure Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Speech at high ambient air-pressure Fant, G. and Sonesson, B. journal: STL-QPSR volume: 5 number: 2 year: 1964 pages: 009-021 http://www.speech.kth.se/qpsr

More information

Automatic Phoneme Recognition. Segmental Hidden Markov Models

Automatic Phoneme Recognition. Segmental Hidden Markov Models Automatic Phoneme Recognition with Segmental Hidden Markov Models Areg G. Baghdasaryan Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment

More information

CHAPTER 6 WRITING SYSTEM

CHAPTER 6 WRITING SYSTEM CHAPTER 6 6.0 Outline WRITING SYSTEM This chapter deals with the writing system for the Bhujel language. 1 It consists of three sections. In section 6.1 we briefly discuss the issue of script in general.

More information

Two critical areas of change as we diverged from our evolutionary relatives: 1. Changes to peripheral mechanisms

Two critical areas of change as we diverged from our evolutionary relatives: 1. Changes to peripheral mechanisms Two critical areas of change as we diverged from our evolutionary relatives: 1. Changes to peripheral mechanisms 2. Changes to central neural mechanisms Chimpanzee vs. Adult Human vocal tracts 1 Human

More information

EE221 Circuits II. Chapter 14 Frequency Response

EE221 Circuits II. Chapter 14 Frequency Response EE22 Circuits II Chapter 4 Frequency Response Frequency Response Chapter 4 4. Introduction 4.2 Transfer Function 4.3 Bode Plots 4.4 Series Resonance 4.5 Parallel Resonance 4.6 Passive Filters 4.7 Active

More information

Voltage Maps. Nearest Neighbor. Alternative. di: distance to electrode i N: number of neighbor electrodes Vi: voltage at electrode i

Voltage Maps. Nearest Neighbor. Alternative. di: distance to electrode i N: number of neighbor electrodes Vi: voltage at electrode i Speech 2 EEG Research Voltage Maps Nearest Neighbor di: distance to electrode i N: number of neighbor electrodes Vi: voltage at electrode i Alternative Spline interpolation Current Source Density 2 nd

More information

EE221 Circuits II. Chapter 14 Frequency Response

EE221 Circuits II. Chapter 14 Frequency Response EE22 Circuits II Chapter 4 Frequency Response Frequency Response Chapter 4 4. Introduction 4.2 Transfer Function 4.3 Bode Plots 4.4 Series Resonance 4.5 Parallel Resonance 4.6 Passive Filters 4.7 Active

More information

Scattering. N-port Parallel Scattering Junctions. Physical Variables at the Junction. CMPT 889: Lecture 9 Wind Instrument Modelling

Scattering. N-port Parallel Scattering Junctions. Physical Variables at the Junction. CMPT 889: Lecture 9 Wind Instrument Modelling Scattering CMPT 889: Lecture 9 Wind Instrument Modelling Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser niversity November 2, 26 Scattering is a phenomenon in which the wave

More information

Quarterly Progress and Status Report. On the synthesis and perception of voiceless fricatives

Quarterly Progress and Status Report. On the synthesis and perception of voiceless fricatives Dept. for Speech, Music and Hearing Quarterly Progress and Status Report On the synthesis and perception of voiceless fricatives Mártony, J. journal: STLQPSR volume: 3 number: 1 year: 1962 pages: 017022

More information

Michael Ingleby & Azra N.Ali. School of Computing and Engineering University of Huddersfield. Government Phonology and Patterns of McGurk Fusion

Michael Ingleby & Azra N.Ali. School of Computing and Engineering University of Huddersfield. Government Phonology and Patterns of McGurk Fusion Michael Ingleby & Azra N.Ali School of Computing and Engineering University of Huddersfield Government Phonology and Patterns of McGurk Fusion audio-visual perception statistical (b a: d a: ) (ga:) F (A

More information

M. Hasegawa-Johnson. DRAFT COPY.

M. Hasegawa-Johnson. DRAFT COPY. Lecture Notes in Speech Production, Speech Coding, and Speech Recognition Mark Hasegawa-Johnson University of Illinois at Urbana-Champaign February 7, 000 M. Hasegawa-Johnson. DRAFT COPY. Chapter Linear

More information

Lecture 2: Acoustics. Acoustics & sound

Lecture 2: Acoustics. Acoustics & sound EE E680: Speech & Audio Processing & Recognition Lecture : Acoustics 1 3 4 The wave equation Acoustic tubes: reflections & resonance Oscillations & musical acoustics Spherical waves & room acoustics Dan

More information

Improved Method for Epoch Extraction in High Pass Filtered Speech

Improved Method for Epoch Extraction in High Pass Filtered Speech Improved Method for Epoch Extraction in High Pass Filtered Speech D. Govind Center for Computational Engineering & Networking Amrita Vishwa Vidyapeetham (University) Coimbatore, Tamilnadu 642 Email: d

More information

CMPT 889: Lecture 9 Wind Instrument Modelling

CMPT 889: Lecture 9 Wind Instrument Modelling CMPT 889: Lecture 9 Wind Instrument Modelling Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 20, 2006 1 Scattering Scattering is a phenomenon in which the

More information

Speaker Recognition. Ling Feng. Kgs. Lyngby 2004 IMM-THESIS

Speaker Recognition. Ling Feng. Kgs. Lyngby 2004 IMM-THESIS Speaker Recognition Ling eng Kgs. Lyngby 2004 I-THESIS-2004-73 Speaker Recognition Ling eng Kgs. Lyngby 2004 Technical University of Denmark Informatics and athematical odelling Building 321, DK-2800 Lyngby,

More information

Sound Correspondences in the World's Languages: Online Supplementary Materials

Sound Correspondences in the World's Languages: Online Supplementary Materials Sound Correspondences in the World's Languages: Online Supplementary Materials Cecil H. Brown, Eric W. Holman, Søren Wichmann Language, Volume 89, Number 1, March 2013, pp. s1-s76 (Article) Published by

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 11, November 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Acoustic Source

More information

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 9: Acoustic Models

The Noisy Channel Model. Statistical NLP Spring Mel Freq. Cepstral Coefficients. Frame Extraction ... Lecture 9: Acoustic Models Statistical NLP Spring 2010 The Noisy Channel Model Lecture 9: Acoustic Models Dan Klein UC Berkeley Acoustic model: HMMs over word positions with mixtures of Gaussians as emissions Language model: Distributions

More information

Characterization of phonemes by means of correlation dimension

Characterization of phonemes by means of correlation dimension Characterization of phonemes by means of correlation dimension PACS REFERENCE: 43.25.TS (nonlinear acoustical and dinamical systems) Martínez, F.; Guillamón, A.; Alcaraz, J.C. Departamento de Matemática

More information

Quarterly Progress and Status Report. Pole-zero matching techniques

Quarterly Progress and Status Report. Pole-zero matching techniques Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Pole-zero matching techniques Fant, G. and Mártony, J. journal: STL-QPSR volume: 1 number: 1 year: 1960 pages: 014-016 http://www.speech.kth.se/qpsr

More information

A multiple-level linear/linear segmental HMM with a formant-based intermediate layer

A multiple-level linear/linear segmental HMM with a formant-based intermediate layer A multiple-level linear/linear segmental HMM with a formant-based intermediate layer Martin J Russell 1 and Philip J B Jackson 2 1 Electronic, Electrical and Computer Engineering, University of Birmingham,

More information

Possibility of Existence and Identification of Diphthongs and Triphthongs in Urdu Language

Possibility of Existence and Identification of Diphthongs and Triphthongs in Urdu Language 16 Kiran Khurshid, Salman Ahmad Usman and Nida Javaid Butt Possibility of Existence and Identification of Diphthongs and Triphthongs in Urdu Language Abstract: This paper gives an account of possible diphthongs

More information

Time-domain representations

Time-domain representations Time-domain representations Speech Processing Tom Bäckström Aalto University Fall 2016 Basics of Signal Processing in the Time-domain Time-domain signals Before we can describe speech signals or modelling

More information

Influence of acoustic loading on an effective single mass model of the vocal folds

Influence of acoustic loading on an effective single mass model of the vocal folds Influence of acoustic loading on an effective single mass model of the vocal folds Matías Zañartu a School of Electrical and Computer Engineering and Ray W. Herrick Laboratories, Purdue University, 140

More information

Design of a CELP coder and analysis of various quantization techniques

Design of a CELP coder and analysis of various quantization techniques EECS 65 Project Report Design of a CELP coder and analysis of various quantization techniques Prof. David L. Neuhoff By: Awais M. Kamboh Krispian C. Lawrence Aditya M. Thomas Philip I. Tsai Winter 005

More information

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007 Linear Prediction Coding Nimrod Peleg Update: Aug. 2007 1 Linear Prediction and Speech Coding The earliest papers on applying LPC to speech: Atal 1968, 1970, 1971 Markel 1971, 1972 Makhoul 1975 This is

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 6, 9 http://asa.aip.org 157th Meeting Acoustical Society of America Portland, Oregon 18-22 May 9 Session 3aSC: Speech Communication 3aSC8. Source-filter interaction

More information

THE SYRINX: NATURE S HYBRID WIND INSTRUMENT

THE SYRINX: NATURE S HYBRID WIND INSTRUMENT THE SYRINX: NATURE S HYBRID WIND INSTRUMENT Tamara Smyth, Julius O. Smith Center for Computer Research in Music and Acoustics Department of Music, Stanford University Stanford, California 94305-8180 USA

More information

c 2005 by Yanli Zheng. All rights reserved.

c 2005 by Yanli Zheng. All rights reserved. c 2005 by Yanli Zheng. All rights reserved. ACOUSTIC MODELING AND FEATURE SELECTION FOR SPEECH RECOGNITION BY YANLI ZHENG B.Engr., Huazhong University of Science and Technology, 1996 M.S.,Tsinghua University,

More information

Inter- and Intra-Subject Variability: A Palatometric Study

Inter- and Intra-Subject Variability: A Palatometric Study Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2007-10-11 Inter- and Intra-Subject Variability: A Palatometric Study Marybeth Corey Sanders Brigham Young University - Provo Follow

More information

Physics-based synthesis of disordered voices

Physics-based synthesis of disordered voices INTERSPEECH 213 Physics-based synthesis of disordered voices Jorge C. Lucero 1, Jean Schoentgen 2,MaraBehlau 3 1 Department of Computer Science, University of Brasilia, Brasilia DF 791-9, Brazil 2 Laboratories

More information

NEAR EAST UNIVERSITY

NEAR EAST UNIVERSITY NEAR EAST UNIVERSITY GRADUATE SCHOOL OF APPLIED ANO SOCIAL SCIENCES LINEAR PREDICTIVE CODING \ Burak Alacam Master Thesis Department of Electrical and Electronic Engineering Nicosia - 2002 Burak Alacam:

More information

Perturbation Theory: Why do /i/ and /a/ have the formants they have? Wednesday, March 9, 16

Perturbation Theory: Why do /i/ and /a/ have the formants they have? Wednesday, March 9, 16 Perturbation Theory: Why do /i/ and /a/ have the formants they have? Components of Oscillation 1. Springs resist displacement from rest position springs return to rest position ẋ = k(x x 0 ) Stiffness

More information

Webster s equation in acoustic wave guides

Webster s equation in acoustic wave guides Webster s equation in acoustic wave guides Jarmo Malinen Aalto University Dept. of Mathematics and Systems Analysis, School of Science Dept. of Signal Processing and Acoustics, School of Electrical Engineering

More information

Webster s equation in acoustic wave guides

Webster s equation in acoustic wave guides Webster s equation in acoustic wave guides An improved version of the Oberwolfach lecture in November 2015 Jarmo Malinen Aalto University Dept. of Mathematics and Systems Analysis, School of Science Dept.

More information

Acoustic tubes with maximal and minimal resonance frequencies

Acoustic tubes with maximal and minimal resonance frequencies Acoustic tubes with imal and minimal resonance frequencies B. De Boer Spuistraat 1, 11VT Amsterdam, Netherlands b.g.deboer@uva.nl 565 This paper presents a theoretical derivation of acoustic tract shapes

More information

Perturbation Theory: Why do /i/ and /a/ have the formants they have? Tuesday, March 3, 15

Perturbation Theory: Why do /i/ and /a/ have the formants they have? Tuesday, March 3, 15 Perturbation Theory: Why do /i/ and /a/ have the formants they have? Vibration: Harmonic motion time 1 2 3 4 5 - A x 0 + A + A Block resting on air hockey table, with a spring attaching it to the side

More information

I. INTRODUCTION. Park, MD 20742; Electronic mail: b Currently on leave at Ecole Centrale de Lyon, 36 Avenue Guy de Collongue,

I. INTRODUCTION. Park, MD 20742; Electronic mail: b Currently on leave at Ecole Centrale de Lyon, 36 Avenue Guy de Collongue, Sound generation by steady flow through glottis-shaped orifices Zhaoyan Zhang, a) Luc Mongeau, b) Steven H. Frankel, Scott Thomson, and Jong Beom Park Ray W. Herrick Laboratories, Purdue University, 140

More information

Our Chorus Workbook

Our Chorus Workbook Our Chorus Workbook 2016-2017 1 An Introduction to Choral Music 2 Solfege Handsigns 3 An Introduction to Vocal Production The breathing muscles are located in the upper and lower abdomen. These control

More information

Real Sound Synthesis for Interactive Applications

Real Sound Synthesis for Interactive Applications Real Sound Synthesis for Interactive Applications Perry R. Cook я А К Peters Natick, Massachusetts Contents Introduction xi 1. Digital Audio Signals 1 1.0 Introduction 1 1.1 Digital Audio Signals 1 1.2

More information

Lecture 3: Acoustics

Lecture 3: Acoustics CSC 83060: Speech & Audio Understanding Lecture 3: Acoustics Michael Mandel mim@sci.brooklyn.cuny.edu CUNY Graduate Center, Computer Science Program http://mr-pc.org/t/csc83060 With much content from Dan

More information

The Z-Transform. For a phasor: X(k) = e jωk. We have previously derived: Y = H(z)X

The Z-Transform. For a phasor: X(k) = e jωk. We have previously derived: Y = H(z)X The Z-Transform For a phasor: X(k) = e jωk We have previously derived: Y = H(z)X That is, the output of the filter (Y(k)) is derived by multiplying the input signal (X(k)) by the transfer function (H(z)).

More information

Estimation of Cepstral Coefficients for Robust Speech Recognition

Estimation of Cepstral Coefficients for Robust Speech Recognition Estimation of Cepstral Coefficients for Robust Speech Recognition by Kevin M. Indrebo, B.S., M.S. A Dissertation submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment

More information

FORMANTS AND VOWEL SOUNDS BY THE FINITE ELEMENT METHOD

FORMANTS AND VOWEL SOUNDS BY THE FINITE ELEMENT METHOD FORMANTS AND VOWEL SOUNDS BY THE FINITE ELEMENT METHOD Antti Hannukainen, Teemu Lukkari, Jarmo Malinen and Pertti Palo 1 Institute of Mathematics P.O.Box 1100, FI-02015 Helsinki University of Technology,

More information

Template-Based Representations. Sargur Srihari

Template-Based Representations. Sargur Srihari Template-Based Representations Sargur srihari@cedar.buffalo.edu 1 Topics Variable-based vs Template-based Temporal Models Basic Assumptions Dynamic Bayesian Networks Hidden Markov Models Linear Dynamical

More information

Filters and Tuned Amplifiers

Filters and Tuned Amplifiers Filters and Tuned Amplifiers Essential building block in many systems, particularly in communication and instrumentation systems Typically implemented in one of three technologies: passive LC filters,

More information

Semi-Supervised Learning of Speech Sounds

Semi-Supervised Learning of Speech Sounds Aren Jansen Partha Niyogi Department of Computer Science Interspeech 2007 Objectives 1 Present a manifold learning algorithm based on locality preserving projections for semi-supervised phone classification

More information

Object Tracking and Asynchrony in Audio- Visual Speech Recognition

Object Tracking and Asynchrony in Audio- Visual Speech Recognition Object Tracking and Asynchrony in Audio- Visual Speech Recognition Mark Hasegawa-Johnson AIVR Seminar August 31, 2006 AVICAR is thanks to: Bowon Lee, Ming Liu, Camille Goudeseune, Suketu Kamdar, Carl Press,

More information

c 2014 Jacob Daniel Bryan

c 2014 Jacob Daniel Bryan c 2014 Jacob Daniel Bryan AUTOREGRESSIVE HIDDEN MARKOV MODELS AND THE SPEECH SIGNAL BY JACOB DANIEL BRYAN THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science

More information

Estimation and tracking of fundamental, 2nd and 3d harmonic frequencies for spectrogram normalization in speech recognition

Estimation and tracking of fundamental, 2nd and 3d harmonic frequencies for spectrogram normalization in speech recognition BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, Vol. 60, No. 1, 2012 DOI: 10.2478/v10175-012-0011-z Estimation and tracking of fundamental, 2nd and 3d harmonic frequencies for spectrogram

More information

Sound, acoustics Slides based on: Rossing, The science of sound, 1990, and Pulkki, Karjalainen, Communication acoutics, 2015

Sound, acoustics Slides based on: Rossing, The science of sound, 1990, and Pulkki, Karjalainen, Communication acoutics, 2015 Acoustics 1 Sound, acoustics Slides based on: Rossing, The science of sound, 1990, and Pulkki, Karjalainen, Communication acoutics, 2015 Contents: 1. Introduction 2. Vibrating systems 3. Waves 4. Resonance

More information