CS578- Speech Signal Processing
|
|
- Grant Bishop
- 5 years ago
- Views:
Transcription
1 CS578- Speech Signal Processing Lecture 7: Speech Coding Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab Univ. of Crete
2 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
3 Digital telephone communication system
4 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)
5 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)
6 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)
7 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability
8 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability
9 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability
10 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability
11 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability
12 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index
13 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index
14 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index
15 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index
16 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index
17 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)
18 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)
19 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)
20 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)
21 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)
22 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)
23 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)
24 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
25 Probability Density of speech By setting x[n] x, the histogram of speech samples can be approximated by a gamma density: p X (x) = or by a simpler Laplacian density: ( ) 1/2 3 e 3 x 2σx 8πσ x x p X (x) = 1 e 3 x σx 2σx
26 Densities comparison
27 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
28 Coding and Decoding
29 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s
30 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s
31 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s
32 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s
33 Uniform quantization x i x i 1 =, 1 i M ˆx i = x i +x i 1 2, 1 i M is referred to as uniform quantization step size. Example of a 2-bit uniform quantization:
34 Uniform quantization: designing decision regions Signal range: 4σ x x[n] 4σ x Assuming B-bit binary codebook, we get 2 B quantization (reconstruction) levels Quantization step size, : and quantization noise. = 2x max 2 B
35 Classes of Quantization noise There are two classes of quantization noise: Granular Distortion: ˆx[n] = x[n] + e[n] where e[n] is the quantization noise, with: 2 e[n] 2 Overload Distortion: clipped samples
36 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise
37 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise
38 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise
39 Dithering Definition (Dithering) We can force e[n] to be white and uncorrelated with x[n] by adding noise to x[n] before quantization!
40 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x
41 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x
42 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x
43 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x
44 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x
45 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
46 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
47 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
48 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
49 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
50 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?
51 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1
52 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1
53 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1
54 Optimal decision and reconstruction level, cont. The minimization of D over decision level, x k, gives: x k = ˆx k+1 + ˆx k, 1 k M 1 2 The minimization of D over reconstruction level, ˆx k, gives: ] xdx ˆx k = x k x k 1 [ p x (x) xk x p x (s)ds k 1 = x k x k 1 p x (x)xdx
55 Optimal decision and reconstruction level, cont. The minimization of D over decision level, x k, gives: x k = ˆx k+1 + ˆx k, 1 k M 1 2 The minimization of D over reconstruction level, ˆx k, gives: ] xdx ˆx k = x k x k 1 [ p x (x) xk x p x (s)ds k 1 = x k x k 1 p x (x)xdx
56 Example with Laplacian pdf
57 Principle of Companding
58 Companding examples Companding examples: Transformation to a uniform density: g[n] = T (x[n]) = x[n] p x(s)ds 1 2, 1 2 g[n] 1 2 = 0 elsewhere µ-law: x max ) T (x[n]) = x max log (1 + µ x[n] log (1 + µ) sign(x[n])
59 Companding examples Companding examples: Transformation to a uniform density: g[n] = T (x[n]) = x[n] p x(s)ds 1 2, 1 2 g[n] 1 2 = 0 elsewhere µ-law: x max ) T (x[n]) = x max log (1 + µ x[n] log (1 + µ) sign(x[n])
60 Adaptive quantization
61 Differential and Residual quantization
62 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
63 Motivation for VQ
64 Comparing scalar and vector quantization
65 Distortion in VQ Here we have a multidimensional pdf p x (x): D = E[(ˆx x) T (ˆx x)] = (ˆx x)t (ˆx x)p x (x)dx = M i=1 x C i (r i x) T (r i x)p x (x)dx Two constraints: A vector x must be quantized to a reconstruction level r i that gives the smallest distortion: C i = {x : x r i 2 x r l 2, l = 1, 2,, M} Each reconstruction level r i must be the centroid of the corresponding decision region, i.e., of the cell C i : x r i = m C i x m i = 1, 2,, M x m C i 1
66 Distortion in VQ Here we have a multidimensional pdf p x (x): D = E[(ˆx x) T (ˆx x)] = (ˆx x)t (ˆx x)p x (x)dx = M i=1 x C i (r i x) T (r i x)p x (x)dx Two constraints: A vector x must be quantized to a reconstruction level r i that gives the smallest distortion: C i = {x : x r i 2 x r l 2, l = 1, 2,, M} Each reconstruction level r i must be the centroid of the corresponding decision region, i.e., of the cell C i : x r i = m C i x m i = 1, 2,, M x m C i 1
67 The k-means algorithm S1: D = 1 N 1 (ˆx k x k ) T (ˆx k x k ) N k=0 S2: Pick an initial guess at the reconstruction levels {r i } S3: For each x k elect an r i closest to x k. Form clusters (clustering step) S4: Find the mean of x k in each cluster which gives a new r i. Compute D. S5: Stop when the change in D over two consecutive iterations is insignificant.
68 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.
69 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.
70 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.
71 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.
72 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.
73 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
74 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.
75 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.
76 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.
77 Scalar quantization within LPC For 7200 bps: Voiced/unvoiced decision: 1 bit Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Poles d i : 10 bits (nonuniform) [5 bits for frequency and 5 bits for bandwidth] 6 poles = 60 bits So: ( ) 100 frames/s = 7200 bps
78 Scalar quantization within LPC For 7200 bps: Voiced/unvoiced decision: 1 bit Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Poles d i : 10 bits (nonuniform) [5 bits for frequency and 5 bits for bandwidth] 6 poles = 60 bits So: ( ) 100 frames/s = 7200 bps
79 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps
80 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps
81 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps
82 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps
83 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps
84 VQ in LPC coding A 10-bit codebook (1024 codewords), 800 bps VQ provides a comparable quality to a 2400 bps scalar quantizer. A 22-bit codebook ( codewords), 2400 bps VQ provides a higher output speech quality.
85 Unique components of MELP Mixed pulse and noise excitation Periodic or aperiodic pulses Adaptive spectral enhancements Pulse dispersion filter
86 Line Spectral Frequencies (LSFs) in MELP LSFs for a pth order all-pole model are defines as follows: 1 Form two polynomials: P(z) = A(z) + z (p+1) A(z 1 ) Q(z) = A(z) z (p+1) A(z 1 ) 2 Find the roots of P(z) and Q(z), ωi which are on the unit circle. 3 Exclude trivial roots at ωi = 0 and ω i = π.
87 MELP coding For a 2400 bps: 34 bits allocated to scalar quantization of the LSFs 8 bits for gain 7 bits for pitch and overall voicing 5 bits for multi-band voicing 1 bit for the jittery state which is 54 bits. With a frame rate of 22.5 ms, we get an 2400 bps coder.
88 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments
89 Acknowledgments Most, if not all, figures in this lecture are coming from the book: T. F. Quatieri: Discrete-Time Speech Signal Processing, principles and practice 2002, Prentice Hall and have been used after permission from Prentice Hall
90
1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ
Digital Speech Processing Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models Adaptive and Differential Coding 1 Speech Waveform Coding-Summary of Part 1 1. Probability
More informationPulse-Code Modulation (PCM) :
PCM & DPCM & DM 1 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number of bits used to represent each sample. The rate from
More informationClass of waveform coders can be represented in this manner
Digital Speech Processing Lecture 15 Speech Coding Methods Based on Speech Waveform Representations ti and Speech Models Uniform and Non- Uniform Coding Methods 1 Analog-to-Digital Conversion (Sampling
More informationDigital Signal Processing
COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #3 Wednesday, September 10, 2003 1.4 Quantization Digital systems can only represent sample amplitudes with a finite set of prescribed values,
More informationSource modeling (block processing)
Digital Speech Processing Lecture 17 Speech Coding Methods Based on Speech Models 1 Waveform Coding versus Block Waveform coding Processing sample-by-sample matching of waveforms coding gquality measured
More informationEE 5345 Biomedical Instrumentation Lecture 12: slides
EE 5345 Biomedical Instrumentation Lecture 1: slides 4-6 Carlos E. Davila, Electrical Engineering Dept. Southern Methodist University slides can be viewed at: http:// www.seas.smu.edu/~cd/ee5345.html EE
More informationDesign of a CELP coder and analysis of various quantization techniques
EECS 65 Project Report Design of a CELP coder and analysis of various quantization techniques Prof. David L. Neuhoff By: Awais M. Kamboh Krispian C. Lawrence Aditya M. Thomas Philip I. Tsai Winter 005
More informationThe Secrets of Quantization. Nimrod Peleg Update: Sept. 2009
The Secrets of Quantization Nimrod Peleg Update: Sept. 2009 What is Quantization Representation of a large set of elements with a much smaller set is called quantization. The number of elements in the
More informationScalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014
Scalar and Vector Quantization National Chiao Tung University Chun-Jen Tsai 11/06/014 Basic Concept of Quantization Quantization is the process of representing a large, possibly infinite, set of values
More informationChapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析
Chapter 9 Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 1 LPC Methods LPC methods are the most widely used in speech coding, speech synthesis, speech recognition, speaker recognition and verification
More informationSpeech Coding. Speech Processing. Tom Bäckström. October Aalto University
Speech Coding Speech Processing Tom Bäckström Aalto University October 2015 Introduction Speech coding refers to the digital compression of speech signals for telecommunication (and storage) applications.
More informationGaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y)
Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y) E{(X-Y) 2 } D
More informationSPEECH ANALYSIS AND SYNTHESIS
16 Chapter 2 SPEECH ANALYSIS AND SYNTHESIS 2.1 INTRODUCTION: Speech signal analysis is used to characterize the spectral information of an input speech signal. Speech signal analysis [52-53] techniques
More informationLinear Prediction Coding. Nimrod Peleg Update: Aug. 2007
Linear Prediction Coding Nimrod Peleg Update: Aug. 2007 1 Linear Prediction and Speech Coding The earliest papers on applying LPC to speech: Atal 1968, 1970, 1971 Markel 1971, 1972 Makhoul 1975 This is
More informationMultimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization
Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lecture 4 -> 6 : Quantization Overview Course page (D.I.R.): https://disit.dir.unipmn.it/course/view.php?id=639 Consulting: Office hours by appointment:
More informationPrinciples of Communications
Principles of Communications Weiyao Lin, PhD Shanghai Jiao Tong University Chapter 4: Analog-to-Digital Conversion Textbook: 7.1 7.4 2010/2011 Meixia Tao @ SJTU 1 Outline Analog signal Sampling Quantization
More informationThe Equivalence of ADPCM and CELP Coding
The Equivalence of ADPCM and CELP Coding Peter Kabal Department of Electrical & Computer Engineering McGill University Montreal, Canada Version.2 March 20 c 20 Peter Kabal 20/03/ You are free: to Share
More informationFinite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut
Finite Word Length Effects and Quantisation Noise 1 Finite Word Length Effects Finite register lengths and A/D converters cause errors at different levels: (i) input: Input quantisation (ii) system: Coefficient
More informationSinusoidal Modeling. Yannis Stylianou SPCC University of Crete, Computer Science Dept., Greece,
Sinusoidal Modeling Yannis Stylianou University of Crete, Computer Science Dept., Greece, yannis@csd.uoc.gr SPCC 2016 1 Speech Production 2 Modulators 3 Sinusoidal Modeling Sinusoidal Models Voiced Speech
More informationL used in various speech coding applications for representing
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 1, NO. 1. JANUARY 1993 3 Efficient Vector Quantization of LPC Parameters at 24 BitsFrame Kuldip K. Paliwal, Member, IEEE, and Bishnu S. Atal, Fellow,
More informationC.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University
Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw
More informationAudio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU
Audio Coding P.1 Fundamentals Quantization Waveform Coding Subband Coding 1. Fundamentals P.2 Introduction Data Redundancy Coding Redundancy Spatial/Temporal Redundancy Perceptual Redundancy Compression
More informationProc. of NCC 2010, Chennai, India
Proc. of NCC 2010, Chennai, India Trajectory and surface modeling of LSF for low rate speech coding M. Deepak and Preeti Rao Department of Electrical Engineering Indian Institute of Technology, Bombay
More informationSCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION
SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION Hauke Krüger and Peter Vary Institute of Communication Systems and Data Processing RWTH Aachen University, Templergraben
More informationVector Quantization and Subband Coding
Vector Quantization and Subband Coding 18-796 ultimedia Communications: Coding, Systems, and Networking Prof. Tsuhan Chen tsuhan@ece.cmu.edu Vector Quantization 1 Vector Quantization (VQ) Each image block
More informationLab 9a. Linear Predictive Coding for Speech Processing
EE275Lab October 27, 2007 Lab 9a. Linear Predictive Coding for Speech Processing Pitch Period Impulse Train Generator Voiced/Unvoiced Speech Switch Vocal Tract Parameters Time-Varying Digital Filter H(z)
More informationMultimedia Communications. Scalar Quantization
Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing
More informationDigital Image Processing Lectures 25 & 26
Lectures 25 & 26, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2015 Area 4: Image Encoding and Compression Goal: To exploit the redundancies in the image
More informationLow Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model
PAPERS Journal of the Audio Engineering Society Vol. 64, No. 11, November 216 ( C 216) DOI: https://doi.org/1.17743/jaes.216.28 Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model
More informationVector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression
Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18
More informationVector Quantizers for Reduced Bit-Rate Coding of Correlated Sources
Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources Russell M. Mersereau Center for Signal and Image Processing Georgia Institute of Technology Outline Cache vector quantization Lossless
More informationA comparative study of LPC parameter representations and quantisation schemes for wideband speech coding
Digital Signal Processing 17 (2007) 114 137 www.elsevier.com/locate/dsp A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding Stephen So a,, Kuldip K.
More informationMel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation. Keiichi Tokuda
Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation Keiichi Tokuda Nagoya Institute of Technology Carnegie Mellon University Tamkang University March 13,
More informationA 600bps Vocoder Algorithm Based on MELP. Lan ZHU and Qiang LI*
2017 2nd International Conference on Electrical and Electronics: Techniques and Applications (EETA 2017) ISBN: 978-1-60595-416-5 A 600bps Vocoder Algorithm Based on MELP Lan ZHU and Qiang LI* Chongqing
More informationEE123 Digital Signal Processing
EE123 Digital Signal Processing Lecture 19 Practical ADC/DAC Ideal Anti-Aliasing ADC A/D x c (t) Analog Anti-Aliasing Filter HLP(jΩ) sampler t = nt x[n] =x c (nt ) Quantizer 1 X c (j ) and s < 2 1 T X
More informationHARMONIC VECTOR QUANTIZATION
HARMONIC VECTOR QUANTIZATION Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Jonas Svedberg, and Harald Pobloth SMN, Ericsson Research, Ericsson AB 64 8, Stockholm, Sweden ABSTRACT
More informationChapter 10 Applications in Communications
Chapter 10 Applications in Communications School of Information Science and Engineering, SDU. 1/ 47 Introduction Some methods for digitizing analog waveforms: Pulse-code modulation (PCM) Differential PCM
More informationCODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING
5 0 DPCM (Differential Pulse Code Modulation) Making scalar quantization work for a correlated source -- a sequential approach. Consider quantizing a slowly varying source (AR, Gauss, ρ =.95, σ 2 = 3.2).
More informationAnalysis of methods for speech signals quantization
INFOTEH-JAHORINA Vol. 14, March 2015. Analysis of methods for speech signals quantization Stefan Stojkov Mihajlo Pupin Institute, University of Belgrade Belgrade, Serbia e-mail: stefan.stojkov@pupin.rs
More informationVector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l
Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT
More informationIntroduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.
Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems
More informationReview of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition
Review of Quantization UMCP ENEE631 Slides (created by M.Wu 004) Quantization UMCP ENEE631 Slides (created by M.Wu 001/004) L-level Quantization Minimize errors for this lossy process What L values to
More informationQuantization 2.1 QUANTIZATION AND THE SOURCE ENCODER
2 Quantization After the introduction to image and video compression presented in Chapter 1, we now address several fundamental aspects of image and video compression in the remaining chapters of Section
More informationE303: Communication Systems
E303: Communication Systems Professor A. Manikas Chair of Communications and Array Processing Imperial College London Principles of PCM Prof. A. Manikas (Imperial College) E303: Principles of PCM v.17
More informationELEN E4810: Digital Signal Processing Topic 11: Continuous Signals. 1. Sampling and Reconstruction 2. Quantization
ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals 1. Sampling and Reconstruction 2. Quantization 1 1. Sampling & Reconstruction DSP must interact with an analog world: A to D D to A x(t)
More informationPCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1
PCM Reference Chapter 1.1, Communication Systems, Carlson. PCM.1 Pulse-code modulation (PCM) Pulse modulations use discrete time samples of analog signals the transmission is composed of analog information
More informationObjectives of Image Coding
Objectives of Image Coding Representation of an image with acceptable quality, using as small a number of bits as possible Applications: Reduction of channel bandwidth for image transmission Reduction
More informationETSI TS V5.0.0 ( )
Technical Specification Universal Mobile Telecommunications System (UMTS); AMR speech Codec; Transcoding Functions () 1 Reference RTS/TSGS-046090v500 Keywords UMTS 650 Route des Lucioles F-0691 Sophia
More informationQuantization of LSF Parameters Using A Trellis Modeling
1 Quantization of LSF Parameters Using A Trellis Modeling Farshad Lahouti, Amir K. Khandani Coding and Signal Transmission Lab. Dept. of E&CE, University of Waterloo, Waterloo, ON, N2L 3G1, Canada (farshad,
More informationL7: Linear prediction of speech
L7: Linear prediction of speech Introduction Linear prediction Finding the linear prediction coefficients Alternative representations This lecture is based on [Dutoit and Marques, 2009, ch1; Taylor, 2009,
More informationSignal representations: Cepstrum
Signal representations: Cepstrum Source-filter separation for sound production For speech, source corresponds to excitation by a pulse train for voiced phonemes and to turbulence (noise) for unvoiced phonemes,
More informationThe information loss in quantization
The information loss in quantization The rough meaning of quantization in the frame of coding is representing numerical quantities with a finite set of symbols. The mapping between numbers, which are normally
More informationEfficient Block Quantisation for Image and Speech Coding
Efficient Block Quantisation for Image and Speech Coding Stephen So, BEng (Hons) School of Microelectronic Engineering Faculty of Engineering and Information Technology Griffith University, Brisbane, Australia
More informationETSI EN V7.1.1 ( )
European Standard (Telecommunications series) Digital cellular telecommunications system (Phase +); Adaptive Multi-Rate (AMR) speech transcoding GLOBAL SYSTEM FOR MOBILE COMMUNICATIONS R Reference DEN/SMG-110690Q7
More informationCompression methods: the 1 st generation
Compression methods: the 1 st generation 1998-2017 Josef Pelikán CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ Still1g 2017 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 32 Basic
More information7.1 Sampling and Reconstruction
Haberlesme Sistemlerine Giris (ELE 361) 6 Agustos 2017 TOBB Ekonomi ve Teknoloji Universitesi, Guz 2017-18 Dr. A. Melda Yuksel Turgut & Tolga Girici Lecture Notes Chapter 7 Analog to Digital Conversion
More informationMultimedia Communications. Differential Coding
Multimedia Communications Differential Coding Differential Coding In many sources, the source output does not change a great deal from one sample to the next. This means that both the dynamic range and
More informationResearch Article Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding
Hindawi Publishing Corporation EURASIP Journal on Audio, Speech, and Music Processing Volume 00, Article ID 597039, 3 pages doi:0.55/00/597039 Research Article Adaptive Long-Term Coding of LSF Parameters
More informationVoiced Speech. Unvoiced Speech
Digital Speech Processing Lecture 2 Homomorphic Speech Processing General Discrete-Time Model of Speech Production p [ n] = p[ n] h [ n] Voiced Speech L h [ n] = A g[ n] v[ n] r[ n] V V V p [ n ] = u [
More informationRe-estimation of Linear Predictive Parameters in Sparse Linear Prediction
Downloaded from vbnaaudk on: januar 12, 2019 Aalborg Universitet Re-estimation of Linear Predictive Parameters in Sparse Linear Prediction Giacobello, Daniele; Murthi, Manohar N; Christensen, Mads Græsbøll;
More informationBASICS OF COMPRESSION THEORY
BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find
More informationTimbral, Scale, Pitch modifications
Introduction Timbral, Scale, Pitch modifications M2 Mathématiques / Vision / Apprentissage Audio signal analysis, indexing and transformation Page 1 / 40 Page 2 / 40 Modification of playback speed Modifications
More informationEE368B Image and Video Compression
EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer
More informationE4702 HW#4-5 solutions by Anmo Kim
E70 HW#-5 solutions by Anmo Kim (ak63@columbia.edu). (P3.7) Midtread type uniform quantizer (figure 3.0(a) in Haykin) Gaussian-distributed random variable with zero mean and unit variance is applied to
More information3GPP TS V6.1.1 ( )
Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB)
More informationAnalysis of Finite Wordlength Effects
Analysis of Finite Wordlength Effects Ideally, the system parameters along with the signal variables have infinite precision taing any value between and In practice, they can tae only discrete values within
More informationChapter 3. Quantization. 3.1 Scalar Quantizers
Chapter 3 Quantization As mentioned in the introduction, two operations are necessary to transform an analog waveform into a digital signal. The first action, sampling, consists of converting a continuous-time
More informationTime-domain representations
Time-domain representations Speech Processing Tom Bäckström Aalto University Fall 2016 Basics of Signal Processing in the Time-domain Time-domain signals Before we can describe speech signals or modelling
More informationLloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks
Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks Sai Han and Tim Fingscheidt Institute for Communications Technology, Technische Universität
More informationHARMONIC CODING OF SPEECH AT LOW BIT RATES
HARMONIC CODING OF SPEECH AT LOW BIT RATES Peter Lupini B. A.Sc. University of British Columbia, 1985 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
More informationDigital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau
Digital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau What is our SNR if we have a sinusoidal signal? What is its pdf? Basically
More informationExample: for source
Nonuniform scalar quantizer References: Sayood Chap. 9, Gersho and Gray, Chap.'s 5 and 6. The basic idea: For a nonuniform source density, put smaller cells and levels where the density is larger, thereby
More informationLog Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding
entropy Article Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding Jerry D. Gibson * and Preethi Mahadevan Department of Electrical and Computer
More informationat Some sort of quantization is necessary to represent continuous signals in digital form
Quantization at Some sort of quantization is necessary to represent continuous signals in digital form x(n 1,n ) x(t 1,tt ) D Sampler Quantizer x q (n 1,nn ) Digitizer (A/D) Quantization is also used for
More informationMultimedia Networking ECE 599
Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on lectures from B. Lee, B. Girod, and A. Mukherjee 1 Outline Digital Signal Representation
More informationVECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION
VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION Xiangyang Chen B. Sc. (Elec. Eng.), The Branch of Tsinghua University, 1983 A THESIS SUBMITTED LV PARTIAL FVLFILLMENT OF THE REQUIREMENTS FOR THE DEGREE
More informationImage Compression using DPCM with LMS Algorithm
Image Compression using DPCM with LMS Algorithm Reenu Sharma, Abhay Khedkar SRCEM, Banmore -----------------------------------------------------------------****---------------------------------------------------------------
More informationCommunication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi
Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking
More information6 Quantization of Discrete Time Signals
Ramachandran, R.P. Quantization of Discrete Time Signals Digital Signal Processing Handboo Ed. Vijay K. Madisetti and Douglas B. Williams Boca Raton: CRC Press LLC, 1999 c 1999byCRCPressLLC 6 Quantization
More informationETSI TS V ( )
TS 146 060 V14.0.0 (2017-04) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+) (GSM); Enhanced Full Rate (EFR) speech transcoding (3GPP TS 46.060 version 14.0.0 Release 14)
More informationPredictive Coding. Prediction Prediction in Images
Prediction Prediction in Images Predictive Coding Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe
More informationPredictive Coding. Prediction
Predictive Coding Prediction Prediction in Images Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe
More informationETSI TS V7.0.0 ( )
TS 6 9 V7.. (7-6) Technical Specification Digital cellular telecommunications system (Phase +); Universal Mobile Telecommunications System (UMTS); Speech codec speech processing functions; Adaptive Multi-Rate
More informationA Systematic Description of Source Significance Information
A Systematic Description of Source Significance Information Norbert Goertz Institute for Digital Communications School of Engineering and Electronics The University of Edinburgh Mayfield Rd., Edinburgh
More informationDesign of Optimal Quantizers for Distributed Source Coding
Design of Optimal Quantizers for Distributed Source Coding David Rebollo-Monedero, Rui Zhang and Bernd Girod Information Systems Laboratory, Electrical Eng. Dept. Stanford University, Stanford, CA 94305
More informationINTERNATIONAL TELECOMMUNICATION UNION. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB)
INTERNATIONAL TELECOMMUNICATION UNION ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU G.722.2 (07/2003) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments
More informationCh. 10 Vector Quantization. Advantages & Design
Ch. 10 Vector Quantization Advantages & Design 1 Advantages of VQ There are (at least) 3 main characteristics of VQ that help it outperform SQ: 1. Exploit Correlation within vectors 2. Exploit Shape Flexibility
More informationAN IMPROVED ADPCM DECODER BY ADAPTIVELY CONTROLLED QUANTIZATION INTERVAL CENTROIDS. Sai Han and Tim Fingscheidt
AN IMPROVED ADPCM DECODER BY ADAPTIVELY CONTROLLED QUANTIZATION INTERVAL CENTROIDS Sai Han and Tim Fingscheidt Institute for Communications Technology, Technische Universität Braunschweig Schleinitzstr.
More informationBASIC COMPRESSION TECHNIQUES
BASIC COMPRESSION TECHNIQUES N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lectures # 05 Questions / Problems / Announcements? 2 Matlab demo of DFT Low-pass windowed-sinc
More informationNEAR EAST UNIVERSITY
NEAR EAST UNIVERSITY GRADUATE SCHOOL OF APPLIED ANO SOCIAL SCIENCES LINEAR PREDICTIVE CODING \ Burak Alacam Master Thesis Department of Electrical and Electronic Engineering Nicosia - 2002 Burak Alacam:
More informationCEPSTRAL ANALYSIS SYNTHESIS ON THE MEL FREQUENCY SCALE, AND AN ADAPTATIVE ALGORITHM FOR IT.
CEPSTRAL ANALYSIS SYNTHESIS ON THE EL FREQUENCY SCALE, AND AN ADAPTATIVE ALGORITH FOR IT. Summarized overview of the IEEE-publicated papers Cepstral analysis synthesis on the mel frequency scale by Satochi
More informationITU-T G khz audio-coding within 64 kbit/s
International Telecommunication Union ITU-T G.722 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (9/212) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments
More informationSignal Modeling Techniques in Speech Recognition. Hassan A. Kingravi
Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction
More informationEXAMPLE OF SCALAR AND VECTOR QUANTIZATION
EXAMPLE OF SCALAR AD VECTOR QUATIZATIO Source sequence : This could be the output of a highly correlated source. A scalar quantizer: =1, M=4 C 1 = {w 1,w 2,w 3,w 4 } = {-4, -1, 1, 4} = codeboo of quantization
More informationSpeaker Identification Based On Discriminative Vector Quantization And Data Fusion
University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) Speaker Identification Based On Discriminative Vector Quantization And Data Fusion 2005 Guangyu Zhou
More informationModule 3. Quantization and Coding. Version 2, ECE IIT, Kharagpur
Module Quantization and Coding ersion, ECE IIT, Kharagpur Lesson Logarithmic Pulse Code Modulation (Log PCM) and Companding ersion, ECE IIT, Kharagpur After reading this lesson, you will learn about: Reason
More informationImage Coding. Chapter 10. Contents. (Related to Ch. 10 of Lim.) 10.1
Chapter 1 Image Coding Contents Introduction..................................................... 1. Quantization..................................................... 1.3 Scalar quantization...............................................
More informationBasic Principles of Video Coding
Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion
More informationEE-597 Notes Quantization
EE-597 Notes Quantization Phil Schniter June, 4 Quantization Given a continuous-time and continuous-amplitude signal (t, processing and storage by modern digital hardware requires discretization in both
More informationLOW COMPLEX FORWARD ADAPTIVE LOSS COMPRESSION ALGORITHM AND ITS APPLICATION IN SPEECH CODING
Journal of ELECTRICAL ENGINEERING, VOL. 62, NO. 1, 2011, 19 24 LOW COMPLEX FORWARD ADAPTIVE LOSS COMPRESSION ALGORITHM AND ITS APPLICATION IN SPEECH CODING Jelena Nikolić Zoran Perić Dragan Antić Aleksandra
More information