CS578- Speech Signal Processing

Size: px
Start display at page:

Download "CS578- Speech Signal Processing"

Transcription

1 CS578- Speech Signal Processing Lecture 7: Speech Coding Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab Univ. of Crete

2 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

3 Digital telephone communication system

4 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)

5 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)

6 Categories of speech coders Waveform coders (16-64 kbps, f s = 8000Hz) Hybrid coders ( kbps, f s = 8000Hz) Vocoders ( kbps, f s = 8000Hz)

7 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability

8 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability

9 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability

10 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability

11 Speech quality Closeness of the processed speech waveform to the original speech waveform Naturalness Background artifacts Intelligibility Speaker identifiability

12 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index

13 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index

14 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index

15 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index

16 Measuring Speech Quality Subjective tests: Diagnostic Rhyme Test (DRT) Diagnostic Acceptability Measure (DAM) Mean Opinion Score (MOS) Objective tests: Segmental Signal-to-Noise Ratio (SNR) Articulation Index

17 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)

18 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)

19 Quantization Statistical models of speech (preliminary) Scalar quantization (i.e., waveform coding) Vector quantization (i.e., subband and sinusoidal coding)

20 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)

21 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)

22 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)

23 Linear Prediction Coding, LPC Classic LPC Mixed Excitation Linear Prediction, MELP Multipulse LPC Code Excited Linear Prediction (CELP)

24 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

25 Probability Density of speech By setting x[n] x, the histogram of speech samples can be approximated by a gamma density: p X (x) = or by a simpler Laplacian density: ( ) 1/2 3 e 3 x 2σx 8πσ x x p X (x) = 1 e 3 x σx 2σx

26 Densities comparison

27 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

28 Coding and Decoding

29 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s

30 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s

31 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s

32 Fundamentals of Scalar Coding Let s quantize a single sample speech value, x[n] into M reconstruction or decision levels: ˆx[n] = ˆx i = Q(x[n]), x i 1 < x[n] x i with 1 i M and x k denotes the M decision levels with 0 k M. Assign a codeword in each reconstruction level. Collection of codewords makes a codebook. Using B-bit binary codebook we can represent each 2 B different quantization (reconstruction) levels. Bit rate, I, is defined as: I = Bf s

33 Uniform quantization x i x i 1 =, 1 i M ˆx i = x i +x i 1 2, 1 i M is referred to as uniform quantization step size. Example of a 2-bit uniform quantization:

34 Uniform quantization: designing decision regions Signal range: 4σ x x[n] 4σ x Assuming B-bit binary codebook, we get 2 B quantization (reconstruction) levels Quantization step size, : and quantization noise. = 2x max 2 B

35 Classes of Quantization noise There are two classes of quantization noise: Granular Distortion: ˆx[n] = x[n] + e[n] where e[n] is the quantization noise, with: 2 e[n] 2 Overload Distortion: clipped samples

36 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise

37 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise

38 Assumptions Quantization noise is an ergodic white-noise random process: r e [m] = E(e[n]e[n + m]) = σ 2 e, m = 0 = 0, m 0 Quantization noise and input signal are uncorrelated: E(x[n]e[n + m]) = 0 m Quantization noise is uniform over the quantization interval p e (e) = 1, 2 e 2 = 0, otherwise

39 Dithering Definition (Dithering) We can force e[n] to be white and uncorrelated with x[n] by adding noise to x[n] before quantization!

40 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x

41 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x

42 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x

43 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x

44 Signal-to-Noise Ratio To quantify the severity of the quantization noise, we define the Signal-to-Noise Ratio (SNR) as: SNR = = σx 2 σ e 2 E(x 2 [n]) E(e 2 [n]) 1 N 1 N 1 N n=0 x2 [n] N 1 n=0 e2 [n] For uniform pdf and quantizer range 2x max : σe 2 = 2 12 x = max B Or and in db: and since x max = 4σ x : SNR = 3 22B ( xmax σx )2 ( ) xmax SNR(dB) 6B log 10 SNR(dB) 6B 7.2 σ x

45 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

46 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

47 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

48 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

49 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

50 Pulse Code Modulation, PCM B bits of information per sample are transmitted as a codeword instantaneous coding not signal-specific 11 bits are required for toll quality what is the rate for f s = 10kHz? For CD, with f s = and B = 16 (16-bit PCM), what is the SNR?

51 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1

52 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1

53 Optimal decision and reconstruction level if x[n] p x (x) we determine the optimal decision level, x i and the reconstruction level, ˆx, by minimizing: D = E[(ˆx x) 2 ] = p x(x)(ˆx x) 2 dx and assuming M reconstruction levels ˆx = Q[x]: D = M = i=1 xi x i 1 p x (x)(ˆx i x) 2 dx So: D ˆx k = 0, 1 k M D x k = 0, 1 k M 1

54 Optimal decision and reconstruction level, cont. The minimization of D over decision level, x k, gives: x k = ˆx k+1 + ˆx k, 1 k M 1 2 The minimization of D over reconstruction level, ˆx k, gives: ] xdx ˆx k = x k x k 1 [ p x (x) xk x p x (s)ds k 1 = x k x k 1 p x (x)xdx

55 Optimal decision and reconstruction level, cont. The minimization of D over decision level, x k, gives: x k = ˆx k+1 + ˆx k, 1 k M 1 2 The minimization of D over reconstruction level, ˆx k, gives: ] xdx ˆx k = x k x k 1 [ p x (x) xk x p x (s)ds k 1 = x k x k 1 p x (x)xdx

56 Example with Laplacian pdf

57 Principle of Companding

58 Companding examples Companding examples: Transformation to a uniform density: g[n] = T (x[n]) = x[n] p x(s)ds 1 2, 1 2 g[n] 1 2 = 0 elsewhere µ-law: x max ) T (x[n]) = x max log (1 + µ x[n] log (1 + µ) sign(x[n])

59 Companding examples Companding examples: Transformation to a uniform density: g[n] = T (x[n]) = x[n] p x(s)ds 1 2, 1 2 g[n] 1 2 = 0 elsewhere µ-law: x max ) T (x[n]) = x max log (1 + µ x[n] log (1 + µ) sign(x[n])

60 Adaptive quantization

61 Differential and Residual quantization

62 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

63 Motivation for VQ

64 Comparing scalar and vector quantization

65 Distortion in VQ Here we have a multidimensional pdf p x (x): D = E[(ˆx x) T (ˆx x)] = (ˆx x)t (ˆx x)p x (x)dx = M i=1 x C i (r i x) T (r i x)p x (x)dx Two constraints: A vector x must be quantized to a reconstruction level r i that gives the smallest distortion: C i = {x : x r i 2 x r l 2, l = 1, 2,, M} Each reconstruction level r i must be the centroid of the corresponding decision region, i.e., of the cell C i : x r i = m C i x m i = 1, 2,, M x m C i 1

66 Distortion in VQ Here we have a multidimensional pdf p x (x): D = E[(ˆx x) T (ˆx x)] = (ˆx x)t (ˆx x)p x (x)dx = M i=1 x C i (r i x) T (r i x)p x (x)dx Two constraints: A vector x must be quantized to a reconstruction level r i that gives the smallest distortion: C i = {x : x r i 2 x r l 2, l = 1, 2,, M} Each reconstruction level r i must be the centroid of the corresponding decision region, i.e., of the cell C i : x r i = m C i x m i = 1, 2,, M x m C i 1

67 The k-means algorithm S1: D = 1 N 1 (ˆx k x k ) T (ˆx k x k ) N k=0 S2: Pick an initial guess at the reconstruction levels {r i } S3: For each x k elect an r i closest to x k. Form clusters (clustering step) S4: Find the mean of x k in each cluster which gives a new r i. Compute D. S5: Stop when the change in D over two consecutive iterations is insignificant.

68 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.

69 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.

70 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.

71 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.

72 The LBG algorithm Set the desired number of cells: M = 2 B Set an initial codebook C (0) with one codevector which is set as the average of the entire training sequence, x k, k = 1, 2,, N. Split the codevector into two and get an initial new codebook C (1). Perform a k-means algorithm to optimize the codebook and get the final C (1) Split the final codevectors into four and repeat the above process until the desired number of cells is reached.

73 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

74 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.

75 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.

76 Basic coding scheme in LPC Vocal tract system function: where H(z) = P(z) = A 1 P(z) p a k z 1 k=1 Input is binary: impulse/noise excitation. If frame rate is 100 frames/s and we use 13 parameters (p = 10, 1 for Gain, 1 for pitch, 1 for voicing decision) we need 1300 parameters/s, instead of 8000 samples/s for f s = 8000Hz.

77 Scalar quantization within LPC For 7200 bps: Voiced/unvoiced decision: 1 bit Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Poles d i : 10 bits (nonuniform) [5 bits for frequency and 5 bits for bandwidth] 6 poles = 60 bits So: ( ) 100 frames/s = 7200 bps

78 Scalar quantization within LPC For 7200 bps: Voiced/unvoiced decision: 1 bit Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Poles d i : 10 bits (nonuniform) [5 bits for frequency and 5 bits for bandwidth] 6 poles = 60 bits So: ( ) 100 frames/s = 7200 bps

79 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps

80 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps

81 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps

82 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps

83 Refinements to the basic LPC coding scheme Companding in the form of a logarithmic operator on pitch and gain Instead of poles use the reflection (or the PARCOR) coefficients k i,(nonuniform) Companding of k i : g i = T [k ( i ] ) = log 1 ki 1+k i Coefficients g i can be coded at 5-6 bits each! (which results in 4800 bps for an order 6 predictor, and 100 frames/s) Reduce the frame rate by a factor of two (50 frames/s) gives us a bit rate of 2400 bps

84 VQ in LPC coding A 10-bit codebook (1024 codewords), 800 bps VQ provides a comparable quality to a 2400 bps scalar quantizer. A 22-bit codebook ( codewords), 2400 bps VQ provides a higher output speech quality.

85 Unique components of MELP Mixed pulse and noise excitation Periodic or aperiodic pulses Adaptive spectral enhancements Pulse dispersion filter

86 Line Spectral Frequencies (LSFs) in MELP LSFs for a pth order all-pole model are defines as follows: 1 Form two polynomials: P(z) = A(z) + z (p+1) A(z 1 ) Q(z) = A(z) z (p+1) A(z 1 ) 2 Find the roots of P(z) and Q(z), ωi which are on the unit circle. 3 Exclude trivial roots at ωi = 0 and ω i = π.

87 MELP coding For a 2400 bps: 34 bits allocated to scalar quantization of the LSFs 8 bits for gain 7 bits for pitch and overall voicing 5 bits for multi-band voicing 1 bit for the jittery state which is 54 bits. With a frame rate of 22.5 ms, we get an 2400 bps coder.

88 Outline 1 Introduction 2 Statistical Models 3 Scalar Quantization Max Quantizer Companding Adaptive quantization Differential and Residual quantization 4 Vector Quantization The k-means algorithm The LBG algorithm 5 Model-based Coding Basic Linear Prediction, LPC Mixed Excitation LPC (MELP) 6 Acknowledgments

89 Acknowledgments Most, if not all, figures in this lecture are coming from the book: T. F. Quatieri: Discrete-Time Speech Signal Processing, principles and practice 2002, Prentice Hall and have been used after permission from Prentice Hall

90

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ Digital Speech Processing Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models Adaptive and Differential Coding 1 Speech Waveform Coding-Summary of Part 1 1. Probability

More information

Pulse-Code Modulation (PCM) :

Pulse-Code Modulation (PCM) : PCM & DPCM & DM 1 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number of bits used to represent each sample. The rate from

More information

Class of waveform coders can be represented in this manner

Class of waveform coders can be represented in this manner Digital Speech Processing Lecture 15 Speech Coding Methods Based on Speech Waveform Representations ti and Speech Models Uniform and Non- Uniform Coding Methods 1 Analog-to-Digital Conversion (Sampling

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #3 Wednesday, September 10, 2003 1.4 Quantization Digital systems can only represent sample amplitudes with a finite set of prescribed values,

More information

Source modeling (block processing)

Source modeling (block processing) Digital Speech Processing Lecture 17 Speech Coding Methods Based on Speech Models 1 Waveform Coding versus Block Waveform coding Processing sample-by-sample matching of waveforms coding gquality measured

More information

EE 5345 Biomedical Instrumentation Lecture 12: slides

EE 5345 Biomedical Instrumentation Lecture 12: slides EE 5345 Biomedical Instrumentation Lecture 1: slides 4-6 Carlos E. Davila, Electrical Engineering Dept. Southern Methodist University slides can be viewed at: http:// www.seas.smu.edu/~cd/ee5345.html EE

More information

Design of a CELP coder and analysis of various quantization techniques

Design of a CELP coder and analysis of various quantization techniques EECS 65 Project Report Design of a CELP coder and analysis of various quantization techniques Prof. David L. Neuhoff By: Awais M. Kamboh Krispian C. Lawrence Aditya M. Thomas Philip I. Tsai Winter 005

More information

The Secrets of Quantization. Nimrod Peleg Update: Sept. 2009

The Secrets of Quantization. Nimrod Peleg Update: Sept. 2009 The Secrets of Quantization Nimrod Peleg Update: Sept. 2009 What is Quantization Representation of a large set of elements with a much smaller set is called quantization. The number of elements in the

More information

Scalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014

Scalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014 Scalar and Vector Quantization National Chiao Tung University Chun-Jen Tsai 11/06/014 Basic Concept of Quantization Quantization is the process of representing a large, possibly infinite, set of values

More information

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 Chapter 9 Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 1 LPC Methods LPC methods are the most widely used in speech coding, speech synthesis, speech recognition, speaker recognition and verification

More information

Speech Coding. Speech Processing. Tom Bäckström. October Aalto University

Speech Coding. Speech Processing. Tom Bäckström. October Aalto University Speech Coding Speech Processing Tom Bäckström Aalto University October 2015 Introduction Speech coding refers to the digital compression of speech signals for telecommunication (and storage) applications.

More information

Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y)

Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y) Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y) E{(X-Y) 2 } D

More information

SPEECH ANALYSIS AND SYNTHESIS

SPEECH ANALYSIS AND SYNTHESIS 16 Chapter 2 SPEECH ANALYSIS AND SYNTHESIS 2.1 INTRODUCTION: Speech signal analysis is used to characterize the spectral information of an input speech signal. Speech signal analysis [52-53] techniques

More information

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007 Linear Prediction Coding Nimrod Peleg Update: Aug. 2007 1 Linear Prediction and Speech Coding The earliest papers on applying LPC to speech: Atal 1968, 1970, 1971 Markel 1971, 1972 Makhoul 1975 This is

More information

Multimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization

Multimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lecture 4 -> 6 : Quantization Overview Course page (D.I.R.): https://disit.dir.unipmn.it/course/view.php?id=639 Consulting: Office hours by appointment:

More information

Principles of Communications

Principles of Communications Principles of Communications Weiyao Lin, PhD Shanghai Jiao Tong University Chapter 4: Analog-to-Digital Conversion Textbook: 7.1 7.4 2010/2011 Meixia Tao @ SJTU 1 Outline Analog signal Sampling Quantization

More information

The Equivalence of ADPCM and CELP Coding

The Equivalence of ADPCM and CELP Coding The Equivalence of ADPCM and CELP Coding Peter Kabal Department of Electrical & Computer Engineering McGill University Montreal, Canada Version.2 March 20 c 20 Peter Kabal 20/03/ You are free: to Share

More information

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut Finite Word Length Effects and Quantisation Noise 1 Finite Word Length Effects Finite register lengths and A/D converters cause errors at different levels: (i) input: Input quantisation (ii) system: Coefficient

More information

Sinusoidal Modeling. Yannis Stylianou SPCC University of Crete, Computer Science Dept., Greece,

Sinusoidal Modeling. Yannis Stylianou SPCC University of Crete, Computer Science Dept., Greece, Sinusoidal Modeling Yannis Stylianou University of Crete, Computer Science Dept., Greece, yannis@csd.uoc.gr SPCC 2016 1 Speech Production 2 Modulators 3 Sinusoidal Modeling Sinusoidal Models Voiced Speech

More information

L used in various speech coding applications for representing

L used in various speech coding applications for representing IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 1, NO. 1. JANUARY 1993 3 Efficient Vector Quantization of LPC Parameters at 24 BitsFrame Kuldip K. Paliwal, Member, IEEE, and Bishnu S. Atal, Fellow,

More information

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU Audio Coding P.1 Fundamentals Quantization Waveform Coding Subband Coding 1. Fundamentals P.2 Introduction Data Redundancy Coding Redundancy Spatial/Temporal Redundancy Perceptual Redundancy Compression

More information

Proc. of NCC 2010, Chennai, India

Proc. of NCC 2010, Chennai, India Proc. of NCC 2010, Chennai, India Trajectory and surface modeling of LSF for low rate speech coding M. Deepak and Preeti Rao Department of Electrical Engineering Indian Institute of Technology, Bombay

More information

SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION

SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION Hauke Krüger and Peter Vary Institute of Communication Systems and Data Processing RWTH Aachen University, Templergraben

More information

Vector Quantization and Subband Coding

Vector Quantization and Subband Coding Vector Quantization and Subband Coding 18-796 ultimedia Communications: Coding, Systems, and Networking Prof. Tsuhan Chen tsuhan@ece.cmu.edu Vector Quantization 1 Vector Quantization (VQ) Each image block

More information

Lab 9a. Linear Predictive Coding for Speech Processing

Lab 9a. Linear Predictive Coding for Speech Processing EE275Lab October 27, 2007 Lab 9a. Linear Predictive Coding for Speech Processing Pitch Period Impulse Train Generator Voiced/Unvoiced Speech Switch Vocal Tract Parameters Time-Varying Digital Filter H(z)

More information

Multimedia Communications. Scalar Quantization

Multimedia Communications. Scalar Quantization Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing

More information

Digital Image Processing Lectures 25 & 26

Digital Image Processing Lectures 25 & 26 Lectures 25 & 26, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2015 Area 4: Image Encoding and Compression Goal: To exploit the redundancies in the image

More information

Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model

Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model PAPERS Journal of the Audio Engineering Society Vol. 64, No. 11, November 216 ( C 216) DOI: https://doi.org/1.17743/jaes.216.28 Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model

More information

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18

More information

Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources

Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources Russell M. Mersereau Center for Signal and Image Processing Georgia Institute of Technology Outline Cache vector quantization Lossless

More information

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding Digital Signal Processing 17 (2007) 114 137 www.elsevier.com/locate/dsp A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding Stephen So a,, Kuldip K.

More information

Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation. Keiichi Tokuda

Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation. Keiichi Tokuda Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation Keiichi Tokuda Nagoya Institute of Technology Carnegie Mellon University Tamkang University March 13,

More information

A 600bps Vocoder Algorithm Based on MELP. Lan ZHU and Qiang LI*

A 600bps Vocoder Algorithm Based on MELP. Lan ZHU and Qiang LI* 2017 2nd International Conference on Electrical and Electronics: Techniques and Applications (EETA 2017) ISBN: 978-1-60595-416-5 A 600bps Vocoder Algorithm Based on MELP Lan ZHU and Qiang LI* Chongqing

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing EE123 Digital Signal Processing Lecture 19 Practical ADC/DAC Ideal Anti-Aliasing ADC A/D x c (t) Analog Anti-Aliasing Filter HLP(jΩ) sampler t = nt x[n] =x c (nt ) Quantizer 1 X c (j ) and s < 2 1 T X

More information

HARMONIC VECTOR QUANTIZATION

HARMONIC VECTOR QUANTIZATION HARMONIC VECTOR QUANTIZATION Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Jonas Svedberg, and Harald Pobloth SMN, Ericsson Research, Ericsson AB 64 8, Stockholm, Sweden ABSTRACT

More information

Chapter 10 Applications in Communications

Chapter 10 Applications in Communications Chapter 10 Applications in Communications School of Information Science and Engineering, SDU. 1/ 47 Introduction Some methods for digitizing analog waveforms: Pulse-code modulation (PCM) Differential PCM

More information

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING 5 0 DPCM (Differential Pulse Code Modulation) Making scalar quantization work for a correlated source -- a sequential approach. Consider quantizing a slowly varying source (AR, Gauss, ρ =.95, σ 2 = 3.2).

More information

Analysis of methods for speech signals quantization

Analysis of methods for speech signals quantization INFOTEH-JAHORINA Vol. 14, March 2015. Analysis of methods for speech signals quantization Stefan Stojkov Mihajlo Pupin Institute, University of Belgrade Belgrade, Serbia e-mail: stefan.stojkov@pupin.rs

More information

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT

More information

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems

More information

Review of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition

Review of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition Review of Quantization UMCP ENEE631 Slides (created by M.Wu 004) Quantization UMCP ENEE631 Slides (created by M.Wu 001/004) L-level Quantization Minimize errors for this lossy process What L values to

More information

Quantization 2.1 QUANTIZATION AND THE SOURCE ENCODER

Quantization 2.1 QUANTIZATION AND THE SOURCE ENCODER 2 Quantization After the introduction to image and video compression presented in Chapter 1, we now address several fundamental aspects of image and video compression in the remaining chapters of Section

More information

E303: Communication Systems

E303: Communication Systems E303: Communication Systems Professor A. Manikas Chair of Communications and Array Processing Imperial College London Principles of PCM Prof. A. Manikas (Imperial College) E303: Principles of PCM v.17

More information

ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals. 1. Sampling and Reconstruction 2. Quantization

ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals. 1. Sampling and Reconstruction 2. Quantization ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals 1. Sampling and Reconstruction 2. Quantization 1 1. Sampling & Reconstruction DSP must interact with an analog world: A to D D to A x(t)

More information

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1 PCM Reference Chapter 1.1, Communication Systems, Carlson. PCM.1 Pulse-code modulation (PCM) Pulse modulations use discrete time samples of analog signals the transmission is composed of analog information

More information

Objectives of Image Coding

Objectives of Image Coding Objectives of Image Coding Representation of an image with acceptable quality, using as small a number of bits as possible Applications: Reduction of channel bandwidth for image transmission Reduction

More information

ETSI TS V5.0.0 ( )

ETSI TS V5.0.0 ( ) Technical Specification Universal Mobile Telecommunications System (UMTS); AMR speech Codec; Transcoding Functions () 1 Reference RTS/TSGS-046090v500 Keywords UMTS 650 Route des Lucioles F-0691 Sophia

More information

Quantization of LSF Parameters Using A Trellis Modeling

Quantization of LSF Parameters Using A Trellis Modeling 1 Quantization of LSF Parameters Using A Trellis Modeling Farshad Lahouti, Amir K. Khandani Coding and Signal Transmission Lab. Dept. of E&CE, University of Waterloo, Waterloo, ON, N2L 3G1, Canada (farshad,

More information

L7: Linear prediction of speech

L7: Linear prediction of speech L7: Linear prediction of speech Introduction Linear prediction Finding the linear prediction coefficients Alternative representations This lecture is based on [Dutoit and Marques, 2009, ch1; Taylor, 2009,

More information

Signal representations: Cepstrum

Signal representations: Cepstrum Signal representations: Cepstrum Source-filter separation for sound production For speech, source corresponds to excitation by a pulse train for voiced phonemes and to turbulence (noise) for unvoiced phonemes,

More information

The information loss in quantization

The information loss in quantization The information loss in quantization The rough meaning of quantization in the frame of coding is representing numerical quantities with a finite set of symbols. The mapping between numbers, which are normally

More information

Efficient Block Quantisation for Image and Speech Coding

Efficient Block Quantisation for Image and Speech Coding Efficient Block Quantisation for Image and Speech Coding Stephen So, BEng (Hons) School of Microelectronic Engineering Faculty of Engineering and Information Technology Griffith University, Brisbane, Australia

More information

ETSI EN V7.1.1 ( )

ETSI EN V7.1.1 ( ) European Standard (Telecommunications series) Digital cellular telecommunications system (Phase +); Adaptive Multi-Rate (AMR) speech transcoding GLOBAL SYSTEM FOR MOBILE COMMUNICATIONS R Reference DEN/SMG-110690Q7

More information

Compression methods: the 1 st generation

Compression methods: the 1 st generation Compression methods: the 1 st generation 1998-2017 Josef Pelikán CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ Still1g 2017 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 32 Basic

More information

7.1 Sampling and Reconstruction

7.1 Sampling and Reconstruction Haberlesme Sistemlerine Giris (ELE 361) 6 Agustos 2017 TOBB Ekonomi ve Teknoloji Universitesi, Guz 2017-18 Dr. A. Melda Yuksel Turgut & Tolga Girici Lecture Notes Chapter 7 Analog to Digital Conversion

More information

Multimedia Communications. Differential Coding

Multimedia Communications. Differential Coding Multimedia Communications Differential Coding Differential Coding In many sources, the source output does not change a great deal from one sample to the next. This means that both the dynamic range and

More information

Research Article Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding

Research Article Adaptive Long-Term Coding of LSF Parameters Trajectories for Large-Delay/Very- to Ultra-Low Bit-Rate Speech Coding Hindawi Publishing Corporation EURASIP Journal on Audio, Speech, and Music Processing Volume 00, Article ID 597039, 3 pages doi:0.55/00/597039 Research Article Adaptive Long-Term Coding of LSF Parameters

More information

Voiced Speech. Unvoiced Speech

Voiced Speech. Unvoiced Speech Digital Speech Processing Lecture 2 Homomorphic Speech Processing General Discrete-Time Model of Speech Production p [ n] = p[ n] h [ n] Voiced Speech L h [ n] = A g[ n] v[ n] r[ n] V V V p [ n ] = u [

More information

Re-estimation of Linear Predictive Parameters in Sparse Linear Prediction

Re-estimation of Linear Predictive Parameters in Sparse Linear Prediction Downloaded from vbnaaudk on: januar 12, 2019 Aalborg Universitet Re-estimation of Linear Predictive Parameters in Sparse Linear Prediction Giacobello, Daniele; Murthi, Manohar N; Christensen, Mads Græsbøll;

More information

BASICS OF COMPRESSION THEORY

BASICS OF COMPRESSION THEORY BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find

More information

Timbral, Scale, Pitch modifications

Timbral, Scale, Pitch modifications Introduction Timbral, Scale, Pitch modifications M2 Mathématiques / Vision / Apprentissage Audio signal analysis, indexing and transformation Page 1 / 40 Page 2 / 40 Modification of playback speed Modifications

More information

EE368B Image and Video Compression

EE368B Image and Video Compression EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer

More information

E4702 HW#4-5 solutions by Anmo Kim

E4702 HW#4-5 solutions by Anmo Kim E70 HW#-5 solutions by Anmo Kim (ak63@columbia.edu). (P3.7) Midtread type uniform quantizer (figure 3.0(a) in Haykin) Gaussian-distributed random variable with zero mean and unit variance is applied to

More information

3GPP TS V6.1.1 ( )

3GPP TS V6.1.1 ( ) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB)

More information

Analysis of Finite Wordlength Effects

Analysis of Finite Wordlength Effects Analysis of Finite Wordlength Effects Ideally, the system parameters along with the signal variables have infinite precision taing any value between and In practice, they can tae only discrete values within

More information

Chapter 3. Quantization. 3.1 Scalar Quantizers

Chapter 3. Quantization. 3.1 Scalar Quantizers Chapter 3 Quantization As mentioned in the introduction, two operations are necessary to transform an analog waveform into a digital signal. The first action, sampling, consists of converting a continuous-time

More information

Time-domain representations

Time-domain representations Time-domain representations Speech Processing Tom Bäckström Aalto University Fall 2016 Basics of Signal Processing in the Time-domain Time-domain signals Before we can describe speech signals or modelling

More information

Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks

Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks Sai Han and Tim Fingscheidt Institute for Communications Technology, Technische Universität

More information

HARMONIC CODING OF SPEECH AT LOW BIT RATES

HARMONIC CODING OF SPEECH AT LOW BIT RATES HARMONIC CODING OF SPEECH AT LOW BIT RATES Peter Lupini B. A.Sc. University of British Columbia, 1985 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

More information

Digital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau

Digital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau Digital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau What is our SNR if we have a sinusoidal signal? What is its pdf? Basically

More information

Example: for source

Example: for source Nonuniform scalar quantizer References: Sayood Chap. 9, Gersho and Gray, Chap.'s 5 and 6. The basic idea: For a nonuniform source density, put smaller cells and levels where the density is larger, thereby

More information

Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding

Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding entropy Article Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding Jerry D. Gibson * and Preethi Mahadevan Department of Electrical and Computer

More information

at Some sort of quantization is necessary to represent continuous signals in digital form

at Some sort of quantization is necessary to represent continuous signals in digital form Quantization at Some sort of quantization is necessary to represent continuous signals in digital form x(n 1,n ) x(t 1,tt ) D Sampler Quantizer x q (n 1,nn ) Digitizer (A/D) Quantization is also used for

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on lectures from B. Lee, B. Girod, and A. Mukherjee 1 Outline Digital Signal Representation

More information

VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION

VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION VECTOR QUANTIZATION OF SPEECH WITH NOISE CANCELLATION Xiangyang Chen B. Sc. (Elec. Eng.), The Branch of Tsinghua University, 1983 A THESIS SUBMITTED LV PARTIAL FVLFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

More information

Image Compression using DPCM with LMS Algorithm

Image Compression using DPCM with LMS Algorithm Image Compression using DPCM with LMS Algorithm Reenu Sharma, Abhay Khedkar SRCEM, Banmore -----------------------------------------------------------------****---------------------------------------------------------------

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

6 Quantization of Discrete Time Signals

6 Quantization of Discrete Time Signals Ramachandran, R.P. Quantization of Discrete Time Signals Digital Signal Processing Handboo Ed. Vijay K. Madisetti and Douglas B. Williams Boca Raton: CRC Press LLC, 1999 c 1999byCRCPressLLC 6 Quantization

More information

ETSI TS V ( )

ETSI TS V ( ) TS 146 060 V14.0.0 (2017-04) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+) (GSM); Enhanced Full Rate (EFR) speech transcoding (3GPP TS 46.060 version 14.0.0 Release 14)

More information

Predictive Coding. Prediction Prediction in Images

Predictive Coding. Prediction Prediction in Images Prediction Prediction in Images Predictive Coding Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe

More information

Predictive Coding. Prediction

Predictive Coding. Prediction Predictive Coding Prediction Prediction in Images Principle of Differential Pulse Code Modulation (DPCM) DPCM and entropy-constrained scalar quantization DPCM and transmission errors Adaptive intra-interframe

More information

ETSI TS V7.0.0 ( )

ETSI TS V7.0.0 ( ) TS 6 9 V7.. (7-6) Technical Specification Digital cellular telecommunications system (Phase +); Universal Mobile Telecommunications System (UMTS); Speech codec speech processing functions; Adaptive Multi-Rate

More information

A Systematic Description of Source Significance Information

A Systematic Description of Source Significance Information A Systematic Description of Source Significance Information Norbert Goertz Institute for Digital Communications School of Engineering and Electronics The University of Edinburgh Mayfield Rd., Edinburgh

More information

Design of Optimal Quantizers for Distributed Source Coding

Design of Optimal Quantizers for Distributed Source Coding Design of Optimal Quantizers for Distributed Source Coding David Rebollo-Monedero, Rui Zhang and Bernd Girod Information Systems Laboratory, Electrical Eng. Dept. Stanford University, Stanford, CA 94305

More information

INTERNATIONAL TELECOMMUNICATION UNION. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB)

INTERNATIONAL TELECOMMUNICATION UNION. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) INTERNATIONAL TELECOMMUNICATION UNION ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU G.722.2 (07/2003) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments

More information

Ch. 10 Vector Quantization. Advantages & Design

Ch. 10 Vector Quantization. Advantages & Design Ch. 10 Vector Quantization Advantages & Design 1 Advantages of VQ There are (at least) 3 main characteristics of VQ that help it outperform SQ: 1. Exploit Correlation within vectors 2. Exploit Shape Flexibility

More information

AN IMPROVED ADPCM DECODER BY ADAPTIVELY CONTROLLED QUANTIZATION INTERVAL CENTROIDS. Sai Han and Tim Fingscheidt

AN IMPROVED ADPCM DECODER BY ADAPTIVELY CONTROLLED QUANTIZATION INTERVAL CENTROIDS. Sai Han and Tim Fingscheidt AN IMPROVED ADPCM DECODER BY ADAPTIVELY CONTROLLED QUANTIZATION INTERVAL CENTROIDS Sai Han and Tim Fingscheidt Institute for Communications Technology, Technische Universität Braunschweig Schleinitzstr.

More information

BASIC COMPRESSION TECHNIQUES

BASIC COMPRESSION TECHNIQUES BASIC COMPRESSION TECHNIQUES N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lectures # 05 Questions / Problems / Announcements? 2 Matlab demo of DFT Low-pass windowed-sinc

More information

NEAR EAST UNIVERSITY

NEAR EAST UNIVERSITY NEAR EAST UNIVERSITY GRADUATE SCHOOL OF APPLIED ANO SOCIAL SCIENCES LINEAR PREDICTIVE CODING \ Burak Alacam Master Thesis Department of Electrical and Electronic Engineering Nicosia - 2002 Burak Alacam:

More information

CEPSTRAL ANALYSIS SYNTHESIS ON THE MEL FREQUENCY SCALE, AND AN ADAPTATIVE ALGORITHM FOR IT.

CEPSTRAL ANALYSIS SYNTHESIS ON THE MEL FREQUENCY SCALE, AND AN ADAPTATIVE ALGORITHM FOR IT. CEPSTRAL ANALYSIS SYNTHESIS ON THE EL FREQUENCY SCALE, AND AN ADAPTATIVE ALGORITH FOR IT. Summarized overview of the IEEE-publicated papers Cepstral analysis synthesis on the mel frequency scale by Satochi

More information

ITU-T G khz audio-coding within 64 kbit/s

ITU-T G khz audio-coding within 64 kbit/s International Telecommunication Union ITU-T G.722 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (9/212) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

EXAMPLE OF SCALAR AND VECTOR QUANTIZATION

EXAMPLE OF SCALAR AND VECTOR QUANTIZATION EXAMPLE OF SCALAR AD VECTOR QUATIZATIO Source sequence : This could be the output of a highly correlated source. A scalar quantizer: =1, M=4 C 1 = {w 1,w 2,w 3,w 4 } = {-4, -1, 1, 4} = codeboo of quantization

More information

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) Speaker Identification Based On Discriminative Vector Quantization And Data Fusion 2005 Guangyu Zhou

More information

Module 3. Quantization and Coding. Version 2, ECE IIT, Kharagpur

Module 3. Quantization and Coding. Version 2, ECE IIT, Kharagpur Module Quantization and Coding ersion, ECE IIT, Kharagpur Lesson Logarithmic Pulse Code Modulation (Log PCM) and Companding ersion, ECE IIT, Kharagpur After reading this lesson, you will learn about: Reason

More information

Image Coding. Chapter 10. Contents. (Related to Ch. 10 of Lim.) 10.1

Image Coding. Chapter 10. Contents. (Related to Ch. 10 of Lim.) 10.1 Chapter 1 Image Coding Contents Introduction..................................................... 1. Quantization..................................................... 1.3 Scalar quantization...............................................

More information

Basic Principles of Video Coding

Basic Principles of Video Coding Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion

More information

EE-597 Notes Quantization

EE-597 Notes Quantization EE-597 Notes Quantization Phil Schniter June, 4 Quantization Given a continuous-time and continuous-amplitude signal (t, processing and storage by modern digital hardware requires discretization in both

More information

LOW COMPLEX FORWARD ADAPTIVE LOSS COMPRESSION ALGORITHM AND ITS APPLICATION IN SPEECH CODING

LOW COMPLEX FORWARD ADAPTIVE LOSS COMPRESSION ALGORITHM AND ITS APPLICATION IN SPEECH CODING Journal of ELECTRICAL ENGINEERING, VOL. 62, NO. 1, 2011, 19 24 LOW COMPLEX FORWARD ADAPTIVE LOSS COMPRESSION ALGORITHM AND ITS APPLICATION IN SPEECH CODING Jelena Nikolić Zoran Perić Dragan Antić Aleksandra

More information