High Rate Speech Service Option 17 for Wideband Spread Spectrum Communication Systems

Size: px
Start display at page:

Download "High Rate Speech Service Option 17 for Wideband Spread Spectrum Communication Systems"

Transcription

1 Document: C.S000-0 Version: Date: December High Rate Speech Service Option for Wideband Spread Spectrum Communication Systems COPYRIGHT GPP and its Organizational Partners claim copyright in this document and individual Organizational Partners may copyright and issue documents or standards publications in individual Organizational Partner's name based on this document. Requests for reproduction of this document should be directed to the GPP Secretariat at Requests to reproduce individual Organizational Partner's documents should be directed to that Organizational Partner. See for more information.

2 High Rate Speech Service Option for Wideband Spread Spectrum Communication Systems Publish Version November,

3 Copyright TIA.

4 PREFACE These technical requirements form a standard for Service Option, a variable rate, twoway speech service option. The maximum speech coding rate of the service option is. kbps. This standard does not address the quality or reliability of Service Option, nor does it cover equipment performance or measurement procedures. 0 SECTION SUMMARY. General. This section defines the terms and numeric indicators used in this document.. Service Option : Variable Data Rate Two-Way Voice. This section describes the requirements for Service Option. Included in these requirements is the description of a speech codec algorithm for variable rate, two-way voice.. Annex A. Bibliography. This is an informative annex (not considered part of this standard) listing documents which may be useful in implementing the standard. 0 i

5 0 0 NOTES. TIA/EIA/IS- Recommended Minimum Performance Standard for the High Rate Speech Service Option for Wideband Spread Spectrum Communication Systems, provides specifications and measurement methods.. Base station refers to the functions performed on the land side, which are typically distributed among a cell, a sector of a cell, and a mobile switching center.. Section uses the following verbal forms: Shall and shall not identify requirements to be followed strictly to conform to the standard and from which no deviation is permitted. Should and should not indicate that one of several possibilities is recommended as particularly suitable, without mentioning or excluding others; that a certain course of action is preferred but not necessarily required; or that (in the negative form) a certain possibility or course of action is discouraged but not prohibited. May and need not indicate a course of action permissible within the limits of the standard. Can and cannot are used for statements of possibility and capability, whether material, physical, or causal.. Footnotes appear at various points in this specification to elaborate and further clarify items discussed in the body of the specification.. Unless indicated otherwise, this document presents numbers in decimal form. Binary numbers are distinguished in the text by the use of single quotation marks. In some tables, binary values may appear without single quotation marks if table notation clearly specifies that values are binary. The character x is used to represent a binary bit of unspecified value. For example xxx0000 represents any -bit binary value such that the least significant five bits equal Hexadecimal numbers (base ) are distinguished in the text by use of the form 0xh¼h where h¼h represents a string of hexadecimal digits. For example, 0xfa represents a number whose binary value is and whose decimal value is. ii

6 0 NOTES. The following conventions apply to mathematical expressions in this standard: ëxû indicates the largest integer less than or equal to x: ë.û =, ë.0û =. éxù indicates the smallest integer greater than or equal to x: é.ù =, é.0ù =. x indicates the absolute value of x: - =, =. Å indicates exclusive OR. min(x, y) indicates the minimum of x and y. max(x, y) indicates the maximum of x and y. In figures, Ä indicates multiplication. In formulas within the text, multiplication is implicit. For example, if h(n) and p L (n) are functions, then h(n) p L (n) = h(n) Ä p L (n). x mod y indicates the remainder after dividing x by y: x mod y = x - (y ëx/yû). round(x) is traditional rounding: round(x) = ëx + 0.û. sign( x) = ì x ³ 0 í. î - x < 0 å indicates summation. If the summation symbol specifies initial and terminal values, and the initial value is greater than the terminal value, then the value of the summation is 0. For example, if N=0, and if f(n) represents an arbitrary function, then N å n= f( n) = 0. 0 The bracket operator, [ ], isolates individual bits of a binary value. VAR[n] refers to bit n of the binary representation of the value of the variable VAR, such that VAR[0] is the least significant bit of VAR. The value of VAR[n] is either 0 or. This standard uses the two-sided z-transform as given below. See Oppenheim, A. V. and Schafer, R. W., Digital Signal Processing, pp. -. ( ) = x i z -i Fz å i=- iii

7 REFERENCES The following standards contain provisions which, through reference in this text, constitute provisions of this Standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this Standard are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. ANSI and TIA maintain registers of currently valid national standards published by them American National Standards:. ANSI/EIA/TIA-, Acoustic-to-Digital and Digital-to-Acoustic Transmission Requirements for ISDN Terminals, March. Other Standards:. CCITT Recommendation G., Pulse Code Modulation (PCM) of Voice Frequencies, Vol. III, Geneva.. CCITT Recommendation G., Separate Performance Characteristics for the Encoding and Decoding Sides of PCM Channels Applicable to -Wire Voice- Frequency Interfaces, Blue Book, Vol. III, Melbourne.. IEEE Standard -, IEEE Standard Methods for Measuring Transmission Performance of Analog and Digital Telephone Sets,.. IEEE Standard -, Method for Determining Objective Loudness Ratings of Telephone Connections,.. ANSI J-STD-00, Personal Station-Base Station Compatibility Requirements for. to.0 GHz Code Division Multiple Access (CDMA) Personal Communications Systems.. TIA/EIA/IS--A, Mobile Station-Base Station Compatibility Standard for Dual- Mode Wideband Spread Spectrum Cellular System. All references to TIA/EIA/IS- -A shall be inclusive of text adopted ty TSB.. TIA/EIA/IS-, Recommended Minimum Performance Standard for Digital Cellular Wideband Spread Spectrum Speech Service Option, May.. TIA/EIA/IS-, Recommended Minimum Performance Standard for the High Rate Speech Service Option for Wideband Spread Spectrum Communication Systems. 0. TSB, Telecommunications Systems Bulletin: Support for. kbps Data Rate and PCS Interaction for Wideband Spread Spectrum Cellular Systems, December. iv

8 CONTENTS GENERAL Terms and Numeric Information... - SERVICE OPTION : VARIABLE DATA RATE TWO-WAY VOICE General Description Service Option Number Multiplex Option Required Multiplex Option Support Interface to Multiplex Option Transmitted Packets Received Packets Service Negotiation Initialization and Connection Mobile Station Requirements Base Station Requirements Service Option Control Messages Mobile Station Requirements Base Station Requirements Variable Rate Speech Coding Algorithm Introduction Input Audio Interface Input Audio Interface in the Mobile Station Conversion and Scaling Digital Audio Input Analog Audio Input Adjusting the Transmit Level Band Pass Filtering Echo Return Loss Input Audio Interface in the Base Station Sampling and Format Conversion Adjusting the Transmit Level Echo Canceling Ear Protection... - v

9 CONTENTS Determining the Formant Prediction Parameters Form of the Formant Synthesis Filter Encoding High-Pass Filtering of Input Samples Windowing the Samples Computing the Autocorrelation Function Determining the LPC Coefficients from the Autocorrelation Function Transforming the LPC Coefficients to Line Spectrum Pairs (LSPs) Converting the LSP Frequencies to Transmission Codes for Rate, Rate /, and Rate / Computing the Sensitivities of the LSP Frequencies Vector Quantizing the LSP Frequencies LSP VQ Codebooks Converting the LSP Frequencies to Transmission Codes for Rate / Decoding LSP Frequencies and Converting to LPC Coefficients Converting the LSP Transmission Codes to LSP Frequencies Checking the Stability of the LSP Frequencies for Rate / Encoding Low-Pass Filtering the LSP Frequencies Interpolating the LSP Frequencies Converting the Interpolated LSP Frequencies to LPC Coefficients Scaling the LPC Coefficients to Perform Bandwidth Expansion Determining the Packet Type (Rate) First Stage of Rate Determination Algorithm Computing Band Energy Calculating Rate Determination Thresholds Comparing Thresholds Performing Hangover Constraining Rate Selection Updating Smoothed Band Energy Updating the Smoothed Band Energy...-0 vi

10 CONTENTS Updating Background Noise Estimate Updating Signal Energy Estimate Second Stage of Rate Determination Algorithm: Rate Reduction Unvoiced Detection Temporally Masked Frame Detection Stationary Voiced Frame Detection Adapting Thresholds to Achieve Target Average Rate Determining the Pitch Prediction Parameters Encoding Computing the Pitch Lag and Pitch Gain Implementing the Pitch Search Convolutions Converting the Pitch Gain and Pitch Lag to the Transmission Codes Decoding Determining the Excitation Codebook Parameters Encoding Computing the Codebook Index and Codebook Gain for Rate and Rate / Implementing the Codebook Search Convolutions Computing the Codebook Gain for Rate / and Rate / Frames Converting Codebook Parameters into Transmission Codes for Rate and Rate / Converting Codebook Parameters into Transmission Codes for Rate / Converting Codebook Parameters into Transmission Codes for Rate / Decoding Converting Codebook Transmission Codes for Rate and Rate / Converting Codebook Transmission Codes for Rate / Converting Codebook Transmission Codes for Rate / Data Packing Rate Packing Rate / Packing Rate / Packing... - vii

11 CONTENTS Rate / Packing Decoding at the Transmitting Speech Codec and the Receiving Speech Codec Generating the Scaled Codebook Vector Generating the Scaled Codebook Vector for Rate and Rate / Generating the Scaled Codebook Vector for Rate / Generating the Scaled Codebook Vector for Rate / Generating the Pitch Synthesis Filter Output Generating the Pitch Pre-Filter Synthesis Output Generating the Formant Synthesis Filter Output Updating the Memories of W(z) in the Transmitting Speech Codec The Adaptive Postfilter in the Receiving Speech Codec Special Cases Insufficient Frame Quality (Erasure) Packets Blank Packets Incorrect Packet Detection Initializing Speech Codec Output Audio Interface Output Audio Interface in the Mobile Station Band Pass Filtering Adjusting the Receive Level Output Audio Interface in the Base Station Adjusting the Receive Level Summary of Encoding and Decoding Encoding Summary Decoding Summary Allowable Delays Allowable Transmitting Speech Codec Encoding Delay Allowable Receiving Speech Codec Decoding Delay...-. Summary of Service Option Notation...- ANNEX A BIBLIOGRAPHY...- viii

12 FIGURES Speech Synthesis Structure in the Receiving Speech Codec Bit Allocation for a Rate Packet Bit Allocation for a Rate / Packet Bit Allocation for a Rate / Packet Bit Allocation for a Rate / Packet Converting the LSP Frequencies to Transmission Codes for Rate / Converting the LSP Transmission Codes to LSP Frequencies for Rate / and Insufficient Frame Quality Frames Two Stages in the Rate Determination Algorithm Decimation of the Prediction Residual for NACF Computation Flowchart for the Second Stage of the Rate Determination Algorithm Histogram of Target_SNR Feature with Reference to Target_SNR_Threshold Analysis-by-Synthesis Procedure for the Pitch Parameter Search Analysis-by-Synthesis Procedure for Codebook Parameter Search Converting Codebook Parameters for Rate and Rate / Converting Codebook Parameters for Rate / Converting Codebook Parameters for Rate / Converting Codebook Transmission Codes for Rate and Rate / Converting Codebook Transmission Codes for Rate / Converting Codebook Transmission Codes for Rate / Decoding at the Transmitting Speech Codec Decoding at the Receiving Speech Codec... - ix

13 TABLES Packet Types Supplied by Service Option to the Multiplex Sublayer Packet Types Supplied by the Multiplex Sublayer to Service Option Valid Service Configuration Attributes for Service Option Service Option Control Message Type-Specific Fields Fraction of Packets at Rate, Rate /, and Rate / with Rate Reduction Parameters Used for Each Rate Transmission Codes and Bit Allocations (Part of ) Transmission Codes and Bit Allocations (Part of ) Hamming Window Values WH(n) LSP Vector Quantization for LSPVQ LSP Vector Quantization for LSPVQ (Part of ) LSP Vector Quantization for LSPVQ (Part of ) LSP Vector Quantization for LSPVQ (Part of ) LSP Vector Quantization for LSPVQ (Part of ) LSP Vector Quantization for LSPVQ LSP Vector Quantization for LSPVQ LSP Subframe Interpolation for All Rates Valid Rate Modifications for the Rate Reduction Algorithm FIR Filter Coefficients Used for Band Energy Calculations Threshold Scale Factors as a Function of SNR Hangover Frames as a Function of SNR Impulse Response of LPF Used in the Decimation Process to Calculate the NACF Unvoiced Encoding Rate as a Function of Reduced Rate Level Definition of Terms for Pitch Search Circular Codebook for Rate / Frames Circular Codebook for Rate Frames Definition of Terms for Codebook Search Codebook Quantizer (Rate, Rate /, and Rate /) Codebook Quantizer (Rate Every th Subframe)...- x

14 TABLES...- Conversion for CBGAIN (Rate, Rate /, and Rate /) Conversion for CBGAIN (Rate Every th Subframe) Conversion for CBSIGN for Rate and Rate / Rate / Frame Bits Used as the Seed for Pseudorandom Number Generation Codebook Quantizer (Rate /) Conversion for CBGAIN (Rate /) Table for Conversion from CBSIGN to G Ù S Table for Conversion from CBGAIN to G Ù Table for Conversion from G Ù to G Ù a Rate Packet Structure (Part of ) Rate Packet Structure (Part of ) Rate Packet Structure (Part of ) Rate / Packet Structure Rate / Packet Structure Rate / Packet Structure Impulse Response of BPF Used to Filter the White Excitation for Rate / Synthesis Gain Subtraction Value as a Function of Consecutive Erasures Pitch Saturation Levels as a Function of Consecutive Erasures LSP Predictor Decay as a Function of Consecutive Erasures Summary of Service Option Notation (Part of ) Summary of Service Option Notation (Part of ) Summary of Service Option Notation (Part of ) Summary of Service Option Notation (Part of ) Summary of Service Option Notation (Part of ) Summary of Service Option Notation (Part of ) xi

15 No text. xii

16 0 0 0 GENERAL. Terms and Numeric Information Autocorrelation Function. A function showing the relationship of a signal with a timeshifted version of itself. Base Station. A station in the Public Radio Telecommunications Service, other than a mobile station, used for radio communications with mobile stations. CELP. See Code Excited Linear Predictive Coding. Codec. The combination of an encoder and decoder in series (encoder/decoder). Code Excited Linear Predictive Coding (CELP). A speech coding algorithm. CELP coders use codebook excitation, a long-term pitch prediction filter, and a short-term formant prediction filter. Codebook. A set of vectors used by the speech codec. For each speech codec codebook subframe, one particular vector is chosen and used to excite the speech codec s filters. The codebook vector is chosen to minimize the weighted error between the original and synthesized speech after the pitch and formant synthesis filter coefficients have been determined. Coder. Same as encoder. Decoder. Generally, a device for the translation of a signal from a digital representation into an analog format. For this standard, a device which converts speech encoded in the format specified in this standard to analog or an equivalent PCM representation. DECSD. Decoder Seed. Encoder. Generally, a device for the translation of a signal into a digital representation. For this standard, a device which converts speech from an analog or its equivalent PCM representation to the digital representation described in this standard. Formant. A resonant frequency of the human vocal tract causing a peak in the short term spectrum of speech. IIR Filter. An infinite-duration impulse response filter is a filter for which the output, in response to an impulse input, never totally converges to zero. This term is usually used in reference to digital filters. Linear Predictive Coding (LPC). A method of predicting future samples of a sequence by a linear combination of the previous samples of the same sequence. Linear Predictive Coding is frequently used in reference to a class of speech codecs. Line Spectral Pair (LSP). A representation of digital filter coefficients in a pseudofrequency domain. This representation has good quantization and interpolation properties. LPC. See Linear Predictive Coding. LSB. Least significant bit. LSP. See Line Spectral Pair. -

17 0 0 0 MSB. Most significant bit. Mobile Station. A station in the Public Radio Telecommunications Service intended to be used while in motion or during halts at unspecified points. Normalized Autocorrelation Function (NACF). A measure used to determine the pitch period and the degree of periodicity of the input speech. This measure is useful in distinguishing voiced from unvoiced speech. Packet. The unit of information exchanged between service option applications in the base station and the mobile station. Pitch. The fundamental frequency in speech caused by the periodic vibration of the human vocal cords. RDA. Rate Determination Algorithm. Receive Objective Loudness Rating (ROLR). A measure of receive audio sensitivity. ROLR is a frequency-weighted ratio of the line voltage input signal to a reference encoder to the acoustic output of the receiver. IEEE defines the measurement of sensitivity and IEEE defines the calculation of objective loudness rating. SPL. Sound Pressure Level. Transmit Objective Loudness Rating (TOLR). A measure of transmit audio sensitivity. TOLR is a frequency-weighted ratio of the acoustic input signal at the transmitter to the line voltage output of the reference decoder. IEEE defines the measurement of sensitivity and IEEE defines the calculation of objective loudness rating. Voiced Speech. Speech generated when the vocal cords are vibrating at a fundamental frequency. Characterized by high energy, periodicity, and a large ratio of energy below khz to energy above khz. Unvoiced Speech. Speech generated by forcing air through constrictions in the vocal tract without vibration of the vocal cords. Characterized by a lack of periodicity, and a nearunity ratio of energy below khz to energy above khz. WAEPL. Weighted Acoustic Echo Path Loss. A measure of the echo performance under normal conversation. ANSI/EIA/TIA- defines the measurement of WAEPL. Zero Input Response (ZIR). The filter output caused by the non-zero initial state of the filter when no input is present. Zero State Response (ZSR). The filter output caused by an input when the initial state of the filter is zero. ZIR. See Zero Input Response. ZSR. See Zero State Response. -

18 0 0 SERVICE OPTION : VARIABLE DATA RATE TWO-WAY VOICE. General Description Service Option provides two-way voice communications between the base station and the mobile station using the dynamically variable data rate speech codec algorithm described in this standard. The service option takes voice samples and generates an encoded speech packet for every Traffic Channel frame. The receiving station generates a speech packet from every Traffic Channel frame and supplies it to the service option for decoding into voice samples. The two speech codecs communicate at one of four rates: Rate, Rate /, Rate /, and Rate /. In case of a discrepancy between the master C simulation and the algorithmic description, the master C simulation will prevail. The master C simulation is contained in the database of the performance specification for this algorithm, TIA/EIA/IS-.. Service Option Number The variable data rate two-way voice service option using the speech codec algorithm described by this standard shall use service option number and is called Service Option.. Multiplex Option.. Required Multiplex Option Support Service Option shall support an interface with Multiplex Option (see TIA/EIA/IS-). Speech packets for Service Option shall only be transported as primary traffic... Interface to Multiplex Option... Transmitted Packets The service option shall generate and supply exactly one packet to the multiplex sublayer every 0 ms. The packet contains the service option information bits which are transmitted as primary traffic. The service option shall operate in one of two modes: IS- Mobile Station-Base Station Compatibility Standard for Dual-Mode Wideband Spread Spectrum Cellular System and J-STD-00 Personal Station-Base Station Compatibility Requirements for. to.0 GHz Code Division Multiple Access (CDMA) Personal Communications Systems use the term frame to represent a 0 ms grouping of data on the Traffic Channel. Common speech codec terminology also uses the term frame to represent a quantum of processing. For Service Option 0x000, the speech codec frame corresponds to speech sampled over 0 ms. The speech samples are processed into a packet. This packet is transmitted in a Traffic Channel frame. -

19 In the first mode, the packet supplied by the service option shall be one of the types shown in Table...-. Upon command, the service option shall generate Blank packets. Also, upon command, the service option shall generate a non-blank packet with a maximum rate of Rate /. In the second mode, the packet supplied by the service option shall be one of the types shown in Table...-, excluding the Rate packet. Upon command, the service option shall generate a Blank packet. Also upon command, the service option shall generate a non-blank packet with a maximum rate of Rate /. 0 Table...-. Packet Types Supplied by Service Option to the Multiplex Sublayer Packet Type Bits per Packet Rate Rate / Rate / Rate / 0 Blank Received Packets The multiplex sublayer in the mobile station categorizes every received Traffic Channel frame and supplies the packet type and accompanying bits, if any, to the service option as shown in Table...-. The service option processes the bits of the packet as described in.. The first five received packet types shown in Table...- correspond to the transmitted packet types shown in Table...-. When the multiplex sublayer determines that a received frame is in error, the multiplex sublayer supplies an insufficient frame quality (erasure) packet to the service option. Table...-. Packet Types Supplied by the Multiplex Sublayer to Service Option Packet Type Bits per Packet Rate Rate / Rate / Rate / 0 Blank 0 Insufficient frame quality (erasure) 0 -

20 .. Service Negotiation The mobile station and base station shall perform service negotiation for the service option as described in IS- or J-STD-00, and the negotiated service configuration shall include only valid attributes for the service option as specified in Table..-. Table..-. Valid Service Configuration Attributes for Service Option Service Configuration Attribute Valid Selections Forward Multiplex Option Multiplex Option Reverse Multiplex Option Multiplex Option Forward Transmission Rates Reverse Transmission Rates Forward Traffic Type Reverse Traffic Type Rate Set with all four rates enabled Rate Set with all four rates enabled Primary Traffic Primary Traffic Initialization and Connection... Mobile Station Requirements If the mobile station accepts a service configuration, as specified in a Service Connect Message, that includes a service option connection using the service option, the mobile station shall perform the following: If the service option connection is new (that is, not part of the previous service configuration), the mobile station shall perform speech codec initialization (see..) at the action time associated with the Service Connect Message. The mobile station shall complete the initialization within 0 ms. Commencing at the action time associated with the Service Connect Message, and continuing for as long as the service configuration includes the service option connection, the service option shall process received packets and shall generate and supply packets for transmission as follows: - If the mobile station is in the Conversation Substate, the service option shall process the received packets and generate and supply packets for transmission in accordance with this standard. - If the mobile station is not in the Conversation Substate, the service option shall process the received packets in accordance with this standard, and shall generate and supply All Ones Rate / Packets for transmission, except when commanded to generate a blank packet. -

21 Base Station Requirements If the base station establishes a service configuration, as specified in a Service Connect Message, that includes a service option connection using the service option, the base station shall perform the following: If the service option connection is new (that is, not part of the previous service configuration), the base station shall perform speech codec initialization (see..) no later than the action time associated with the Service Connect Message. Commencing at the action time associated with the Service Connect Message and continuing for as long as the service configuration includes the service option connection, the service option shall process received packets and shall generate and supply packets for transmission in accordance with this standard. The base station may defer enabling the audio input and output... Service Option Control Messages... Mobile Station Requirements The mobile station shall support one pending Service Option Control Message for the service option. If the mobile station receives a Service Option Control Message for the service option, then, at the action time associated with the message, the mobile station shall process the message as follows:. If the MOBILE_TO_MOBILE field is equal to, the service option shall process each received Blank packet as an insufficient frame quality (erasure) packet. In addition, if the INIT_CODEC field is equal to, the service option should disable the audio output for second after initialization. If the MOBILE_TO_MOBILE field is equal to 0, the service option shall process each received packet as described in.... If the INIT_CODEC field is equal to, the mobile station shall perform speech codec initialization (see..). The mobile station shall complete the initialization within 0 ms.. If the RATE_REDUC field is equal to a value defined in Table...-, the service option shall generate the fraction of those packets normally generated as Rate packets (see...) at either Rate, Rate /, or Rate / as specified by the corresponding line in Table...-. The service option shall continue to use these fractions until either of the following events occur: The mobile station receives a Service Option Control Message specifying a different RATE_REDUC, or The service option is initialized. The service option may use the procedure defined in... to perform this rate reduction. This rate reduction mechanism is not deterministic, but depends upon the -

22 0 statistics of the input speech. The values in Table...- are based upon the assumption that 0% of active speech is unvoiced. In reduced rate level, unvoiced speech is encoded using Rate /. In reduced rate levels and, unvoiced speech is encoded using Rate /. In reduced rate level, 0% of the voiced speech frames are encoded using Rate /. The decision to encode the input voiced speech frame as Rate / or Rate is made based upon the statistics of the input speech and the average encoding rate for active speech as defined in... If the RATE_REDUC field is not equal to a value defined in Table...-, the mobile station shall reject the message by sending a Mobile Station Reject Order with the ORDQ field set equal to Base Station Requirements The base station may send a Service Option Control Message to the mobile station. If the base station sends a Service Option Control Message, the base station shall include the following type-specific fields for the service option: Table...-. Service Option Control Message Type-Specific Fields Field Length (bits) RATE_REDUC RESERVED MOBILE_TO_MOBILE INIT_CODEC 0 0 RATE_REDUC - Rate reduction. The base station shall set this field to the RATE_REDUC value from Table...- corresponding to the rate reduction that the mobile station is to perform. RESERVED - Reserved bits. The base station shall set this field to 000. MOBILE_TO_MOBILE - Mobile-to-mobile processing. If the mobile station is to perform mobile-to-mobile processing (see...), the base station shall set this field to. In addition, if the mobile station is to disable the audio output of the speech codec for second after initialization, the base station shall set the INIT_CODEC field and the MOBILE_TO_- MOBILE field to. If the mobile station is not to perform mobile-to-mobile processing, the base station shall set this field to 0. INIT_CODEC - Initialize speech codec. If the mobile station is to initialize the speech codec (see..), the base station shall set this field to ; otherwise, the base station shall set this field to 0. -

23 Table...-. Fraction of Packets at Rate, Rate /, and Rate / with Rate Reduction RATE_REDUC Reduced Rate Mode Level Average Encoding Rate for Active Speech (kbps) Fraction of Normally Rate Packets to be Rate Fraction of Normally Rate Packets to be Rate / Fraction of Normally Rate Packets to be Rate / All other RATE_REDUC values are reserved. Note: Average Encoding Rate calculation uses channel rates of.,., and. kbps for Rate, /, and / respectively. 0. Variable Rate Speech Coding Algorithm.. Introduction The speech codec uses a code excited linear predictive (CELP) coding algorithm. This technique uses a codebook to vector quantize the residual signal using an analysis-bysynthesis method. The speech codec produces a variable output data rate based upon speech activity. For typical two-way telephone conversations, the average data rate is reduced by a factor of two or more with respect to the maximum data rate. The overall speech synthesis or decoder model is shown in Figure..-. First, a vector is taken from one of two sources depending on the rate. For Rate / and Rate / a pseudorandom vector is generated. For all other rates, a vector specified by an index Ù I is taken from the codebook, which is a table of vectors. This vector is multiplied by a gain term G Ù, and then is filtered by the long-term pitch synthesis filter whose characteristics are governed by the pitch parameters L Ù and b Ù. The output of the pitch synthesis filter is processed by the pitch pre-filter. The pitch pre-filter parameters are the pitch lag, L Ù, and Ù Ù an attenuated pitch gain coefficient, b', derived from b. The output of the pre-filter is For a summary of Service Option 0x000 notation, see.. -

24 0 filtered by the formant synthesis filter to reproduce the speech signal. The output of the formant synthesis filter is filtered by the adaptive postfilter, PF(z). The speech codec encoding procedure involves determining the input parameters for the decoder which minimize the perceptual difference between the synthesized and the original speech. The selection processes for each set of parameters are described in this section. The encoding procedure also includes quantizing the parameters and packing them into data packets for transmission. The speech codec decoding procedure involves unpacking the data packets, unquantizing the received parameters, and reconstructing the speech signal from these parameters. The reconstruction consists of filtering the scaled codebook vector, c d (n), as shown in Figure..-. Gain Control p' d (n) Pseudorandom Vector Generator DECSD Codebook Rate / or / All Other Rates c d (n) Pitch Synthesis Filter P(z) p d (n) Pitch Pre- Filter P'(z) p pre (n) Formant Synthesis (LPC) Filter A(z) y d (n) Postfilter PF(z) pf(n) Gain Control ^ I G^ ^ ^ L & b ^ ^ L & b' ^a,..., a^ 0 s d (n) Input Parameters Figure..-. Speech Synthesis Structure in the Receiving Speech Codec Output Speech The input speech is sampled at khz. This speech is broken down into 0 ms speech codec frames, each consisting of 0 samples. The formant synthesis (LPC) filter coefficients are updated once per frame, regardless of the data rate selected. The number of bits used to encode the LPC parameters is a function of the selected data rate. Within each Also called the linear predictive coding filter, whose characteristics are governed by the filter coefficients a^,..., a^ 0. -

25 frame, the pitch and codebook parameters are updated a varying number of times, depending upon the selected data rate. Table..- describes the various parameters used for each rate. Table..-. Parameters Used for Each Rate Parameter Rate Rate / Rate / Rate / Linear predictive coding (LPC) updates per frame Samples per LPC update, L A 0 (0 ms) 0 (0 ms) 0 (0 ms) 0 (0 ms) Bits per LPC update 0 Pitch updates (subframes) per frame 0 0 Samples per pitch subframe, L p 0 ( ms) 0 ( ms) - Bits per pitch update - Codebook updates (subframes) per frame Samples per codebook subframe, L C 0 (. ms) 0 ( ms) ( ms) 0 (0 ms) Bits per codebook update. * * * *Note: Rate uses bits per codebook update in of the codebook subframes per frame and bits per codebook update, in four codebook subframes. Rate / uses five unsigned codebook gains, each -bits long for scaling the pseudorandom excitation. Rate / uses six bits for pseudorandom excitation, instead of using the codebook. 0 The components for each rate packet are shown in Figures..- through..-. In these figures, each LPC frame corresponds to one 0-sample frame of speech. The number in the LPC block of each figure is the number of bits used at that rate to encode the LPC coefficients. Each pitch block corresponds to a pitch update within each frame, and the number in each pitch block corresponds to the number of bits used to encode the updated pitch parameters. For example at Rate, the pitch parameters are updated four times, once for each quarter of the speech frame, each time using bits to encode the new pitch parameters. Similarly, each codebook block corresponds to a codebook update within each frame, and the number in each codebook block corresponds to the number of bits used to encode the updated codebook parameters. For example at Rate /, the codebook parameters are updated four times, once for each quarter of the speech frame, each time using bits to encode the parameters. -

26 LPC Frame Total = bits Pitch Subframe + Codebook Subframe reserved bits Figure..-. Bit Allocation for a Rate Packet LPC Frame Pitch Subframe Codebook Subframe Figure..-. Bit Allocation for a Rate / Packet Total = bits LPC Frame Total = bits Pitch Subframe 0 + Codebook Subframe Figure..-. Bit Allocation for a Rate / Packet reserved bits 0 LPC Frame 0 Total = bits Pitch Subframe 0 + Codebook Subframe Figure..-. Bit Allocation for a Rate / Packet reserved bits -

27 Table..- lists all the parameter codes transmitted for each rate packet. The following list describes each parameter: LSPi Line Spectral Pair frequency i. 0 LSPVi PLAGi PFRACi PGAINi CBINDEXi CBGAINi CBSEED CBSIGNi Line Spectral Pair frequencies grouped into five vectors of dimension two. Pitch Lag for the ith pitch subframe. Fractional Pitch Lag for the ith pitch subframe. Pitch Gain for the ith pitch subframe. Codebook Index for the ith codebook subframe. Unsigned Codebook Gain for the ith codebook subframe. Random Seed for Rate / packets. Sign of the Codebook Gain for the ith codebook subframe. This standard refers to the LSB of a particular code as CODE[0] and the more significant bits as CODE[], CODE[], etc. For example, if LSPV = 000 in binary for a maximum rate frame, LSPV[0] =, LSPV[] =, LSPV[] = 0, LSPV[] =, LSPV[] = 0, and LSPV[] = 0. -0

28 Table..-. Transmission Codes and Bit Allocations (Part of ) Rate Rate Code / / / Code / / / LSP CBINDEX LSP CBINDEX LSP CBINDEX LSP CBINDEX LSP CBINDEX LSP CBINDEX LSP CBINDEX LSP CBINDEX0 LSP CBINDEX LSP0 CBINDEX LSPV CBINDEX LSPV CBINDEX LSPV CBINDEX LSPV CBINDEX LSPV CBGAIN PLAG CBGAIN PLAG CBGAIN PLAG CBGAIN PLAG CBGAIN PFRAC CBGAIN PFRAC CBGAIN PFRAC CBGAIN PFRAC CBGAIN PGAIN CBGAIN0 PGAIN CBGAIN PGAIN CBGAIN PGAIN CBGAIN CBSEED CBGAIN CBINDEX CBGAIN CBINDEX CBGAIN -

29 Table..-. Transmission Codes and Bit Allocations (Part of ) Rate Rate Code / / / Code / / / CBSIGN CBSIGN CBSIGN CBSIGN0 CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN CBSIGN Input Audio Interface... Input Audio Interface in the Mobile Station The input audio may be either an analog or digital signal.... Conversion and Scaling The speech shall be sampled at a rate of 000 samples per second. The speech shall be quantized to a uniform PCM format with at least magnitude bits of dynamic range. The quantities in this standard assume a -bit integer input quantization with a range of ±0. The following speech codec discussion assumes this -bit integer quantization. If the speech codec uses a different quantization, then appropriate scaling should be used.... Digital Audio Input If the input audio is an -bit mlaw PCM signal, it shall be converted to a uniform PCM format according to Table in CCITT Recommendation G. Pulse Code Modulation (PCM) of Voice Frequencies.... Analog Audio Input If the input is in analog form, the mobile station shall sample the analog speech and shall convert the samples to a digital format for speech codec processing. This shall be done by either the following or an equivalent method: First, the input gain audio level is adjusted. Then, the signal is bandpass filtered to prevent aliasing. Finally, the filtered signal is sampled and quantized (see...).... Adjusting the Transmit Level The mobile station shall have a transmit objective loudness rating (TOLR) equal to - db, when transmitting to a reference base station (see..0..). The loudness ratings are described in IEEE Standard - IEEE Standard Method for Determining Objective Loudness Ratings of Telephone Connections. Measurement techniques and tolerances are -

30 0 0 0 described in IS- Recommended Minimum Performance Standard for Wideband Spread Spectrum Digital Cellular System Speech Service Options.... Band Pass Filtering Input anti-aliasing filtering shall conform to CCITT Recommendation G. Separate Performance Characteristics for the Encoding and Decoding Sides of PCM Channels Applicable to -Wire Voice-Frequency Interfaces. Additional anti-aliasing filtering may be provided by the manufacturer.... Echo Return Loss Provision shall be made to ensure adequate isolation between receive and transmit audio paths in all modes of operation. When no external transmit audio is present, the speech codec shall not generate packets at rates higher than Rate / (see..), due to acoustic coupling of the receive audio into the transmit audio path (specifically with the receive audio at full volume). Target levels of db WAEPL should be met. See ANSI/EIA/TIA Standard Acoustic-to-Digital and Digital-to-Acoustic Transmission Requirements for ISDN Terminals. Refer to the requirements stated in IS- Recommended Minimum Performance Standard for Wideband Spread Spectrum Digital Cellular System Speech Service Options.... Input Audio Interface in the Base Station... Sampling and Format Conversion The base station converts the input speech (analog, mlaw companded Pulse Code Modulation, or other format) into a uniform quantized PCM format with at least magnitude bits of dynamic range. The sampling rate is 000 samples per second. The sampling and conversion process shall be as in Adjusting the Transmit Level The base station shall set the transmit level so that a 00 Hz tone at a level of 0 dbm0 at the network interface produces a level. db below the level of a sine wave whose peak is at the maximum quantization level. Measurement techniques and tolerances are described in IS- Recommended Minimum Performance Standard for Wideband Spread Spectrum Digital Cellular System Speech Service Options.... Echo Canceling The base station shall provide a method to cancel echoes returned by the PSTN interface. The echo canceling function should provide at least 0 db of echo return loss enhancement. The echo canceling function should work over a range of PSTN echo return delays from 0 to ms. Because of the relatively long delays inherent in the speech coding and transmitting processes, echoes that are not sufficiently suppressed are noticeable to the mobile station user. -

31 0... Ear Protection To protect the user from possible ear damage, ear-piece acoustic output shall be limited so as not to exceed 0 db SPL when placed to the ear as measured in accordance with. of IEEE - Standard Method for Measuring Transmission Performance on Analog and Digital Telephone Sets... Determining the Formant Prediction Parameters... Form of the Formant Synthesis Filter The formant synthesis filter, which is similar to the traditional LPC formant synthesis filter, is the inverse of the formant prediction error filter. The prediction error filter is of the tenth order (i.e., P is equal to 0), and has transfer function ( ) =- a i z -i Az P å (...-) i= The formant synthesis filter has transfer function ( ) = P Az å - a i z -i i= (...-) The LPC coefficients, a i, are computed from the input speech.... Encoding 0 The encoding process begins by determining the formant prediction parameters. This is performed by the following steps:. High-pass filter the input samples.. Window the filtered samples using a Hamming window.. Compute the values of the autocorrelation function corresponding to shifts from 0 to samples.. Determine the LPC coefficients from the autocorrelation values.. Transform the LPC coefficients to LSP frequencies.. Convert the LSP frequencies into LSP codes (these codes are placed into the packet for transmission). -

32 ... High-Pass Filtering of Input Samples A high-pass digital filter is inserted into the input signal path to remove unwanted background and circuit noise and to prevent a DC offset from artificially increasing R(0) (see...) and thus disrupting the rate decision algorithm (see..). One possible highpass filter for accomplishing these objectives is defined as HPF( z) = 0. z - z + z -.z + 0. (...-) 0... Windowing the Samples The high-pass filtered speech samples are windowed using a Hamming window which is centered at the center of the fourth Rate pitch subframe. The window is 0 samples long (i.e., L A is equal to 0). Let s(n) be the input speech signal with the DC removed, where s(0) denotes the first sample of the current frame. The windowed speech signal is defined as S w ( n) = sn+0 ( )W H ( n), 0 n L A - (...-) where the Hamming window, W H (n), is defined in Table...- in hexadecimal format. Each value in the table has fractional bits. Note the offset of 0 samples, which results in the window of speech being centered between the th and 0th samples of the current speech frame of 0 samples, and s(0+i) for 0 i are the first 0 samples of the next speech frame. -

33 Table...-. Hamming Window Values W H (n) n W H (n) n n W H (n) n n W H (n) n 0 0x0f 0x 0x 0 0x0 0x0 0xf 0 0x0 0xd 0 0x 0 0x0 0 0xf 0xc 0 0x0d 0x 0x00 0 0x0b 0xaf 0xdb 00 0x0f 0xacd 0 0xaf 0x0d 0xbee 0xa 0x0 0xd 0xd 0x0f 0 0xe 0xf 0 0x0 0xfe 0xaa 0x0dc 0x0 0xbc 0x0e 0xb0 0 0xbe 0x0ec 0 0xda 0xcb 0x0 0x0 0xd0 0x0a 0xd 0xd0 0 0x0ad0 0x 0 0xdf 0x0b 0xb 0xeb 0x0c 0xa0 0xeb 0x0d0 0 0xc 0xf0 0 0x0dd 0xae 0xff 0x0eb0 0xbfd 0xf 0x0f0 0xd 0 0xfb 0x0 0 0xe 0 0xfdb 0x 0xf 0 0xff 0xb 0x0 0 0xfff 0 0x 0x 0 -

34 0... Computing the Autocorrelation Function Following the windowing operation, the kth value of the autocorrelation function is computed as ( ) = S w m Rk L A --k å ( )S w m + k, 0 k m=0 (...-) Only the first values of the autocorrelation function, R(0) through R(), need to be computed from the windowed speech signal within the analysis window. Of these, the first values of the autocorrelation function are required for LPC analysis. All values are used for the rate determination algorithm defined in Determining the LPC Coefficients from the Autocorrelation Function The LPC coefficients are obtained from the autocorrelation function. A method is Durbin s recursion, as shown below. 0 0 { E (0) = R(0) i = while (i ² P) { iê-ê ï ì ï ü ki = ír(i)ê-ê åêa (i-) ý j ÊR(iÊ-Êj)Ê /E(i - ) î ï þ ï jê=ê a (i) i = ki j = while (j ² i-) { a (i) j = a (i-) j - kia (i-) i-j j = j + } E (i) = ( - k Êi ) E(i - ) i = i + } } The LPC coefficients are ( P) a j = a j, j P (...-) See Rabiner, L. R. and Schafer, R. W., Digital Processing of Speech Signals, (New Jersey: Prentice- Hall Inc, ), pp. -. The superscripts in parentheses represent the stage of Durbin s recursion. For example a (i) j refers to a j at the ith stage. -

35 ... Transforming the LPC Coefficients to Line Spectrum Pairs (LSPs) The LPC coefficients are transformed into line spectrum pair frequencies. The prediction error filter transfer function, A(z), is given by Az ( ) =- a z a 0 z -0 (...-) where a i, i 0, are the LPC coefficients as described earlier. Define two new transfer functions P A (z) and Q A (z) as P A ( z) = Az ( )+ z - Az ( - ) =+ p z p z - + p z p z -0 + z - (...-) and Q A ( z) = Az ( )- z - Az ( - ) =+ q z q z - - q z q z -0 - z - (...-) 0 where p i =-a i -a -i, i (...-) and q i =-a i +a -i, i (...-) The LSP frequencies are the ten roots which exist between w=0 and w=.0 in the following two equations: ( ) = cos ( pw) P' w ( )+ p' cos( ( pw) )+...+p' cos( pw)+ p' (...-) ( ) = cos ( pw) Q' w ( )+ q' cos( ( pw) )+...+q' cos( pw)+ q' (...-) where the parameters p' and q' are computed recursively from the parameters p and q as p' 0 = q' 0 = (...-) 0 p' i = p i - p' i-, i (...-) q' i = q i + q' i-, i (...-0) -

36 0 Since the formant synthesis (LPC) filter is stable, the roots of the two functions alternate in the range from 0 to.0. If these ten roots are denoted as w, w,..., w 0 in the increasing order of magnitude, then w i for i=,,,, are roots of P'(w) and w i for i=,,,,0 are those of Q'(w).... Converting the LSP Frequencies to Transmission Codes for Rate, Rate /, and Rate / For Rate, Rate /, and Rate /, a vector quantizer (VQ) is used to quantize the 0 LSP frequencies into bits. The quantization procedure is described in the following subsections.... Computing the Sensitivities of the LSP Frequencies Before quantization begins, the following algorithm is used to compute how sensitive each LSP is to quantization. These sensitivity weightings are used in the quantization process to weight the quantization error in each LSP frequency appropriately: First, obtain the set of values J i, composed of J i () through J i (0), where i is the index of the LSP frequency of interest, by performing long division operations on P A (z) and Q A (z) given in Equations...- and...-. For the LSP frequencies with odd index, w, w, etc., the long division is performed as pz pz p z - + z = J J z J 0 z i()+ i( ) i( ) - cos( p - wi ) z + - z (...-) 0 and for the LSP frequencies with even index, w, w, etc., the long division is performed as qz qz q z - - z = J J z J 0 z i()+ i( ) i( ) - cos( p - wi ) z + - z (...-) Next, compute the autocorrelations of the vectors J i, using the following equation: R Ji 0-n ( n) = å J i ( k)j ( i k + n ), 0 n <0 and i 0 (...-) k= Finally, compute the sensitivity weights for the LSP frequencies by cross correlating the vectors with the autocorrelation vector computed from the speech (see R Ji Equation...-) and multiplying the results by sin ( pw i ). The final sensitivity weights, SW i are given by SW i = sin pw i æ ( ) çr0 ç è ö ( )R Ji ( 0)+. 0åR( k)r Ji ( k), i 0 (...-) k= ø Use these weights, SW i, to compute the weighted square error distortion metrics needed to search the LSP VQ codebooks, as described in the next subsection. -

37 0... Vector Quantizing the LSP Frequencies In the LSP VQ algorithm, the 0-dimensional LSP vector is partitioned into five - dimensional subvectors. Each of these -dimensional subvectors is quantized by a VQ, whose codebooks vary in size. Define w i as the ith LSP frequency and wq i as the quantized ith LSP frequency. The VQ codebook values are given in tables in... Define L k (i,j) as the jth element of the kth vector in the ith VQ codebook. For example, L (,) is the first element of the rd vector in codebook, shown in Table...- as 0.. The vectors in the vector quantizer codebooks are differential vectors; i.e., the VQ codebooks contain possible values for the quantized differences in the LSP frequencies, given by Dw i = w i -w i-. The five subvectors are quantized sequentially in the following manner. The first VQ codebook contains possible quantized values for Dw = w -w 0 = w and Dw =w -w. The best vector in the first codebook is selected as the vector which minimizes the sensitivity weighted error between the quantized and unquantized LSP frequencies in the first subvector, which is computed by error = SW ( w - wq ) + SW ( w - wq ) ( ) + SW ( w -( Dwq +Dwq )) = SW w -( Dwq ) ( ( ( ))) + SW w - L k, = SW w - L k, ( ( ( )+ L k (, ) )) (...-) 0 This error function is computed for each of the codevectors in the first LSP VQ codebook (i.e., 0 k < ). The codevector which results in the minimum error is selected, and the - bit LSPV transmission code is set equal to the index of this codevector. Define the index of the best vector for the ith codebook as kbst(i). Once kbst() has been determined, the first two quantized LSP frequencies can be reconstructed from the first VQ codebook as wq =Dwq = L kbst( ) (,) wq =Dwq +Dwq = L kbst( ) (,)+ L kbst( ) (, ) (...-) 0 The remaining subvectors are quantized sequentially in a similar manner. The ith VQ codebook contains possible quantized values for Dw i- = w i- -w i- and Dw i = w i -w i-. The best vector in the ith codebook is selected as the vector which minimizes the sensitivity weighted error between the quantized and unquantized LSP frequencies in the ith subvector, computed by error = SW i- ( w i- - wq i- ) + SW i ( w i - wq i ) ( ) + SW i ( w i - ( wq i- +Dwq i- +Dwq i )) ( ( ( ))) + SW i ( w i - ( wq i- + L k ( i,)+ L k ( i,) )) = SW i- w i- - ( wq i- +Dwq i- ) = SW i- w i- - wq i- + L k i, (...-) -0

Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB) Service Option 62 for Spread Spectrum Systems

Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB) Service Option 62 for Spread Spectrum Systems GPP C.S00-0 Version.0 Date: June, 00 Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB) Service Option for Spread Spectrum Systems COPYRIGHT GPP and its Organizational Partners claim

More information

Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB), Service Options 62 and 63 for Spread Spectrum Systems

Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB), Service Options 62 and 63 for Spread Spectrum Systems GPP C.S00-A v.0 GPP C.S00-A Version.0 Date: April, 00 Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB), Service Options and for Spread Spectrum Systems COPYRIGHT GPP and its Organizational

More information

SPEECH ANALYSIS AND SYNTHESIS

SPEECH ANALYSIS AND SYNTHESIS 16 Chapter 2 SPEECH ANALYSIS AND SYNTHESIS 2.1 INTRODUCTION: Speech signal analysis is used to characterize the spectral information of an input speech signal. Speech signal analysis [52-53] techniques

More information

ETSI TS V ( )

ETSI TS V ( ) TS 146 060 V14.0.0 (2017-04) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+) (GSM); Enhanced Full Rate (EFR) speech transcoding (3GPP TS 46.060 version 14.0.0 Release 14)

More information

ETSI TS V5.0.0 ( )

ETSI TS V5.0.0 ( ) Technical Specification Universal Mobile Telecommunications System (UMTS); AMR speech Codec; Transcoding Functions () 1 Reference RTS/TSGS-046090v500 Keywords UMTS 650 Route des Lucioles F-0691 Sophia

More information

3GPP TS V6.1.1 ( )

3GPP TS V6.1.1 ( ) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB)

More information

ETSI EN V7.1.1 ( )

ETSI EN V7.1.1 ( ) European Standard (Telecommunications series) Digital cellular telecommunications system (Phase +); Adaptive Multi-Rate (AMR) speech transcoding GLOBAL SYSTEM FOR MOBILE COMMUNICATIONS R Reference DEN/SMG-110690Q7

More information

CS578- Speech Signal Processing

CS578- Speech Signal Processing CS578- Speech Signal Processing Lecture 7: Speech Coding Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab yannis@csd.uoc.gr Univ. of Crete Outline 1 Introduction

More information

INTERNATIONAL TELECOMMUNICATION UNION. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB)

INTERNATIONAL TELECOMMUNICATION UNION. Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) INTERNATIONAL TELECOMMUNICATION UNION ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU G.722.2 (07/2003) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments

More information

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析

Chapter 9. Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 Chapter 9 Linear Predictive Analysis of Speech Signals 语音信号的线性预测分析 1 LPC Methods LPC methods are the most widely used in speech coding, speech synthesis, speech recognition, speaker recognition and verification

More information

The Equivalence of ADPCM and CELP Coding

The Equivalence of ADPCM and CELP Coding The Equivalence of ADPCM and CELP Coding Peter Kabal Department of Electrical & Computer Engineering McGill University Montreal, Canada Version.2 March 20 c 20 Peter Kabal 20/03/ You are free: to Share

More information

ETSI TS V7.0.0 ( )

ETSI TS V7.0.0 ( ) TS 6 9 V7.. (7-6) Technical Specification Digital cellular telecommunications system (Phase +); Universal Mobile Telecommunications System (UMTS); Speech codec speech processing functions; Adaptive Multi-Rate

More information

ETSI TS V5.0.0 ( )

ETSI TS V5.0.0 ( ) TS 126 192 V5.0.0 (2001-03) Technical Specification Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech processing functions AMR Wideband Speech Codec; Comfort noise aspects

More information

Design of a CELP coder and analysis of various quantization techniques

Design of a CELP coder and analysis of various quantization techniques EECS 65 Project Report Design of a CELP coder and analysis of various quantization techniques Prof. David L. Neuhoff By: Awais M. Kamboh Krispian C. Lawrence Aditya M. Thomas Philip I. Tsai Winter 005

More information

ITU-T G khz audio-coding within 64 kbit/s

ITU-T G khz audio-coding within 64 kbit/s International Telecommunication Union ITU-T G.722 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (9/212) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments

More information

ETSI TS V ( )

ETSI TS V ( ) TS 126 192 V15.0.0 (2018-0) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Speech codec speech processing functions;

More information

Time-domain representations

Time-domain representations Time-domain representations Speech Processing Tom Bäckström Aalto University Fall 2016 Basics of Signal Processing in the Time-domain Time-domain signals Before we can describe speech signals or modelling

More information

Speech Coding. Speech Processing. Tom Bäckström. October Aalto University

Speech Coding. Speech Processing. Tom Bäckström. October Aalto University Speech Coding Speech Processing Tom Bäckström Aalto University October 2015 Introduction Speech coding refers to the digital compression of speech signals for telecommunication (and storage) applications.

More information

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007

Linear Prediction Coding. Nimrod Peleg Update: Aug. 2007 Linear Prediction Coding Nimrod Peleg Update: Aug. 2007 1 Linear Prediction and Speech Coding The earliest papers on applying LPC to speech: Atal 1968, 1970, 1971 Markel 1971, 1972 Makhoul 1975 This is

More information

SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION

SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION SCELP: LOW DELAY AUDIO CODING WITH NOISE SHAPING BASED ON SPHERICAL VECTOR QUANTIZATION Hauke Krüger and Peter Vary Institute of Communication Systems and Data Processing RWTH Aachen University, Templergraben

More information

Chapter 12 Variable Phase Interpolation

Chapter 12 Variable Phase Interpolation Chapter 12 Variable Phase Interpolation Contents Slide 1 Reason for Variable Phase Interpolation Slide 2 Another Need for Interpolation Slide 3 Ideal Impulse Sampling Slide 4 The Sampling Theorem Slide

More information

L used in various speech coding applications for representing

L used in various speech coding applications for representing IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 1, NO. 1. JANUARY 1993 3 Efficient Vector Quantization of LPC Parameters at 24 BitsFrame Kuldip K. Paliwal, Member, IEEE, and Bishnu S. Atal, Fellow,

More information

Pulse-Code Modulation (PCM) :

Pulse-Code Modulation (PCM) : PCM & DPCM & DM 1 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number of bits used to represent each sample. The rate from

More information

Proc. of NCC 2010, Chennai, India

Proc. of NCC 2010, Chennai, India Proc. of NCC 2010, Chennai, India Trajectory and surface modeling of LSF for low rate speech coding M. Deepak and Preeti Rao Department of Electrical Engineering Indian Institute of Technology, Bombay

More information

ETSI EN V7.0.1 ( )

ETSI EN V7.0.1 ( ) EN 3 969 V7.. (-) European Standard (Telecommunications series) Digital cellular telecommunications system (Phase +); Half rate speech; Half rate speech transcoding (GSM 6. version 7.. Release 998) GLOBAL

More information

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut Finite Word Length Effects and Quantisation Noise 1 Finite Word Length Effects Finite register lengths and A/D converters cause errors at different levels: (i) input: Input quantisation (ii) system: Coefficient

More information

Chapter 10 Applications in Communications

Chapter 10 Applications in Communications Chapter 10 Applications in Communications School of Information Science and Engineering, SDU. 1/ 47 Introduction Some methods for digitizing analog waveforms: Pulse-code modulation (PCM) Differential PCM

More information

Oversampling Converters

Oversampling Converters Oversampling Converters David Johns and Ken Martin (johns@eecg.toronto.edu) (martin@eecg.toronto.edu) slide 1 of 56 Motivation Popular approach for medium-to-low speed A/D and D/A applications requiring

More information

Lab 9a. Linear Predictive Coding for Speech Processing

Lab 9a. Linear Predictive Coding for Speech Processing EE275Lab October 27, 2007 Lab 9a. Linear Predictive Coding for Speech Processing Pitch Period Impulse Train Generator Voiced/Unvoiced Speech Switch Vocal Tract Parameters Time-Varying Digital Filter H(z)

More information

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems

More information

Source modeling (block processing)

Source modeling (block processing) Digital Speech Processing Lecture 17 Speech Coding Methods Based on Speech Models 1 Waveform Coding versus Block Waveform coding Processing sample-by-sample matching of waveforms coding gquality measured

More information

Topic 3. Design of Sequences with Low Correlation

Topic 3. Design of Sequences with Low Correlation Topic 3. Design of Sequences with Low Correlation M-sequences and Quadratic Residue Sequences 2 Multiple Trace Term Sequences and WG Sequences 3 Gold-pair, Kasami Sequences, and Interleaved Sequences 4

More information

Digital Speech Processing Lecture 10. Short-Time Fourier Analysis Methods - Filter Bank Design

Digital Speech Processing Lecture 10. Short-Time Fourier Analysis Methods - Filter Bank Design Digital Speech Processing Lecture Short-Time Fourier Analysis Methods - Filter Bank Design Review of STFT j j ˆ m ˆ. X e x[ mw ] [ nˆ m] e nˆ function of nˆ looks like a time sequence function of ˆ looks

More information

Optical Storage Technology. Error Correction

Optical Storage Technology. Error Correction Optical Storage Technology Error Correction Introduction With analog audio, there is no opportunity for error correction. With digital audio, the nature of binary data lends itself to recovery in the event

More information

BASICS OF COMPRESSION THEORY

BASICS OF COMPRESSION THEORY BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on lectures from B. Lee, B. Girod, and A. Mukherjee 1 Outline Digital Signal Representation

More information

Class of waveform coders can be represented in this manner

Class of waveform coders can be represented in this manner Digital Speech Processing Lecture 15 Speech Coding Methods Based on Speech Waveform Representations ti and Speech Models Uniform and Non- Uniform Coding Methods 1 Analog-to-Digital Conversion (Sampling

More information

Speech Signal Representations

Speech Signal Representations Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6

More information

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256 General Models for Compression / Decompression -they apply to symbols data, text, and to image but not video 1. Simplest model (Lossless ( encoding without prediction) (server) Signal Encode Transmit (client)

More information

A Systematic Description of Source Significance Information

A Systematic Description of Source Significance Information A Systematic Description of Source Significance Information Norbert Goertz Institute for Digital Communications School of Engineering and Electronics The University of Edinburgh Mayfield Rd., Edinburgh

More information

INTRODUCTION TO DELTA-SIGMA ADCS

INTRODUCTION TO DELTA-SIGMA ADCS ECE37 Advanced Analog Circuits INTRODUCTION TO DELTA-SIGMA ADCS Richard Schreier richard.schreier@analog.com NLCOTD: Level Translator VDD > VDD2, e.g. 3-V logic? -V logic VDD < VDD2, e.g. -V logic? 3-V

More information

EE 521: Instrumentation and Measurements

EE 521: Instrumentation and Measurements Aly El-Osery Electrical Engineering Department, New Mexico Tech Socorro, New Mexico, USA September 23, 2009 1 / 18 1 Sampling 2 Quantization 3 Digital-to-Analog Converter 4 Analog-to-Digital Converter

More information

DSP Design Lecture 2. Fredrik Edman.

DSP Design Lecture 2. Fredrik Edman. DSP Design Lecture Number representation, scaling, quantization and round-off Noise Fredrik Edman fredrik.edman@eit.lth.se Representation of Numbers Numbers is a way to use symbols to describe and model

More information

Musimathics The Mathematical Foundations of Music Volume 2. Gareth Loy. Foreword by John Chowning

Musimathics The Mathematical Foundations of Music Volume 2. Gareth Loy. Foreword by John Chowning Musimathics The Mathematical Foundations of Music Volume 2 Gareth Loy Foreword by John Chowning The MIT Press Cambridge, Massachusetts London, England ..2.3.4.5.6.7.8.9.0..2.3.4 2 2. 2.2 2.3 2.4 2.5 2.6

More information

6. H.261 Video Coding Standard

6. H.261 Video Coding Standard 6. H.261 Video Coding Standard ITU-T (formerly CCITT) H-Series of Recommendations 1. H.221 - Frame structure for a 64 to 1920 kbits/s channel in audiovisual teleservices 2. H.230 - Frame synchronous control

More information

representation of speech

representation of speech Digital Speech Processing Lectures 7-8 Time Domain Methods in Speech Processing 1 General Synthesis Model voiced sound amplitude Log Areas, Reflection Coefficients, Formants, Vocal Tract Polynomial, l

More information

BASIC COMPRESSION TECHNIQUES

BASIC COMPRESSION TECHNIQUES BASIC COMPRESSION TECHNIQUES N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lectures # 05 Questions / Problems / Announcements? 2 Matlab demo of DFT Low-pass windowed-sinc

More information

Use of the decibel and the neper

Use of the decibel and the neper Rec. ITU-R V.574-4 1 RECOMMENDATION ITU-R V.574-4 USE OF THE DECIBEL AND THE NEPER IN TELECOMMUNICATIONS*, **, *** Rec. ITU-R V.574-4 (1978-1982-1986-1990-2000) Scope This text recommends the symbols to

More information

HARMONIC VECTOR QUANTIZATION

HARMONIC VECTOR QUANTIZATION HARMONIC VECTOR QUANTIZATION Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Jonas Svedberg, and Harald Pobloth SMN, Ericsson Research, Ericsson AB 64 8, Stockholm, Sweden ABSTRACT

More information

E303: Communication Systems

E303: Communication Systems E303: Communication Systems Professor A. Manikas Chair of Communications and Array Processing Imperial College London Principles of PCM Prof. A. Manikas (Imperial College) E303: Principles of PCM v.17

More information

Tracking of Spread Spectrum Signals

Tracking of Spread Spectrum Signals Chapter 7 Tracking of Spread Spectrum Signals 7. Introduction As discussed in the last chapter, there are two parts to the synchronization process. The first stage is often termed acquisition and typically

More information

Improvement of tandemless transcoding from AMR to EVRC

Improvement of tandemless transcoding from AMR to EVRC Iproveent of tandeless transcoding fro to Wonil Lee, Sunil Lee, Changdong Yoo Dept. of Electrical Engineering and Coputer Science, KAIST illee@kaist.ac.kr, coboysun@kaist.ac.kr, cdyoo@ee.kaist.ac.kr ABSTRACT

More information

Acoustic Research Institute ARI

Acoustic Research Institute ARI Austrian Academy of Sciences Acoustic Research Institute ARI System Identification in Audio Engineering P. Majdak piotr@majdak.com Institut für Schallforschung, Österreichische Akademie der Wissenschaften;

More information

COMP Signals and Systems. Dr Chris Bleakley. UCD School of Computer Science and Informatics.

COMP Signals and Systems. Dr Chris Bleakley. UCD School of Computer Science and Informatics. COMP 40420 2. Signals and Systems Dr Chris Bleakley UCD School of Computer Science and Informatics. Scoil na Ríomheolaíochta agus an Faisnéisíochta UCD. Introduction 1. Signals 2. Systems 3. System response

More information

Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources

Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources Vector Quantizers for Reduced Bit-Rate Coding of Correlated Sources Russell M. Mersereau Center for Signal and Image Processing Georgia Institute of Technology Outline Cache vector quantization Lossless

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

3GPP TS V ( )

3GPP TS V ( ) TS 23.032 V11.0.0 (2012-09) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Universal Geographical Area Description (GAD) (Release

More information

Principles of Communications

Principles of Communications Principles of Communications Weiyao Lin, PhD Shanghai Jiao Tong University Chapter 4: Analog-to-Digital Conversion Textbook: 7.1 7.4 2010/2011 Meixia Tao @ SJTU 1 Outline Analog signal Sampling Quantization

More information

(51) Int Cl. 7 : G10L 19/12

(51) Int Cl. 7 : G10L 19/12 (19) Europäisches Patentamt European Patent Office Office européen des brevets *EP000994B1* (11) EP 0 991 04 B1 (12) EUROPEAN PATENT SPECIFICATION (4) Date of publication and mention of the grant of the

More information

Timbral, Scale, Pitch modifications

Timbral, Scale, Pitch modifications Introduction Timbral, Scale, Pitch modifications M2 Mathématiques / Vision / Apprentissage Audio signal analysis, indexing and transformation Page 1 / 40 Page 2 / 40 Modification of playback speed Modifications

More information

Compression methods: the 1 st generation

Compression methods: the 1 st generation Compression methods: the 1 st generation 1998-2017 Josef Pelikán CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ Still1g 2017 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 32 Basic

More information

Lecture 12. Block Diagram

Lecture 12. Block Diagram Lecture 12 Goals Be able to encode using a linear block code Be able to decode a linear block code received over a binary symmetric channel or an additive white Gaussian channel XII-1 Block Diagram Data

More information

Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation. Keiichi Tokuda

Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation. Keiichi Tokuda Mel-Generalized Cepstral Representation of Speech A Unified Approach to Speech Spectral Estimation Keiichi Tokuda Nagoya Institute of Technology Carnegie Mellon University Tamkang University March 13,

More information

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9. ( c ) E p s t e i n, C a r t e r, B o l l i n g e r, A u r i s p a C h a p t e r 17: I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science 17.1 Binary Codes Normal numbers we use

More information

Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm

Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm EngOpt 2008 - International Conference on Engineering Optimization Rio de Janeiro, Brazil, 0-05 June 2008. Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic

More information

Multimedia Communications. Scalar Quantization

Multimedia Communications. Scalar Quantization Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing

More information

Source Coding. Scalar Quantization

Source Coding. Scalar Quantization Source Coding Source Coding The Communications Toolbox includes some basic functions for source coding. Source coding, also known as quantization or signal formatting, includes the concepts of analog-to-digital

More information

Feature extraction 2

Feature extraction 2 Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Feature extraction 2 Dr Philip Jackson Linear prediction Perceptual linear prediction Comparison of feature methods

More information

DISCRETE-TIME SIGNAL PROCESSING

DISCRETE-TIME SIGNAL PROCESSING THIRD EDITION DISCRETE-TIME SIGNAL PROCESSING ALAN V. OPPENHEIM MASSACHUSETTS INSTITUTE OF TECHNOLOGY RONALD W. SCHÄFER HEWLETT-PACKARD LABORATORIES Upper Saddle River Boston Columbus San Francisco New

More information

Cast of Characters. Some Symbols, Functions, and Variables Used in the Book

Cast of Characters. Some Symbols, Functions, and Variables Used in the Book Page 1 of 6 Cast of Characters Some s, Functions, and Variables Used in the Book Digital Signal Processing and the Microcontroller by Dale Grover and John R. Deller ISBN 0-13-081348-6 Prentice Hall, 1998

More information

Error Correction Methods

Error Correction Methods Technologies and Services on igital Broadcasting (7) Error Correction Methods "Technologies and Services of igital Broadcasting" (in Japanese, ISBN4-339-06-) is published by CORONA publishing co., Ltd.

More information

Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks

Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks Lloyd-Max Quantization of Correlated Processes: How to Obtain Gains by Receiver-Sided Time-Variant Codebooks Sai Han and Tim Fingscheidt Institute for Communications Technology, Technische Universität

More information

A 600bps Vocoder Algorithm Based on MELP. Lan ZHU and Qiang LI*

A 600bps Vocoder Algorithm Based on MELP. Lan ZHU and Qiang LI* 2017 2nd International Conference on Electrical and Electronics: Techniques and Applications (EETA 2017) ISBN: 978-1-60595-416-5 A 600bps Vocoder Algorithm Based on MELP Lan ZHU and Qiang LI* Chongqing

More information

Quantization of LSF Parameters Using A Trellis Modeling

Quantization of LSF Parameters Using A Trellis Modeling 1 Quantization of LSF Parameters Using A Trellis Modeling Farshad Lahouti, Amir K. Khandani Coding and Signal Transmission Lab. Dept. of E&CE, University of Waterloo, Waterloo, ON, N2L 3G1, Canada (farshad,

More information

Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding

Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding entropy Article Log Likelihood Spectral Distance, Entropy Rate Power, and Mutual Information with Applications to Speech Coding Jerry D. Gibson * and Preethi Mahadevan Department of Electrical and Computer

More information

IS INTERNATIONAL STANDARD. Acoustics - Determination of sound power levels of noise sources using sound intensity - Part 2:

IS INTERNATIONAL STANDARD. Acoustics - Determination of sound power levels of noise sources using sound intensity - Part 2: INTERNATIONAL STANDARD IS0 9614-2 First edition 1996-08-01 Acoustics - Determination of sound power levels of noise sources using sound intensity - Part 2: Measurement by scanning Acoustique - Dhermination

More information

COMPARISON OF WINDOWING SCHEMES FOR SPEECH CODING. Johannes Fischer * and Tom Bäckström * Fraunhofer IIS, Am Wolfsmantel 33, Erlangen, Germany

COMPARISON OF WINDOWING SCHEMES FOR SPEECH CODING. Johannes Fischer * and Tom Bäckström * Fraunhofer IIS, Am Wolfsmantel 33, Erlangen, Germany COMPARISON OF WINDOWING SCHEMES FOR SPEECH CODING Johannes Fischer * and Tom Bäcström * * International Audio Laboratories Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg (FAU) Fraunhofer IIS,

More information

Signal representations: Cepstrum

Signal representations: Cepstrum Signal representations: Cepstrum Source-filter separation for sound production For speech, source corresponds to excitation by a pulse train for voiced phonemes and to turbulence (noise) for unvoiced phonemes,

More information

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING 5 0 DPCM (Differential Pulse Code Modulation) Making scalar quantization work for a correlated source -- a sequential approach. Consider quantizing a slowly varying source (AR, Gauss, ρ =.95, σ 2 = 3.2).

More information

Chirp Transform for FFT

Chirp Transform for FFT Chirp Transform for FFT Since the FFT is an implementation of the DFT, it provides a frequency resolution of 2π/N, where N is the length of the input sequence. If this resolution is not sufficient in a

More information

Theory and Problems of Signals and Systems

Theory and Problems of Signals and Systems SCHAUM'S OUTLINES OF Theory and Problems of Signals and Systems HWEI P. HSU is Professor of Electrical Engineering at Fairleigh Dickinson University. He received his B.S. from National Taiwan University

More information

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ Digital Speech Processing Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models Adaptive and Differential Coding 1 Speech Waveform Coding-Summary of Part 1 1. Probability

More information

EE4512 Analog and Digital Communications Chapter 4. Chapter 4 Receiver Design

EE4512 Analog and Digital Communications Chapter 4. Chapter 4 Receiver Design Chapter 4 Receiver Design Chapter 4 Receiver Design Probability of Bit Error Pages 124-149 149 Probability of Bit Error The low pass filtered and sampled PAM signal results in an expression for the probability

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

NEAR EAST UNIVERSITY

NEAR EAST UNIVERSITY NEAR EAST UNIVERSITY GRADUATE SCHOOL OF APPLIED ANO SOCIAL SCIENCES LINEAR PREDICTIVE CODING \ Burak Alacam Master Thesis Department of Electrical and Electronic Engineering Nicosia - 2002 Burak Alacam:

More information

TS V5.2.0 ( )

TS V5.2.0 ( ) Technical Specification Digital cellular telecommunications system (Phase 2+); Universal Geographical Area Description (GAD) (GSM 03.32 version 5.2.0 Release 1996) GLOBAL SYSTEM FOR MOBILE COMMUNICATIONS

More information

L7: Linear prediction of speech

L7: Linear prediction of speech L7: Linear prediction of speech Introduction Linear prediction Finding the linear prediction coefficients Alternative representations This lecture is based on [Dutoit and Marques, 2009, ch1; Taylor, 2009,

More information

Frequency Domain Speech Analysis

Frequency Domain Speech Analysis Frequency Domain Speech Analysis Short Time Fourier Analysis Cepstral Analysis Windowed (short time) Fourier Transform Spectrogram of speech signals Filter bank implementation* (Real) cepstrum and complex

More information

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding Digital Signal Processing 17 (2007) 114 137 www.elsevier.com/locate/dsp A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding Stephen So a,, Kuldip K.

More information

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1 PCM Reference Chapter 1.1, Communication Systems, Carlson. PCM.1 Pulse-code modulation (PCM) Pulse modulations use discrete time samples of analog signals the transmission is composed of analog information

More information

Book 14 Edition 1.1 March Socketable Flat LED Module and LED Light Engine

Book 14 Edition 1.1 March Socketable Flat LED Module and LED Light Engine Book 14 Edition 1.1 March 2017 Socketable Flat LED Module and LED Light Engine Book 14 Summary (informative) Background The Zhaga Consortium is a global lighting-industry organization that aims to standardize

More information

Numbers and Arithmetic

Numbers and Arithmetic Numbers and Arithmetic See: P&H Chapter 2.4 2.6, 3.2, C.5 C.6 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register file alu

More information

EIGENFILTERS FOR SIGNAL CANCELLATION. Sunil Bharitkar and Chris Kyriakakis

EIGENFILTERS FOR SIGNAL CANCELLATION. Sunil Bharitkar and Chris Kyriakakis EIGENFILTERS FOR SIGNAL CANCELLATION Sunil Bharitkar and Chris Kyriakakis Immersive Audio Laboratory University of Southern California Los Angeles. CA 9. USA Phone:+1-13-7- Fax:+1-13-7-51, Email:ckyriak@imsc.edu.edu,bharitka@sipi.usc.edu

More information

RADIO SYSTEMS ETIN15. Lecture no: Equalization. Ove Edfors, Department of Electrical and Information Technology

RADIO SYSTEMS ETIN15. Lecture no: Equalization. Ove Edfors, Department of Electrical and Information Technology RADIO SYSTEMS ETIN15 Lecture no: 8 Equalization Ove Edfors, Department of Electrical and Information Technology Ove.Edfors@eit.lth.se Contents Inter-symbol interference Linear equalizers Decision-feedback

More information

Multimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization

Multimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lecture 4 -> 6 : Quantization Overview Course page (D.I.R.): https://disit.dir.unipmn.it/course/view.php?id=639 Consulting: Office hours by appointment:

More information

Data Detection for Controlled ISI. h(nt) = 1 for n=0,1 and zero otherwise.

Data Detection for Controlled ISI. h(nt) = 1 for n=0,1 and zero otherwise. Data Detection for Controlled ISI *Symbol by symbol suboptimum detection For the duobinary signal pulse h(nt) = 1 for n=0,1 and zero otherwise. The samples at the output of the receiving filter(demodulator)

More information

Ill-Conditioning and Bandwidth Expansion in Linear Prediction of Speech

Ill-Conditioning and Bandwidth Expansion in Linear Prediction of Speech Ill-Conditioning and Bandwidth Expansion in Linear Prediction of Speech Peter Kabal Department of Electrical & Computer Engineering McGill University Montreal, Canada February 2003 c 2003 Peter Kabal 2003/02/25

More information

CS6956: Wireless and Mobile Networks Lecture Notes: 2/4/2015

CS6956: Wireless and Mobile Networks Lecture Notes: 2/4/2015 CS6956: Wireless and Mobile Networks Lecture Notes: 2/4/2015 [Most of the material for this lecture has been taken from the Wireless Communications & Networks book by Stallings (2 nd edition).] Effective

More information

Lesson 1. Optimal signalbehandling LTH. September Statistical Digital Signal Processing and Modeling, Hayes, M:

Lesson 1. Optimal signalbehandling LTH. September Statistical Digital Signal Processing and Modeling, Hayes, M: Lesson 1 Optimal Signal Processing Optimal signalbehandling LTH September 2013 Statistical Digital Signal Processing and Modeling, Hayes, M: John Wiley & Sons, 1996. ISBN 0471594318 Nedelko Grbic Mtrl

More information

Topic 3: Fourier Series (FS)

Topic 3: Fourier Series (FS) ELEC264: Signals And Systems Topic 3: Fourier Series (FS) o o o o Introduction to frequency analysis of signals CT FS Fourier series of CT periodic signals Signal Symmetry and CT Fourier Series Properties

More information