Chapter [4] "Operations on a Single Random Variable"

Similar documents
3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

2. SPECTRAL ANALYSIS APPLIED TO STOCHASTIC PROCESSES

EE 574 Detection and Estimation Theory Lecture Presentation 8

DETECTION theory deals primarily with techniques for

Data Detection for Controlled ISI. h(nt) = 1 for n=0,1 and zero otherwise.

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Digital Band-pass Modulation PROF. MICHAEL TSAI 2011/11/10

for valid PSD. PART B (Answer all five units, 5 X 10 = 50 Marks) UNIT I

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

EE4304 C-term 2007: Lecture 17 Supplemental Slides

7 The Waveform Channel

This examination consists of 10 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS

This examination consists of 11 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS

SPEECH ANALYSIS AND SYNTHESIS

Communications and Signal Processing Spring 2017 MSE Exam

BASICS OF DETECTION AND ESTIMATION THEORY

8.5 Taylor Polynomials and Taylor Series

Memo of J. G. Proakis and M Salehi, Digital Communications, 5th ed. New York: McGraw-Hill, Chenggao HAN

where r n = dn+1 x(t)

Chapter 2 Signal Processing at Receivers: Detection Theory

Statistics for Economists. Lectures 3 & 4

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Fourier Methods in Digital Signal Processing Final Exam ME 579, Spring 2015 NAME

Name of the Student: Problems on Discrete & Continuous R.Vs

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

6.041/6.431 Fall 2010 Final Exam Solutions Wednesday, December 15, 9:00AM - 12:00noon.

ECE 564/645 - Digital Communications, Spring 2018 Homework #2 Due: March 19 (In Lecture)

A Simple Example Binary Hypothesis Testing Optimal Receiver Frontend M-ary Signal Sets Message Sequences. possible signals has been transmitted.

CS 361: Probability & Statistics

Tutorial 1 : Probabilities

Chapter 2 Random Variables

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

Statistics for scientists and engineers

Discrete Random Variables

Module 3. Function of a Random Variable and its distribution

Probability, Random Processes and Inference

Fourier Transform for Continuous Functions

Multiple Random Variables

Lecture 12. Block Diagram

A Systematic Description of Source Significance Information

EE6604 Personal & Mobile Communications. Week 15. OFDM on AWGN and ISI Channels

Chapter 9. Non-Parametric Density Function Estimation

Lecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process

(i) Represent discrete-time signals using transform. (ii) Understand the relationship between transform and discrete-time Fourier transform

RADIO SYSTEMS ETIN15. Lecture no: Equalization. Ove Edfors, Department of Electrical and Information Technology

ECE6604 PERSONAL & MOBILE COMMUNICATIONS. Week 3. Flat Fading Channels Envelope Distribution Autocorrelation of a Random Process

Signal Design for Band-Limited Channels

Probability and Statistics for Final Year Engineering Students

Midterm Exam 1 Solution

EE4512 Analog and Digital Communications Chapter 4. Chapter 4 Receiver Design

Sensors. Chapter Signal Conditioning

P 1.5 X 4.5 / X 2 and (iii) The smallest value of n for

Physics 6720 Introduction to Statistics April 4, 2017

SRI VIDYA COLLEGE OF ENGINEERING AND TECHNOLOGY UNIT 3 RANDOM PROCESS TWO MARK QUESTIONS

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore

1 Review of di erential calculus

EE6604 Personal & Mobile Communications. Week 13. Multi-antenna Techniques

CHAPTER 2 RANDOM PROCESSES IN DISCRETE TIME

Centre for Mathematical Sciences HT 2017 Mathematical Statistics

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Chapter 9. Non-Parametric Density Function Estimation

Uncertainty due to Finite Resolution Measurements

ECE302 Spring 2006 Practice Final Exam Solution May 4, Name: Score: /100

Direct-Sequence Spread-Spectrum

Introduction to Convolutional Codes, Part 1

B Elements of Complex Analysis

that efficiently utilizes the total available channel bandwidth W.

Chapter 11 - Sequences and Series

Solutions to Homework Set #3 (Prepared by Yu Xiang) Let the random variable Y be the time to get the n-th packet. Find the pdf of Y.

Chapter 7: Channel coding:convolutional codes

Quantum Mechanics- I Prof. Dr. S. Lakshmi Bala Department of Physics Indian Institute of Technology, Madras

UCSD ECE 153 Handout #20 Prof. Young-Han Kim Thursday, April 24, Solutions to Homework Set #3 (Prepared by TA Fatemeh Arbabjolfaei)

Performance Analysis of Spread Spectrum CDMA systems

Practice Problems Section Problems

Name of the Student: Problems on Discrete & Continuous R.Vs

Lecture Notes 1: Vector spaces

16. . Proceeding similarly, we get a 2 = 52 1 = , a 3 = 53 1 = and a 4 = 54 1 = 125

Sequences and the Binomial Theorem

Digital Baseband Systems. Reference: Digital Communications John G. Proakis

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

LOW-density parity-check (LDPC) codes were invented

Chapter Intended Learning Outcomes: (i) Understanding the relationship between transform and the Fourier transform for discrete-time signals

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

EE456 Digital Communications

CS 361: Probability & Statistics

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes

Stochastic Processes

We introduce methods that are useful in:

Lecture 3 Continuous Random Variable

Random Signal Transformations and Quantization

Northwestern University Department of Electrical Engineering and Computer Science

ECE-340, Spring 2015 Review Questions

Statistical Methods in Particle Physics

ECE 564/645 - Digital Communications, Spring 2018 Midterm Exam #1 March 22nd, 7:00-9:00pm Marston 220

Transcription:

Chapter [4] "Operations on a Single Random Variable" 4.1 Introduction In our study of random variables we use the probability density function or the cumulative distribution function to provide a complete statistical description of the random variable. From these functions we could, in theory, determine just about anything we might want to know about the random variable. In many cases, it is of interest to distill this information down to a few parameters that describe some of the important features of the random variable. For example, we saw in Chapter 3 that the Gaussian random variable is described by two parameters, which were referred to as the mean and variance. In this chapter, we will look at these parameters as well as several others that describe various characteristics of random variables. We will see that these parameters can be viewed as the results of Performing various operations on a random variable. 4.2 Expected Value of a Random Variable To begin, we introduce the idea of an average or expected value of a random Variable. This is perhaps the single most important characteristic of a random variable and is also a concept very familiar to most students. After taking a test, one of the most common questions a student will ask after they see their grade is, What was the average? On the other hand, how often does a student ask, What was the probability density function of the exam scores? While the answer to the second question would provide the student with more information about how the class performed, the student may not want all that information. Just knowing the average may be sufficient to tell the student how he or she performed relative to the rest of the class. -190-

DEFINITION 4.1: The expected value of a random variable X which has a PDF, fx(x), is [ ] (4.1) The terms average, mean, expectation, and first moment are all alternative names for the concept of expected value and will be used interchangeably throughout the text. Furthermore, an over bar is often used to denote expected value so that the symbol X is to be interpreted as meaning the same thing as E[X]. Another commonly used notation is to write [ ] For discrete random variables, the PDF can be written in terms of the probability mass function, (4.2) In this case, using the properties of delta functions, the definition of expected values for discrete random variables reduces to (4.3) Hence, the expected value of a discrete random variable is simply a weighted average of the values that the random variable can take on, weighted by the probability mass of each value. Naturally, the expected value of a random variable exists only if the integral in Equation 4.1 or the series in Equation 4.3 converges. One can dream up many random variables for which the integral or series does not converge and thus their expected values don t exist (or less formally, their expected value is infinite). To gain some physical insight into this concept of expected value, we may think of f X (x) as a mass distribution of an object along the x-axis; then Equation 4.1 calculates the centroid or center of gravity of the mass. -191-

EXAMPLE 4.1: Consider a random variable that has an exponential PDF given by ( ) Its expected value is calculated as follows: [ ] ( ) The last equality in the series is obtained by using integration by parts once. It is seen from this example that the parameter b that appears in this exponential distribution is, in fact, the mean (or expected value) of the random variable. EXAMPLE 4.2: Next, consider a Poisson random variable whose probability mass function is given by Its expected value is found in a similar manner. [ ] Once again, we see that the parameter α in the Poisson distribution is equal to the mean. 4.3 Expected Values of Functions of Random Variables The concept of expectation can be applied to functions of random variables as well as to the random variable itself. This will allow us to define many other parameters that describe various aspects of a random variable. DEFINITION 4.2: Given a random variable X with PDF f X (x), the expected value of a function, g(x), of that random variable is given by [ ] (4.4) For a discrete random variable, this definition reduces to [ ] (4.5) -192-

To start with, we demonstrate one extremely useful property of expectations in the following theorem. THEOREM 4.1: For any constants a and b, [ ] [ ] (4.6) Furthermore, for any function g(x) that can be written as a sum of several other functions E[ ] [ ] (4.7) In other words, expectation is a linear operation and the expectation operator can be exchanged (in order) with any other linear operation. Table 4.1 Expected Values of Various Functions of Random Variables Different functional forms of lead to various different parameters that describe the random variable and are known by special names. A few of the more common ones are listed in Table 4.1. In the following sections, selected parameters will be studied in more detail. -193-

4.4 Moments DEFINITION 4.3: The nth moment of a random variable X is defined as [ ] (4.8) For a discrete random variable, this definition reduces to [ ] (4.9) The zero th moment is simply the area under the PDF and hence must be 1 for any random variable. The most commonly used moments are the first and second moments. The first moment is what we previously referred to as the mean, while the second moment is the mean squared value. For some random variables, the second moment might be a more meaningful characterization than the first. For example, suppose X is a sample of a noise waveform. We might expect that the distribution of the noise is symmetric about zero (i.e., just as likely to be positive as negative) and hence the first moment will be zero. So if we are told that X has a zero mean, this merely says that the noise does not have a bias. On the other hand, the second moment of the random noise sample is in some sense a measure of the strength of the noise. In fact, we will associate the second moment of a noise process with the power in the process. Hence, specifying the second moment can give us some useful physical insight into the noise process. EXAMPLE 4.3: Consider a random variable with a uniform probability density function given as { The mean is given by [ ] -194-

while the second moment is [ ] In fact, it is not hard to see that in general, the nth moment of this uniform random variable is given by [ ] 4.5 Central Moments Consider a random variable Y which could be expressed as the sum, of a deterministic (i.e., not random) part a and a random part X. Furthermore, suppose that the random part tends to be very small compared to the fixed part. That is, the random variable Y tends to take small fluctuations about a constant value, a. Such might be the case in a situation where there is a fixed signal corrupted by noise. In this case, we might write. In this case, the nth moment of Y would be dominated by the fixed part. That is, it s difficult to characterize the randomness in Y by looking at the moments. To overcome this, we can use the concept of central moments. DEFINITION 4.4: The nth central moment of a random variable X is defined as [ ] (4.10) In this equation, is the mean (first moment) of the random variable. For discrete random variables, this definition reduces to [ ] (4.11) -195-

With central moments, the mean is subtracted from the variable before the moment is taken in order to remove the bias in the higher moments due to the mean. Note that, like regular moments, the zero the central moment is [ ] [ ]. Furthermore, the first central moment is [ ] [ ]. Therefore, the lowest central moment of any real interest is the second central moment. This central moment is given a special name, the variance, and we quite often use the notation to represent the variance of the random variable X. Note that [ ] [ ] (4.12) In many cases, the best way to calculate the variance of a random variable is to calculate the first two moments and then form the second moment minus the first moment squared. EXAMPLE 4.4: For the binomial random variable in Example 4.4, recall that the mean was [ ] and the second moment was [ ] 2 2. Therefore, the variance is given by Similarly, for the uniform random variable in Example 4.5, E[X] = a/2, E[X 2 ] = a 2 /3, and hence = a 2 /3 a 2 /4 = a 2 /12. Note that if the moments have not previously been calculated, it may be just as easy to compute the variance directly. In the case of the uniform random variable, once the mean has been calculated, the variance can be found as Another common quantity related to the second central moment of a random variable is the standard deviation, which is defined as the square root of the variance, [ ] -196-

Both the variance and the standard deviation serve as a measure of the width of the PDF of a random variable. Some of the higher order central moments also have special names, although they are much less frequently used. The third central moment is known as the skewness and is a measure of the symmetry of the PDF about the mean. The fourth central moment is called the kurtosis and is a measure of the peakedness of a random variable near the mean. Note that not all random variables have finite moments and/or central moments. We give an example of this later for the Cauchy random variable. Some quantities related to these higher order central moments are given in Definition 4.5. DEFINITION 4.5: The coefficient of skewness is [ ] (4.13) This is a dimensionless quantity that is positive if the random variable has a PDF skewed to the right and negative if skewed to the left. The coefficient of kurtosis is also dimensionless and is given as [ ] (4.14) The more the PDF is concentrated near its mean, the larger the coefficient of kurtosis. In other words, a random variable with a large coefficient of kurtosis will have a large peak near the mean. -197-

4.6 Conditional Expected Values Another important concept involving expectation is that of conditional expected value. As specified in Definition 4.6, the conditional expected value of a random variable is a weighted average of the values the random variable can take on, weighted by the conditional PDF of the random variable. DEFINITION 4.6: The expected value of a random variable X, conditioned on some event A is [ ] (4.15) For a discrete random variable, this definition reduces to [ ] (4.16) Similarly, the conditional expectation of a function, variable, conditioned on the event A is of a random [ ] (4.17) depending on whether the random variable is continuous or discrete. Conditional expected values are computed in the same manner as regular expected values with the PDF or PMF replaced by a conditional PDF or conditional PMF. EXAMPLE 4.5: Consider a Gaussian random variable of the form -198-

Suppose the event A is that the random variable X is positive, A = {X > 0}. Then The conditional expected value of X given that X > 0 is then [ 4.7 Transformations of Random Variables Consider a random variable X with a PDF and CDF given by X and F X (x), respectively. Define a new random variable Y such that Y = g(x) for some function g( ). What is the PDF, f Y (y) (or CDF), of the new random variable? This problem is often encountered in the study of systems where the PDF for the input random variable X is known and the PDF for the output random variable Y needs to be determined. In such a case, we say that the input random variable has undergone a transformation. A. Monotonically Increasing Functions To begin our exploration of transformations of random variables, let s assume that the function is continuous, one-to-one, and monotonically increasing. A typical function of this form is illustrated in Figure 4.1(a). This assumption will be lifted later when we consider more general functions, but for now this simpler case applies. Under these assumptions, the inverse function, X = g 1 (Y), exists and is well behaved. In order to obtain the PDF of Y, we first calculate the CDF. Recall that F Y (y) = P r (Y y). Since there is a one-to-one relationship between values of Y and their corresponding values of X, this CDF can be written in terms of X according to -199-

(4.18) Note that this can also be written as (4.19) Figure 4.1 A monotonic increasing function (a) and a monotonic decreasing function (b). Differentiating Equation 4.18 with respect to y produces ( ) (4.20) while differentiating Equation 4.21 with respect to x gives (4.21) Either Equation 4.20 or 4.21 can be used (whichever is more convenient) to compute the PDF of the new random variable. -200-

EXAMPLE 4.6: Suppose X is a Gaussian random variable with mean, μ, and variance, σ2. A new random variable is formed according to, where (so that the transformation is monotonically increasing). Since, then applying Equation 4.21 produces ( ) Note that the PDF of Y still has a Gaussian form. In this example, the transformation did not change the form of the PDF; it merely changed the mean and variance. EXAMPLE 4.7: Let X be an exponential random variable with f X (x) = 2e 2x u(x) and let the transformation be Y = X 3. Then = 3x 2 and hence, EXAMPLE 4.8: Suppose the same Gaussian random variable from the previous example is passed through a half-wave rectifier which is described by the input-output relationship, -201-

For, so that f Y (y)=f X (y). However, when we note that the event X <0 is equivalent to the event Y =0; hence Since the input Gaussian PDF is symmetric about zero, r. Basically, the random variable Y is a mixed random variable. It has a continuous part over the region and a discrete part at y = 0. Using a delta function, we can write the PDF of Y as ( ) Example 4.8 illustrates how to deal with transformations that are flat over some interval of nonzero length. In general, suppose the transformation y = g(x) is such that g(x) = y o for any x in the interval x 1 x x 2. Then the PDF of Y will include a discrete component (a delta function) of height Pr(Y = y o ) = at the point y=y o. One often encounters transformations that have several different flat regions. One such staircase function is shown in Figure 4.2. Here, a random variable X that may be continuous will be converted into a discrete random variable. The classical example of this is analog-to digital conversion of signals. Suppose the transformation is of a general staircase form, { (4.22) -202-

Then Y will be a discrete random variable whose PMF is { (4.23) Figure 4.2 A staircase (quantizer) transformation: a continuous random variable will be converted into a discrete random variable. -203-

4.8 Characteristic Functions In this section we introduce the concept of a characteristic function. The characteristic function of a random variable is closely related to the Fourier transform of the PDF of that random variable. Thus, the characteristic function provides a sort of frequency domain representation of a random variable, although in this context there is no connection between our frequency variable ω and any physical frequency. In studies of deterministic signals, it was found that the use of Fourier transforms greatly simplified many problems, especially those involving convolutions. We will see in future chapters the need for performing convolution operations on PDFs of random variables and hence frequency domain tools will become quite useful. Furthermore, we will find that characteristic functions have many other uses. For example, the characteristic function is quite useful for finding moments of a random variable. In addition to the characteristic function, two other related functions, namely, the momentgenerating function (analogous to the Laplace transform) and the probability-generating function (analogous to the z-transform), will also be studied in the following sections. DEFINITION 4.7: The characteristic function of a random variable, X, is given by [ ] (4.24) Note the similarity between this integral and the Fourier transform. In most of the electrical engineering literature, the Fourier transform of the function f X (x) would be. Given this relationship between the PDF and the characteristic function, it should be clear that one can get the PDF of a random variable from its characteristic function through an inverse Fourier transform operation: (4.25) -204-

The characteristic functions associated with various random variables can be easily found using tables of commonly used Fourier transforms, but one must be careful since the Fourier integral used in Equation 4.24 may be different from the definition used to generate common tables of Fourier transforms. In addition, various properties of Fourier transforms can also be used to help calculate characteristic functions as shown in the following example. EXAMPLE 4.9: An exponential random variable has a PDF given by f X (x) = exp ( x) u(x). Its characteristic function is found to be This result assumes that ω is a real quantity. Now suppose another random variable Y has a PDF given by f Y (y) = a exp( ay)u(y). Note that f Y (y) = af X (ay), thus using the scaling property of Fourier transforms, the characteristic function associated with the random variable Y is given by ( ) assuming a is a positive constant (which it must be for Y to have a valid PDF). Finally, suppose that Z has a PDF given by f Z (z) = a exp ( a(z b)) u(z b). Since f Z (z) = f Y (z b), the shifting property of Fourier transforms can be used to help find the characteristic function associated with the random variable Z: -205-

4.9 Probability Generating Functions In the world of signal analysis, we often use Fourier transforms to describe continuous time signals, but when we deal with discrete time signals, it is common to use a z-transform instead. In the same way, the characteristic function is a useful tool for working with continuous random variables, but when discrete random variables are concerned, it is often more convenient to use a device similar to the z-transform which is known as the probability generating function. DEFINITION 4.8: For a discrete random variable with a probability mass function, P X (k), defined on the nonnegative integers1, k = 0, 1, 2,..., the probability generating function, H X (z), is defined as (4.26) Note the similarity between the probability generating function and the unilateral z-transform of the probability mass function. Since the PMF is seen as the coefficients of the Taylor series expansion of H X (z), it should be apparent that the PMF can be obtained from the probability generating function through (4.27) The derivatives of the probability generating function evaluated at zero return the PMF and not the moments as with the characteristic function. However, the moments of the random variable can be obtained from the derivatives of the probability generating function at z = 1. -206-

THEOREM 4.2: The mean of a discrete random variable can be found from its probability generating function according to [ ] (4.28) Furthermore, the higher order derivatives of the probability generating function evaluated at z = 1 lead to quantities known as the factorial moments, [ ] (4.29) EXAMPLE 4.10: A geometric random variable has a PMF given by P X (k) The probability generating function is found to be (4.30) In order to facilitate forming a Taylor series expansion of this function about the point z = 1, it is written explicitly as a function of z 1. From there, the power series expansion is fairly simple: (4.31) Comparing the coefficients of this series with the coefficients given in Equation 4.31 leads to immediate identification of the factorial moments, (4.32) -207-

4.10 Moment Generating Functions In many problems, the random quantities we are studying are often inherently nonnegative. Examples include the frequency of a random signal, the time between arrivals of successive customers in a queueing system, or the number of points scored by your favorite football team. The resulting PDFs of these quantities are naturally one-sided. For such one-sided waveforms, it is common to use Laplace transforms as a frequency domain tool. The moment generating function is the equivalent tool for studying random variables. DEFINITION 4.9: The moment generating function, nonnegative 2 random variable, X, is, of a [ ] (4.33) Note the similarity between the moment generating function and the Laplace transform of the PDF. The PDF can in principle be retrieved from the moment generating function through an operation similar to an inverse Laplace transform, (4.34) Because the sign in the exponential term in the integral in Equation 4.33 is the opposite of the traditional Laplace transform, the contour of integration (the so called Bromwich contour) in the integral specified in Equation 4.34 must now be placed to the left of all poles of the moment generating function. As with the characteristic function, the moments of the random variable can be found from the derivatives of the moment generating function (hence, its name) according to [ ] (4.35) It is also noted that if the moment generating function is expanded in a power series of the form (4.36) then the moments of the random variable are given by E[X k ] = k!m k. -208-

4.11 Optimum Receiver for Signals Corrupted by AWGN Let us begin by developing a mathematical model for the signal at the input to the receiver. We assume that the transmitter sends digital information by use of M signal waveforms [. Each waveform is transmitted within the symbol (signaling) interval of duration T. To be specific, we consider the transmission of information over the interval 0 t T. Fig. 4.3: Model for received signal passed through AWGN channel The channel is assumed to corrupt the signal by the additive of white Gaussian noise (AWGN), as illustrated in figure (4.3). Thus the received signal in the interval 0 t T may be expressed as: (4.37) Where n(t) denotes a sample function of the additive white Gaussian noise (AWGN) process with power spectral density w/z. Based on the observation of over the signal interval, we wish to design a receiver that is optimum in the sense that it minimizes the probability of error. It is convenient to subdivide the receiver into two parts the signal demodulator and the detector- as shown in figure (4.4). The function of the signal demodulator is to convert the received waveform r(t) into N- dimensional vector [ ] where N is the dimension of the transmitted signal waveforms. The function of the detector is to decide which of the M possible signal waveforms was transmitted based of vector r. -209-

Fig. 4.4: Receiver Configuration Two realizations of the signal demodulator are described in the next two sections. One is based on the use of signal correlators. The second is based on the use of matched filters. The optimum detector that follows the signal demodulator is designed to minimize the probability of error. 4.11.1. Correlation Demodulator In this section, we describe a correlation demodulator that decomposes the received signal and the noise into N- dimensional vector. In other words, the signal and the noise are expanded into a series of linearly weighted orthogonal basis functions. It is assumed that the N basis function { )} span the signal space, so every one of the possible transmitted signals of the set { } can be represented as a weighted linear combination of { }. In the case of the noise, the functions do not span the noise space. However, we show below that the noise terms that fall outside the signal space are irrelevant to the detection of the signal. Suppose the received signal r(t) is passed through a parallel bank of N cross- correlators which basically compute the projection of r(t) onto the N basis functions as illustrated in figure (4.5). Thus we have: [ ] (4.38) -210-

Where [ (4.39) [ (4.40) The signal is now represented by the vector [ ] Their values depend on which of the M signals was transmitted. The components { } are random variables that arise from the presence of the additive noise. In fact, we can express the received signal r(t) in the interval as: (4.41) The term )defined as: (4.42) is a zero- mean Gaussian noise process that represents the difference between the original noise process n(t) and the part corresponding to the projection of n(t) onto the basis functions { }. We shall show bellow that is irrelevant to the decision as to which signal was transmitted. Correspondingly, the decision may be based entirely on the correlator output signal and noise components Since the signals { } are deterministic, the signal components are deterministic. The noise components { } are Gaussian. There mean are for all n. [ ] (4.43) -211-

Fig. 4.5: Correlation- type demodulator Their covariance are: [ ] (4.44) -212-

Where = 1 when and zero otherwise. Therefore, the N noise components { } are zero- mean uncorrelated Gaussian random variable with common variance. From the above development, it follows that the correlator outputs { } conditioned on the mth signal being transmitted are Gaussian random variables with mean: And equal variance: (4.45) (4.46) Since the noise components { } are uncorrelated Gaussian random variables, they are also statistically independent. As a consequence, the correlator outputs {{ } } conditioned on the mth signal being transmitted are statistically independent Gaussian variables. Hence, the conditional probability density functions of the random variables [ ] are simply: Where: ( ) ( ) (4.47) ( ) [ ] (4.48) By substituting Eq. (4.47) into Eq. (4.46), we obtain the joint conditional probability pdfs: ( ) [ ] (4.49) As a final point we wish to show that the correlator outputs [ ] are sufficient statistics for reaching a decision on which of the M signals was transmitted, i.e. that no additional relevant information can be extracted from correlator outputs { }. Hence, n'(t) may be ignored. -213-

4.11.2. Matched-Filter Demodulator Instead of using a bank of N correlators to generate the variables, { }, we may use a bank of N linear filters. Suppose that the impulse responses of the N linear filters are (4.50) Sampling the outputs of these filters at t = T, we obtain (4.51) The result at t = T is the same as that obtained from the linear correlators. However, this is not to conclude both type demodulators are all the same any time. Indeed, it emphasizes that the equality holds only at the time instant t = T. The figure below illustrates the two demodulators behaviors. Fig. 4.6: Output of correlator and matched filter for sine wave input A matched filter to the signal s(t) is that its impulse response is h(t) =s(t - t), where s(t) is assumed to be confined to the time interval. An example is shown below. -214-

Fig. 4.7: Signal s(t) and filter matched to s(t) The response of h(t) = s(t - t) to the signal s(t) is (4.52) It is the time-autocorrelation function of the signal s(t), and the illustration is shown as below. Fig. 4.8: The matched filter output is the autocorrelation function of -215-

Note: the autocorrelation function is an even function of t, which attains a peak at t = T. Figure 4-9 demonstrates the matched filter demodulator that generates the observed variables {r k }. Fig. 4.9: The matched filter demodulator -216-

Frequency-domain interpretation of the matched filter: Because h(t) = s(t - t), the Fourier transform of this relationship is Note: * + (4.53) (1) The matched filter has a frequency response that is the complex conjugate of the transmitted signal spectrum multiplied by the phase factor, representing the sampling delay of T. (2) H( f ) S( f ), the magnitude response of the matched filter is identical to the transmitted signal spectrum. In addition, the phase of H( f ) is the negative of the phase of S( f ). Henceforth, the filter output has a spectrum. The output waveform is (4.54) By sampling the output of the matched filter at t = T, we have (4.55) The noise at the output of the matched filter has a PSD (4.56) -217-

The total noise power at the output of the matched filter is (4.57) The signal power at the output of the matched filter is (4.58) The output SNR is then (4.59) which agrees with the result given by Error! Reference source not found.. -218-

4.11.3. The Optimum Detector The optimal detector is to make an optimum decision rule based on the observation vector, r at the output of the demodulator. Three kinds of detectors will be discussed in the sequel. There are symbol-by-symbol maximum likelihood detector for signals without memory, maximum likelihood sequence detector, and symbol-by-symbol MAP detector, both for signals with memory. In this section, we design the symbol-by-symbol maximum likelihood (ML) detector to make the probability of correct decision maximum. Define the posterior probabilities as. The decision criterion is based on selecting the signal corresponding to the maximum of the set of posterior probabilities, hence, such a criterion is called as maximum a posteriori probability (MAP) criterion. The posterior probabilities can be expressed by (4.60) where m P r s is the conditional PDF of the observed vector given, and P( ) is the a priori probability of the mth signal being transmitted. The denominator can be written as (4.61) Therefore, to compute the posterior probabilities requires knowledge of the a priori probabilities and the conditional PDFs for m=1,2,...,m.we observe that the denominator in Error! Reference source not found. is independent of which signal is transmitted. Furthermore, suppose the M signals are equally likely, that is, the a priori probability is for all M. Consequently, to maximize is equivalent to maximizing. -219-

The conditional PDF. or any monotonic function of it is usually called the likelihood function. The decision criterion based on the maximum of the likelihood function,. over the M signals is called the maximum-likelihood (ML) criterion.. is shown in Error! Reference source not found.. To simplify the computation, we take natural logarithm both sides as (4.62) Apparently, the maximum of ln. over is equivalent to finding the signal that minimizes the Euclidean distance (4.63) is called the distance metrics. As a result, this decision rule is called as minimum distance detection. Eq. (4.63.) can be expanded as (4.64) The term can be ignored in the computation of the metrics since it is common to all distance metrics. The modified distance metrics are (4.65) -220-

To minimize is equivalent to maximizing c(r,, as below (4.66) Notes: (1) The term represents the projection of the received signal vector onto each of the M possible transmitted signals vectors. The projection is a measure of the correlation between the received vector and the m th signal, therefore is called the correlation metrics for deciding which of the M signals was transmitted. (2) The term, m=1,2,,m may be viewed as bias for signal sets that have unequal energies. Accordingly, for signals with same energy, this term can be ignored. The correlation metrics can be expressed as (4.67) Therefore, (4.67) can be generated by a demodulator that cross-correlates the received signal r(t) with each of the M possible transmitted signals with the individual energy. Consequently, the optimum receiver (demodulator and detector) can be implemented as Figure 4.10. -221-

Fig. 4.10: An alternative realization of the optimum AWGN receiver As a result, the MAP criterion can be simplified to the ML criterion when all signals have same energy. On the other hand, if the energies of signals are unequal, the MAP criterion should be adopted, and the corresponding metrics are (4.68) -222-

4.11.4 Optimum Filtering: Wiener- Hopf Filter When a desired signal is mixed with noise, the SNR can be improved by passing it through a filter that suppresses frequency components where the signal is weak but the noise is strong. The SNR improvement in this case can be explained qualitatively by considering a case of white noise mixed with signal m(t) whose PSD decreases at high frequencies. If the filter attenuates higher frequencies more, the signal will be reduced - in fact, distorted. The distortion component may be considered and added noise. Thus, attenuation of higher frequencies will cause additional noise (from signal distortion), but, in compensation, it will reduce the channel noise, which is strong at higher frequencies. Because at higher frequencies the signal has small power content, and the total noise may be smaller than before. Let be the optimum filter (Fig. 4-11-a). This filter, not being ideal, will cause signal distortion. The distortion signal can be found from Fig. (4-11-b). The distortion signal power appearing at the output is given by: (4.69) Where is the signal PSD at the input of the receiving filter. The channel noise power appearing at the filter output is given by -223-

Fig. 4.11: Weiner-Hopf filter calculations (4.70) Where is the noise PSD appearing at the input of the receiving filter. The distortion component acts as a noise. Because the signal and the channel noise are incoherent, the total noise at the receiving filter output is the sum of the channel noise and the distortion noise, [ ] (4.71) Using the fact that, and noting that both and are real, Eq.( 4.72) can be rearranged as: [ ] (4.72) -224-

Where ( ) = + The integrand on the right-hand side of Eq.(4.73) is non-negative. Moreover, it is a sum of two non-negative terms. Hence, to minimize N o, we must minimize each term. Because the second term is independent of, only the first term can be minimized. From Eq. (4.72) it is obvious that this term is minimized when: For this optimum choice, the output noise power N o is given by: (4.73) (4.74) Eq. (4.74) shows that ( ) 1 (no attenuation) when. But when ( )., the filter has high attenuation. In other words, the optimum filter attenuates heavily that band where noise is relatively stronger. This causes some signal distortion, but at the same time it attenuates the noise more heavily so that the overall SNR is improved. -225-

Comments on the Optimum Filter: If the SNR at the filter input is reasonably large e.g. ( ) > 100 100( ) (SNR of 20dB) the optimum filter [Eq. (4.73)] in this case is practically an ideal filter, and N O [Eq. (4.74)] is given by (4.75) Hence for a large input SNR, optimization yields insignificant improvement. The Weiner-Hopf filter is therefore practical only when the input SNR is small (large- noise case). Another issue is realization of the optimum filter. Because ( ) and ( ) are both even functions of, the optimum filter is an even function of. Hence, the unit impulse response op(t) is an even function of t. This makes op(t) non-causal and filter unrealizable. As noted earlier, such a filter can be realized approximately if we are willing to tolerate some delay in the output. If delay cannot be tolerated, the derivation of,must be repeated with a realization constraint. Note that the realizable optimum filter can never be superior to the unrealizable optimum filter. -226-

Problems for Chapter [4] 4.1) Imagine that you are trapped in a circular room with three doors symmetrically placed around the perimeter. You are told by a mysterious voice that one door leads to the outside after a two-hour trip through a maze. However, the other two doors lead to mazes that terminate back in the room after a two-hour trip, at which time you are unable to tell through which door you exited or entered. What is the average time for escape to the outside? Can you guess the answer ahead of time? If not, can you provide a physical explanation for the answer you calculate? 4.2) A communication system sends data in the form of packets of fixed length. Noise in the communication channel may cause a packet to be received incorrectly. If this happens, then the packet is retransmitted. Let the probability that a packet is received incorrectly be q. Determine the average number of transmissions that are necessary before a packet is received correctly. 4.3) In Exercise 4.2 let the transmission time be T t seconds for a packet. If the packet was received incorrectly, then a message is sent back to the transmitter that states that the message was received incorrectly. Let the time for sending such a message be T i. Assume that if the packet is received correctly, we do not send an acknowledgment. What is the average time for a successful transmission? 4.4) For a Gaussian random variable, derive expressions for the coefficient of skewness and the coefficient of kurtosis in terms of the mean and variance, and 2. 4.5) Prove that all odd central moments of a Gaussian random variable are equal to zero. Furthermore, develop an expression for all even central moments of a Gaussian random variable. -227-

4.6) Let c n be the nth central moment of a random variable and n be its nth moment. Find a relationship between c n and k, 4.7) Let X be a random variable with [ ] and. Find the following: a) [ ] b) [ ] 2 c) [ 2 ] 4.8) Suppose X is a Gaussian random variable with a mean of and a 2 variance of Find an expression for [ ] 4.9) Suppose a random variable has a PDF which is nonzero on only the interval That is, the random variable cannot take on negative values. Prove that [ ] [ ] 4.10) Show that the concept of total probability can be extended to expected values. That is, if is a set of mutually exclusive and exhaustive events, then [ ] [ ] 4.11) Prove Jensen s Inequality, which states that for any convex function and any random variable, [ ] [ ] 4.12) Suppose X is a random variable with an exponential PDF of the form f X (x) = 2e 2xu (x). A new random variable is created according to the transformation. (a) Find the range for X and Y. (b) Find f Y (y). -228-

4.13) Let X be a standard normal random variable Find the PDF of 4.14) Repeat Exercise 4.13 if the transformation is, 4.15) Suppose a random variable, X, has a Gaussian PDF with zero mean and variance. The random variable is transformed by the device whose input/output relationship is shown in the accompanying figure. Find and sketch the PDF of the transformed random variable, Y. 4.16) A matched filter has the frequency response: a) Determine the impulse response (t) corresponding to H(f). b) Determine the signal waveform to which the filter characteristic is matched. -229-

4.17) Consider the signal { ( ) a) Determine the impulse response of the matched filter for the signal. b) Determine the output of the matched filter at t = T. c) Suppose the signal s(t) is passed through a correlator that correlates the input s(t) with t=t. Determine the value of the correlator output at t = T. Compute your result with that in (b). 4.18) A signal process m(t) is mixed with a channel noise n(t). The respective PSDs are:, and a) Find the optimum Weiner-Hopf filter. b) Sketch its unit impulse response. c) Estimate the amount of delay necessary to make this filter closely realizable. -230-