Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf

Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf Reading: Ch. 5 in Kay-II. (Part of) Ch. III.B in Poor. EE 527, Detection and Estimation Theory, # 5c

Detecting Parametric Signals in Noise w Having Exactly Known Pdf/Pmf p w (w) (Bayesian Decision-theoretic Approach) Consider the simple binary signal-detection problem: where H 0 : x µ 0 (ϕ) + w versus H : x µ (ϕ) + w µ 0 (ϕ) and µ (ϕ) are known vector-valued functions of the nuisance parameter ϕ and the noise probability density or mass function (pdf/pmf) p w (w) is exactly known. Recall our discussion on handling nuisance parameters on pp. 7 8 of handout # 5. Since we have simple hypotheses, we need to specify the Bernoulli prior pmf for the two hypotheses, using prior probabilities π 0, π π 0 (the Bernoulli pmf). Specializing the result in eq. (5) in handout # 5 (where we have assumed that the hypotheses and ϕ are independent a EE 527, Detection and Estimation Theory, # 5c 2

priori, see eq. (4) in handout # 5) to the above scenario, we obtain the following Bayes decision rule: prior pdf of ϕ ( ) {}}{ pw x µ (ϕ) π(ϕ) dϕ Λ(x) ( pw x µ0 (ϕ) ) π(ϕ) dϕ integrated likelihood ratio H π 0 L( 0) π L(0 ). () Example. Detection of on-off keying signals with unknown phase in additive white Gaussian noise (AWGN): Choose AWGN with Σ σ 2 I and known noise variance σ 2, µ 0 (ϕ) 0, and µ (ϕ) s(ϕ) s[0, ϕ] s[, ϕ]. s[n, ϕ] a 0 sin(0 ω c + ϕ) a sin( ω c + ϕ). a N sin((n ) ω c + ϕ) where a, a 2,..., a N is a known amplitude sequence, ω c is known carrier frequency, and ϕ is an unknown phase angle, independent of the noise, following π(ϕ) uniform(0, 2π). EE 527, Detection and Estimation Theory, # 5c 3

Now, () reduces to Λ(x) p(x H ) p(x H 0 ) 2π 0 2π 0 2π 0 2π 2π p(x H, ϕ) dϕ 2π p(x H 0, ϕ) indep. of ϕ dϕ 2π exp{ N 2σ 2 n0 (x[n] s[n, ϕ])2 } dϕ exp{ N 2σ 2 n0 (x[n])2 } { [ N N exp σ 2 ( x[n]s[n, ϕ]) 2 (s[n, ϕ]) 2]} dϕ. 2π 0 n0 n0 We prefer to choose ω c equal to an integer multiple of 2π/N. For dense signal sampling (N large) and ω c not close to 0 or π, we have [see eq. (III.B.67) in Poor]: (s[n, ϕ]) 2 N n0 N n0 a 2 n sin 2 (n ω c + ϕ) 2 N n0 a 2 n N 2 a2 where we have used the identity sin 2 x 2 2 following definition: cos 2x and the a 2 N N n0 a 2 n. EE 527, Detection and Estimation Theory, # 5c 4

Thus, 2π Λ(x) { [ N ]} exp 2π 0 σ 2 ( x[n]s[n, ϕ]) 4 N a2 dϕ n0 ( ) exp N a 2 2π { 4 σ 2 N exp 2π 0 σ 2 ( } x[n]s[n, ϕ]) dϕ n0 ( ) exp N a 2 4 σ 2 2π 2π { N exp σ 2 x[n] a n cos ( n ω c + ϕ 2 π)} dϕ 2π 0 0 ( exp n0 ) N a 2 4 σ 2 2π ( { N exp σ 2 Re x[n] a n e j (n ω c+ϕ 2 π)}) dϕ ( exp 4 2π any 2π interval N a 2 σ 2 ) n0 exp ( σ 2 Re { z(x) exp [ j (ϕ 2 π)]}) dϕ (2) EE 527, Detection and Estimation Theory, # 5c 5

where z(x) N n0 x[n] a n exp(j n ω c ). Clearly, (2) does not depend on z(x) and is, therefore, a function of z(x) only through its magnitude z(x). Furthermore, (2) is an increasing function of z(x), implying that we can simplify our test to N { N n0 n0 N n0 x[n] a n exp(j n ω c ) a threshold γ H x[n] a n exp( j n ω c ) Fourier transform of x[n] a n x[n] a n cos(n ω c ) quadrature component H a threshold } 2 { N + H a threshold n0 x[n] a n sin(n ω c ) quadrature component } 2 in both the Bayesian and Neyman-Pearson scenarios (as usual, only the choice of the threshold differs between the two scenarios). In this case, we can evaluate the integral (2) EE 527, Detection and Estimation Theory, # 5c 6

exactly, yielding ( Λ(x) exp 4 N a 2 σ 2 ) I 0 ( z(x) σ 2 ) where I 0 ( ) denotes the zeroth-order modified Bessel function of the first kind, which can be defined as follows: I 0 ( z ) 2π 2π 0 e Re{zejϕ} dϕ. [Here, we can easily show that the right-hand side of the above expression is a function of z only (i.e. independent of the phase z): 2π 2π 0 θ z+ϕ e Re{zejϕ} dϕ 2π 2π e Re{ z ej( z+ϕ)} dϕ 0 2π e Re{ z ejθ} dθ any 2π interval a function of z only 2π e z Re{ejθ} dθ 2π 0 any 2π interval e z cos θ dθ I 0 ( z ). EE 527, Detection and Estimation Theory, # 5c 7

] Therefore, our Bayes decision rule simplifies to z(x) H ( ( σ 2 I0 N a exp 2 ) 4 σ 2 π0 L( 0) ) π L(0 ) γ leading to the following receiver structure: Note: Here, to implement the the maximum-likelihood detection rule: z(x) H ( ( σ 2 I0 N a exp 2 )) 4 }{{ σ 2 } maximum-likelihood rule threshold we need to know the AWGN noise variance σ 2. EE 527, Detection and Estimation Theory, # 5c 8

Detecting A Stochastic Signal in AWGN (Neyman-Pearson Approach) Some signals have unknown waveform (e.g. speech signals or NDE defect responses). We may need to use stochastic models to describe such signals. We start with a simple independentsignal model, described below. Estimator-correlator. Consider the following hypothesis test: H 0 : x[n] w[n], n, 2,..., N H : x[n] s[n] + w[n], n, 2,..., N where s[n] are zero-mean independent Gaussian random variables with known variances σ 2 s,n, i.e. s[n] N (0, σ 2 s,n), the noise w[n] is AWGN with known variance σ 2, i.e. w[n] N (0, σ 2 ), s[n] and w[n] are independent. Here is an alternative formulation. Consider the following EE 527, Detection and Estimation Theory, # 5c 9

family of pdfs: p(x C s ) N (0, C s + σ 2 I) [ N exp n [2 π (σ2 + c s,n )] C s diag{c s,, c s,2,..., c s,n } with x x[] x[2]. x[n] and the following (equivalent) hypotheses: N n (x[n]) 2 ] 2 (σ 2 + c s,n ) H 0 : C s 0 (signal absent) versus H : C s diag{σ 2 s,, σ 2 s,2,..., σ 2 s,n} (signal present). Clearly, we have integrated s[n], n, 2,..., N out to obtain the marginal likelihood p(x Σ s ) under H. Here, the only discrimination between the two hypotheses is in variance of the measurements (i.e. power of the received signal). The Neyman-Pearson detector computes the likelihood ratio: Λ(x) p(x diag{σ2 s,, σs,2, 2..., σs,n}) 2. p(x 0) EE 527, Detection and Estimation Theory, # 5c 0

Note that where we define σ 2 σ 2 + σs,n 2 σ 2 σs,n 2 κ n σ 2 + σs,n 2. Let us compute the log likelihood ratio: log Λ(x) const not a function of x N n σ 2 s,n σ 2 + σ 2 s,n κ n (x[n]) 2 N 2 (σ 2 + σs,n) 2 + n const + 2 σ not a function of x 2 (x[n]) 2 2 σ 2 N κ n (x[n]) 2 n and, therefore, our test simplifies to (after ignoring the constant terms and scaling the log likelihood ratio by the positive constant σ 2 /2): T (x) N n κ n (x[n]) 2 H γ (a threshold). We first provide two interpretations of this detector and then EE 527, Detection and Estimation Theory, # 5c

generalize it to the case of correlated signal s[n] having known covariance. Filter-squarer interpretation: N κ n (x[n]) 2 n N ( κ n x[n]) 2. n EE 527, Detection and Estimation Theory, # 5c 2

Estimator-correlator interpretation: N κ n (x[n]) 2 n N x[n] (κ n x[n]) n N x[n] n σ 2 s,n σ 2 + σs,n 2 x[n] E [s[n] y[n]]bs MMSE [n]. EE 527, Detection and Estimation Theory, # 5c 3

(Asymptotic) Estimator-correlator: Wide-sense Stationary (WSS) signal s[n] in WSS noise w[n]. Suppose that s[n] and w[n] are zero-mean WSS sequences with power spectral densities (PSDs) P ss (f) and P ww (f), f [ 2, 2 ]. Then, our estimator-correlator structure remains the same, with the estimator block modified accordingly: which is an asymptotic estimator-correlator. EE 527, Detection and Estimation Theory, # 5c 4

Estimator-correlator for Correlated Signal in the Form of a Linear Model We extend the estimator-correlator to the case of correlated signal: where H 0 : x w versus H : x Hθ +w signal s H is a known N p matrix and N p. the noise w is zero-mean Gaussian with known covariance matrix Σ w : p(w) N (0, Σ w ). θ is unknown, with the following prior pdf: where Σ θ is known. π(θ) N (0, Σ θ ) We focus on the following general family of pdfs: p(x C s ) N (0, C s + Σ w ) EE 527, Detection and Estimation Theory, # 5c 5

where C s is a positive-definite covariance matrix. Then, the above hypothesis-testing problem can be equivalently stated as H 0 : C s 0 (signal absent) versus H : C s HΣ θ H T (signal present). The estimator-correlator test statistic is now T (x) 2 xt Σ w Recall the matrix inversion lemma: x 2 xt (HΣ θ H T + Σ w ) x (A + BCD) A A B(C + DA B) DA and use it as follows: yielding (Σ w + HΣ θ H T ) Σ w Σw H(Σ θ + H T Σw H) H T Σw T (x) 2 xt Σw H (Σ θ + H T Σw H) H T Σw x. (3) E [θ x] θ b MMSE, see handout # 4 EE 527, Detection and Estimation Theory, # 5c 6

Example: Detecting a Sinusoid in a Rayleigh-fading Channel (Neyman-Pearson Approach) Over a short time interval, the channel output is a constantamplitude sinusoid with random amplitude and phase, i.e. s[n] A cos(2πf 0 n + ϕ) a cos(2πf 0 n) + b sin(2πf 0 n) for n 0,,..., N. Let us choose independent, identically distributed (i.i.d.) Rayleigh fading: θ [ a b ] N (0, σ 2 θ I) Σ θ implying A a 2 + b 2 a Rayleigh random variable ϕ uniform(0, 2π). With these assumptions, s[n] is a WSS Gaussian random EE 527, Detection and Estimation Theory, # 5c 7

process, since E (s[n]) 0 E (s[n]s[n + k]) E {[a cos 2πf 0 n + b sin 2πf 0 n] [a cos 2πf 0 (n + k) + b sin 2πf 0 (n + k)]} σ 2 θ [cos 2πf 0 n cos 2πf 0 (n + k) + sin 2πf 0 n sin 2πf 0 (n + k)] { σ 2 exp(j2πf0 n) + exp( j2πf 0 n) θ 2 exp(j2πf 0(n + k)) + exp( j2πf 0 (n + k)) 2 + exp(j2πf 0n) exp( j2πf 0 n) 2j exp(j2πf 0(n + k)) exp( j2πf 0 (n + k)) } 2j σ 2 θ cos 2πf 0 k r ss [k] for n 0,,..., N. We now construct a linear model with H 0 cos 2πf 0. sin 2πf 0. cos 2πf 0 (N ) sin 2πf 0 (N ) EE 527, Detection and Estimation Theory, # 5c 8

and Σ w σ 2 I (AWGN). Note that σ 2 and σ 2 θ are assumed known. Now, (3) reduces to T (x) 2 xt Σw H(Σ θ + H T Σ w H) H T Σ w x 2 (σ 2 ) 2 xt H (σ 2 θ I + σ 2 H T H) H T x For large N and f 0 not too close to 0 or 2, H T H (N/2) I see p. 57 in Kay-II. Furthermore, H T x [ N n0 x[n] cos 2πf 0n N n0 x[n] sin 2πf 0n ]. After scaling by the positive constant 2 (σ 2 ) 2 /(σ 2 θ +σ 2 N/2), our test statistic T (x) simplifies to T (x) N N n0 x[n] exp( j2πf 0 n) 2 which is nothing but the periodogram of x[n], n, 2,..., N evaluated at frequency f f 0, also known as the quadrature or noncoherent matched filter. EE 527, Detection and Estimation Theory, # 5c 9

Performance Analysis for the Neyman-Pearson Setup: Define [ ] ξ ξ H T x. ξ 2 Under H 0, we have only noise, implying that ξ N (0, H T σ 2 I H Nσ2 I). }{{ 2 } s 2 0 Under H, we have ( ξ N 0, H T (H Σ θ H T + σ 2 I)H σ θ 2 I ). We approximate the covariance matrix of ξ under H as follows: H T (HΣ θ H T + σ 2 I)H σ 2 θ (N/2) 2 I + σ 2 (N/2) I N ( N 2 2 σ2 θ + σ 2) I. s 2 EE 527, Detection and Estimation Theory, # 5c 20

Now, under H 0, T (x) N (ξ2 + ξ 2 2) s2 0 N [( standard normal random variable {}}{ ξ s 0 ) 2 + ( ξ2 s 0 ) 2 ] χ 2 2 under H 0 implying that P FA P [T (X) > γ C s 0] ) (N γ Q χ 2 2 s 2 0 { [ }}{ N T (X) P s 2 > N γ ] }{{ 0 s } 2 C s 0 0 χ 2 2 exp( 2 N γ/s2 0) see eq. (2.0) in Kay-II. Similarly, under H, P D P [T (X) > γ C s σ 2 θ HH T ] [ NT (X) P s 2 > Nγ ] s 2 C s σ 2 θ HH T exp( 2 N γ/s2 ). EE 527, Detection and Estimation Theory, # 5c 2

But leading to 2 N γ s2 0 log P FA ( s 2 P D exp 0 s 2 log P FA ) P FA s 2 0 /s2. Recall the expressions for s 2 0 and s 2 and compute their ratio: s 2 s 2 0 N 2 (N 2 σ2 θ + σ 2 ) Nσ 2 2 N 2 σ 2 θ σ 2 + η 2 + where η Nσ2 θ σ 2 NE [ a 2 +b 2 {}}{ A 2 /2] σ 2 average SNR. Hence P D P +η/2 FA. P D increases slowly with the average signal-to-noise ratio (SNR) η (see Figure 5.7 in Kay-II) because Rayleigh fading causes amplitude to be small with high probability. Coherent channel matched filter. Noncoherent channel quadrature matched filter. Compare Figures 5.7 and 4.5 in Kay-II. EE 527, Detection and Estimation Theory, # 5c 22

Noncoherent FSK in a Rayleigh-fading Channel (Bayesian decision-theoretic detection for 0- loss) H 0 : x[n] A cos(2πf 0 n + ϕ) + w[n], n 0,,..., N H : x[n] A cos(2πf n + ϕ) + w[n], n 0,,..., N where, as before, A and ϕ are random phase and amplitude and w[0] w w[]. N (0, Σ w). w[n ] We now focus on the following family of pdfs: p(x H) N (0, H Σ θ H T + Σ w ) where Σ θ is known. We can rewrite the above detection problem using the linear model as follows: H 0 : H H 0 versus H : H H EE 527, Detection and Estimation Theory, # 5c 23

where x x[] x[2]. x[n], H 0 0 cos 2πf 0. sin 2πf 0. cos 2πf 0 (N ) sin 2πf 0 (N ) and H 0 cos 2πf. sin 2πf. cos 2πf (N ) sin 2πf (N ). For a priori equiprobable hypotheses π(h H 0 ) π(h H ) 2 we have the maximum-likelihood test: p(x H H ) p(x H H 0 ) H EE 527, Detection and Estimation Theory, # 5c 24

i.e. H Σ θ H T +Σ w /2 exp ˆ 2 x T Σw x 2 xt (H Σ θ H T + Σ w) x H 0 Σ θ H 0 T +Σ w /2 exp ˆ 2 x T Σw x 2 xt (H 0 Σ θ H0 T + Σ w) x which can be written as H Σ θ H T +Σ w /2 exp ˆ 2 x T Σw H (Σ + H T θ Σ w H ) H T Σ w x H 0 Σ θ H 0 T +Σ w /2 exp ˆ 2 x T Σw H 0 (Σ + H θ 0 T Σ w H 0 ) H0 T Σ w x H H. To further simplify the above test, we adopt additional assumptions. In particular, we assume i.i.d. Rayleigh fading: Σ θ σ 2 θ I and AWGN Σ w σ 2 I (AWGN). where σ 2 and σ 2 θ are known. For large N and f i, i {0, } not too close to 0 or 2, H T i H i (N/2) I see p. 57 in Kay-II. Now, we apply this approximation and the identities P Q P Q and I + AB I + BA to further EE 527, Detection and Estimation Theory, # 5c 25

simplify the above determinant expressions: H i Σ θ H T i + Σ w σ 2 θ H 0 H T 0 + σ 2 I σ 2 I I + σ2 θ σ 2 H 0 H0 T (σ 2 ) N I 2 + σ2 s N σ 2 2 I 2 for i {0, }. Applying the above approximation and assumptions yields the simplified maximum-likelihood test: exp [ x T H 2 σ 4 (σ 2 θ I + N I) H T 2 σ 2 x ] exp [ x 2 σ T H 4 0 (σ 2 θ I + N I) 2 σ H T 2 0 x] and, equivalently, H. or x T H H T x x T H 0 H T 0 x H 0 N N n0 x[n] exp( j2πf n) 2 H N N n0 x[n] exp( j2πf 0 n) 2. For performance analysis of this detector (i.e. computing an approximate expression for the average error probability), see Example 5.6 in Kay-II. EE 527, Detection and Estimation Theory, # 5c 26

Preliminaries Let us define a matrix-variate circularly symmetric complex Gaussian pdf of an p q random matrix Z with mean M (of size p q) and positive-definite covariance matrices S and Σ (of dimensions p p and q q, respectively) as follows: N p q (Z M, S, Σ) π pq S q Σ p exp { tr[σ (Z M) H S (Z M)] } exp { tr[σ (Z M) H S (Z M)] } exp [ tr(σ Z H S Z) + tr(σ Z H S M) + tr(σ M H S Z) ] where H denotes the Hermitian (conjugate) transpose. EE 527, Detection and Estimation Theory, # 5c 27

Noncoherent Detection of Space-time Codes in a Rayleigh Fading Channel Consider the multiple-input multiple-output (MIMO) flat-fading channel where the n R vector signal received by the receiver array at time t is modeled as x(t) Hφ(t) + w(t), t,..., N where H is the n R n T channel-response matrix, φ(t) is the n T vector of symbols transmitted by n T transmitter antennas and received by the receiver array at time t, and w(t) is additive noise. Note that we can write the above model as [x() x(n)] X H [φ() φ(n)] Φ + [w() w(n)] W (4) Here, Φ is a space-time code (multivariate symbol ) belonging to an M-ary constellation: Φ {Φ 0, Φ,..., Φ M }. EE 527, Detection and Estimation Theory, # 5c 28

We adopt a priori equiprobable hypotheses: π(φ) M i {Φ 0,Φ,...,Φ M }(Φ) and assume that w(t), t, 2,..., N are zero-mean i.i.d. with known covariance cov(w(t)) σ 2 I nr. Now, the likelihood function for the measurement model (4) is p(x Φ, H) N nr N(X HΦ, σ 2 I, I) { π n RN σ 2 I N exp } σ 2 tr[(x HΦ)H (X HΦ)]. We assume that Φ and H are independent a priori, i.e. π(φ, H) π(φ) π(h) where π(φ) is given in (5) and π(h) is chosen according to Here, ia (x) denotes the indicator function: i A (x) j, x A, 0, otherwise. EE 527, Detection and Estimation Theory, # 5c 29

the following separable Rayleigh-fading model: π(h) N nr n T (H 0, I, transmitter fading corr. matrix {}}{ h ) π n Rn T h n R exp [ tr( h HH H) ]. Now p(φ, H X) p(x Φ, H) π(φ) π(h) exp [ σ 2 tr(φh H H X) + σ 2 tr(xh HΦ) ] σ 2 tr(φh H H HΦ) i {Φ0,Φ,...,Φ M }(Φ) exp [ tr( h HH H) ] exp { σ 2 tr(φh H H X) + σ 2 tr(xh HΦ) [ ( ) tr σ 2 Φ ΦH + h C H (Φ) ]} H H H i {Φ0,Φ,...,Φ M }(Φ) EE 527, Detection and Estimation Theory, # 5c 30

implying that p(h Φ, X) exp { σ 2 tr(φh H H X) + σ 2 tr(xh HΦ) tr [( ) ]} σ 2 Φ ΦH + h H H H [ exp tr { C H (Φ) H H H } + tr { C H (Φ) Ĥ(Φ) H H } + tr { C H (Φ) H H Ĥ(Φ) }] N nr n T (H σ 2 X ΦH ( σ 2 Φ ΦH + h }{{ ), I, } H(Φ) b ( ) ) σ 2 Φ ΦH + h } {{ } C H (Φ) π n Rn T CH (Φ) n R ( exp tr { C H (Φ) [H Ĥ(Φ)]H [H Ĥ(Φ)]}) EE 527, Detection and Estimation Theory, # 5c 3

where we have defined Ĥ(Φ) C H (Φ) σ 2 X ΦH C H (Φ) ( σ 2 Φ ΦH + h ) Note the following useful facts: tr[c H (Φ) H H Ĥ(Φ)] σ 2 tr[c H(Φ) H H X Φ H C H (Φ)] σ 2 tr(hh X Φ H ) tr[c H (Φ) Ĥ(Φ) H H] σ 2 tr[c H(Φ) C H (Φ)ΦX H H] σ 2 tr(φxh H). To obtain the marginal posterior pmf of Φ, we apply our EE 527, Detection and Estimation Theory, # 5c 32

notorious trick: p(φ, H X) p(φ X) p(h Φ, X) keep track of the terms containing Φ and H [ exp σ 2 tr(φh H H X) + ] σ 2 tr(xh HΦ) tr(c H (Φ) H H H) i {Φ0,Φ,...,Φ M }(Φ) C H (Φ) n R { ]} exp tr [C H (Φ) (H Ĥ(Φ))H (H Ĥ(Φ)) i {Φ0,Φ,...,Φ M }(Φ) C H (Φ) nr exp{tr[c H (Φ) Ĥ(Φ) H Ĥ(Φ)]} i {Φ0,Φ,...,Φ M }(Φ) C H (Φ) n R { } exp (σ 2 ) 2 tr[c H(Φ) C H (Φ) Φ X H X Φ H C H (Φ)] i {Φ0,Φ,...,Φ M }(Φ) C H (Φ) n R { } exp (σ 2 ) 2 tr[φ XH X Φ H C H (Φ)] EE 527, Detection and Estimation Theory, # 5c 33

and the maximum-likelihood test becomes: { ( Xm x : m arg max C H (Φ l ) n R l {0,,...,M } ( ))} exp (σ 2 ) 2 tr[φ l X H X Φl H C H (Φ l )]. Unitary space-time codes and i.i.d. fading: Suppose that Φ m Φ H m I nt for all m and the fading is i.i.d., i.e. h ψ 2 h I nt. Then, the above maximum-likelihood test greatly simplifies: X m { } x : m arg max l X H X Φl H ] l {0,,...,M } which is the detector proposed in B.M. Hochwald and T.L. Marzetta, IEEE Trans. Inform. Theory, vol. 46, pp. 543 564, March 2000. EE 527, Detection and Estimation Theory, # 5c 34