Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)
|
|
- Oswald York
- 6 years ago
- Views:
Transcription
1 Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis testing problem, probability distribution of the observations under each hypothesis is assumed to be known exactly. Example: Composite hypothesis testing: problems involving unknown parameters (Chapter 4). Example: 1
2 Objectives: 1. Design testing rules that are optimal in some appropriate sense based on the amount of information available. 2. Analyze the performance of the test. Structure of this chapter: 2.2 Binary Hypothesis Testing Problem Formulation tes Problem modeling, notation, performance measure, etc. 2.3 Bayesian Test tes * A-priori prob. known * Cost structure known 2.4 Minimax Test tes * A-priori prob. unknown * Cost structure known 2.5 Neyman-Pearson Test tes * A-priori prob. unknown * Cost structure unknown tes Receiver operating characteristic (ROC) 2.6 Gaussian Detection 2.7 M-ary Hypothesis Testing 2
3 2.2 Binary Hypothesis Testing Problem Formulation (Levy 2.2 and 2.4) Binary Hypothesis Testing is to decide between 2 hypotheses based on the observation (random). Model contains: 1. Hypothesis and a-priori probability 2. Observation 3. Connection b/w hypotheses and observation 4. Decision function 5. Performance measure 3
4 1. Hypothesis and a-priori probability Hypotheses: and. π 0 = P ( ) and π 1 = P ( ) = 1 π Observation Random vector Y with sample space Y. An observation is a sample vector y of Y. 3. Connection b/w hypotheses and observation: Distributions of Y under and. For continuous Y, For discrete Y, PDF under each hypothesis : Y f Y (y ) : Y f Y (y ) Assume to be known in this chapter. PMF under each hypothesis : P (Y = y ) = p(y ) : P (Y = y ) = p(y ) 4
5 4. Decision function: Decide whether or is true given an observation. A map from Y to {0, 1}: δ(y) = Decision regions: Y 0 and Y 1 1 if decide on 0 if decide on Y 0 {y δ(y) = 0} Y 1 {y δ(y) = 1}. We have Y 0 Y 1 = and Y 0 Y 1 = Y. A decision function is a partition of the sample space of Y. Examples on decision rules: 5
6 Goal: obtain the decision rule which is optimal (in some sense). 5. An optimality/performance measure. Bayesian formulation: All uncertainties are quantifiable. The cost and benefits of outcomes can be measured. Cost function: C ij for i = 0, 1, j = 0, 1, the cost of deciding on H i when H j holds. The value of C ij depends on the application/nature of the problem. Examples on cost function: Assumption: Making a correct decision is always less costly than making a mistake, i.e., C 00 < C 10 and C 11 < C 01. 6
7 Bayes risk of a decision function δ:. test Risk under : R(δ ) = C 00 P (δ(y) = 0 ) + C 10 P (δ(y) = 1 ) Risk under : = C 00 P (Y 0 ) + C 10 P (Y 1 ) R(δ ) = C 01 P (δ(y) = 0 ) + C 11 P (δ(y) = 1 ) = C 01 P (Y 0 ) + C 11 P (Y 1 ) Continuous Y: P (Y i H j ) = Y i f(y H j )dy. Discrete Y: P (Y i H j ) = y Y i P (y H j ). 7
8 Bayes risk: R(δ) = R(δ )P ( ) + R(δ )P ( ) = π 0 C 00 P (Y 0 ) + π 0 C 10 P (Y 1 ) +π 1 C 01 P (Y 0 ) + π 1 C 11 P (Y 1 ) 1 1 = π j C ij P (Y i H j ). i=0 j=0 Since P (Y 0 ) + P (Y 1 ) = P (Y 0 ) + P (Y 1 ) = 1. R(δ) = π 0 C 00 + π 0 (C 10 C 00 )P (Y 1 ) +π 1 C 01 + π 1 (C 11 C 01 )P (Y 1 ). Optimal δ: δ that minimizes R(δ), the Bayes risk. 8
9 False alarm: is true but is decided. (Error of Type I) Detection: is true and is decided. Miss of detection: is true but is decided. (Error of Type II). Probability of detection: P D (δ) = P (Y 1 ). Probability of false alarm: P F (δ) = P (Y 1 ). Probability of miss: P M (δ) = P (Y 0 ) = 1 P D (δ). R(δ) = π 0 C 00 +π 0 (C 10 C 00 )P F (δ)+π 1 C 01 +π 1 (C 11 C 01 )P D (δ). Ideally: Want P D (δ) 1 and P F (δ) 0. Receiver operating characteristic (ROC): the upper boundary between achievable and un-achievable regions in the (P F, P D )-square. 9
10 2.3 Bayesian Testing (Levy 2.2) Assume: 1. A-priori probabilities (π 0, π 1 ) are known. 2. Cost structure (C ij ) is known. Find the optimal decision δ that minimizes Bayes risk, R(δ). Note: Distribution of Y under each hypothesis is also known. 10
11 2.3.1 Likelihood-Ratio Test (LRT) From previous derivation, R(δ) = π 0 C 00 + π 0 (C 10 C 00 )P (Y 1 ) + π 1 C 01 + π 1 (C 11 C 01 )P (Y 1 ) = π 0 C 00 + π 1 C 01 + Y 1 [π 0 (C 10 C 00 )f(y ) + π 1 (C 11 C 01 )f(y )] dy. R(δ) is minimized if and only if π 0 (C 10 C 00 )f(y ) + π 1 (C 11 C 01 )f(y ) π 0 (C 10 C 00 )f(y ) f(y ) f(y ) π 1 (C 01 C 11 )f(y ) (C 10 C 00)π 0 (C 01 C 11 )π 1, since C 01 > C
12 Define the likelihood ratio as L(y) f(y ) f(y ) (C 10 C 00)π 0 (C 01 C 11 )π 1. The optimal decision rule is: L(y) τ (C 10 C 00)π 0 (C 01 C 11 )π 1 δ B (y) = 1 if L(y) τ 0 if L(y) < τ. For discrete Y, similarly, the optimal decision rule is: L(y) P (y ) P (y ) τ, a LRT. 12
13 Maximum A-Posteriori (MAP) Rule: Consider the following cost structure: C 00 = C 11 = 0, C 01 = C 10 = 1 C ij = 1 δ ij. 1 if i = j Kronecher delta function δ ij = 0 if i j. An error incurs a cost. Minimizing Bayes risk becomes minimizing the probability of error. LRT becomes: L(y) = P (y ) P (y ) π 1 f(y ) P ( Y = y) (C 10 C 00)π 0 = π 0 (C 01 C 11 )π 1 π 1 π 0 f(y ) P ( Y = y). Choose the hypothesis with the larger a-posteriori probability. 13
14 Maximum Likelihood Rule: If, furthermore, π 0 = π 1 = 1/2, equal-probable hypotheses, LRT becomes f(y ) f(y ). Choose the hypothesis with the larger likelihood function value Examples 14
15 2.3.3 Asymptotic Performance of LRT (Levy 3.2) For a binary hypothesis testing problem, Y 1, Y 2,, Y N is the sequence of i.i.d. random observations. Y k R n. Assume that Y is continuous and let LRT: 1 N Y = L(y) = f(y ) f(y ) = N N ln(y k ) k=1 1 N k=1 Y 1 Y 2 Y N. f(y k ) N f(y k ) = ln τ(n) γ(n). k=1 L(y k ) τ(n) 15
16 Let Z k ln L(Y k ) = f(y k ) f(y k ) and S N 1 N N k=1 Z k. The LRT becomes: S N γ(n). Notice that Z k s are i.i.d. and S N is the sample mean of Z k s. When N, strong law of large numbers a.s. : S N E [Z k ] = ln f(y ) f(y ) f(y )dy a.s. : S N E [Z k ] = ln f(y ) f(y ) f(y )dy Def. For two PDFs f and g, the Kullback-Leibler (KL) divergence is D(f g) = f(x) ln f(x) g(x) dx. A natural notion of distance between random variables. Not a true distance metric. 16
17 Properties: 1. D(f g) 0 with equality if and only if f = g. 2. Non-symmetric D(f g) D(g f). 3. Does not satisfy the triangular inequality. Let f 0 (y) f(y ) and f 1 (y) f(y ). When N, a.s. : S N ln f 1(y) f 0 (y) f 1(y)dy = D(f 1 f 0 ) > 0. a.s. : S N ln f 1(y) f 0 (y) f 0(y)dy = D(f 0 f 1 ) < 0 Thus P D (N) 1 and P F (N) 0. * As long as we are willing to collect an arbitrarily large number of ind. observations, we can separate perfectly and regardless of π 0 and C ij. How fast does P D (N) 1 and P F (N) 0? Exponentially with N. 17
18 2.4 Mini-max Hypothesis Testing (Levy 2.5) Assume: 1. A-priori probabilities (π 0, π 1 ) is unknown. 2. Cost structure (C ij ) is known. Possible solutions: 1. Guess. May lead to bad performance. 2. Design the test conservatively by assuming the least-favorable choice of a-priori and selecting the test that minimizes the Bayes risk for this choice. Minimizes the maximum risk. Guarantees a minimum level of performance independent of the a-priori. Problem statement: Find the test δ M and a-priori value π 0M that solves the mini-max problem (δ M, π 0M ) = arg min δ max R(δ, π 0). π 0 [0,1] 18
19 Approach: Saddle point method. Def. A saddle point is a point in the domain of a function which is a stationary point but not a local extremum. If a point (δ M, π 0M ) satisfies R(δ M, π 0 ) R(δ M, π 0M ) R(δ, π 0M ), for any δ, π 0, (1) It is a saddle point of the function R. It is the solution of the mini-max problem. Proof: Step 1: A saddle point of the form (1) exists. Step 2: The saddle point is the solution (Saddle point property) Step 3: Construct the saddle point. Mini-max equation: (C 01 C 11 ) + (C 11 C 01 )P D (δ M ) (C 10 C 00 )P F (δ M ) = 0. 19
20 Comments: test 1. If C 00 = C 11, the mini-max equation becomes P D = 1 C 10 C 00 C 01 C 11 P F, a line through (0,1) of the (P F, P D ) square. If C ij = 1 δ ij, mini-max equation becomes P D = 1 P F. 2. Mini-max test corresponding to the intersection of the ROC and the line of the mini-max equation. 3. The LRT threshold τ m of the mini-max test, corresponding to the intersection, equals the slope of the ROC at this point. The corresponding a-priori probability can be calculated by π 0M = [ 1 + C 10 C 00 (C 01 C 11 )τ M ] Another way of finding π 0M : π 0M = arg max π0 min δ R(δ, π 0 ). Define V (π 0 ) min δ R(δ, π 0 ), which is the minimum Bayes risk with a-priori π, achieved by the LRT. π 0M = arg max π 0 V (π 0 ). 20
21 Examples: 21
22 2.5 Neyman-Pearson (NP) Testing (Levy 2.4.1) Assume: 1. A-priori probabilities (π 0, π 1 ) is unknown. 2. Cost structure (C ij ) is unknown. NP-testing problem: Select the test δ that maximizes P D (δ) ensuring that the probability of false alarm P F (δ) is no more than α. D α {δ P F (δ) α} δ NP = arg max δ D α P D (α) Lagrangian method for constrained optimization. 22
23 δ NP = arg max P D (δ) subject to P F (δ) α. δ Consider the Lagrangian: L(δ, λ) P D (δ) + λ(p F (δ) α). A test δ is optimal if it minimizes L(δ, λ) (maximizes L(δ, λ)), λ 0, P F (δ) α, and λ(α P F (δ)) = 0. L(δ, λ) = f(y )dy + λα λ f(y )dy Y 1 Y 1 = [f(y ) λf(y )] dy + λα. Y 1 L(δ, λ) is maximized when 1 if f(y ) > λf(y ) δ(y) = 0 if f(y ) < λf(y ) 0 or 1 if f(y ) = λf(y ) = 1 if L(y) > λ 0 if L(y) < λ 0 or 1 if L(y) = λ Thus, δ has to be an LRT. λ must satisfy the KKT condition. 23
24 Let F L (l ) P (L l ), CDF of the LR L = L(y) under. Let f 0 F L (0 ) = P (L = 0 ). Define 2 tests: δ L,λ (y) = 1 if L(y) > λ 0 if L(y) λ δ U,λ (y) = Case 1: If 1 α < f 0, let λ = 0 and δ NP = δ L,0. 1 if L(y) λ 0 if L(y) < λ Case 2: If 1 α f 0 and there exists a λ such that F L (λ ) = 1 α, i.e, 1 α is in the range of F L (l ), choose this λ as the LRT threshold and let δ NP = δ L,λ. Case 3: If 1 α f 0 and 1 α is not in the range of F L (l ), i.e., there is a discontinuity point λ > 0 of F L (l ) such that F L (λ ) < 1 α < F L (λ ), Choose this λ as the LRT threshold, the NP test is the randomized 24
25 test: Choose δ U,λ with probability p and δ L,λ with probability 1 p, equivalently, 1 if L(y) > λ 0 if L(y) < λ δ NP =. 1 w.p. p; if L(y) = λ 0 w.p. 1 p Comments: 1. When Y is discrete, F L (l ) is discontinuous, thus, randomized test is usually needed. 2. Similarly, we could consider the minimization of P F under the constraint P M (δ) β. Similar solution can be obtained. This problem is called an NP test of Type II. The previously discussed one is called an NP test of Type I. Example: 25
26 ROC Properties Finding ROC is naturally the NP test problem, which must be an LRT. L(y) τ. P D (τ) = τ f L (l )dl P F (τ) = τ f L (l )dl (2) As τ varies from 0 to, (P F (δ), P D (δ)) moves continuously along the ROC curve. 1. Let τ = 0. Thus δ 1 (y) = 1 always and P D (δ) = P F (δ) = 1. (1, 1) belongs to the ROC. 2. Let τ =. Thus δ 1 (y) = 0 always and P D (δ) = P F (δ) = 0. (0, 0) belongs to the ROC. 3. The slope of the ROC at point (P F (τ), P D (δ)) equals to τ. 26
27 4. The ROC curve is concave, i.e., the domain of the achievable pairs (P F, P D ) is convex. 5. All points on the ROC curve satisfy P D P F. 6. The region of feasible tests is symmetric about the point (1/2, 1/2), i.e., if (P F, P D ) is feasible, so is (1 P F, 1 P D ). Example: 27
ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters
ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters D. Richard Brown III Worcester Polytechnic Institute 26-February-2009 Worcester Polytechnic Institute D. Richard Brown III 26-February-2009
More informationECE531 Lecture 4b: Composite Hypothesis Testing
ECE531 Lecture 4b: Composite Hypothesis Testing D. Richard Brown III Worcester Polytechnic Institute 16-February-2011 Worcester Polytechnic Institute D. Richard Brown III 16-February-2011 1 / 44 Introduction
More informationDetection and Estimation Theory
ESE 524 Detection and Estimation Theory Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory Electrical and Systems Engineering Washington University 2 Urbauer
More informationDetection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010
Detection Theory Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010 Outline Neyman-Pearson Theorem Detector Performance Irrelevant Data Minimum Probability of Error Bayes Risk Multiple
More informationDetection theory 101 ELEC-E5410 Signal Processing for Communications
Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off
More informationECE531 Lecture 2b: Bayesian Hypothesis Testing
ECE531 Lecture 2b: Bayesian Hypothesis Testing D. Richard Brown III Worcester Polytechnic Institute 29-January-2009 Worcester Polytechnic Institute D. Richard Brown III 29-January-2009 1 / 39 Minimizing
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationECE531 Lecture 13: Sequential Detection of Discrete-Time Signals
ECE531 Lecture 13: Sequential Detection of Discrete-Time Signals D. Richard Brown III Worcester Polytechnic Institute 30-Apr-2009 Worcester Polytechnic Institute D. Richard Brown III 30-Apr-2009 1 / 32
More informationChapter 2 Signal Processing at Receivers: Detection Theory
Chapter Signal Processing at Receivers: Detection Theory As an application of the statistical hypothesis testing, signal detection plays a key role in signal processing at receivers of wireless communication
More informationDetection and Estimation Chapter 1. Hypothesis Testing
Detection and Estimation Chapter 1. Hypothesis Testing Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2015 1/20 Syllabus Homework:
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationBayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory
Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology
More informationIf there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,
Recall The Neyman-Pearson Lemma Neyman-Pearson Lemma: Let Θ = {θ 0, θ }, and let F θ0 (x) be the cdf of the random vector X under hypothesis and F θ (x) be its cdf under hypothesis. Assume that the cdfs
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationScalable robust hypothesis tests using graphical models
Scalable robust hypothesis tests using graphical models Umamahesh Srinivas ipal Group Meeting October 22, 2010 Binary hypothesis testing problem Random vector x = (x 1,...,x n ) R n generated from either
More informationDetection and Estimation Theory
ESE 524 Detection and Estimation Theory Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory Electrical and Systems Engineering Washington University 2 Urbauer
More informationOn the Optimality of Likelihood Ratio Test for Prospect Theory Based Binary Hypothesis Testing
1 On the Optimality of Likelihood Ratio Test for Prospect Theory Based Binary Hypothesis Testing Sinan Gezici, Senior Member, IEEE, and Pramod K. Varshney, Life Fellow, IEEE Abstract In this letter, the
More information2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?
ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationDetection and Estimation Theory
ESE 524 Detection and Estimation Theory Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory Electrical and Systems Engineering Washington University 211 Urbauer
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015
More informationECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations
ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations D. Richard Brown III Worcester Polytechnic Institute Worcester Polytechnic Institute D. Richard Brown III 1 / 7 Neyman
More informationBayesian Decision Theory
Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities
More informationLecture 22: Error exponents in hypothesis testing, GLRT
10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected
More informationIN HYPOTHESIS testing problems, a decision-maker aims
IEEE SIGNAL PROCESSING LETTERS, VOL. 25, NO. 12, DECEMBER 2018 1845 On the Optimality of Likelihood Ratio Test for Prospect Theory-Based Binary Hypothesis Testing Sinan Gezici, Senior Member, IEEE, and
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationSTOCHASTIC PROCESSES, DETECTION AND ESTIMATION Course Notes
STOCHASTIC PROCESSES, DETECTION AND ESTIMATION 6.432 Course Notes Alan S. Willsky, Gregory W. Wornell, and Jeffrey H. Shapiro Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationDetection theory. H 0 : x[n] = w[n]
Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More informationF2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1
Adaptive Filtering and Change Detection Bo Wahlberg (KTH and Fredrik Gustafsson (LiTH Course Organization Lectures and compendium: Theory, Algorithms, Applications, Evaluation Toolbox and manual: Algorithms,
More informationInformation Theory and Hypothesis Testing
Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking
More informationApplications of Information Geometry to Hypothesis Testing and Signal Detection
CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationECE531 Lecture 2a: A Mathematical Model for Hypothesis Testing (Finite Number of Possible Observations)
ECE531 Lecture 2a: A Mathematical Model for Hypothesis Testing (Finite Number of Possible Observations) D. Richard Brown III Worcester Polytechnic Institute 26-January-2011 Worcester Polytechnic Institute
More informationComposite Hypotheses and Generalized Likelihood Ratio Tests
Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve
More informationIntroduction to Bayesian Statistics
Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationLecture 12 November 3
STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1
More information44 CHAPTER 2. BAYESIAN DECISION THEORY
44 CHAPTER 2. BAYESIAN DECISION THEORY Problems Section 2.1 1. In the two-category case, under the Bayes decision rule the conditional error is given by Eq. 7. Even if the posterior densities are continuous,
More informationECE531 Lecture 8: Non-Random Parameter Estimation
ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction
More informationSTAT 830 Decision Theory and Bayesian Methods
STAT 830 Decision Theory and Bayesian Methods Example: Decide between 4 modes of transportation to work: B = Ride my bike. C = Take the car. T = Use public transit. H = Stay home. Costs depend on weather:
More informationPattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors
More informationQuickest Anomaly Detection: A Case of Active Hypothesis Testing
Quickest Anomaly Detection: A Case of Active Hypothesis Testing Kobi Cohen, Qing Zhao Department of Electrical Computer Engineering, University of California, Davis, CA 95616 {yscohen, qzhao}@ucdavis.edu
More informationFundamentals of Statistical Signal Processing Volume II Detection Theory
Fundamentals of Statistical Signal Processing Volume II Detection Theory Steven M. Kay University of Rhode Island PH PTR Prentice Hall PTR Upper Saddle River, New Jersey 07458 http://www.phptr.com Contents
More informationLecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher
Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and
More informationDigital Transmission Methods S
Digital ransmission ethods S-7.5 Second Exercise Session Hypothesis esting Decision aking Gram-Schmidt method Detection.K.K. Communication Laboratory 5//6 Konstantinos.koufos@tkk.fi Exercise We assume
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More information10. Composite Hypothesis Testing. ECE 830, Spring 2014
10. Composite Hypothesis Testing ECE 830, Spring 2014 1 / 25 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve unknown parameters
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 3: Detection Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Detection theory
More informationDecentralized Detection In Wireless Sensor Networks
Decentralized Detection In Wireless Sensor Networks Milad Kharratzadeh Department of Electrical & Computer Engineering McGill University Montreal, Canada April 2011 Statistical Detection and Estimation
More informationECE531 Homework Assignment Number 2
ECE53 Homework Assignment Number 2 Due by 8:5pm on Thursday 5-Feb-29 Make sure your reasoning and work are clear to receive full credit for each problem 3 points A city has two taxi companies distinguished
More informationInformation Theory in Intelligent Decision Making
Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory
More informationDecision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over
Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we
More informationHypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses
Testing Hypotheses MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline Hypothesis Testing 1 Hypothesis Testing 2 Hypothesis Testing: Statistical Decision Problem Two coins: Coin 0 and Coin 1 P(Head Coin 0)
More informationDefinition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.
Hypothesis Testing Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Suppose the family of population distributions is indexed
More informationIntroduction to Detection Theory
Introduction to Detection Theory Detection Theory (a.k.a. decision theory or hypothesis testing) is concerned with situations where we need to make a decision on whether an event (out of M possible events)
More informationError Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3
Biometrics: A Pattern Recognition System Yes/No Pattern classification Biometrics CSE 190 Lecture 3 Authentication False accept rate (FAR): Proportion of imposters accepted False reject rate (FRR): Proportion
More informationA first model of learning
A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers
More informationDetection of Unexploded Ordnance: An Introduction
Chapter 1 Detection of Unexploded Ordnance: An Introduction Hakan Deliç Wireless Communications Laboratory Department of Electrical and Electronics Engineering Boǧaziçi University Bebek 34342 Istanbul
More informationHypothesis testing (cont d)
Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationChange Detection Algorithms
5 Change Detection Algorithms In this chapter, we describe the simplest change detection algorithms. We consider a sequence of independent random variables (y k ) k with a probability density p (y) depending
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationDecision Criteria 23
Decision Criteria 23 test will work. In Section 2.7 we develop bounds and approximate expressions for the performance that will be necessary for some of the later chapters. Finally, in Section 2.8 we summarize
More informationTopic 3: Hypothesis Testing
CS 8850: Advanced Machine Learning Fall 07 Topic 3: Hypothesis Testing Instructor: Daniel L. Pimentel-Alarcón c Copyright 07 3. Introduction One of the simplest inference problems is that of deciding between
More informationP Values and Nuisance Parameters
P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;
More informationON UNIFORMLY MOST POWERFUL DECENTRALIZED DETECTION
ON UNIFORMLY MOST POWERFUL DECENTRALIZED DETECTION by Uri Rogers A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are
More informationTHE potential for large-scale sensor networks is attracting
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 1, JANUARY 2007 327 Detection in Sensor Networks: The Saddlepoint Approximation Saeed A. Aldosari, Member, IEEE, and José M. F. Moura, Fellow, IEEE
More informationROBUST MINIMUM DISTANCE NEYMAN-PEARSON DETECTION OF A WEAK SIGNAL IN NON-GAUSSIAN NOISE
17th European Signal Processing Conference EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 ROBUST MIIMUM DISTACE EYMA-PEARSO DETECTIO OF A WEAK SIGAL I O-GAUSSIA OISE Georgy Shevlyakov, Kyungmin Lee,
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationSequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process
Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University
More informationy Xw 2 2 y Xw λ w 2 2
CS 189 Introduction to Machine Learning Spring 2018 Note 4 1 MLE and MAP for Regression (Part I) So far, we ve explored two approaches of the regression framework, Ordinary Least Squares and Ridge Regression:
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2015/2016, Our model and tasks The model: two variables are usually present: - the first one is typically discrete
More informationHypothesis Testing - Frequentist
Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationBayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)
Bayesian inference Justin Chumbley ETH and UZH (Thanks to Jean Denizeau for slides) Overview of the talk Introduction: Bayesian inference Bayesian model comparison Group-level Bayesian model selection
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationECE531 Screencast 11.4: Composite Neyman-Pearson Hypothesis Testing
ECE531 Screencast 11.4: Composite Neyman-Pearson Hypothesis Testing D. Richard Brown III Worcester Polytechnic Institute Worcester Polytechnic Institute D. Richard Brown III 1 / 8 Basics Hypotheses H 0
More informationFinite sample size optimality of GLR tests. George V. Moustakides University of Patras, Greece
Finite sample size optimality of GLR tests George V. Moustakides University of Patras, Greece Outline Hypothesis testing and GLR Randomized tests Classical hypothesis testing with randomized tests Alternative
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationStatistical Signal Processing Detection, Estimation, and Time Series Analysis
Statistical Signal Processing Detection, Estimation, and Time Series Analysis Louis L. Scharf University of Colorado at Boulder with Cedric Demeure collaborating on Chapters 10 and 11 A TT ADDISON-WESLEY
More information2.1 Optimization formulation of k-means
MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 2: k-means Clustering Lecturer: Jiaming Xu Scribe: Jiaming Xu, September 2, 2016 Outline Optimization formulation of k-means Convergence
More informationOptimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints
Optimization using Calculus Optimization of Functions of Multiple Variables subject to Equality Constraints 1 Objectives Optimization of functions of multiple variables subjected to equality constraints
More informationLecture 4 September 15
IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric
More informationSYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I
SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability
More informationEECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015
EECS564 Estimation, Filtering, and Detection Exam Week of April 0, 015 This is an open book takehome exam. You have 48 hours to complete the exam. All work on the exam should be your own. problems have
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationLecture 17: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 31, 2016 [Ed. Apr 1]
ECE598: Information-theoretic methods in high-dimensional statistics Spring 06 Lecture 7: Density Estimation Lecturer: Yihong Wu Scribe: Jiaqi Mu, Mar 3, 06 [Ed. Apr ] In last lecture, we studied the minimax
More informationBayesian Learning. Bayesian Learning Criteria
Bayesian Learning In Bayesian learning, we are interested in the probability of a hypothesis h given the dataset D. By Bayes theorem: P (h D) = P (D h)p (h) P (D) Other useful formulas to remember are:
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationUncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department
Decision-Making under Statistical Uncertainty Jayakrishnan Unnikrishnan PhD Defense ECE Department University of Illinois at Urbana-Champaign CSL 141 12 June 2010 Statistical Decision-Making Relevant in
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More information