1 Biometric Authentication: A Copula Based Approach

Size: px
Start display at page:

Download "1 Biometric Authentication: A Copula Based Approach"

Transcription

1 1 Biometric Authentication: A Copula Based Approach Satish G. Iyengar a, Pramod K. Varshney a and Thyagaraju Damarla b a EECS Department, Syracuse University, Syracuse, NY 13244, USA b Army Research Laboratory, Adelphi, MD, USA 1.1 Introduction Biometrics involves the design of automatic human recognition systems that use physical features such as face, fingerprints, iris or behavioral traits such as gait or rate of keystrokes, etc. as passwords. For example, in building access control applications, a person s face may be matched to templates stored in a database consisting of all enrolled users. Decision to allow or deny entry is then taken based on the similarity score generated by the face matching algorithm. Such security systems that rely on biometrics have several advantages over the conventional ones where alphanumeric personal identification numbers (PINs) are provided to the users. For example, a PIN, if leaked, may be used by an unauthorized person causing serious security concerns. However, a person s physical signature belongs only to that individual and it is very difficult if not impossible to emulate it. Further, biometric systems may be more convenient and user-friendly as there is no code to remember or any token to carry. However, there exist several limitations. Biometric traits such as face and voice change with age. One may be required to update the systems database to counter this time variabity. Environmental noise and noise in the acquisition system further affect the accuracy and reliability of the system. Overlap between physical features or inter-class similarity (e.g., twins with identical facial features) limits the system s ability to distinguish between the classes. There also exist intra-class variations due to differences between the acquired biometric signature of an individual requesting the access and his/her template registered in the database. Apart from noise sources stated above, these differences may also stem from the psychological and behavioral variations of an individual at different instances of time. One method to overcome these limitations is to consider combining multiple sources of information. It 1

2 2 Fig A multi-biometric authentication system. Biometric signatures of disparate modalities such as face, iris and fingerprint are fused. may include fusing observations of disparate modalities (e.g., voice and face) or multiple features (extracted from the same biometric trait), multiple classifiers or multiple samples of the same source. This method of fusing several biometric sources is called multi-biometrics. Figure 1.1 shows a multimodal biometric system which considers fusion of disparate biometric signatures such as face, iris and fingerprints. Fusion of multimodal biometrics offers several advantages and new possibilities for system improvement. For example, in video/image based person authentication systems for security and access control applications, the system performance degrades when the subjects age or when the lighting conditions are poor. The presence of audio signature along with video would overcome many of these difficulties. In other words, noise in one modality may not affect the other and thus make the system more robust to noise. Secondly, multiple modalities may contain complementary information relevant to the verification task. The level of detail and the type of information present in one modality may be different from the other. An efficient system design is one which exploits this heterogeneity of the multimodal data set. In this chapter, we concern ourselves with the design of rules for fusing

3 Biometric Authentication: A Copula Based Approach 3 different biometric signatures and describe a new approach based on copula theory (Dass et al. 2005, Iyengar et al and Iyengar et al. 2009). We show how copula theory enables the development of a mathematical framework for fusion of data from heterogeneous sources. We also consider the problem of analyzing the effect of inter-modal dependence on system performance and describe some recent results from Iyengar et al The chapter provides a detailed exposition on copula theory and its applicability to biometric authentication. Although Fig. 1.1 implies fusion of multimodal observations, the developed framework is general enough to be also able to handle other problems where the source of heterogeneity is due to multiple samples, algorithms or multiple classifiers, i.e., the measurements z 1, z 2,..., z N in Fig. 1.1 may also denote multiple samples or multiple features (extracted from the same modality) or output of multiple algorithms which are combined (jointly processed) at a fusion center. 1.2 Fusion of Multi-biometrics A biometric authentication task is essentially a binary hypotheses testing problem where, data (biometric signatures) from several modalities are fused to test for the hypothesis H 1 against H 0 where, H 0 H 1 : claimant an impostor : claimant a genuine user Fusion of information from multiple biometric sensors can be classified into three different levels: Data or Feature level fusion: Local observations from each source are processed directly or are transformed so that only relevant features can be extracted and retained for further processing. Resultant features from each source are then combined to obtain a global decision regarding the identity of the human under test. This method is most efficient in terms of performance as it involves minimum information loss. However, feature sets obtained are typically of high dimensionality and one often suffers from the well known curse of dimensionality. Further, the design of a fusion rule may not be straightforward as the acquired feature set may not be commensurate (e.g., audio and corresponding video features). Score level fusion: Fusion of match scores is second to the feature level fusion in terms of performance efficiency as the input undergoes moderate information loss in its transformation to similarity scores. Similarity (or dissimilarity) scores are obtained for each modality by matching the input

4 4 data (or features) to those stored in the system s database. The approach is then to fuse the scores thus obtained by matching the input at each source to its corresponding template. The range of score values may be different for different matching algorithms. One of the challenges in the design of score level fusion rule is to incorporate this heterogeneity. Decision level fusion: Local decisions regarding the presence or absence of a genuine user are obtained by independently processing each modality. Binary data thus obtained for each source are subsequently fused to make a global decision. Significant reduction in system complexity can be obtained. However, this is at the cost of potentially high information loss due to the binary quantization of individual features. Further, methods for multi-biometric fusion can be classified into two broad categories (Jain et al. 2005); (a) the classification approach and (b) the combination approach. As an example, consider the problem of fusing information from an audio and a video sensor. Approach (a) involves training a classifier to jointly process the audio-visual features or matching scores and classify the claimant as a genuine user (Accept) or an impostor (Reject). The classifier is expected to be capable of learning the decision boundary irrespective of how the feature vectors or matching scores are generated and thus no processing is required prior to feeding them into the classifier. Several classifiers based on neural networks, k-nn, and support vector machines (SVM) and that based on the likelihood principle have been used in the literature. On the other hand, approach (b) is to combine all the feature vectors or matching scores to generate a single scalar score and a global decision regarding the presence or absence of a genuine user is based on this combined score. Thus, in order to achieve a meaningful combination, the features obtained from disparate modalities must be transformed or normalized to a common domain. Design of such a transform is not straightforward and is often data-dependent and requires extensive empirical evaluation (Toh et al. 2004, Snelick et al. 2005, Jain et al. 2005). Our interest in this chapter is in the likelihood ratio based fusion approach. The method does not require the data to be transformed so as to make them commensurable. Further, the method has a strong theoretical foundation and is proved optimal (in both Neyman-Pearson (NP) and Bayesian sense) (Lehmann 2008). However, it requires complete knowledge of joint probability density functions (PDF) of the multiple genuine (f Z (z H 1 )) and impostor features (f Z (z H 0 )). Thus, estimation of these PDFs is one major challenge and optimality suffers if there is mismatch between the true and the estimated joint PDFs.

5 Biometric Authentication: A Copula Based Approach 5 Next, we discuss statistical modeling of heterogeneous data (features or scores). 1.3 Statistical Modeling of Heterogeneous Biometric Data A parametric approach to statistical signal processing applications such as detection, estimation and tracking necessitate complete specification of the joint PDF of the observed samples. However, in many cases, the derivation of the joint PDF becomes mathematically intractable. In problems such as fusion of multi-biometrics, random variables associated with each biometric trait may follow probability distributions that are different from one another. For example, it is highly likely that features (or match scores) derived from the face and acoustic measurements follow disparate PDFs. The differences in physics governing each modality results in disparate marginal (univariate) distributions. There may also be differences in signal dimensionality, support and sampling rate requirements across modalities. Moreover, the marginal distributions may exhibit non zero statistical dependence due to complex intermodal interactions. Deriving the underlying dependence structure is a challenge and may not always be possible. We can thus identify the following two challenges when modeling the joint distribution of heterogeneous data: Quantification of inter-modal dependence and interactions Derivation of joint probability density function (PDF) between dependent heterogeneous measurements when each of the underlying marginals follow disparate distributions Prabhakar and Jain 2002 use non-parametric density estimation for combining the scores obtained from four fingerprint matching algorithms and use likelihood ratio based fusion to make the final decision. Several issues such as the selection of the kernel bandwidth and density estimation at the tails complicate this approach. More recently, Nandakumar et al consider the use of finite Gaussian mixture models (GMM) for genuine and impostor score densities. They show that GMM models are easier to implement than kernel density estimators (KDE) while also achieving high system performance. However, GMM models require selecting the appropriate number of Gaussian components. The use of too many components may result in over-fitting the data while using too few components may not approximate the true density well. They use a GMM fitting algorithm developed by Figueiredo and Jain 2002 which automatically estimates the number of components and the component parameters using the expectation-maximization (EM) algorithm and the minimum message length criterion.

6 6 We present, in this chapter, an alternative approach based on copula theory. We show how the copula functions posess all the ingredients necessary for modeling the joint PDF of heterogeneous data. One of the main advantages of the copula approach is that it allows us to express the log-likelihood ratio as a sum of two terms; the first that corresponds to the strategies employed by the individual modalities and the second term to cross-modal processing. This allows us to separately quantify system performance due only to the model differences across the two hypotheses and that contributed only by the cross-modal interactions. Thus, it provides an elegant framework to study the effects of cross-modal interactions. Further, as will be evident later, there is also a reduction in multivariate PDF estimation complexity as the estimation problem can be split into two steps, (i) Estimation of only the marginal distributions (ii) Estimating the copula parameter The use of nonparametric measures such as Kendall s τ (as opposed to MLE) to estimate the copula parameter further reduces the computational complexity. We now explain how copula theory can be exploited to address the modeling issues discussed above. We begin with the following definition, Definition 1 A random vector Z = {Z n } N n=1 governing the joint statistics of an N-variate data set can be termed as heterogeneous if the marginals Z n are non-identically distributed. The variables Z n may exhibit statistical dependence in that f Z (z) N f Zn (z n ) (1.1) n=1 The goal is to construct the joint PDF f Z (z) of the heterogeneous random vector Z. Definition 1 is, of course, inclusive of the special case when the marginals are identically distributed and/or are statistically independent. Characterizing multivariate statistical dependence is one of the most widely researched topics and has always been a difficult problem (Mari and Kotz 2001). The most commonly used bivariate measure, the Pearson s correlation ρ captures only the linear relationship between variables and is a weak measure of dependence when dealing with non-gaussian random variables. Two random variables X and Y are said to be uncorrelated if the covariance, Σ X,Y = E(XY ) E(X)E(Y ) (1.2)

7 Biometric Authentication: A Copula Based Approach 7 Fig Illustrative example that shows that correlation coefficient ρ is a weak measure of dependence is zero ( ρ = Σ X,Y σ 2 X σ 2 Y ) = 0. Statistical independence has a stricter requirement in that X and Y are independent only if their joint density can be factored as the product of the marginals. In general, a zero correlation does not guarantee independence (except when the variables are jointly Gaussian). For example, we see that though dependence of one variable on the other is evident in the scatter plots (Fig. 1.2 a and b), the correlation coefficient is zero. The problem is further compounded when dependent heterogeneous random variables with disparate PDFs are involved. Often one then chooses to assume multivariate Gaussianity or inter-modal independence (also called the product model) to construct a tractable statistical model. A multivariate Gaussian model necessitates the marginals to be Gaussian and thus would fail to incorporate the true non-gaussian marginal PDFs. Assuming statistical independence neglects inter-modal dependence thus leading to suboptimal solutions. As we will show later, a copula based model for dependent heterogeneous random vectors allows us to retain the marginal PDFs as well as capture the inter-modal dependence information Copula Theory and its Implications Copulas are functions that couple multivariate joint distributions to their component marginal distribution functions (Nelsen 1999), (Kurowicka and Cooke 2006). The main advantage of the copula based approach is that it allows us to define inter-modal dependence irrespective of the underlying marginal distributions. One can construct joint distributions with arbitrary marginals and the desired dependence structure. This is well suited for heterogeneous random vectors.

8 8 Sklar (1959) was the first to define copula functions. Theorem 1 (Sklar s Theorem) Let F Z (z 1, z 2, z N ) be the joint cumulative distribution function (CDF) with continuous marginal CDFs F Z1 (z 1 ), F Z2 (z 2 ),, F ZN (z N ). Then there exists a copula function C( ) such that for all z 1, z 2,, z n in [, ], F Z (z 1, z 2, z N ) = C(F Z1 (z 1 ), F Z2 (z 2 ),, F ZN (z N )) (1.3) For continuous marginals, C( ) is unique; otherwise C( ) is uniquely determined on RanF Z1 RanF Z2 RanF ZN where RanX denotes the range of X. Conversely, if C( ) is a copula and F Z1 (z 1 ), F Z2 (z 2 ),, F ZN (z N ) are marginal CDFs then the function F Z ( ) in (1.3) is a valid joint CDF with the marginals F Z1 (z 1 ), F Z2 (z 2 ),, F ZN (z N ). Note that the copula function C(u 1, u 2,, u N ) is itself a CDF with uniform marginals as u n = F Zn (z n ) U(0, 1) (by probability integral transform). The copula based joint PDF of N continuous heterogeneous random variables can now be obtained by taking an N th order derivative of (1.3), ( N ) f Z (z) = f Zn (z n ) c(f Z1 (z 1 ),, F ZN (z N )) n=1 = f c Z(z) (1.4) where Z = [Z 1, Z 2,, Z N ] and we use the superscript c to denote that f c Z (z) is the copula representation of f Z(z). Note that we need to know the true copula density function c( ) to have an exact representation as in (1.4). We emphasize here that any joint PDF with continuous marginals can be written in terms of a copula function as in (1.4) (due to Sklar s theorem). However, identifying the true copula is not a straightforward task. A common approach then is to select a copula function k( ) a priori and fit the given marginals and the desired dependence structure to derive the joint distribution. Thus, model mismatch errors are introduced when k( ) c( ); i.e., the selected copula is not equal to the true dependence structure given by c( ) and it is essential to account for this error and its effect on the detection performance. In the following, we first consider system design and its performance analysis assuming the knowledge of the true underlying copula c( ). This allows us to analyze the effects of inter-modal dependence. We defer the discussion on joint PDF construction with potentially misspecified copula functions until Section 1.5.

9 Biometric Authentication: A Copula Based Approach Copula Based Multi-biometric Fusion The biometric authentication problem can be formulated as a binary hypotheses test. A decision theory problem consists of deciding which of the hypotheses H 0,, H k is true based on the acquired observation vector of (say) L samples. An optimal test (in both the Neyman-Pearson (NP) and Bayesian sense) for a two hypotheses problem (H 0 vs. H 1 ) computes the log-likelihood ratio (Λ) and decides in favor of H 1 when the ratio is larger than a pre-defined threshold (η), Λ(z) = log f Z(z H 1 ) f Z (z H 0 ) H 1 H 0 η (1.5) where f Z (z H i ) is the joint PDF of the random observation vector z = [z 1,, z N ] T R N under the hypothesis H i, (i = 0, 1). In the NP set up, the threshold η is selected to constrain the false alarm error probability, P F to a value α < 1 and at the same time minimize the probability of miss, P M. The two error probabilities are given as P F = P (Λ > η H 0 ), P M = P (Λ < η H 1 ) (1.6) Consider a binary hypotheses testing problem where H 0 and H 1 correspond to an impostor and a genuine user respectively. H 1 : f Z (z 1, z 2,, z N H 1 ) H 0 : f Z (z 1, z 2,, z N H 0 ) = N f Zn (z n H 0 ) (1.7) The random variables Z 1,, Z N are assumed to be statistically independent under the hypothesis H 0 contrary to when H 1 is true. n= Log-likelihood ratio test for heterogeneous signals Using copula functions, the log-likelihood ratio test in (1.5) can be written as, ( Λ c (z) = log f Z c(z H N ) 1) f Z (z H 0 ) = log f Zn (z n H 1 ) + f n=1 Zn (z n H 0 ) log [ c(fz 1 1 (z 1 ),, FZ 1 N (z N )) ] (1.8) where the copula density c( ) characterizes dependence under H 1. The superscript i in FZ i n (z n ) denotes the CDF of Z n under hypothesis i. The first term in (1.8) corresponds to the differences in the statistics of each modality across the two hypotheses while the cross modal dependence

10 10 and interactions is included in the second term. This allows us to exactly factor out the role of cross modal dependence and quantify performance gains (if any) achieved due to inter-modal dependence. We denote the test based on the decision statistic in (1.8) as LLRT-H Log-likelihood ratio test for the product distribution It is interesting to note the form of the test statistic in (1.8). The first term, ( N ) f Zn (z n H 1 ) Λ p (z) = log (1.9) f Zn (z n H 0 ) n=1 is the test obtained when the variables Z 1, Z 2,, Z N are statistically independent or when dependence between them is deliberately neglected for simplicity. The test based on this decision statistic is the log-likelihood ratio test for the product distribution (LLRT-P). In problems where the derivation of the joint density becomes mathematically intractable, tests are usually employed assuming independence between variables conditioned on each hypothesis. This naturally results in performance degradation. We now compare performances of LLRT-H and LLRT-P detectors Performance analysis The asymptotic performance of a likelihood ratio test can be quantified using the Kullback-Leibler (KL) divergence, D (f Z (z H 1 ) f Z (z H 0 )), between the PDFs underlying the two hypotheses. For two distributions p X (x) and q X (x), the KL divergence is defined as ( ) px (x) D(p q) = p X (x) log dx (1.10) q X (x) and it measures how different p X (x) is relative to q X (x). Further, D(p q) 0 D(p q) = 0 p = q For L (independent) users of the system, through Stein s Lemma (Chernoff 1956), we have for a fixed value of P M = β, (0 < β < 1), 1 lim L L log P F = D (f Z (z H 1 ) f Z (z H 0 )) (1.11) The base of the logarithm is arbitrary. In this paper, log( ) denotes natural logarithm unless defined otherwise.

11 Biometric Authentication: A Copula Based Approach 11 The greater the value of D (f Z (z H 1 ) f Z (z H 0 )), faster is the convergence of P F to zero as L. The KL divergence is thus indicative of the performance of a log-likelihood ratio test. Further, it is additive when the dependence across the heterogeneous observations is zero, D ( f p Z (z H 1) f Z (z H 0 ) ) = N D (f Zn (z n H 1 ) f Zn (z n H 0 )) (1.12) n=1 where D (f Zn (z n H 1 ) f Zn (z n H 0 )) is the KL divergence for a single modality Z n. The following theorem helps us understand the effect of statistical dependence across the biometric traits on recognition performance, Theorem 2 (Iyengar et al., 2009) The KL divergence between the two hypotheses (H 1 vs. H 0 ) increases by a factor equal to the multi-information between the random variables when dependence between the variables is taken into account, D(f Z (z H 1 ) f Z (z H 0 )) = D p (f Z (z H 1 ) f Z (z H 0 )) + I 1 (Z 1 ; Z 2,, Z N ) }{{} 0 (1.13) where I i (Z 1 ; Z 2,, Z N ) = I(Z 1 ; Z 2,, Z N ; H i ). The result in (1.13) is intuitively satisfying as the multi-information I 1 ( Z 1 ; Z 2,, Z N ) (which reduces to the well-known mutual information for N = 2) describes the complete nature of dependence between the variables Effect of statistical dependence across multiple biometrics on fusion Poh and Bengio (2005) note, Despite considerable efforts in fusions, there is a lack of understanding on the roles and effects of correlation and variance (of both the client and impostor scores of base classifiers/experts). While it is widely accepted that it is essential to correctly account for correlation in classifier design (Roli 2002, Ushmaev 2006), the exact link between classification performance and statistical dependence has not been established to the best of our knowledge. Some recent contributions in this direction include, apart from Poh and Bengio 2005, Koval et al and, Kryszczuk and Drygajlo Poh and Bengio 2005 studied the problem under the assumption of normality for both genuine and impostor features or scores and concluded that

12 12 a positive value for the correlation coefficient is detrimental to the system. Contrary to this result, Koval et al used error exponent analysis to conclude that a non-zero inter-modal dependence always enhances system performance. Recently, Kryszczuk and Drygajlo 2008 considered the impact of correlation for bivariate Gaussian features. They used Matusita distance as a measure of separation between the PDFs of the competing hypotheses. They showed that the conclusions of the above two studies do not hold in general, i.e., they do not extend to arbitrary distributions. Copula theory allows us to answer this important question and is general in treatment. The result in Theorem 2 makes no assumptions about the PDFs of biometric features. From Theorem 2, D(f Z (z H 1 ) f Z (z H 0 )) D p (f Z (z H 1 ) f Z (z H 0 )) (1.14) due to the non-negativity of the multiinformation. Thus, in addition to the model differences across the hypotheses, the presence of non-zero dependence further increases the inter-class distinguishibility. However, the problem is more complicated when the variables exhibit statistical dependence under both hypotheses and the result that Dependence can only enhance detection performance is no longer true (Iyengar - PhD Diss.). Next, we discuss methods to construct joint PDFs with potentially misspecified copula functions. 1.5 Joint PDF construction using Copulas As pointed out earlier, the copula density c( ) (the true dependence structure) is often unknown. Instead, a copula density k( ) is chosen a priori from a valid set A = [k 1 ( ),, k p ( )] of copula functions. Several copula functions have been defined especially in the econometrics and finance literature (e.g. Clemen and Reilly 1999); the popular ones among them being multivariate Gaussian copula, Student s t copula and copula functions from the Archimedean family. Given a copula density function k( ) and the marginal distributions, the joint PDF estimate then has the form similar to (1.4), f Z (z) = ( N ) f Zn (z n ).k(f Z1 (z 1 ),, F ZN (z N )) n=1 = f k Z(z) (1.15) As an example, let Z 1 and Z 2 be the random variables associated with two

13 Biometric Authentication: A Copula Based Approach 13 Table 1.1. Some well known copula functions Copula C(u 1, u 2 ) Kendall s τ Gaussian Φ N [Φ 1 (u 1 ), Φ 1 2 (u 2 ); θ] π arcsin (θ) Clayton ( ) Frank 1 θ log 1 + (e θu 1 1)(e θu 2 1) Gumbel exp [u θ 1 + u θ 2 1] 1 θ θ e θ 1 [ { ( log u 1 ) θ + ( log u 2 ) θ} 1/θ ] 1 4 θ θ+2 θ 0 [ 1 1 θ 1 1 θ Product u 1.u 2 0 ] t e t 1 dt heterogeneous biometrics; i.e., they may follow disparate distributions. One can first estimate the marginals f Z1 (z 1 ) and f Z2 (z 2 ) individually if they are unknown and then proceed to estimate the parameters of the copula density k( ). In the following, we assume that the marginal PDFs are known or have been consistently estimated and concentrate only on copula fitting. Given a copula function K( ) selected a priori, we wish to construct a copula based bivariate density function of the form (1.15) based on acquired data. Table 1.1 lists some of the well known bivariate copulas. Each of these functions is parameterized by θ that controls the amount of dependence between the two variables. Thus, it is required to estimate θ from the acquired bivariate observations Estimation using nonparametric dependence measures Nelsen 1999 describes how copulas can be used to quantify concordance (a measure of dependence) between random variables. Nonparametric rank correlations such as Kendall s tau (τ) and Spearman s rho (ρ s ) measure concordance between two random variables. Now let (z 1 (i), z 2 (i)) and (z 1 (j), z 2 (j)) be two observations from a bivariate measurement vector (Z 1, Z 2 ) of continuous random variables. The observations are said to be concordant if (z 1 (i) z 1 (j)) (z 2 (i) z 2 (j)) > 0 and discordant if (z 1 (i) z 1 (j)) (z 2 (i) z 2 (j)) < 0. The population version of Kendall s τ can be expressed in terms of K( ) (Nelsen 1999) τ Z1,Z 2 = 4 K(u 1, u 2 ; θ)dk(u 1, u 2 ; θ) 1 (1.16) where u n = F Zn (z n ). Thus, for a given τ, the integral equation above can be Φ N (, ) and Φ( ) in Table 1.1 denote standard bivariate and univariate Gaussian CDFs respectively.

14 14 solved to obtain an estimate ˆθ τ ; the subscript τ denotes that it is the τ-based estimate. Table 1.1 shows the relationship between τ and θ for some of the well-known copula functions (Mari and Kotz 2001, Nelsen 1999, Kurowicka and Cooke 2006). When τ is unknown, ˆθτ can be obtained from the sample estimate ˆτ. Given L i.i.d measurements (z 1 (l), z 2 (l)) l (l = 1, 2,, L), the observations are rank ordered and ˆτ can be computed as ˆτ Z1,Z 2 = c d c + d (1.17) where c and d are the number of concordant and discordant pairs respectively. Similar relations hold between ρ s and a copula function K( ). The population version of ρ s in terms of the copula function K( ) is given as, ρ s Z 1,Z 2 = 12 u 1 u 2 dk(u 1, u 2 ) 3 (1.18) Equation (1.18) can be used to obtain ˆθ ρ s; the subscript ρ s denotes that it is the ρ s -based estimate. When ρ s is unknown, its sample estimate can be used. Bivariate measurements (z 1 (l), z 2 (l)) are first converted to rankings x i and y i. The sample estimate ˆρ s is then given as where ˆρ Z1,Z 2 = 1 6 d 2 i L(L 2 1) (1.19) d i = x i y i = the difference between the ranks of z 1 and z 2 respectively L = number of observations Thus, the joint PDF constructed using the above method captures the inter-modal rank correlations (τ or ρ s ) even though the copula function chosen a priori is misspecified Maximum likelihood estimation Maximum likelihood estimation (MLE) based approaches can also be used to estimate θ and are discussed in Bouyé et.al The copula representation allows one to estimate the marginals (if unknown) first and then the dependence parameter θ separately. This two-step method is known as the

15 Biometric Authentication: A Copula Based Approach 15 method of inference functions for margins (IFM)(Joe and Xu 1996). Given L i.i.d realizations (z 1 (l), z 2 (l)) l (l = 1, 2,, L), ˆθ IF M = argmax L log k (F Z1 (z 1 (l)), F Z2 (z 2 (l)); θ) (1.20) l=1 When the marginal CDFs in (1.20) are replaced by their empirical estimates, ˆF Zn (x) = 1 L L I(X l x) (1.21) l=1 where I(E) is an indicator of event E, the method is called the canonical maximum likelihood (CML) method, ˆθ CML = argmax L l=1 ( log k ˆFZ1 (z 1 (l)), ˆF ) Z2 (z 2 (l)); θ (1.22) Though we have only discussed the case of two modalities here, it is easy to extend the method described above to construct joint PDFs when more than two modalities are involved. Clemen and Reilly 1999 discuss the multivariate Gaussian copula approach by computing pair-wise Kendall s τ between the variables. Multivariate Archimedean copulas are studied in Nelsen Experimental Results In the following, we apply the copula based method to fuse similarity scores from two different face matching algorithms to classify between genuine and impostor users. We consider both the NP and the Bayesian framework for this example. We use the biometric score set developed by National Institute of Standards and Technology (NIST-BSSR 1 database). Three thousand subjects participated in the experiment and two samples were obtained per subject thus giving (2 3000) genuine and ( ) impostor scores. Similarity scores generated by the two face recognition systems are heterogeneous as they use different matching algorithms. Let z i 1 and zi 2 denote the scores generated by face matchers 1 and 2 respectively under the hypothesis H i. Further, z i 1 R 1 = [0, 1] and z i 2 R 2 = [0, 100]. Higher the value of the scores, better is the match between the subject under test P n and the template P m to which it is compared. In Fig. 1.3, we show a score-matrix generated by the face matcher 1. An entry at (m, n) in the matrix corresponds to the score obtained when P n

16 16 Fig Scores generated by Face Matcher 1: NIST BSSR 1 Database is matched to P m stored in the database. It can be seen that for some P n (n = 29 and 1273 here ), a match score of negative one ( 1) is reported for all m. Surprisingly, this is true even for some cases when m = n (e.g. m = n = 29, 1273); i.e., when P n is matched to its own template. This may have been due to errors during data acquisition, image registration or feature extraction due to the poor quality of the image. Negative one ( 1 R 1 ) thus appears to serve as an indicator to flag the incorrect working of the matcher. The global performance of the biometric authentication system will thus depend on how this anomaly is handled in decision making. For example, the fusion center can be designed to respond in one of the following ways upon the reception of the error flag, (i) Request for retrial: The fusion center does not make a global decision upon receiving the error flag. Instead the person P n claiming an identity is requested to provide his biometric measurement again. The request for retrial design ensures Z 1 R 1. We emulate this by deleting the users whose match scores were reported to be negative one. For example, we would delete both the row and column corresponding to user 29 in the above matrix. We present results using (2 2992) genuine and ( ) impostor scores. However, there may be applications where the system does not have the liberty

17 Biometric Authentication: A Copula Based Approach 17 to request for a retrial and the fusion center has to make a decision after each match. (ii) Censoring (face matchers that generate the error flag): In our example, face matcher 1 generates the error flag. Upon the reception of the error flag (z 1 = 1), the terms of the log likelihood ratio test that depend on z 1 are discarded. Thus, the first and the third terms in Λ(z) = log f Z 1 (z 1 H 1 ) f Z1 (z 1 H 0 ) } {{ } =0 + log f Z 2 (z 2 H 1 ) f Z2 (z 2 H 0 ) + log c 1( ) c 0 ( ) }{{} =0 (1.23) are set to zero. (iii) Accept H 0 : The system decides in favor of H 0 when one or more of the face matchers generate an error flag. This design is conservative and is thus suitable for applications where one desires minimal false alarm rates. (iv) Accept H 1 : The system decides in favor of H 1 when one or more error flags are generated. (v) Random decision: Toss a fair coin to decide between H 0 and H 1. We show performance results for all five designs in this section. Data is first partitioned into two subsets of equal size where the first subset is the training set to be used for model fitting. Recognition performance (P F vs. P D ) is evaluated with the second subset; the testing set. The marginal PDFs for the impostor (H 0 ) and genuine (H 1 ) scores are shown in Fig A Gaussian mixture model is fit to the scores generated by both face matchers 1 and 2 (Figueiredo and Jain 2002). Scores under both hypotheses are statistically dependent and a KL divergence based criterion resulted in the use of Frank and Gumbel copula functions to model genuine and impostor scores respectively. For more details on the selection procedure, see Iyengar - PhD Diss. Data is randomly partitioned to generate thirty training-testing sets (resamples) and a mean receiver operating characteristic (ROC) is obtained using threshold averaging (Fawcett 2006). ROCs for LLRT-P and LLRT- H for each strategy are shown in Fig Note that the Accept H 1 and Random Decision methods are more liberal (in granting access) when compared to the other schemes. In other words, both the methods suffer heavily due to increased false alarm rates. However, the superiority of the copula based method over LLRT-H is evident in all five approaches. We note here that Dass et al addressed the biometrics scores fusion problem using copulas and observed no improvement over the product rule (LLRT-P). To

18 18 Fig Marginal PDF estimation: (a), (b) Gaussian models for impostor and genuine scores generated by Face Matcher 1; (c), (d) Gaussian mixture models for impostor and genuine scores generated by Face Matcher 2 account for the error flags (negative ones), they modeled the marginal PDFs as a mixture of discrete and continuous components. However, copula methods require the marginals to be strictly continuous. Further, their analysis was limited to the use of Gaussian copula densities which are insufficient to model the inter-modal dependence. In this paper, we have employed different approaches to handle the error flags and have considered the use of a more general family of copula functions with the potential of improving system performance. These reasons could explain the differences between our results and Dass et al We now consider the Bayesian framework. In some problems, one is able assign or know a priori, the probabilities of occurence for the two competing classes, H 0 and H 1, denoted by P (H 0 ) and P (H 1 ) respectively. The objective of a Bayesian detector is to minimize the probability of error P E (or more generally, the Bayes risk function) given the priors where P E = min(p (H 0 z), P (H 1 z)) (1.24) P (H 0 z) and P (H 1 z) are the posterior probabilities of the two hypotheses given the observations. In Fig. 1.6, we plot P E averaged over the thirty resamples versus the prior

19 Biometric Authentication: A Copula Based Approach 19 Fig Receiver Operating Characteristics for the five approaches to handle error flags using the Neyman-Pearson framework probability P(H 1 ) for all five strategies. We see that LLRT-H achieves the best performance over the entire range of priors showing that our copula based approach performs better than the one using the product model. 1.7 Concluding Remarks In this chapter, we have discussed the statistical theory of copula functions and its applicability to biometric authentication in detail. Copulas are bet-

20 20 Fig Probability of Error vs. P (H 1 ) for the five approaches to handle error flags using the Bayesian framework ter descriptors of statistical dependence across heterogeneous sources. No assumption on the source of heterogeneity are required; the same machinery holds for fusion of multiple modalities, samples, algorithms or multiple classifiers. Another interesting property of the copula approach is that it al-

21 Biometric Authentication: A Copula Based Approach 21 lows us to separate the cross-modal terms from the unimodal ones in the log likelihood ratio, thus allowing intra-modal vs. inter-modal analyses. Performance analysis in the asymptotic regime proved the intuitive result that when inter-modal dependence is accounted for in the test statistic, discriminability between the two competing hypotheses increases over the product rule by a factor exactly equal to the multi-information between the heterogeneous biometric signatures. In all, the copula approach provides a general framework for processing heterogeneous information. Applicability and the superiority of our copula based approach was shown by applying it to the NIST-BSSR 1 database. A couple of extensions that are of interest to us include - Combination of multiple copula densities. Different copula functions exhibit different behavior and a combination of multiple copula functions may better characterize dependence between several modalities than just using a single copula function. It would be interesting to explore this multi-model approach in detail. Joint feature extraction. The design of a multibiometric identification system includes, apart from information fusion, several pre-processing steps such as feature selection and extraction. In this chapter, we focussed only on the fusion aspect and chose to entirely omit the discussion on feature selection and extraction methods. Deriving features of reduced dimensionality is an essential step where data is transformed so that only relevant information is extracted and retained for further processing. This alleviates the well known curse of dimensionality. There have been several studies and methods proposed for common modality or homogeneous signals. Heterogeneous signal processing offers new possibilities for system improvement. One can envision a joint feature extraction algorithm that exploits the dependence structure between the multimodal signals. Development of feature extraction methods that optimize for inter-modal redundancy/synergy could be an interesting direction for future research. 1.8 Acknowledgements Research was sponsored by Army Research Laboratory and was accomplished under Cooperative Agreement No. W911NF It was also supported by ARO grant W911NF The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied of the Army Research Laboratory or the U.S. Government. The U.S. Government

22 22 is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon. The authors also thank Dr. Anil K. Jain and Dr. Karthik Nandakumar for their valuable comments and suggestions during the preparation of this chapter. Bibliography Bouye, E., Durrleman, V., Nikeghbali, A., Riboulet, G. and Roncalli, T. (2000). Copulas for Finance - A Reading Guide and Some Applications. Available at SSRN : Chernoff, H. (1956). Large-sample theory: Parametric case. Ann. Math. Statist, 27, pp Clemen, R. T. and Reilly, T. (1999). Correlations and copulas for decision and risk analysis, Management Science, 45, pp Cover, T. and Thomas, J., (2006). Elements of Information Theory. (John Wiley and Sons, Ltd, New Jersey). Dass, S.C., Nandakumar, K. and Jain, A. K. (2005). A principled approach to score level fusion in multimodal biometric systems. In proc. of Audio and Video based Biometric Person Authentication Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), pp Figueiredo, M. and Jain, A. K. (2002). Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 24, pp Iyengar, S.G., Varshney, P.K. and Damrla, T. (2007). On the detection of footsteps using acoustic and seismic sensing. In proc. of 41st Annual Asilomar Conference on Signals, Systems and Computers, pp Iyengar, S. G., Varshney, P. K. and Damarla, T. (2009). A parametric copula based framework for multimodal signal processing. In proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp Iyengar, S. G. PhD dissertation in progress. Syracuse University, Syracuse, NY Jain, A. K., Nandakumar, K. and Ross, A. (2005). Score normalization in multimodal biometric systems. Pattern Recognition. 38, pp Joe, H. and Xu, J. J. (1996). The estimation method of inference functions for margins for multivariate models. Technical Report, Department of Statistics, University of British Columbia. Koval, O., Voloshynovskiy, S. and Pun, T. (2007). Analysis of multimodal binary detection systems based on dependent/independent modalities. IEEE 9th Workshop on Multimedia Signal Processing, pp Kryszczuk, K. and Drygajlo, A. (2008). Impact of feature correlations on separation between bivariate normal distributions. In proc. of the 19th International Conference on Pattern Recognition, pp Kurowicka, D. and Cooke, R. (2006). Uncertainty Analysis with High Dimensional Dependence Modeling. (John Wiley and Sons, Ltd, West Sussex, England). Lehmann, E. L. and Romano, J. P. (2008). Testing Statistical Hypotheses. (Springer, 3rd edition). Mari, D. and Kotz, S. (2001). Correlation and Dependence. (Imperial College Press, London).

23 Biometric Authentication: A Copula Based Approach 23 Nandakumar, K., Chen, Y., Dass, S., and Jain, A. K. (2007). Likelihood ratiobased biometric score fusion. IEEE Trans. Pattern Anal. Mach. Intell., 55, pp Nat l Inst. of Standards and Tech., (2004). NIST Biometric Scores Set, Release 1. Nelsen, R. B. (1999). An Introduction to Copulas (Springer-Verlag, New York). Poh, N. and Bengio, S. (2005). How Do Correlation and Variance of Base-Experts Affect Fusion in Biometric Authentication Tasks? IEEE Trans. Signal Processing, 53 (11), pp Prabhakar, S and Jain, A.K. (2002). Decision-Level Fusion in Fingerprint Verification. Pattern Recognition, 35 (4), pp Roli, F., Fumera, G. and Kittler, J. (2002). Fixed and trained combiners for fusion of imbalanced pattern classifiers. In Proc. of the International Conference on Information Fusion, pp Snelick, R, Uludag, U, Mink, A., Indovina, M. and Jain, A. K. (2005). Large Scale Evaluation of Multimodal Biometric Authentication Using State-of-the- Art Systems. IEEE Trans. Pattern Anal. Mach. Intell., 27 (3), pp Toh, K. A., Jiang, X. and Yau, W. Y. (2004). Exploiting Global and Local Decisions for Multimodal Biometrics Verification. IEEE Trans. Signal Processing, supplement on secure media, 52 (10), pp Ushmaev, O. and Novikov, S. (2006). Biometric fusion: Robust approach. In Proc. of the 2nd Workshop on Multimodal User Authentication (Toulose, France). Varshney, P.K. (1997). Distributed Detection and Data Fusion (Springer, New York).

Pramod K. Varshney. EECS Department, Syracuse University This research was sponsored by ARO grant W911NF

Pramod K. Varshney. EECS Department, Syracuse University This research was sponsored by ARO grant W911NF Pramod K. Varshney EECS Department, Syracuse University varshney@syr.edu This research was sponsored by ARO grant W911NF-09-1-0244 2 Overview of Distributed Inference U i s may be 1. Local decisions 2.

More information

Likelihood Ratio Based Biometric Score Fusion

Likelihood Ratio Based Biometric Score Fusion 1 To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007 Based Biometric Score Fusion Karthik Nandakumar, Student Member, IEEE, Yi Chen, Student Member, IEEE, Sarat C. Dass,

More information

Biometric Fusion: Does Modeling Correlation Really Matter?

Biometric Fusion: Does Modeling Correlation Really Matter? To appear in Proc. of IEEE 3rd Intl. Conf. on Biometrics: Theory, Applications and Systems BTAS 9), Washington DC, September 9 Biometric Fusion: Does Modeling Correlation Really Matter? Karthik Nandakumar,

More information

Semi-parametric Distributional Models for Biometric Fusion

Semi-parametric Distributional Models for Biometric Fusion Semi-parametric Distributional Models for Biometric Fusion Wenmei Huang and Sarat C. Dass Department of Statistics and Probability, Michigan State University, East Lansing, MI, 48824, USA Abstract The

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

Score Normalization in Multimodal Biometric Systems

Score Normalization in Multimodal Biometric Systems Score Normalization in Multimodal Biometric Systems Karthik Nandakumar and Anil K. Jain Michigan State University, East Lansing, MI Arun A. Ross West Virginia University, Morgantown, WV http://biometrics.cse.mse.edu

More information

If you wish to cite this paper, please use the following reference:

If you wish to cite this paper, please use the following reference: This is an accepted version of a paper published in Proceedings of the st IEEE International Workshop on Information Forensics and Security (WIFS 2009). If you wish to cite this paper, please use the following

More information

Citation for published version (APA): Susyanto, N. (2016). Semiparametric copula models for biometric score level fusion

Citation for published version (APA): Susyanto, N. (2016). Semiparametric copula models for biometric score level fusion UvA-DARE (Digital Academic Repository) Semiparametric copula models for biometric score level fusion Susyanto, N. Link to publication Citation for published version (APA): Susyanto, N. (2016). Semiparametric

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

The Instability of Correlations: Measurement and the Implications for Market Risk

The Instability of Correlations: Measurement and the Implications for Market Risk The Instability of Correlations: Measurement and the Implications for Market Risk Prof. Massimo Guidolin 20254 Advanced Quantitative Methods for Asset Pricing and Structuring Winter/Spring 2018 Threshold

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Robustness of a semiparametric estimator of a copula

Robustness of a semiparametric estimator of a copula Robustness of a semiparametric estimator of a copula Gunky Kim a, Mervyn J. Silvapulle b and Paramsothy Silvapulle c a Department of Econometrics and Business Statistics, Monash University, c Caulfield

More information

A Goodness-of-fit Test for Copulas

A Goodness-of-fit Test for Copulas A Goodness-of-fit Test for Copulas Artem Prokhorov August 2008 Abstract A new goodness-of-fit test for copulas is proposed. It is based on restrictions on certain elements of the information matrix and

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

I D I A P R E S E A R C H R E P O R T. Samy Bengio a. November submitted for publication

I D I A P R E S E A R C H R E P O R T. Samy Bengio a. November submitted for publication R E S E A R C H R E P O R T I D I A P Why Do Multi-Stream, Multi-Band and Multi-Modal Approaches Work on Biometric User Authentication Tasks? Norman Poh Hoon Thian a IDIAP RR 03-59 November 2003 Samy Bengio

More information

Semi-parametric predictive inference for bivariate data using copulas

Semi-parametric predictive inference for bivariate data using copulas Semi-parametric predictive inference for bivariate data using copulas Tahani Coolen-Maturi a, Frank P.A. Coolen b,, Noryanti Muhammad b a Durham University Business School, Durham University, Durham, DH1

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

BIOMETRIC verification systems are used to verify the

BIOMETRIC verification systems are used to verify the 86 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 14, NO. 1, JANUARY 2004 Likelihood-Ratio-Based Biometric Verification Asker M. Bazen and Raymond N. J. Veldhuis Abstract This paper

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Bivariate Degradation Modeling Based on Gamma Process

Bivariate Degradation Modeling Based on Gamma Process Bivariate Degradation Modeling Based on Gamma Process Jinglun Zhou Zhengqiang Pan Member IAENG and Quan Sun Abstract Many highly reliable products have two or more performance characteristics (PCs). The

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

Biometrics: Introduction and Examples. Raymond Veldhuis

Biometrics: Introduction and Examples. Raymond Veldhuis Biometrics: Introduction and Examples Raymond Veldhuis 1 Overview Biometric recognition Face recognition Challenges Transparent face recognition Large-scale identification Watch list Anonymous biometrics

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural

More information

QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS

QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS QUANTIZATION FOR DISTRIBUTED ESTIMATION IN LARGE SCALE SENSOR NETWORKS Parvathinathan Venkitasubramaniam, Gökhan Mergen, Lang Tong and Ananthram Swami ABSTRACT We study the problem of quantization for

More information

Mixtures of Gaussians with Sparse Structure

Mixtures of Gaussians with Sparse Structure Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

BAYESIAN DESIGN OF DECENTRALIZED HYPOTHESIS TESTING UNDER COMMUNICATION CONSTRAINTS. Alla Tarighati, and Joakim Jaldén

BAYESIAN DESIGN OF DECENTRALIZED HYPOTHESIS TESTING UNDER COMMUNICATION CONSTRAINTS. Alla Tarighati, and Joakim Jaldén 204 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) BAYESIA DESIG OF DECETRALIZED HYPOTHESIS TESTIG UDER COMMUICATIO COSTRAITS Alla Tarighati, and Joakim Jaldén ACCESS

More information

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Applied Mathematical Sciences, Vol. 4, 2010, no. 14, 657-666 Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Pranesh Kumar Mathematics Department University of Northern British Columbia

More information

Dynamic Linear Combination of Two-Class Classifiers

Dynamic Linear Combination of Two-Class Classifiers Dynamic Linear Combination of Two-Class Classifiers Carlo Lobrano 1, Roberto Tronci 1,2, Giorgio Giacinto 1, and Fabio Roli 1 1 DIEE Dept. of Electrical and Electronic Engineering, University of Cagliari,

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

Simulation of Tail Dependence in Cot-copula

Simulation of Tail Dependence in Cot-copula Int Statistical Inst: Proc 58th World Statistical Congress, 0, Dublin (Session CPS08) p477 Simulation of Tail Dependence in Cot-copula Pirmoradian, Azam Institute of Mathematical Sciences, Faculty of Science,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Fusing Heterogeneous Data for Detection Under Non-stationary Dependence

Fusing Heterogeneous Data for Detection Under Non-stationary Dependence Syracuse University SURFACE Electrical Engineering and Computer Science College of Engineering and Computer Science 202 Fusing Heterogeneous Data for Detection Under on-stationary Dependence Hao He Syracuse

More information

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION Alireza Bayestehtashk and Izhak Shafran Center for Spoken Language Understanding, Oregon Health & Science University, Portland, Oregon, USA

More information

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Ying Lin and Biao Chen Dept. of EECS Syracuse University Syracuse, NY 13244, U.S.A. ylin20 {bichen}@ecs.syr.edu Peter

More information

This is an accepted version of a paper published in Elsevier Information Fusion. If you wish to cite this paper, please use the following reference:

This is an accepted version of a paper published in Elsevier Information Fusion. If you wish to cite this paper, please use the following reference: This is an accepted version of a paper published in Elsevier Information Fusion. If you wish to cite this paper, please use the following reference: T. Murakami, T. Ohki, K. Takahashi, Optimal sequential

More information

Prequential Analysis

Prequential Analysis Prequential Analysis Philip Dawid University of Cambridge NIPS 2008 Tutorial Forecasting 2 Context and purpose...................................................... 3 One-step Forecasts.......................................................

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC

More information

When enough is enough: early stopping of biometrics error rate testing

When enough is enough: early stopping of biometrics error rate testing When enough is enough: early stopping of biometrics error rate testing Michael E. Schuckers Department of Mathematics, Computer Science and Statistics St. Lawrence University and Center for Identification

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

On the Detection of Footsteps Based on Acoustic

On the Detection of Footsteps Based on Acoustic On the Detection of Footsteps Based on Acoustic and Seismic Sensing Satish G. Iyengar, Pramod K. Varshney Department of EECS, 335 Link Hall, Syracuse University, Syracuse, NY 13244 Email: {siyengar, varshney}

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces

A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 411 A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces Paul M. Baggenstoss, Member, IEEE

More information

Lecture 22: Error exponents in hypothesis testing, GLRT

Lecture 22: Error exponents in hypothesis testing, GLRT 10-704: Information Processing and Learning Spring 2012 Lecture 22: Error exponents in hypothesis testing, GLRT Lecturer: Aarti Singh Scribe: Aarti Singh Disclaimer: These notes have not been subjected

More information

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box 90251 Durham, NC 27708, USA Summary: Pre-experimental Frequentist error probabilities do not summarize

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Intelligent Systems Statistical Machine Learning

Intelligent Systems Statistical Machine Learning Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

A Contrario Detection of False Matches in Iris Recognition

A Contrario Detection of False Matches in Iris Recognition A Contrario Detection of False Matches in Iris Recognition Marcelo Mottalli, Mariano Tepper, and Marta Mejail Departamento de Computación, Universidad de Buenos Aires, Argentina Abstract. The pattern of

More information

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee Songklanakarin Journal of Science and Technology SJST-0-0.R Sukparungsee Bivariate copulas on the exponentially weighted moving average control chart Journal: Songklanakarin Journal of Science and Technology

More information

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER Zhen Zhen 1, Jun Young Lee 2, and Abdus Saboor 3 1 Mingde College, Guizhou University, China zhenz2000@21cn.com 2 Department

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Robust Speaker Identification

Robust Speaker Identification Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

Novel spectrum sensing schemes for Cognitive Radio Networks

Novel spectrum sensing schemes for Cognitive Radio Networks Novel spectrum sensing schemes for Cognitive Radio Networks Cantabria University Santander, May, 2015 Supélec, SCEE Rennes, France 1 The Advanced Signal Processing Group http://gtas.unican.es The Advanced

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

Minimum Error-Rate Discriminant

Minimum Error-Rate Discriminant Discriminants Minimum Error-Rate Discriminant In the case of zero-one loss function, the Bayes Discriminant can be further simplified: g i (x) =P (ω i x). (29) J. Corso (SUNY at Buffalo) Bayesian Decision

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment.

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment. 1 Introduction and Problem Formulation In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment. 1.1 Machine Learning under Covariate

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Optimal Mean-Square Noise Benefits in Quantizer-Array Linear Estimation Ashok Patel and Bart Kosko

Optimal Mean-Square Noise Benefits in Quantizer-Array Linear Estimation Ashok Patel and Bart Kosko IEEE SIGNAL PROCESSING LETTERS, VOL. 17, NO. 12, DECEMBER 2010 1005 Optimal Mean-Square Noise Benefits in Quantizer-Array Linear Estimation Ashok Patel and Bart Kosko Abstract A new theorem shows that

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Detection theory. H 0 : x[n] = w[n]

Detection theory. H 0 : x[n] = w[n] Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Forecasting Wind Ramps

Forecasting Wind Ramps Forecasting Wind Ramps Erin Summers and Anand Subramanian Jan 5, 20 Introduction The recent increase in the number of wind power producers has necessitated changes in the methods power system operators

More information

Copula modeling for discrete data

Copula modeling for discrete data Copula modeling for discrete data Christian Genest & Johanna G. Nešlehová in collaboration with Bruno Rémillard McGill University and HEC Montréal ROBUST, September 11, 2016 Main question Suppose (X 1,

More information

EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMAL AREA UNDER THE FRR CURVE FOR THE HAMMING DISTANCE CLASSIFIER

EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMAL AREA UNDER THE FRR CURVE FOR THE HAMMING DISTANCE CLASSIFIER 17th European Signal Processing Conference (EUSIPCO 29) Glasgow, Scotland, August 24-28, 29 EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMA AREA UNER THE CURVE FOR THE HAMMING ISTANCE CASSIFIER Chun Chen,

More information

certain class of distributions, any SFQ can be expressed as a set of thresholds on the sufficient statistic. For distributions

certain class of distributions, any SFQ can be expressed as a set of thresholds on the sufficient statistic. For distributions Score-Function Quantization for Distributed Estimation Parvathinathan Venkitasubramaniam and Lang Tong School of Electrical and Computer Engineering Cornell University Ithaca, NY 4853 Email: {pv45, lt35}@cornell.edu

More information

Heterogeneous Sensor Signal Processing for Inference with Nonlinear Dependence

Heterogeneous Sensor Signal Processing for Inference with Nonlinear Dependence Syracuse University SURFACE Dissertations - ALL SURFACE December 2015 Heterogeneous Sensor Signal Processing for Inference with Nonlinear Dependence Hao He Syracuse University Follow this and additional

More information

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier Seiichi Ozawa, Shaoning Pang, and Nikola Kasabov Graduate School of Science and Technology, Kobe

More information