Theoretical Statistical Correlation for Biometric Identification Performance

Size: px
Start display at page:

Download "Theoretical Statistical Correlation for Biometric Identification Performance"

Transcription

1 Theoretical Statistical Correlation for Biometric Identification Performance Michael E. Schuckers Theoretical Statistical Correlation for Biometric Identification Performance p. 1/20

2 Overview Problem Statement: There is a need for appropriate statistical methods for estimation of the performance measures for biometric identification devices. Contribution: This paper lays out general correlation structure that allows for estimation, specifically standard errors, for FTE, FTA, FMR, FNMR. Theoretical Statistical Correlation for Biometric Identification Performance p. 2/20

3 Biometric Process and Notation Enrollment, (E i ) Theoretical Statistical Correlation for Biometric Identification Performance p. 3/20

4 Biometric Process and Notation Enrollment, (E i ) Acquisition, (A ij ) Theoretical Statistical Correlation for Biometric Identification Performance p. 3/20

5 Biometric Process and Notation Enrollment, (E i ) Acquisition, (A ij ) Feature Extraction, (S ik, X ik ) Theoretical Statistical Correlation for Biometric Identification Performance p. 3/20

6 Biometric Process and Notation Enrollment, (E i ) Acquisition, (A ij ) Feature Extraction, (S ik, X ik ) Matching, (Y ik,i k ) Theoretical Statistical Correlation for Biometric Identification Performance p. 3/20

7 Biometric Process and Notation Enrollment, (E i ) Acquisition, (A ij ) Feature Extraction, (S ik, X ik ) Matching, (Y ik,i k ) Decision, (D ik,i k ) Theoretical Statistical Correlation for Biometric Identification Performance p. 3/20

8 Failure to Enrol (FTE) Notation (1) E i = { 1 if individual i is unable to enroll 0 otherwise. Failure to enroll (FTE) rate is (2) FTE = i E E i N E. where N E is the total number of individuals who attempted to enrol in the system. Theoretical Statistical Correlation for Biometric Identification Performance p. 4/20

9 Failure to Acquire (FTA) Notation (3) A ij = 1 if the j th acquisition attempt by individual i is not acquired 0 otherwise. where a i is the total number of attempts for the i th individual A represents all individuals who attempt to have their biometric image collected. Failure to acquire (FTA) rate is (4) FTA = i A ai j=1 A ij i A a i Theoretical Statistical Correlation for Biometric Identification Performance p. 5/20

10 False Match Rate (FMR) Notation (5) D ii l = { 0 if i i,y ik,i k > τ 1 if i i,y ik,i k τ False match rate (FMR) is then (6) FMR = i i i i l D ii l i i n ii where n ii is the number of decisions on the ordered pair of individuals i and i. Theoretical Statistical Correlation for Biometric Identification Performance p. 6/20

11 False Non-Match Rate (FNMR) Notation (7) D ii l = { 1 if i = i,y ik,ik > τ 0 if i = i,y ik,ik τ False non-match rate (FNMR) is then i FNMR = l D iil (8), i n ii where n ii is the number of decisions on individual i. Theoretical Statistical Correlation for Biometric Identification Performance p. 7/20

12 Statistical Methods Estimated FTE, FTA, FMR, FNMR are all linear combinations of binary RV s Linear Combination R = v b vr v Expectation E[R ] = V v=1 a ve[r v ] Variance V [R ] = V V v=1 w=1 a va w Cov(R v,r w ) If variance is constant then V [R ] = σr 2 [ V v=1 a2 v + 2 V V v=1 w>v a va w Corr(R v,r w )]. Theoretical Statistical Correlation for Biometric Identification Performance p. 8/20

13 Assumptions For the rest of this paper, we assume that FTE, FTA, FMR, and FNMR Constant mean Constant variance/covariance/correlation Get simple right move to more complicated E s, A s, D s are binary (variance mean(1-mean)) Work from correlation Theoretical Statistical Correlation for Biometric Identification Performance p. 9/20

14 Binary Correlations Which of the following rows of binary sequences is correlated? A Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

15 Binary Correlations Which of the following rows of binary sequences is correlated? A B Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

16 Binary Correlations Which of the following rows of binary sequences is correlated? A B C Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

17 Binary Correlations Which of the following rows of binary sequences is correlated? A B C D Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

18 Binary Correlations Which of the following rows of binary sequences is correlated? A B C D Answer: None, all are binomial Corr = 0 Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

19 Binary Correlations Which of the following rows of binary sequences is correlated? A B C D Answer: None, all are binomial Corr = 0 Theoretical Statistical Correlation for Biometric Identification Performance p. 10/20

20 Correlation Structure: FTE Uncorrelated Between Individuals Structure { 1 if i = i Corr(E i,e i ) (9) = 0 otherwise. Then estimated variance is (10) V [ ˆ FTE] = FTE(1 ˆ N E ˆ FTE) Theoretical Statistical Correlation for Biometric Identification Performance p. 11/20

21 Correlation Structure: FTA Intra-individual Correlation (11) Corr(A ij,a i j ) = 1 if i = i,j = j ψ if i = i,j j 0 otherwise. Then estimated variance is (12) V [ ˆ FTA] = FTA(1 ˆ N 2 A ˆ FTA) [ N A + ψ n i=1 a i (a i 1) ] where N A = a i. Theoretical Statistical Correlation for Biometric Identification Performance p. 12/20

22 Correlation Structure: FNMR Intra-individual Correlation (13) Corr(D iil,d i i l ) = 1 if i = i,l = l ρ if i = i,l l 0 otherwise V [ FNMR] ˆ = (14) where N G = n ii. FNMR(1 ˆ N 2 G ˆ FNMR) [ N G + ρ n i=1 n ii (n ii 1) ] Theoretical Statistical Correlation for Biometric Identification Performance p. 13/20

23 Correlation Structure: FMR Shared elements of a pair correlation OR ALL HELL BREAKS LOOSE (15) Corr(D ikl, D i k l ) = 1 if i = i, k = k, l = l η if i = i, k = k, l l ω 1 if i = i, k k, i k,i k ω 2 if i i, k = k, i k,i k ω 3 if i = k, i k,i i, i k ω 3 if i = k, i k, i i, k k ξ 1 if i = k, k = i, i i, k k, l = l ξ 2 if i = k, k = i, i i, k k, l l 0 otherwise Theoretical Statistical Correlation for Biometric Identification Performance p. 14/20

24 Correlation Structure: FMR (Variance) V [ FNMR] ˆ = (16) where N I = i FNMR ˆ I (1 + ω 1 i=1 + ω 3 i=1 + ξ 1 i=1 k i n ik. k=1 k i k=1 k i k=1 k i N 2 I n ik n ik ˆ FNMR) k =1 k i,k k i =1 i i,i k n ki + ξ 2 n i=1 N I + η i=1 n ik k=1 k i + ω 2 i=1 n i i + k =1 k i,k k k=1 k i n ki (n ki 1) n ik (n ik 1) k=1 k i n kk n ik i =1 i i,i k n i k Theoretical Statistical Correlation for Biometric Identification Performance p. 15/20

25 Application: FMR from Ross and Jain (2003) Modality τ ˆ FMR ˆη ˆω 1 ˆω 2 ˆω 3 ˆξ1 ˆξ2 ˆV [ ˆπ I ] Hand *0.000 * FP * Face * * * τ here is the threshold * indicates truncated correlation at zero N I = Overdispersion here: 2.64, 1.25, 1.45 respectively relative to binomial Theoretical Statistical Correlation for Biometric Identification Performance p. 16/20

26 Application: FNMR from Ross and Jain (2003) Modality τ ˆ FNMR ˆρ ˆV [ˆπG ] Hand FP * Face τ here is the threshold * indicates truncated correlation at zero N G = 500 Overdispersion here: 1.16, 1.00, 1.04 respectively relative to binomial Theoretical Statistical Correlation for Biometric Identification Performance p. 17/20

27 Structure and Methods Metric Parametric model Non-parametric model FTE Binomial IID FTA Beta-binomial User-specific Bootstrap FNMR Beta-binomial User-specific Bootstrap FMR N/A N/A User-specific Bootstrap due to Poh et al 2007 FMR model generalization of implicit structure by Bickel Central Limit Theorems apply to all cases. Theoretical Statistical Correlation for Biometric Identification Performance p. 18/20

28 Discussion Theoretical Formulation of Correlation Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

29 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

30 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Basis for More Complicated Correlation Structures Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

31 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Basis for More Complicated Correlation Structures Appropriate Statistical CI s and Hypo Tests Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

32 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Basis for More Complicated Correlation Structures Appropriate Statistical CI s and Hypo Tests Parametric approach used for Sample Size Calc s Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

33 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Basis for More Complicated Correlation Structures Appropriate Statistical CI s and Hypo Tests Parametric approach used for Sample Size Calc s Needed for GLM s, Covariate approaches Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

34 Discussion Theoretical Formulation of Correlation All correlations are threshold dependent Basis for More Complicated Correlation Structures Appropriate Statistical CI s and Hypo Tests Parametric approach used for Sample Size Calc s Needed for GLM s, Covariate approaches Mean Transaction Time?? Theoretical Statistical Correlation for Biometric Identification Performance p. 19/20

35 Thank You Questions? Theoretical Statistical Correlation for Biometric Identification Performance p. 20/20

When enough is enough: early stopping of biometrics error rate testing

When enough is enough: early stopping of biometrics error rate testing When enough is enough: early stopping of biometrics error rate testing Michael E. Schuckers Department of Mathematics, Computer Science and Statistics St. Lawrence University and Center for Identification

More information

Test Sample and Size. Synonyms. Definition. Main Body Text. Michael E. Schuckers 1. Sample Size; Crew designs

Test Sample and Size. Synonyms. Definition. Main Body Text. Michael E. Schuckers 1. Sample Size; Crew designs Test Sample and Size Michael E. Schuckers 1 St. Lawrence University, Canton, NY 13617, USA schuckers@stlawu.edu Synonyms Sample Size; Crew designs Definition The testing and evaluation of biometrics is

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

Error Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3

Error Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3 Biometrics: A Pattern Recognition System Yes/No Pattern classification Biometrics CSE 190 Lecture 3 Authentication False accept rate (FAR): Proportion of imposters accepted False reject rate (FRR): Proportion

More information

Score Normalization in Multimodal Biometric Systems

Score Normalization in Multimodal Biometric Systems Score Normalization in Multimodal Biometric Systems Karthik Nandakumar and Anil K. Jain Michigan State University, East Lansing, MI Arun A. Ross West Virginia University, Morgantown, WV http://biometrics.cse.mse.edu

More information

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics Linear, Generalized Linear, and Mixed-Effects Models in R John Fox McMaster University ICPSR 2018 John Fox (McMaster University) Statistical Models in R ICPSR 2018 1 / 19 Linear and Generalized Linear

More information

Accounting for Population Uncertainty in Covariance Structure Analysis

Accounting for Population Uncertainty in Covariance Structure Analysis Accounting for Population Uncertainty in Structure Analysis Boston College May 21, 2013 Joint work with: Michael W. Browne The Ohio State University matrix among observed variables are usually implied

More information

Biometrics: Introduction and Examples. Raymond Veldhuis

Biometrics: Introduction and Examples. Raymond Veldhuis Biometrics: Introduction and Examples Raymond Veldhuis 1 Overview Biometric recognition Face recognition Challenges Transparent face recognition Large-scale identification Watch list Anonymous biometrics

More information

Index Models for Sparsely Sampled Functional Data. Supplementary Material.

Index Models for Sparsely Sampled Functional Data. Supplementary Material. Index Models for Sparsely Sampled Functional Data. Supplementary Material. May 21, 2014 1 Proof of Theorem 2 We will follow the general argument in the proof of Theorem 3 in Li et al. 2010). Write d =

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

Chapter 1. Modeling Basics

Chapter 1. Modeling Basics Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI Analytics Software Beyond deterministic chain ladder reserves Neil Covington Director of Solutions Management GI Objectives 2 Contents 01 Background 02 Formulaic Stochastic Reserving Methods 03 Bootstrapping

More information

Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods

Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods Vibration Based Health Monitoring for a Thin Aluminum Plate: Experimental Assessment of Several Statistical Time Series Methods Fotis P. Kopsaftopoulos and Spilios D. Fassois Stochastic Mechanical Systems

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Internet Appendix to Are Stocks Really Less Volatile in the Long Run?

Internet Appendix to Are Stocks Really Less Volatile in the Long Run? Internet Appendix to Are Stocks Really Less Volatile in the Long Run? by Ľuboš Pástor and Robert F. Stambaugh January 1, 211 B1. Roadmap This Appendix is organized as follows. A simple illustration demonstrating

More information

A strategy for modelling count data which may have extra zeros

A strategy for modelling count data which may have extra zeros A strategy for modelling count data which may have extra zeros Alan Welsh Centre for Mathematics and its Applications Australian National University The Data Response is the number of Leadbeater s possum

More information

Has this Person Been Encountered Before? : Modeling an Anonymous Identification System

Has this Person Been Encountered Before? : Modeling an Anonymous Identification System A version of this paper appears in the IEEE Computer Society Workshop on Biometrics at the Computer Vision and Pattern Recognition Conference (CVPR 1, Providence, RI, USA, June 1 Has this Person Been Encountered

More information

Score calibration for optimal biometric identification

Score calibration for optimal biometric identification Score calibration for optimal biometric identification (see also NIST IBPC 2010 online proceedings: http://biometrics.nist.gov/ibpc2010) AI/GI/CRV 2010, Ottawa Dmitry O. Gorodnichy Head of Video Surveillance

More information

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Aaron Gilary, Jerry Maples, U.S. Census Bureau U.S. Census Bureau Eric V. Slud, U.S. Census Bureau Univ. Maryland College Park

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Corner. Corners are the intersections of two edges of sufficiently different orientations.

Corner. Corners are the intersections of two edges of sufficiently different orientations. 2D Image Features Two dimensional image features are interesting local structures. They include junctions of different types like Y, T, X, and L. Much of the work on 2D features focuses on junction L,

More information

Greg Taylor. Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales. CAS Spring meeting Phoenix Arizona May 2012

Greg Taylor. Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales. CAS Spring meeting Phoenix Arizona May 2012 Chain ladder correlations Greg Taylor Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales CAS Spring meeting Phoenix Arizona 0-3 May 0 Overview The chain ladder produces

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

SUPPLEMENTARY SIMULATIONS & FIGURES

SUPPLEMENTARY SIMULATIONS & FIGURES Supplementary Material: Supplementary Material for Mixed Effects Models for Resampled Network Statistics Improve Statistical Power to Find Differences in Multi-Subject Functional Connectivity Manjari Narayan,

More information

Lecture 13: More on Binary Data

Lecture 13: More on Binary Data Lecture 1: More on Binary Data Link functions for Binomial models Link η = g(π) π = g 1 (η) identity π η logarithmic log π e η logistic log ( π 1 π probit Φ 1 (π) Φ(η) log-log log( log π) exp( e η ) complementary

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

(Refer Slide Time: 0:36)

(Refer Slide Time: 0:36) Probability and Random Variables/Processes for Wireless Communications. Professor Aditya K Jagannatham. Department of Electrical Engineering. Indian Institute of Technology Kanpur. Lecture -15. Special

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Recitation 5. Inference and Power Calculations. Yiqing Xu. March 7, 2014 MIT

Recitation 5. Inference and Power Calculations. Yiqing Xu. March 7, 2014 MIT 17.802 Recitation 5 Inference and Power Calculations Yiqing Xu MIT March 7, 2014 1 Inference of Frequentists 2 Power Calculations Inference (mostly MHE Ch8) Inference in Asymptopia (and with Weak Null)

More information

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 23-24, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting

More information

Using the Empirical Probability Integral Transformation to Construct a Nonparametric CUSUM Algorithm

Using the Empirical Probability Integral Transformation to Construct a Nonparametric CUSUM Algorithm Using the Empirical Probability Integral Transformation to Construct a Nonparametric CUSUM Algorithm Daniel R. Jeske University of California, Riverside Department of Statistics Joint work with V. Montes

More information

OWL to the rescue of LASSO

OWL to the rescue of LASSO OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Stat 502 Design and Analysis of Experiments General Linear Model

Stat 502 Design and Analysis of Experiments General Linear Model 1 Stat 502 Design and Analysis of Experiments General Linear Model Fritz Scholz Department of Statistics, University of Washington December 6, 2013 2 General Linear Hypothesis We assume the data vector

More information

Biometric Hash based on Statistical Features of Online Signatures

Biometric Hash based on Statistical Features of Online Signatures Biometric Hash based on Statistical Features of Online Signatures Claus Vielhauer 1,2, Ralf Steinmetz 1, Astrid Mayerhöfer 3 1 Technical University Darmstadt Institute for Industrial Process- and System

More information

Pointwise Exact Bootstrap Distributions of Cost Curves

Pointwise Exact Bootstrap Distributions of Cost Curves Pointwise Exact Bootstrap Distributions of Cost Curves Charles Dugas and David Gadoury University of Montréal 25th ICML Helsinki July 2008 Dugas, Gadoury (U Montréal) Cost curves July 8, 2008 1 / 24 Outline

More information

Efficient adaptive covariate modelling for extremes

Efficient adaptive covariate modelling for extremes Efficient adaptive covariate modelling for extremes Slides at www.lancs.ac.uk/ jonathan Matthew Jones, David Randell, Emma Ross, Elena Zanini, Philip Jonathan Copyright of Shell December 218 1 / 23 Structural

More information

Statistical Inference

Statistical Inference Statistical Inference J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France SPM Course Edinburgh, April 2011 Image time-series Spatial

More information

arxiv: v2 [stat.me] 4 Jun 2016

arxiv: v2 [stat.me] 4 Jun 2016 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2

More information

Vast Volatility Matrix Estimation for High Frequency Data

Vast Volatility Matrix Estimation for High Frequency Data Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36 Outline

More information

Applied Asymptotics Case studies in higher order inference

Applied Asymptotics Case studies in higher order inference Applied Asymptotics Case studies in higher order inference Nancy Reid May 18, 2006 A.C. Davison, A. R. Brazzale, A. M. Staicu Introduction likelihood-based inference in parametric models higher order approximations

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -20 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -20 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -20 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Case study -3: Monthly streamflows

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

EE/CpE 345. Modeling and Simulation. Fall Class 9

EE/CpE 345. Modeling and Simulation. Fall Class 9 EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

The Random Effects Model Introduction

The Random Effects Model Introduction The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other

More information

L2: Review of probability and statistics

L2: Review of probability and statistics Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution

More information

Newman-Penrose formalism in higher dimensions

Newman-Penrose formalism in higher dimensions Newman-Penrose formalism in higher dimensions V. Pravda various parts in collaboration with: A. Coley, R. Milson, M. Ortaggio and A. Pravdová Introduction - algebraic classification in four dimensions

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an

More information

System Design and Performance Assessment: A Biometric Menagerie Perspective. Norman Poh Lecturer Dept of Computing

System Design and Performance Assessment: A Biometric Menagerie Perspective. Norman Poh Lecturer Dept of Computing System Design and Performance Assessment: A Biometric Menagerie Perspective Norman Poh n.poh@surrey.ac.uk Lecturer Dept of Computing 1 2 Biometric menagerie Terms & performance metrics Variability vs Repeatability

More information

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS Rituparna Sen Monday, July 31 10:45am-12:30pm Classroom 228 St-C5 Financial Models Joint work with Hans-Georg Müller and Ulrich Stadtmüller 1. INTRODUCTION

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 01 / 2010 Leibniz University of Hannover Natural Sciences Faculty Titel: Multiple contrast tests for multiple endpoints Author: Mario Hasler 1 1 Lehrfach Variationsstatistik,

More information

DELTA METHOD and RESERVING

DELTA METHOD and RESERVING XXXVI th ASTIN COLLOQUIUM Zurich, 4 6 September 2005 DELTA METHOD and RESERVING C.PARTRAT, Lyon 1 university (ISFA) N.PEY, AXA Canada J.SCHILLING, GIE AXA Introduction Presentation of methods based on

More information

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

Math Review Sheet, Fall 2008

Math Review Sheet, Fall 2008 1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the

More information

Chapter 3 - Estimation by direct maximization of the likelihood

Chapter 3 - Estimation by direct maximization of the likelihood Chapter 3 - Estimation by direct maximization of the likelihood 02433 - Hidden Markov Models Martin Wæver Pedersen, Henrik Madsen Course week 3 MWP, compiled June 7, 2011 Recall: Recursive scheme for the

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic

More information

EECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015

EECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015 EECS564 Estimation, Filtering, and Detection Exam Week of April 0, 015 This is an open book takehome exam. You have 48 hours to complete the exam. All work on the exam should be your own. problems have

More information

ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations

ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations D. Richard Brown III Worcester Polytechnic Institute Worcester Polytechnic Institute D. Richard Brown III 1 / 7 Neyman

More information

Sample-weighted semiparametric estimates of cause-specific cumulative incidence using left-/interval censored data from electronic health records

Sample-weighted semiparametric estimates of cause-specific cumulative incidence using left-/interval censored data from electronic health records 1 / 22 Sample-weighted semiparametric estimates of cause-specific cumulative incidence using left-/interval censored data from electronic health records Noorie Hyun, Hormuzd A. Katki, Barry I. Graubard

More information

CORRELATION ANALYSIS OF ORDERED OBSERVATIONS FROM A BLOCK-EQUICORRELATED MULTIVARIATE NORMAL DISTRIBUTION

CORRELATION ANALYSIS OF ORDERED OBSERVATIONS FROM A BLOCK-EQUICORRELATED MULTIVARIATE NORMAL DISTRIBUTION CORRELATION ANALYSIS OF ORDERED OBSERVATIONS FROM A BLOCK-EQUICORRELATED MULTIVARIATE NORMAL DISTRIBUTION MARLOS VIANA AND INGRAM OLKIN Abstract. Inferential procedures are developed for correlations between

More information

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Optimal bandwidth selection for the fuzzy regression discontinuity estimator Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal

More information

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix Alexander P. Koldanov and Petr A. Koldanov Abstract A multiple decision statistical problem for the elements of inverse covariance

More information

Spatial Misalignment

Spatial Misalignment November 4, 2009 Objectives Land use in Madagascar Example with WinBUGS code Objectives Learn how to realign nonnested block level data Become familiar with ways to deal with misaligned data which are

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship Scholars' Mine Doctoral Dissertations Student Research & Creative Works Spring 01 Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with

More information

Inference with Transposable Data: Modeling the Effects of Row and Column Correlations

Inference with Transposable Data: Modeling the Effects of Row and Column Correlations Inference with Transposable Data: Modeling the Effects of Row and Column Correlations Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

FDVAR: R code for variance of the number of false discoveries Beta version

FDVAR: R code for variance of the number of false discoveries Beta version FDVAR: R code for variance of the number of false discoveries Beta version Art B. Owen May 2004 Abstract This report documents a beta version of code for computing the variance of the false discovery rate,

More information

Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design

Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design Chapter 236 Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design Introduction This module provides power analysis and sample size calculation for non-inferiority tests

More information

Biometric Data Fusion Based on Subjective Logic 1

Biometric Data Fusion Based on Subjective Logic 1 Biometric Data Fusion Based on Subjective Logic 1 Audun Jøsang University of Oslo Norway josang@mn.uio.no Thorvald H. Munch-Møller University of Oslo Norway thorvahm@ifi.uio.no Abstract Biometric identification

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Stochastic Primal-Dual Methods for Reinforcement Learning

Stochastic Primal-Dual Methods for Reinforcement Learning Stochastic Primal-Dual Methods for Reinforcement Learning Alireza Askarian 1 Amber Srivastava 1 1 Department of Mechanical Engineering University of Illinois at Urbana Champaign Big Data Optimization,

More information

In-house germination methods validation studies: analysis

In-house germination methods validation studies: analysis 1 In-house germination methods validation studies: analysis Jean-Louis Laffont - ISTA Statistics Committee Design assumptions for the validation study Based on peer validation guidelines From Table 1 in

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

A Practical Approach to Inferring Large Graphical Models from Sparse Microarray Data

A Practical Approach to Inferring Large Graphical Models from Sparse Microarray Data A Practical Approach to Inferring Large Graphical Models from Sparse Microarray Data Juliane Schäfer Department of Statistics, University of Munich Workshop: Practical Analysis of Gene Expression Data

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Machine Learning, Fall 2012 Homework 2

Machine Learning, Fall 2012 Homework 2 0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0

More information

EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMAL AREA UNDER THE FRR CURVE FOR THE HAMMING DISTANCE CLASSIFIER

EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMAL AREA UNDER THE FRR CURVE FOR THE HAMMING DISTANCE CLASSIFIER 17th European Signal Processing Conference (EUSIPCO 29) Glasgow, Scotland, August 24-28, 29 EXTRACTING BIOMETRIC BINARY STRINGS WITH MINIMA AREA UNER THE CURVE FOR THE HAMMING ISTANCE CASSIFIER Chun Chen,

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -18 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -18 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -18 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Model selection Mean square error

More information