Team 250 Page 2. I. Introduction

Similar documents
CHAPTER VI Statistical Analysis of Experimental Data

Chapter 8. Inferences about More Than Two Population Central Values

Lecture Notes Types of economic variables

Econometric Methods. Review of Estimation

Introduction to local (nonparametric) density estimation. methods

Lecture 9: Tolerant Testing

Lecture 3. Sampling, sampling distributions, and parameter estimation

Chapter Statistics Background of Regression Analysis

Summary of the lecture in Biostatistics

Module 7: Probability and Statistics

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

Simple Linear Regression

To use adaptive cluster sampling we must first make some definitions of the sampling universe:

Functions of Random Variables

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

MEASURES OF DISPERSION

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Lecture 3 Probability review (cont d)

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Multiple Linear Regression Analysis

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Chapter 5 Properties of a Random Sample

PROPERTIES OF GOOD ESTIMATORS

PTAS for Bin-Packing

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

The Mathematical Appendix

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Lecture 1 Review of Fundamental Statistical Concepts

Some Notes on the Probability Space of Statistical Surveys

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Chapter 3 Sampling For Proportions and Percentages

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

Dimensionality Reduction and Learning

Chapter 14 Logistic Regression Models

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Chapter 13 Student Lecture Notes 13-1

Chapter 9 Jordan Block Matrices

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Objectives of Multiple Regression

ESS Line Fitting

1 Onto functions and bijections Applications to Counting

Bayes (Naïve or not) Classifiers: Generative Approach

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Lecture 8: Linear Regression

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

A Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies

(Monte Carlo) Resampling Technique in Validity Testing and Reliability Testing

For combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Lecture 2 - What are component and system reliability and how it can be improved?

Ideal multigrades with trigonometric coefficients

Bootstrap Method for Testing of Equality of Several Coefficients of Variation

5 Short Proofs of Simplified Stirling s Approximation

Chapter -2 Simple Random Sampling

Chapter -2 Simple Random Sampling

Random Variables and Probability Distributions

is the score of the 1 st student, x

Correlation and Regression Analysis

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Multiple Choice Test. Chapter Adequacy of Models for Regression

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Median as a Weighted Arithmetic Mean of All Sample Observations

TESTS BASED ON MAXIMUM LIKELIHOOD

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Descriptive Statistics

CHAPTER 4 RADICAL EXPRESSIONS

Investigating Cellular Automata

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

CHAPTER 2. = y ˆ β x (.1022) So we can write

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Point Estimation: definition of estimators

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Statistics MINITAB - Lab 5

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

1. Overview of basic probability

Class 13,14 June 17, 19, 2015

ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS

Evaluating Polynomials

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

The Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)

Non-uniform Turán-type problems

UNIVERSITY OF EAST ANGLIA. Main Series UG Examination

Chapter Two. An Introduction to Regression ( )

ENGI 3423 Simple Linear Regression Page 12-01

Analysis of Variance with Weibull Data

v 1 -periodic 2-exponents of SU(2 e ) and SU(2 e + 1)

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

Chapter 11 Systematic Sampling

= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality

A New Family of Transformations for Lifetime Data

UNIT 4 SOME OTHER SAMPLING SCHEMES

Transcription:

Summary Gve the possble umber of geetc varatos, the probablty of havg a aturally occurrg Doppelgager s low. Ths s why DA evdece acqured at crme scees s such coclusve evdece whe preseted crmal trals. Though the process of DA fgerprtg s fallble, the probablty that two urelated people wth the same DA exst s mcroscopc. Barrg, the, that you have a detcal evl tw, the probablty that you wll be mstake for a crmal based o such evdece s low. Fgerprts, however, beg oly a porto of ths geetc detty, seem far less restrctg. It s the cocevably possble that oe could be mstake as the perpetrator of a crme based o fgerprt evdece. It s our goal to determe exactly how probable ths s. Oe of the progetors of the study of fgerprt detty was Sr Fracs Galto, who detfed characterstc rdge patters the sk that vary wdely amog a populato, but whch are costat over tme to a dvdual. I addto to these mutae, fgerprts also have a overall patter that early all cases falls to oe of three groups: loops, arches, ad whorls. Usg both the overall fgerprt patters, ad a set of the most commoly occurrg Galto Characterstcs (GCs), we created a model to test the dvdualty of fgerprts, based o a probablstc terpretato: hghly probable fgerprts are less dvdual, ad less probably fgerprts are more dvdual. I ths model, we frst dvded a deal rectagular thumbprt to squares of equal area, deoted as cells. Kowg that ay comparso betwee two fgerprts frst matches the geeral patter of a fgerprt ad the a certa umber of GCs, we calculated the fgerprt patters that have the maxmum probablty of occurrece. Ths was doe by usg fgures whch determed the relatve frequecy of occurrece of each of the patters ad GCs. To start, we assumed that from a deal thumbprt cotag total cells, we chose to cofrm the form ad placemet of GCs those cells. Our model proceeds stages, frst choosg the overall patter of the prt, ad the proceedg to choose locatos of GCs from the total placemets possble. Oce the patter ad placemet have bee determed, t remas oly to factor the relatve occurrece probabltes of each GC order to determe a measure of the dvdualty of the fgerprt. The model s costructed based o a umber of assumptos. To beg wth, we frst assume that the patters ad GCs occur depedetly; ether has a fluece o the other s probablty. I later stages of our aalyss, the, we accout for the fact that depedeces may exst, ad alter the selecto of GCs accordgly. Aother assumpto that our model makes s that the GCs occur depedetly; that s, the spaces whch we wsh to cofrm the presece of GCs, placemet has o effect o whch characterstc s selected. Sce there has bee o coclusve evdece that a partcular fgerprt patter has ay fluece o the mutae preset the fgerprt, ths seems to be a vald assumpto, ad hece o uecessary restrctos were placed o the form of the fgerprt. The costructo of the model allowed us to calculate the ablty to cofrm a fgerprt based o partal fgerprt evdece. I addto, we used populato fgures of may coutres ad the etre world to fd what the mmum umber of GCs commo betwee fgerprts should be before a match ca be sad to occur. I testg ths model, we dd ot calculate the probablty of occurrece for every dvdual patter ad placemet of GCs. Rather, we calculated oly the probablty of the most lkely occurrece. Also, the oretato of GCs was ot take to cosderato. Ths may at frst seem to be a weakess, but s fact a stregth, as requrg a fgerprt to occur wth GCs oreted a partcular drecto s strcter tha ot requrg ay partcular drecto for ther placemet. Thus, ay fgerprt occurrg ature s hypothetcally less lkely to occur tha our calculated maxmum. For a template fgerprt wth 12 detfed mutae, a reasoable requred umber gve ew advacemets laser recogto of fgerprts, the probablty fdg a match was calculated to be o the order of 10-13. Ths fgure shows that eve the most lkely fgerprt s thus hghly dvdual, ad fgerprt detfcato s as relable o deal grouds as DA detfcato, whch has relablty o the order of 10-10.

Team 250 Page 2 A IQUIRY ITO IDIVIDUALITY OF THUMBPRITS Asma Al-Raw, Steve Glbersto, Joatha Whtmer Kasas State Uversty Mathematcal Cotest Modelg 2004 I. Itroducto How ca you dsbeleve me whe I have created each oe of you dow to the prts o your fgers? --God (The Holy Qur a 75:3-4) [4] The above referece, depedg o oe s relgousess or secularsm, ether cofrms that fgerprts are dstct to dvduals, or at the very least, that kowledge of varato of fgerprts betwee persos, ad ts heret propertes detfcato, has exsted sce the 8 th cetury. I moder Wester culture, the dea of usg fgerprts as a meas of detfcato frst appeared a artcle wrtte by Hery Faulds 1880 the joural ature [3]. Hs terest was aroused by hs dscovery of rdged patter mprts hadmade pottery. After performg a seres of expermets to determe dfferece fgerprts amog dvduals as well as ther reslece, he recommeded that a prmary use of these rdged mprts could be used as evdece of crmal detty at the scee of the crme. At the root of ths asserto s the assumpto of uqueess each huma s fgerprt patters. There are several commoaltes the patters of rdged sk, however, whch allow fgerprts to be systematcally classfed. For example, the rdged les o fgers appear a umber of major patter types: loops, whch comprse the largest porto of all fgerprts ad occur two chraltes; whorls, whch are characterzed by the spralg patter of the rdges; ad the arches, whch comprse the smallest major group [1]. Other possble mafestatos exst; however ther occurrece s very rare. I addto to these major groups, the rdges of dfferet fgerprts show certa defg characterstcs. Ths dea was prevalet oe of the frst attempted quatfcatos of fgerprt dvdualty, whch was performed by Sr Fracs Galto 1892 [1]. The patters of fger rdge dvergeces ad combatos, termed mutae, are also detfed as Galto Characterstcs hs hoor. Later developmets have corporated hs deas alog wth other prt-determg factors to establsh more exactly each prt s uqueess [1,2,6]. Whether or ot each fgerprt patter s truly uque, ther use as a form of detfcato has foud much use foresc scece. Recetly, however, the valdty of fgerprt evdece has bee called to questo, as evdeced by the case Uted States v. Mtchell, whch preseted the US wth ts frst challege as to the admssblty of latet fgerprt evdece as a meas of detfcato [7]. Ths ecesstates a reevaluato of the valdty of fgerprt uqueess measuremet. Thus, we become faced wth the problem of determg the probablty that two people the world mght share the same fgerprts to measurable accuracy. Ths s qute a complex problem f oe allows t to be, as there seem at frst to be almost ftely may varatos wth rdge patters whose appearace ad terplay must be accouted for, ad yet t has a smple ad elegat soluto whch we wll show ths paper. I our study, we focus ot o each of the te fgers, but o oly the thumb, whch effectvely serves as a upper boud for the

Team 250 Page 3 multple occurrece probablty of all frcto rdged sk. Our calculatos have foud o the bass of a dscrete probablty model that t s extremely ulkely that two people wth the same thumbprts have ever exsted, wth the lmtatos of curret measuremet practces. II. Model The frst step devsg a model for thumbprt dvdualty s smply to uderstad what types of fgerprts exst. As metoed prevously, fgerprts occur what seems to be a fte umber of varatos, determed by both ther overall patter ad the dstrbuto of Galto Characterstcs (GCs). The patters fall to three ma categores: loops, arches, ad whorls. These ca be further dvded to over a thousad subcategores [1]. Fgure 1 shows the major types of prts. FIGURE 1. These are four most commo patters of fgerprt patters: Left ad rght loops, whorls, ad arches. From www.sfs.ca.gov/patter_types.htm. Prts whch fall to these categores ca, to the utraed eye, ad oftetmes eve the traed eye, appear very smlar. Whe the cotrbuto of GCs s factored, a partcular fgerprt s uque character starts to become apparet. The major types of GCs are llustrated Fgure 2. Whether the patter o the fger s a loop, arch, or whorl, GCs occur radomly throughout the etre prt. These occurreces gve dstct attrbutes to the prt that ca be systematcally classfed.

Team 250 Page 4 FIGURE 2. A chart showg the 10 most commo forms of Galto Characterstcs. (Osterburg??) The cetral problem, gve a kow classfcato of a fgerprt by ts patter ad GCs, becomes to calculate the probablty that a detcal fger exsts. Our model focuses specfcally o thumbprts, for a varety of reasos. For stace, a thumb s easy to dealze. I practce, whe fgerprts are take, the fger s rolled over early ts etre surface above the frst kuckle. Ths s smlar to the urollg of a ucapped cylder. The shape of ths prt o paper s approxmately rectagular. The thumbprt has the largest area, ad also the largest umber of defg qualtes, due to the radom dstrbuto of GCs. For a deal rectagular thumbprt, we partto the area to equally szed squares, wth a mmum sze o the order of oe square mllmeter, due to the mmum extet to whch a GC ca be detfed as occurrg oe of the squares. Sce oly a fte umber of vsble GCs ca occur o a sgle pattered fger, a dscrete probablty method s useful for determg the possblty of Doppelgager thumbs. It s the perfectly admssble to use a coutg argumet to fd approxmately the umber of possble arragemets of frcto rdges o the thumb, ad ther relatve occurreces based o the features they cota. It should be oted that deal fgerprts as descrbed above do ot usually occur actual feldwork. Usually oly portos of fgerprts are left by ols or other substaces o the fgers of the crmal; these are called latet prts. After these latet prts are developed ad brought to vsble form, they are descrbed as partal prts. These partal prts cota oly a fracto of the total surface of the frcto rdged sk o the thumb. Usg smlar deas to the oes above, we ca model partal prts smply

Team 250 Page 5 by decreasg ; that s, lmtg the umber of cells o whch the prts have to match up. Sce a partal prt caot possbly match the rest of the cells cotaed a deal prt, the characterstcs of those cells are rrelevat. Decreasg the gves a accurate model, as we ca say that the area we are samplg from s smaller. Accordgly, the probablty of matchg the prt amog people of a gve populato grows, as we show below. III. Probablty Algorthms Our frst step was to measure the dmesos of a dealzed thumb. Averagg over the three members our group, we foud the dmesos of a early rectagular prt, whe measured as descrbed above, to be approxmately 3 cm by 4 cm. Thus there are approxmately 1200 square mllmeters o two thumbs. We took each square mllmeter to be a cell, so that our deal thumb model, a full prt has a possblty of 1200 detfcato pots. I practce, a suspect s thumbprt ad the thumbprt foud at the scee of the crme are compared to each other o both the overall patter ad a certa umber of dstgushg characterstcs. The dstgushg factors ca correspod to ether scars o the suspect s thumbprt or GCs. Sce scars are the result of completely radom evets, ad thus are early mpossble to quatfy wthout exact persoal hstores, our model cosders oly the cases whch GCs occupy these detfyg pots. I prevous models [1,2], the relato betwee GCs ad the overall patter was ot cosdered; oly the occurrece of GCs was take to accout. I our model, varous degrees of patter ad GC depedece were cosdered. Ths accouts for the possblty that a certa percetage of the GCs are heret the overall patter. I the case where patter ad GC occurreces are completely depedet, oe ca separate the probablty of a fgerprt s occurrece to two factors: P P P (1). fp p I the above equato, P fp s the probablty a partcular fgerprt wll occur, P p s the probablty a partcular patter wll occur, some approxmate fgures for whch are gve Table 1, ad P GC s the probablty of a partcular combato of GCs. Class of Prt Probablty Rght Loop 0.325 Left Loop 0.325 Whorl 0.3 Arch 0.05 Total 1 TABLE 1: A lst of approxmate occurrece probabltes of the four most commo thumbprts from Osterburg, et. al. The loop category s determed there to have a 65% occurrece probablty, whch here s dvded to the two chraltes, whch are easly dstgushable ad occur at early the same rate overall. Our model treats o-measured GCs ad cells whch there are o GCs as equvalet empty cells. Thus, the case where GCs are depedet o whch patter a fgerprt has, we ca stll use ths depedece model, by otg that sce a partcular GC

Team 250 Page 6 percetage of the GCs are determed by the patter, we ca treat those as empty space whch o defg characterstc occurs. Suppose the, that we wsh to fd the probablty that a partcular dstrbuto of measured GCs occurs. To do ths, we ote that of the total cells the fgerprt, oly of these cells have ay sgfcace terms of GC measuremet. The umber of ways ths ca be dstrbuted s easy to compute. Placg all measured cells o the same level, we beg placg GC s ad empty cells o the surface of the thumbprt. At frst there are GCs to place wth the total area of the prt, ad total cells to place them. If the frst cell s empty space, we are left wth -1 cells whch to place characterstcs, ad characterstcs. If the frst cell cotas a characterstc, we have -1 empty cells whch to place characterstcs, ad -1 GCs. Iteratg ths choce process over all cells, we fd that the umber of ways we ca place the GCs s!!( )! (2). Ths leaves us to calculate the probablty that each GC cell cotas a partcular GC. Osterburg, et al, cotas relatve frequeces of occurrece for each characterstc averaged over 39 fgers. Table 2 gves these fgures. I our model, sce we dsregard empty spaces, we cosdered oly the relatve frequecy of the eleve most commo elemets. Double occurreces, or the evet that two GCs occur the same space, whle certaly possble, were gored ths model calculato, due to ther small frequecy. The umber the table s msleadg, as t accouts for all double occurreces, ot double occurreces of partcular types. Parameter Cell cofgurato Frequecy Probablty of Parameter 0 Empty 6,584 0.766 1 Islad 152 0.018 2 Brdge 105 0.012 3 Spur 64 0.007 4 Dot 130 0.015 5 Edg rdge 715 0.083 6 Fork 328 0.038 7 Lake 55 0.006 8 Trfurcato 5 0.001 9 Double bfurcato 12 0.001 10 Delta 17 0.002 11 Broke rdge 119 0.014 12 Multple occuraces 305 0.036 Total 8,591 1.000 TABLE 2. Expermetally determed Galto Characterstc probablty umbers. From Osterburg, et al. Our model dsregards multple occurreces, hece for our purposes, the characterstcs umbered 0 ad 12 are empty cells. Oly the characterstcs umbered 1-11 are relevat. The relatve probablty s a ecessary factor for determg whch characterstc s most lkely to occur the GC cells. The probablty of the th occurrece s gve by:

Team 250 Page 7 r = P( ) P( ) (3), where the elemets P() are determed from Table 1. The ths case rages from 1 to 11, as our model cosders oly sgle GC occurreces, ad treats the low probablty ad multple occurrece GCs as empty space. It should be oted that ther cluso would decrease the relatve probablty of the th term as defed above; hece, t would decrease the upper boud whch our calculato ams to set. Clearly, the sum of these relatve probablty quattes s 1, hece they are valdly defed as probabltes. For GCs, the probablty of each arragemet s gve by the relatve probablty of each GC to the power of the umber of tmes the GC s selected dvded by the umber of ways to dvde those elemets to groups categorzed by the eleve GCs cosdered. Though the dea s complex, the otato s rather mathematcally smple, ad correspods to the product of the selecto probabltes dvded by the multomal coeffcet correspodg to choosg 1 of GC umber 1, 2 of GC umber 2, etc. If we dvde ths quatty by the umber of ways each of the GCs cosdered, we obta the probablty of each arragemet of GC s, show equato (4a). P GC 1 11 = 1 r 11!! 1 11 11 = 1! r!!( )! 11 (!) r! ( )! (4a) Oe should ote that the above, α (4b), hece there are oly as may stages cosdered the determato of GCs as there are GCs that are measured ad avalable to compare to. To reterate, our algorthm for calculatg Doppelgager thumb probabltes cosders separately the probabltes of both the geeral patter ad GC occurrece. The probablty of GC occurrece s determed by the umber of places whch GCs are observed, the relatve probablty of a GC occurrg there, ad the umber of ways these GC s ca the be ordered. The quatfcato of ths s the gve by equato (4a). ow, gve equatos (1) ad (4a), we ca calculate the probablty of ay partcular fgerprt matchg o both the patter ad ay GCs by usg the formato Tables 1 ad 2. Sce we wsh, the, to put a lmt o the umber of people the world who ca match fgerprts, gve these characterstcs, we calculated P max, the probablty of ay thumbprt matchg a template wth oly the most lkely characterstcs each of the GC places. Ths smplfes equato (4a), by restrctg choce to oly the GC wth maxmum probablty. Thus we have

Team 250 Page 8 P GC 1 11 = 1 r 11 0 r max 0 r max P max (5). Some plots of ths are gve Appedx A. These plots use the value of r max obtaed by computg the relatve probablty of edg rdges, ad cosder oly the rght ad left loop patters (occurrg equal supply) to costtute the maxmum patter probablty. To calculate the quattes determed equato (5), t becomes ecessary to calculate factorals of very large umbers to determe values of choose. Ths ca be approxmately doe by usg Sterlg s approxmato, whose formula s gve by Ths, tur, leads us to the approxmato 1 log( m!) mlog( m) m + log(2 m) (6). 2 log log(!) log ( )! log(!) (7), whch ca be utlzed to approxmate. If we suppose that a percetage of GCs are depedet o the overlyg patter, the our model chages very lttle. Assumg that l of the total GCs are depedet o a partcular patter, we ca essetally dsregard all patter-depedet GCs as empty cells, as they would be exactly what s expected the prt at that pot the patter. Hece, wth a slght modfcato from to l, where l deotes the umber of GCs depedet o the patter, equatos (4a) ad (4b) ca stll be utlzed. I the evet that the GCs are wholly determed by the overlyg patter, we ca dsregard the fluece of the patter our calculato of P fp, as we have more precse formato about GC form ad occurrece tha we do about patter ad sub-patter form ad occurrece. Also, our estmates for the lkelhood of a GC occurrg at a gve pot the -square array gve a more lmtg maxmum for the probablty tha do our fgures o geeral patter characterstcs. The omsso of the patter fluece o the fgerprt probablty s completely vald, sce total GC depedece o patter s equvalet to total patter depedece o GC; they smply become two dfferet types of taxoomy. IV. Data Returg to problem ow, we are specfcally asked to determe what the probablty s that a perso ca be msdetfed by fgerprt evdece; that s, we are to determe the probablty that two people share the same fgerprt characterstcs. For a template wth GCs, we are to calculate the probablty that two dstct people match the template. Ths s lmted by the square of P max for a gve, whch as graphed

Team 250 Page 9 Fgure 3 below, s see to be very low for all 10. For the value of = 12, take Osterburg, et al to be a meda value for what s requred for verfcato by varous teratoal law eforcemet ageces, we ca see that the probablty of fgerprt multplcty s 4.64 x 10-15. These calculatos were smply performed usg a Mcrosoft Excel spreadsheet ad the formulas Secto III. Maxmum Probabltes at Varous Patter Depedeces 1.00E+04 1.00E-01 1.00E-06 1.00E-11 P_max 1.00E-16 1.00E-21 1.00E-26 1.00E-31 0 5 10 15 20 25 30 35 umber of GCs o Depedece 25% Depedece 50% Depedece 75% Depedece 100% Depedece FIGURE 3: Plot of maxmum probablty as a fucto of the umber of GCs used the verfcato process. Here s allowed to rage from 1 to 30. Aother, drectly applcable, ad hghly terestg problem s the followg: What s the maxmum umber of GCs that a partcular coutry s law eforcemet ageces must use order to get the hghest probablty of a match usg the lowest umber of GCs per detfcato? Usg populato fgures Table 3, we ca determe ths. To do so, we multply the populato of a coutry by P max to fd the umber of people a coutry that are probable to match a gve GC template. The results are plotted Appedx A. The plots Appedx A all pot to ear certa detfcato for 12. Ths s true regardless of the coutry whch the detfcato s beg made. I fact, usg the world populato fgure, t s ear certa that o a thumb wth 1200 cells, a match s all but certa, ad deed, oly oe perso s lkely to have ever exsted wth such a prt. Coutry US World Cha umber of people 2.925E+08 6.347E+09 1.295E+09

Team 250 Page 10 Lchteste 3.284E+04 # People Ever 1.269E+10 Table 3: Populato fgures for the world ad some represetatve coutres. The umber of people ever was a fgure computed o the assumpto that roughly twce as may people have exsted the hstory of humaty tha exst at ths partcular pot tme. As was oted before, however, t mght be the case that a thumb wth 1200 cells s overly large, or that oly partal prts ca be obtaed for detfcato purposes. I ths case, we restrct the umber to a umber less tha 1200. For the plots Appedx B, we chaged the umber 1200 our calculato to values of = 600 ad = 300. Though ths creases the probablty of fdg multple matches, due to restrcto the umber of stes to place GCs. However, f as few as 12 GCs are matched, the fgerprt s uque detty s all but assured. V. Error Aalyss A prevous vestgato by Pakat, et. al. cluded the oretato of each muta the model for fgerprt dvdualty. We eglect to clude the factor of oretato of the characterstc for may reasos. Frstly, removg the factor of GC oretato ca oly decrease our estmate of the maxmum possble thumb Doppelgager probablty. Sce we are attemptg oly to fd a maxmum boud for ths probablty, removal of a factor whch ca oly decrease the probablty of a partcular prt, whle the same breath uecessarly complcates our soluto, does o damage to our model. Pakat, whom accouts for oretato hs model, arrved at a lower fgure for fgerprt dvdualty tha we dd. I accoutg for ths oretato, however, Pakat completely dsregards the dffereces mutae, oly cocetratg o locato ad oretato of defg features the fgerprt rdges. Some fgures doe o varous model calculatos that are cluded Pakat s paper are lsted Table 4, Appedx C. A secod reaso our model dsregards oretato s that our model reles o the assumpto that mutae occur ether depedetly or sem-depedetly. I accoutg for oretato, we would have to take to accout restrctos placed o the oretato of the GC by the overall patter. Ths s smple to see: persos wth loop patters have a hgher probablty upward ad dowward potg GCs tha do persos wth arches. Accoutg for oretato would make the patter ad mutae probabltes separable, ad aga harm the smplcty of our model whle offerg lttle mprovemet to our lmtg maxmum. Aother uavodable problem wth our model s the roughess of patter ad GC frequeces. Ufortuately, there are o good assessmets publshed o the percetages of the populato who patters that fall to the arch, loop, ad whorl categores. The frequecy of occurrece of GCs faces a smlar problem. I fact, the oly fgures we could fd were rough estmatos based o a small sample of people. Osterburg, whose fgures we used ths model, arrved at hs probablty parameters of GCs by samplg from 39 fgerprts. He dd break them to a total of 8,591 cells, but as we do ot kow whether or ot a sgle perso s more lkely to have a certa type of GC, these probabltes caot be take at face value [1]. Surely more recet fgures o these

Team 250 Page 11 parameters exst, but they aga do ot harm our model, oly the fgures whch t calculates. As metoed before, there s a possblty that there exsts depedece betwee GCs ad the overall patter of a fgerprt. I our model, we attempted accout for ths by decreasg the detfyg trats of a partcular muta by 25%, 50%, ad 100%. For the 100%, we smply calculated the probablty of a partcular GC occurrece ad dsregarded the patter, as ether ca be see to be the determg factor of the other. Ths s ot a exact model smply because ths assumes sem-depedece where complete depedece may occur. Wthout proper relatos that gve the depedece of mutae o the overall patter, however, we are uable to properly accout for ths. Iasmuch as we were able to adjust for these parameters, our model stll predcts that detfyg 12 or more mutae o a prt, whch s well wth curret techology, all but assures a postve match. Oe who pays astute atteto to our graph Fgure 3 otes that the graphs of 100% ad 0% depedece are actually the closest predcted probablty. Ths s because removal of the patter parameter the calculato of P max oly creases the overall maxmum probablty by a approxmate factor of 10. The other fgures suffer from exactess relatg the depedece betwee occurrece of patter ad mutae. I the fgures for our model, we have more precse kowledge of GC occurrece tha of patter occurrece. Hece, the plots whch we requre a percet depedece o patter suffer uecessarly from exact data. As we are creatg a somewhat dealstc model of fgerprts, scars were ot take to cosderato. As ca bee see Fgure 4, scars do have a effect o the appearace of fgerprts. Ths may create accuraces; however, there s o good way to model the formato of scars, as ths s completely due to persoal expereces. FIGURE 4: The effect of scars o fgerprt aalyss. From Cowger, p. 4. Our model also dffers o oe accout from most other models of fgerprts. Prevous artcles [3] publshed o fgerprt aalyss defe fgerprts oly as the porto the geeral vcty of the cetral patter. Our model actually takes the prt o the etre area above the upper jot of the thumb, whch would be the type of fgerprt o fle. Accordgly, our probabltes are sgfcatly lower tha those calculated by others. However, our model ca, as metoed before, be made to approxmate these the lmt where the umber of cells s at a value aroud 300 ad s aroud 12. The values we calculated ths method match up to other models accordgly, as see Table 4. The major problem whch our model suffers from s ts ablty to accout for huma error determg thumbprt probablty. Epste [7] otes that the major problem wth latet fgerprt evdece s the ablty of the humas whom exame the prts to dscer exact characterstcs. We ow have the ablty to use optcal scas to determe fgerprts of a dvdual exactly, as opposed to puttg k o fle. If the thumbprts matches were able to be tested by a computer, t would be hghly ulkely, gve our model, that ayoe would ever be msdetfed.

Team 250 Page 12 Comparg the output of our model wth the probabltes of error DA aalyss, we fd that fgerprts are a much more accurate method of detfcato. Though everyoe except detcal tws ad cloes has a uque sequece of DA, for crmology, the exact sequece s ot actually used as evdece. Istead, DA s cut up wth a ezyme to Restrcto fragmet legth polymorphsms (RFLPs). These peces of DA are the ru out o a gel, whch separates t out by the sze of the segmet [8]. Accordgly, f two or more people smply have restrcto stes approxmately the same area, or eve have the same amouts of DA betwee restrcto stes, they ca be mstake for oe aother. Ths s a much hgher probablty tha f the exact sequece were take to accout. Accordgly, though msdetfcato s rare, the probablty of msdetfcato DA aalyss s o the order of oe te bllo, whle accordg to our data that of fgerprt aalyss s much lower [5].

Team 250 Page 13 VI. Cocluso Itally, ths problem aroused us may cocers. What f oe of us really had a thumb Doppelgager? We could be covcted for crmes we had ever commtted! Ths stuato would be most ufortuate. However, after rug our model uder a case of maxmum probablty, we dscovered that there s a better chace of msdetfcato through DA proflg f the fgerprt aalyss s coducted wth mmal huma error. Ths s plaly evdet the fact that the odds of msdetfcato of DA evdece, regarded legal ad publc opo as early fallble, has a probablty of msdetfcato o the order of 10-10, whle the odds of fgerprt msdetfcato s four orders of magtude less, accordg to our model. eedless to say, t seems ureasoable to dey fgerprt proflg as evdece a crmal tral.

Team 250 Page 14 Appedx A: Shared Characterstcs of a Populato The followg plots were used to determe the optmum fgure for detfcato of crmals based o fgerprt evdece that s gve secto IV. umber of lke thumbprts, 0% depedece, =1200 1.E+10 1.E+05 umber of people wth thumbprt 1.E+00 1.E-05 1.E-10 1.E-15 1.E-20 1.E-25 0 5 10 15 20 25 30 35 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 5: Plot of the umber of probable lke thumbprts a gve coutry usg the model of zero percet patter depedece. Ths shows that f oly 10 mutae are requred to match, the t s lkely that o oe the hstory of the world has had a exactly matchg whole thumbprt. umber of lke thumbprts, 25% depedece, =1200 1.00E+11 1.00E+06 umber of people wth thumbprt 1.00E+01 1.00E-04 1.00E-09 1.00E-14 1.00E-19 1.00E-24 0 5 10 15 20 25 30 35 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 6: Same as above, for 25% depedece model. Here, oly 10 mutae are requred for postve detfcato as well.

Team 250 Page 15 umber of lke thumbprts, 50% depedece, =1200 1.00E+09 1.00E+04 umber of people wth thumbprt 1.00E-01 1.00E-06 1.00E-11 1.00E-16 1.00E-21 0 5 10 15 20 25 30 35 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 7: Same as above, for the 50% patter depedece model. Here, aroud 12 characterstcs are requred for a hghly probable detfcato. The dfferece here s lkely caused by error our kowledge of patter frequeces. umber of lke thumbprts, 100% depedece =1200 1.00E+09 1.00E+04 umber of people wth thumbprt 1.00E-01 1.00E-06 1.00E-11 1.00E-16 1.00E-21 1.00E-26 0 5 10 15 20 25 30 35 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 8: Same as above, for the complete depedece model. Aga, oly about 10 characterstcs are requred for a postve detfcato.

Team 250 Page 16 Appedx B: Shared Partal Prt Characterstcs of a Populato The followg plots were used to determe the optmum umber of GCs to match up wth a gve populato f oly partal prts are avalable for comparso. umber of lke thumbprts, 0% depedece, =600 1.00E+13 1.00E+08 umber of people wth thumbp 1.00E+03 1.00E-02 1.00E-07 1.00E-12 1.00E-17 1.00E-22 0 5 10 15 20 25 30 35 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 9: A plot of the umber of possble lke half-thumbprts, gve zero depedece o fgerprt patter. umber of lke thumbprts, 100% depedece, =600 1.00E+13 1.00E+08 umber of people wth thumbprt 1.00E+03 1.00E-02 1.00E-07 1.00E-12 1.00E-17 1.00E-22 0 5 10 15 20 25 30 35 umber of GC's US Most World Most Cha Most Lchteste Ever Most Fgure 10: A plot of the umber of possble lke half-thumbprts, gve oe hudred percet depedece o fgerprt patter.

Team 250 Page 17 umber of lke thumbprts, 0% depedece, =300 1.00E+12 1.00E+07 umber of people wth thumbpr 1.00E+02 1.00E-03 1.00E-08 1.00E-13 1.00E-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 11: A plot of the umber of possble lke quarter-thumbprts, gve zero depedece o fgerprt patter. umber of lke thumbprts, 100% depedecy, =300 1.00E+12 umber of people wth thumbprt 1.00E+07 1.00E+02 1.00E-03 1.00E-08 1.00E-13 1.00E-18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 umber of GCs US Most World Most Cha Most Lchteste Most Ever Most Fgure 12: A plot of the umber of possble lke quarter-thumbprts, gve oe hudred percet depedece o fgerprt patter.

Team 250 Page 18 Appedx C: Table of Calculated Probabltes These probabltes were calculated usg varous past models by Pakat [6]. As oted earler, our model, whch predcts a value less tha 4 x 10-15 for the probablty of each dvdual fgerprt, s good agreemet wth these calculatos. Author P fp =36, R=24, M=72 =12, R=8, M=72 Galto (1892) 1 1 1 R 1.45x10-11 9.54x10-7 16 256 2 Pearso(1930) 1 1 1 R 1.09x10-41 8.65x10-17 16 256 36 Hery(1900) 2 1 1.32x10-23 3.72x10-9 4 Balthazard(1911) 1 2.12x10-22 5.96x10-8 Bose(1917) Wetworh & Wlder (1918) Cumms & Mdlo (1943) Gupta (1968) Roxburgh (1933) 4 1 2.12x10-22 5.96x10-8 4 1 6.87x10-62 4.10x10-21 50 1 1 2.22x10-63 1.32x10-22 31 50 1 1 1 1.00x10-38 1.00x10-14 10 10 10 1 1.5 3.75x10-47 3.35x10-18 1000 10 2.412 Traurg (1963) ( 0.1944) 2.47x10-26 2.91x10-9 Osterburg et al. (1980) M ( 0.766) (0.234) 1.33x10-27 3.05x10-15 Stoey (1985) 3 0.6 (0.5 10 ) 1 1.2x10-80 3.5x10-26 5 TABLE 4: Calculated probabltes for varous models. Obtaed from Pakat, et. al. [6]. Here, R s the umber of regos of a fgerprt cosdered as defed by Galto, M s the umber of regos as defed by Osterburg.

Team 250 Page 19 Refereces [1] J.Osterburg, et al., Developmet of a Mathematcal Formula for the Calculato of Fgerprt Probabltes Based o Idvdual Characterstcs, Joural of the Amerca Statstcal Assocato, Vol. 72, o. 360, pg 772-778, 1977 [2] S. L. Sclove, The Occurrece of Fgerprt Characterstcs as a Two Dmesoal Process, Joural of Amerca Statstcal Assocato, Vol. 74, o. 367, pp. 588-595, 1979 [3] James F. Cowger, Frcto Rdge Sk: Comparso ad Idetfcato of Fgerprts, Elsever Scece Publshg Co. Ic., ew York, ew York, 1983. [4] The oble Qur a: I the Eglsh Laguage, Dr. Muhammad Taq-u-D Al-Hlal. Ryadh, Housto, Lahore: Darussalam Publshers ad Dstrbutors, 1998. [5] DA Fgerprtg. The Columba Ecyclopeda, Sxth Edto. ew York: Columba Uversty Press, 2003 [6] Sharath Pakat, et al., O the Idvdualty of Fgerprts http://bometrcs.cse.msu.edu/2cvpr230.pdf [7] Robert Epste, Fgerprts Meet Daubert: The Myth of Fgerprt Scece s Revealed, Souther Calfora Law Revew, Vol. 75, pp. 605-658, 2002 [8] Athoy J. F. Grffths, Moder Geetc Aalyss, W. H. Freema ad Compay, ew York, Mew York, 2002.