q P (T b < T a Z 0 = z, X 1 = 1) = p P (T b < T a Z 0 = z + 1) + q P (T b < T a Z 0 = z 1)

Similar documents
Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

4 Branching Processes

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

1 Gambler s Ruin Problem

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Probability and Distributions

Selected Exercises on Expectations and Some Probability Inequalities

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Lecture 4 - Random walk, ruin problems and random processes

Statistics Ph.D. Qualifying Exam: Part II November 9, 2002

Modern Discrete Probability Branching processes

1. Let A be a 2 2 nonzero real matrix. Which of the following is true?

Mathematical Methods for Computer Science

Stat 516, Homework 1

Basic concepts of probability theory

Basic concepts of probability theory

THE QUEEN S UNIVERSITY OF BELFAST

Basic concepts of probability theory

6 Continuous-Time Birth and Death Chains

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Spring 2012 Math 541B Exam 1

Discrete Distributions Chapter 6

Positive and null recurrent-branching Process

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

14 Branching processes

3. DISCRETE RANDOM VARIABLES

Sampling Distributions

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Rare-Event Simulation

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answer sheet.

Stochastic Process (ENPC) Monday, 22nd of January 2018 (2h30)

T. Liggett Mathematics 171 Final Exam June 8, 2011

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

P (A G) dp G P (A G)

Poisson Approximation for Two Scan Statistics with Rates of Convergence

Sequential Plans and Risk Evaluation

Things to remember when learning probability distributions:

STOCHASTIC PROCESSES Basic notions

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

LTCC. Exercises. (1) Two possible weather conditions on any day: {rainy, sunny} (2) Tomorrow s weather depends only on today s weather

Stat 512 Homework key 2

Institute of Actuaries of India

Ching-Han Hsu, BMES, National Tsing Hua University c 2015 by Ching-Han Hsu, Ph.D., BMIR Lab. = a + b 2. b a. x a b a = 12

Random Walks Conditioned to Stay Positive

Sampling Distributions

Statistical Inference

Hitting Probabilities

Stochastic process. X, a series of random variables indexed by t

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Universal examples. Chapter The Bernoulli process

COMPSCI 240: Reasoning Under Uncertainty

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10

Boundary Problems for One and Two Dimensional Random Walks

General Random Variables

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Master s Written Examination

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Nonparametric hypothesis tests and permutation tests

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Math 6810 (Probability) Fall Lecture notes

STAT 3610: Review of Probability Distributions

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) GRADUATE DIPLOMA, 2004

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

6.1 Moment Generating and Characteristic Functions

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

The Distribution of Mixing Times in Markov Chains

Chapter 7. Basic Probability Theory

(Practice Version) Midterm Exam 2

A Note on two inventory models

18.440: Lecture 28 Lectures Review

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY GRADUATE DIPLOMA, Statistical Theory and Methods I. Time Allowed: Three Hours

1 Continuous-time chains, finite state space

Probability Distributions

Estimation of arrival and service rates for M/M/c queue system

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

COMP2610/COMP Information Theory

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

FINAL EXAM: Monday 8-10am

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

A Very Brief Summary of Statistical Inference, and Examples

Part I Stochastic variables and Markov chains

Eco517 Fall 2014 C. Sims MIDTERM EXAM

2. Transience and Recurrence

Stat 260/CS Learning in Sequential Decision Problems.

UNIVERSITY OF LONDON IMPERIAL COLLEGE LONDON

Mod-φ convergence I: examples and probabilistic estimates

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.

Chapter 4 Sampling Distributions and Limits

Extreme Value Analysis and Spatial Extremes

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10. x n+1 = f(x n ),

Lattice paths with catastrophes

2 Random Variable Generation

18.175: Lecture 17 Poisson random variables

Convergence Rates for Renewal Sequences

Stat 5102 Final Exam May 14, 2015

Transcription:

Random Walks Suppose X 0 is fixed at some value, and X 1, X 2,..., are iid Bernoulli trials with P (X i = 1) = p and P (X i = 1) = q = 1 p. Let Z n = X 1 + + X n (the n th partial sum of the X i ). The sequence Z n is called a simple random walk. Geometrically, Z n can be viewed as determining a path in the integer lattice from (0, 0) to (n, Z n ). The path is continuous, and is formed entirely from diagonal segments of the form (a, b) (a + 1, b + 1) or (a, b) (a + 1, b 1). A random walk is a Markov chain, with transition probabilities P (Z n = a Z n 1 = b) = pi(a = b + 1) + qi(a = b 1). Suppose we are interested in a hitting time T a = inf{n Z n = a} (i.e. the first time that the path (n, Z n ) hits the horizontal line Y = a). Given two integers a, b, a natural question is to determine the probability P (T b < T a Z 0 = z) (i.e. the probability that b is reached before b starting from a specified point). The interesting case is when a z b. 1

P (T b < T a Z 0 = z) = P (T b < T a, X 1 = 1 Z 0 = z) + P (T b < T a, X 1 = 1 Z 0 = z) = P (T b < T a Z 0 = z, X 1 = 1)P (X 1 = 1) + P (T b < T a Z 0 = z, X 1 = 1)P (X 1 = 1) = p P (T b < T a Z 0 = z, X 1 = 1) + q P (T b < T a Z 0 = z, X 1 = 1) = p P (T b < T a Z 1 = z + 1) + q P (T b < T a Z 1 = z 1) = p P (T b < T a Z 0 = z + 1) + q P (T b < T a Z 0 = z 1) If we let W z = P (T b < T a Z 0 = z), then we see that the W z satisfy the recurrence relation: W z = pw z+1 + qw z 1 subject to the boundary conditions W a = 0, W b = 1. Difference equations often have solutions of the form W z = exp(cz). Moreover, any solution is a linear combination of solutions of this form (and conversely any linear combination of solutions of this form are also a solution). In the current case, we would have exp(cz) = p exp(c(z + 1)) + q exp(c(z 1)), 2

which is equivalent to or 1 = p exp(c) + q exp( c) exp(c) = p exp(2c) + q. This is quadratic in exp(c), so we can solve it explicitly: exp(c) = (1 ± 1 4p(1 p))/2p = (1 ± (2p 1))/2p and we get exp(c) = 1 or exp(c) = q/p. This gives c = 0 or c = log(q/p), which in turn gives W z = 1 or W z = exp(z log(q/p)). A general solution is of the form W z = C 1 + C 2 exp(z log(q/p)), and using the boundary conditions W a = 0 and W b = 1, we get a unique solution W z = exp(z log(q/p)) exp(a log(q/p)) exp(b log(q/p)) exp(a log(q/p)). Now extend the definition of the hitting time so that T a,b = inf{n Z n = a or Z n = b}. We would like to find the expectation E(T a,b Z 0 = z). Denote this value M z. M z = E(T a,b X 1 = 1)P (X 1 = 1) + E(T a,b X 1 = 1)P (X 1 = 1) = p(1 + M z+1 ) + q(1 + M z 1 ) = 1 + pm z+1 + qm z 1 3

This equation is inhomogeneous because of the unit additive constant. The associated inhomogeneous recurrence is M z = pm z+1 + qm z 1. It is easy to verify that the sum of a solution to the inhomogeneous recurrence and any linear combination of solutions to the associated homogeneous recurrence is a solution to the inhomogeneous recurrence. It is easy to verify that z/(q p) is a solution to the inhomogeneous recurrence. The left side of the inhomogeneous recurrence is z/(q p) 1, and the right side is p(z + 1)/(q p) + q(z 1)/(q p) = (pz + p + qz q)/(q p) = (z + p q)/(q p) = z/(q p) 1. The homogeneous recurrence is the same as the recurrence from the previous problem (using a different notation). Therefore the general solution has the form C 1 + C 2 exp(z log(q/p)), hence a general solution to the inhomogeneous recurrence has the form z/(q p) + C 1 + C 2 exp(z log(q/p)). The boundary conditions are M a = M b = 0, which leads to the solution (b a) exp(z log(q/p)) (b a) exp(a log(q/p)) M z = (z a)/(q p) (q p) exp(b log(q/p)) (q p) exp(a log(q/p)). 4

This can be re-written as M z = (W z (b z) + (1 W z )(a z))/(p q). Moment Generating functions (MGF s) are another way to derive this type of result. The moment generating function of a random variable Z is defined to be the expectation M(θ) = E exp(θz). This is either equal to exp(θz)π(z)dz or Z exp(θz)π(z) depending on whether Z is continuous or discrete. A key property of the MGF is that (under fairly general conditions) d k M(θ)/dθ k = EZ k exp(θz), hence the k th derivative evaluated at θ = 0 is EZ k, the k th moment of Z. Two special cases are when k = 0, giving M(0) = 1, and k = 1, giving M (0) = EZ. Another important property of the MGF is that if X and Y are independent, and M X (θ) and M Y (θ) are their MGF s, then if Z = X + Y, M Z (θ) = M X (θ)m Y (θ). In particular, if X 1, X 2,..., are iid and Z n = n i=1 X i, then M Zn (θ) = M X (θ) n. Suppose that Z is discrete, EZ 0, and there exist a < 0 and b > 0 such that P (Z = z) > 0, and P (Z = b) > 0. We will prove that there is a unique value θ distinct from 0 such that M(θ ) = 1. 5

The moment generating function is bounded below by P (Z = a) exp(θa) and by P (Z = b) exp(θb). Therefore as θ approaches ±, M(θ) approaches +. The second derivative of the MGF is EZ 2, which is positive, hence M(θ) is convex. Since M (0) = EZ 0, M(θ) is either strictly increasing or strictly decreasing as it passes through θ = 0. By continuity, and by the limit at ±, there must be a second solution to M(θ) = 1. By convexity, this solution is unique. Review X 1, X 2,... iid P (X i = 1) = p, P (X i = 1) = q = 1 p. The process Z n = X 0 + X 1 + + X n is a random walk, where X 0 is a constant. T a = inf n {z T z = a} is the random number of steps before Z n hits a. We derived formulas for M z E(T a X 0 = z) and W z P (T b < T a X 0 = z) using difference equations. The moment generating function (MGF) for a random variable X is E exp(θx). The MGF has three key properties: 1. If M (k) X (θ) dm X (θ) k /dθ k, then M (k) X (0) = EX k. 2. If A and B are independent, then M A+B (θ) = M A (θ)m B (θ). 3. If X is (i) discrete, (ii) has positive probability of being both positive and negative, and (iii) has nonzero mean, then the MGF M X (θ) crosses M X (θ) = 1 at exactly two distinct points. One of the points is θ = 0, the other is a point θ 0. 6

Continuation The MGF for each X i is M X (θ) = p exp(θ) + q exp( θ). Since X i satisfies conditions (i), (ii), and (iii) in 3 above, there exists θ 0 such that M X (θ ) = 1. Solving directly yields θ = log(q/p). The random variable Z n X 0 is the signed net vertical displacement of the random walk after n steps. The MGF for Z n X 0 is M Zn X 0 (θ) = (p exp(θ) + q exp( θ)) n, and M Zn X 0 (θ ) = 1. Now consider Z Ta,b X 0. This is the net vertical displacement of the random walk at the point when it has first hit a or b. Therefore it can only take on two values, b X 0 (with probability W z ) and a X 0 (with probability 1 W z ). Now consider the MGF of Z Ta,b X 0. This looks like the MGF of Z n X 0 with n = T a,b, but it is actually quite different, because T a,b is a random variable upon which the final expression for the MGF should not depend. Therefore the MGF of Z n X 0 is different from It is true, however, that (p exp(θ) + q exp( θ)) T a,b. M ZTa,b X 0 (θ) = E(p exp(θ) + q exp( θ)) T a,b. Therefore even though we do not yet have the MGF of Z Ta,b X 0, we do known that it equals 1 when evaluated at θ = θ. 7

Since Z Ta,b X 0 can take on only b X 0 (with probability W z ) or a X 0 (with probability 1 W z ), the MGF of Z Ta,b X 0 satisfies M ZTa,b X 0 (θ) = W z exp(θ(b X 0 )) + (1 W z ) exp(θ(a X 0 )). This expression must evaluate to 1 at θ = log(q/p). Therefore we can solve for W z : Wald s Identity: W z = exp(zθ ) exp(aθ ) exp(bθ ) exp(aθ ). E exp(θ(z T a,b X 0 )) M X (θ) T a,b exp(θ(z Ta,b X 0 )) = E Ta,b E M X (θ) T T a,b a,b = E Ta,b M X (θ) T a,b E ( ) exp(θ(z Ta,b X 0 )) T a,b = E Ta,b M X (θ) T a,b M X (θ) T a,b = 1. Differentiate Wald s identity with respect to θ to get that the following is equal to 0: EM X (θ) T a,b(z Ta,b X 0 ) exp(θ(z Ta,b X 0 )) T a,b M X (θ) T a,b 1 M X(θ) exp(θ(z Ta,b X 0 )). Evaluating at θ = 0 yields: EZ Ta,b X 0 T a,b EX i = 0. 8

From which we conclude We know that EZ Ta,b X 0 = ET a,b EX i EZ Ta,b X 0 = W z (b X 0 ) + (1 W z )(a X 0 ), and that EX i = p q. Therefore M z = ET z,b = W z(b X 0 ) + (1 W z )(a X 0 ). p q Connection to a simple sequence alignment model: Suppose that we we are aligning two random sequences against each other without gaps, and the scoring model gives +1 for a match and 1 for a mismatch. Thus the score at position n is a random walk where p is the null probability of a match (which will ordinarily be less than 1/2). Suppose we choose as our alignment the highest scoring alignment that is ever reached before the first time that we reach a negative score. We would like to know the null distribution of this alignment score. In the notation used earlier, z = 0 since we always start at a score of 0, and a = 1, since we always stop when the score becomes negative. The probability that the maximal score is at least b is P (T b > T 1 ), which is W 0 as computed above. Specifically, it is W 0 = 1 exp( θ ) exp(bθ ) exp( θ ). 9

Approximating further, exp(bθ ) typically dominates exp( θ ), so we get P (S max b) C exp( θ b), which is a geometric probability. We want to generalize beyond a simple random walk. Suppose that each X i has support contained in c, c + 1,..., d, where c and d are positive integers, and let p c,..., p d denote the probability with which X i takes on each of these values. The MGF of X i is m X (θ) = j p j exp(jθ). We assume that p c and p d are positive, that EX i = j jp j < 0, and there is no integer that divides the set of step sizes j that have p j 0. Under these conditions, there exists θ > 0 such that m X (θ ) = 1. We now generalize T a,b for a < b as follows: T a,b = inf{n Z n b or Z n a}. The MGF of Z Ta,b X 0 also crosses 1 at θ : m ZTa,b X 0 (θ ) = 1. From now on we take X 0 = 0. We observe that Z Ta,b must terminate at one of b, b+1,..., b+d or one of c, c+1,..., 1. Thus we can write m ZTa,b (θ) = 1 k= c P k exp(kθ) + b+d 10 k=b P k exp(kθ),

where P k is the probability that Z Ta,b = k. Based on Wald s identity, we have ET a,b = EZ Ta,b /EX. We fix a = 1, and let b to get an asymptotic estimate for A = ET a,b. Since EZ Ta,b = k kp k, as b, this converges to 1 k= c kr k, where R k is the limit as b of the probability that Z Ta,b = k (note that for positive k the R k vanish in the limit). Thus we get A = 1 k= c kr k d k= c kp k as an approximation to ET a,b. Ultimately this can be used in inference for sequence alignment, since if an optimal local alignment has length A, it is unlikely to reflect a true biological relationship. At this point, we can not evaluate A, because we do not yet know how to compute the R k. Now consider the case where the stopping occurs at T L,1, where L > 0. The MGF of Z T L,1 becomes L k= L c+1 Q k (L) exp(kθ ) + d 11 k=1 Q k (L) exp(kθ ).

It is a fact that the limits Q k = lim L Q k (L) exist. The sum 1 Q = Q k is less that 1. It is the probability that the random walk is never positive. Furthermore the terms Q k (L) exp(kθ ) vanish as L for k < 0 (since we know that θ > 0). Therefore we get d k=1 Q k exp(kθ ) = 1. Let F (y) be the probability that the walk never exceeds y. We consider this event as follows: F (y) = P (no positive value ever reached) + d k=0 P (k is first positive value reached) P (y never reached k is first positive value reached). F (y) = Q + d k=0 Q k F (y k). Next we apply the renewal theorem, which states that if b i, f i, and u i (i 0) are non-negative and satisfy 1. B = b i <, 2. f i = 1, 3. µ = if i <, 4. The GCD of??? is 1, 5. u y b y = y k=0 f k u y k, then u y B/µ as y. 12

Let V (y) = (1 F (y)) exp(yθ ), so we get 1 V (y) exp( yθ ) = Q+ y k=0 which when y < d can be rewritten Q k (1 V (y k) exp((k y)θ )), V (y) = exp(yθ )(Q y+1 + + Q d ) + y and when y d can be written k=0 Q k exp(kθ )V (y k), V (y) = d k=0 Q k exp(kθ )V (y k). Now if we let f k = Q k exp(kθ ) and b y = exp(yθ )(Q y+1 + + Q d ) if y d and b y = 0 if y d, then the renewal theorem can be applied. We can compute µ directly, as it is a finite sum. To make use of the renewal theorem, we must compute B = b k. It is easy to verify that (exp(θ ) 1)B = k Q k exp(kθ ) Q. Thus B = Q/(exp(θ ) 1), and the renewal theorem states that V (y) Q (exp(θ ) 1)( k kq k exp(kθ )), and using the relationship F (y) = 1 V (y) exp( yθ ) we are done. 13

Review: Let F (y) denote the probability that a random walk starting at zero never exceeds y. We found that and that V (y) F (y) = 1 V (y) exp( yθ ), Q (exp(θ ) 1) d k=1 kq k exp(kθ ) V, where Q k is the probability that the first positive value of the walk occcurs at height k, and Q = d k=1 Q k. Thus we write or equivalently F (y) V exp( yθ ), F (y) exp(yθ ) V. Now we are interested in the case where the walk stops when it first reaches a negative value. We would like to obtain an approximation to the probability G(y) that such a walk never exceeds y. Let F (y) = 1 F (y) and G (y) = 1 G(y) denote the probabilities that the two walks exceed y. Let R j denote the probability that j is the first negative value reached by the walk that stop when it reaches a negative score (so j must be one of 1, 2,..., c). Thus we can write F (y) = G (y) + 1 14 j= c R j F (y j).

or exp(yθ )F (y) = exp(yθ )G (y) + exp(yθ ) 1 which leads to V exp(yθ )G (y) + V 1 j= c Using V as was computed earlier, we have j= c R j exp(jθ ). R j F (y j), where C = G (y) exp( θ (y + 1))C Q(1 1 j= c R j exp(jθ )) (1 exp( θ )) d k=1 kq k exp(kθ ). and we conclude that the probability of the walk exceeding y before it becomes negative is approximated by C exp( yθ ). 15

Extreme Values of iid Sequences Let X 1,..., X n denote an iid sequence of random variables. P (max{x 1,..., X n } > t) = 1 P (all X i t) = 1 P (X 1 t) n. Two examples where this can be used to give a simple expression for probabilities involving extreme values are: 1. Uniform distribution on (a, b): P (max{x 1,..., X n } > t) = 1 t a b a 2. Exponential distribution with mean λ: n. P (max{x 1,..., X n } > t) = 1 (1 exp( t/λ)) n. Now suppose that the X i have a density π( ). Let F (t) = t π(u)du denote the cumulative distribution function, and note that P (max{x 1,..., X n } t) = F (t) n and therefore the density of max{x 1,..., X n } is df (t) n /dt = nf (t) n 1 F (t) = nf (t) n 1 π(t). Similarly, the minimum satisfies P (min{x 1,..., X n } t) = (1 F (t)) n, 16

and hence the density of the minimum is n(1 F (t)) n 1 π(t). For the Exponential distribution with mean λ, the density of the maximum is n(1 exp( t/λ)) n 1 exp( t/λ)/λ, and the density of the minimum is n exp( t/λ) n 1 exp( t/λ)/λ = n exp( nt/λ)/λ, which is itself an exponential distribution with mean λ/n. The exponential distribution has an important property called the memory-less property. Suppose X 1, X 2,..., X n are exponential, and let X (1) X (n) be the same set of values in increasing order. Let I 1 = X 1, and for k > 1 let I k = X (k) X (k 1). The memory-less property is that the I k are iid. Furthermore, I k is the distribution of the smallest of n k + 1 exponential random variables, so I k is exponential with parameter λ k = λ/(n k +1). Therefore EI 1 = λ/n, EI 2 = λ/(n 1), and so on. Since X max = k I k, we have EX max = λ(1 + 1/2 + 1/3 + + 1/n). VarX max = λ 2 (1 + 1/4 + + 1/n 2 ). 17

Since k=1 1/n 2 = π 2 /6, we can approximate the variance of X max with λ 2 π 2 /6. It is a fact that log n n j=1 1/n has a finite limit called Euler s constant, denoted γ. Thus the mean of X max can be approximated as λ(γ + log n). Note the unusual nature of X max in that the mean becomes infinite while the variance stays bounded. The extreme value theory for the geometric distribution is very similar to that of the exponential distribution. For n geometric random variables with mass function (1 p)p k on k = 0, 1,..., we have P (Y max t) = (1 p) t = exp( t log(1 p)). If we set λ = 1/ log(p) then the results for the exponential distribution are approximately true for the geometric distribution. Returning to the basic fact for the exponential distribution P (max(y 1,..., Y n ) t) = (1 exp( t/λ)) n, changing variables to t t + λ log n yields P (max(y 1,..., Y n ) t + λ log n) = (1 exp( t/λ log n)) n = (1 exp( t/λ)/n) n exp( exp( t/λ)). Thus we can write 18

P (max(y 1,..., Y n ) t) exp( exp( t/λ + log n/λ)). Next we will get a slightly better approximation. Let F (t) = exp( exp( t)). This is clearly a cumulative distribution function on t (0, ) (it is positive, non-decreasing, and has left and right limits at 0 and 1 respectively). The corresponding density is f(t) = exp( t exp( t)). This is called the double exponential distribution. It is a fact that for a large class of random variables (including all geometric-like distributions), the maximum of n iid realizations comes from a distribution that converges to F as n. Specifically, let µ n = λ(log n + γ), and σ 2 n = λ 2 π 2 /6. Then the distribution function for Y max for n iid geometric-like random variables is P (Y max t) = exp( exp( π(t µ n )/σ n 6)). To get a p-value we would use P (Y max t) = 1 exp( exp( π(t µ n )/σ n 6)). 19

Application to Inference for Ungapped Alignment Scores Suppose we have an ungapped alignment of two random sequences, where the following conditions hold: 1. The GCD of the scores is 1. 2. At any given position, there is a positive probability of recieving both a negative and a positive score. 3. The expected score is negative. That is x,y p(x)p(y)s(x, y) < 0. We know condition 3 is automatically satisfied if s(x, y) = log p(x, y)/p(x)p(y) is defined as a log-likelihood ratio. Let S i = s(x i, Y i ) be the score for position i alone. The global alignment score up to position n is Z n = S 1 + + S n. Z n is a random walk that goes to with probability 1. The local alignment score for positions m + 1,..., n is Z n Z m. Define a ladder point to be a new local minimum in the walk. That is, Z k is a ladder point if Z k < min(z 1,..., Z k 1 ). Let L 1, L 2,... be the ladder points, and let U k = max{y Lk Y Lk, Y Lk +1 Y Lk,..., Y Lk+1 Y Lk }. The U k are independent, and each U k has the distribution of the maximum of value of a random walk that starts at 0 and is absorbed at any negative state (which was worked out above). 20

Because of the negative drift, the optimal local alignment Y max is approximated by the max of the U k. Let A be the expected number of steps before a negative score is reached (as above). If N is the length of the two sequences being compared, the number of ladder points is approximated by n = N/A. Substituting this into the approximate extreme value distribution for geometric-like random variables yields: P (U max u) exp( K exp( (u + log N)θ )), where K = C exp(θ )/A. 21