RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES

Size: px
Start display at page:

Download "RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES"

Transcription

1 Chapter 7 RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES 7.1 Introduction Definition 7.1. Let {X i ; i 1} be a sequence of IID random variables, and let S n = X 1 + X X n. The integer-time stochastic process {S n ; n 1} is called a random walk, or, more precisely, the random walk based on {X i ; i 1}. For any given n, S n is simply a sum of IID random variables, but here the behavior of the entire random walk process, {S n ; n 1}, is of interest. Thus, for a given real number α > 0, we might try to find the probabiity that S n α for any n, or given that S n α for one or more values of n, we might want to find the distribution of the smallest n such that S n α. typical questions about random walks are finding the smallest n such that S n reaches or exceeds a threshold, and finding the probability that the threshold is ever reached or crossed. Since S n tends to drift downward with increasing n if E [X] = X < 0, and tends to drift upward if X > 0, the results to be obtained depend critically on whether X < 0, X > 0, or X = 0. Since results for X < 0 can be easily translated into results for X > 0 by considering { S n ; n 0}, we will focus on the case X < 0. As one might expect, both the results and the techniques have a very different flavor when X = 0, since here the random walk does not drift but typically wanders around in a rather aimless fashion. The following three subsections discuss three special cases of random walks. The first two, simple random walks and integer random walks, will be useful throughout as examples, since they can be easily visualized and analyzed. The third special case is that of renewal processes, which we have already studied and which will provide additional insight into the general study of random walks. After this, Sections 3.2 and 3.3 show how two major application areas, G/G/1 queues and 279

2 280 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES hypothesis testing, can be treated in terms of random walks. These sections also show why questions related to threshold crossings are so important in random walks. Section 3.4 then develops the theory of threshold crossing for general random walks and Section 3.5 extends and in many ways simplifies these results through the use of stopping rules and a powerful generalization of Wald s equality known as Wald s identity. The remainder of the chapter is devoted to a rather general type of stochastic process called martingales. The topic of martingales is both a subject of interest in its own right and also a tool that provides additional insight into random walks, laws of large numbers, and other basic topics in probability and stochastic processes Simple random walks Suppose X 1, X 2,... are IID binary random variables, each taking on the value 1 with probability p and 1 with probability q = 1 p. Letting S n = X X n, the sequence of sums {S n ; n 1}, is called a simple random walk. S n is the difference between positive and negative occurrences in the first n trials. Thus, if there are j positive occurrences for 0 j n, then S n = 2j n, and Pr{S n = 2j n} = n! j!(n j)! pj (1 p) n j. (7.1) This distribution allows us to answer questions about S n for any given n, but it is not very helpful in answering such questions as the following: for any given integer k > 0, what is the probability that the sequence S 1, S 2,... ever reaches or exceeds k? This probability can be expressed as 1 Pr{ S 1 n=1 {S n k}} and is referred to as the probability that the random walk crosses a threshold at k. Exercise 7.1 demonstrates the surprisingly simple result that for a simple random walk with p < 1/2, this threshold crossing probability is ( 1 ) [ µ p k Pr {S n k} =. (7.2) 1 p n=1 Sections 7.4 and 7.5 treat this same question for general random walks. They also treat questions such as the overshoot given a threshold crossing, the time at which the threshold is crossed given that it is crossed, and the probability of crossing such a positive threshold before crossing a given negative threshold Integer-valued random walks Suppose next that X 1, X 2,... are arbitrary IID integer-valued random variables. We can again ask for the probability that such an integer valued random walk crosses a threshold 1 This same probability is often expressed as as Pr{sup 1 n=1 S n k}. For a general random walk, the event S n 1 {Sn k} is slightly different from sup n 1 S n k. The latter event can include sample sequences s 1, s 2,... in which a subsequence of values s n approach k as a limit but never quite reach k. This is impossible for a simple random walk since all s k must be integers. It is possible, but can be shown to have probability zero, for general random walks. We will avoid this silliness by not using the sup notation to refer to threshold crossings.

3 7.2. THE WAITING TIME IN A G/G/1 QUEUE: 281 at k, i.e., that the event S 1 n=1 {S n k} occurs, but the question is considerably harder than for simple random walks. Since this random walk takes on only integer values, it can be represented as a Markov chain with the set of integers forming the state space. In the Markov chain representation, threshold crossing problems are first passage-time problems. These problems can be attacked by the Markov chain tools we already know, but the special structure of the random walk provides new approaches and simplifications that will be explained in Sections 7.4 and Renewal processes as special cases of random walks If X 1, X 2,... are IID positive random variables, then {S n ; n 1} is both a special case of a random walk and also the sequence of arrival epochs of a renewal counting process, {N(t); t 0}. In this special case, the sequence {S n ; n 1} must eventually cross a threshold at any given positive value α, and the question of whether the threshold is ever crossed becomes uninteresting. However, the trial on which a threshold is crossed and the overshoot when it is crossed are familiar questions from the study of renewal theory. For the renewal counting process, N(α) is the largest n for which S n α and N(α) + 1 is the smallest n for which S n > α, i.e., the smallest n for which the threshold at α is strictly exceeded. Thus the trial at which α is crossed is a central issue in renewal theory. Also the overshoot, which is S N(α)+1 α is familiar as the residual life at α. Figure 7.1 illustrates the difference between general random walks and positive random walks, i.e., renewal processes. Note that the renewal process is illustrated with the axes reversed from usual representation. We usually view each renewal epoch as a time (epoch) and view N(α) as the number of trials up to time α, whereas with random walks, we usually view the number of trials as a discrete time variable and view the sum of rv s as some kind of amplitude or cost. Mathematically this makes no difference and it is often valuable to move from one point of view to another. 7.2 The waiting time in a G/G/1 queue: This section and the next introduce two important problems that are best solved by viewing them as random walks. In this section we represent the waiting time in a G/G/1 queue as a threshold crossing problem in a random walk. In the next section, we represent the error probability in a standard type of detection problem as a random walk problem. This detection problem will later be generalized to a sequential detection problem based on threshold crossings in a random walk. Consider a G/G/1 queue with first come first serve (FCFS) service. We shall find how to associate the probability that a customer must wait more than some given time α in the queue with the probability that a certain random walk crosses a threshold at α. Let X 1, X 2,... be the inter-arrival times of a G/G/1 queueing system; thus these variables are IID with a given distribution function F X (x) = Pr{X i x}. Assume that arrival 0 enters an empty system at time 0, so that S n = X 1 + X X n is the epoch of the n th arrival after time 0. Let Y 0, Y 1,..., be the service times of the successive customers. These are IID

4 282 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES α S 4 Epoch S 3 q S 1 q S 2 q Trial q S5 q α Epoch S S 3 2 q q S 1 q Trial S S 5 4 q q Sq 5 q Trial Sq 3 S 2 S 4 q Sq 1 Epoch α (a) (b) (c) Figure 7.1: The sample function in (a) above illustrates a random walk with arbitrary (positive and negative) step sizes {X i ; i 1}. The sample function in (b) illustrates a random walk restricted to positive step sizes {X i > 0; i 1}, i.e., a renewal process. Note that the axes are reversed from the usual depiction of a renewal process. The same sample function is shown in part (c) using the customary axes for a renewal process. For both the arbitrary random walk of part (a) and the random walk with positive step sizes of parts (b) and (c), a threshold at α is crossed on trial 4 with an overshoot S 4 α. with some given distribution function F Y (y) and are independent of {X i ; i 1}. Figure 7.2 shows a sample path of arrivals and departures and illustrates the waiting time in queue for each arrival. To analyze the waiting time, note that the system time, i.e., the time in queue plus the time in service, for any given customer n is W n + Y n, where W n is the queueing time and Y n is the service time. As illustrated in Figure 7.2, customer n + 1 arrives X n+1 time units after the beginning of this interval, i.e., after the arrival of customer n. If X n+1 < W n + Y n, then customer n + 1 arrives while customer n is still in the system, and thus must wait in the queue until n finishes service (in the figure, for example, customer 2 arrives while customer 1 is still in the queue). Thus W n+1 = W n + Y n X n+1 if X n+1 W n + Y n. (7.3) On the other hand, if X n+1 > W n + Y n, then customer n (and all earlier customers) have departed when n + 1 arrives. Thus n + 1 starts service immediately and W n+1 = 0. This is the case for customer 3 in the figure. These two cases can be combined in the single equation W n+1 = max[w n + Y n X n+1, 0]; for n 0; W 0 = 0. (7.4) Since Y n and X n+1 are coupled together in this equation for each n, it is convenient to define U n+1 = Y n X n+1. Note that {U i ; i 1} is a sequence of IID random variables. From (7.4), W n = max[w n 1 + U n, 0], and iterating on this equation, W n = max[max[w n 2 +U n 1, 0]+U n, 0] (7.5) = max[(w n 2 + U n 1 + U n ), U n, 0] = max[(w n 3 +U n 2 +U n 1 +U n ), (U n 1 +U n ), U n, 0] =

5 7.2. THE WAITING TIME IN A G/G/1 QUEUE: 283 s 3 s 2 x 3 Arrivals s 1 x 2 w 2 y 2 0 x 1 w 1 y 1 Departures y 0 x 2 + w 2 = w 1 + y 1 Figure 7.2: Sample path of arrivals and departures from a G/G/1 queue. Customer 0 arrives at time 0 and enters service immediately. Customer 1 arrives at time s 1 = x 1. For the case shown above, customer 0 has not yet departed, i.e., x 1 < y 0, so customer 1 s time in queue is w 1 = y 0 x 1. As illustrated, customer 1 s sytem time (queueing time plus service time) is w 1 + y 1. Customer 2 arrives at s 2 = x 1 + x 2. For the case shown above, this is before customer 1 departs at y 0 + y 1. Thus, customer 2 s wait in queue is w 2 = y 0 + y 1 x 1 x 2. As illustrated above, x 2 +w 2 is also equal to customer 1 s system time, so w 2 = w 1 +y 1 x 2. Customer 3 arrives when the system is empty, so it enters service immediately with no wait in queue, i.e., w 3 = 0. = max[(u 1 +U U n ), (U 2 +U U n ),..., (U n 1 +U n ), U n, 0]. (7.6) It is not necessary for the theorem below, but we can understand this maximization better by realizing that if the maximization is achieved at U i + U i U n, then a busy period must start with the arrival of customer i 1 and continue at least through the service of customer n. To see this intuitively, note that the analysis above starts with the arrival of customer 0 to an empty system at time 0, but the choice of 0 time and customer number 0 has nothing to do with the analysis, and thus the analysis is valid for any arrival to an empty system. Choosing the largest customer number before n that starts a busy period must then give the correct waiting time and thus maximize (7.5). Exercise 7.2 provides further insight into this maximization. Define Z n 1 = U n, define Z n 2 = U n +U n 1, and in general, for i n, define Z n i = U n +U n U n i+1. Thus Z n n = U n + + U 1. With these definitions, (7.5) becomes W n = max[0, Z n 1, Z n 2,..., Z n n]. (7.7) Note that the terms in {Zi n ; 1 i n} are the first n terms of a random walk, but it is not the random walk based on U 1, U 2,..., but rather the random walk going backward, starting with U n. Note also that W n+1, for example, is the maximum of a different set of variables, i.e., it is the walk going backward from U n+1. Fortunately, this doesn t matter for the analysis since the ordered variables (U n, U n 1..., U 1 ) are statistically identical to (U 1,..., U n ). The probability that the wait is greater than or equal to a given value α is Pr{W n α} = Pr{max[0, Z n 1, Z n 2,..., Z n n] α}. (7.8) This says that, for the n th customer, Pr{W n α} is equal to the probability that the random walk {Z n i ; 1 i n} crosses a threshold at α by the nth trial. Because of the

6 284 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES initialization used in the analysis, we see that W n is the waiting time in queue of the n th arrival after the beginning of a busy period (although this n th arrival might belong to a a later busy period than that initial busy period). As noted above, (U n, U n 1,..., U 1 ) is statistically identical to (U 1,..., U n ) and thus Pr{W n α} is the same as the probability that the first n terms of a random walk based on {U i ; i 1} crosses a threshold at α. Since the first n + 1 terms of this random walk provide one more opportunity to cross α than the first n terms, we see that Pr{W n α} Pr{W n+1 α} 1. (7.9) Since this sequence of probabilities is non-decreasing, it must have a limit as n 1, and this limit is denoted Pr{W α}. Mathematically, 2 this limit is the probability that a random walk based on {U i ; i 1} ever crosses a threshold at α. Physically, this limit is the probability that the waiting time in queue is at least α for any given very large-numbered customer (i.e., for customer n when the influence of a busy period starting n customers earlier has died out). These results are summarized in the following theorem. Theorem 7.1. Let {X i ; i 1} be the interarrival intervals of a G/G/1 queue, let {Y i ; i 0} be the service times, and assume that the system is empty at time 0 when customer 0 arrives. Let W n be the time that the n th customer waits in the queue. Let U n = Y n 1 X n for n 1. Then for any α > 0, and n 1, W n is given by (7.7). Also, Pr{W n α} is equal to the probability that the random walk based on {U i ; i 1} crosses a threshold at α by the n th trial. Finally, Pr{W α} = lim n 1 Pr{W n α} is equal to the probability that the random walk based on {U i ; i 1} ever crosses a threshold at α. Note that the theorem specifies the distribution function of W n for each n, but says nothing about the joint distribution of successive waiting times. These are not the same as the distribution of successive terms in a random walk because of the reversal of terms above. We shall find a relatively simple solution to the probability that a random walk crosses a positive threshold in Section 7.4. From Theorem 7.1, this also solves for the distribution of queueing delay for the G/G/1 queue (and thus also for the M/G/1 and M/M/1 queues). 7.3 Detection, Decisions, and Hypothesis testing Consider a situation in which we make n noisy observations of the outcome of a single binary random variable H and then guess, on the basis of the observations alone, which binary outcome occurred. In communication technology, this is called a detection problem. It models, for example, the situation in which a single binary digit is transmitted over some time interval but a noisy vector depending on that binary digit is received. It similarly models the problem of detecting whether or not a target is present in a radar observation. In control theory, such situations are usually referred to as decision problems, whereas in statistics, they are referred to as hypothesis testing. 2 More precisely, the sequence of waiting times W 1, W 2..., have distribution functions F Wn that converge to F W, the generic distribution of the given threshold crossing problem with unlimited trials. As n increases, the distribution of W n approaches F W, and we refer to W as the waiting time in steady state.

7 7.3. DETECTION, DECISIONS, AND HYPOTHESIS TESTING 285 Specifically, let H 0 and H 1 be the names for the two possible values of the binary random variable H and let p 0 = Pr{H 0 } and p 1 = 1 p 0 = Pr{H 1 }. Thus p 0 and p 1 are the a priori probabilities 3 for the random variable H. Let Y 1, Y 2,..., Y n be the n observations. We assume that, conditional on H 0, the observations Y 1,... Y n are IID random variables. Suppose, to be specific, that these variables have a density f(y H 0 ). Conditional on H 0, the joint density of a sample n-tuple y = (y 1, y 2,..., y n ) of observations is given by f(y H 0 ) = ny i=1 f(y i H 0 ). (7.10) Similarly, conditional on H 1, we assume that Y 1,... Y n are IID random variables with a conditional joint density given by (7.10) with H 1 in place of H 0. In summary then, the model is that H is a rv with PMF {p 0, p 1 }. Conditional on H, Y = (Y 1,..., Y n ) is an n-tuple of IID rv s. Given a particular sample of n observations y = y 1, y 2,..., y n, we can evaluate Pr{H 1 y} as Pr{H 1 y} = p 1 Q n i=1 f(y i H 1 ) p 1 Q n i=1 f(y i H 1 ) + p 0 Q n i=1 f(y i H 0 ). (7.11) We can evaluate Pr{H 0 y} in the same way, and the ratio of these quantities is given by Pr{H 1 y} Pr{H 0 y} = p Q n 1 i=1 f(y i H 1 ) Q p n 0 i=1 f(y i H 0 ). (7.12) If we observe y and choose H 0, then Pr{H 1 y} is the resulting probability of error, and conversely if we choose H 1, then Pr{H 0 y} is the resulting probability of error. Thus the probability of error is minimized, for a given y, by evaluating the above ratio and choosing H 1 if the ratio is greater than 1 and choosing H 0 otherwise. If the ratio is equal to 1, the error probability is the same whether H 0 or H 1 is chosen. The above rule for choosing H 0 or H 1 is called the Maximum a posteriori probability detection rule, usually abbreviated as the MAP rule. The rule has a more attractive form (and also brings us back to random walks) if we take the logarithm of each side of (7.12), getting ln Pr{H 1 y} Pr{H 0 y} = ln p 1 p 0 + nx i=1 z i ; where z i = ln f(y i H 1 ) f(y i H 0 ). (7.13) The quantity z i in (7.13) is called a log likelihood ratio. Note that z i is a function only of y i, and that this same function is used for each i. For simplicity, we assume that this 3 Statisticians have argued since the beginning of statistics about the validity of choosing a priori probabilities for an hypothesis to be tested. Bayesian statisticians are comfortable with this practice and non-bayesians are not. Both are comfortable with choosing a probability model for the observations conditional on each hypothesis. We take a Bayesian approach here, partly to take advantage of the power of a complete probability model, and partly because non-bayesian results, i.e., results that do not depend on the a priori probabilies are much easier to derive and interpret within a full probability model. As will be seen, the Bayesian approach also makes it natural to incorporate the results of early observations into updated a priori probabilities for analyzing later observations.

8 286 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES function is finite for all y. The MAP rule is to choose H 1 or H 0 depending on whether the quantity on the right is positive or negative, i.e., nx > ln(p 0 /p 1 ) ; choose H 1 z i < ln(p 0 /p 1 ) ; choose H 0 (7.14) i=1 = ln(p 0 /p 1 ) ; don t care, choose either Conditional on H 0, the rv s {Y i ; 1 i n} are IID. Since Z i = ln[f(y i H 1 )/f(y i H 0 )] for 1 i n, and since Z i is the same finite function of Y i for all i, we see that each Z i is a rv and that Z 1,..., Z n are IID conditional on H 0. Similarly, Z 1,..., Z n are IID conditional on H 1. Without conditioning on H 0 or H 1, neither the rv s Y 1,..., Y n nor the rv s Z 1,..., Z n are IID. Thus it is important to keep in mind the basic structure of this problem: initially a sample value is chosen for H. Then n observations, IID conditional on H, are made. Naturally the observer does not observe the original selection for H Conditional on H 0, the sum on the left in (7.14) is thus the sample value of the nth term in the random walk S n = Z Z n based on the rv s {Z i ; i 1} conditional on H 0. The MAP rule chooses H 1, thus making an error conditional on H 0, if S n is greater than the threshold ln[p 0 /p 1 ]. Similarly, conditional on H 1, S n = Z Z n is the n th term in a random walk with the conditional probabilities from H 1, and an error is made, conditional on H 1, if S n is less than the threshold ln[p 0 /p 1 ]. It is interesting to observe that P i z i in (7.14) depends only on the observations but not on p 0, whereas the threshold ln(p 0 /p 1 ) depends only on p 0 and not on the observations. Naturally the marginal probability distribution of P i Z i does depend on p 0 (and on the conditioning), but P i z i is a function only of the observations, so its value does not depend on p 0. The decision rule in (7.14) is called a threshold test in the sense that P i z i is compared with a threshold to make a decision. There are a number of other formulations of the problem that also lead to threshold tests. For example, maximum likelihood (ML) detection chooses the hypothesis i that maximizes f(y H i ), and thus corresponds to a threshold at 0. The ML rule has the property that it minimizes the maximum of Pr{H 0 Y } and Pr{H 1 Y }; this has obvious benefits when one is unsure of the a priori probabilities. In many detection situations there are unequal costs associated with the two kinds of errors. For example one kind of error in a medical test could lead to death of the patient and the other to an unneeded medical procedure. A minimum cost decision minimizes the expected cost over the two types of errors. As shown in Exercise 7.5, this is also a threshold test. Finally, one might impose the constraint that Pr{error H 1 } must be less than some tolerable limit α, and then minimize Pr{error H 0 } subject to this constraint. The solution to this is called a Neyman-Pearson threshold test (see Exercise 7.6). The Neyman-Pearson test is of particular interest since it does not require any assumptions about a priori probabilities. So far we have assumed that a decision is made after n observations. In many situations there is a cost associated with observations and one would prefer, after a given number of observations, to make a decision if the resulting probability of error is small enough, and to

9 7.4. THRESHOLD CROSSING PROBABILITIES IN RANDOM WALKS 287 continue with more observations otherwise. Common sense dictates such a strategy, and the branch of probability theory analyzing such strategies is called sequential analysis, which is based on the results in the next section. Essentially, we will see that the appropriate way to vary the number of observations based on the result of the observations is as follows: The probability of error under either hypothesis is based on S n = Z Z n. Thus we will see that the appropriate rule is to choose H 0 if the sample value of S n is less than some negative threshold β, to choose H 1 if the sample value of S n α for some positive threshold α and to continue testing if the sample value has not exceeded either threshold. The previous examples have all involved random walks crossing thresholds, and we now turn to the systematic study of threshold crossing problems. First we look at single thresholds, so that one question of interest is to find Pr{S n α} for an arbitrary integer n 1 and arbitrary α > 0. Another question is whether S n α for any n 1. We then turn to random walks with both a positive and negative threshold. Here, some questions of interest are to find the probability that the positive threshold is crossed before the negative threshold, to find the distribution of the threshold crossing time given the particular threshold crossed, and to find the overshoot when a threshold is crossed. 7.4 Threshold crossing probabilities in random walks Let {X i ; i 1} be a sequence of IID random variables with the distribution function F X (x), and let {S n ; n 1} be a random walk with S n = X X n. We assume throughout that E [X] exists and is finite. The reader should focus on the case E [X] = X < 0 on a first reading, and consider X = 0 and X > 0 later. For X < 0 and α > 0, we shall develop upper bounds on Pr{S n α} that are exponentially decreasing in n and α. These bounds, and many similar results to follow, are examples of large deviation theory, i.e., probabilities of highly unlikely events. We assume throughout this section that X has a moment generating function g(r) = E e rx = R e rx df X (x), and that g(r) is finite in some open interval around r = 0. As pointed out in Chapter 1, X must then have moments of all orders and the tails of its distribution function F X (x) must decay at least exponentially in x as x 1 and as x +1. Note that e rx is increasing in r for x > 0, so that if R 1 0 e rx df X (x) blows up for some r + > 0, it remains infinite for all r > r +. Similarly, for x < 0, e rx is increasing in r, so that if R x 0 erx df X (x) blows up at some r < 0, it is infinite for all r < r. Thus if r and r + are the smallest and largest values such that g(r) is finite for r < r < r +, then g(r) is infinite for r > r + and for r < r. The end points r and r + can each be finite or infinite, and the values g(r + ) and g(r ) can each be finite or infinite. Note that if X is bounded in the sense that Pr{X < B} = 0 and Pr{X > B} = 0 for some B < 1, then g(r) exists for all r. Such rv s are said to have finite support and include all discrete rv s with a finite set of possible values. Another simple example is that if X is a non-negative rv with F X (x) = 1 exp( αx) for x 0, then r + = α. Similarly, if X is a negative rv with F X = exp(βx) for x < 0, then r = β. Exercise 7.7 provides further examples of these possibilities.

10 288 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES The moment generating function of S n = X X n is given by g Sn (r) = E [exp(rs n )] = E [exp(r(x X n )] = {E [exp(rx)]} n = {g(r)} n. (7.15) It follows that g Sn (r) is finite in the same interval (r, r + ) as g(r). First we look at the probability, Pr{S n α}, that the n th step of the random walk satisfies S n α for some threshold α > 0. We could actually find the distribution of S n either by convolving the density of X with itself n times or by going through the transform domain. This would not give us much insight, however, and would be computationally tedious for large n. Instead, we explore the exponential bound, (1.38). For any r 0, in the region where g(r) is finite, i.e., for 0 r < r +, we have Pr{S n α} g Sn (r)e rα = [g(r)] n e rα. (7.16) It is convenient to rewrite (7.16) in terms of the semi-invariant moment generating function (r) = ln[g(r)]. Pr{S n α} exp[n (r) rα] ; any r, 0 r < r +. (7.17) The first two derivatives of with respect to r are given by 0 (r) = g0 (r) g(r) ; 00 (r) = g(r)g00 (r) [g 0 (r)] 2 [g(r)] 2. (7.18) Recall from (1.32) that g 0 (0) = E [X] and g 00 (0) = E X 2. Substituting this into (7.18), we can evaluate 0 (0) and 00 (0) as 0 (0) = X = E [X] ; 00 (0) = σ 2 X. (7.19) The fact that 00 (0) is the second central moment of X is why is called a semi-invariant moment generating function. Unfortunately, the higher-order derivatives of, evaluated at r = 0, are not equal to the higher-order central moments. Over the range of r where g(r) < 1, it is shown in Exercise 7.8 that 00 (r) 0, with strict inequality except in the very special (and uninteresting) case where X is deterministic. If X is deterministic, then S n is also and there is no point to considering a probabilistic model. We thus assume in what follows that X is non-deterministic and thus 00 (r) > 0 for all r between r and r + Figure 7.3 sketches (r) assuming that X < 0 and r + = 1 We can now minimize the exponent in (7.17) over r 0. For simplicity, first assume that r + = 1. Since 00 (r) > 0, the exponent is minimized by setting its derivative equal to 0. The minimum (if it exists) occurs at the r, say r o for which 0 (r) = α/n. As seen from Figure 7.3, this is satisfied with r 0 only if α/n X. Thus Pr{S n α} exp n (r o ) r o 0 (r o ) where 0 (r o ) = α/n E [X] (7.20) Ω æ (ro ) = exp α 0 (r o ) r o. (7.21)

11 7.4. THRESHOLD CROSSING PROBABILITIES IN RANDOM WALKS r slope = E [X] (r) Figure 7.3: Semi-invariant moment generating function (r) for a rv X such that E [X] < 0 and r + = 1. Note that (r) is tangent to the line of slope E [X] < 0 at 0 and has a positive second derivative everywhere. 0 r r o r r o (r o)(n/α) (r) rα/n (r) (r o ) slope = 0 (r o ) = α/n (r o) r oα/n Figure 7.4: Graphical minimization of (r) (α/n)r. For any r, (r) (α/n)r is found by drawing a line of slope (α/n) from the point (r, (r)) to the vertical axis. The minimum occurs when the line of slope α/n is tangent to the curve. The first of these inequalities shows how Pr{S n α} decreases exponentially with n for fixed α/n = 0 (r o ) and the second shows how it decreases with α for the same ratio α/n = 0 (r o ). We now give a graphical interpretation in Figure 7.4 of what these exponents mean, and return subsequently to discuss whether α/n = 0 (r) actually has a solution. The function (r) has a strictly positive second derivative, and thus any tangent to the function must lie below the function everywhere except at the point of tangency. The particular tangent shown is tangent at the point r = r o where 0 (r) = α/n. Thus this tangent line has slope α/n = 0 (r o ) and meets the vertical axis at the point (r o ) r o 0 (r o ). As illustrated, this vertical axis intercept is smaller than (r) (α/n)r for any other choice of r. This is the exponent in (7.20). This exponent is negative and shows that for a fixed ratio α/n, Pr{S n α} decays exponentially in n. Our primary interest is in the probability that S n exceeds a positive threshold, α > 0, but it can be seen that both the algebraic and graphical arguments above apply whenever α > ne [X]. Since E [X] < 0, we might also be interested in the probability that S n exceeds the mean by some amount, while also being negative. Figure 7.4 also gives a geometric interpretation of (7.21) for the case α > 0. The exponent in α is given by (7.21) to be r o + (r o )/ 0 (r o ) where r o satisfies 0 (r o ) = α/n. The negative

12 290 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES of this is seen to be the horizontal axis intercept of the tangent to (r) at r o, and thus this intercept gives the exponential decay rate of Pr{S n α} in α for fixed α/n. It is interesting to observe what happens to (7.21) as n is changed while holding α > 0 fixed. This is an important question for threshold crossings, since it provides an upper bound on crossing a fixed α for different values of n. For the α and n illustrated in Figure 7.4, note that as n increases with fixed α, the slope of the tangent decreases, moving the horizontal axis intercept to the right, i.e., increasing the exponential decay rate in α. Conversely, as n is decreased, the intercept moves to the left, decreasing the exponential decay rate. Note, however, that when the slope increases to the point where the intercept reaches the point where (r) = 0, i.e., the point labelled r in Figure 7.4, then further reductions in n move the tangent point to where (r) is positive. At this point, the intercept starts to move to the right again. This means that for all n, an upper bound to Pr{S n α} is given by Pr{S n α} exp( r α) for arbitrary α > 0, n 1. (7.22) We now must return to the question of whether the equation α/n = 0 (r) has a solution. From the assumption that E [X] < 0, we know that 0 (0) < 0. We have not yet shown why 0 (r) should become positive as r increases. To see this in the simplest case, assume that X is discrete and assume that X takes on positive values (if X were a non-positive rv, there would be no point to discussing the probability of crossing a positive threshold). Let x max be the largest such value. Then g(r) = P x p(x)erx p(x max )e rxmax. It follows that (r) rx max + ln(p(x max )). Since has a positive second derivative, it follows that 0 (r) must be increasing with r and must approach x max in the limit as r 1. Thus α/n = 0 (r) has a solution whenever α/n < x max. It is also clear that Pr{S n α} = 0 for α/n > x max. Thus 0 (r) = α/n has a solution over the range of interest. One can extend this argument to the case where X has an arbitrary distribution function with negative mean. Although we have only established (7.20, 7.21, 7.22) as upper bounds, Exercise 7.10 shows that for any fixed ratio a = α/n, and any ε > 0, there is an n 0 (ε) such that for all n n 0 (ε), Pr{S n n(α ε)} > exp{ n[ra (r) + ε]} where r satisfies 0 (r) = a. This means that for fixed a = α/n, (7.20) is exponentially tight, i.e., Pr{S n na} decays exponentially with increasing n at the asymptotic rate ra + (r) where r satisfies 0 (r) = a. The above discussion has treated only the case where r + = 1. Figure 5 illustrates the minimization of (7.17) for the case where r + < 1. We have assumed that (r) < 0 for r < r +, since the previous argument applies if (r) crosses 0 at some r r +. To include this case, (7.20) is generalized to Pr{S n α} exp n (r o ) roα n Ω µ exp n lim (r) r (r,r + ) r + rα n æ if α/n = 0 (r o ) for r o < r + otherwise. (7.23)

13 7.5. THRESHOLDS, STOPPING RULES, AND WALD S IDENTITY r r = r + r (r )(n/α) (r) rα/n (r) (r ) slope = α/n (r ) r α/n Figure 7.5: Graphical minimization of (r) (α/n)r for the case where r + < 1. As before, for any r < r +, (r) rα/n is found by drawing a line of slope (α/n) from the point (r, (r)) to the vertical axis. The minimum occurs when the line of slope α/n is tangent to the curve or when it touches the curve at r = r. If we extend the definition of r as the supremum of r such that (r) 0, then Pr{S n α} exp( r α) still holds for arbitrary α > 0, n 1. The next section establishes Wald s identity, which shows, among other things, that if X < 0, then exp( r α) is an upper bound (and a reasonable approximation) to the probability that the walk ever crosses a threshold at α > 0. Note that we have already found an upper bound to Pr{S n α} for any α > 0, n 1, but this new result bounds Pr{ S n {S n α}} for any α > 0. Both the threshold-crossing bounds in this section and Wald s identity in the next suggest that for large n or large α, the most important parameter of the IID rv s X making up the walk is the positive root r of (r), rather than the mean, variance, or other moments of X. As a prelude to developing these large deviation results about threshold crossings, we define stopping rules in a way that is both simpler and more general than the treatment in Chapter Thresholds, stopping rules, and Wald s identity The following lemma shows that a random walk with two thresholds, say α > 0 and β < 0, eventually crosses one of the threshold. Figure 7.6 illustrates two sample paths and how they cross thresholds. More specifically, the random walk first crosses a thresholds at trial n if β < S i < α for 1 i < n and either S n α or S n β. The lemma shows that this random number of trials N is finite with probability 1 (i.e., N is a rv) and that N has moments of all orders. Lemma 7.1. Let {X i ; i 1} be IID and not identically 0. For each n 1, let S n = X X n. Let α > 0 and β < 0 be arbitrary real numbers, and let N be the smallest n for which either S n α or S n β. Consider a random walk with two thresholds, α > 0 and β < 0, and assume that X is not identically zero. Then N is a random variable (i.e., lim m 1 Pr{N m} = 0) and has finite moments of all orders.

14 292 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES α S 3 r S 4 r S 5 r S6 r α S 2 r S 2 r β S 1 r β S 1 r S 3 r S 6 r S 4 r S 5 r Figure 7.6: Two sample paths of a random walk with two thresholds. In the first, the threshold at α is crossed at N = 5. In the second, the threshold at β is crossed at N = 4 Proof: Since X is not identically 0, there is some n for which either Pr{S n α + β} > 0 or for which Pr{S n α β} > 0. For any such n, let ε = max[pr{s n α + β}, Pr{S n α β}]. For any integer k 1, given that N > n(k 1), and given any value of S n(k 1) in (β, α), a threshold will be crossed by time nk with probability at least ε. Thus, Iterating on k, Pr{N > nk N > n(k 1)} 1 ε, Pr{N > nk} (1 ε) k. This shows that N is finite with probability 1 and that Pr{N j} goes to 0 at least geometrically in j. It follows that the moment generating function g N (r) of N is finite in a region around r = 0, and that N has moments of all orders Stopping rules In this section, we start with a definition of stopping rules that is more fundamental and quite different from that in Chapter 3. We then use this definition to establish Wald s identity, which is the basis for all of our subsequent results about random walks and threshold crossings. First consider a simple example. Consider a sequence {X n ; n 1} of binary random variables taking on only the values ±1. Suppose we are interested in the first occurrence of the string (+1, 1), and we view this condition as a stopping rule. Figure 7.7 illustrates this stopped process by viewing it as the truncation of a tree of possible sequences. Aside from the complexity of the tree, the same approach can be taken when considering a random walk with a stopping rule that stops at the first trial in which the random walk reaches either α > 0 or β < 0. In this case also, the stopping node is the initial segment for which the first crossing occurs at the final trial of that segment.

15 7.5. THRESHOLDS, STOPPING RULES, AND WALD S IDENTITY 293 r 1-1 r r r s r r r s r r s r r r s r s r s r Figure 7.7: A tree representing the collection of binary (1, -1) sequences, with a stopping rule viewed as a pruning of the tree. The particular stopping rule here is to stop on the first occurrence of the string (+1, 1). The leaves of the tree (i.e., the nodes at which stopping occurs) are marked with large dots and the intermediate nodes (the other nodes) with small dots. Note that each leaf in the tree has a oneto-one correspondence with an initial segment of the tree, so the stopping nodes can be unambiguously viewed either as leaves of the tree or initial segments of the sample sequences. Note that in both of these examples, the stopping rule determines which initial segment of any given sample sequence satisifes the rule. The distribution of each X n, and even whether or not the sequence is IID, is usually not relevant for defining these stopping rules. In other words, the conditions about statistical independence used in Chapter 3 for the indicator functions of stopping rules is quite unnatural for most applications. The essence of a stopping rule, however, is illustrated quite well in Figure 7.7. If one stops at some initial segment of a sample sequence, then one cannot stop again at some longer initial segment of the same sample sequence. This leads us to the following definitions of stopping nodes, stopping rules, and stopping times. Definition 7.2 (Stopping nodes). Given a sequence {X n ; n 1} of rv s, a collection of stopping nodes is a collection of initial segments of the sample sequences of {X n ; n 1}. If an initial segment of one sequence is a stopping node, then it is a stopping node for all sequences with that same initial segment. Also, no stopping node can be an initial segment of any other stopping node. This definition is less abstract when each X n is dicrete with a finite number, say m of possible values. In this case, as illustrated in Figure 7.7, the set of seqeunces is represented by a tree in which each node has one branch coming in from the root and m branches going out. Each stopping node corresponds to pruning the tree at that node. All the sequences with that given initial segment can then be ignored since they all have that same initial

16 294 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES segment, i.e., stopping node. In this sense, every pruning of the tree corresponds to a collection of stopping nodes. In information theory, such a collection of stopping nodes is called a prefix-free source code. Each segment corresponding to a stopping node is used as a codeword for some given message. If a sequence of consecutive segments is transmitted, a receiver can parse the incoming letters into segments by using the fact that no stopping node is an initial segment of any other stopping node. Definition 7.3 (Stopping rule and stopping time). A stopping rule for {X n ; n 1} is a rule that determines a collection of stopping nodes. A stopping time is a perhaps defective rv whose value, for a sample sequence with a stopping node, is the length of the initial segment for that node. Its value, for a sample sequence with no stopping node, is infinite. For most interesting stopping rules, sample sequences exist that have no stopping nodes. For the example of a random walk with two thresholds, there are many sequences that stay inside the thresholds forever. As shown by Lemma 7.1 however, this set of sequences has zero probability and thus the stopping time is a (non-defective) rv. We see from this that, although stopping rules are generally defined without the use of a probability measure, and the mapping from sample sequences to stopping nodes is similarly independent of the probability measure, the question of whether the stopping time is defective and whether it has moments is very dependent on the probability measure. Theorem 7.2 (Wald s identity). Let {X i ; i 1} be IID and let (r) = ln{e e rx } be the semi-invariant moment generating function of each X i. Assume (r) is finite in an open interval (r, r + ) with r < 0 < r +. For each n 1, let S n = X X n. Let α > 0 and β < 0 be arbitrary real numbers, and let N be the smallest n for which either S n α or S n β. Then for all r (r, r + ), E [exp(rs N N (r))] = 1. (7.24) We first show how to use and interpret this theorem, and then prove it. The proof is quite simple, but will mean more after understanding the surprising power of this result. Wald s identity can be thought of as a generating function form of Wald s equality as established in Theorem 3.3. First note that the trial N at which a threshold is crossed in the theorem is a stopping time in the terminology of Chapter 3. Also, if we take the derivative with respect to r of both sides of (7.24), we get E [S N N 0 (r) exp{rs N N (r)} = 0. Setting r = 0 and recalling that (0) = 0 and 0 (0) = X, this becomes Wald s equality, E [S N ] = E [N] X. (7.25) Note that this derivation of Wald s equality is restricted to a random walk with two thresholds (and this automatically satisfies the constraint in Wald s equality that E [N] < 1). The result in Chapter 3 was more general, applying to any stopping time such that E [N] < 1.

17 7.5. THRESHOLDS, STOPPING RULES, AND WALD S IDENTITY 295 The second derivative of (7.24) with respect to r is E (S N N 0 (r)) 2 N 00 (r) exp{rs N N (r)} = 0. At r = 0, this is h E SN 2 2NS N X + N 2 X 2i = E [N] σx. 2 (7.26) This equation is often difficult to use because of the cross term between S N and N, but its main application comes in the case where X = 0. In this case, Wald s equality provides no information about E [N], but (7.26) simplifies to E S 2 N = E [N] σ 2 X. (7.27) Example (Simple random walks again). As an example, consider the simple random walk of Section with Pr{X=1} = Pr{X= 1} = 1/2, and assume that α > 0 and β < 0 are integers. Since S n takes on only integer values and changes only by ±1, it takes on the value α or β before exceeding either of these values. Thus S N = α or S N = β. Let q α denote Pr{S N = α}. The expected value of S N is then αq α +β(1 q α ). From Wald s equality, E [S N ] = 0, so From (7.27), q α = β α β ; 1 q α = α α β. (7.28) E [N] σ 2 X = E S 2 N = α 2 q α + β 2 (1 q α ). (7.29) Using the value of q α from (7.28) and recognizing that σ 2 X = 1, E [N] = βα/σ 2 X = βα. (7.30) As a sanity check, note that if α and β are each multiplied by some large constant k, then E [N] increases by k 2. Since σs 2 n = n, we would expect S n to fluctuate with increasing n with typical values growing as n, and thus it is reasonable for the time to reach a threshold to increase with the square of the distance to the threshold. We also notice that if β is decreased toward 1, while holding α constant, then q α 1 and E [N] 1, which helps explain the possibility of winning one coin with probability 1 in a coin tossing game, assuming we have an infinite capital to risk and an infinite time to wait. For more general random walks with X = 0, there is usually an overshoot when the threshold is crossed. If the magnitudes of α and β are large relative to the range of X, however, it is often reasonable to ignore the overshoots, and then βα/σx 2 becomes a good approximation to E [N]. If one wants to include the overshoot, then the effect of the overshoot must be taken into account both in (7.28) and (7.29). We next apply Wald s identity to upper bound Pr{S N α} for the case where X < 0.

18 296 CHAPTER 7. RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES Corollary 7.1. Under the conditions of 7.2, assume that (r) has a root at r > 0. Then Pr{S N α} exp( r α). (7.31) Proof: Wald s identity, with r = r, reduces to E [exp(r S N )] = 1. We can express this as Pr{S N α} E [exp(r S N ) S N α] + Pr{S N β} E [exp(r S N ) S N β] = 1. (7.32) Since the second term on the left is non-negative, Pr{S N α} E [exp(r S N ) S N α] 1. (7.33) Given that S N α, we see that exp(r S N ) exp(r α). Thus which is equivalent to (7.31). Pr{S N α} exp(r α) 1. (7.34) This bound is valid for all β < 0, and thus is also valid in the limit β 1 (see Exercise 7.12 for a more careful demonstration that (7.31) is valid without a lower threshold). Equation (7.31) is also valid for the case of Figure 7.5, where (r) < 0 for all r (0, r + ). The exponential bound in (7.22) shows that Pr{S n α} exp( r α) for each n; (7.31) is stronger than this. It shows that Pr{ S n {S n α}} exp( r α). This also holds in the limit β 1. When Corollary 7.1 is applied to the G/G/1 queue in Theorem 7.1, (7.31) is referred to as the Kingman Bound. Corollary 7.2 (Kingman Bound). Let {X i ; i 1} and {Y i ; i 0} be the interarrival intervals and service times of a G/G/1 queue that is empty at time 0 when customer 0 arrives. Let {U i = Y i 1 X i ; i 1}, and let (r) = ln{e e Ur } be the semi-invariant moment generating function of each U i. Assume that (r) has a root at r > 0. Then W n, the waiting time of the nth arrival and W, the steady state waiting time, satisfy Pr{W n α} Pr{W α} exp( r α) ; for all α > 0. (7.35) In most applications, a positive threshold crossing for a random walk with a negative drift corresponds to some exceptional, and usually undesirable, circumstance (for example an error in the hypothesis testing problem or an overflow in the G/G/1 queue). Thus an upper bound such as (7.31) provides an assurance of a certain level of performance and is often more useful than either an approximation or an exact expression that is very difficult to evaluate. For a random walk with X > 0, the exceptional circumstance is Pr{S N β}. This can be analyzed by changing the sign of X and β and using the results for a negative expected value. These exponential bounds do not work for X = 0, and we will not analyze that case here. Note that (7.31) is an upper bound because, first, the effect of the second threshold in (7.32) was set to 0, and, second, the overshoot in the threshold crossing at α was set to 0 in going

19 7.5. THRESHOLDS, STOPPING RULES, AND WALD S IDENTITY 297 from (7.33) to (7.34). It is easy to account for the second threshold by recognizing that Pr{S N β} = 1 Pr{S N α}. Then (7.32) can be solved, getting Pr{S N α} = 1 E [exp(r S N ) S N β] E [exp(r S N ) S N α] E [exp(r S N ) S N β]. (7.36) Accounting for the overshoots is much more difficult. For the case of the simple random walk, overshoots never occur since the random walk always changes in unit steps. Thus, for α and β integers, we have E [exp(r S N ) S N β] = exp(r β) and E [exp(r S N ) S N α] = exp(r α). Substituting this in (7.36) yields the exact solution Pr{S N α} = exp( r α)[1 exp(r β)] 1 exp[ r. (7.37) (α β)] Solving the equation (r ) = 0 for the simple random walk with probabilities p and q yields r = ln(q/p). This is also valid if X takes on the three values 1, 0, and +1 with p = Pr{X = 1}, q = Pr{X = 1}, and 1 p q = Pr{X = 0}. It can be seen that if α and β are large positive integers, then the simple bound of (7.31) is almost exact for this example. Equation (7.37) is sometimes taken as an approximation for (7.36). Unfortunately, for many applications, the overshoots are more significant than the effect of the opposite threshold so that (7.37) is only negligibly better than (7.31) as an approximation, and has the disadvantage of not being a bound. If Pr{S N α} must actually be calculated, then the overshoots in (7.36) must be taken into account. See Chapter 12 of [9] for a treatment of overshoots Joint distribution of N and barrier Next we look at Pr{N n, S N α}, where again we assume that X < 0 and that (r ) = 0 for some r > 0. For any r in the region where (r) 0 (i.e., for 0 r r ), we have N (r) n (r) for N n. Thus, from the Wald identity, we have 1 E [exp[rs N N (r)] N n, S N α] Pr{N n, SN α} exp[rα n (r)]pr{n n, S N α} Pr{N n, S N α} exp[ rα + n (r)] ; for all r such that 0 r r.(7.38) Under our assumption that X < 0, we have (r) 0 in the range 0 r r, and (7.38) is valid for all r in this range. To obtain the tightest bound of this form, we should minimize the right hand side of (7.38). This is the same minimization (except for the constraint r r ) as in Figure 7.4, and the result, if α/n < 0 (r ), is Pr{N n, S N α} exp[ r o α + n (r o )]. (7.39) where r o satisfies 0 (r o ) = α/n. This is the same as the bound on Pr{S n α} in (7.20) except that r r in (7.39). For the special case described in Figure 7.5 where (r) < 0 for all r < r +, (7.39) is modified in the same way as used in (7.23).

DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition

DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition R. G. Gallager January 31, 2011 i ii Preface These notes are a draft of a major rewrite of a text [9] of the same name. The notes and the text are outgrowths

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.262 Discrete Stochastic Processes Midterm Quiz April 6, 2010 There are 5 questions, each with several parts.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.262 Discrete Stochastic Processes Midterm Quiz April 6, 2010 There are 5 questions, each with several parts.

More information

1 Gambler s Ruin Problem

1 Gambler s Ruin Problem 1 Gambler s Ruin Problem Consider a gambler who starts with an initial fortune of $1 and then on each successive gamble either wins $1 or loses $1 independent of the past with probabilities p and q = 1

More information

2. Transience and Recurrence

2. Transience and Recurrence Virtual Laboratories > 15. Markov Chains > 1 2 3 4 5 6 7 8 9 10 11 12 2. Transience and Recurrence The study of Markov chains, particularly the limiting behavior, depends critically on the random times

More information

6.262: Discrete Stochastic Processes 2/2/11. Lecture 1: Introduction and Probability review

6.262: Discrete Stochastic Processes 2/2/11. Lecture 1: Introduction and Probability review 6.262: Discrete Stochastic Processes 2/2/11 Lecture 1: Introduction and Probability review Outline: Probability in the real world Probability as a branch of mathematics Discrete stochastic processes Processes

More information

Exponential Tail Bounds

Exponential Tail Bounds Exponential Tail Bounds Mathias Winther Madsen January 2, 205 Here s a warm-up problem to get you started: Problem You enter the casino with 00 chips and start playing a game in which you double your capital

More information

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

Slope Fields: Graphing Solutions Without the Solutions

Slope Fields: Graphing Solutions Without the Solutions 8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,

More information

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011 Exercise 6.5: Solutions to Homework 0 6.262 Discrete Stochastic Processes MIT, Spring 20 Consider the Markov process illustrated below. The transitions are labelled by the rate q ij at which those transitions

More information

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS

Stochastic Processes. Theory for Applications. Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS Stochastic Processes Theory for Applications Robert G. Gallager CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv Swgg&sfzoMj ybr zmjfr%cforj owf fmdy xix Acknowledgements xxi 1 Introduction and review

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Chapter 16 focused on decision making in the face of uncertainty about one future

Chapter 16 focused on decision making in the face of uncertainty about one future 9 C H A P T E R Markov Chains Chapter 6 focused on decision making in the face of uncertainty about one future event (learning the true state of nature). However, some decisions need to take into account

More information

On the static assignment to parallel servers

On the static assignment to parallel servers On the static assignment to parallel servers Ger Koole Vrije Universiteit Faculty of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The Netherlands Email: koole@cs.vu.nl, Url: www.cs.vu.nl/

More information

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes? IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2006, Professor Whitt SOLUTIONS to Final Exam Chapters 4-7 and 10 in Ross, Tuesday, December 19, 4:10pm-7:00pm Open Book: but only

More information

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974 LIMITS FOR QUEUES AS THE WAITING ROOM GROWS by Daniel P. Heyman Ward Whitt Bell Communications Research AT&T Bell Laboratories Red Bank, NJ 07701 Murray Hill, NJ 07974 May 11, 1988 ABSTRACT We study the

More information

Basic Probability space, sample space concepts and order of a Stochastic Process

Basic Probability space, sample space concepts and order of a Stochastic Process The Lecture Contains: Basic Introduction Basic Probability space, sample space concepts and order of a Stochastic Process Examples Definition of Stochastic Process Marginal Distributions Moments Gaussian

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication

Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication Stavros Tripakis Abstract We introduce problems of decentralized control with communication, where we explicitly

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

Figure 10.1: Recording when the event E occurs

Figure 10.1: Recording when the event E occurs 10 Poisson Processes Let T R be an interval. A family of random variables {X(t) ; t T} is called a continuous time stochastic process. We often consider T = [0, 1] and T = [0, ). As X(t) is a random variable

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

1.225J J (ESD 205) Transportation Flow Systems

1.225J J (ESD 205) Transportation Flow Systems 1.225J J (ESD 25) Transportation Flow Systems Lecture 9 Simulation Models Prof. Ismail Chabini and Prof. Amedeo R. Odoni Lecture 9 Outline About this lecture: It is based on R16. Only material covered

More information

1 Functions, Graphs and Limits

1 Functions, Graphs and Limits 1 Functions, Graphs and Limits 1.1 The Cartesian Plane In this course we will be dealing a lot with the Cartesian plane (also called the xy-plane), so this section should serve as a review of it and its

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Module 3. Function of a Random Variable and its distribution

Module 3. Function of a Random Variable and its distribution Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given

More information

reversed chain is ergodic and has the same equilibrium probabilities (check that π j =

reversed chain is ergodic and has the same equilibrium probabilities (check that π j = Lecture 10 Networks of queues In this lecture we shall finally get around to consider what happens when queues are part of networks (which, after all, is the topic of the course). Firstly we shall need

More information

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7 Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

More information

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011 Exercise 1 Solutions to Homework 6 6.262 Discrete Stochastic Processes MIT, Spring 2011 Let {Y n ; n 1} be a sequence of rv s and assume that lim n E[ Y n ] = 0. Show that {Y n ; n 1} converges to 0 in

More information

College Algebra Through Problem Solving (2018 Edition)

College Algebra Through Problem Solving (2018 Edition) City University of New York (CUNY) CUNY Academic Works Open Educational Resources Queensborough Community College Winter 1-25-2018 College Algebra Through Problem Solving (2018 Edition) Danielle Cifone

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities

More information

RANDOM WALKS IN ONE DIMENSION

RANDOM WALKS IN ONE DIMENSION RANDOM WALKS IN ONE DIMENSION STEVEN P. LALLEY 1. THE GAMBLER S RUIN PROBLEM 1.1. Statement of the problem. I have A dollars; my colleague Xinyi has B dollars. A cup of coffee at the Sacred Grounds in

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

2. Two binary operations (addition, denoted + and multiplication, denoted

2. Two binary operations (addition, denoted + and multiplication, denoted Chapter 2 The Structure of R The purpose of this chapter is to explain to the reader why the set of real numbers is so special. By the end of this chapter, the reader should understand the difference between

More information

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS chapter MORE MATRIX ALGEBRA GOALS In Chapter we studied matrix operations and the algebra of sets and logic. We also made note of the strong resemblance of matrix algebra to elementary algebra. The reader

More information

Markov Processes Hamid R. Rabiee

Markov Processes Hamid R. Rabiee Markov Processes Hamid R. Rabiee Overview Markov Property Markov Chains Definition Stationary Property Paths in Markov Chains Classification of States Steady States in MCs. 2 Markov Property A discrete

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Error Correcting Codes Prof. Dr. P. Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore

Error Correcting Codes Prof. Dr. P. Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore (Refer Slide Time: 00:15) Error Correcting Codes Prof. Dr. P. Vijay Kumar Department of Electrical Communication Engineering Indian Institute of Science, Bangalore Lecture No. # 03 Mathematical Preliminaries:

More information

Other properties of M M 1

Other properties of M M 1 Other properties of M M 1 Přemysl Bejda premyslbejda@gmail.com 2012 Contents 1 Reflected Lévy Process 2 Time dependent properties of M M 1 3 Waiting times and queue disciplines in M M 1 Contents 1 Reflected

More information

CPSC 467b: Cryptography and Computer Security

CPSC 467b: Cryptography and Computer Security CPSC 467b: Cryptography and Computer Security Michael J. Fischer Lecture 9 February 6, 2012 CPSC 467b, Lecture 9 1/53 Euler s Theorem Generating RSA Modulus Finding primes by guess and check Density of

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers ALGEBRA CHRISTIAN REMLING 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers by Z = {..., 2, 1, 0, 1,...}. Given a, b Z, we write a b if b = ac for some

More information

IEOR 6711, HMWK 5, Professor Sigman

IEOR 6711, HMWK 5, Professor Sigman IEOR 6711, HMWK 5, Professor Sigman 1. Semi-Markov processes: Consider an irreducible positive recurrent discrete-time Markov chain {X n } with transition matrix P (P i,j ), i, j S, and finite state space.

More information

Sequential Decisions

Sequential Decisions Sequential Decisions A Basic Theorem of (Bayesian) Expected Utility Theory: If you can postpone a terminal decision in order to observe, cost free, an experiment whose outcome might change your terminal

More information

Polynomial Expressions and Functions

Polynomial Expressions and Functions Hartfield College Algebra (Version 2017a - Thomas Hartfield) Unit FOUR Page - 1 - of 36 Topic 32: Polynomial Expressions and Functions Recall the definitions of polynomials and terms. Definition: A polynomial

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

BRANCHING PROCESSES 1. GALTON-WATSON PROCESSES

BRANCHING PROCESSES 1. GALTON-WATSON PROCESSES BRANCHING PROCESSES 1. GALTON-WATSON PROCESSES Galton-Watson processes were introduced by Francis Galton in 1889 as a simple mathematical model for the propagation of family names. They were reinvented

More information

V. Graph Sketching and Max-Min Problems

V. Graph Sketching and Max-Min Problems V. Graph Sketching and Max-Min Problems The signs of the first and second derivatives of a function tell us something about the shape of its graph. In this chapter we learn how to find that information.

More information

Proof Techniques (Review of Math 271)

Proof Techniques (Review of Math 271) Chapter 2 Proof Techniques (Review of Math 271) 2.1 Overview This chapter reviews proof techniques that were probably introduced in Math 271 and that may also have been used in a different way in Phil

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

The Liapunov Method for Determining Stability (DRAFT)

The Liapunov Method for Determining Stability (DRAFT) 44 The Liapunov Method for Determining Stability (DRAFT) 44.1 The Liapunov Method, Naively Developed In the last chapter, we discussed describing trajectories of a 2 2 autonomous system x = F(x) as level

More information

Chapter 3 Representations of a Linear Relation

Chapter 3 Representations of a Linear Relation Chapter 3 Representations of a Linear Relation The purpose of this chapter is to develop fluency in the ways of representing a linear relation, and in extracting information from these representations.

More information

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation LECTURE 10: REVIEW OF POWER SERIES By definition, a power series centered at x 0 is a series of the form where a 0, a 1,... and x 0 are constants. For convenience, we shall mostly be concerned with the

More information

1 Gambler s Ruin Problem

1 Gambler s Ruin Problem Coyright c 2017 by Karl Sigman 1 Gambler s Ruin Problem Let N 2 be an integer and let 1 i N 1. Consider a gambler who starts with an initial fortune of $i and then on each successive gamble either wins

More information

2905 Queueing Theory and Simulation PART III: HIGHER DIMENSIONAL AND NON-MARKOVIAN QUEUES

2905 Queueing Theory and Simulation PART III: HIGHER DIMENSIONAL AND NON-MARKOVIAN QUEUES 295 Queueing Theory and Simulation PART III: HIGHER DIMENSIONAL AND NON-MARKOVIAN QUEUES 16 Queueing Systems with Two Types of Customers In this section, we discuss queueing systems with two types of customers.

More information

Queueing Theory and Simulation. Introduction

Queueing Theory and Simulation. Introduction Queueing Theory and Simulation Based on the slides of Dr. Dharma P. Agrawal, University of Cincinnati and Dr. Hiroyuki Ohsaki Graduate School of Information Science & Technology, Osaka University, Japan

More information

Stochastic process. X, a series of random variables indexed by t

Stochastic process. X, a series of random variables indexed by t Stochastic process X, a series of random variables indexed by t X={X(t), t 0} is a continuous time stochastic process X={X(t), t=0,1, } is a discrete time stochastic process X(t) is the state at time t,

More information

Chapter 3 Representations of a Linear Relation

Chapter 3 Representations of a Linear Relation Chapter 3 Representations of a Linear Relation The purpose of this chapter is to develop fluency in the ways of representing a linear relation, and in extracting information from these representations.

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

An analogy from Calculus: limits

An analogy from Calculus: limits COMP 250 Fall 2018 35 - big O Nov. 30, 2018 We have seen several algorithms in the course, and we have loosely characterized their runtimes in terms of the size n of the input. We say that the algorithm

More information

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about 0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 7 02 December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about Two-Player zero-sum games (min-max theorem) Mixed

More information

Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis.

Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis. Service Engineering Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis. G/G/1 Queue: Virtual Waiting Time (Unfinished Work). GI/GI/1: Lindley s Equations

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

CIS 2033 Lecture 5, Fall

CIS 2033 Lecture 5, Fall CIS 2033 Lecture 5, Fall 2016 1 Instructor: David Dobor September 13, 2016 1 Supplemental reading from Dekking s textbook: Chapter2, 3. We mentioned at the beginning of this class that calculus was a prerequisite

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

RENEWAL THEORY STEVEN P. LALLEY UNIVERSITY OF CHICAGO. X i

RENEWAL THEORY STEVEN P. LALLEY UNIVERSITY OF CHICAGO. X i RENEWAL THEORY STEVEN P. LALLEY UNIVERSITY OF CHICAGO 1. RENEWAL PROCESSES A renewal process is the increasing sequence of random nonnegative numbers S 0,S 1,S 2,... gotten by adding i.i.d. positive random

More information

A little context This paper is concerned with finite automata from the experimental point of view. The behavior of these machines is strictly determin

A little context This paper is concerned with finite automata from the experimental point of view. The behavior of these machines is strictly determin Computability and Probabilistic machines K. de Leeuw, E. F. Moore, C. E. Shannon and N. Shapiro in Automata Studies, Shannon, C. E. and McCarthy, J. Eds. Annals of Mathematics Studies, Princeton University

More information

Parameter estimation Conditional risk

Parameter estimation Conditional risk Parameter estimation Conditional risk Formalizing the problem Specify random variables we care about e.g., Commute Time e.g., Heights of buildings in a city We might then pick a particular distribution

More information

West Windsor-Plainsboro Regional School District Algebra Grade 8

West Windsor-Plainsboro Regional School District Algebra Grade 8 West Windsor-Plainsboro Regional School District Algebra Grade 8 Content Area: Mathematics Unit 1: Foundations of Algebra This unit involves the study of real numbers and the language of algebra. Using

More information

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication) Nikolaus Robalino and Arthur Robson Appendix B: Proof of Theorem 2 This appendix contains the proof of Theorem

More information

2.2 Some Consequences of the Completeness Axiom

2.2 Some Consequences of the Completeness Axiom 60 CHAPTER 2. IMPORTANT PROPERTIES OF R 2.2 Some Consequences of the Completeness Axiom In this section, we use the fact that R is complete to establish some important results. First, we will prove that

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

5 + 9(10) + 3(100) + 0(1000) + 2(10000) = Chapter 5 Analyzing Algorithms So far we have been proving statements about databases, mathematics and arithmetic, or sequences of numbers. Though these types of statements are common in computer science,

More information

DETECTION theory deals primarily with techniques for

DETECTION theory deals primarily with techniques for ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for

More information

1 Introduction (January 21)

1 Introduction (January 21) CS 97: Concrete Models of Computation Spring Introduction (January ). Deterministic Complexity Consider a monotonically nondecreasing function f : {,,..., n} {, }, where f() = and f(n) =. We call f a step

More information

Information measures in simple coding problems

Information measures in simple coding problems Part I Information measures in simple coding problems in this web service in this web service Source coding and hypothesis testing; information measures A(discrete)source is a sequence {X i } i= of random

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

May 2015 Timezone 2 IB Maths Standard Exam Worked Solutions

May 2015 Timezone 2 IB Maths Standard Exam Worked Solutions May 015 Timezone IB Maths Standard Exam Worked Solutions 015, Steve Muench steve.muench@gmail.com @stevemuench Please feel free to share the link to these solutions http://bit.ly/ib-sl-maths-may-015-tz

More information

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk Instructor: Victor F. Araman December 4, 2003 Theory and Applications of Stochastic Systems Lecture 0 B60.432.0 Exponential Martingale for Random Walk Let (S n : n 0) be a random walk with i.i.d. increments

More information

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002 Tail Inequalities 497 - Randomized Algorithms Sariel Har-Peled December 0, 00 Wir mssen wissen, wir werden wissen (We must know, we shall know) David Hilbert 1 Tail Inequalities 1.1 The Chernoff Bound

More information

Testing Problems with Sub-Learning Sample Complexity

Testing Problems with Sub-Learning Sample Complexity Testing Problems with Sub-Learning Sample Complexity Michael Kearns AT&T Labs Research 180 Park Avenue Florham Park, NJ, 07932 mkearns@researchattcom Dana Ron Laboratory for Computer Science, MIT 545 Technology

More information

Discrete-event simulations

Discrete-event simulations Discrete-event simulations Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/elt-53606/ OUTLINE: Why do we need simulations? Step-by-step simulations; Classifications;

More information

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory

Bayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology

More information

Lecture - 30 Stationary Processes

Lecture - 30 Stationary Processes Probability and Random Variables Prof. M. Chakraborty Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 30 Stationary Processes So,

More information

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk ANSAPW University of Queensland 8-11 July, 2013 1 Outline (I) Fluid

More information

Discrete Event Systems Exam

Discrete Event Systems Exam Computer Engineering and Networks Laboratory TEC, NSG, DISCO HS 2016 Prof. L. Thiele, Prof. L. Vanbever, Prof. R. Wattenhofer Discrete Event Systems Exam Friday, 3 rd February 2017, 14:00 16:00. Do not

More information

The Derivative of a Function Measuring Rates of Change of a function. Secant line. f(x) f(x 0 ) Average rate of change of with respect to over,

The Derivative of a Function Measuring Rates of Change of a function. Secant line. f(x) f(x 0 ) Average rate of change of with respect to over, The Derivative of a Function Measuring Rates of Change of a function y f(x) f(x 0 ) P Q Secant line x 0 x x Average rate of change of with respect to over, " " " " - Slope of secant line through, and,

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Introduction and Preliminaries

Introduction and Preliminaries Chapter 1 Introduction and Preliminaries This chapter serves two purposes. The first purpose is to prepare the readers for the more systematic development in later chapters of methods of real analysis

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information