Estimating and Simulating Nonhomogeneous Poisson Processes

Similar documents
Nonparametric estimation and variate generation for a nonhomogeneous Poisson process from event count data

Estimation for Nonhomogeneous Poisson Processes from Aggregated Data

A TEST OF RANDOMNESS BASED ON THE DISTANCE BETWEEN CONSECUTIVE RANDOM NUMBER PAIRS. Matthew J. Duggan John H. Drew Lawrence M.

Section 7.5: Nonstationary Poisson Processes

Reliability Growth in JMP 10

Mahdi karbasian* & Zoubi Ibrahim

An Integral Measure of Aging/Rejuvenation for Repairable and Non-repairable Systems

Introduction to repairable systems STK4400 Spring 2011

Repairable Systems Reliability Trend Tests and Evaluation

AN INTEGRAL MEASURE OF AGING/REJUVENATION FOR REPAIRABLE AND NON REPAIRABLE SYSTEMS

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

ISSN (ONLINE): , ISSN (PRINT):

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Confidence Intervals for Reliability Growth Models with Small Sample Sizes. Reliability growth models, Order statistics, Confidence intervals

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

LEAST SQUARES ESTIMATION OF NONHOMOGENEOUS POISSON PROCESSES

Computer Simulation of Repairable Processes

Proceedings of the 2014 Winter Simulation Conference A. Tolk, S. D. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds.

BAYESIAN MODELING OF DYNAMIC SOFTWARE GROWTH CURVE MODELS

Stat 710: Mathematical Statistics Lecture 31

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

Solution: The process is a compound Poisson Process with E[N (t)] = λt/p by Wald's equation.

The exponential distribution and the Poisson process

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

Northwestern University Department of Electrical Engineering and Computer Science

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Stat 516, Homework 1

IEOR 3106: Second Midterm Exam, Chapters 5-6, November 7, 2013

Martín Gastón Teresa León Fermín Mallor

MAS1302 Computational Probability and Statistics

MODELING AND ANALYZING TRANSIENT MILITARY AIR TRAFFIC CONTROL

Simulating Multivariate Nonhomogeneous Poisson Processes Using Projections

IE 581 Introduction to Stochastic Simulation

Economics 583: Econometric Theory I A Primer on Asymptotics

Stochastic Renewal Processes in Structural Reliability Analysis:

Poisson Processes. Particles arriving over time at a particle detector. Several ways to describe most common model.

3 Continuous Random Variables

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Variability within multi-component systems. Bayesian inference in probabilistic risk assessment The current state of the art

S1. Proofs of the Key Properties of CIATA-Ph

Estimation for generalized half logistic distribution based on records

Simulation and Calibration of a Fully Bayesian Marked Multidimensional Hawkes Process with Dissimilar Decays

Modeling and Simulation of Nonstationary Non-Poisson Arrival Processes

On the Partitioning of Servers in Queueing Systems during Rush Hour

Lecture 2: CDF and EDF

Optimal maintenance for minimal and imperfect repair models

Choosing Arrival Process Models for Service Systems: Tests of a Nonhomogeneous Poisson Process

1.225J J (ESD 205) Transportation Flow Systems

ISE/OR 762 Stochastic Simulation Techniques

Computer simulation on homogeneity testing for weighted data sets used in HEP

Introduction to Reliability Theory (part 2)

I I FINAL, 01 Jun 8.4 to 31 May TITLE AND SUBTITLE 5 * _- N, '. ', -;

Continuous-time Markov Chains

A STATISTICAL TEST FOR MONOTONIC AND NON-MONOTONIC TREND IN REPAIRABLE SYSTEMS

1 Poisson processes, and Compound (batch) Poisson processes

Online Supplement to Are Call Center and Hospital Arrivals Well Modeled by Nonhomogeneous Poisson Processes?

Recognition Performance from SAR Imagery Subject to System Resource Constraints

Q = (c) Assuming that Ricoh has been working continuously for 7 days, what is the probability that it will remain working at least 8 more days?

Survival Analysis. Lu Tian and Richard Olshen Stanford University

ABSTRACT. LIU, RAN. Modeling and Simulation of Nonstationary Non-Poisson Processes. (Under the direction of Dr. James R. Wilson.)

ST745: Survival Analysis: Nonparametric methods

XLVII SIMPÓSIO BRASILEIRO DE PESQUISA OPERACIONAL

3. Poisson Processes (12/09/12, see Adult and Baby Ross)

Statistical Methods for Astronomy

Multivariate Distributions

Master of Science in Statistics A Proposal

Slides 5: Random Number Extensions

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

This paper investigates the impact of dependence among successive service times on the transient and

RENEWAL PROCESSES AND POISSON PROCESSES

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Finite-Horizon Statistics for Markov chains

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

Modelling and simulating non-stationary arrival processes to facilitate analysis

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Uniform random numbers generators

Gradient-based Adaptive Stochastic Search

Notation Precedence Diagram

Examination paper for TMA4275 Lifetime Analysis

Multistate models and recurrent event models

Permutation tests. Patrick Breheny. September 25. Conditioning Nonparametric null hypotheses Permutation testing

Estimating a parametric lifetime distribution from superimposed renewal process data

Department of Mathematics

errors every 1 hour unless he falls asleep, in which case he just reports the total errors

ABC methods for phase-type distributions with applications in insurance risk problems

An algorithm for computing the PDF of order statistics drawn from discrete parent populations is presented,

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim

Proceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds.

Expected Values, Exponential and Gamma Distributions

A nonparametric monotone maximum likelihood estimator of time trend for repairable systems data

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Analytical Bootstrap Methods for Censored Data

2 Continuous Random Variables and their Distributions

Slides 8: Statistical Models in Simulation

Conditioning Nonparametric null hypotheses Permutation testing. Permutation tests. Patrick Breheny. October 5. STA 621: Nonparametric Statistics

Poisson Processes. Stochastic Processes. Feb UC3M

Birth-Death Processes

Transcription:

Estimating and Simulating Nonhomogeneous Poisson Processes Larry Leemis Department of Mathematics The College of William & Mary Williamsburg, VA 2387 8795 USA 757 22 234 E-mail: leemis@math.wm.edu May 23, 23 Outline. Motivation 2. Probabilistic properties 3. Estimating Λ(t) from k realizations on (, S] 4. Estimating Λ(t) from overlapping realizations 5. Software 6. Conclusions Note: Portions of this work are with Brad Arkin (RST Corporation), Andy Glen (United States Military Academy), John Drew (William & Mary), and Diane Evans (Rose Hulman). Part of this work was supported by an NSF grant supporting the UMSA (Undergraduate Modeling and Simulation Analysis) REU (Research Experience for Undergraduates) at The College of William & Mary.

. Motivation Although easy to estimate and simulate, HPPs and renewal processes do not allow for varying rates. The use of an NHPP is often more appropriate for modeling. Example: Customer arrivals to a fast food restaurant 6 5 4 3 2 rate Breakfast Lunch Dinner 5 5 2 time Other examples: Cyclone arrival times in the Arctic Sea (Lee, Wilson, and Crawford, 99) Database transaction times (Lewis and Shedler, 976) Calls for on-line analysis of electrocardiograms at a hospital in Houston (Kao and Chang, 988) Respiratory cancer deaths near a steel complex in Scotland (Lawson, 988) Repairable systems: blood analyzers, fan motors, power supplies, turbines (Nelson, 988)

2. Probabilistic properties Notation t time N(t) number of events by time t λ(t) instantaneous arrival rate at time t (intensity function) Λ(t) = t λ(τ)dτ cumulative intensity function Property [ b a λ(τ)dτ ] n e b a λ(τ)dτ Pr (N(b) N(a) = n) = n! Property 2 E[N(t)] = Λ(t) n =,,... Property 3 (Çinlar, 975) If E, E 2,... are event times in a unit HPP then Λ (E ), Λ (E 2 ),... are event times in an NHPP with cumulative intensity function Λ(t). Λ( t ) E 5 E 4 E 3 E 2 E T T 2 T 3 T 4 T 5 t

3. Estimating Λ(t) from k realizations on (, S] Data t time (, S] time interval where observations are collected k number of realizations collected n, n 2,..., n k number of observations per realization t (), t (2),..., t (n) superposition of observations n = k i= n i total number of observations collected Intuitive solution (Law and Kelton, 2): partition time axis and let λ(t) be piecewise constant. Problems (a) Determining cell width Small cell width sampling variability Large cell width miss trend (b) Subjective due to arbitrary parameters 6 5 4 3 2 rate Breakfast Lunch Dinner 5 5 2 time

O O O O O O O O O Piecewise-linear nonparametric cumulative intensity function estimator in ˆΛ(t) = (n + )k + n ( ) t t (i) (n + ) k ( ) t (i) < t t (i+) t (i+) t (i) for i =,, 2,..., n. Rationale: ˆΛ(S) = n/k n k Λ( t ) 2n (n+)k n (n+)k t () t (2) S t

Properties Handles ties as expected Consistency lim ˆΛ(t) = Λ(t) k Confidence interval (asymptotically exact) for Λ(t) ˆΛ(t) ± z α/2 ˆΛ(t) Variate generation straightforward Input: Number of observed arrivals n Number of active realizations k Superpositioned observations t (), t (2),..., t (n+) Output: Event times T, T 2,..., T i on (, S] k i generate U i U(, ) E i log e ( U i ) while E i < n/k do begin m (n+)kei n [initialize variate counter] [generate initial random number] [generate initial exponential variate] [set m ˆΛ(t (m) ) < E i ˆΛ(t (m+) )] ( T i t (m) + [t (m+) t (m) ] (n+)kei n m [generate event time] i i + [increment variate counter] generate U i U(, ) [generate next random number] E i E i log e ( U i ) [generate next HPP event time] end )

4. Estimating Λ(t) from overlapping realizations Data t time (, S] time interval where observations are collected r # time intervals where the # realizations is fixed (s j, s j+ ] interval j +, j =,,..., r k j+ # realizations observed on (s j, s j+ ], j =,,..., r n j+ number of observations on (s j, s j+ ] t (), t (),..., t (n+r) superposition of observations, s, s,..., s r Note: s =, s r = S ˆΛ(t) = j q= n q k q + ( i j q= (n q + ) ) ( ) n j+ n j+ + t t(i) ( ) (n j+ + ) k j+ (n j+ + ) k j+ t(i+) t (i) t (i) < t t (i+) ; i =,, 2,..., n + r, Rationale: ˆΛ(s j+ ) = j+ s j < t s j+ ; j =,,..., r, q= n q /k q Single segment of ˆΛ(t) in the (j + )st region:, j + q = n q k q k j+ realizations n j+ observations ˆΛ(t (i+) ) ˆΛ(t (i) ) jq = n q k q s j t (i) t (i+) s j+

Example: Lunchwagon arrivals (Klein and Roberts, 984) Depiction of three regions for lunchwagon arrivals from : AM to 2:3 PM for k =, k 2 = 2, and k 3 = ; s =.5, s 2 = 3, and s 3 = 4.5. 6 5 4 Λ(t) 3 2.5.5 2 2.5 t 3 3.5 4 4.5

An asymptotically exact ( α)% confidence interval for Λ(t) Λ(t) ˆΛ(t) < z α/2 ˆΛ(t) ˆΛ(s j ) k j+ + j q= ˆΛ(s q ) ˆΛ(s q ) Parent cumulative intensity function, nonparametric estimator, and 95% confidence bands for lunchwagon arrivals from : AM to 2:3 PM for k =, k 2 = 2, and k 3 = ; s =.5, s 2 = 3, and s 3 = 4.5. k q 6 5 4 Λ(t) 3 2.5.5 2 2.5 t 3 3.5 4 4.5

Variate Generation Input: Number of partitions r Number of active realizations k, k 2,..., k r Number of observed arrivals per partition n, n 2,..., n r Superpositioned values t (), t (),..., t (n+r) Output: Event times T, T 2,..., T i on (, S] i j MAX r q= n q /k q generate U i U(, ) E i log e ( U i ) [initialize variate counter] [initialize region counter] [set MAX to ˆΛ(S)] [generate initial random number] [generate initial exponential variate] while E i < MAX do begin while E i > j+ q= n q /k q do [update j if necessary] begin j j + [increment region counter] end m (nj+ +)k j+(e i j q= n q/k q) n j+ ( T i t (m) +[t (m+) t (m) ] (n j++)k j+ + j q= (n q + ) [set m ˆΛ(t (m) ) < E i ˆΛ(t (m+) )] ) j E i q= nq/kq n j+ ( m j q= (n q + ) ) i i + generate U i U(, ) E i E i log e ( U i ) end [generate event time] [increment variate counter] [generate next random number] [generate next HPP event time]

Example : Monte Carlo evaluation of the confidence interval for Λ(t) Coverages in the lunchwagon example (nominal coverage.95;, replications; k =, k 2 = 2, k 3 = ; s =, s =.5, s 2 = 3, s 3 = 4.5). Time Actual Coverage Misses High Misses Low.9.95.3.487.35.9386.48.566.8.955.2.296 2.25.9466.96.339 2.7.9498.74.329 3.5.959.295.96 3.6.9498.25.25 4.5.957.67.36

Example 2: Failure times for 2 copy machines (Zaino and Berke 992) Failure times 45 Machine Number 4 35 3 25 234567 Actuations ˆΛ(t) for each machine ˆΛ(t) 8 7 6 5 4 3 2 234567 Actuations

ˆΛ(t) for the copy machine failure times 8 ˆΛ(t) 6 4 2 234567 Actuations

Example 3: Failure times of heat pump compressors (Nelson 99) The compressors are located in five separate buildings, each under repair contract for a time span (a, b], indicated by the boldface values. The data set consists of n = 28 failure times, and yields r = 29 regions. Bldg Num of Comp Entry time, Failure Times, Exit time B 64 2.59, 3.3, 4.62, 4.62, 5.75, 5.75, 7.42, 7.42, 8.77, 9.27, D 356 4.45, 4.47, 4.47, 5.56, 5.57, 5.8, 6.3, 7.2, 7. E 458., 2,85, 4.65, 4.79, 5.85, 6.73, 7.33 H 49.,.7,.7,.34, 5.9 K 95., 2.7, 3.65, 4.4, 4.4 ˆΛ(t) for the heat pump compressor failure times..8 ˆΛ(t).6.4.2 2 3 4 5 t (years) 6 7 8 9

5. Software Civilization advances by extending the number of important operations which we can perform without thinking about them. Alfred North Whitehead (86 947) APPL (A Probability Programming Language) is a Maplebased language with data structures for discrete and continuous random variables and algorithms for their manipulation. Example : Let X, X 2,..., X be independent and identically distributed U(,) random variables. Find Typical approaches Pr Central limit theorem Simulation 4 < i= n := ; X := UniformRV(, ); Y := Convolution(X, n); CDF(Y, 6) - CDF(Y, 4); 65577 972 X i < 6

Example 2: X T riangular(, 2, 4) Y T riangular(, 2, 3) Find the distribution of V = XY. 4 y 3 xy = 2 2 2 3 4 xy = 8 xy = 6 xy = 4 xy = 3 xy = 2 xy = x X := TriangularRV(, 2, 4); Y := TriangularRV(, 2, 3); V := Product(X, Y); which returns the probability density function of V as f V (v) = 4 3 v + 2 3 ln v + 2v 3 ln v + 4 3 < v 2 8 + 4 3 ln 2 + 7v 3 ln 2 + 3 v 4 ln v 5v 3 ln v 2 < v 3 4 + 4 3 ln 2 + 7v 3 ln 2 + 2v 2 ln v v ln v 2 ln 3 2v 3 ln 3 44 3 < v 4 3 4 ln 2 7v 3 ln 2 8 3 v 2 ln 3 + 22 3 ln v 2v 3 ln 3 + 4v 3 ln v 4 < v 6 8 3 8 ln 2 4v 3 ln 2 2 3 v + 4 3 ln v + v 3 ln v + 4 ln 3 + v 3 ln 3 6 < v 8 8 + 8 ln 2 + 2v 3 ln 2 + 2 3 v + 4 ln 3 4 ln v + v 3 ln 3 v 3 ln v 8 < v < 2

Example 3: Kolmogorov Smirnov test statistic (all parameters known) Defining formula: D n = sup x F (x) F n (x), Computational formula: D n = max i=,2,...,n i n x (i), i n x (i) The cdf for the test statistic is (Birnbaum, 952) P D n < 2n + v for v 2n 2n, where for u u 2 u n. = n! 2n +v 2n v 3 2n +v 3 2n +v v... 2n 2n v 2n 2n g(u, u 2,..., u n ) du n... du 2 du g(u, u 2,..., u n ) = CASE I: n = F D (t) = Pr(D t) = t 2 2t 2 < t < t CASE II: n = 2 F D2 (t) = Pr(D 2 t) = t 4 8 ( t ) 2 4 4 < t < 2 2( t) 2 2 < t < t

CASE I: n =. F(x).8.6.4.2. x..2.4.6.8. CASE II: n = 2. F(x).8.6.4.2. x..2.4.6.8.

Goal: X := KSRV(n); CASE III: n = 6. F(x).8.6.4.2. x..2.4.6.8. y < 2 468 y 6 234 y 5 + 48 y 4 6 3 y 3 + 3 y2 9 y + 5 324 2 y < 6 288 y 6 48 y 5 + 236 y 4 28 3 y 3 + 235 9 y2 + 27 y 5 8 6 y < 4 32 y 6 + 32 y 5 26 3 y 4 + 424 9 y 3 785 9 y2 + 45 27 y 35 296 4 y < 3 28 y F D6 (y) = 6 + 56 y 5 5 3 y 4 + 55 9 y3 + 525 54 y2 565 8 y + 5 6 3 y < 5 2 4 y 6 24 y 5 + 295 y 4 985 9 y 3 + 775 9 y2 7645 648 y + 5 5 6 2 y < 2 2 y 6 + 32 y 5 85 9 y3 + 75 36 y2 + 337 648 y y < 2 3 y 6 38 y 5 + 6 3 y4 265 9 y3 5 8 y2 + 465 648 y 2 3 y < 5 6 2 y 6 + 2 y 5 3 y 4 + 4 y 3 3 y 2 5 + 2 y 6 y < y.

Example 4:. Let X, X 2,..., X be iid geometric( / 4) random variables (parameterized from ). Find the mean and variance of X (2). Y := OrderStat(GeometricRV( / 4),, 2); Mean(Y); Variance(Y); yielding µ = 3583658956 2399275947 and σ 2 = 39699845747469522944 52329477258653395669 Example 5:. Fit a power law process to the odometer failure data over (, ]: 2,942 28,489 65,56 78,254 83,639 85,63 88,43 9,89 92,36 94,78 98,23 99,9. CarFailures := [2942, 28489, 6556, 78254, 83639, 8563, 8843, 989, 9236, 9478, 9823, 999]; X := WeibullRV(lambda, kappa); hat := MLENHPP(X, CarFailures, [lambda, kappa], ); PlotEmpVsFittedCIF(X, Sample, [lambda = hat[], kappa = hat[2]],, ); ˆλ =.2637 ˆκ = 2.568

2 8 CIF 6 4 2 2 4 6 8 x 6. Conclusions (a) Nonparametric estimation and simulation for NHPPs is straightforward. (b) Collecting data across overlapping intervals does not pose any significant problems. (c) Once coded, this approach requires less effort than a parametric renewal process in order to simulate the observations. (d) There may be potential for a probability package analogous to statistical packages such as SAS, SPSS, or S-Plus. (e) I am searching for situations where an exact probability calculation is needed (typically not the case in economics, OR, engineering, classical statistics; possibly the case in biology, chemistry, physics).

Bibliography Arkin, B., and Leemis, L., Nonparametric Estimation of the Cumulative Intensity Function for a Nonhomogeneous Poisson Process from Overlapping Intervals, Management Science, 46, 7, 2, 989 998. Drew, J.H., Glen, A.G., Leemis, L.M., Computing the Cumulative Distribution Function of the Kolmogorov-Smirnov Statistic, Computational Statistics and Data Analysis, 34,, 2, 5. Evans, D.L. and Leemis, L.M., Input Modeling Using a Computer Algebra System, in Proceedings of the 2 Winter Simulation Conference, J.A. Joines, R.R. Barton, K. Kang, P.A. Fishwick, eds., Institute of Electrical and Electronics Engineers, Orlando, Florida, 2, 577 586. Glen, A.G., Leemis, L.M., and Drew, J.H., A Generalized Univariate Change-of-Variable Transformation Technique, INFORMS Journal on Computing, 9, 3, 997, 288 295. Glen, A.G., Evans, D.L., and Leemis, L.M., APPL: A Probability Programming Language, The American Statistician, 55, 2, 2, 56 66. Glen, A.G., Leemis, L.M., and Drew, J.H., Computing the Distribution of the Product of Two Continuous Random Variables, forthcoming, Computational Statistics and Data Analysis. Leemis, L.M., Nonparametric Estimation of the Intensity Function for a Nonhomogeneous Poisson Process, Management Science, 37, 7, 99, 886 9.