Strategic Allocation of Test Units in an Accelerated Degradation Test Plan

Similar documents
Bivariate Degradation Modeling Based on Gamma Process

Step-Stress Models and Associated Inference

ACCELERATED DESTRUCTIVE DEGRADATION TEST PLANNING. Presented by Luis A. Escobar Experimental Statistics LSU, Baton Rouge LA 70803

Accelerated Destructive Degradation Test Planning

Bayesian Methods for Accelerated Destructive Degradation Test Planning

Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring

Degradation data analysis for samples under unequal operating conditions: a case study on train wheels

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Prediction for Two-Phase Degradation. A thesis presented to. the faculty of. In partial fulfillment. of the requirements for the degree

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

EM Algorithm II. September 11, 2018

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

A Tool for Evaluating Time-Varying-Stress Accelerated Life Test Plans with Log-Location- Scale Distributions

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests

Introduction to Algorithmic Trading Strategies Lecture 10

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Unit 20: Planning Accelerated Life Tests

A Bayesian optimal design for degradation tests based on the inverse Gaussian process

Estimation of Quantiles

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

Gamma process model for time-dependent structural reliability analysis

Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions

Asymptotic distribution of the sample average value-at-risk

Statistical Data Analysis

Estimation for inverse Gaussian Distribution Under First-failure Progressive Hybird Censored Samples

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Bayesian Life Test Planning for the Weibull Distribution with Given Shape Parameter

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Covariance function estimation in Gaussian process regression

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions

Statistical Data Analysis Stat 3: p-values, parameter estimation

Reliability of Technical Systems

ebay/google short course: Problem set 2

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

The Relationship Between Confidence Intervals for Failure Probabilities and Life Time Quantiles

Lecture 2: Review of Basic Probability Theory

Analysis of Type-II Progressively Hybrid Censored Data

ECE531 Lecture 10b: Maximum Likelihood Estimation

ACCOUNTING FOR INPUT-MODEL AND INPUT-PARAMETER UNCERTAINTIES IN SIMULATION. < May 22, 2006

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Lecture 7 Introduction to Statistical Decision Theory

Parametric Techniques Lecture 3

Practice Problems Section Problems

Probability and Estimation. Alan Moses

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Parametric Techniques

The comparative studies on reliability for Rayleigh models

Math 494: Mathematical Statistics

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

Default Priors and Effcient Posterior Computation in Bayesian

Standard Error of Technical Cost Incorporating Parameter Uncertainty

STA 2201/442 Assignment 2

Probability and Information Theory. Sargur N. Srihari

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Quantile POD for Hit-Miss Data

Some Theoretical Properties and Parameter Estimation for the Two-Sided Length Biased Inverse Gaussian Distribution

Variational Principal Components

Sequential Detection. Changes: an overview. George V. Moustakides

4. Distributions of Functions of Random Variables

Bayesian Analysis of Simple Step-stress Model under Weibull Lifetimes

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

STA 4273H: Statistical Machine Learning

Robust Inference. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed.

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases

Load-strength Dynamic Interaction Principle and Failure Rate Model

Closed-Form Estimators for the Gamma Distribution Derived from Likelihood Equations

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

Gradient-based Adaptive Stochastic Search

Probability and Stochastic Processes

System Identification, Lecture 4

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Statistical Inference of Covariate-Adjusted Randomized Experiments

Inference on reliability in two-parameter exponential stress strength model

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Accelerated Testing Obtaining Reliability Information Quickly

A first look at the performances of a Bayesian chart to monitor. the ratio of two Weibull percentiles

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

Multistate Modeling and Applications

Sensitivity and Reliability Analysis of Nonlinear Frame Structures

ABC methods for phase-type distributions with applications in insurance risk problems

Hybrid Censoring; An Introduction 2

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

First Year Examination Department of Statistics, University of Florida

Probabilistic Graphical Models

Time Series 3. Robert Almgren. Sept. 28, 2009

Semester , Example Exam 1

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Master s Written Examination - Solution

Transcription:

Strategic Allocation of Test Units in an Accelerated Degradation Test Plan Zhi-Sheng YE, Qingpei HU, and Dan YU Department of Industrial and Systems Engineering, National University of Singapore Academy of Mathematics and Systems Science, Chinese Academy of Sciences, China Abstract Degradation is often defined in terms of the change of a key performance characteristic over time. It is common to see that the initial performance of the test units varies and it is strongly correlated with the degradation rate. Motivated by a real application in the semiconductor sensor industry, this study advocates an allocation strategy in accelerated degradation test (ADT) planning by capitalizing on the correlation information. In the proposed strategy, the initial degradation levels of the test units are measured and the measurements are ranked. The ranking information is used to allocate the test units to different factor levels of the accelerating variable. More specifically, we may prefer to allocate units with lower degradation rates to a higher factor level in order to hasten the degradation process. The allocation strategy is first demonstrated using a cumulative-exposure degradation model. Likelihood inference for the model is developed. The optimum test plan is obtained by minimizing the large sample variance of a lifetime quantile at nominal use conditions. Various compromise plans are discussed. A comparison of the results with those from traditional ADTs with random allocation reveals the value of the proposed allocation rule. To demonstrate the broad applicability, we further apply the allocation strategy to two more degradation models which are variants of the cumulative-exposure model. Key words: Random initial degradation, order statistics, general path model, compromise test plan, large sample approximate variance, reliability. 1

1 Introduction Degradation is a common mechanism of product failures. In many applications, degradation of a product is measurable. Degradation data can be obtained through repeated measures of the degradation over time, based on which an appropriate degradation model can be chosen for the degradation process. A degradation-induced failure is often defined to be the event that the degradation exceeds a certain threshold. Therefore, the failure time distribution of the product can be readily established by making use of the degradation model. When the degradation is measurable, a degradation test requires less test units and a shorter test duration in quantifying the product lifetime distribution, compared with traditional life tests that make use of the binary failure/survival information. Nevertheless, degradation of a product is often quite slow. A long time is needed in order to observe noticeable degradation signals. The slow degradation rate causes problems in a degradation test where the maximum allowable duration of the test is limited. A common method to address the issue is to use harsh test conditions such as high temperature, use rate and vibration. The purpose of acceleration is to increase the degradation rate so that clear degradation signals can be observed within the test duration. An interesting observation from applications is that the degradation rate is highly correlated with the initial degradation level of a test unit. This observation together with the purpose of acceleration naturally leads to a research question on the use of the initial degradation measurements for a better ADT plan. Two examples below are used to motivate our study. 1.1 Motivating examples Semiconductor infra-red sensors are widely used in various advanced applications such as spacecraft attitude determination, thermal imaging systems and high temperature pyrometers. Our study concerns an important application of the infra-red sensors to the railway wheel temperature detection. Overheat of wheel bearings is a major cause to the railway accidents. A thermal sensor monitors the temperature of bearings and provides timely alert of overheat. Signals generated from the sensor are used to determine the temperature, and the noise level in the signals affects the measurement accuracy. The noise is caused by carrier trapping in defects of the infra-red thin-film. Detailed discussions on the definition, root cause and quantification of the noise can be found in Jevtić (1995). The noise level increases over time due to inner degradation of the film. In order to identify the signals, the noise level should not exceed a given tolerance limit. To understand the noise-induced failures, an ADT was conducted. In the experiment, 60 thermal sensors were assigned to three temperature levels. Degradation of 13 units under a certain temperature level is demonstrated in 2

Figure 1. The data were scaled to protect proprietary information. degradation level 18 16 14 12 10 8 6 4 2 0 2 0 1 2 3 4 5 time Figure 1: Degradation paths of 13 infra-red sensors. As can be seen from Figure 1, the degradation measurements at the beginning vary. The difference may be explained by a short, yet unknown, usage time during the production test and burn-in screening, as well as measurement errors. A close look at the degradation paths reveals that a unit with a higher degradation measurement at the beginning tends to have a higher degradation rate. The positive correlation has been observed consistently in most tests of the infra-red sensors in this company. Because the purpose of an accelerated test is to hasten the degradation rate, a natural question raised by the engineer is, shall we allocate units with higher initial degradation measurements to a lower temperature level? These units are expected to have a higher degradation rate. Therefore, a stronger degradation signal may be observed under the low factor level, compared with a randomly-selected unit. On the contrary, a unit with a small degradation rate may need a higher temperature level to bring out the degradation signal. It is also possible that the initial degradation level is negatively correlated with the degradation rate. Such an example can be found in Weaver et al. (2013), where a degradation test was performed on inkjet printer heads. Degradation of the printhead is defined to be the diffusion of an ink-related substance in the head. When the substance reaches a certain location in the head which can be regarded as a degradation-failure threshold, a failure 3

will soon follow. Totally 12 units were tested to assess the lifetime distribution of the degradation-induced failures. The data are shown in Figure 2. The analysis in Weaver et al. (2013) showed that the correlation coefficient between the initial degradation level and the degradation rate is as high as -0.82. Given the negative correlation, a natural strategy is to allocate units with a higher initial degradation to a higher factor level. 80 70 60 migration in mm 50 40 30 20 10 0 10 20 30 40 50 time in days Figure 2: Scatterplot of printhead migration data. 1.2 Related literature Over the last decade, there has been a rapid growth on the literature of ADT models and ADT planning. A comprehensive demonstration of ADT data analysis and a brief overview of the early literature are given in Meeker and Escobar (1998, Chapter 21). Among the literature, some studies consider optimum degradation test planning without stress acceleration, i.e., the test is conducted under the room temperature. Common objectives of the planning include cost minimization (Wu and Chang 2002; Weaver et al. 2013; Kim and Bae 2013) and (asymptotic) variance minimization of an estimated reliability quantity such as lifetime quantiles (Yu and Tseng 1999; Tsai et al. 2012). These planning methods are useful when degradation under use conditions is sufficiently fast. When the degradation is very slow, nevertheless, acceleration of the degradation process using accelerating variables is necessary. General path models (Meeker et al. 1998) and stochastic process models (e.g., 4

Park and Padgett 2005) have been well developed for ADTs, based on which optimum ADTs can be designed. Using the general path models, Boulanger and Escobar (1994) discussed optimum ADT design when there is an upper bound on the maximum degradation level. Yu (2006) derived optimum ADTs when the degradation rate follows a reciprocal Weibull distribution and the initial degradation is zero. Liu and Tang (2010) used Bayesian methods for ADT design, where the mean path function is derived from the degradation physics. Weaver and Meeker (2014) illustrated methods for finding optimum ADTs when the rate is normally distributed and the initial degradation level is random. Shi et al. (2009) and Shi and Meeker (2012) considered the scenario where the measurement process is destructive to the test units. On the other hand, optimum ADT planning using stochastic processes has also been extensively studied recently. For example, Liao and Tseng (2006), Peng and Tseng (2010), Lim and Yum (2011) and Hu et al. (2015) derived optimum ADT design for the Wiener degradation process under different acceleration schemes, i.e., constant-stress, step-stress and progressive-stress accelerations. Tseng and Lee (2015) discussed optimum ADT planning for a general class of degradation processes called the exponential-dispersion models. Generally speaking, the initial degradation at time zero is nonzero and it is random. This obvious fact is supported by numerous degradation datasets reported in the literature, e.g., the block error rates data of storage disks (Meeker et al. 1998, Table C.18), strength data of adhesive bond (Jeng et al. 2011), the luminosity data of vacuum fluorescent displays (Bae and Kvam 2004) and the printhead migration data presented in Figure 2. In a general path model, the randomness in the initial degradation is often modelled by a normal distribution, e.g., Lu et al. (1997), Yuan and Pandey (2009) and Weaver et al. (2013). All the analyses in the above-mentioned work revealed strong correlations between the initial degradation level and the degradation rate. For example, the estimated correlation coefficient in the wall thinning data of carbon steel pipes used in a nuclear power plant (Yuan and Pandey 2009) is as high as 0.91. When stochastic-process models are used for the degradation data, however, most studies ignored the random initial degradation and focused on the degradation increments (e.g., Si et al. 2012; Hu et al. 2015). 1.3 Overview The strong correlation between the initial degradation level and the degradation rate implies that the degradation rates can be roughly estimated/ranked using the initial degradation measurements. Then test units whose degradation rates are deemed low can be allocated to a high level of the accelerating variables in order to accelerate the degradation. Based 5

on the idea, this study proposes an allocation strategy of test units in an ADT. Statistical inference of degradation data from such an ADT is developed. Optimum designs of ADTs with strategic allocation are also investigated. The remainder of the paper is organized as follows. Section 2 states the allocation strategy and the degradation data. Section 3 presents a general path model for the degradation process. The cumulative-exposure principle is used to incorporate the accelerating variable in the model. Likelihood inference of the model is developed in Section 4. The results are used for ADT planning in Section 5. Section 6 demonstrates the allocation strategy using two more ADT models. Section 7 concludes the paper. 2 Allocation Strategy and the Data The study is confined to ADTs with one accelerating variable. In a conventional ADT plan, test units are randomly allocated to a number of factor levels of the accelerating variable. The allocation scheme is called random allocation, in contrast to the strategic allocation proposed in this study. In the proposed allocation strategy, the initial degradation measurements of the test units, denoted as X, are taken at the beginning of the ADT. Then the units are allocated to different factor-levels of the accelerating variable based on their initial measurements. Specifically, assume that totally n test units are available for the ADT experiment. Degradation levels of all the n units are measured at time t = 0. The degradation measurements of these units are ranked in ascending order, i.e., from the smallest to the largest. For notational ease, we call the unit with the ith smallest initial degradation measurement as unit i, and let X (i) be the corresponding initial degradation measurement. Therefore, X (1) X (2) X (n) are a sequence of order statistics from X. Based on the ranking, unit i is allocated to factor-level s(i). For instance, if the planner decides to use two factor-levels in the ADT and the correlation between the initial degradation measurements and the degradation rates is negative, then she may consider allocating the first n L units to a low factor-level s L and the remainder to the the highest-allowable level s H. Values of n L and s L can be determined by optimizing a certain planning criterion, e.g., minimizing the asymptotic variance of an estimated lifetime quantile. In this case, s(i) = s L for i = 1,, n L and s(i) = s H otherwise. After the ith unit is allocated to factor-level s(i), denote its measurement times as t i = {t i1,, t imi }, where m i is the total number of measurements (excluding the measurement at t = 0) for this unit during the accelerated test. The associated degradation measurements are denoted as Y i = {Y i (t i1 ),, Y i (t imi )}. Therefore, degradation data collected from the ADT are D = { ( X (i), s(i), t i, Y i ) ; i = 1, 2,, n}. In order to analyze the data and to 6

plan the ADT, an appropriate model for the degradation process is needed. In the following section, a cumulative-exposure degradation model is presented for the data. 3 A Cumulative-Exposure ADT Model Following Lu et al. (1997), Yuan and Pandey (2009) and Weaver et al. (2013), we first assume that under nominal use conditions, the degradation measurement at time t can be described by a random-effects degradation model as Y (t) = Y 0 + bt + ɛ, (1) where Y 0 is the initial degradation level, b is the degradation rate that is assumed to vary from unit to unit, while ɛ N (0, σɛ 2 ) is a measurement error term that is assumed to be independent and identically distribution under different t. Here, t can be a monotone transformation of the chronological time, e.g., Tseng et al. (2003), Lee and Tang (2007) and Weaver and Meeker (2014). We further model (Y 0, b) by a bivariate normal distribution as ( [ (Y 0, b) N β 0, σ 2 0 σ 01 σ 01 σ 2 1 ]), where β 0 = [β, α 0 ] T is the mean parameter vector. Throughout the paper, we consider increasing degradation, i.e., α 0 > 0. The case of decreasing degradation can be handled in a seminar vein. Let D be a fixed failure threshold. A unit fails when its inherent degradation Y 0 + bt exceeds D. The cumulative distribution function (CDF) of the failure time T under the nominal use conditions is given by ( ) F T (t) = 1 Φ (D β α 0 t)/ σ 20 + t 2 σ 21 + 2tσ 01, (2) where Φ( ) is the standard normal CDF. The complicated expression of the lifetime quantile can be found in Equation (14) in Weaver et al. (2013), and the partial derivatives were provided in their appendix. A common objective of an ADT planning is to minimize the asymptotic variance of the estimated p lifetime quantile. This criterion is also adopted in our study. When the product is subject to a higher factor level of the accelerating variable, the degradation would be hastened. This study considers a single stress. Let s 0 be the nominal use level and s H be the maximum allowable level of the accelerating variable. In the study, 7

we normalize the stress as s = ψ( s) ψ( s 0) ψ( s H ) ψ( s 0 ), where ψ( ) is a monotone transformation of the stress whose form depends on the acceleration relation, e.g., ψ( s) = 1/ s for the Arrhenous relation and ψ( s) = ln( s) for the power law. After normalization, s 0 = 0, s H = 1 and 0 s 1. It is reasonable to believe that the initial degradation level Y 0 does not change with the stress s. On the other hand, the cumulative exposure principle is used to link the normalized accelerating variable to the degradation model (1). Using the principle, we assume that if a unit is operated under stress s for a duration of t, it is equivalent to an operation time of exp(α 1 s)t under the nominal use conditions. Therefore, the degradation process under factor-level s can be described as Y (t; s) = Y 0 + b exp(α 1 s)t + ɛ. (3) 4 Likelihood Inference The initial degradation measurement X = Y 0 + ɛ, and thus X N (β, σ 2 0 + σ 2 ɛ ). In order to derive the likelihood based on D, we first look at the joint distribution of [Y 0, b, X], which is multivariate normal with mean vector [β, α 0, β] T and covariance matrix σ 2 0 σ 01 σ 2 0 σ 01 σ 2 1 σ 01 σ 2 0 σ 01 σ 2 0 + σ 2 ɛ Given X = x, the conditional joint distribution of [Y 0, b] is normal with mean β(x) = β 0 + (x β)/(σ 2 0 + σ 2 ɛ ) [σ 2 0, σ 01 ] T and covariance matrix U given by [ ] [ ] σ0 2 σ 01 1 σ0 4 σ 01 σ0 2 U = σ 01 σ1 2 σ0 2 + σɛ 2 σ 01 σ0 2 σ01 2. = σ2 ɛ σ 2 0 + σ 2 ɛ [ σ 2 0 σ 01 σ 01 (σ 2 0σ 2 1 σ 2 01)/σ 2 ɛ + σ 2 1 This means that the variance of [(Y 0, b) X] does not depend on the realization of X, while a higher realization of X would lead to a higher mean for [(Y 0, b) X], provided that σ 01 is positive. For unit i, the realization of the ith order statistic X (i) is x i. The following proposition shows that conditional on X (i) = x i, the joint distribution of [Y 0, b] is the same as if it is conditional on X = x i. Proof of the proposition is given in the appendix. Proposition 1 Assume (W, X) follows a d-dimensional multivariate normal distribution ]. (4) 8

where X is one dimensional. Let (W (i), X (i) ) be the observation from n i.i.d. realizations of (W, X) where X (i) is the ith smallest observation among the n realizations of X. Then f W(i) X (i) (w x) = f W X (w x). From the proposition, we can see that the distribution of Y i conditional on X (i) = x i is multivariate normal with mean vector µ i = Z i Z i β(x) and covariance matrix Σ i = Z i Z i UZ i Z T i + σ 2 ɛ I mi, where I mi is the m i m i identity matrix, and Z i = [ 1 1 1 t i1 t i2 t imi ] T, Z i = [ 1 0 0 e α 1s(i) ]. (5) Let θ = (β, α 0, α 1, σ 2 0, σ 2 1, σ 2 ɛ, σ 01 ) T be the model parameters to estimate. The likelihood contributed from unit i is the same as the joint distribution of [X (i), Y i ], which can be factorized as f Yi X (i) (y i x i )f X(i) (x i ). The first part is multivariate normal as argued above. The second part is the probability density of a normal order statistic. The expression of f X(i) (x i ) is complicated. Nevertheless, when we look at the joint likelihood from all n units, the joint distribution of (X (1),, X (n) ) is the same as n independent realizations of X, up to a multiplicative constant n!. Therefore, the overall log-likelihood, up to a constant, can be neatly expressed as n [ln(σ 20 + σ 2ɛ ) + (x ] i β) 2 l(θ D) = 1 2 1 2 n ln det(σ i ) 1 2 σ0 2 + σɛ 2 n (y i µ i ) T Σ 1 i (y i µ i ). (6) Maximum likelihood (ML) estimators of θ can be obtained by numerically maximizing the above log-likelihood function. The Fisher information from all measurements in the n units consists of two parts I θ = Ĩ + I. The first part Ĩ is contributed from the initial degradation measurements X (i), i = 1, 2,, n, which can be specified as Ĩ = n σ 2 0 +σ2 ɛ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 n n 0 0 0 0 2(σ0 2+σ2 ɛ ) 2 2(σ0 2+σ2 ɛ ) 2 0 n n 0 0 0 0 2(σ0 2+σ2 ɛ )2 2(σ0 2+σ2 ɛ )2 0 0 0 0 0 0 0 0. (7) Derivation of the second part I of the Fisher information matrix requires some effort. By 9

making use of Equation (16) in Klein et al. (2000), the (k, l)th element of I can be expressed as: I kl = n E [ µ T ikσ 1 i µ il ] + 1 2 n ( ) trace Σ 1 i Σ ik Σ 1 i Σ il, (8) where µ ik = µ i / θ k is the partial derivative of µ i with respect to the kth element in θ, Σ ik = Σ i / θ k, and the expectation is taken with respect to X (1),, X (n). The partial derivative Σ ik can be specified as Σ ik = Z i Z i U k Z i Z T i + Z i Z ik UZ i Z T i + Z i Z i UZ ik Z T i for k 6 and Σ i6 = Z i Z i U 6 Z i Z T i + I mi. Here, U k and Z ik are the partial derivative of U in (4) and Z i in (5) with respect to the kth element in θ, respectively. Note that Z ik = 0 2 2 except for k = 3. On the other hand, we can decompose µ ik as µ ik = µ ik:1 + µ ik:2 z i, where z i = (X (i) β)/ σ 2 0 + σ 2 ɛ are standard-normal order statistics, and µ ik:1 and µ ik:2 are functions of θ and they do not depend on X (i) or z i. Based on the above analysis, I kl can be evaluated as I kl = 1 2 n n ( ) trace Σ 1 i Σ ik Σ 1 i Σ il + [ µ T ik:1 Σ 1 i µ il:1 + E[z i ]( µ T ik:1σ 1 i µ il:2 + µ T ik:2σ 1 i µ il:1 ) + E[z 2 i ] µ T ik:2σ 1 i µ il:2 ]. (9) Detailed expressions of µ ik:1, µ ik:2 and U k in the above display are given in the appendix. On the other hand, E[z i ] and E[z 2 i ] are the first two moments of a standard-normal order statistic. They can be obtained numerically using the routine proposed in Royston (1982) and Shea and Scallon (1988). Alternatively, we can generate and sort n standard normal random variables, repeat the process for a large number of times, and then average over the repeats to approximate E[z i ] and E[z 2 i ]. Remark: The above inference method is developed for ADTs with strategic allocation of the test units, where the allocation is based on the ranking of the initial degradation measurements. Nevertheless, the results can be readily modified to analyze degradation data from a conventional ADT with random allocation. Denote the data from a conventional ADT as D r = {(X i, s(i), t i, Y i ), i = 1, 2,, n}. The difference between degradation data from the two types of ADTs is the distribution of the initial degradation measurement for the ith unit. The initial measurement is normally distributed for an ADT with random allocation, while it is the ith order statistic among n normally-distributed random variables for an ADT with strategic allocation. The log-likelihood function based on D r from a conventional ADT is the same as (6), which can be readily shown by using a conditional argument similar to that for ADTs with strategic allocation. However, the Fisher information 10

matrix is slightly different in that z i for ADTs with random allocation are i.i.d. standard normal random variables. Therefore, the two moments E[z i ] and E[zi 2 ] in (9) are equal to 0 and 1, respectively. Based on the Fisher information matrix, optimum ADT plans for the cumulative-exposure model under random allocation can be obtained. The results can then be compared with the optimum plan with strategic allocation, as shown in the next section. 5 Optimum ADT Planning After the ML estimate θ n is obtained, the ML estimate of the p lifetime quantile t p under nominal use conditions can be obtained through the plug-in method. By using the delta method (e.g., Meeker and Escobar 1998, Section B.2), the asymptotic variance can be obtained as AVar( t p ) = t T p I θ t p, where t p is the gradient of t p with respect to θ. A common objective of an ADT design is to minimize AVar( t p ). There are many possible planning variables for the ADT design, e.g., the measurement frequency, the test duration, the total number of test units, the stress levels, and the number of units allocated to each level. In our study, we fix the number of test units, as well as the measurement times and the test duration for all units. This is reasonable, as the measurement is often taken on a daily or weekly base and there is often a constraint on the maximum allowable test duration. Therefore, our objective is to minimize AVar( t p ) by carefully choosing the factor levels and the number of units allocated to each level. Because the optimum test plan is expected to spread the units out to the boundary of the experimental region (Weaver and Meeker 2014), we adopt the convention that sets the highest factor level in the ADT as s H = 1. Take the degradation of inkjet printhead as an example. ML estimates of the degradation parameters under the use conditions have been given in Weaver et al. (2013). treated as the planning values of the model parameters, denoted as They are β = 11.22, α0 = 1.14; α1 = 2.5; σ0 2 = 0.2025; σ1 2 = 0.0049; σɛ 2 = 6.76; σ01 = 0.0258. Furthermore, we assume that the degradation can be accelerated by an accelerating variable, say, temperature. The planning value of the coefficient α1 associated with the normalized stress is assumed to be 2.5. We let the degradation threshold be D = 600. The goal is to develop a test plan that minimizes the 0.1 quantile t 0.1. The total number of units available for the test is set to be n = 50, while each unit is measured every two units of time until t = 20. 11

5.1 Two-point plan First, we consider a two-point plan with two factor-levels. The proposed allocation strategy in Section 2 is used for the ADT planning. Because there is a strong negative correlation between the initial degradation and the degradation rate, we allocate the first n L units with the smallest initial degradation measurements to the low stress level s L and the remaining n n L units to s H = 1. Therefore, s(i) = s L for i n L and s(i) = s H otherwise. The asymptotic variance AVar( t 0.1 ) under each combination of (n L, s L ) can be obtained based on the Fisher information matrix I θ derived in Section 4 and the delta method. The optimum ADT plan chooses the optimum (n L, s L ) to minimize AVar( t 0.1 ). There are only two planning variables and n L is integer-valued. Therefore, grid search is used to search for the optimum design point. The grid spacing for s L is set as 0.001. Here, we need to highlight that because the test units are ranked based on their initial degradation measurements, the sample size n should be given before the planning is done. This is different from conventional ADT plans with random allocation where the proportion of units tested at s L can be used instead of n L (e.g., Shi et al. 2009; Tseng and Lee 2015). A contour plot of the asymptotic variance versus (n L, s L ) is provided in Figure 3(a). The optimum ADT plan is n L = 38 and s L = 0.237, and the corresponding AVar( t 0.1 ) is 77.94. In order to demonstrate the effectiveness of the proposed allocation strategy, we further show the results using the conventional ADT planning with random allocation. In the ADT with random-allocation, n L units are randomly chosen and allocated to factor-level s L and the remainder are tested at s H = 1. The Fisher information matrix can be readily derived based on the results for strategic allocation, as discussed in the remark above. The contour plot of the asymptotic variance versus (n L, s L ) is provided in Figure 3(b). The optimum plan is (n L, s L ) = (35, 0.222) with the minimum asymptotic variance 88.73. The strategic allocation significantly reduces the asymptotic variance by 12.94%. Simulation is used to evaluate the adequacy of the large-sample approximation to the variance of t 0.1. We first use the strategic allocation that allocates the first 12 units with the largest initial degradation measurements to s H = 1, and the remainder to s L. Different values of s L varying from 0.05 to 0.6 are evaluated in the simulation. The planning values and test settings used in the above two-point design are used to generate degradation data. The simulation is repeated 10,000 times for each value of s L. The variance of t 0.1 is estimated based on the 10,000 replications, as shown in Figure 4(a). As can be seen from the figure, the variance obtained from the simulation is very close to the asymptotic variance, indicating the appropriateness of the large sample approximation. We then validate the large sample approximations for the ADT with random allocation. In the ADT, we randomly allocate 15 units to s H = 1 and the remainder to s L. Different values of s L are chosen for the 12

500 0.9 0.9 0.8 2000 2000 0.8 2000 2000 0.7 500 0.7 500 low stress level 0.6 0.5 0.4 0.3 110 93 200 low stress level 0.6 0.5 0.4 0.3 110 150 95 0.2 0.1 500 200 79 83 93 110 0.2 0.1 150 90 95 110 150 10 20 30 40 10 20 30 40 n L n L (a) (b) Figure 3: Contour plot of the large-sample variance of t 0.1 using the cumulative exposure model (3) and (a) the strategic allocation, and (b) random allocation. The diamond in each plot denotes the optimum plan. 13

450 400 350 simulation large sample 400 350 simulation large sample (asymptotic) variance 300 250 200 (asymptotic) variance 300 250 200 150 150 100 100 50 0 0.1 0.2 0.3 0.4 0.5 0.6 low stress s L (a) 50 0 0.1 0.2 0.3 0.4 0.5 0.6 low stress s L (b) Figure 4: The exact variances of t 0.1 obtained from simulation versus the asymptotic variances under (a) the strategic allocation with n L = 38 and (b) random allocation with n L = 35. simulation. The variances of t 0.1 estimated based on 10,000 Monte Carlo replications are shown in Figure 4(b), which supports the accuracy of the large-sample approximations again. 5.2 Compromise design As argued in the ADT literature, a two-point test plan may not be robust to deviations from model assumptions. In our ADT design with strategic allocation, a key assumption is the correlation between initial degradation measurements and the degradation rates. Therefore, a compromise test plan can be formed by first randomly allocating n 0 units to the normal use conditions s 0, and then applying the strategic allocation to the remaining n n 0 units. Then the decision variables are again n L and s L. Optimum values of these two decision variables are the same as those in a two-point ADT with n n 0 units. The ADT is called type I compromise test in the study. In the printhead example above, if we randomly allocate n 0 = 10 units to s 0 and apply the strategic allocation rule to the remaining 40 units as above, then the optimum compromise test plan is (n L, s L ) = (28, 0.27) with asymptotic variance AVar( t 0.1 ) = 82.17. If the random allocation rule is used, the optimum compromise test is (n L, s L ) = (26, 0.25) with asymptotic variance AVar( t 0.1 ) = 92.43, 12.5% higher than that 14

with strategic allocation. When the degradation under s 0 is too slow, the degradation pattern may not be wellrevealed given a short test duration. Another alternative of a compromise test plan is to add an additional point s M in the middle of s L and s H and then randomly allocate a fixed number of n M test units to s M. Then the correlation between the initial degradation measurements and the degradation rate can be verified using degradation data from s M. We call the method a type II compromise test. Use the printhead example again and assume n M = 10, s M = (s L + s H )/2. The optimum setting of this compromise test with strategic allocation is (n L, s L ) = (30, 0.18) with asymptotic variance AVar( t 0.1 ) = 83.87. It is higher than the first compromise ADT above, as the printhead degradation under s 0 is relatively fast. 6 Strategic Allocation for More Degradation Models The merits of the allocation strategy proposed in Section 2 have been demonstrated using the cumulative exposure ADT model (3). Nevertheless, it is also advantageous when applied to other ADT models. For demonstration, this section further applies the strategy to two variants of the ADT model in Section 3. The first variant introduces an onset parameter to simplify the cumulative exposure model (3). The second uses a different method to link the accelerating variable to the degradation model (1). For both variates, the inference procedure is similar to that in Section 4. It is shown that strategic allocation again leads to a better ADT plan compared with the traditional counterpart with random allocation. 6.1 A simplified cumulative-exposure model with unknown onset The cumulative exposure ADT model (3) involves as many as seven parameters. The large number of parameters causes difficulties in parameter estimation. It also results in a 7 7 information matrix, which requires some computation effort to invert. This section introduces an unknown onset to replace the initial degradation Y 0 in (3). There are many factors contributing to the random initial degradation. One factor is a period of usage before the ADT, due to tests during production or burn-in screening. For instance, engineers of infra-red sensors believe that the unknown period of testing is the root cause of the positive correlation between the random initial values and the degradation rate. If this factor is dominant over the others, it would be reasonable to introduce an unknown onset time τ, and so the initial degradation level Y 0 can be replaced by β + bτ, where β is the common value of the degradation characteristic when these units are manufactured. Although the test conditions during the unknown production period prior to the ADT might vary and differ from the 15

nominal use conditions, the cumulative exposure principle ensures an equivalent operating time for this unknown period with unknown test conditions. The parameter τ denotes this unknown equivalent operating time. From the viewpoint of data analysis, we may allow τ to be negative so that a negative correlation between the initial degradation and the rate can be captured. Usually, the common degradation level β of the product at the very beginning is zero, e.g., the infra-red sensor degradation, or it is known. In these cases, the degradation data can be simply adjusted to exclude this parameter. Therefore, we may consider an unknown onset model whose degradation measurements under factor-level s are Y (t; s) = bτ + b exp(α 1 s)t + ɛ, (10) where b N (α 0, σ 2 1) and ɛ N (0, σ 2 ɛ ). The parameter vector is θ = (τ, α 0, α 1, σ 2 1, σ 2 ɛ ), which has five elements only. Given a degradation-failure threshold D, the lifetime CDF F T (t) and the p lifetime quantile t p under the nominal use conditions are given by ( D (τ + t)α0 F T (t) = 1 Φ (τ + t)σ 1 ) ; t p = D z 1 p σ 1 + α 0 τ. (11) The quantile is much simpler than that for the cumulative exposure model in Section 3. Given the model (10), the initial measurement X N (α 0 τ, τ 2 σ 2 1 + σ 2 ɛ ). Section 4, we can show that [b X = x] is normal with mean and variance Similar to τσ 2 1 E[b X = x] = α 0 + τ 2 σ1 2 + σɛ 2 (x α 0 τ), V var(b X = x) = σ2 1σɛ 2. τ 2 σ1 2 + σɛ 2 Suppose that the strategic allocation proposed in Section 2 is used for the test, and the test settings are the same as those in Section 2. The resulting data are D = { ( X (i), s(i), t i, Y i ) ; i = 1, 2,, n}. If the model (10) is used for D, the log-likelihood for θ can be expressed as l(θ D) = 1 2 1 2 n [ln(τ 2 σ 21 + σ 2ɛ ) + (x ] i α 0 τ) 2 1 n ln det(σ τ 2 σ1 2 + σɛ 2 i ) 2 n (y i µ i ) T Σ 1 i (y i µ i ), (12) where µ i = E[b X = x i ]Z i Zi, Σ i = V Z i Zi ZT i Z T i + σɛ 2 I mi, Zi = [τ, e α1s(i) ] T and Z is defined in (5). The Fisher information matrix is I θ = Ĩ + I, where Ĩ is contributed from the degradation measurements at t = 0 and I are from Y i, i = 1, 2,, n. Let σ 2 un = τ 2 σ 2 1 + σ 2 ɛ. 16

Using Equation (16) in Klein et al. (2000), Ĩ can be obtained as Ĩ = n σ 4 un α0σ 2 un 2 + 2τ 2 σ1 4 α 0 τσun 2 0 τ 3 σ1 2 τσ1 2 α 0 τσun 2 τ 2 σun 2 0 0 0 0 0 0 0 0 τ 3 σ 2 1 0 0 τ 4 /2 τ 2 /2 τσ 2 1 0 0 τ 2 /2 1/2. On the other hand, the (k, l)th element of the second part I can be derived as: I kl = 1 n ( ) trace Σ 1 i Σ ik Σ 1 i Σ il + 2 n ] [ µ T ik:1σ µ il:1 + E[z i ] µ T ik:1σ µ il:2 + E[z i ] µ T ik:2σ µ il:1 + E[zi 2 ] µ T ik:2σ µ il:2, where µ ik:1 + µ ik:2 z i is the partial derivative of µ i, Σik = V k Z i Zi ZT i Z T i + V Z i Zik ZT i Z T i + V Z i Zi ZT ik Z T i for k = 1, 2, 3, 4, and Σ i5 = V 5 Z i Zi ZT i Z T i + V Z i Zi5 ZT i Z T i + V Z i Zi ZT i5 Z T i + I mi. Here, z i is again a standard-normal order statistic, and V k and Zik are the respective partial derivatives of V and Z i with respect to the k-th element of θ. Detailed expressions of Z ik, µ ik:1, µ ik:2 and V k are given in the appendix. The asymptotic variance AVar( t p ) can be obtained based on the information matrix I θ together with the delta method. The partial derivatives of t p can be easily obtained given the simple explicit form (11). Our analysis suggests that the unknown onset model (10) provides a reasonably good fit to the degradation data of infra-red sensors. Therefore, we demonstrate ADT planning with strategic allocation for the sensors using the model (10). ML estimates from the sensor degradation data are used as planning values of the model parameters: τ = 1.86, α 0 = 1.01, α 1 = 1.32, σ 2 1 = 0.0103 and σ 2 ɛ = 2.88. The degradation-failure threshold and the number of test units are set as D = 50 and n = 70. Measurement times of all units are assumed the same. From time t = 0, degradation measurements of all units are taken every 0.3 units of time until t = 3. We first consider a two-point design for the ADT with strategic allocation. Because the correlation between the initial degradation and the degradation rate is positive, the first n L units with the highest initial degradation measurements are allocated to the low factor level s L and the remainder are allocated to s H. That is, s(i) = s H for i = 1,, n n L and s(i) = s L otherwise. The contour plot of the asymptotic variance AVar( t 0.1 ) versus (n L, s L ) 17

38 0.9 300 0.9 300 low stress level 0.8 0.7 0.6 0.5 0.4 300 30 15 30 15 low stress level 0.8 0.7 0.6 0.5 0.4 300 15 38 15 38 0.3 0.2 10 7.5 7.8 10 8.5 0.3 0.2 10.5 7.7 10.5 8 8.8 0.1 8.5 0.1 8.8 8 10 20 30 40 50 60 10 20 30 40 50 60 n L n L (a) (b) Figure 5: Contour plot of the large-sample variance of t 0.1 using (a) the strategic allocation and (b) random allocation. The diamond in each plot denotes the optimum plan. is given in Figure 5(a). The optimum plan and the corresponding asymptotic variance are given in Table 1. For comparison, results using the traditional ADT with random allocation are also given in Figure 5(b) and Table 1. The results in the table again show that the proposed allocation strategy leads to a better plan with a smaller asymptotic variance. Simulation is used to evaluate the appropriateness of the large sample approximation. For both allocation strategies, we fix n L = 61 and evaluate the exact variance var( t 0.1 ) under different values of s L. The exact variance under each combination of (n L, s L ) is estimated using 10,000 Monte Carlo replications. The results are given in Figure 6. The exact variances are slightly higher than the asymptotic variance but the differences are small. The result implies the validity of large sample approximation used in the design. As with Section 5.2, compromise tests are considered for the infra-red sensors. We first consider the type I compromise test where n 0 = 15 units are randomly selected and allocated to s 0. The optimum setting and the resulting asymptotic variance of t 0.1 is reported in Table 1. Then the type II compromise test is evaluated, where n M = 15 units are randomly selected and allocated to s M = (s L +s H )/2. The optimum results are also included in Table 1. From the table, we can see that the type I compromise test with strategic allocation has a slightly larger asymptotic variance compared with the two-point optimum plan. Therefore, this compromise test is recommended as it enables us to check the correlation, and it allows us to see degradation under nominal use conditions. 18

(asymptotic) variance 20 18 16 14 12 10 random exact random asymptotic strategic exact strategic asymptotic 8 6 0 0.1 0.2 0.3 0.4 0.5 0.6 low stress s L Figure 6: The exact variances of t 0.1 obtained from simulation versus the asymptotic variances when n L = 61: Both ADTs with random allocation and ADTs with strategic allocation are evaluated. Table 1: Optimum test plans and the associated asymptotic variance for the two-point design and the type I and type II compromise tests. Strategic allocation Random allocation Plan (n L, S L ) AVar( t 0.1 ) (n L, S L ) AVar( t 0.1 ) Two-point (61, 0.208) 7.418 (61, 0.205) 7.619 Compromise I (47, 0.226) 7.542 (47, 0.222) 7.730 Compromise II (47, 0.117) 7.893 (47, 0.116) 8.076 19

6.2 Drift-acceleration ADT model Consider the degradation model (1). Section 3 used the cumulative exposure principle to link the accelerating variable s to the degradation model. Another popular method is to let some parameter(s) in (1) be a function of s. One such acceleration model is to assume that the mean of the degradation rate b is an increasing function of s, e.g., see Yuan and Pandey (2009) and Weaver and Meeker (2014). Let b s be the degradation rate under factor level s. Then the covariance matrix of [Y 0, b s ] is the same as [Y 0, b] while the mean vector changes to [β, α 0 e α1s ] T. Because the stress affects the mean of the degradation rate only, we call it the drift-acceleration ADT model. This model is simpler than the cumulative exposure model in that (i) the covariance matrix of Y i does not depend on s, and (ii) the information matrix has more zero entries. However, the initial degradation Y 0 in the model cannot be simplified using an unknown onset, because it is difficult to justify the onset time τ. In addition, the drift-acceleration ADT model may not be appropriate when the degradation is sensitive to the stress. This is because the variance of the degradation measurement does not depend on s. Nevertheless, we may expect that degradation under normal use conditions will be almost constant with near-zero variance, while degradation under a high factor level would exhibit larger variation. Let the allocation strategy and the test settings in Section 2 carry over here. Using the same inference procedure as that in Section 4, we can derive the log-likelihood function for the model parameters θ = [β, α 0, α 1, σ0, 2 σ1, 2 σɛ 2, σ 01 ] T as n [ln(σ 20 + σ 2ɛ ) + (x ] i β) 2 l(θ D) = 1 2 1 2 n ln det(σ i ) 1 2 σ0 2 + σɛ 2 n (y i µ i ) T Σ 1 i (y i µ i ). (13) where µ i = Z{[β, α 0 e α1s ] T + (x i β)/(σ0 2 + σɛ 2 ) [σ0, 2 σ 01 ] T } and Σ i = ZUZ T + σɛ 2 I mi. The Fisher information I θ consists of Ĩ from the initial measurements at t = 0 and I from Y i. The first part Ĩ is the same as (7). The (k, l)th element of I can be derived using (8). Two partial derivatives µ ik and Σ ik are needed to evaluate I kl. The partial derivative Σ ik can be computed as Σ ik = Z U k Z T for k 6 and Σ i6 = Z U 6 Z T +I mi. Letting z i be a standard-normal order statistic as above, µ ik can be decomposed as µ ik:1 + µ ik:2 z i. Detailed expressions for µ ik:1 and µ ik:2 are given in the appendix. For ADTs with random allocation (Weaver and Meeker 2014), the log-likelihood is the same as (13) while the information matrix can be obtained by treating z i as a standard normal random variable. See the remark in Section 4. Continue with the printhead degradation example in Section 5. Assume that the degra- 20

0.9 0.8 0.7 280 600 230 221 218 280 218 221 600 230 0.9 0.8 0.7 500 300 250 235 230 300 500 250 low stress level 0.6 0.5 0.4 230 221 low stress level 0.6 0.5 0.4 228 235 230 0.3 0.3 235 0.2 0.2 250 0.1 600 0.1 500 280 300 10 20 30 40 10 20 30 40 n L n L (a) (b) Figure 7: Contour plot of the large-sample variance of t 0.1 using (a) the strategic allocation and (b) random allocation. The diamond in each plot denotes the optimum plan. dation process follows the drift-acceleration ADT model. Let the planning values of θ be the same as Section 5. Consider a two-point design. When strategic allocation is used in the planning, a contour plot of the asymptotic variance of AVar( t 0.1 ) is given in Figure 7(a). The optimum test plan is n L = 30 and s L = 0.814, and the corresponding AVar( t 0.1 ) is 217.5. In Figure 7(b), we also show the asymptotic variance of t 0.1 when an ADT with random allocation is used. The optimum plan is (n L, s L ) = (41, 0.612) with the minimum AVar( t 0.1 ) = 227.0. The asymptotic variance is 4.4% higher than the ADT with strategic allocation, which demonstrates the merits of the proposed allocation strategy again. Compared with the cumulative-exposure model, the drift-acceleration model tends to favor a higher value for the low factor-level s L. This is because the signal-to-noise ratio E[Y (t)]/std(y (t)) increases much faster in s in the drift-acceleration model. Therefore, a higher value for s L is able to gain more information for parameters related to the degradation rate, i.e., α 0 and α 1. 21

7 Conclusions An allocation strategy of test units in an ADT has been successfully proposed in the study. The motivation is based on the observation that the initial degradation level is usually strongly correlated with the degradation rate. The proposed allocation strategy tallies with the principle of accelerated tests in that more reliable units, manifested in smaller degradation rates, are exposed to higher levels of the accelerating variables. The allocation strategy was demonstrated using the general path degradation models. We first considered a cumulative-exposure ADT model. The likelihood function and the Fisher information matrix under strategic allocation were derived. The optimum two-point test plan was obtained by minimizing the asymptotic variance of the ML estimator of a lifetime quantile. It was found that compared with random allocation, strategic allocation of the test units is able to significantly reduce the asymptotic variance. Compromise test plans were also discussed. To further demonstrate the applicability of the allocation strategy, we applied the strategy to two more ADT models and developed the inference procedure. The first model simplifies the cumulative-exposure model (3) by introducing an onset parameter. The second uses a different method to link the accelerating variable to the degradation model (1). Examples for both models again revealed the ability of the strategy in reducing the asymptotic variance of t p. The study may be extended in several important directions. The first is to apply the strategy to stochastic process models, such as the Wiener process and the gamma process. It is straightforward to extend (1) to a Wiener process with random-effects as Y (t) = Y 0 + bt + σb(t), where [Y 0, b] is bivariate normal, σ is the volatility parameter, and B(t) is the standard Brownian motion. In the model, Y (t) has a closed-form PDF. However, extension of (1) to the gamma process is difficult, as it is hard to find a bivariate distribution to describe the initial degradation and the degradation rate under the gamma process model. For this process, the unknown-onset model (10) is useful as it links the initial degradation and the degradation rate through the onset parameter τ. This indicates the potential applications of the unknown-onset model in degradation analysis. The second direction is the model optimization. Because there are only two planning variables in our ADT setting, a grid search is implementable. Nevertheless, most contour plots in this study suggests that the asymptotic variance AVar( t 0.1 ) seems to be bowl-shaped in the two planning variables. This implies that a derivative-free optimization algorithm might be able to locate the global optimum. If we allow for more planning variables such as the test duration and the measurement frequency, a well-designed optimization algorithm is more than necessary. 22

Appendix Proof of Proposition 1 Suppose (W k, X k ), k = 1,, n, are n i.i.d. realizations of (W, X). Let X be the collection of the ordered random variables X (1),, X (n) and X ( k) be the collection of X (1),, X (k 1), X (k+1),, X (n). Their realizations are denoted as x and x ( k), respectively. Let W (k) be the d 1 dimensional random vector associated with X (k) and define W, W ( k), w and w ( k) in a similar way. Based on the law of total probability and Bayes theorem, we have f W(i) X (i) (w i x i ) = f [W,X( i) X (i) ](w, x ( i) x i )dw ( i) dx ( i) = f[w,x] (w, x) dw ( i) dx ( i). (14) f X(i) (x i ) According to the property of order statistics, the PDF f [W,X] (w, x) is proportional to the joint density of (W k, X k ), k = 1,, n as n f [W,X] (w, x) = n! [f Wk X k (w k x k )f Xk (x k )]. k=1 Substitute this result into (14) to see that f W(i) X (i) (w i x i ) f Wi X i (w i x i ). Because both sides are PDFs and the right-hand side integrates to one, the left-hand side has to equal the right side. The proof of the proposition is complete. Partial derivatives for the cumulative exposure ADT model The partial derivative of U in (4) with respect to each element of θ = [β, α 0, α 1, σ 2 0, σ 2 1, σ 2 ɛ, ρ] T is given by U k = 0 2 2, k = 1, 2, 3; U 4 = U 6 = [ 1 (σ0 2 + σɛ 2 ) 2 [ 1 (σ0 2 + σɛ 2 ) 2 σ 4 0 σ 2 0σ 01 σ 2 0σ 01 σ 2 01 ] σ 4 ɛ σ 2 ɛ σ 01 σ 2 ɛ σ 01 σ 2 01 ; U 7 = 1 σ 2 0 + σ 2 ɛ [ ] ; U 5 = 0 σ 2 ɛ σ 2 ɛ 2σ 01 The partial derivative of µ is denoted as µ ik = µ ik:1 + µ ik:2 z i, where z i is the ith order statistic of n i.i.d. standard normal random variables. Detailed expressions for µ ik:l, k = 1, 2,, 7 and l = 1, 2 are given below. Addition between a scalar and a vector is always understood [ ] 0 0 0 1. ] ; 23

as: the scalar is added to every element of the vector. µ i1:1 = σ2 ɛ σ 2 0 + σ 2 ɛ µ i3:1 = s(i)α 0 e α 1s(i) t i, µ i3:2 = s(i)σ 01e α 1s(i) σ 01e α 1s(i) t σ0 2 + σɛ 2 i, µ i1:2 = 0; µ i2:1 = e α1s(i) t i ; µ i2:2 = 0; (σ 2 0 + σ 2 ɛ ) 1/2 t i; µ i4:1 = 0, µ i4:2 = σ2 ɛ σ 01 e α1s(i) t i (σ 2 0 + σ 2 ɛ ) 3/2 ; µ i5:1 = µ i5:2 = µ i6:1 = µ i7:1 = 0; µ i6:2 = σ2 0 + σ 01 e α 1s(i) t i (σ 2 0 + σ 2 ɛ ) 3/2 ; µ i7:2 = Some partial derivatives for the simplified model e α 1s(i) (σ 2 0 + σ 2 ɛ ) 1/2 t i. Again, the partial derivative of µ is denoted as µ ik = µ ik:1 + µ ik:2 z i. Detailed expressions of µ ik:1 and µ ik:2 are given below. µ i1:1 = α 0σ 2 ɛ τ 2 σ 2 1 + σ 2 ɛ τσ2 1α 0 e α 1s(i) t τ 2 σ1 2 + σɛ 2 i, µ i1:2 = 2τσ 2 1σ 2 ɛ (τ 2 σ 2 1 + σ 2 ɛ ) 3/2 + σ2 1σ 2 ɛ τ 2 σ 4 1 (τ 2 σ 2 1 + σ 2 ɛ ) 3/2 eα 1s(i) t i ; σ 2 ( ɛ µ i2:1 = τ + e α 1 ) s(i) t τ 2 σ1 2 + σɛ 2 i ; µi3:1 = s(i)α 0 e α1s(i) τσ1 2 t i, µ i3:2 = (τ 2 σ1 2 + σɛ 2 ) 1/2 s(i)eα 1s(i) t i ; ( ) ( ) µ i2:2 = µ i4:1 = µ i5:1 = 0, µ i4:2 = τσ2 ɛ τ + e α 1 s(i) t i ; µ (τ 2 σ1 2 + σɛ 2 ) 3/2 i5:2 = τσ2 1 τ + e α 1 s(i) t i. (τ 2 σ1 2 + σɛ 2 ) 3/2 On the other hand, the partial derivatives of V and Z are given below. V 1 = 2τσ4 1σɛ 2 (τ 2 σ1 2 + σɛ 2 ) ; 2 Vk = 0, k = 2, 3; V4 = σ 4 ɛ (τ 2 σ 2 1 + σ 2 ɛ ) 2 ; V5 = τ 2 σ 4 1 (τ 2 σ 2 1 + σ 2 ɛ ) 2. Z ik = 0 2 1, k = 2, 4, 5; Z i1 = [1, 0] T ; Z i3 = [ 0, s(i)e α 1s(i) ] T. Partial derivatives for the drift-acceleration ADT model the partial derivative of µ is µ ik = µ ik:1 + µ ik:2 z i, where µ i1:1 = σ2 ɛ σ 2 0 + σ 2 ɛ σ 01 t σ0 2 + σɛ 2 i ; µ i2:1 = e α1s(i) t i ; µ i3:1 = α 0 s(i)e α1s(i) t i ; µ i4:2 = σ2 ɛ σ 01 t i (σ 2 0 + σ 2 ɛ ) 3/2 ; µ i6:2 = σ2 0 + σ 01 t i (σ 2 0 + σ 2 ɛ ) 3/2 ; µ i7:2 = t i (σ 2 0 + σ 2 ɛ ) 1/2. µ i1:2 = µ i2:2 = µ i3:2 = µ i4:1 = µ i5:1 = µ i5:2 = µ i6:1 = µ i7:1 = 0. 24