Seminar on Longitudinal Analysis

Size: px
Start display at page:

Download "Seminar on Longitudinal Analysis"

Transcription

1 Seminar on Longitudinal Analysis James Heckman University of Chicago This draft, May 20, / 191

2 Review of Data Collection Techniques Before addressing issues regarding the modeling and estimation of individual histories, we first review the type of data available on processes such as fertility, employment and unemployment, and other state transition processes. For illustration, suppose we examine the characterization of a two-state unemployment/employment process over continuous time with asynchronous switching. In the next figure, let t = 0 be the starting date and T 0 be the termination date. Individual I experiences three transitions in this time period, while II makes only two 2 / 191

3 Two-State Unemployment/Employment 1 I II 0 t=0 T 0 0 = unemployed 1 = employed 3 / 191

4 Event history data Event history data provides the analyst with information on: 1 all of the switches of each individual 2 length of time spent in each state This data is extremely rare. A few examples of such data sets: Fertility studies of the Bicol region of the Philippines Another from Denmark Seattle employment study These studies are available only for a finite length of time. Thus, the ending date of the survey censors the final observation: the duration of the individual s stay in the last state is unknown. 4 / 191

5 CPS data collection The CPS data collection technique actually provides exceedingly little information for longitudinal analysis despite its breadth. The CPS survey follows a group of individuals in the following fashion: 1 At time t = 0 an initial survey is performed. 2 After 9 months, a second survey is made. 3 After 11 months, a followup is performed. The following four case histories, each represented as a single broken line (line for unemployment and break for employment), illustrate how choppy the CPS data is. 5 / 191

6 CPS Survey A B C D t=0 2 months T=9 T=11 t 6 / 191

7 The questioning about past history of unemployment spells is such that: A is recorded as one uninterrupted spell of unemployment B is also recorded as an uninterrupted spell C is a spell of known duration and ending date D is known only as lasting between 2 and 9 months and is right-censored. Such data doesn t even allow computation of an unbiased mean duration of unemployment spells, let alone allowing us to answer questions regarding the dependence of spell length on past history of the individual or an calendar time. 7 / 191

8 Prospective Point Sampling A good deal of occupational information is collected on this basis, sometimes (misleadingly) called continuous history data. Compared with the CPS framework, this scheme is very simple. At chosen sampling intervals, the individual reports the state he or she currently occupies. The detail of such surveys depends greatly on the chosen sample interval and the characteristics of the transition process. 8 / 191

9 Prospective Point Sampling One-year intervals may be sufficiently narrow for a fertility series, but too wide for employment transitions. Multiple transitions may occur within the interval. Unless retrospective information is gathered at the sampling dates to fill in such gaps, this method may convey too little information regarding the time dynamics of the underlying process. 9 / 191

10 Event Count Data Such surveys record the number of switches between states only. There is no information on duration, incidence rates or spacing. 10 / 191

11 Partial observability This is a situation arising when the degree of variability recorded is not the same across variables. In some surveys, earnings is recorded on a yearly basis while unemployment is followed at three-month intervals. Methods to study correlation between disparate series will be addressed. 11 / 191

12 Partial observability The moral of this review to those involved in data collection design is 1 Collect all retrospective data 2 the wider the chosen sampling intervals, the noisier the recording of the underlying process although the surveying may be less costly 3 the more characteristics regarding each individual are collected, the easier to control for individual errors in reporting. 12 / 191

13 Duration Models Standard hazard specifications A duration model characterizes the probability of an event occurring as a function of elapsed time, when the waiting time is governed by some random distribution. Let T be the waiting time until the occurrence of an event. [ t ] S(t) = exp h(u) du 0 = the probability that T exceeds some time span t h(t) = d g(t) [ log S(t)] = dt S(t), where g(t) is the density associated with G(t) = 1 S(t). 13 / 191

14 S(t) is called the survivor function and h(t) the hazard function, and G(t) is the distribution of spell lengths. Note that when discussing a given population of individuals and a single event which each individual experiences only once, the functions have the following interpretations: S(t) = the fraction of people who have not yet experienced the event f (t) = rate of occurrence per unit time h(t) = rate of occurrence per individual still to experience the event, or at risk 14 / 191

15 A crude approximation to the survivor function S(t) is a step function giving the number surviving as a percent of the total starting population. Each time an event occurs, the number of survivors falls in a discrete fashion. 15 / 191

16 Survivor Step Function 1 S(t) 0 t 16 / 191

17 An interesting transformation of the S(t) function is t log S(t) = h(u) du. 0 the curvature of this integrated hazard may be read from the next graph as either A the rate of occurrence is slowing down with time and the number of individuals who leave, B the rate of occurrence is speeding up, or C the rate of occurrence changes with time and the number of individuals who leave. 17 / 191

18 Curvature of Integrated Hazard - log S(t) B C A t 18 / 191

19 Giving a functional form to the specification of the hazard is a problem of balancing off flexibility to incorporate all of cases A, B and C against conveyance: finding a parametrization which is easy to work with. Six types of hazard function are given below: 1 The simplest is of course a constant hazard over time: h(t) = C for all t > 0. Somewhat restrictive. 2 The Weibull family of distributions offers simplicity and more flexibility: h(t) = αt α 1, α > 0. α = 1 then this yields a specific constant hazard. 0 < α < 1 then h(t) is decreasing in t. α > 1 then h(t) is increasing in t. 19 / 191

20 3 The Gompertz distributions also offer flexibility and convenience, with a problem when the parameter γ is negative. h(t) = exp(γt). For γ > 0, h(t) is increasing. For γ < 0, h(t) is decreasing, but the distribution function is defective. The probability P(T > t) = S(t) does not go to zero as t goes to infinity: h(u) du = exp(γu) du = 1 [exp(γt) 1] for γ < 0 γ S(t) = exp [ 1γ ] (exp(γt) 1) ( ) 1 lim S(t) = exp 0 t γ 20 / 191

21 4 The Box-Cox hazard is a commingling of the preceding two: h(t) = [ ] γt λ 1 exp λ h(t) Weibull lim λ 0 λ = 1 Gompertz γ = 0 constant hazard 21 / 191

22 5 The unimodal parametrization allows h(t) to be initially increasing and subsequently decreasing. The inflection point of curve C is the mode of the hazard function. h(t) = λα(λt)α (λt) α 0 < α < 1 then h(t) is decreasing from + α = 1 then h(t) is decreasing from +1 α > 1 then h(t) is unimodal with mode α 1 λ 22 / 191

23 6 Finally, a more complex hazard function with a concrete basis for the parametrization is the following: Let t = elapsed calendar time x = age of an individual female A hazard specification for the fertility of women who have had a child might be as follows: h(t, x) = ρη(x)(η(x)t) α 1 exp [ η(x)t] 23 / 191

24 Notice the relation between this function and the unimodal hazard of 5. There the parameter which governs the mode is η(x). By specifying that η(x) declines with the age of the female, the declining fertility of older women is incorporated into the hazard. An example of η(x) would be 0 if x < 14 1 if 14 x < 25 η(x) = [ ( 1 x 25 ) τ ] 2 if 25 x < if x / 191

25 1 0(x) x 25 / 191

26 Mixtures of constant hazard rates Even in the simple model in which all individuals have some constant hazard rate, θ, and these rates are distributed across the population according to some mixing density, m(θ), problems of identification arise. First, we show that such a mixing model always generates a declining proportional hazard rate, and then discuss distinguishing such a model from one in which each individual has a decreasing hazard rate. 26 / 191

27 This form of heterogeneity, a mixture of constant hazard rates is common in the duration analysis literature. S(t) = 0 exp( θt) dµ(θ) where dµ(θ) is equivalent to m(θ) dθ, with m(θ) = mixing density of θ in the population θ = unobserved heterogeneous hazard rates 27 / 191

28 For a population with such a combination of individuals, the proportional, or population hazard rate, is always decreasing over time: h(t) = d dt [ ln S(t)] = θe θt dµ(θ) 0 e 0 θt dµ(θ) d dt h(t) = [ θe θt dµ(θ) ] 2 0 e θt dµ(θ) θ 2 e θt dµ(θ) 0 0 [ e 0 θt dµ(θ) ] 2 By the Cauchy-Schwartz inequality for L 2, the sign of d h(t) is dt negative. 1 1 ( x(s) y(s) ds ) 2 x 2 (s) ds y 2 (s) ds. Let x 2 (s) ds = 0 e θt dµ(θ) and y 2 (s) ds = 0 θ 2 e θt dµ(θ) 28 / 191

29 This result is quite general but the converse is not true. One restriction that helps but does not completely resolve the difficulty of distinguishing between mixtures of constant hazards and mixtures of decreasing hazards is that of complete monotonicity. That is, mixtures of constant hazards must have derivatives which alternate in sign: ( 1) n d n S(t) dt n 0 for all t 0, n I + If data is fine enough to support differentiation, then a simple sufficient condition (see Feller 1971, Vol. II) for this property to be violated is h (t 0 ) + 3h(t 0 )h (t 0 ) h 3 (t 0 ) > / 191

30 However, it may be that an alternative specification with decreasing hazards, S(t) = is identical to a constant hazard model 0 where the new mixture is properly defined. 0 e θtα dµ(θ), 0 < α < 1 (1) e θt dµ (θ), (2) Both (1) and (2) are completely monotone, and of course observationally equivalent. 30 / 191

31 This underidentification can be overcome only by further a priori assumptions. If only densities with finite means are allowed then (2) may be ruled out. For example, if dµ(θ) is a gamma density, then dµ (θ) must have a fat tail. Alternatively, if there is a theoretical restriction on the time dependence, which is presumed the same for all individuals, i.e., α is known, then the mixture dµ(θ) is identified. 31 / 191

32 Actuarial Estimators Non-parametric estimates of S(t) and t h(u) du will involve 0 maximum likelihood estimators over a function space. No distributional assumptions regarding m(θ) are made. The first example of such a likelihood is an actuarial estimator on a single-state process. 32 / 191

33 The actuarial estimator is based on the following observation: if failure times across a population are governed by the same distribution but the information on the process gives only the number of survivors at arbitrary time intervals, I 1 I 2 I 3 t 1 t 2 t... 3 t k then the survivor function evaluated at each one of the time points S(t k ) = P(T > t k ) = P(T > t 1, T > t 2,..., T > t k ) = P(T > t k T > t 1, T > t 2,..., T > t k 1 ) P(T > t k 1 T > t 1, T > t 2,..., T > t k 2 ). P(T > t 1 ) = P(T > t k T > t k 1 ) P(T > t k 1 T > t k 2 ) P(T > t 1 ) 33 / 191

34 Notice that if T > t k 1, then T > t j, for j < k 1. If there are η 1 individuals at risk at the beginning of time interval I 1, then an estimated probability of surviving each interval is P(T > t j ) = 1 1 η 1. The estimated survivor function is Ŝ(t k ) = k j=1 (1 1ηj ). 34 / 191

35 If the number of failures within an interval is greater than one, then the atuarial estimation may be modified: Let d i be the number of terminations in interval I i. Ŝ(t k ) = k j=1 ( 1 d j η j The Kaplan-Meier non-parametric maximum likelihood estimator is an extension of this actuarial construct. S(t) = ( 1 d ) j. η ti <t j ). Here t 1 < t 2 < < t are the actual times at which individuals experience the event: d i = number of individuals exiting at the i th event time η i = number of individuals at risk at the i th time 35 / 191

36 Given the non-parametric likelihood, in order to perform hypothesis testing, we need a standard deviation as a function of time. Greenwood s formula gives γ(t) = S(t) kt d 0 (η j)(η j + 1) j=1 k t = value of k such that t [ ] t (k), t (k+1) 36 / 191

37 The Aalen estimator of integrated hazard is a good descriptive device which uses the same technique used in procedures for estimating multi-state transition rates which may involve complicated time dependence. Where the survivor function is ( S(t) = exp t 0 ) h(u) du. A procedure to estimate the integrated hazard sums the ratio of exiting individuals to the number remaining at risk at each event time: t h(u) du = d i 0 η i i=t i <t 37 / 191

38 Let S(t) be the Kaplan-Meier estimator. The relation of the Aalen estimate to S(t) may be seen by the following transformation: ( ln 1 d ) i = ( ln 1 d ) i = η i η i i=t i <t t 0 h(u) du 38 / 191

39 Each term of this Kaplan-Meier integrated hazard, when written as a series expansion is ( ln 1 d ) i = d i d i 2 + d i 3. η i η i 2ηi 2 6ηi 3 The Aalen estimator ignores the higher order terms of this expansion. If the number at risk is large, most of the weight in this series does indeed fall on the first term alone. The Aalen estimator also tends to correct the bias introduced by the nonlinear transformation ln(s(t)). 39 / 191

40 In failure time models, once an individual experiences the event, he is out of the pool of individuals at risk for the rest of the survey. In the following example taken from economics, transitions between unemployment, employment and being out of the labor force may be repeated any number of times. The Aalen estimator may be used in this context as well in testing a hypothesis regarding the transition rates between states. 40 / 191

41 Given three employment states, there are six possible transition rates. U E O 41 / 191

42 Define the following event rates: r U,E (t) = expected number of unemployment to employment transitions per unit time, per individual at risk. r O,E (t) = expected number of transitions from out of the labor force to employment Flinn and Heckman pose the question of whether unemployment and out of the labor force should be designated as separate classifications. 42 / 191

43 To test this, they examine the hypothesis H 0 : r U,E (t) = r O,E (t). First, the assumption that transitions depend on past history is made. Suppose data available gives a counting process over the entire population. The six cumulative counts are N e (t) = (N U,E (t), N N,E (t), N E,U (t),...) Note that individuals may appear in these counts more than once. 43 / 191

44 The Aalen estimator for the integrated transition rate is where t 0 r U,E (u) du = k:0 t k <t d k Y U (t U,E (k) ) Y U = the number of individuals in state U at time t. Although Y U need not decline monotonically with multistate flows, it is still the number of individuals at risk in state U. 44 / 191

45 The test relies on the query, how frequently do events occur per time period per individual? The event times of the two transition patterns do not have to match, but having complete event history data is crucial. O, E U, E The test asks whether is equal to t 0 t 0 s 1 s 2 s 3... s k t 1 t 2 t 3... t k r U,E (u) du = r O,E (u) du = k:0 t k <t k:0 t k <t d k Y U (t U,E (k) ) d k Y O (t O,E (k) ) 45 / 191

46 If event history were not available, and instead prospective point data were, then multiple intermediate transitions would be unobservable. To infer what jumps occurred between observed points, on might try to fit a Markov or semi-markov process. 46 / 191

47 Estimation of Separable Hazard Models In this section, the problem of estimating the form of time dependence is addressed. Specifically, the sensitivity of the time dependence estimate to the form of the distribution of unobserved heterogeneity assumed and to the parametrization of the time dependency is examined. Finally, the nonparametric method of the EM algorithm is presented as an alternative to the standard maximum likelihood methods. 47 / 191

48 If the survivor function is of the separable form S(t x) = [S 0 (t)] exp X f β then partial likelihood estimation will yield a ˆβ estimate, and parametric specifications of the time dependence, such as h 0 (t) = αt α 1 or h 0 (t) = e αt, where ( t ) S 0 (t) = exp h 0 (u) du 0 may be estimated with standard techniques. Given the variety of functional forms that might be chosen for the time dependence, on would perhaps like to data to speak for itself, in the absence of any theory on time dependence. 48 / 191

49 A nonparametric approach would proceed in the following fashion: Choose a priori a set of time intervals that need not correspond to jump times associated with occurrences of transitions. They must satisfy the condition that they are long enough to be non-empty of events. e α 1 for 0 t < t 1 e α 2 for t 1 t < t 2 h 0 (t) =. e α k for t k 1 t < t k 49 / 191

50 This places no restrictions of monotonicity on the number of modes for the hazard function. The integrated hazard will be t 0 h 0 (u) du = k e αj (t j t j 1 ) j=1 for a stepwise hazard, as in the graph below: 50 / 191

51 Stepwise Hazard Function / 191

52 The estimated hazards are where α k = ln d k = ln [ i R k δ i exp ( ) ] X β e i e d k = number of individuals who experience the event of interest in the half open interval [t k 1, t k ) R k = set of individuals at risk in the interval [t k 1, t k ) δ i =1 if the individual is uncensored. Again, δ i points out that non-censored individuals provide information regarding time dependence. 52 / 191

53 If the estimated ĥ 0 (t) is of the form in the graph (A), which appears unimodal, this would support a specification of h 0 (t) = λα (λt)α (λt) α for α > / 191

54 (A) Estimated ĥ 0 (t) h(t) t 54 / 191

55 (B) Estimated ĥ 0 (t) t 55 / 191

56 Survivor Analysis and GMLE A brief look at the theory underlying generalized maximum likelihood follows. (See Kiefer and Wolfowitz.) Define dµ(x) to be a dominating measure if for dp 0 (x) = f (x) dµ(x), µ(a) = 0 implies that P 0 (A) = 0. Let P be the measure of all probability measures. For every pair of probability measures P 1 and P 2 in the class P, define dp 1 (x) f (x; P 1, P 2 ) = d(p 1 + P 2 )(x) 56 / 191

57 Radon-Nikodym derivative Here, f (x; P 1, P 2 ) plays the role of the likelihood. The measure ˆP is a generalized maximum likelihood estimator if f (x, ˆP, P) f (x, P, ˆP) or d ˆP(x) d(ˆp + P)(x) dp(x) d(ˆp + P)(x). 57 / 191

58 Let T 1, T 2,..., T n be the times an event of interest occurs in a population surveyed. To extend the analysis, we now allow a censoring of the data in a random manner. Let C 1, C 2,..., C n be the censoring times: this is equivalent to an individual dropping out of a sample population before experiencing the event in question. 58 / 191

59 The observable is Y i = min(t i, C i ). At each time Y i, an individual either makes the transition out of the state having experienced the event or drops out from the sample s observed population prematurely. Let the δ i variable indicate whether y i is censored or not: { 0 if the individual is censored δ i = 1 if not Consider the probability distributions on x = ((y 1, δ 1 ), (y 2, δ 2 ),..., (y n, δ n )). 59 / 191

60 Finding a generalized maximum likelihood estimator of the hazard rate is the same as finding a P(x) such that the observed events are given the maximum probability: P(x) = n { Pr(T = y(i) ) } δ i {Pr(T > yi )} 1 δ i. i=1 The first bracketed term is the probability of an event occurring at exactly time T. It is given weight only if the event is uncensored, i.e., when δ i = 1. The second term is the probability that the event occurs any time after the time T, which is the most information that a censoring at T yields. It receives weight only if δ i = / 191

61 More succinctly, the problem is to ( n n ) max p i p j for p 1, p 2,..., p n 0. i=1 j=1 When the number of individuals is finite, the sum of the number of events and censoring must also be finite. (The two are equal.) 61 / 191

62 When the time partitioning is fine enough so that no two events occur exactly simultaneously, then the solution to the maximum problem is exactly the Kaplan-Meier estimate: ˆP i = δ i 1 η i + 1 j=1 ( 1 ) δ j. η j + 1 A comparison of time dependent rate estimates depends on the assumption of a homogeneous population. Finding differences across covariates requires further investigation. 62 / 191

63 Duration Models with Covariates Complicating the underlying process by postulating that it is affected by some covariates (which we assume are observable, for now) leads to gains in estimation efficiency to some specification of the form of duration dependence. The issue which arises, on which statisticians and social scientists are divided, is whether to model the durations (i.e., waiting times) themselves, or to model the rates of exit (the hazard function). Application of a standard regression framework to durations imply complex hazard specifications. Statisticians tend to favor modeling curves to fit the hazard itself with more convenient parameterizations. 63 / 191

64 Regression Framework In order to apply linear regression to waiting times, a standard technique is to take a log transformation, mapping non-negative waiting times onto the entire real line. Then a symmetric disturbance term ε i may be applied to a regression of log durations: ln t i = β 0 + k β j x ji + ε i. j=1 x ji = the value of the j th covariate for individual i. ε i = iid, Φ(0, δ 2 ) where Φ is the normal cdf. 64 / 191

65 The theorizing here is at the level of linking the covariates to the expected value of log duration. The hazard is subsumed in this specification, and is worth examining. Although the waiting times are straightforward, the hazard is complex (read crazy ). 65 / 191

66 Let ln T = β 0 + X + ε [ ] T = exp β 0 + X β + ε. The conditional survivor function is ( ( ) ) P(T > t X ) = P exp β 0 + X β exp (ε) > t ( [ ]) = P exp (ε) > t exp β 0 X β ( ) = P ε > ln t β 0 X β ( ) = 1 Φ ln t β 0 X β. β 66 / 191

67 The survivor function is S(t) = 1 Φ(T ) = exp [ t ] h(u) du. 0 The hazard my be retrieved as well by differentiation. t [ ( )] h(u) du = ln 1 Φ ln t β 0 X β 0 1 φ(ln t β t 0 X β) h(t) = 1 Φ(ln t β 0 X β). Another drawback of this method is having to know the completed waiting times t i. Traditionally, one may know only transition times censored in some fashion. 67 / 191

68 Consider a More General Approach This approach postulates a general separable hazard specification where the time dependent portion is multiplicatively separable from the portion which varies with X, some vector of covariates, i.e., h(t X ) = ψ(t)u(x), where ψ(t), u(x) are non-negative valued functions. 68 / 191

69 An example of such a hazard is the Cox specification: h(t X ) = ψ(t) exp(x β) or, ( t ) exp h(u X ) du = [S 0 (t)] exp(x β ), 0 where ( t ) S 0 (t) = exp φ(u X ) du. 0 A full maximum likelihood estimator of the hazard and the coefficients on the covariates is constructed as follows. 69 / 191

70 Define D i = indicator for individuals who experience an event of interest at time i. C i = indicator for censored individuals: those who drop out of the sample before they experience an event. By looking at the conditional survivor function, we find the contribution to the likelihood for individuals experiencing the event is [ S0 (t (i) ) ] exp(x β ) [S 0 (t + (i) ) ] exp(x 1β ) where t + (i) denotes the time just after t (i), i.e., t (i) / 191

71 For a censored individual, the contribution is The likelihood is therefore [ S 0 (t + (i) ) ] exp(x β ). L = [ [S0 (t (i) ) ]exp(x β) ] exp(x lβ [S 0 (t + (i) + ) l D i [ [ ] ] exp(x lβ S 0 (t + (i) ) ). l C i ) ] 71 / 191

72 Following a method suggested by Cox, if one is interested in the covariates, then one might maximize the partial likelihood. This ignores the time-dependent part of the hazard, separating out the changes which enter through the covariates. Given an additional requirement that the covariates X be time-invariant, (see B. Efron). The Cox likelihood proposal is max β n i=1 exp(x iβ ) l R(t (i) ) lβ) exp(x where R(t (i) ) is the number of individuals at risk at time t (i). 72 / 191

73 Heuristically, this term may be related to the full maximum likelihood by the following argument. The conditional probability that person (i) experiences an event at t (i) given that R(t (i) ) individuals are at risk and that exactly on event occurs at time t (i) is h(t X i ) = P(t < T < t + δ T > t) φ(t) iβ). exp(x 73 / 191

74 By the definition of conditional probability, this is ( ) the fraction with covariates φ(t) (i))β i who exit in interval (t, t + z) exp(x = X ( ) l R(t + ) φ(t) lβ) total population at risk exp(x as of time t Making the assumption that the nature of time dependence is the same for all individuals, φ(t) cancels out, yielding one element of the Cox partial likelihood objective. 74 / 191

75 A Diagnostic Check for Specific Hazards Given the expense of generalized maximum likelihood estimation, it is useful to have simple diagnostic tests of the specification of the time dependence in a duration model. A test of proportional hazards under Weibull time dependence can be performed by a graphical technique: Let S(t x ) = P(T > t X = x ) = [S 0 (t)] exp(x β ) Taking a log transformation twice over, we have ln S(t x) = ln S 0 (t) exp(x β) ln( ln S(t x)) = ln( ln S 0 (t)) + X β. (3) 75 / 191

76 When the time dependence part of the survivor function is of the Weibull family, then S 0 (t) = exp( t α ) and the first term of the equation (3) is ln( ln S 0 (t)) = α ln t and ln( ln S(t x)) = α ln t + X β. 76 / 191

77 Graphing ln( ln S(t x)), a double log transformation of the survivor function against ln(t) should produce a family of straight lines of the same slope, α. The intercepts should vary according to X β specification of S(t x). under this separable If the graph does not conform then this convenient specification does not apply. 77 / 191

78 Double log Transformation of Survivor Function Against ln(t) ln( -ln( S(t) ) ) slope = V ln( t ) 78 / 191

79 More generally, unless parallel curves are generated by plots of ln[ ln S(t x )] against ln t for various chosen values of the covariates x, then the assumption of separable cannot hold. Example, see J. Menken, J. Trussel, D. Stampel, O. Babokol, Demography, Vol. 18, 1981 pp (on marital dissolution) 79 / 191

80 A Duration Model from Economic Theory In a search model of unemployment, the Poisson arrival of new job offers and a reservation wage strategy of income maximizing workers generates observed unemployment spells which have fundamentally non-separable hazards. Heterogeneity across workers is likely to exist either in costs of search or wage offer distributions. This model is outlined here as an example of a duration model based on optimizing behavior by economic agents. A thorough presentation may be found in Lippman & MaCall, (1976). 80 / 191

81 Let λ= Poisson encounter rate with new job offers V = value of search rv = reservation wage c = instantaneous cost of search r = instantaneous interest rate F (w) = distribution of wage offers, assumed to have finite mean. 81 / 191

82 Agents maximize income subject to the following scheme: If search cost C is incurred, job offers arrive at rate λ independent of c. Wage offers are independently drawn without recall from distribution F (w). Agents are infinite-lived and jobs last forever, having present discounted value w r. The value of search is { c t 1+r t V = + 1 λ t V + λ t E [ w 1+r t 1+r t max, V ] + O( t) if V > 0 r 0 otherwise. 82 / 191

83 Passing to the limit, we have c + rv = λ r rv (w rv ) df (w) and reservation strategy: { 1 if w > rv, job offer accepted d = 0 if w rv, job offer rejected. Then the probability that an unemployment spell exceeds duration t u, given hazard rate of exit (acceptance of a job) h u = λ(1 F (rv )), is Pr(T u > t u ) = exp( λ(1 F (rv ))t u ). 83 / 191

84 Returning to the entire population, with some observed characteristics x and unobserved characteristics θ, we have P r (T u > t u x) = exp [ λ(x, θ)(1 F (rv (x, θ)))t u ] dµ(θ) Θ where { Θ = θ 0 < λ } (w rv (θ, x)) df (w) c(x, θ) ω r rv Obviously, there is no general separable hazard specification that emerges. Instead, the hazard and covariates are linked by the solution to the Bellman equation, in which the reservation wage rv is implicitly defined. See Heckman and Singer (1984). 84 / 191

85 Duration Models with Unobservables In most applications, the analyst has data on some process, along with some observable characteristics regarding individuals included in the survey. In addition to these, there may be other characteristics of these individuals which are factors affecting the process but which are unmeasured. For example, in the Stanford study of heart transplant recipients, it is likely that each patient differed in some dimension, call it frailty, which affected their survival times after receiving transplants. Thus, the survivor function includes an unobservable θ: P(T > t x, θ) = exp( H(t)U(x)V (θ)) = exp( H(t) exp(x β + θ)) 85 / 191

86 The statistician has the following issues to address: what estimation strategy to use to obtain ˆβ in the presence of θ, and what estimation can reveal regarding the distribution of θ itself. Even when this distribution of θ is of no interest to the analyst, its presence will affect the consistency of estimation of β. Assuming some form of the mixing distribution of θ, dµ(θ), the data may be confronted with the integrated survivor function, where θ has been integrated out: P(T > t X ) = exp( H(t) exp(x β + θ))dµ(θ) 86 / 191

87 Commonly used functional forms for dµ(θ) are gamma and normal distributions Gamma: dµ(θ) = ba θ a 1 exp( bθ) dθ Γ(a) Normal: dµ(θ) = exp( (θ a)2 /2b) 2πb dθ. Both of these offer a flexible, analytically tractable and computationally convenient family of distributions. Both are fully described by the specification of two parameters a, b above. 87 / 191

88 For all the claimed convenience of these specifications, along with lognormal variation, they do not all yield the same qualitative estimates for the coefficients on time dependence and observed covariates. Sensitivity of these estimators is apparent in one study, the Heckman and Singer analysis of labor earnings data, but not in the Manton, Stallard and Vaupel study of mortality risks among the aged. 88 / 191

89 Mortality Risks Among the Aged I. Weibull Parameter Estimates: φ(t) = αt α 1 Age Gamma Heterogeneity Inverse Gaussian No Heterogeneity (.05) 5.88 (.08) 5.44 (.04) (.06) 5.98 (.09) 5.46 (.04) (.07) 6.35 (.11) 5.69 (.05) II. Gompertz Parameter Estimates: φ(t) = e γt Age Gamma Heterogeneity Inverse Gaussian No Heterogeneity ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 89 / 191

90 Although the labor economics study shows qualitative differences in the model and in the duration dependence, the mortality study shows duration dependence that is robust to alternative specifications of the mixing distribution. It has yet to be shown what characterizes a data set which will be robust to alternative heterogeneity specifications. 90 / 191

91 Alternatively, in a study of child mortality, the specification of time dependence is shown to be sensitive to the inclusion of unobservables in the estimation strategy. Nonparametric maximum likelihood estimation has been shown to give unbiased estimates of covariate coefficients provided one has chosen a particular form of time dependence in Monte Carlo studies. This is where theory must play an important role. 91 / 191

92 0.6 w/ unobservables 0.4 est. by NPMLE unobservables ignored Weibull unobservables ignored NPMLE Gompertz 92 / 191

93 Without strong enough theory to suggest the time underlying functional form of time dependence, these studies suggest, the effect of covariates and time dependence cannot be distinguished, even with the use of a nonparametric estimation strategy. 93 / 191

94 The following digression on an area of statistics on the extraction of true test scores focuses on the issues that underlie the reasons why duration models with underlying heterogeneity need strong restrictions a priori from theory to obtain identification. The demands on the data from models of this sort are far greater than those of regression models. 94 / 191

95 True test scores are discussed in Lord and Novick, Statistical Theory of Mental Test Scores (1968). Let X = observed test score ξ = errors, with density h ξ u = unobserved true test score, with density g u X = ξ + u f (X ) = h ξ (X t) g u (t) dt 95 / 191

96 Notice that a survivor function of form S(t) = k(t θ) dµ(θ) is a general form of h ξ (X θ). The density of true test scores g u (t) is the analog of the mixing distribution. The question of both problems is fundamentally the same: When can an observed histogram be decomposed and purged of some noise? J.W. Tukey does so with strong assumptions regarding the densities h ξ and g u in Named and Faceless Values, Sankhya / 191

97 Heuristically, you want to decompose observed from true test scores, something that can be accomplished only with some already known characteristics of the errors: θ i = X i + σ2 Z (X X σx 2 i ) In this linear model, an empirical Bayes estimator θ uses a variance obtained from a previous study by ETS on errors in test scores to extract an estimate of u. 97 / 191

98 For the survivor analysis, we need a nonlinear version of this. For each individual, identify a normal density with mean θ i and variance σ 2 i where σ 2 i is the variance of θ i. The estimated time score distribution must be extracted from an observed histogram, where each portion of that histogram is one observation on an underlying distribution for that one individual. 98 / 191

99 ^ 0(2 i, F 2 i ) n 99 / 191

100 / 191

101 The estimated time score distribution is reflated by the normally distributed errors: g u (θ) = 1 n n η( θi, σ 2 i ). Again, the analytical device most useful depends on the question at hand. If each individual realization is important, the histogram is useful. If the population distribution is under discussion, the histogram must be reflated by the errors. 101 / 191

102 Nonparametric estimation of the mixing distribution Two approaches to extracting estimates of µ(θ), α and β with a nonparametric approach to µ(θ) are: 1 to use a maximum amount of information to obtain the largest number of jump points. If η is the number of points of increase used then N η is the number of remaining θ s available to calculate the probabilities along those discrete levels of dµ(θ). 2 to use a coarse distribution, say with 2 or 3 jumps and recompute the maximum likelihood for additional jump points until the likelihood no longer increases. 102 / 191

103 The model, including some simplifying notation, for each individual l yields a probability of experiencing the event at time t l is f (t l, θ X l) = f (t l ; λ j,l ) = αt α 1 λ j,l exp( t α λ j,l ) f (t l, θ X ) = f (t l ; λ j,l ) = = J p j αt α 1 λ j,l exp( t α λ j,l ) j=1 J p j f (t l ; λ j,l ) j=1 λ j,l = exp(x β + θ j ) J = number of points of increase in mixing distribution p j = probability of experiencing an event 103 / 191

104 For censored individuals S(t l λ l) = J pj exp( t α λ j,l) t l = censoring time Since θ is unobserved, the p j s and λ s can t be separated, and the previous method of proportional hazards is invalid. 104 / 191

105 Direct Estimation of Time-Dependent Models with Unobserved Heterogeneity The problems of estimating such a model with no a priori restrictions on heterogeneity, such as whether dµ(θ) is unimodal, has finite mean or any other moments, is apparent by examining the absolute simplest model conceivable. Imagine there are only two types of individuals. Their values of the unobservable are θ 1 and θ 2. Their proportions in the sample population are p and 1 p. 105 / 191

106 The data may then be confronted with an integrated survivor function ( S(t X ) = p exp H(t) exp(x β + θ 1 ) ) +(1 p) exp( H(t) exp(x β + θ 2 )) There are three parameters to be assigned values relating to the heterogeneity: θ 1, θ 2, p. These are in addition to the vector β and whatever α is implicit in the time dependence specifications H(t). 106 / 191

107 1-p p / 191

108 For three support points, we have S(t X ) = 3 k=1 p k exp( H(t) exp(x β + θ)). For any n support points, 2n 1 free parameters are introduced. 108 / 191

109 It is then no surprise that in order to distinguish between survivor functions, which are all monotone decreasing functions of time, across separate possible combinations of θ mixtures, Monte Carlo studies indicate that the size of the sample must be on the order of 20,000 observations. One Monte Carlo study used a sample created using a gamma distributed unobservable. The nonparametric maximum likelihood technique was poor at reaching any discrete mixture approaching the shape of the continuous gamma function. For example, the next figure was not forthcoming. 109 / 191

110 p / 191

111 However, the data did yield estimates of the covariates β which were consistent with the actual ones used to create the sample. 111 / 191

112 The E. M. (Expectation - Maximization) algorithm involves two separate steps. E-step: p (m+1) j = N l=1 p(m) j f (t l ;λ (m) j,l ) P J j=1 p j f (t l ;λ (m) j,l ) = N M-step: L = N J n=1 j=1 ln f (t l; λ j,l ) l=1 p(m) j φ (m+1) j,l 112 / 191

113 The procedure is as follows: 1 Select starting value: (p (0) 1... p (0) 2 Form φ (1) j,l = f (t l ;λ (0) j,l ) P J p (0) j f (t l ;λ (0) ). j,l J ), (θ(0) Substitute into L (1) = ln f (t l ; λ j,l )φ (1) j,l and maximize L to obtain α (1), β (1), θ (1). 3 Calculate p (1) j, a weighted average of p (0) j θ (0) J ), α(0), β (0). 4 Repeat from step 2, with φ (2) j,l, and L(2) to convergence. 113 / 191

114 Demster, Laird and Rubin show this procedure does indeed converge under exceedingly weak assumptions on f. Problems do arise with this technique: the likelihood function tends to have bumps - the convergence may yield a local rather than a global maximum. This technique can be very slow to converge over flat portions of the likelihood. 114 / 191

115 In practice, it makes sense to combine this method with a steepest ascent route. Also, this is an unbalanced problem in that some part (time dependency and observed covariates) is parametric and the other part (unobservables) is non parametric. Consistency of an estimate of m(θ) is available only for huge data sets (10k), whereas convergence is quicker for α and β. 115 / 191

116 The recoverability of m(θ) is difficult for even the simplest form of the survivor function. Take Observable data is on S(t) = S(t θ) = exp( θt). 0 exp( θt) dµ(θ). 116 / 191

117 A Laplace transform of the probability distribution µ(θ) is invertible if S(t) is known for continuous time. Very small changes in S(t) will yield large differences in dµ(θ). The analytical problem yields only approximations of S(t) for some time t. 117 / 191

118 Using the EM algorithm often yields a mixing distribution with only 3-8 points. A spiked estimated mixing distribution is likely an approximation to a true distribution with more variance: the data not being rich enough to distinguish between all such grouped peaks. 118 / 191

119 Given a discrete mixing distribution, there is thus a finite numbers of durations: g(t θ) = θ exp( θt) (exponential) g(t θ) = αt α 1 exp(θ) exp( t α exp θ) ḡ(t) = (observed) g(t, θ) dµ(θ) (Weibull) (mixed density) 119 / 191

120 An interesting mathematical result on variation diminishing transformations allows the analyst to infer information regarding the mixing distribution from the histogram of the durations. If g(t θ) exhibits sign regularity, then the number of times that the histogram ḡ crosses an arbitrary constant c must exceed the number of times that the mixing distribution crosses the same constant function c. g m(2) c c t / 191

121 Formally, sign regularity requires that for all values t 1 < t 2, θ 1 < θ 2, either ( ) g(t1 θ det 1 ) g(t 1 θ 2 ) 0 g(t 2 θ 1 ) g(t 2 θ 2 ) Or that the same determinant is less than zero for all t 1 < t 2, θ 1 < θ 2. Every member of the exponential family of functions exhibits this property. The variation diminishing property is { number of sign changes in m(θ) c for θ R} { number of sign changes in g (t) c for t R} 121 / 191

122 This not only determines the minimum number of modes that m(θ) may have, but gives local information on amplitude, since c may be chosen arbitrarily. 122 / 191

123 Tests which are insensitive to the specification of heterogeneity An important question given these results which show sensitivity of estimators to heterogeneity specification is what simple tests are reliable indicators of the basic structural form of the time dependence in a duration model. 123 / 191

124 For example, one study by Chahnazarian, Menken and Choe shows that a commonly used practice of picking an arbitrary time segmentation on a duration model in order to run logistic regressions on transitions at say 6 month intervals is quite sensitive to that arbitrary choice. They show that covariate coefficients vary qualitatively depending on the choice of time segmentation. The lesson, one might say, is to devote some effort to modeling the underlying dynamic process before seeking to obtain results on what variables affect that process! 124 / 191

125 Thus, some preliminary tests are in order to investigate what forms of switching processes are consistent with the data. First, discuss properties of Markov switching processes in discrete time. Following this, I present a test which can reject a broad class of models all based on the notion of partial exchangeability. 125 / 191

126 Finally, this is extended to a continuous time model of a particular form of separability, namely where switching times are given by a Poisson process (i.e. exponential waiting times) but where the switches across states depend only on the current state (Markovian). 126 / 191

127 Discrete time switching models The following discussion of Markov processes leads to an indication of the assumptions that must be made in order to test duration models when only prospective point sampling data is available, not event history data. Let the following diagram represent transitions between two states, represented by the two points 0 and 1. The dashed red line is one individual s history, the solid line is the other s. 1 0 t 127 / 191

128 The definition of the Markov property is P{X (k ) = i k X (k 1) = i k 1, X (k 2) = i k 2,..., X (0) = i 0 } = P {X (k ) = i k X (k 1) = i k 1 }, for k = 1, 2,... k, where k = 0 is the first survey. Time homogeneity requires that P{X (k ) = 1 X (k 1) = i k 1 } is the same for all sampling times k. 128 / 191

129 Suppose that one only has two surveys. All that may be estimated is P(X ( ) = i 1 X (0) = i 0 ) = m i0,i 1. A 2 2 transition matrix which summarizes probabilities assigned to the possible transitions (0, 0), (0, 1), (1, 0), (1, 1) at the survey times k = 0, k = : ( ) a 1 a P(0, ) = M = 1 b b 129 / 191

130 To examine the properties of such a Markov process, assume that we have a discrete time event model where the transition times are identified with the sampling times. Assume that switches occur just before the sampling times. Let the transition matrix have the actual values ( ) 1/4 3/4 M =. 5/8 3/8 If switches occurred twice as often, say at times 0, /2, and, then in order for the property of time stationarity to hold, there must be a new double-time transition matrix M 0 such that M 0 M 0 = M 130 / 191

131 Similarly, if time stationarity is to hold with sampling and switching times at 0, /3, 2 /3,, then M 0 M 0 M 0 = M For the above value of M, the first condition does not hold true, but the second does. As the number of matrices changes, the existence of the roots changes. 131 / 191

132 A Continuous Time Switching Process Assume that two rates describe waiting times for the transitions from state 0 to state 1 and vice versa, and that transition probabilities are independent across spells, P(T 0 > s) = exp( r 0 s) P(T 1 > s) = exp( r 1 s). 132 / 191

133 A basic test of this specification, as before, is a graph of the log of present of the population surviving, against time: one should observe linear functions. -ln S(t) r t 0 -ln S(t) r t 1 Duration of spell in state 0 Duration of spell in state / 191

134 The transition rates may be summarized by the matrix R: ( ) r0 r R = 0 r 1 r 1 Reducing R to eigenvalue form: λ 1 0 R = H... H 1 0 λ s exp(λ 1 ) 0 exp( R) = H... H 1 0 exp(λ s ) Here = length of time between observations. 134 / 191

135 Taking logs, R = ln M or 1 ln M = R. If this data allows an R matrix with real roots, ( r0 ) r 0 r 1 r 1 (4) then this model has probabilistic interpretation λ 1 0 M = B... B 1 0 λ s 135 / 191

136 In general, ln M = B(ln λ 1 +i(arg(λ j ) + 2πk))B 1 where the right hand term is the polar decomposition, and where ( ) a 1 a M = 1 b b 1 ln(a + b 1) ln M = a + b 2 ( a 1 ) 1 a 1 b b / 191

137 If a + b > 1, then the observations are compatible with a unique continuous time transition model with transition rates r 0 and r 1 and r 0 = r 1 = (1 a) ln(a + b 1) a + b 2 (1 b) ln(a + b 1) a + b 2 If a + b < 1, then ln(a + b 1) is complex and there is no probabilistic interpretation of the R matrix. 137 / 191

138 Suppose that the transition rates are dependent on some observables x : r 0 = c 0 exp(xβ) r 1 = c 1 exp(xβ) What happens when this form is used on discrete sampling data? [ ( )] c0 exp(x β) c P(0, ) = exp( R) exp 0 exp(x β) c 1 exp(x β) c 1 exp(x β) ( ) m00 m = 01 m 10 m / 191

139 The likelihood is then = N i=1 m δi mδi mδi mδi N (exp( R(xβ))) δ (exp( R(x β))) δ i=1 (exp( R(xβ))) δ (exp( R(x β))) δ Reference: J. Cohen and B. Singer Malaria in Nigeria: Constrained continuous time Markov models for discrete-time Longitudinal Data Edited by S. Levin. Lectures in Mathematics in Life Sciences. American Mathematics Society pp69-133, / 191

140 Nonstationary Continuous Time Models and Discrete Time Data Here we ask when transition matrix M = ( a 1 a 1 b b based on observations in two states, is consistent with a non-stationary continuous time model. ), 1) 2) P(s, t) = P(s, t)r(t) t P(s, t) = R(s)P(s, t) s P(t, t) = I P ij (s, t) = P(x(t) = j X (s) = i) ( ) r0 (t) r R(t) = 0 (t) r 1 (t) r 1 (t) for s < t 140 / 191

141 Nonstationary Continuous Time Models and Discrete Time Data In the R matrix, r(t) can be any non-negative function of t. Solutions to the differential equations (1) and (2) are conditional probabilities. There exists a P(0, ) iff (a + b) > 1. Given this condition holds, a problem with identification still needs to be solved. Assumptions or indications fro theory need to be applied to parametrization. Ref. GS Goodman 1970, An Intrinsic Time Model for Nonstationary Markov Chains, Zeitschrift für Wahrscheinlichkeitstheorie. 141 / 191

142 In the case where continuous event history data is available, the nonstationary Markov property can be tested and the nonstationary may be characterized. First using the language of a counting process. N (t) = (N 01 (t), N 10 (t)) = cumulative transitions cumulative transitions from 0 1in time ; from 1 0 in time 0 s < t 0 s < t 142 / 191

143 If the Markov property holds up, then by examining the 0 1 transitions, t 0 r 0 (u) du = k:t k t 1 Y 0 (t 01 k ), where Y 0 (t) is the number of individuals at risk in state 0 at time t. 143 / 191

144 For separate time periods t 0, t 1,, this gives rise to an estimated curve as below, Z t r 0 (u) du 0 t " t 0 t 1 t 2 t which conforms to a Weibull function. 144 / 191

145 If then t 0 r 0 (u) du = t α r 0 (t) = αt α 1. Notice that the estimation above applies the same Aalen s theory of counting processes presented at the beginning of these notes. The variance is [ t ] Var r(u) du = 0 k:t 0,1 k t [ 1 Y 0 (t (0,1) k ) ] 2. This technique cannot be applied with point sampling data. 145 / 191

146 An example of the theoretical restriction that must be imposed with point sampling data follows. It can be of a relatively simple form, as in the study of malaria: the climate s effect on the mosquito population led to the restrictive presumption that the winter incidence rates of malaria would be distinctly different than the summer s. { R 1 for 0 < t < /2 R(t) = for /2 < t < M = R 2 P(0, /2) = exp( /2R 1 ) P( /2, ) = exp( R 2 ) ) = ( a 1 a 1 a a ( α 1 α 1 α α = M 1 M 2 ) ( β 1 β 1 β β ) 146 / 191

147 In order for the transitions over a year to have a Markovian probabilistic interpretation, a > 1/2, and for the 6 month seasons, a > 1/2, β > 1/2. From the condition that M = M 1 M 2, it follows that a = αβ + (1 α)(1 β). Graphically, this restriction requires (α, β) be of a lows of points for each possible a. With data from point sampling available at six month intervals, the estimated â, ˆα, and ˆβ can be used to test this restriction. 147 / 191

148 Time Aggregation There are dangers in using comparisons between estimated discrete time Markov transitions and continuous time switching probabilities to draw conclusions regarding the dynamics of the process. For example, comparing the sum of the diagonal elements of the matrix M k to the sum of the diagonals of P(0, k ) may be interpreted as an examination of those who stay in the originating state over all time periods k and those who begin and end in the same state, but who may make transitions in and out in the intervening periods. 148 / 191

149 Since the second condition seems a weaker condition, one might expect the following regularity: tr(m k ) < trˆp(0, k ). This cannot be claimed as certainty, since the Markov model applied to discrete sampling points would not capture asynchronous switching times. Suppose that population were not homogeneous, but a simple mixture of movers and stayers. Stayers making up {s : 0 s < 1} proportion of the population. 149 / 191

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Michael P. Babington and Javier Cano-Urbina August 31, 2018 Abstract Duration data obtained from a given stock of individuals

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Duration Analysis. Joan Llull

Duration Analysis. Joan Llull Duration Analysis Joan Llull Panel Data and Duration Models Barcelona GSE joan.llull [at] movebarcelona [dot] eu Introduction Duration Analysis 2 Duration analysis Duration data: how long has an individual

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Unobserved Heterogeneity

Unobserved Heterogeneity Unobserved Heterogeneity Germán Rodríguez grodri@princeton.edu Spring, 21. Revised Spring 25 This unit considers survival models with a random effect representing unobserved heterogeneity of frailty, a

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Remarks on Structural Estimation The Search Framework

Remarks on Structural Estimation The Search Framework Remarks on Structural Estimation The Search Framework Christopher Flinn NYU and Collegio Carlo Alberto November 2009 1 The Estimation of Search Models We develop a simple model of single agent search set

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Dynamic Models Part 1

Dynamic Models Part 1 Dynamic Models Part 1 Christopher Taber University of Wisconsin December 5, 2016 Survival analysis This is especially useful for variables of interest measured in lengths of time: Length of life after

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Introduction to Rare Event Simulation

Introduction to Rare Event Simulation Introduction to Rare Event Simulation Brown University: Summer School on Rare Event Simulation Jose Blanchet Columbia University. Department of Statistics, Department of IEOR. Blanchet (Columbia) 1 / 31

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions Econ 513, USC, Department of Economics Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions I Introduction Here we look at a set of complications with the

More information

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

7.1 The Hazard and Survival Functions

7.1 The Hazard and Survival Functions Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Identification of Models of the Labor Market

Identification of Models of the Labor Market Identification of Models of the Labor Market Eric French and Christopher Taber, Federal Reserve Bank of Chicago and Wisconsin November 6, 2009 French,Taber (FRBC and UW) Identification November 6, 2009

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Decomposing Duration Dependence in a Stopping Time Model

Decomposing Duration Dependence in a Stopping Time Model Decomposing Duration Dependence in a Stopping Time Model Fernando Alvarez University of Chicago Katarína Borovičková New York University June 8, 2015 Robert Shimer University of Chicago Abstract We develop

More information

Exam C Solutions Spring 2005

Exam C Solutions Spring 2005 Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary Part VII Accounting for the Endogeneity of Schooling 327 / 785 Much of the CPS-Census literature on the returns to schooling ignores the choice of schooling and its consequences for estimating the rate

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Introduction to Reliability Theory (part 2)

Introduction to Reliability Theory (part 2) Introduction to Reliability Theory (part 2) Frank Coolen UTOPIAE Training School II, Durham University 3 July 2018 (UTOPIAE) Introduction to Reliability Theory 1 / 21 Outline Statistical issues Software

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Decomposing Duration Dependence in a Stopping Time Model

Decomposing Duration Dependence in a Stopping Time Model Decomposing Duration Dependence in a Stopping Time Model Fernando Alvarez University of Chicago Katarína Borovičková New York University February 5, 2016 Robert Shimer University of Chicago Abstract We

More information

Hakone Seminar Recent Developments in Statistics

Hakone Seminar Recent Developments in Statistics Hakone Seminar Recent Developments in Statistics November 12-14, 2015 Hotel Green Plaza Hakone: http://www.hgp.co.jp/language/english/sp/ Organizer: Masanobu TANIGUCHI (Research Institute for Science &

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

Proportional hazards regression

Proportional hazards regression Proportional hazards regression Patrick Breheny October 8 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/28 Introduction The model Solving for the MLE Inference Today we will begin discussing regression

More information

A Distributional Framework for Matched Employer Employee Data

A Distributional Framework for Matched Employer Employee Data A Distributional Framework for Matched Employer Employee Data (Preliminary) Interactions - BFI Bonhomme, Lamadon, Manresa University of Chicago MIT Sloan September 26th - 2015 Wage Dispersion Wages are

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Heckman and Sedlacek, JPE 1985, 93(6), 1077-1125 James Heckman University of Chicago

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

The Weibull Distribution

The Weibull Distribution The Weibull Distribution Patrick Breheny October 10 Patrick Breheny University of Iowa Survival Data Analysis (BIOS 7210) 1 / 19 Introduction Today we will introduce an important generalization of the

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

R. Koenker Spring 2017 Economics 574 Problem Set 1

R. Koenker Spring 2017 Economics 574 Problem Set 1 R. Koenker Spring 207 Economics 574 Problem Set.: Suppose X, Y are random variables with joint density f(x, y) = x 2 + xy/3 x [0, ], y [0, 2]. Find: (a) the joint df, (b) the marginal density of X, (c)

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data

Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data Géraldine Laurent 1 and Cédric Heuchenne 2 1 QuantOM, HEC-Management School of

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

ECON 721: Lecture Notes on Duration Analysis. Petra E. Todd

ECON 721: Lecture Notes on Duration Analysis. Petra E. Todd ECON 721: Lecture Notes on Duration Analysis Petra E. Todd Fall, 213 2 Contents 1 Two state Model, possible non-stationary 1 1.1 Hazard function.......................... 1 1.2 Examples.............................

More information

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching Mayya Komisarchik Harvard University April 13, 2016 1 Heartfelt thanks to all of the Gov 2001 TFs of yesteryear; this section draws heavily

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Maximum likelihood estimation of a log-concave density based on censored data

Maximum likelihood estimation of a log-concave density based on censored data Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen

More information

Lecture 7. Poisson and lifetime processes in risk analysis

Lecture 7. Poisson and lifetime processes in risk analysis Lecture 7. Poisson and lifetime processes in risk analysis Jesper Rydén Department of Mathematics, Uppsala University jesper.ryden@math.uu.se Statistical Risk Analysis Spring 2014 Example: Life times of

More information

Accelerated Failure Time Models

Accelerated Failure Time Models Accelerated Failure Time Models Patrick Breheny October 12 Patrick Breheny University of Iowa Survival Data Analysis (BIOS 7210) 1 / 29 The AFT model framework Last time, we introduced the Weibull distribution

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

1 The problem of survival analysis

1 The problem of survival analysis 1 The problem of survival analysis Survival analysis concerns analyzing the time to the occurrence of an event. For instance, we have a dataset in which the times are 1, 5, 9, 20, and 22. Perhaps those

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Continuous-time Markov Chains

Continuous-time Markov Chains Continuous-time Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 23, 2017

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Estimation of Quantiles

Estimation of Quantiles 9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

One-Parameter Processes, Usually Functions of Time

One-Parameter Processes, Usually Functions of Time Chapter 4 One-Parameter Processes, Usually Functions of Time Section 4.1 defines one-parameter processes, and their variations (discrete or continuous parameter, one- or two- sided parameter), including

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information