Maximum likelihood estimation of a log-concave density based on censored data

Size: px
Start display at page:

Download "Maximum likelihood estimation of a log-concave density based on censored data"

Transcription

1 Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen and Kaspar Rufibach Swiss Statistics Seminar, 25 October 2011

2 Overview 1 A review of log-concave density estimation for exact data 2 Log-concave density estimation for censored data Problem formulation Theoretical results An EM algorithm Simulated and real data examples

3 Part 1: Log-concave density estimation for exact data

4 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood.

5 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood. Log-likelihood: n l(f ) := log f (X i ), i=1 not bounded from above on space of all densities f.

6 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood. Log-likelihood: l(f ) := n log f (X i ), i=1 not bounded from above on space of all densities f. Our approach: impose shape constraint. Unimodality is not enough.

7 Part 1: log-concave densities and exact data Unimodality is not enough! Distribute mass 1/2 uniformly over [x (1), x (n) ] (blue) and concentrate mass 1/2 more and more closely around one individual point (red)

8 Part 1: log-concave densities and exact data Log-concavity A density f is called log-concave if ϕ := log f : R [, ) is a concave function.

9 Part 1: log-concave densities and exact data Log-concavity A density f is called log-concave if ϕ := log f : R [, ) is a concave function. Turns out that class F lc of log-concave densities is a large, rather flexible class with many nice properties.

10 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal.

11 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ).

12 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits.

13 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits. S, Hüsler and Dümbgen (2009/11), strong continuity property: weak convergence implies pointwise convergence of densities, and conv. in an exponentially weighted total variation distance: exp(δ x ) fn (x) f (x) dx 0 for some δ > 0.

14 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits. S, Hüsler and Dümbgen (2009/11), strong continuity property: weak convergence implies pointwise convergence of densities, and conv. in an exponentially weighted total variation distance: exp(δ x ) fn (x) f (x) dx 0 for some δ > 0. Every log-concave density has subexponential tails and a non-decreasing hazard rate.

15 Part 1: log-concave densities and exact data Parametric families of log-concave distributions The class of log-concave densities contains the following parametric families: Normal, logistic, exponential, Laplace, uniform, Gumbel Gamma and Weibull with shape parameter 1 Beta with both shape parameters 1

16 Part 1: log-concave densities and exact data Log-concave density estimation Nonparametric maximum likelihood estimation under log-concavity constraint has been studied independently in Rufibach (2006) and Dümbgen and Rufibach (2009) Pal, Woodroofe, and Meyer (2007)

17 Part 1: log-concave densities and exact data Log-concave density estimation Nonparametric maximum likelihood estimation under log-concavity constraint has been studied independently in Rufibach (2006) and Dümbgen and Rufibach (2009) Pal, Woodroofe, and Meyer (2007) Task: for n 2 find fˆn arg max f F lc n log f (X i ), i=1 where F lc := {f : R R + log-concave density}.

18 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ).

19 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ). Shape: ϕ n piecewise linear, changes of slope only possible at data points X i, dom(ϕ n) = [X (1), X (n) ].

20 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ). Shape: ϕ n piecewise linear, changes of slope only possible at data points X i, dom(ϕ n) = [X (1), X (n) ]. L depends on ϕ only via ϕ(x 1 ),..., ϕ(x n ). Is strictly concave and coercive; defined on a closed, convex cone R n. = Existence and uniqueness of NPMLE fˆn.

21 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance.

22 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0.

23 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ).

24 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ). Pointwise limit distributions: If f (x 0 ) > 0, ϕ C 2 around x 0, and ϕ (x 0 ) 0: n 2/5( fˆn(x 0 ) f (x 0 ) ) D c(x0, f )H (0), where H is so-called lower invelope of Y (t) = 0 t 0 t W (s) ds t 4, W two-sided Brownian motion starting in 0.

25 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ). Pointwise limit distributions: If f (x 0 ) > 0, ϕ C 2 around x 0, and ϕ (x 0 ) 0: n 2/5( fˆn(x 0 ) f (x 0 ) ) D c(x0, f )H (0), where H is so-called lower invelope of Y (t) = 0 t 0 t W (s) ds t 4, W two-sided Brownian motion starting in 0. results for derived quantities: mode, distance between knots.

26 Part 1: log-concave densities and exact data Computation We have a concave maximization problem over a closed convex cone in R n. n potentially large.

27 Part 1: log-concave densities and exact data Computation We have a concave maximization problem over a closed convex cone in R n. n potentially large. There is a fast active set algorithm by Dümbgen and Rufibach (2011), available in the R-package logcondens.

28 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1).

29 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1) Log-concave density estimation from i.i.d. data Density estimate. Dashed vertical lines indicate knots of the log-density. log-concave f^n log-concave smoothed f^n* 0.20 density

30 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1) Log-concave density estimation from i.i.d. data Density estimate. Dashed vertical lines indicate knots of the log-density. log-concave f^n log-concave smoothed f^n* 0.20 density

31 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.).

32 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.). General approximation theory with applications to additive regression models with log-concave error distribution (Dümbgen, Samworth and S, 2010, Ann. Stat.).

33 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.). General approximation theory with applications to additive regression models with log-concave error distribution (Dümbgen, Samworth and S, 2010, Ann. Stat.). Discrete case (Balabdaoui and Rufibach, preprint 2011).

34 Part 2: Log-concave density estimation for censored data

35 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i.

36 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0.

37 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0. Write generally L i := inf X i, R i := sup X i.

38 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0. Write generally L i := inf X i, R i := sup X i. We would like to estimate the underlying distribution of the X i s. (It is not clear what this means without further specification.)

39 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =.

40 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i.

41 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i. Then the X i translate into observations X i as follows: { X i if X i I ij and Y ij = 1; X i := I ij if X i I ij and Y ij = 0.

42 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i. Then the X i translate into observations X i as follows: { X i if X i I ij and Y ij = 1; X i := I ij if X i I ij and Y ij = 0. Note: The K i, T ij, and Y ij may be arbitrarily dependent random variables.

43 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0.

44 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0.

45 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0. Mixed case interval-censoring: for each individual it is known in which of a number of contiguous intervals the event happened. K i finite-valued, I ij 0 for all j.

46 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0. Mixed case interval-censoring: for each individual it is known in which of a number of contiguous intervals the event happened. K i finite-valued, I ij 0 for all j. Rounding and binning: X i s are known up to rounding to closest integer (say). K i, I ij 0, T ij j + 1/2 for all i, j.

47 Conditional log-likelihood Assume that conditionally on all K i, T ij, and Y ij the random variables X 1,..., X n are i.i.d. with density f = exp ϕ.

48 Conditional log-likelihood Assume that conditionally on all K i, T ij, and Y ij the random variables X 1,..., X n are i.i.d. with density f = exp ϕ. The normalized log-likelihood for ϕ is then l(ϕ) := l(ϕ; X 1,..., X n) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 where L i = inf X i, R i = sup X i. ( Ri L i )] exp ϕ(x) dx,

49 Likelihood maximization Maximize l(ϕ) over all ϕ Φ that satisfy exp ϕ(x) dx = 1.

50 Likelihood maximization Maximize l(ϕ) over all ϕ Φ that satisfy exp ϕ(x) dx = 1. Equivalently: Maximize L(ϕ) = 1 n n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log i=1 0 exp ϕ(x) dx ( Ri L i )] exp ϕ(x) dx over ϕ Φ.

51 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}.

52 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}. Then any ϕ arg max ϕ Φ L(ϕ) satisfies [min i R i, max i L i ] dom ( ϕ ) [τ 1, τ q ]

53 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}. Then any ϕ arg max ϕ Φ L(ϕ) satisfies [min i R i, max i L i ] dom ( ϕ ) [τ 1, τ q ] ϕ is linear on every interval (τ j, τ j+1 ) dom(ϕ ) \ n i=1 [L i, R i ].

54 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ).

55 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ). Then there exists a maximizer ϕ with dom(ϕ ) = [τ j1, τ j2 ] R + for some j 1, j 2 that is piecewise linear on dom(ϕ ) with at most one change of slope in any interval (τ j, τ j+1 ).

56 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ). Then there exists a maximizer ϕ with dom(ϕ ) = [τ j1, τ j2 ] R + for some j 1, j 2 that is piecewise linear on dom(ϕ ) with at most one change of slope in any interval (τ j, τ j+1 ). There is no change of slope in the two extreme intervals (τ j1, τ j1 +1) and (τ j2 1, τ j2 ); the interval (τ j1 +1, τ j1 +2) unless there is an i such that L i = R i = τ j1, and the interval (τ j2 2, τ j2 1) unless there is an i such that L i = R i = τ j2 ; any interval (τ j, τ j+1 ) dom(ϕ ) \ n i=1 [L i, R i ].

57 Existence of maximizer Theorem (existence) Assume that there is no i o {1,..., n} with L io = R io = X io and Then n [L i, R i ] = {X io }. i=1 arg max ϕ G conc L(ϕ), where G conc Φ consists of all piecewise linear functions with domain and changes of slope according to the shape theorem. We only allow slope changes on a fine grid t 1 < t 2 <... < t m containing the finite τ j.

58 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1

59 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1 Assuming that there are intervals to the left and to the right of X io =: τ jo (otherwise idea is easily adapted), we can obtain an arbitrarily large log-likelihood, by defining ϕ as a triangle function with domain [τ jo 1, τ jo+1] and peak in τ jo :

60 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1 Assuming that there are intervals to the left and to the right of X io =: τ jo (otherwise idea is easily adapted), we can obtain an arbitrarily large log-likelihood, by defining ϕ as a triangle function with domain [τ jo 1, τ jo+1] and peak in τ jo : The integral of exp ϕ over [τ jo 1, τ jo ] and [τ jo, τ jo+1] can be kept constant, while ϕ(τ jo ) and hence L(ϕ) go towards.

61 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +).

62 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists.

63 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists. 2 Show L(ϕ (k) ) for every sequence (ϕ (k) ) k G conc with max 1 i m+1 ϕ (k) i by somewhat tedious calculations, where ϕ (k) m+1 := log( ϕ (k) m+1 ).

64 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists. 2 Show L(ϕ (k) ) for every sequence (ϕ (k) ) k G conc with max 1 i m+1 ϕ (k) i by somewhat tedious calculations, where ϕ (k) m+1 := log( ϕ (k) m+1 ). Assuming now that there is no maximizer in G conc ( [, ) m [, 0) ) leads to a contradiction.

65 Open problems uniqueness

66 Open problems uniqueness consistency and rates

67 Open problems uniqueness consistency and rates everything else...

68 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore!

69 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm.

70 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm. Remember: l(ϕ) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 incomplete data log-likelihood (normalized); ( Ri L i )] exp ϕ(x) dx.

71 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm. Remember: l(ϕ) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 incomplete data log-likelihood (normalized); ( Ri L i )] exp ϕ(x) dx. l (ϕ) := 1 n n ϕ(x i ) i=1 complete data log-likelihood (normalized).

72 EM algorithm E step: Given ϕ r G conc (log-densities in G conc ), compute l (ϕ; ϕ r ) := E ϕr ( l (ϕ) X i X i for all i ) = 1 n = 1 n n i=1 i=1 ( E ϕr ϕ(xi ) ) Xi X i n [1{L i = R i }ϕ(x i ) + 1{L i < R i } E ( ϕ r ϕ(xi ) 1{X i X i} ) ] ( ). P ϕr Xi X i Easy, since ϕ r, ϕ piecewise linear.

73 EM algorithm E step: Given ϕ r G conc (log-densities in G conc ), compute l (ϕ; ϕ r ) := E ϕr ( l (ϕ) X i X i for all i ) = 1 n = 1 n n i=1 i=1 ( E ϕr ϕ(xi ) ) Xi X i n [1{L i = R i }ϕ(x i ) + 1{L i < R i } E ( ϕ r ϕ(xi ) 1{X i X i} ) ] ( ). P ϕr Xi X i Easy, since ϕ r, ϕ piecewise linear. M step: Maximize l (ϕ; ϕ r ) over all ϕ G conc with dom(ϕ) dom(ϕ r ). Domain will not get smaller. Call maximizer ϕ r+1.

74 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <.

75 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <.

76 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <. Therefore l (ϕ; ϕ r ) = m w j ϕ(t j ) + w m+1 ϕ (t m +) j=1 with certain (computable) weights w 1,..., w m > 0; w m+1 > 0 if right endpoint appears in data intervals, = 0 otherwise.

77 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <. Therefore l (ϕ; ϕ r ) = m w j ϕ(t j ) + w m+1 ϕ (t m +) j=1 with certain (computable) weights w 1,..., w m > 0; w m+1 > 0 if right endpoint appears in data intervals, = 0 otherwise. Is weighted version of our exact data log-likelihood. We may use the original active set algorithm for m data points (with an extension for the right-hand slope)!

78 Domain reduction EM algorithm starts on maximal domain [min i L i, max i R i ]. Problem: Domain is never reduced, but we may have started on too large domain. Algorithm forces ϕ r towards on extra domain (leading to convergence / numerical problems).

79 Domain reduction EM algorithm starts on maximal domain [min i L i, max i R i ]. Problem: Domain is never reduced, but we may have started on too large domain. Algorithm forces ϕ r towards on extra domain (leading to convergence / numerical problems). Solution: If τ j+1 τ j exp ϕ(x) dx gets too small for some j (only possible at the very left or very right of current domain), the corresponding interval is removed from the domain and the EM algorithm is restarted on the reduced domain.

80 Three examples (simulated) right-censored Γ(3, 1)-data

81 Three examples (simulated) right-censored Γ(3, 1)-data (simulated) mixed-case interval-censored Gumbel(2, 1)-data

82 Three examples (simulated) right-censored Γ(3, 1)-data (simulated) mixed-case interval-censored Gumbel(2, 1)-data (real data) ovarian cancer cases

83 Example 1: Right-censoring Consider the same 100 data points from Γ(3, 1) as before, but now introduce censoring after individual Γ(2, 1/2)-distributed inspection times. Resulting data (37 censored):

84 Example 1: Log-density estimates NPMLE for censored data true log-density NPMLE for exact data

85 Example 1: Density estimates NPMLE for censored data true density NPMLE for exact data

86 Example 2: Interval-censoring 100 data points from a Gumbel(2, 1)-distribution. Each individual is inspected according to a Poisson(1)-process, i.e. each inspection takes place an Exp(1)-distributed time after the last (all times independent)

87 Example 2: Survival function estimates survival function of log-concave NPMLE true survival function Turnbull estimator of survival function

88 Example 3: Real data We consider the dataset ovarian from the R-package survival. Survival times in days of 26 women with ovarian cancer (14 censored)

89 Example 3: Survival function estimates log-concave NPMLE Kaplan-Meier estimator

90 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer?

91 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time.

92 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time. New log-likelihood: l(ϕ, p 0 ) := 1 n [ 1{L i = R i }ϕ(x i ) n i=1 ( Ri )] + 1{L i < R i } log exp ϕ(x) dx + p 0 1{R i = }. L i

93 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time. New log-likelihood: l(ϕ, p 0 ) := 1 n [ 1{L i = R i }ϕ(x i ) n i=1 ( Ri )] + 1{L i < R i } log exp ϕ(x) dx + p 0 1{R i = }. L i For known p 0, nothing essential changes in our algorithm!

94 Profile likelihood maximization in p 0 Maximize l profile (p 0 ) := max ϕ Gconc (p 0 ) l(ϕ, p 0) p 0 profile log-likelihood p0

95 Profile likelihood maximization in p 0 Maximize l profile (p 0 ) := max ϕ Gconc (p 0 ) l(ϕ, p 0) p 0 profile log-likelihood In our example p Compute ϕ arg max ϕ Gconc l(ϕ, p 0). (p 0) p0

96 Example 3: Survival function estimates log-concave NPMLE Kaplan-Meier estimator

97 R package Algorithm implemented in R package logconcens. Give it a try!

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work

More information

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Northwestern University November 5, 2010 Conference on Shape Restrictions in Non- and Semi-Parametric

More information

Nonparametric estimation under Shape Restrictions

Nonparametric estimation under Shape Restrictions Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions

More information

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,

More information

Log Concavity and Density Estimation

Log Concavity and Density Estimation Log Concavity and Density Estimation Lutz Dümbgen, Kaspar Rufibach, André Hüsler (Universität Bern) Geurt Jongbloed (Amsterdam) Nicolai Bissantz (Göttingen) I. A Consulting Case II. III. IV. Log Concave

More information

Shape constrained estimation: a brief introduction

Shape constrained estimation: a brief introduction Shape constrained estimation: a brief introduction Jon A. Wellner University of Washington, Seattle Cowles Foundation, Yale November 7, 2016 Econometrics lunch seminar, Yale Based on joint work with: Fadoua

More information

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November

More information

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed

More information

Computing Confidence Intervals for Log-Concave Densities. by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University

Computing Confidence Intervals for Log-Concave Densities. by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University Computing Confidence Intervals for Log-Concave Densities by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University August 2012 Contents 1 Introduction 1 1.1 Introduction..............................

More information

arxiv: v2 [stat.me] 7 Aug 2007

arxiv: v2 [stat.me] 7 Aug 2007 Active Set and EM Algorithms for Log Concave Densities Based on Complete and Censored Data Lutz Dümbgen, André Hüsler and Kaspar Rufibach University of Bern arxiv:0707.4643v2 [stat.me] 7 Aug 2007 August

More information

A Law of the Iterated Logarithm. for Grenander s Estimator

A Law of the Iterated Logarithm. for Grenander s Estimator A Law of the Iterated Logarithm for Grenander s Estimator Jon A. Wellner University of Washington, Seattle Probability Seminar October 24, 2016 Based on joint work with: Lutz Dümbgen and Malcolm Wolff

More information

Active Set Methods for Log-Concave Densities and Nonparametric Tail Inflation

Active Set Methods for Log-Concave Densities and Nonparametric Tail Inflation Active Set Methods for Log-Concave Densities and Nonparametric Tail Inflation Lutz Duembgen (Bern) Aleandre Moesching and Christof Straehl (Bern) Peter McCullagh and Nicholas G. Polson (Chicago) January

More information

Recent progress in log-concave density estimation

Recent progress in log-concave density estimation Submitted to Statistical Science Recent progress in log-concave density estimation Richard J. Samworth University of Cambridge arxiv:1709.03154v1 [stat.me] 10 Sep 2017 Abstract. In recent years, log-concave

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Regularization prescriptions and convex duality: density estimation and Renyi entropies

Regularization prescriptions and convex duality: density estimation and Renyi entropies Regularization prescriptions and convex duality: density estimation and Renyi entropies Ivan Mizera University of Alberta Department of Mathematical and Statistical Sciences Edmonton, Alberta, Canada Linz,

More information

arxiv: v3 [math.st] 4 Jun 2018

arxiv: v3 [math.st] 4 Jun 2018 Inference for the mode of a log-concave density arxiv:1611.1348v3 [math.st] 4 Jun 218 Charles R. Doss and Jon A. Wellner School of Statistics University of Minnesota Minneapolis, MN 55455 e-mail: cdoss@stat.umn.edu

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

Maximum likelihood estimation of a multidimensional logconcave

Maximum likelihood estimation of a multidimensional logconcave Maximum likelihood estimation of a multidimensional logconcave density Madeleine Cule and Richard Samworth University of Cambridge, UK and Michael Stewart University of Sydney, Australia Summary. Let X

More information

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Maximum likelihood estimation of a multidimensional logconcave

Maximum likelihood estimation of a multidimensional logconcave Maximum likelihood estimation of a multidimensional logconcave density Madeleine Cule and Richard Samworth University of Cambridge, UK and Michael Stewart University of Sydney, Australia Summary. Let X

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES. By Arseni Seregin, and Jon A. Wellner, University of Washington

NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES. By Arseni Seregin, and Jon A. Wellner, University of Washington Submitted to the Annals of Statistics arxiv: math.st/0911.4151v1 NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES By Arseni Seregin, and Jon A. Wellner, University of Washington We

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Survival Analysis for Interval Censored Data - Nonparametric Estimation

Survival Analysis for Interval Censored Data - Nonparametric Estimation Survival Analysis for Interval Censored Data - Nonparametric Estimation Seminar of Statistics, ETHZ Group 8, 02. May 2011 Martina Albers, Nanina Anderegg, Urs Müller Overview Examples Nonparametric MLE

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software March 2011, Volume 39, Issue 6. http://www.jstatsoft.org/ logcondens: Computations Related to Univariate Log-Concave Density Estimation Lutz Dümbgen University of Bern

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Nonparametric estimation under shape constraints, part 2

Nonparametric estimation under shape constraints, part 2 Nonparametric estimation under shape constraints, part 2 Piet Groeneboom, Delft University August 7, 2013 What to expect? Theory and open problems for interval censoring, case 2. Same for the bivariate

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

Chernoff s distribution is log-concave. But why? (And why does it matter?)

Chernoff s distribution is log-concave. But why? (And why does it matter?) Chernoff s distribution is log-concave But why? (And why does it matter?) Jon A. Wellner University of Washington, Seattle University of Michigan April 1, 2011 Woodroofe Seminar Based on joint work with:

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Problem set 4, Real Analysis I, Spring, 2015.

Problem set 4, Real Analysis I, Spring, 2015. Problem set 4, Real Analysis I, Spring, 215. (18) Let f be a measurable finite-valued function on [, 1], and suppose f(x) f(y) is integrable on [, 1] [, 1]. Show that f is integrable on [, 1]. [Hint: Show

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Learning and Testing Structured Distributions

Learning and Testing Structured Distributions Learning and Testing Structured Distributions Ilias Diakonikolas University of Edinburgh Simons Institute March 2015 This Talk Algorithmic Framework for Distribution Estimation: Leads to fast & robust

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

A tailor made nonparametric density estimate

A tailor made nonparametric density estimate A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Log-concave distributions: definitions, properties, and consequences

Log-concave distributions: definitions, properties, and consequences Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202

More information

Approximate Self Consistency for Middle-Censored Data

Approximate Self Consistency for Middle-Censored Data Approximate Self Consistency for Middle-Censored Data by S. Rao Jammalamadaka Department of Statistics & Applied Probability, University of California, Santa Barbara, CA 93106, USA. and Srikanth K. Iyer

More information

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Chapter 3. Chord Length Estimation. 3.1 Introduction

Chapter 3. Chord Length Estimation. 3.1 Introduction Chapter 3 Chord Length Estimation 3.1 Introduction Consider a random closed set W R 2 which we observe through a bounded window B. Important characteristics of the probability distribution of a random

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

The high order moments method in endpoint estimation: an overview

The high order moments method in endpoint estimation: an overview 1/ 33 The high order moments method in endpoint estimation: an overview Gilles STUPFLER (Aix Marseille Université) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Overview of Extreme Value Theory. Dr. Sawsan Hilal space Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Likelihood Based Inference for Monotone Response Models

Likelihood Based Inference for Monotone Response Models Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic

More information

Probability and Statistics

Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Chapter 3: Parametric families of univariate distributions CHAPTER 3: PARAMETRIC

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

A SMOOTHED VERSION OF THE KAPLAN-MEIER ESTIMATOR. Agnieszka Rossa

A SMOOTHED VERSION OF THE KAPLAN-MEIER ESTIMATOR. Agnieszka Rossa A SMOOTHED VERSION OF THE KAPLAN-MEIER ESTIMATOR Agnieszka Rossa Dept. of Stat. Methods, University of Lódź, Poland Rewolucji 1905, 41, Lódź e-mail: agrossa@krysia.uni.lodz.pl and Ryszard Zieliński Inst.

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Gaussian discriminant analysis Naive Bayes

Gaussian discriminant analysis Naive Bayes DM825 Introduction to Machine Learning Lecture 7 Gaussian discriminant analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. is 2. Multi-variate

More information

(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199

(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199 CONSISTENCY OF THE GMLE WITH MIXED CASE INTERVAL-CENSORED DATA By Anton Schick and Qiqing Yu Binghamton University April 1997. Revised December 1997, Revised July 1998 Abstract. In this paper we consider

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Legendre-Fenchel transforms in a nutshell

Legendre-Fenchel transforms in a nutshell 1 2 3 Legendre-Fenchel transforms in a nutshell Hugo Touchette School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, UK Started: July 11, 2005; last compiled: August 14, 2007

More information

Statistical Estimation: Data & Non-data Information

Statistical Estimation: Data & Non-data Information Statistical Estimation: Data & Non-data Information Roger J-B Wets University of California, Davis & M.Casey @ Raytheon G.Pflug @ U. Vienna, X. Dong @ EpiRisk, G-M You @ EpiRisk. a little background Decision

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Class 2 & 3 Overfitting & Regularization

Class 2 & 3 Overfitting & Regularization Class 2 & 3 Overfitting & Regularization Carlo Ciliberto Department of Computer Science, UCL October 18, 2017 Last Class The goal of Statistical Learning Theory is to find a good estimator f n : X Y, approximating

More information

New mixture models and algorithms in the mixtools package

New mixture models and algorithms in the mixtools package New mixture models and algorithms in the mixtools package Didier Chauveau MAPMO - UMR 6628 - Université d Orléans 1ères Rencontres BoRdeaux Juillet 2012 Outline 1 Mixture models and EM algorithm-ology

More information

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups 4 Testing Hypotheses The next lectures will look at tests, some in an actuarial setting, and in the last subsection we will also consider tests applied to graduation 4 Tests in the regression setting )

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Nonparametric Model Construction

Nonparametric Model Construction Nonparametric Model Construction Chapters 4 and 12 Stat 477 - Loss Models Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28 Types of data Types of data For non-life

More information

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017 CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class

More information

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by . Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

SOLUTION FOR HOMEWORK 8, STAT 4372

SOLUTION FOR HOMEWORK 8, STAT 4372 SOLUTION FOR HOMEWORK 8, STAT 4372 Welcome to your 8th homework. Here you have an opportunity to solve classical estimation problems which are the must to solve on the exam due to their simplicity. 1.

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models David Rosenberg, Brett Bernstein New York University April 26, 2017 David Rosenberg, Brett Bernstein (New York University) DS-GA 1003 April 26, 2017 1 / 42 Intro Question Intro

More information

Likelihood Based Inference for Monotone Response Models

Likelihood Based Inference for Monotone Response Models Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 11, 2006 Abstract The behavior of maximum likelihood estimates (MLEs) and the likelihood ratio

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Shape constrained kernel density estimation

Shape constrained kernel density estimation Shape constrained kernel density estimation Melanie Birke Ruhr-Universität Bochum Fakultät für Mathematik 4478 Bochum, Germany e-mail: melanie.birke@ruhr-uni-bochum.de March 27, 28 Abstract In this paper,

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

Maximum Likelihood Estimation under Shape Constraints

Maximum Likelihood Estimation under Shape Constraints Maximum Likelihood Estimation under Shape Constraints Hanna K. Jankowski June 2 3, 29 Contents 1 Introduction 2 2 The MLE of a decreasing density 3 2.1 Finding the Estimator..............................

More information

Introduction: exponential family, conjugacy, and sufficiency (9/2/13)

Introduction: exponential family, conjugacy, and sufficiency (9/2/13) STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p. Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

Legendre-Fenchel transforms in a nutshell

Legendre-Fenchel transforms in a nutshell 1 2 3 Legendre-Fenchel transforms in a nutshell Hugo Touchette School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, UK Started: July 11, 2005; last compiled: October 16, 2014

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution. POWER-LAW ADJUSTED SURVIVAL MODELS William J. Reed Department of Mathematics & Statistics University of Victoria PO Box 3060 STN CSC Victoria, B.C. Canada V8W 3R4 reed@math.uvic.ca Key Words: survival

More information

2. Variance and Higher Moments

2. Variance and Higher Moments 1 of 16 7/16/2009 5:45 AM Virtual Laboratories > 4. Expected Value > 1 2 3 4 5 6 2. Variance and Higher Moments Recall that by taking the expected value of various transformations of a random variable,

More information

PDEs in Image Processing, Tutorials

PDEs in Image Processing, Tutorials PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information