Maximum likelihood estimation of a log-concave density based on censored data
|
|
- Marylou McDonald
- 5 years ago
- Views:
Transcription
1 Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen and Kaspar Rufibach Swiss Statistics Seminar, 25 October 2011
2 Overview 1 A review of log-concave density estimation for exact data 2 Log-concave density estimation for censored data Problem formulation Theoretical results An EM algorithm Simulated and real data examples
3 Part 1: Log-concave density estimation for exact data
4 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood.
5 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood. Log-likelihood: n l(f ) := log f (X i ), i=1 not bounded from above on space of all densities f.
6 Part 1: log-concave densities and exact data Problem Problem: Nonparametric estimation of the density f of a distribution P on R based on independent observations X 1, X 2,..., X n using maximum likelihood. Log-likelihood: l(f ) := n log f (X i ), i=1 not bounded from above on space of all densities f. Our approach: impose shape constraint. Unimodality is not enough.
7 Part 1: log-concave densities and exact data Unimodality is not enough! Distribute mass 1/2 uniformly over [x (1), x (n) ] (blue) and concentrate mass 1/2 more and more closely around one individual point (red)
8 Part 1: log-concave densities and exact data Log-concavity A density f is called log-concave if ϕ := log f : R [, ) is a concave function.
9 Part 1: log-concave densities and exact data Log-concavity A density f is called log-concave if ϕ := log f : R [, ) is a concave function. Turns out that class F lc of log-concave densities is a large, rather flexible class with many nice properties.
10 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal.
11 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ).
12 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits.
13 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits. S, Hüsler and Dümbgen (2009/11), strong continuity property: weak convergence implies pointwise convergence of densities, and conv. in an exponentially weighted total variation distance: exp(δ x ) fn (x) f (x) dx 0 for some δ > 0.
14 Part 1: log-concave densities and exact data Properties of F lc Every log-concave density is unimodal. Ibragimov (1956): f log-concave if and only if f g is unimodal for every unimodal g (Log-concavity has been called strong unimodality ). F lc is closed under convolution and weak limits. S, Hüsler and Dümbgen (2009/11), strong continuity property: weak convergence implies pointwise convergence of densities, and conv. in an exponentially weighted total variation distance: exp(δ x ) fn (x) f (x) dx 0 for some δ > 0. Every log-concave density has subexponential tails and a non-decreasing hazard rate.
15 Part 1: log-concave densities and exact data Parametric families of log-concave distributions The class of log-concave densities contains the following parametric families: Normal, logistic, exponential, Laplace, uniform, Gumbel Gamma and Weibull with shape parameter 1 Beta with both shape parameters 1
16 Part 1: log-concave densities and exact data Log-concave density estimation Nonparametric maximum likelihood estimation under log-concavity constraint has been studied independently in Rufibach (2006) and Dümbgen and Rufibach (2009) Pal, Woodroofe, and Meyer (2007)
17 Part 1: log-concave densities and exact data Log-concave density estimation Nonparametric maximum likelihood estimation under log-concavity constraint has been studied independently in Rufibach (2006) and Dümbgen and Rufibach (2009) Pal, Woodroofe, and Meyer (2007) Task: for n 2 find fˆn arg max f F lc n log f (X i ), i=1 where F lc := {f : R R + log-concave density}.
18 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ).
19 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ). Shape: ϕ n piecewise linear, changes of slope only possible at data points X i, dom(ϕ n) = [X (1), X (n) ].
20 Part 1: log-concave densities and exact data First results Equivalent problem (by Silverman s trick ): find ϕ n arg max ϕ Φ n ϕ(x i ) 1 exp ϕ(x) dx n i=1 0 }{{} =:L(ϕ) where Φ is set of all concave (and upper semicontinuous, say) functions R [, ). Shape: ϕ n piecewise linear, changes of slope only possible at data points X i, dom(ϕ n) = [X (1), X (n) ]. L depends on ϕ only via ϕ(x 1 ),..., ϕ(x n ). Is strictly concave and coercive; defined on a closed, convex cone R n. = Existence and uniqueness of NPMLE fˆn.
21 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance.
22 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0.
23 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ).
24 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ). Pointwise limit distributions: If f (x 0 ) > 0, ϕ C 2 around x 0, and ϕ (x 0 ) 0: n 2/5( fˆn(x 0 ) f (x 0 ) ) D c(x0, f )H (0), where H is so-called lower invelope of Y (t) = 0 t 0 t W (s) ds t 4, W two-sided Brownian motion starting in 0.
25 Part 1: log-concave densities and exact data Further properties of fˆn (simplified) (Dümbgen & Rufibach, 2009, and Balabdaoui, Rufibach & Wellner, 2009) Mean(fˆn) = empirical mean, Variance(fˆn) empirical variance. Uniform consistency: fˆn f 0, and F n F 0. Rate-optimality and adaptivity: For any compact interval T int{f > 0} and a true f that is Hölder continuous with exponent β [1, 2], we have max x T fˆn(x) f (x) = O p ( (log(n) n ) β/(2β+1) ). Pointwise limit distributions: If f (x 0 ) > 0, ϕ C 2 around x 0, and ϕ (x 0 ) 0: n 2/5( fˆn(x 0 ) f (x 0 ) ) D c(x0, f )H (0), where H is so-called lower invelope of Y (t) = 0 t 0 t W (s) ds t 4, W two-sided Brownian motion starting in 0. results for derived quantities: mode, distance between knots.
26 Part 1: log-concave densities and exact data Computation We have a concave maximization problem over a closed convex cone in R n. n potentially large.
27 Part 1: log-concave densities and exact data Computation We have a concave maximization problem over a closed convex cone in R n. n potentially large. There is a fast active set algorithm by Dümbgen and Rufibach (2011), available in the R-package logcondens.
28 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1).
29 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1) Log-concave density estimation from i.i.d. data Density estimate. Dashed vertical lines indicate knots of the log-density. log-concave f^n log-concave smoothed f^n* 0.20 density
30 Part 1: log-concave densities and exact data Simulation example Sample 100 points from Γ(3, 1) Log-concave density estimation from i.i.d. data Density estimate. Dashed vertical lines indicate knots of the log-density. log-concave f^n log-concave smoothed f^n* 0.20 density
31 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.).
32 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.). General approximation theory with applications to additive regression models with log-concave error distribution (Dümbgen, Samworth and S, 2010, Ann. Stat.).
33 Part 1: log-concave densities and exact data Related work Multivariate case (Cule, Samworth and Stewart, 2010, JRSS B); R package LogConcDEAD (Cule, Gramacy and Samworth, 2009, J Stat. Soft.). General approximation theory with applications to additive regression models with log-concave error distribution (Dümbgen, Samworth and S, 2010, Ann. Stat.). Discrete case (Balabdaoui and Rufibach, preprint 2011).
34 Part 2: Log-concave density estimation for censored data
35 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i.
36 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0.
37 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0. Write generally L i := inf X i, R i := sup X i.
38 Censored data For simplicity of notation: estimate densities on R + ; think of distribution of time to a certain event. Instead of exact values X i we observe intervals X i containing X i. The following cases are possible: X i = {X i }; X i = (L i, R i ] where 0 L i < R i < ; X i = (L i, ) where L i 0. Write generally L i := inf X i, R i := sup X i. We would like to estimate the underlying distribution of the X i s. (It is not clear what this means without further specification.)
39 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =.
40 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i.
41 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i. Then the X i translate into observations X i as follows: { X i if X i I ij and Y ij = 1; X i := I ij if X i I ij and Y ij = 0.
42 A concrete censoring model Assume that the i-th individual is inspected at times T ij, 1 j K i, where 0 =: T i0 < T i1 < T i2 <... < T i,ki. If K i =, assume that T i, := sup j (T ij ) =. Let furthermore Y ij be an indicator for exact observability in the j-th inter-inspection interval { (T ij, T i,j+1 ] if 0 j < K i ; I ij := (T ij, ) if j = K i. Then the X i translate into observations X i as follows: { X i if X i I ij and Y ij = 1; X i := I ij if X i I ij and Y ij = 0. Note: The K i, T ij, and Y ij may be arbitrarily dependent random variables.
43 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0.
44 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0.
45 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0. Mixed case interval-censoring: for each individual it is known in which of a number of contiguous intervals the event happened. K i finite-valued, I ij 0 for all j.
46 Special cases Right-censoring: if event happens prior to a single inspection time, it is observed exactly, otherwise censored. K i 1, I i0 1, I i1 0. Current status model: it is only known whether event happened before or after a single inspection time. K i 1, I i0 I i1 0. Mixed case interval-censoring: for each individual it is known in which of a number of contiguous intervals the event happened. K i finite-valued, I ij 0 for all j. Rounding and binning: X i s are known up to rounding to closest integer (say). K i, I ij 0, T ij j + 1/2 for all i, j.
47 Conditional log-likelihood Assume that conditionally on all K i, T ij, and Y ij the random variables X 1,..., X n are i.i.d. with density f = exp ϕ.
48 Conditional log-likelihood Assume that conditionally on all K i, T ij, and Y ij the random variables X 1,..., X n are i.i.d. with density f = exp ϕ. The normalized log-likelihood for ϕ is then l(ϕ) := l(ϕ; X 1,..., X n) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 where L i = inf X i, R i = sup X i. ( Ri L i )] exp ϕ(x) dx,
49 Likelihood maximization Maximize l(ϕ) over all ϕ Φ that satisfy exp ϕ(x) dx = 1.
50 Likelihood maximization Maximize l(ϕ) over all ϕ Φ that satisfy exp ϕ(x) dx = 1. Equivalently: Maximize L(ϕ) = 1 n n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log i=1 0 exp ϕ(x) dx ( Ri L i )] exp ϕ(x) dx over ϕ Φ.
51 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}.
52 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}. Then any ϕ arg max ϕ Φ L(ϕ) satisfies [min i R i, max i L i ] dom ( ϕ ) [τ 1, τ q ]
53 Shape requirements for maximizer ϕ Let 0 τ 1 < τ 2 <... < τ q be the endpoints of the data intervals, i.e. {τ 1,..., τ q } = {L i : 1 i n} {R i : 1 i n}. Then any ϕ arg max ϕ Φ L(ϕ) satisfies [min i R i, max i L i ] dom ( ϕ ) [τ 1, τ q ] ϕ is linear on every interval (τ j, τ j+1 ) dom(ϕ ) \ n i=1 [L i, R i ].
54 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ).
55 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ). Then there exists a maximizer ϕ with dom(ϕ ) = [τ j1, τ j2 ] R + for some j 1, j 2 that is piecewise linear on dom(ϕ ) with at most one change of slope in any interval (τ j, τ j+1 ).
56 Shape simplifications for maximizer Theorem (shape) Suppose that arg max ϕ Φ L(ϕ). Then there exists a maximizer ϕ with dom(ϕ ) = [τ j1, τ j2 ] R + for some j 1, j 2 that is piecewise linear on dom(ϕ ) with at most one change of slope in any interval (τ j, τ j+1 ). There is no change of slope in the two extreme intervals (τ j1, τ j1 +1) and (τ j2 1, τ j2 ); the interval (τ j1 +1, τ j1 +2) unless there is an i such that L i = R i = τ j1, and the interval (τ j2 2, τ j2 1) unless there is an i such that L i = R i = τ j2 ; any interval (τ j, τ j+1 ) dom(ϕ ) \ n i=1 [L i, R i ].
57 Existence of maximizer Theorem (existence) Assume that there is no i o {1,..., n} with L io = R io = X io and Then n [L i, R i ] = {X io }. i=1 arg max ϕ G conc L(ϕ), where G conc Φ consists of all piecewise linear functions with domain and changes of slope according to the shape theorem. We only allow slope changes on a fine grid t 1 < t 2 <... < t m containing the finite τ j.
58 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1
59 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1 Assuming that there are intervals to the left and to the right of X io =: τ jo (otherwise idea is easily adapted), we can obtain an arbitrarily large log-likelihood, by defining ϕ as a triangle function with domain [τ jo 1, τ jo+1] and peak in τ jo :
60 Counter-example Suppose that there is an i o {1,..., n} with L io = R io = X io and n [L i, R i ] = {X io }. i=1 Assuming that there are intervals to the left and to the right of X io =: τ jo (otherwise idea is easily adapted), we can obtain an arbitrarily large log-likelihood, by defining ϕ as a triangle function with domain [τ jo 1, τ jo+1] and peak in τ jo : The integral of exp ϕ over [τ jo 1, τ jo ] and [τ jo, τ jo+1] can be kept constant, while ϕ(τ jo ) and hence L(ϕ) go towards.
61 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +).
62 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists.
63 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists. 2 Show L(ϕ (k) ) for every sequence (ϕ (k) ) k G conc with max 1 i m+1 ϕ (k) i by somewhat tedious calculations, where ϕ (k) m+1 := log( ϕ (k) m+1 ).
64 Proof idea of existence theorem Consider as a problem of maximizing L(ϕ) over a convex cone in [, ) m [, 0); ϕ = (ϕ 1,..., ϕ m, ϕ m+1 ), where ϕ i := ϕ(t i ) and ϕ m+1 := ϕ (t m +). 1 Log-likelihood is continuous on G conc ([, k] m [, 1/k]) for every k N, which is compact in extended Euclidean topology = max over this set exists. 2 Show L(ϕ (k) ) for every sequence (ϕ (k) ) k G conc with max 1 i m+1 ϕ (k) i by somewhat tedious calculations, where ϕ (k) m+1 := log( ϕ (k) m+1 ). Assuming now that there is no maximizer in G conc ( [, ) m [, 0) ) leads to a contradiction.
65 Open problems uniqueness
66 Open problems uniqueness consistency and rates
67 Open problems uniqueness consistency and rates everything else...
68 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore!
69 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm.
70 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm. Remember: l(ϕ) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 incomplete data log-likelihood (normalized); ( Ri L i )] exp ϕ(x) dx.
71 Computation Still maximization over cone in typically very high dimension (now m). But our log-likelihood function is not concave anymore! We try an EM algorithm. Remember: l(ϕ) := 1 n [ 1{L i = R i }ϕ(x i ) + 1{L i < R i } log n i=1 incomplete data log-likelihood (normalized); ( Ri L i )] exp ϕ(x) dx. l (ϕ) := 1 n n ϕ(x i ) i=1 complete data log-likelihood (normalized).
72 EM algorithm E step: Given ϕ r G conc (log-densities in G conc ), compute l (ϕ; ϕ r ) := E ϕr ( l (ϕ) X i X i for all i ) = 1 n = 1 n n i=1 i=1 ( E ϕr ϕ(xi ) ) Xi X i n [1{L i = R i }ϕ(x i ) + 1{L i < R i } E ( ϕ r ϕ(xi ) 1{X i X i} ) ] ( ). P ϕr Xi X i Easy, since ϕ r, ϕ piecewise linear.
73 EM algorithm E step: Given ϕ r G conc (log-densities in G conc ), compute l (ϕ; ϕ r ) := E ϕr ( l (ϕ) X i X i for all i ) = 1 n = 1 n n i=1 i=1 ( E ϕr ϕ(xi ) ) Xi X i n [1{L i = R i }ϕ(x i ) + 1{L i < R i } E ( ϕ r ϕ(xi ) 1{X i X i} ) ] ( ). P ϕr Xi X i Easy, since ϕ r, ϕ piecewise linear. M step: Maximize l (ϕ; ϕ r ) over all ϕ G conc with dom(ϕ) dom(ϕ r ). Domain will not get smaller. Call maximizer ϕ r+1.
74 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <.
75 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <.
76 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <. Therefore l (ϕ; ϕ r ) = m w j ϕ(t j ) + w m+1 ϕ (t m +) j=1 with certain (computable) weights w 1,..., w m > 0; w m+1 > 0 if right endpoint appears in data intervals, = 0 otherwise.
77 Reduction to Active Set Algorithm E ϕr ( ϕ(xi ) 1{X i [a, b]} ) is a linear combination of ϕ(a) and ϕ(b) if ϕ r, ϕ are linear on [a, b], 0 a b <. E ϕr ( ϕ(xi ) 1{X i [a, )} ) is a linear combination of ϕ(a) and ϕ (a+) if ϕ r, ϕ are linear on [a, ), 0 a <. Therefore l (ϕ; ϕ r ) = m w j ϕ(t j ) + w m+1 ϕ (t m +) j=1 with certain (computable) weights w 1,..., w m > 0; w m+1 > 0 if right endpoint appears in data intervals, = 0 otherwise. Is weighted version of our exact data log-likelihood. We may use the original active set algorithm for m data points (with an extension for the right-hand slope)!
78 Domain reduction EM algorithm starts on maximal domain [min i L i, max i R i ]. Problem: Domain is never reduced, but we may have started on too large domain. Algorithm forces ϕ r towards on extra domain (leading to convergence / numerical problems).
79 Domain reduction EM algorithm starts on maximal domain [min i L i, max i R i ]. Problem: Domain is never reduced, but we may have started on too large domain. Algorithm forces ϕ r towards on extra domain (leading to convergence / numerical problems). Solution: If τ j+1 τ j exp ϕ(x) dx gets too small for some j (only possible at the very left or very right of current domain), the corresponding interval is removed from the domain and the EM algorithm is restarted on the reduced domain.
80 Three examples (simulated) right-censored Γ(3, 1)-data
81 Three examples (simulated) right-censored Γ(3, 1)-data (simulated) mixed-case interval-censored Gumbel(2, 1)-data
82 Three examples (simulated) right-censored Γ(3, 1)-data (simulated) mixed-case interval-censored Gumbel(2, 1)-data (real data) ovarian cancer cases
83 Example 1: Right-censoring Consider the same 100 data points from Γ(3, 1) as before, but now introduce censoring after individual Γ(2, 1/2)-distributed inspection times. Resulting data (37 censored):
84 Example 1: Log-density estimates NPMLE for censored data true log-density NPMLE for exact data
85 Example 1: Density estimates NPMLE for censored data true density NPMLE for exact data
86 Example 2: Interval-censoring 100 data points from a Gumbel(2, 1)-distribution. Each individual is inspected according to a Poisson(1)-process, i.e. each inspection takes place an Exp(1)-distributed time after the last (all times independent)
87 Example 2: Survival function estimates survival function of log-concave NPMLE true survival function Turnbull estimator of survival function
88 Example 3: Real data We consider the dataset ovarian from the R-package survival. Survival times in days of 26 women with ovarian cancer (14 censored)
89 Example 3: Survival function estimates log-concave NPMLE Kaplan-Meier estimator
90 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer?
91 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time.
92 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time. New log-likelihood: l(ϕ, p 0 ) := 1 n [ 1{L i = R i }ϕ(x i ) n i=1 ( Ri )] + 1{L i < R i } log exp ϕ(x) dx + p 0 1{R i = }. L i
93 Cure probability What if we allow for the possibility that patients are cured, i.e. with a certain probability p 0 a woman will not die from cancer? We model this by allowing subprobability densities of total mass 1 p 0 and adding a point mass of p 0 at time. New log-likelihood: l(ϕ, p 0 ) := 1 n [ 1{L i = R i }ϕ(x i ) n i=1 ( Ri )] + 1{L i < R i } log exp ϕ(x) dx + p 0 1{R i = }. L i For known p 0, nothing essential changes in our algorithm!
94 Profile likelihood maximization in p 0 Maximize l profile (p 0 ) := max ϕ Gconc (p 0 ) l(ϕ, p 0) p 0 profile log-likelihood p0
95 Profile likelihood maximization in p 0 Maximize l profile (p 0 ) := max ϕ Gconc (p 0 ) l(ϕ, p 0) p 0 profile log-likelihood In our example p Compute ϕ arg max ϕ Gconc l(ϕ, p 0). (p 0) p0
96 Example 3: Survival function estimates log-concave NPMLE Kaplan-Meier estimator
97 R package Algorithm implemented in R package logconcens. Give it a try!
Nonparametric estimation of log-concave densities
Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work
More informationNonparametric estimation of log-concave densities
Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Northwestern University November 5, 2010 Conference on Shape Restrictions in Non- and Semi-Parametric
More informationNonparametric estimation under Shape Restrictions
Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions
More informationNonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood
Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,
More informationLog Concavity and Density Estimation
Log Concavity and Density Estimation Lutz Dümbgen, Kaspar Rufibach, André Hüsler (Universität Bern) Geurt Jongbloed (Amsterdam) Nicolai Bissantz (Göttingen) I. A Consulting Case II. III. IV. Log Concave
More informationShape constrained estimation: a brief introduction
Shape constrained estimation: a brief introduction Jon A. Wellner University of Washington, Seattle Cowles Foundation, Yale November 7, 2016 Econometrics lunch seminar, Yale Based on joint work with: Fadoua
More informationNonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood
Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November
More informationOPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan
OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed
More informationComputing Confidence Intervals for Log-Concave Densities. by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University
Computing Confidence Intervals for Log-Concave Densities by Mahdis Azadbakhsh Supervisors: Hanna Jankowski Xin Gao York University August 2012 Contents 1 Introduction 1 1.1 Introduction..............................
More informationarxiv: v2 [stat.me] 7 Aug 2007
Active Set and EM Algorithms for Log Concave Densities Based on Complete and Censored Data Lutz Dümbgen, André Hüsler and Kaspar Rufibach University of Bern arxiv:0707.4643v2 [stat.me] 7 Aug 2007 August
More informationA Law of the Iterated Logarithm. for Grenander s Estimator
A Law of the Iterated Logarithm for Grenander s Estimator Jon A. Wellner University of Washington, Seattle Probability Seminar October 24, 2016 Based on joint work with: Lutz Dümbgen and Malcolm Wolff
More informationActive Set Methods for Log-Concave Densities and Nonparametric Tail Inflation
Active Set Methods for Log-Concave Densities and Nonparametric Tail Inflation Lutz Duembgen (Bern) Aleandre Moesching and Christof Straehl (Bern) Peter McCullagh and Nicholas G. Polson (Chicago) January
More informationRecent progress in log-concave density estimation
Submitted to Statistical Science Recent progress in log-concave density estimation Richard J. Samworth University of Cambridge arxiv:1709.03154v1 [stat.me] 10 Sep 2017 Abstract. In recent years, log-concave
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationRegularization prescriptions and convex duality: density estimation and Renyi entropies
Regularization prescriptions and convex duality: density estimation and Renyi entropies Ivan Mizera University of Alberta Department of Mathematical and Statistical Sciences Edmonton, Alberta, Canada Linz,
More informationarxiv: v3 [math.st] 4 Jun 2018
Inference for the mode of a log-concave density arxiv:1611.1348v3 [math.st] 4 Jun 218 Charles R. Doss and Jon A. Wellner School of Statistics University of Minnesota Minneapolis, MN 55455 e-mail: cdoss@stat.umn.edu
More informationand Comparison with NPMLE
NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/
More informationMaximum likelihood estimation of a multidimensional logconcave
Maximum likelihood estimation of a multidimensional logconcave density Madeleine Cule and Richard Samworth University of Cambridge, UK and Michael Stewart University of Sydney, Australia Summary. Let X
More information1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time
ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationMaximum likelihood estimation of a multidimensional logconcave
Maximum likelihood estimation of a multidimensional logconcave density Madeleine Cule and Richard Samworth University of Cambridge, UK and Michael Stewart University of Sydney, Australia Summary. Let X
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationNONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES. By Arseni Seregin, and Jon A. Wellner, University of Washington
Submitted to the Annals of Statistics arxiv: math.st/0911.4151v1 NONPARAMETRIC ESTIMATION OF MULTIVARIATE CONVEX-TRANSFORMED DENSITIES By Arseni Seregin, and Jon A. Wellner, University of Washington We
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationSurvival Analysis for Interval Censored Data - Nonparametric Estimation
Survival Analysis for Interval Censored Data - Nonparametric Estimation Seminar of Statistics, ETHZ Group 8, 02. May 2011 Martina Albers, Nanina Anderegg, Urs Müller Overview Examples Nonparametric MLE
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationJournal of Statistical Software
JSS Journal of Statistical Software March 2011, Volume 39, Issue 6. http://www.jstatsoft.org/ logcondens: Computations Related to Univariate Log-Concave Density Estimation Lutz Dümbgen University of Bern
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and
More informationNonparametric estimation under shape constraints, part 2
Nonparametric estimation under shape constraints, part 2 Piet Groeneboom, Delft University August 7, 2013 What to expect? Theory and open problems for interval censoring, case 2. Same for the bivariate
More informationSTAT 6350 Analysis of Lifetime Data. Probability Plotting
STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life
More informationChernoff s distribution is log-concave. But why? (And why does it matter?)
Chernoff s distribution is log-concave But why? (And why does it matter?) Jon A. Wellner University of Washington, Seattle University of Michigan April 1, 2011 Woodroofe Seminar Based on joint work with:
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationProblem set 4, Real Analysis I, Spring, 2015.
Problem set 4, Real Analysis I, Spring, 215. (18) Let f be a measurable finite-valued function on [, 1], and suppose f(x) f(y) is integrable on [, 1] [, 1]. Show that f is integrable on [, 1]. [Hint: Show
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationLearning and Testing Structured Distributions
Learning and Testing Structured Distributions Ilias Diakonikolas University of Edinburgh Simons Institute March 2015 This Talk Algorithmic Framework for Distribution Estimation: Leads to fast & robust
More informationSurvival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University
Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationA tailor made nonparametric density estimate
A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview
Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationSurvival Analysis. Stat 526. April 13, 2018
Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationLog-concave distributions: definitions, properties, and consequences
Log-concave distributions: definitions, properties, and consequences Jon A. Wellner University of Washington, Seattle; visiting Heidelberg Seminaire, Institut de Mathématiques de Toulouse 28 February 202
More informationApproximate Self Consistency for Middle-Censored Data
Approximate Self Consistency for Middle-Censored Data by S. Rao Jammalamadaka Department of Statistics & Applied Probability, University of California, Santa Barbara, CA 93106, USA. and Srikanth K. Iyer
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationChapter 3. Chord Length Estimation. 3.1 Introduction
Chapter 3 Chord Length Estimation 3.1 Introduction Consider a random closed set W R 2 which we observe through a bounded window B. Important characteristics of the probability distribution of a random
More informationMaster s Written Examination - Solution
Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2
More informationThe high order moments method in endpoint estimation: an overview
1/ 33 The high order moments method in endpoint estimation: an overview Gilles STUPFLER (Aix Marseille Université) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de
More informationEmpirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm
Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier
More informationOverview of Extreme Value Theory. Dr. Sawsan Hilal space
Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationLikelihood Based Inference for Monotone Response Models
Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic
More informationProbability and Statistics
Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Chapter 3: Parametric families of univariate distributions CHAPTER 3: PARAMETRIC
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationA SMOOTHED VERSION OF THE KAPLAN-MEIER ESTIMATOR. Agnieszka Rossa
A SMOOTHED VERSION OF THE KAPLAN-MEIER ESTIMATOR Agnieszka Rossa Dept. of Stat. Methods, University of Lódź, Poland Rewolucji 1905, 41, Lódź e-mail: agrossa@krysia.uni.lodz.pl and Ryszard Zieliński Inst.
More information1 Glivenko-Cantelli type theorems
STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then
More informationCTDL-Positive Stable Frailty Model
CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationGaussian discriminant analysis Naive Bayes
DM825 Introduction to Machine Learning Lecture 7 Gaussian discriminant analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. is 2. Multi-variate
More information(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199
CONSISTENCY OF THE GMLE WITH MIXED CASE INTERVAL-CENSORED DATA By Anton Schick and Qiqing Yu Binghamton University April 1997. Revised December 1997, Revised July 1998 Abstract. In this paper we consider
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationLegendre-Fenchel transforms in a nutshell
1 2 3 Legendre-Fenchel transforms in a nutshell Hugo Touchette School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, UK Started: July 11, 2005; last compiled: August 14, 2007
More informationStatistical Estimation: Data & Non-data Information
Statistical Estimation: Data & Non-data Information Roger J-B Wets University of California, Davis & M.Casey @ Raytheon G.Pflug @ U. Vienna, X. Dong @ EpiRisk, G-M You @ EpiRisk. a little background Decision
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationClass 2 & 3 Overfitting & Regularization
Class 2 & 3 Overfitting & Regularization Carlo Ciliberto Department of Computer Science, UCL October 18, 2017 Last Class The goal of Statistical Learning Theory is to find a good estimator f n : X Y, approximating
More informationNew mixture models and algorithms in the mixtools package
New mixture models and algorithms in the mixtools package Didier Chauveau MAPMO - UMR 6628 - Université d Orléans 1ères Rencontres BoRdeaux Juillet 2012 Outline 1 Mixture models and EM algorithm-ology
More information4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups
4 Testing Hypotheses The next lectures will look at tests, some in an actuarial setting, and in the last subsection we will also consider tests applied to graduation 4 Tests in the regression setting )
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationNonparametric Model Construction
Nonparametric Model Construction Chapters 4 and 12 Stat 477 - Loss Models Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28 Types of data Types of data For non-life
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationδ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by
. Sanov s Theorem Here we consider a sequence of i.i.d. random variables with values in some complete separable metric space X with a common distribution α. Then the sample distribution β n = n maps X
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationSOLUTION FOR HOMEWORK 8, STAT 4372
SOLUTION FOR HOMEWORK 8, STAT 4372 Welcome to your 8th homework. Here you have an opportunity to solve classical estimation problems which are the must to solve on the exam due to their simplicity. 1.
More informationGaussian Mixture Models
Gaussian Mixture Models David Rosenberg, Brett Bernstein New York University April 26, 2017 David Rosenberg, Brett Bernstein (New York University) DS-GA 1003 April 26, 2017 1 / 42 Intro Question Intro
More informationLikelihood Based Inference for Monotone Response Models
Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 11, 2006 Abstract The behavior of maximum likelihood estimates (MLEs) and the likelihood ratio
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationShape constrained kernel density estimation
Shape constrained kernel density estimation Melanie Birke Ruhr-Universität Bochum Fakultät für Mathematik 4478 Bochum, Germany e-mail: melanie.birke@ruhr-uni-bochum.de March 27, 28 Abstract In this paper,
More informationParameter Estimation
Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods
More informationMaximum Likelihood Estimation under Shape Constraints
Maximum Likelihood Estimation under Shape Constraints Hanna K. Jankowski June 2 3, 29 Contents 1 Introduction 2 2 The MLE of a decreasing density 3 2.1 Finding the Estimator..............................
More informationIntroduction: exponential family, conjugacy, and sufficiency (9/2/13)
STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review
More informationA Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks
A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing
More informationMaximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,
More informationSTAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring
STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the
More informationLegendre-Fenchel transforms in a nutshell
1 2 3 Legendre-Fenchel transforms in a nutshell Hugo Touchette School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, UK Started: July 11, 2005; last compiled: October 16, 2014
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More informationKey Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.
POWER-LAW ADJUSTED SURVIVAL MODELS William J. Reed Department of Mathematics & Statistics University of Victoria PO Box 3060 STN CSC Victoria, B.C. Canada V8W 3R4 reed@math.uvic.ca Key Words: survival
More information2. Variance and Higher Moments
1 of 16 7/16/2009 5:45 AM Virtual Laboratories > 4. Expected Value > 1 2 3 4 5 6 2. Variance and Higher Moments Recall that by taking the expected value of various transformations of a random variable,
More informationPDEs in Image Processing, Tutorials
PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationStat 451 Lecture Notes Simulating Random Variables
Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:
More information