Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds

Size: px
Start display at page:

Download "Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds"

Transcription

1 Population size estimation by means of capture-recapture with special emphasis on heterogeneity using Chao and Zelterman bounds Dankmar Böhning Quantitative Biology and Applied Statistics, School of Biological Sciences University of Reading Ege University, Izmir, June 3, 2008 prepared: June 2, 2008

2 Outline Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

3 Introduction Formulation of the Problem a population has N units of which n are identified by some mechanism (trap, register, police database,...) probability of identifying an unit is (1 p 0 ) so that N = (1 p 0 )N + p 0 N = n + p 0 N and the Horvitz-Thompson estimator follows: ˆN = n 1 p 0

4 Introduction Formulation of the Problem ˆN = n 1 p 0 is fine BUT: p 0 is assumed to be known usually an estimate of p 0 is required

5 Introduction Formulation of the Problem as Frequencies of Frequencies a common setting for estimating p 0 is the Frequencies of Frequencies: the identifying mechanism provides a count Y of repeated identifications (w.r.t. to a reference period), but zero counts are not observed leading to frequencies f 1, f 2,..., f m where m is the largest observed count and f j is the frequency of units with exactly j counts

6 Introduction Formulation of the Problem as Frequencies of Frequencies we have: f 0 is not observed Recall that N = f 0 + n = f 0 + f 1 + f f m, so that ˆf 0 leads to ˆN

7 Introduction Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

8 Some Applications McKendrick s Data on Cholera in India McKendrick (1926) had the following frequency of households with j cases of Cholera in an Indian village: f 0 f 1 f 2 f 3 f 4 n How many households f 0 are affected by the epidemic, but have no cases?

9 Some Applications Mathews s Data on Estimating the Dystrophin Density in the Human Muscle Cullen et al. (1990) attempted to locate dystrophin, a gene product of possible importance in muscular dystrophies, within the muscle fibres of biopsy specimens taken from normal patients Units (epitops) of Dystrophin cannot be detected by the electron microscope until they have been labelled by a suitable electron-dense substance; technique used gold-conjugated antibodies which adhere to the dystrophin

10 Some Applications Mathews s Data on Estimating the Dystrophin Density in the Human Muscle not all units are labelled and it is important to account for all labelled and unlabelled units to achieve an unbiased estimate of the dystrophin density more than one anti-body molecule may attach to a dystrophin unit; observed then is a count variable Y counting the number of antibody molecules on each dystrophin unit Y = 0 means that unit is unlabelled and not observed

11 Some Applications Mathews s Data on Estimating the Dystrophin Density in the Human Muscle the frequency distribution of the antibody count attached to a dystrophin unit: f 0 f 1 f 2 f 3 f 4 f 5 n

12 Some Applications Del Rio Vilas s Data on Estimating Hidden Scrapie in Great Britain 2005 sheep is kept in holdings in great Britain (and elsewhere) the occurrence of scrapie is monitored in the Compulsory Scrapie Flocks Scheme (CSFS) summarizing abbatoir survey, stock survey and the statutory reporting of clinical cases CSFS established since 2004 the frequency distribution of the scrapie count within each holding for the year 2005: f 0 f 1 f 2 f 3 f 4 f 5 f 6 f 7 f 8 n

13 Some Applications Hser s Data on Estimating Hidden Intravenous Drug Users in Los Angeles 1989 intravenous drug users in L.A. county were entered into the California Drug Abuse Data System (CAL-DADS) the data below refer to the frequency distribution of the episode count per drug user in 1989 the frequency distribution of the episode count per drug user for the year 1989: f 0 f 1 f 2 f 3 f 4 f 5 f 6-11,982 3,893 1,959 1, f 7 f 8 f 9 f 10 f 11 f 12 n ,198

14 Some Applications Drakos Data on Estimating Hidden Transnational Terrorist Activity data on terrorism are provided by various databases including RAND terrorism chronology, the terrorism indictment and DFI International research on terrorist organizations on 153 countries and the period terrorism is violence or the threat of violence, calculated to create an atmosphere of fear and alarm

15 Some Applications Drakos Data on Estimating Hidden Transnational Terrorist Activity the frequency distribution (Drakos 2007) of the count of transnational terrorist activity Y it in country i and year t: f 0 f 1 f 2 f 3 f 4 f 5 f f 7 f 8 f 9 f f 136 n similar to the McKendrick data there is an f 0 = 1, 357 however it is thought that there is a hidden number of terrorist activities which is of interest to be estimated estimate f 0 the frequency of periods and countries with unrecorded terrorist activites

16 Some Applications Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

17 Ways to a Realistic Solution Formulation of the Problem and the Idea for its Solution Suppose we can find some model for the count probabilities p j = p j (λ) then estimate λ by some method (truncated likelihood) and then use the model for p 0 : 1 ˆN = 1 p 0 (ˆλ)

18 Ways to a Realistic Solution Formulation of the Problem and the Idea for its Solution Only to illustrate: Poisson model for the count probabilities then estimate λ and arrive at: p j = p j (λ) = exp( λ)λ j /j! ˆN = n 1 ˆp 0 = n 1 exp( ˆλ)

19 Ways to a Realistic Solution Formulation of the Problem and the Idea for its Solution However: using a simple Poisson model for the count probabilities is not appropriate, since every unit is different p j = p j (λ) = exp( λ)λ j /j! there is population heterogeneity so that more realistic p j = p j (λ) = 0 exp( t)t j /j!λ(t)dt where λ(t) stands for the heterogeneity distribution of the Poisson parameter

20 Ways to a Realistic Solution Formulation of the Problem and the Idea for its Solution instead of providing an estimate ˆλ(t) by means of nonparametric mixture models (Böhning and Schön 2005, JRSSC) interest is on two alternatives: 1. lower bound approach by Chao (1987, 1989, Biometrics) monotonicity of the ratios of consecutive mixed Poisson probabilities diagnostic device for presence of a mixed Poisson 2. robust approach of Zelterman (1988, JSPI) likelihood framework for the Zelterman estimate variance via Fisher information and covariate modelling via logistic regression (Böhning and Del Rio Vilas 2008, JABES) bias investigation

21 Ways to a Realistic Solution Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

22 Generalized Chao Bounds and a Monotonicity Property Chao s Lower Bound Estimate Poisson mixture for j = 0, 1, 2,... p j = 0 exp( t)t j /j!λ(t)dt with unknown λ(t) for t > 0. Then, by the Cauchy-Schwartz inequality: E(XY ) 2 E(X 2 )E(Y 2 ) where X = exp( t) and Y = exp( t)t, and expected values are w.r.t. λ(t): ( 2 exp( t)tλ(t)dt) 0 0 exp( t)λ(t)dt 0 exp( t)t 2 λ(t)dt

23 Generalized Chao Bounds and a Monotonicity Property Chao s Lower Bound Estimate ( 2 exp( t)tλ(t)dt) 0 0 exp( t)λ(t)dt 2 p 2 1 p 0 2p 2 p2 1 2p 2 p 0 0 exp( t) t2 2 λ(t)dt which leads to Chao s lower bound estimate (truely nonparametric) ˆf 0 = f 2 1 2f 2

24 Generalized Chao Bounds and a Monotonicity Property A Monotonicity Property Cauchy-Schwartz more generally applicable use X = exp( t)t j 1 and Y = exp( t)t j+1 ): E(X 2 )E(Y 2 ) = ( E(XY ) 2 = 0 0 ) 2 exp( t)t j λ(t)dt exp( t)t j 1 λ(t)dt 0 exp( t)t j+1 λ(t)dt (j!p j ) 2 (j 1)!p j 1 (j + 1)!p j+1 p j j (j + 1) p j+1 p j 1 says: ratios of consecutive mixed Poissons are monotone p j

25 Generalized Chao Bounds and a Monotonicity Property Application of the Monotonicity Property: Ratio Plot p j plot j p j 1 against j for j = 1, 2,..., m 1 replace p j by observed frequency f j so that: f j plot j f j 1 against j for j = 1, 2,..., m 1 monotonicity indicative for a mixture (heterogeneity) conceptually related to the Poisson plot (Hoaglin 1980 American Statistician, Gart 1970, Rao 1971) difference: Poisson plot looks for a horizontal line, the ratio plot looks for a monotone increasing pattern

26 Generalized Chao Bounds and a Monotonicity Property 5 Ratio Plot for Matthews-Data 4 ratio j 3 4

27 Generalized Chao Bounds and a Monotonicity Property 14 Ratio Plot for Scrapie-Data ratio j 5 6 7

28 Generalized Chao Bounds and a Monotonicity Property Ratio Plot for Drug User Data ratio j

29 Generalized Chao Bounds and a Monotonicity Property 12 Ratio Plot for Terrorist Activity Data 10 8 ratio j

30 Generalized Chao Bounds and a Monotonicity Property Conclusions from the Ratio Plot frequently, we find in count data sets evidence for heterogeneity in form of a mixture concept applicable for both: zero-truncated and untruncated count data (normalizing constant cancels out!)

31 Generalized Chao Bounds and a Monotonicity Property Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

32 Zelterman Estimation The Idea of Zelterman (1988) he noted that leading to the proposal and in particular for j = 1 λ = λj+1 λ j = (j + 1) λj+1 /(j + 1)! λ j /j! Po(j + 1; λ) λ = (j + 1) Po(j; λ) ˆλ j = (j + 1) f j+1 f j ˆλ = ˆλ 1 = 2 f 2 f 1

33 Zelterman Estimation ˆλ = 2 f 2 f1 is robust in the sense that it is not affected by any changes in counts larger than 2 count distribution need only to behave like a Poisson for counts of 1 or 2

34 Zelterman Estimation frequently: evidence for a mixture (Scrapie data) frequency Distribution Observed Frequency Fitted Poisson Mixture Fitted Simple Poisson count

35 Zelterman Estimation frequently: evidence for a 2-component mixture model Example McKendrick Matthews Scrapie Drug Use L.A. terrorist activity Non-parametric mixture model homogeneity 2-component 2-component 3-component 6-component

36 Zelterman Estimation Population Size Estimates for the Examples Example n simple MLE Chao Zelterman McKendrick Matthews Scrapie Drug Use L.A. 20,198 26,425 38,637 42,268 terrorist activity ,144 1,429

37 Zelterman Estimation Bias for Zelterman in a 2-component mixture model assume that for j = 0, 1, 2,... p j = (1 p)po(j; λ) + ppo(j; µ) bias of is determined by bias in ˆp 0 ˆN = n 1 ˆp 0

38 Zelterman Estimation Zelterman and non-parametric mixture Example n Chao Zelterman NPMLE of mixture McKendrick (1) Matthews (2) Scrapie (2) Drug Use L.A. 20,198 38,637 42,268 39,173 (2) 56,836 (3) terrorist activity 785 1,144 1,429 -

39 Zelterman Estimation Bias for Zelterman in a 2-component mixture model for Zelterman: ˆp 0 = exp( ˆλ) = exp( 2 f 2 f 1 ) replacing frequencies by expected values bias of ˆp 0 E(ˆp 0 ) exp( 2 p 2 p 1 ) E(ˆp 0 ) p 0 exp( 2 p 2 p 1 ) [(1 p)e λ + pe µ ]

40 Zelterman Estimation Bias for Zelterman in a 2-component mixture model bias of ˆp 0 ( exp 2 p ) 2 [(1 p)e λ + pe µ ] p 1 = exp ( (1 p)λ2 e λ + pµ 2 e µ ) (1 p)λe λ + pµe µ [(1 p)e λ + pe µ ] µ e λ (1 p)e λ = pe λ small amount of contamination = small bias

41 Zelterman Estimation Bias for Zelterman in a 2-component mixture model bias of ˆp 0 for large µ pe λ > 0 Zelterman overestimates (upper bound) small amount of contamination = small bias

42 Zelterman Estimation For Comparison: Bias of simple MLE in a 2-component mixture model bias of ˆp 0 = exp( Ȳ ) 0 by Jensen s inequality e [(1 p)λ+pµ] [(1 p)e λ + pe µ ] so that simple homogeneity model underestimates for all mixture models and for large µ bias of ˆp 0 = (1 p)e λ small amount of contamination = large bias

43 Zelterman Estimation next graph shows: bias of Zelterman ˆp 0 = pe λ bias of simple MLE ˆp 0 = (1 p)e λ

44 Zelterman Estimation Asymptotic Bias Variable Zelterman simple MLE p

45 Zelterman Estimation next graphs show: exact bias of Zelterman ˆp 0 exp ( (1 p)λ2 e λ + pµ 2 e µ ) (1 p)λe λ + pµe µ [(1 p)e λ + pe µ ] exact bias of simple MLE ˆp 0 e [(1 p)λ+pµ] [(1 p)e λ + pe µ ]

46 Zelterman Estimation Zelterman Bias Bias under Homogenous Poisson μ λ μ λ p = 0.05

47 Zelterman Estimation Zelterman Bias Bias under Homogenous Poisson λ μ μ λ p = 0.5

48 Zelterman Estimation Introduction Some Applications Ways to a Realistic Solution Generalized Chao Bounds and a Monotonicity Property Zelterman Estimation Applications Outlook Zelterman as MLE Zelterman can be extended to case data Zelterman extended to higher counts Zelterman extended: truncation or censoring? Some simulation results

49 Applications Zelterman larger than Chao? ˆN Z = n 1 exp( ˆλ) = n + n exp(ˆλ) 1 n + n 1 + ˆλ ˆλ 2 1 = n + n ˆλ ˆλ 2 = n + n 2f 2 f ( 2f2 f 1 ) 2 = n + ( ) f 2 n + 1 = 2f ˆN C 2 yes, if ˆλ is small (Böhning and Brittain 2008) ( ) f 2 1 n 2f 2 f 1 + f 2

50 Applications Zelterman larger than Chao? f 2 f 1 n f 1 +f 2 Example n Chao Zelterman McKendrick Matthews Scrapie Drug Use L.A. 20,198 38,637 42, terrorist activity 785 1,144 1,

51 Outlook Zelterman as MLE Zelterman Estimation offers Flexibility Zelterman estimate truncates all counts different from 1 or 2: write 1 p = p 1 = exp( λ)λ exp( λ)λ + exp( λ)λ 2 /2 = λ/2 exp( λ)λ 2 /2 p = p 2 = exp( λ)λ + exp( λ)λ 2 /2 = λ/2 1 + λ/2 and consider associated binomial log-likelihood f 1 log(p 1 ) + f 2 log(p 2 ) = f 1 log(1 p) + f 2 log(p) which is maximized for ˆp = ˆp 2 = f 1 f 1 +f 2, or ˆλ = 2ˆp 2 1 ˆp 2 = 2f 2 f 1

52 Outlook Zelterman as MLE Zelterman Estimation offers Flexibility a likelihood framework offers generalizations: (correct) variance estimate of the Zelterman estimator (Fisher information) (Böhning 2008, Statistical Methodology) extension of the estimator for case data incorporation of covariates (binomial logistic regression with log-link function to the Poisson parameter) (Böhning and van der Heijden 2008 Ann. Appl. Statist.) efficiency

53 Outlook Zelterman can be extended to case data Zelterman Estimation: Extension to Case Data Table: Illustration of Case Data with Individual Recapture Counts Unit i Count y i δ i Sex i Age i Male Male Female Male Female Female

54 Outlook Zelterman can be extended to case data Zelterman Estimation offers Flexibility Binomial likelihood for grouped data f 1 log(1 p) + f 2 log(p) becomes for case data (1 δ i ) log(1 p) + δ i log(p) i which becomes with covariate information on case i a logistic regression model p i = exp(βt x i ) 1 + exp(β T x i )

55 Outlook Zelterman can be extended to case data Zelterman Estimation offers Flexibility covariate information on case i p i = exp(βt x i ) 1 + exp(β T x i ) compare with parameterization in capture probability λ it follows that p i = λ i/2 1 + λ i /2 λ i = 2 exp(β T x i ) and the generalization of the Horvitz-Thompson estimator is n i=1 1 1 exp( 2e βt x i )

56 Outlook Zelterman extended to higher counts Generalizing the Idea of Zelterman: Improving Efficiency not only but also Po(j + 1; λ) λ = (j + 1) Po(j; λ) 1 {( }}{ j ) i=1 λ = λ λi j = i=1 λi j i=1 (i + 1) λi+1 (i+1)! j i=1 λi /i! = j i=1 (i + 1)Po(i + 1; λ) j i=1 Po(i; λ)

57 Outlook Zelterman extended to higher counts λ = leads to the proposal j i=1 (i + 1)Po(i + 1; λ) j i=1 Po(i; λ) ˆλ j = j i=1 (i + 1)f i+1 j i=1 f i and in particular for j = 1, j = 2, j = 3, j = 4 ˆλ = ˆλ 1 = 2 f 2 f 1, ˆλ 2 = 2f 2 + 3f 3 f 1 + f 2, ˆλ 3 = 2f 2 + 3f 3 + 4f 4 f 1 + f 2 + f 3 ˆλ 4 = 2f 2 + 3f 3 + 4f 4 + 5f 5 f 1 + f 2 + f 3 + f 4 n Z i = 1 exp( ˆλ i )

58 Outlook Zelterman extended: truncation or censoring? Generalizing the idea of Zelterman: truncation or censoring? disadvantage of conventional Zelterman: uses only f 1 and f 2 idea of truncation: ignore all counts different from 1 and 2 idea of censoring: use marginal likelihood for all counts of 2 and larger p 1 = P(Y = 1) = p 2+ = P(Y > 1) = exp( λ) 1 exp( λ) λ exp( λ) 1 exp( λ) [λ2 /2! + λ 3 /3! +...] = 1 exp( λ) 1 exp( λ) λ

59 Outlook Zelterman extended: truncation or censoring? Generalizing the idea of Zelterman: truncation or censoring? leads to the binomial likelihood f 1 log(p 1 ) + f 2+ log(p 2+ ) since we have leads to f 1 /n = ˆp 1 = f 1 /n exp( λ) 1 exp( λ) λ = 1 exp(λ) 1 λ 1 λ + λ 2 /2 λ ˆλ C = 2(n f 1) f 1

60 Outlook Zelterman extended: truncation or censoring? Bias reduced estimator ˆλ C = 2(n f 1) f 1 = 2f 2+2f f 1 is biased since equate E(ˆλ C ) 2λ2 /2 + 2λ 3 / λ ˆλ C = λ + λ 2 /3 λ + λ 2 /3 and solve for λ to provide ˆλ C bias = ˆλ C denote Z C bias = n 1 exp( λ C bias )

61 Outlook Some simulation results Simulation Experiment Goal: Compare Z j = n 1 exp( ˆλ j ) N C = n + f 1 2 n 2f 2 as well as Z C = corrected version and Chao s estimator 1 exp( ˆλ C ) and its bias count data: f j arise from 0.5Po(j; 0.5) + 0.5Po(j; µ) for µ = 1, 2,..., 7 and j = 0, 1, 2,... population size: N = f 0 + f = 100 f 0 is truncated N estimated using Z 1, Z 2, Z 3, Z C, Z C Bias and N C

62 Outlook Some simulation results Bias Estimator Chao Z_C Z_C_Bias Z1 Z2 Z3 Mean lambda 4 5

63 Outlook Some simulation results Standard Deviation Estimator Chao Z_C Z_C_Bias Z1 Z2 Z3 SD lambda 4 5

64 Outlook Some simulation results RMSE Estimator Chao Z_C Z_C_Bias Z1 Z2 Z3 RMSE lambda 4 5

65 Outlook Some simulation results Main Conclusions Generalized Chao Bounds lead to a diagnostic device for presence of heterogeneity Zelterman Estimation appears appropriate since it offers flexible likelihood concept Further generalizations of Zelterman estimation (covariate modelling and increasing efficiency by including higher counts) easily possible

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1 Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1

More information

Capture recapture estimation based upon the geometric distribution allowing for heterogeneity

Capture recapture estimation based upon the geometric distribution allowing for heterogeneity Capture recapture estimation based upon the geometric distribution allowing for heterogeneity Sa-aat Niwitpong, Dankmar Böhning, Peter G. M. van der Heijden & Heinz Holling Metrika International Journal

More information

Mixture Models for Capture- Recapture Data

Mixture Models for Capture- Recapture Data Mixture Models for Capture- Recapture Data Dankmar Böhning Invited Lecture at Mixture Models between Theory and Applications Rome, September 13, 2002 How many cases n in a population? Registry identifies

More information

One-Source Capture-Recapture: Models, applications and software

One-Source Capture-Recapture: Models, applications and software One-Source Capture-Recapture: Models, applications and software Maarten Cruyff, Guus Cruts, Peter G.M. van der Heijden, * Utrecht University Trimbos ISI 2011 1 Outline 1. One-source data 2. Models and

More information

Capture Recapture Methods for Estimating the Size of a Population: Dealing with Variable Capture Probabilities

Capture Recapture Methods for Estimating the Size of a Population: Dealing with Variable Capture Probabilities 18 Capture Recapture Methods for Estimating the Size of a Population: Dealing with Variable Capture Probabilities Louis-Paul Rivest and Sophie Baillargeon Université Laval, Québec, QC 18.1 Estimating Abundance

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Nonparametric estimation of the number of zeros in truncated count distributions

Nonparametric estimation of the number of zeros in truncated count distributions Nonparametric estimation of the number of zeros in truncated count distributions Célestin C. KOKONENDJI University of Franche-Comté, France Laboratoire de Mathématiques de Besançon - UMR 6623 CNRS-UFC

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Fully general Chao and Zelterman estimators with application to a whale shark population

Fully general Chao and Zelterman estimators with application to a whale shark population Appl. Statist. (2018) 67, Part 1, pp. 217 229 Fully general Chao and Zelterman estimators with application to a whale shark population Alessio Farcomeni Sapienza University of Rome, Italy [Received May

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3. Mathematical Statistics: Homewor problems General guideline. While woring outside the classroom, use any help you want, including people, computer algebra systems, Internet, and solution manuals, but mae

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Capture Recapture to Estimate Criminal Populations. Synonyms

Capture Recapture to Estimate Criminal Populations. Synonyms Capture Recapture to Estimate Criminal Populations 267 C Brower AM, Carroll L (2007) Spatial and temporal aspects of alcohol-related crime in a college town. J Am Coll Health 55:267 75 Cohen L, Felson

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

A class of latent marginal models for capture-recapture data with continuous covariates

A class of latent marginal models for capture-recapture data with continuous covariates A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

arxiv: v1 [stat.ap] 27 Jul 2011

arxiv: v1 [stat.ap] 27 Jul 2011 The Annals of Applied Statistics 2011, Vol. 5, No. 2B, 1512 1533 DOI: 10.1214/10-AOAS436 c Institute of Mathematical Statistics, 2011 POPULATION SIZE ESTIMATION BASED UPON RATIOS OF RECAPTURE PROBABILITIES

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 248 Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Kelly

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 9: Asymptotics III(MLE) 1 / 20 Jensen

More information

Generalized Linear Models (1/29/13)

Generalized Linear Models (1/29/13) STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability

More information

Lecture 4: Generalized Linear Mixed Models

Lecture 4: Generalized Linear Mixed Models Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 An example with one random effect An example with two nested random effects

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Exponential Families

Exponential Families Exponential Families David M. Blei 1 Introduction We discuss the exponential family, a very flexible family of distributions. Most distributions that you have heard of are in the exponential family. Bernoulli,

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Indirect Sampling in Case of Asymmetrical Link Structures

Indirect Sampling in Case of Asymmetrical Link Structures Indirect Sampling in Case of Asymmetrical Link Structures Torsten Harms Abstract Estimation in case of indirect sampling as introduced by Huang (1984) and Ernst (1989) and developed by Lavalle (1995) Deville

More information

Imprecise Prior for Imprecise Inference on Poisson Sampling Model

Imprecise Prior for Imprecise Inference on Poisson Sampling Model Imprecise Prior for Imprecise Inference on Poisson Sampling Model A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Doctor

More information

Package SPECIES. R topics documented: April 23, Type Package. Title Statistical package for species richness estimation. Version 1.

Package SPECIES. R topics documented: April 23, Type Package. Title Statistical package for species richness estimation. Version 1. Package SPECIES April 23, 2011 Type Package Title Statistical package for species richness estimation Version 1.0 Date 2010-01-24 Author Ji-Ping Wang, Maintainer Ji-Ping Wang

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Estimating the number of opiate users in Rotterdam using statistical models for incomplete count data *

Estimating the number of opiate users in Rotterdam using statistical models for incomplete count data * Estimating the number of opiate users in Rotterdam using statistical models for incomplete count data * December 1997 Filip Smit 1, Jaap Toet 2, Peter van der Heijden 3 1 Methods Section, Trimbos Institute,

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

CRISP: Capture-Recapture Interactive Simulation Package

CRISP: Capture-Recapture Interactive Simulation Package CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

A hidden semi-markov model for the occurrences of water pipe bursts

A hidden semi-markov model for the occurrences of water pipe bursts A hidden semi-markov model for the occurrences of water pipe bursts T. Economou 1, T.C. Bailey 1 and Z. Kapelan 1 1 School of Engineering, Computer Science and Mathematics, University of Exeter, Harrison

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Multi-level Models: Idea

Multi-level Models: Idea Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

2 Inference for Multinomial Distribution

2 Inference for Multinomial Distribution Markov Chain Monte Carlo Methods Part III: Statistical Concepts By K.B.Athreya, Mohan Delampady and T.Krishnan 1 Introduction In parts I and II of this series it was shown how Markov chain Monte Carlo

More information

Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

To appear in The American Statistician vol. 61 (2007) pp

To appear in The American Statistician vol. 61 (2007) pp How Can the Score Test Be Inconsistent? David A Freedman ABSTRACT: The score test can be inconsistent because at the MLE under the null hypothesis the observed information matrix generates negative variance

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be Chapter 6 Expectation and Conditional Expectation Lectures 24-30 In this chapter, we introduce expected value or the mean of a random variable. First we define expectation for discrete random variables

More information

Censoring mechanisms

Censoring mechanisms Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood

More information

Review of Maximum Likelihood Estimators

Review of Maximum Likelihood Estimators Libby MacKinnon CSE 527 notes Lecture 7, October 7, 2007 MLE and EM Review of Maximum Likelihood Estimators MLE is one of many approaches to parameter estimation. The likelihood of independent observations

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #27 Estimation-I Today, I will introduce the problem of

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

45 6 = ( )/6! = 20 6 = selected subsets: 6 = and 10

45 6 = ( )/6! = 20 6 = selected subsets: 6 = and 10 Homework 1. Selected Solutions, 9/12/03. Ch. 2, # 34, p. 74. The most natural sample space S is the set of unordered -tuples, i.e., the set of -element subsets of the list of 45 workers (labelled 1,...,

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Application of Latent Class with Random Effects Models to Longitudinal Data. Ken Beath Macquarie University

Application of Latent Class with Random Effects Models to Longitudinal Data. Ken Beath Macquarie University Application of Latent Class with Random Effects Models to Longitudinal Data Ken Beath Macquarie University Introduction Latent trajectory is a method of classifying subjects based on longitudinal data

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

The Accelerated Failure Time Model Under Biased. Sampling

The Accelerated Failure Time Model Under Biased. Sampling The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ZHENLINYANGandRONNIET.C.LEE Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore

More information

Federated analyses. technical, statistical and human challenges

Federated analyses. technical, statistical and human challenges Federated analyses technical, statistical and human challenges Bénédicte Delcoigne, Statistician, PhD Department of Medicine (Solna), Unit of Clinical Epidemiology, Karolinska Institutet What is it? When

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS

STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS David A Binder and Georgia R Roberts Methodology Branch, Statistics Canada, Ottawa, ON, Canada K1A 0T6 KEY WORDS: Design-based properties, Informative sampling,

More information

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety David Buil-Gil, Reka Solymosi Centre for Criminology and Criminal

More information