Statistical Models for Defective Count Data
|
|
- Adela Daniel
- 6 years ago
- Views:
Transcription
1 Statistical Models for Defective Count Data Gerhard Neubauer a, Gordana -Duraš a, and Herwig Friedl b a Statistical Applications, Joanneum Research, Graz, Austria b Institute of Statistics, University of Technology, Graz, Austria Statistik Tage 2011, Graz, September 7 th -9 th
2 Introduction Statistik Tage Graz, September 7-9,
3 Defective Count Data Defective: reported... too much and/or too few is counted, recorded, Most prominent example: Criminology Crimes associated with shame (sexual offences, domestic violence) Theft of low-value goods Likely to be not reported to the police, official numbers are too low Under-Reporting Statistik Tage Graz, September 7-9,
4 Defective Count Data Less considered: Insurance companies suspect that theft may be reported, but only to make an insurance claim Most popular with bicycles, skiing equipment Likely to be reported to the police, official numbers are too high Over-Reporting More precisely: Under-Reporting fraud Over-Reporting theft Statistik Tage Graz, September 7-9,
5 Criminology: Bicycle Theft Data Weekly counts of bicycle theft in an Austrian region Week Statistik Tage Graz, September 7-9,
6 Bicycle Theft? Statistik Tage Graz, September 7-9,
7 Defective Count Data Health data Registers for infectious diseases (HIV), chronic diseases (diabetes) Cause of death registers (heart attack) Any miss-classification will cause both errors Under-Reporting in the right category Over-Reporting in the wrong category Statistik Tage Graz, September 7-9,
8 Health: Heart Attack Data Monthly counts of heart attack discharges from Styrian hospitals Count Month Statistik Tage Graz, September 7-9,
9 Defective Count Data Criminology Health data Insurance data: traffic accidents with minor damage Production: number of defective goods in production Warranty: number of goods sent back for warranty claim... Any sample of count data may be defective due to recording failures Statistik Tage Graz, September 7-9,
10 Under-Reporting How to estimate the total number of cases? Statistik Tage Graz, September 7-9,
11 Bernoulli Sampling One observation λ observations Case reported Case reported Yes No Yes No R 1 R Y D R Bernoulli(π) Y = λ R i Binomial(λ, π) i=1 Y... (random) number of reported cases D... (random) number of unreported cases λ... total number of cases π... reporting probability Statistik Tage Graz, September 7-9,
12 Binomial Model Case reported Yes No Y D Y = λ R i Binomial(λ, π) i=1 E(Y ) = µ = λπ var(y ) = σ 2 = λπ(1 π) Both λ and π have to be estimated Statistik Tage Graz, September 7-9,
13 Binomial Model Case reported Yes No Y D Y = λ R i Binomial(λ, π) i=1 E(Y ) = µ = λπ var(y ) = σ 2 = λπ(1 π) Both λ and π have to be estimated No longer a member of 1-parameter exponential family Limitation to data with s 2 < y Statistik Tage Graz, September 7-9,
14 Is iid Assumption Appropriate? Week Count Month Trend/Seasonality Regression approach Statistik Tage Graz, September 7-9,
15 Regression Approach Consider Y t ind Binomial(λ t, π), t = 1,..., T, where λ t = exp(x tβ), β R d, π = to ensure λ t > 0 and 0 < π < 1 Likelihood contribution of y t L(α, β y t ) = exp(α) 1 + exp(α), α R, ( λt y t ) π y t (1 π) λ t y t Statistik Tage Graz, September 7-9,
16 Maximum Likelihood Estimation Sample Log-Likelihood l(α, β y) = = T t=1 T t=1 log L(α, β y t ) log [( λt y t )π y t (1 π) λ t y t ] Score and observed Information are analytically evaluated Newton-Raphson procedure till convergence Statistik Tage Graz, September 7-9,
17 Maximum Likelihood Estimation Sample Log-Likelihood l(α, β y) = = T t=1 T t=1 log L(α, β y t ) log [( λt y t )π y t (1 π) λ t y t ] Score and observed Information are analytically evaluated Newton-Raphson procedure till convergence Still problems with overdispersion! Statistik Tage Graz, September 7-9,
18 Randomized Parameter Models Statistik Tage Graz, September 7-9,
19 Beta-Binomial Model Randomize reporting probability π We use the parameterization Y t P t Binomial(λ t, P t ) P t Beta(γ, δ) Y t Beta-Binomial(λ t, γ, δ) θ = γ + δ and π = γ γ + δ = E(P t) Hence E(Y t ) = µ t = λ t π and var(y t ) = µ t (1 π)φ, where φ = λ+γ+δ 1+γ+δ 1 Statistik Tage Graz, September 7-9,
20 Poisson Models Randomize total number of cases λ Model not identified Winkelmann (2000): Pogit Model Y t L t Binomial(L t, π) L t Poisson(λ t ) Y t Poisson(λ t π) Y t Poisson(λ t π t ) λ t = exp(x tβ) and π t = with two (disjoint) sets of regressors x t and z t exp(z tα) 1 + exp(z tα) Statistik Tage Graz, September 7-9,
21 Negative Binomial Model Randomize total number of cases λ Y t L t Binomial(L t, π) L t K t Poisson(K t λ t ) K t Gamma(ω t, ω t ) Y t Negative Binomial(ω t, 1 π) where ω t = λ t (1 π) is the expected number of unreported cases E(Y t ) = µ t = λ t π var(y t ) = µ t + µ2 t ω t Statistik Tage Graz, September 7-9,
22 Beta-Poisson Model Randomize π and λ We use the parameterization Y t L t, P t Binomial(L t, P t ) L t Poisson(λ t ) P t Beta(γ, δ) Y t Beta-Poisson(λ t, γ, δ) θ = γ + δ and π = γ γ + δ = E(P t) Hence E(Y t ) = µ t = λ t π and var(y t ) = µ t φ t, where φ t = 1 + λ t(1 π) 1+γ+δ 1 Statistik Tage Graz, September 7-9,
23 Estimation and Inference Statistik Tage Graz, September 7-9,
24 Estimation We use Maximum Likelihood (ML), where the solution of score equations α log L (Ω M y t, x t ) = 0 and β log L (Ω M y t, x t ) = 0 t t Ω M the parameter vector of given model is found by the Newton-Raphson algorithm For the Beta-mixtures we also use the Hybrid algorithm Cycle between 1. ML for α, β given θ, and 2. MoM for θ given α, β Choice of starting values sometimes crucial Statistik Tage Graz, September 7-9,
25 Inference Normality of parameter estimates Simulation studies: for reasonable settings ˆα and ˆβ approximately normal Problems when π 0, π 1, i.e. Poisson limit perfect reporting system Inference within distribution - usual model selection methods e.g. t-values, LRT,... Statistik Tage Graz, September 7-9,
26 Inference Inference between distributions - Non-nested testing (Allcroft and Glasbey, 2003) Comparison of various models M = (M k ), k = 1,..., K 1. find ˆθ k for data y = (y t ) by optimizing GoF criterion c = (c k (y)) 2. simulate samples y (s) (ˆθ k ), s = 1,..., S, (e.g. S = 100) and obtain all S matrices C (s) of dimension K K 3. obtain mean C of S matrices and compare c to the k-th column c k by multivariate normality assumption If D k χ 2 K, D k = (c c k ) V 1 k (c c k) kth model correct Statistik Tage Graz, September 7-9,
27 Real Data Applications Statistik Tage Graz, September 7-9,
28 Bicycle Theft Distribution log L Pearson BIC D p(d) ˆπ Negative-Binomial Beta-Binomial Beta-Poisson Model selected: Negative Binomial ˆπ = 0.61, p(d) = 0.59 Statistik Tage Graz, September 7-9,
29 Bicycle Theft Data Week Statistik Tage Graz, September 7-9,
30 Bicycle Theft Model Week Estimated mean (solid), estimated total number of thefts with confidence interval (dashed) Statistik Tage Graz, September 7-9,
31 Heart Attack Data Distribution log L Pearson BIC D p(d) ˆπ Negative Binomial Beta-Binomial <e Beta-Poisson <e Model selected: Negative Binomial ˆπ = 0.55, p(d) = 0.63 Statistik Tage Graz, September 7-9,
32 Heart Attack Data Count Month Estimated mean (solid), estimated total number of heart attacks with confidence interval (dashed) Statistik Tage Graz, September 7-9,
33 Heart Attack Counts + Cause of Death Counts Count Month Estimated mean (black solid), estimated total number of heart attacks with confidence interval (red dashed) Statistik Tage Graz, September 7-9,
34 Over-Reporting How to estimate the correct number of cases? Statistik Tage Graz, September 7-9,
35 Cameron & Trivedi, 1998 Model dependence between the two Bernoulli variables T 0 : true not reported T 1 : true reported O: over-reported U: under-reported Y : observed count E = Event and R = Recording Bivariate Binomial Random Variable R = 0 R = 1 E = 0 T 0 O N 0 E = 1 U T 1 N 1 N Y Y N Statistik Tage Graz, September 7-9,
36 Independent Sampling In Criminology T 0 does not exist. Therefore we consider independent sampling. R = 0 R = 1 E = 0 R 0 E = 1 U R 1 N and Y = R 0 + R 1 Specify under-reporting model for R 1 and a count model for R 0 Model Y by convolution of both Statistik Tage Graz, September 7-9,
37 A Family of Poisson Convolutions Assume that R 1 is generated by under-reporting, i.e. R 1 N, P Binomial(N, P ), and Then p(y = y = r 0 + r 1 ) = R 0 Poisson(α) r 0 j=0 p 0 (j)p 1 (y j) = r 1 j=0 p 0 (y j)p 1 (j) with E(Y ) = µ 0 + µ 1 = α + λπ var(y ) = σ σ 2 1 = α + var(r 1 ) Statistik Tage Graz, September 7-9,
38 Negative-Binomial-Poisson Convolution For R 1 Negative Binomial(ω, π) Y Delaporte(λ, π, α), λ = ω/(1 π) with where and p(y = y = r 1 + r 0 ) = (1 π)ω α y exp( α) S Γ(ω) S = r 1 j=0 Γ(j + ω) ( π ) j Γ(j + 1)Γ(y j + 1) α E(Y ) = µ 0 + µ 1 = α + λπ var(y ) = σ σ 2 1 = α + λπ(1 π) 1 Statistik Tage Graz, September 7-9,
39 Application to bicycle theft data Data and Mean in grey Week Statistik Tage Graz, September 7-9,
40 Application to bicycle theft data Data and Mean in grey, E(R_0) in red, E(R_1) in green, E(U) in blue Week Statistik Tage Graz, September 7-9,
41 Averaged Results Application to bicycle theft data R = 0 R = 1 E = 0 Ê(R 0 ) = E = 1 Ê(U) = 5.08 Ê(R 1 ) = Ê(N) = and Ê(Y ) = Ê(R 0 + R 1 ) = Reporting probability in the under-reporting model: ˆπ = 0.77 Fraud rate: 14.35/35.53 = 0.40 Statistik Tage Graz, September 7-9,
42 Conclusion highly relevant methodology for wide-spread applications models based on Bernoulli sampling conditional binomial models suitable for cases when var(y ) > µ estimation relies on ML and Hybrid ML non-nested technique for model selection Limitations π 0, i.e. Poisson limit π 1, perfect reporting system Statistik Tage Graz, September 7-9,
43 References Allcroft, D.J. and Glasby, Ch.A. (2003). A simulation-based method for model evaluation, Statistical Modelling, 3, Cameron, A.C. and Trivedi, P.K. (1998). Regression Analysis of Count Data. Cambridge: Cambridge University Press. Neubauer, G., Djuraš, G. and Friedl, H. (2011). Models for Underreporting: A Bernoulli Sampling Approach for Reported Counts. Austrian Journal of Statistics, 40, Winkelmann, R. (2000). Econometric Analysis of Count Data. Berlin: Springer. Statistik Tage Graz, September 7-9,
Models for Underreporting: A Bernoulli Sampling Approach for Reported Counts
AUSTRIAN JOURNAL OF STATISTICS Volume 40 (2011), Number 1 & 2, 85-92 Models for Underreporting: A Bernoulli Sampling Approach for Reported Counts Gerhard Neubauer 1, Gordana Djuraš 1 and Herwig Friedl
More informationPoisson Regression. Ryan Godwin. ECON University of Manitoba
Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationChapter 4: Generalized Linear Models-II
: Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationLecture 8. Poisson models for counts
Lecture 8. Poisson models for counts Jesper Rydén Department of Mathematics, Uppsala University jesper.ryden@math.uu.se Statistical Risk Analysis Spring 2014 Absolute risks The failure intensity λ(t) describes
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationInfinitely Imbalanced Logistic Regression
p. 1/1 Infinitely Imbalanced Logistic Regression Art B. Owen Journal of Machine Learning Research, April 2007 Presenter: Ivo D. Shterev p. 2/1 Outline Motivation Introduction Numerical Examples Notation
More informationGeneralized Linear Models I
Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.
More informationMultivariate negative binomial models for insurance claim counts
Multivariate negative binomial models for insurance claim counts Peng Shi (Northern Illinois University) and Emiliano A. Valdez (University of Connecticut) 9 November 0, Montréal, Quebec Université de
More informationPh.D. Preliminary Examination Statistics June 2, 2014
Ph.D. Preliminary Examination Statistics June, 04 NOTES:. The exam is worth 00 points.. Partial credit may be given for partial answers if possible.. There are 5 pages in this exam paper. I have neither
More informationMODELING COUNT DATA Joseph M. Hilbe
MODELING COUNT DATA Joseph M. Hilbe Arizona State University Count models are a subset of discrete response regression models. Count data are distributed as non-negative integers, are intrinsically heteroskedastic,
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationSections 4.1, 4.2, 4.3
Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear
More informationH-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL
H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationSTAT 526 Spring Final Exam. Thursday May 5, 2011
STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationQualifying Exam in Probability and Statistics.
Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation
More informationChapter 4: Generalized Linear Models-I
: Generalized Linear Models-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationPump failure data. Pump Failures Time
Outline 1. Poisson distribution 2. Tests of hypothesis for a single Poisson mean 3. Comparing multiple Poisson means 4. Likelihood equivalence with exponential model Pump failure data Pump 1 2 3 4 5 Failures
More informationTime-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation
Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation H. Zhang, E. Cutright & T. Giras Center of Rail Safety-Critical Excellence, University of Virginia,
More information1/15. Over or under dispersion Problem
1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let
More informationClinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.
Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,
More informationGeneralized linear models III Log-linear and related models
Generalized linear models III Log-linear and related models Peter McCullagh Department of Statistics University of Chicago Polokwane, South Africa November 2013 Outline Log-linear models Binomial models
More informationLecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014
Lecture 13 Text: A Course in Probability by Weiss 5.5 STAT 225 Introduction to Probability Models February 16, 2014 Whitney Huang Purdue University 13.1 Agenda 1 2 3 13.2 Review So far, we have seen discrete
More informationParameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application
Original Parameters Estimation Methods for the Negative Binomial-Crack Distribution and Its Application Pornpop Saengthong 1*, Winai Bodhisuwan 2 Received: 29 March 2013 Accepted: 15 May 2013 Abstract
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationPetr Volf. Model for Difference of Two Series of Poisson-like Count Data
Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like
More informationModeling of salmonella-prevalence in the case of chicken
Modeling of salmonella-prevalence in the case of chicken Alena Myšičková Wolfgang Härdle Institute for Statistics and Econometrics Humboldt-University at Berlin Outline Introduction Data and Materials
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationZero inflated negative binomial-generalized exponential distribution and its applications
Songklanakarin J. Sci. Technol. 6 (4), 48-491, Jul. - Aug. 014 http://www.sst.psu.ac.th Original Article Zero inflated negative binomial-generalized eponential distribution and its applications Sirinapa
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationCounts using Jitters joint work with Peng Shi, Northern Illinois University
of Claim Longitudinal of Claim joint work with Peng Shi, Northern Illinois University UConn Actuarial Science Seminar 2 December 2011 Department of Mathematics University of Connecticut Storrs, Connecticut,
More informationA Few Special Distributions and Their Properties
A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution
More informationThe LmB Conferences on Multivariate Count Analysis
The LmB Conferences on Multivariate Count Analysis Title: On Poisson-exponential-Tweedie regression models for ultra-overdispersed count data Rahma ABID, C.C. Kokonendji & A. Masmoudi Email Address: rahma.abid.ch@gmail.com
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationLecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1
Lecture 7. Testing for Poisson cdf - Poisson regression - Random points in space 1 Igor Rychlik Chalmers Department of Mathematical Sciences Probability, Statistics and Risk, MVE300 Chalmers April 2010
More informationFurther simulation evidence on the performance of the Poisson pseudo-maximum likelihood estimator
Further simulation evidence on the performance of the Poisson pseudo-maximum likelihood estimator J.M.C. Santos Silva Silvana Tenreyro January 27, 2009 Abstract We extend the simulation results given in
More informationCategorical Variables and Contingency Tables: Description and Inference
Categorical Variables and Contingency Tables: Description and Inference STAT 526 Professor Olga Vitek March 3, 2011 Reading: Agresti Ch. 1, 2 and 3 Faraway Ch. 4 3 Univariate Binomial and Multinomial Measurements
More informationChapter 2: Describing Contingency Tables - I
: Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationOverview of Extreme Value Theory. Dr. Sawsan Hilal space
Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate
More informationRobust negative binomial regression
Robust negative binomial regression Michel Amiguet, Alfio Marazzi, Marina S. Valdora, Víctor J. Yohai september 2016 Negative Binomial Regression Poisson regression is the standard method used to model
More informationAn EM Algorithm for Multivariate Mixed Poisson Regression Models and its Application
Applied Mathematical Sciences, Vol. 6, 2012, no. 137, 6843-6856 An EM Algorithm for Multivariate Mixed Poisson Regression Models and its Application M. E. Ghitany 1, D. Karlis 2, D.K. Al-Mutairi 1 and
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationHow to predict the probability of a major nuclear accident after Fukushima Da
How to predict the probability of a major nuclear accident after Fukushima Dai-ichi? CERNA Mines ParisTech March 14, 2012 1 2 Frequentist approach Bayesian approach Issues 3 Allowing safety progress Dealing
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationTom Salisbury
MATH 2030 3.00MW Elementary Probability Course Notes Part V: Independence of Random Variables, Law of Large Numbers, Central Limit Theorem, Poisson distribution Geometric & Exponential distributions Tom
More informationIntroduction. Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University
Introduction Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 56 Course logistics Let Y be a discrete
More informationDiscrete Distributions Chapter 6
Discrete Distributions Chapter 6 Negative Binomial Distribution section 6.3 Consider k r, r +,... independent Bernoulli trials with probability of success in one trial being p. Let the random variable
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationMATH4427 Notebook 2 Fall Semester 2017/2018
MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationEconometric Analysis of Count Data
Econometric Analysis of Count Data Springer-Verlag Berlin Heidelberg GmbH Rainer Winkelmann Econometric Analysis of Count Data Third, Revised and Enlarged Edition With l3 Figures and 20 Tables, Springer
More informationA simple bivariate count data regression model. Abstract
A simple bivariate count data regression model Shiferaw Gurmu Georgia State University John Elder North Dakota State University Abstract This paper develops a simple bivariate count data regression model
More informationBinary Response: Logistic Regression. STAT 526 Professor Olga Vitek
Binary Response: Logistic Regression STAT 526 Professor Olga Vitek March 29, 2011 4 Model Specification and Interpretation 4-1 Probability Distribution of a Binary Outcome Y In many situations, the response
More informationSTATISTICAL MODEL OF ROAD TRAFFIC CRASHES DATA IN ANAMBRA STATE, NIGERIA: A POISSON REGRESSION APPROACH
STATISTICAL MODEL OF ROAD TRAFFIC CRASHES DATA IN ANAMBRA STATE, NIGERIA: A POISSON REGRESSION APPROACH Dr. Nwankwo Chike H and Nwaigwe Godwin I * Department of Statistics, Nnamdi Azikiwe University, Awka,
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationGeneralized Linear Models
Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015
Reading ECE 275B Homework #2 Due Thursday 2/12/2015 MIDTERM is Scheduled for Thursday, February 19, 2015 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example
More informationAdvanced Statistics I : Gaussian Linear Model (and beyond)
Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Logistic regression (v1) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 30 Regression methods for binary outcomes 2 / 30 Binary outcomes For the duration of this
More informationSemiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More informationSB1a Applied Statistics Lectures 9-10
SB1a Applied Statistics Lectures 9-10 Dr Geoff Nicholls Week 5 MT15 - Natural or canonical) exponential families - Generalised Linear Models for data - Fitting GLM s to data MLE s Iteratively Re-weighted
More informationPartition models and cluster processes
and cluster processes and cluster processes With applications to classification Jie Yang Department of Statistics University of Chicago ICM, Madrid, August 26 and cluster processes utline 1 and cluster
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationCOPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition
Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationStatistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach
Statistical Model Of Road Traffic Crashes Data In Anambra State, Nigeria: A Poisson Regression Approach Nwankwo Chike H., Nwaigwe Godwin I Abstract: Road traffic crashes are count (discrete) in nature.
More informationStat 587: Key points and formulae Week 15
Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place
More informationStatistics and econometrics
1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample
More informationGeneralized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data
Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized
More informationContinuous Probability Spaces
Continuous Probability Spaces Ω is not countable. Outcomes can be any real number or part of an interval of R, e.g. heights, weights and lifetimes. Can not assign probabilities to each outcome and add
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationBOOTSTRAPPING WITH MODELS FOR COUNT DATA
Journal of Biopharmaceutical Statistics, 21: 1164 1176, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.607748 BOOTSTRAPPING WITH MODELS FOR
More informationSTAT 705: Analysis of Contingency Tables
STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic
More informationPoisson regression: Further topics
Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to
More information