A flexible two-step randomised response model for estimating the proportions of individuals with sensitive attributes

Similar documents
Inferences About Two Population Proportions

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

7 Estimation. 7.1 Population and Sample (P.91-92)

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

Review of probabilities

Topic 2 Probability. Basic probability Conditional probability and independence Bayes rule Basic reliability

Chapter 7 Wednesday, May 26th

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

A brief review of basics of probabilities

Inferences Based on Two Samples

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Inferential Statistics. Chapter 5

CHAPTER TOPICS. Sampling Distribution of the Mean The Central Limit Theorem Sampling Distribution of the Proportion Sampling from Finite Population

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Chapter 18: Sampling Distribution Models

Math 243 Section 3.1 Introduction to Probability Lab

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

ECON Introductory Econometrics. Lecture 17: Experiments

Binomial and Poisson Probability Distributions

Single Sample Means. SOCY601 Alan Neustadtl

UNIT 5 ~ Probability: What Are the Chances? 1

University of California, Berkeley, Statistics 134: Concepts of Probability. Michael Lugo, Spring Exam 1

Probability and Probability Distributions. Dr. Mohammed Alahmed

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by:

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Intermediate Math Circles November 8, 2017 Probability II

The possible experimental outcomes: 1, 2, 3, 4, 5, 6 (Experimental outcomes are also known as sample points)


Total. Name: Student ID: CSE 21A. Midterm #2. February 28, 2013

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

Unit4: Inferencefornumericaldata. 1. Inference using the t-distribution. Sta Fall Duke University, Department of Statistical Science

Inferences About Two Proportions

CS 543 Page 1 John E. Boon, Jr.

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

The t-statistic. Student s t Test

3.2 Probability Rules

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Independence Solutions STAT-UB.0103 Statistics for Business Control and Regression Models

Dynamic Programming Lecture #4

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

Locally Differentially Private Protocols for Frequency Estimation. Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha

Sampling Distributions

ECON Introductory Econometrics. Lecture 2: Review of Statistics

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

Application of Statistical Analysis in Population and Sampling Population

Note: Solve these papers by yourself This VU Group is not responsible for any solved content. Paper 1. Question No: 3 ( Marks: 1 ) - Please choose one

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

Sampling. General introduction to sampling methods in epidemiology and some applications to food microbiology study October Hanoi

STAT Chapter 8: Hypothesis Tests

Chapter Six: Two Independent Samples Methods 1/51

STAT 285: Fall Semester Final Examination Solutions

STATISTICS 4, S4 (4769) A2

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Chapter 12: Inference about One Population

Lecture 2. October 21, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

1 and 2 proportions starts here

Chapter 15 Sampling Distribution Models

Probability and Statistics Notes

CIVL 7012/8012. Simple Linear Regression. Lecture 3

MAT Mathematics in Today's World

P (A) = P (B) = P (C) = P (D) =

Introducing Generalized Linear Models: Logistic Regression

Review of Econometrics

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

PROBABILITY.

Lecture 4. August 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Discrete Random Variables

Section 7.1 Experiments, Sample Spaces, and Events

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Basic Concepts of Probability

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018

Event A: at least one tail observed A:

Discrete Structures Prelim 1 Selected problems from past exams

Simple Linear Regression: One Qualitative IV

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Chapter 6: Probability The Study of Randomness

POPULATION AND SAMPLE

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

CHAPTER 2 Estimating Probabilities

Probability Long-Term Memory Review Review 1

PRACTICE PROBLEMS FOR EXAM 1

Probability & Random Variables

3 Conditional Probability

STAT 31 Practice Midterm 2 Fall, 2005

Chapters 4-6: Inference with two samples Read sections 4.2.5, 5.2, 5.3, 6.2

Discrete Random Variables

MATH 10 INTRODUCTORY STATISTICS

Lecture 5: ANOVA and Correlation

Math 140 Introductory Statistics

Example. What is the sample space for flipping a fair coin? Rolling a 6-sided die? Find the event E where E = {x x has exactly one head}

Population 1 Population 2

Transcription:

A flexible two-step randomised response model for estimating the proportions of individuals with sensitive attributes Anne-Françoise Donneau, Murielle Mauer Francisco Sartor and Adelin Albert Department of Biostatistics, University of Liège, Liège Scientific Institute of Public Health Louis Pasteur, Brussels 1

Content 1. Problem definition. Classical RRM approach 3. New RRM approach 4. Properties 5. Application 6. Conclusion

Problem definition Let p be the proportion of individuals in a given population presenting a certain characteristic of interest (e.g., smoking, heart defect, diabetes,. Problem: Estimating from a sample of size drawn from the population Classical studies: Let be the number of subjects with the characteristic in the sample, then an estimate of pˆ r n p r pˆ (1 pˆ pˆ (1 pˆ IC 95%: pˆ 1.96 p pˆ 1.96 n n n p pˆ (1 pˆ and SE( pˆ n is given by 3

Difficulties When the attribute of interest is a sensitive one ( e.g., illicit drugs consumption, psychiatric disorder,.., people tend to refuse to answer or intentionally mislead biased estimation of the true proportion p in classical method Solution to reduce or eliminate bias linked to the wish to keep private life secret Random Response Model (RRM (Warner, 1965 4

RRM approach (I Objective : reassuring the respondent that a possible affirmative answer to sensitive questions can t put him/her in a dangerous situation Method: - Pair the sensitive question with an unrelated one. - Respondent select randomly one of these two questions. - Interviewer records only a Yes or No without knowing which question has been answered. 5

RRM approach (II 6

New approach (I No unrelated question but merely the sensitive question. Question with categorical answers which can be combined with some flexibility to have a binary outcome. 7

New approach (II Let f denote the probability of a positive answer regardless of the question, f qp m( 1 q Then, as the probabilities of the random processes and are supposed to be known (and can be fixed a priori by the investigator, the probability of a positive answer to the sensitive question is f m(1 q p q q m 8

Properties (I n 1 If and are the observed numbers of subjects responding positively according to the outcome of the Step1 and Step1+ Step, respectively and Consequently, if the right hand-side is >0 pˆ 0 n 1 pˆ otherwise; n1 n1 fˆ fˆ n m(1 q q 9

Properties (II Statement1: pˆ is an unbiased and consistent estimator of p - Unbiasedness As fˆ E( pˆ E E fˆ fˆ m(1 q q m(1 q q f is an unbiased estimator of E( pˆ - Consistency By the law of large sample, Hence, pˆ fˆ f m(1 q q fˆ f m(1 q q p p 10

Properties (III Statment : Var( pˆ 1 nq f (1 f Var( pˆ fˆ m(1 q q q 1 Var( fˆ m q Since and are constant and as Var( fˆ f (1 n f Var( pˆ 1 nq f (1 f 11

Properties (VI Thus for large n, pˆ 1 nq p fˆ (1 fˆ ~ N(0,1 Confidence interval pˆ Q Z ( 1 1 nq fˆ ( 1 fˆ ; pˆ Q Z ( 1 1 nq fˆ ( 1 fˆ 1

One-sample power (I Determination of sample size at the % significance level n, so that pˆ p QZ f ( 1 qp nq f ( 1 As, 1 m(1 q f n Q ( 1 f ( 1 Z q f n Q ( 1 Z qp m(1 q q 1 qp m(1 q 13

One-sample problem (II Example based on the prevalence of the use of cannabis. Hypothesis: Δ= % α= 5% m p = 5/6 and = 0.5 = 5% q Plugging these values in the previous formula, provides n 191 14

Application : Question During all life, how many times have you used illicit drug? 1 = Never = 1- times 0=No 3 = 3-5 times better than 4 = 6-9 times 1=Yes 5 = 10-19 times 6 = 0 times or more 15

Application: Random process Flip the coin (RP1, if the result is tail answer to question «a». If the result is head, go to «b». a During all life, how many times have you used illicit drug? 1 = Never = 1- times 3 = 3-5 times Response 4 = 6-9 times 5 = 10-19 times 6 = 0 times or more b Roll the dice once (RP. What is the result? 16

Application: Study material Undergraduate students registered at the University of Liège during the academic year 003-004 Characteristics of the students (n=435 Age (year 18.1 ± 0.78 Gender Female (% 88 (66.4 Male(% 146 (33.6 17

Application: Practical aspect During supervised practical works more receptive and concentrated Explanation Closed box 18

Application: Results Sensitive Question Yes answer : -6 (1 or more No answer : 1 (Never m = 5/6 Sensitive question Prevalence (± SE 95% CI Illicit drug 40.9 (± 4.7 31.7-50.0 19

Application: Flexibility (I Sensitive Question Yes answer : 4-6 (6 or more No answer : 1-3 (Never - 5 times m = 1/ Sensitive question Prevalence (± SE 95% CI Illicit drug 6.7 (± 4.7 17.6 35.9 0

Application: Flexibility (II Sensitive Question Yes answer : 6 (0 or more No answer : 1-5 (Never - 19 times m = 1/6 Sensitive question Prevalence (± SE 95% CI Illicit drug 16.4 (± 3.6 9.3 3.8 1

Application: Comparison anonymous questionnaire A classical anonymous questionnaire was distributed to n = 46 undergraduate students also registered at the University of Liège during the academic year 003-004. Illicit drug Prevalence (± SE Classical (n=46 4. (±.0 RRM (n=435 16.4 (± 3.6

Conclusion The problem of estimating the prevalence of sensitive attributes is quite common (public health, social life, economy, New approach of RRM by a two-step (RP1 RP rather than by a one-step (RP1 random process response Flexibility with RP (question with multiple answers, modify Can the two-step RRM be a substitute for the classical approach in general? Sensitivity analyses should be carried on. 3