Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Similar documents
Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

New Statistical Test for Quality Control in High Dimension Data Set

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Topic 7: Convergence of Random Variables

SYNCHRONOUS SEQUENTIAL CIRCUITS

Diagonalization of Matrices Dr. E. Jacobs

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Agmon Kolmogorov Inequalities on l 2 (Z d )

Multi-View Clustering via Canonical Correlation Analysis

A Modification of the Jarque-Bera Test. for Normality

Least-Squares Regression on Sparse Spaces

A comparison of small area estimators of counts aligned with direct higher level estimates

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

A Review of Multiple Try MCMC algorithms for Signal Processing

Survey-weighted Unit-Level Small Area Estimation

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009

Math 1B, lecture 8: Integration by parts

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Space-time Linear Dispersion Using Coordinate Interleaving

A NONLINEAR SOURCE SEPARATION APPROACH FOR THE NICOLSKY-EISENMAN MODEL

A COMPARISON OF SMALL AREA AND CALIBRATION ESTIMATORS VIA SIMULATION

Entanglement is not very useful for estimating multiple phases

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

Research Article When Inflation Causes No Increase in Claim Amounts

Designing of Acceptance Double Sampling Plan for Life Test Based on Percentiles of Exponentiated Rayleigh Distribution

Chapter 6: Energy-Momentum Tensors

Applications of the Wronskian to ordinary linear differential equations

2Algebraic ONLINE PAGE PROOFS. foundations

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

A nonlinear inverse problem of the Korteweg-de Vries equation

The total derivative. Chapter Lagrangian and Eulerian approaches

Technion - Computer Science Department - M.Sc. Thesis MSC Constrained Codes for Two-Dimensional Channels.

Modeling time-varying storage components in PSpice

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

Introduction to Markov Processes

Logarithmic spurious regressions

Combining Time Series and Cross-sectional Data for Current Employment Statistics Estimates 1

Generalizing Kronecker Graphs in order to Model Searchable Networks

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Mark J. Machina CARDINAL PROPERTIES OF "LOCAL UTILITY FUNCTIONS"

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

Influence of weight initialization on multilayer perceptron performance

A Novel Decoupled Iterative Method for Deep-Submicron MOSFET RF Circuit Simulation

Linear First-Order Equations

Systems & Control Letters

Multi-View Clustering via Canonical Correlation Analysis

Estimation of District Level Poor Households in the State of. Uttar Pradesh in India by Combining NSSO Survey and

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016

A Note on Exact Solutions to Linear Differential Equations by the Matrix Exponential

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

On the enumeration of partitions with summands in arithmetic progression

UNIFYING PCA AND MULTISCALE APPROACHES TO FAULT DETECTION AND ISOLATION

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

EVALUATING HIGHER DERIVATIVE TENSORS BY FORWARD PROPAGATION OF UNIVARIATE TAYLOR SERIES

Gaussian processes with monotonicity information

Lecture 6: Calculus. In Song Kim. September 7, 2011

Short Intro to Coordinate Transformation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

An Analytical Expression of the Probability of Error for Relaying with Decode-and-forward

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Conservation Laws. Chapter Conservation of Energy

Generalization of the persistent random walk to dimensions greater than 1

Lecture 2: Correlated Topic Model

Optimization of Geometries by Energy Minimization

Optimal CDMA Signatures: A Finite-Step Approach

The canonical controllers and regular interconnection

MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS ABSTRACT KEYWORDS

ON THE OPTIMALITY SYSTEM FOR A 1 D EULER FLOW PROBLEM

Euler equations for multiple integrals

KNN Particle Filters for Dynamic Hybrid Bayesian Networks

A Short Note on Self-Similar Solution to Unconfined Flow in an Aquifer with Accretion

Placement and tuning of resonance dampers on footbridges

Schrödinger s equation.

Chapter 2 Lagrangian Modeling

Periods of quadratic twists of elliptic curves

AN INTRODUCTION TO NUMERICAL METHODS USING MATHCAD. Mathcad Release 14. Khyruddin Akbar Ansari, Ph.D., P.E.

arxiv: v1 [math.st] 20 Jun 2017

Relative Entropy and Score Function: New Information Estimation Relationships through Arbitrary Additive Perturbation

Quantile function expansion using regularly varying functions

A Weak First Digit Law for a Class of Sequences

Stable and compact finite difference schemes

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

An Approach for Design of Multi-element USBL Systems

Dot trajectories in the superposition of random screens: analysis and synthesis

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

CONTROL CHARTS FOR VARIABLES

Math 342 Partial Differential Equations «Viktor Grigoryan

ELEC3114 Control Systems 1

Lecture Note 2. 1 Bonferroni Principle. 1.1 Idea. 1.2 Want. Material covered today is from Chapter 1 and chapter 4

THE VAN KAMPEN EXPANSION FOR LINKED DUFFING LINEAR OSCILLATORS EXCITED BY COLORED NOISE

arxiv: v1 [math.co] 29 May 2009

DEGREE DISTRIBUTION OF SHORTEST PATH TREES AND BIAS OF NETWORK SAMPLING ALGORITHMS

BLOCK DESIGNS WITH NESTED ROWS AND COLUMNS

Multi-View Clustering via Canonical Correlation Analysis

Transcription:

International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize Response Questioning Methos by Multiple Answers Heiko Groenitz Working group Statistics, School of Business an Economics, Philipps-University Marburg, Germany Corresponence: Heiko Groenitz, Working group Statistics, School of Business an Economics, Philipps-University Marburg, Germany E-mail: groenitz@staffuni-marburge Receive: June 24, 207 Accepte: August 8, 207 Online Publishe: August, 207 oi:05539/ijspv6n5p0 Abstract URL: https://oiorg/05539/ijspv6n5p0 When private or stigmatizing characteristics are inclue in sample surveys, irect questions result in low cooperation of the responents To increase cooperation, inirect questioning proceures have been establishe in the literature Nonranomize response methos are one group of such proceures an have attracte much attention in recent years In this article, we consier four popular nonranomize response schemes an present a possibility to improve the estimation precision of these schemes The basic iea is to require multiple inirect answers from each responent We evelop a Fisher scoring algorithm for the maximum likelihoo estimation in the presente new schemes an show the better efficiency of the new schemes compare with the original esigns Keywors: Fisher scoring, inirect questioning, Löwner orer, privacy of responents, sample survey, sensitive characteristic Introuction Surveys are important tools in many isciplines of science, for instance, social science an economics Sometimes, variables which are viewe as private or stigmatizing are involve in the survey Examples for such sensitive variables are financial situation, political views, cheating in examinations, uneclare work, insurance frau, an iscrimination Direct questions on such characteristics will often yiel low cooperation of the responents, ie, answer refusal an untruthful answers will often occur Therefor, skilful questioning proceures that protect the interviewees privacy an eliver ata enabling statistical inference were evelope in the literature One group of proceures is the class of ranomize response (RR) methos In RR techniques, the responent conucts a ranom experiment an gives a certain inirect answer epening on the result of the ranom experiment For example, consier the following process with the sensitive attribute uneclare work an throwing a ie as ranom experiment: If the ie shows or 2, the interviewee answers the question Have you conucte work in the last year without eclaring this to the relevant public authorities? If the ie shows 3-6, the opposite question Di you eclare all your work in the last year to the relevant public authorities? must be respone The interviewer oes not observe the ranom experiment an hears only yes or no, but oes not know the question that is answere This protects the privacy Base on the inirect answers of many responents, the istribution of the sensitive variable can be estimate The escribe proceure correspons to the RR technique by Warner (965) Various other RR methos are available toay See, for example, Fox an Tracy (986), Chauhuri (20), Chauhuri an Christofies (203), or Chauhuri, Christofies an Rao (206) for overviews The ranom experiment in RR methos is a bit cumbersome an causes oubts on the suitability of RR methos for online surveys This motivate iverse authors to introuce nonranomize response (NRR) methos, for example, Yu, Tian, an (2008), Tan, Tian, an (2009),, Tian,, an Liu (2009) or Groenitz (204) In NRR schemes, an inirect answer that epens on the responent s outcome of an auxiliary characteristic must be given The auxiliary characteristic is efine on the same population which the sensitive characteristic is efine on Typically, the auxiliary characteristic is inepenent of the sensitive attribute an possesses a known istribution To give an example, we mention the characteristic escribing whether the responent s birthay is in January - April or not In NRR proceures, the responent woul give the same answer if he or she is aske again To improve the estimation efficiency of RR methos, some authors stuy repeate RR methos (Eriksson, 973; Alavi & Tajoini, 206; Groenitz, 206) Here, the interviewee must repeat the ranom experiment multiple times Say, we have two repetitions Depening on the sensitive characteristic an the result of the first repetition of the experiment, the first inirect answer must be given Depening on the sensitive characteristic an the result of the secon repetition, the secon inirect answer must be provie That is, two inirect answers are necessary In this article, we present some repeate NRR techniques We erive inference for these proceures an show that our 0

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 repeate NRR methos improve the estimation efficiency of the original NRR techniques The basic iea for repeate NRR techniques is to involve multiple ifferent auxiliary characteristics in the proceure For example, one can consier the characteristic escribing whether the responent s birthay is in January to April an the characteristic escribing whether the responent s telephone number ens on 0-6 In Section 2, we explain the NRR methos consiere in this paper In Section 3, we escribe the corresponing repeate NRR esigns The maximum likelihoo (ML) estimation an the estimation variance for the multiple-trial NRR schemes are aresse in Section 4 The accuracy gains of repeate NRR techniques in comparison with single-trial NRR techniques are emonstrate in Section 5 2 NRR Methos In this section, four NRR methos are escribe: The crosswise metho an the triangular metho (both Yu et al, 2008), the multi-category esign by et al (2009), an the iagonal technique by Groenitz (204) Let the sensitive characteristic be enote by X We give some concrete examples for X: (i) X {, 2} with X = if the person has pai the taxes for the last year correctly an X = 2 if he or she has evae taxes last year (ii) X {, 2} with X = if the person s annual income excees a certain value an X = 2 else (iii) X {, 2, 3} where X = hols if the person never has conucte insurance frau, X = 2 hols if the person has conucte insurance frau once or twice, an X = 3 hols if the person has conucte insurance frau three or more times (iv) X {, 2, 3, 4} where each value of X represents a certain income class For the crosswise an triangular metho, X {, 2}, ie, X with two categories, is require For the methos by et al (2009) an Groenitz (204), X can have an arbitrary number of categories coe by, 2,, k The triangular metho an the et al (2009) metho eman that the category X = is nonsensitive The crosswise metho can be applie for the examples (i) an (ii) The triangular metho can hanle example (i) The technique of et al (2009) is suitable for (iii) an the iagonal technique can be applie for (iii) an (iv) For each of the consiere NRR esigns, a nonsensitive auxiliary variable W is necessary The responents iniviual values of W must not be known to the interviewer or the survey agency W an X must be inepenent an W must possess a known istribution For the crosswise an triangular metho, W must have the categories W = an W = 2 For the et al (2009) metho an the iagonal technique, W must have the k categories W =,, W = k Examples for W with two categories were alreay given in the Introuction A W with k = 4 is as follows: Let W be base on the number forme by the last three igits of the interviewee s telephone number If this number is 624, 625 749, 750 874, an 875 999, we efine W =, W = 2, W = 3, an W = 4, respectively For example, the telephone number 9478722 results in the number 722 an W = 2 In the survey, the responents provie an inirect answer A that epens on X an W Giving an inirect answer A protects the privacy The concrete answer schemes are: - Crosswise metho: For X = W = or X = W = 2, the answer A = must be given For other combinations of X an W, the inirect answer is A = 2 - Triangular metho: For X = W =, we have A = In the other cases, A = 2 is require - et al (2009) metho: For X =, the answer is the value of the nonsensitive variable, that is, A = W For X = i with i = 2,, k, the answer is the value of the sensitive characteristic, that is, A = X - Diagonal technique: The answer is given by the formula A = [(W X) mo k] +, however, the responents o not receive this mathematical formula Instea, they receive a table that illustrates the answer to give For example, for k = 4, Table is such a table 02

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 Table Table of require inirect answer A for iagonal technique X/W W = W = 2 W = 3 W = 4 X = 2 3 4 X = 2 4 2 3 X = 3 3 4 2 X = 4 2 3 4 3 Repeate NRR Methos In this section, we introuce a repeate version for each of the NRR methos from Section 2 Here, every responent gives multiple inirect answers We consier the case of two inirect answers in particular As preliminary consieration, let us fix some NRR scheme from Section 2 an assume that the responent shoul give a first inirect answer A base on the sensitive X an the nonsensitive auxiliary characteristic W an a secon inirect answer A 2 also base on X an W Then, A = A 2 always follows Consequently, the secon inirect answer oes not contain aitional information Thus, it oes not work to base both inirect answers on X an W The solution is to utilize a separate nonsensitive auxiliary attribute for each repetition Say, the nonsensitive auxiliary characteristic for the first an secon trial is enote by W an W 2, respectively For a fixe NRR scheme from Section 2, the interview proceure for the two-trial version is as follows The interviewee first gives the inirect answer A epening on X an W accoring to the fixe NRR scheme Afterwar, he or she gives the secon inirect answer A 2 epening on X an W 2 also accoring to the selecte NRR scheme For each NRR technique from Section 2, neither the responent s value of W nor the value of W 2 must be known to the interviewer or the survey agency For the crosswise an triangular metho, W, W 2 {, 2} is necessary For the et al (2009) an Groenitz (204) metho, W an W 2 both must have the categories,, k We make three further assumptions: The vector (W, W 2 ) an X are inepenent, W an W 2 are inepenent, an W an W 2 possess known istributions (W an W 2 are allowe to have ifferent istributions) These three assumptions can usually be seen as fulfille when the auxiliary characteristics are constructe, for example, from birthay perios, telephone numbers, or house numbers 4 Statistical Inference for Repeate NRR Designs We efine π i to be the proportion of persons in the population having X value equal to i (i =,, k) an set π = (π,, π k ) We now evelop the ML estimation for π for the repeate NRR esigns an a sample of size n rawn by simple ranom sampling with replacement The estimation variance is also aresse Fix one of the repeate NRR esigns an efine c i an c 2i to be the proportion of population units with W = i an W 2 = i, respectively Let the entry (i, j) of the k k matrix C be given by P(A = i X = j) Analog, let the entry (i, j) of the k k matrix C 2 be given by P(A 2 = i X = j) For the crosswise metho, we have C = ( ) ( ) c c 2 c2 c an C c 2 c 2 = 22 c 22 c 2 For the triangular metho, C = ( ) c 0 an C c 2 2 = ( c2 ) 0 c 22 hol For the technique by et al (2009), the first column of C equals (c,, c k ) The jth column of C for j = 2,, k has entry as jth component while the other components are 0 In the matrix C 2, the first column is (c 2,, c 2k ) The jth column of C 2 for j = 2,, k has entry as jth component an entry 0 for the other components For the iagonal technique, each row of C is a left-cyclic shift of the row above an the first row is (c,, c k ) Regaring C 2, each row is again a left-cyclic shift of the row above where the first row is now (c 2,, c 2k ) Consier a, a 2, x {,, k}, efine I = I (a, x) = {i {,, k} : W = i, X = x result in answer A = a }, I 2 = I 2 (a 2, x) = { j {,, k} : W 2 = j, X = x result in answer A 2 = a 2 }, I = I(a, a 2, x) = {(i, j) {,, k} 2 : (W, W 2 ) = (i, j), X = x yiel (A, A 2 ) = (a, a 2 )} 03

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 an obtain P(A = a,a 2 = a 2 X = x) = P(X = x) P(A = a, A 2 = a 2, X = x) = P(X = x) P(A = a, A 2 = a 2, W = w, W 2 = w 2, X = x) (w,w 2 ) I + (w,w 2 ) I P(A = a, A 2 = a 2, W = w, W 2 = w 2, X = x) = P(X = x) P(W = w, W 2 = w 2, X = x) + 0 (w,w 2 ) I = P(X = x) P(W = w ) P(W 2 = w 2 ) P(X = x) (w,w 2 ) I = P(W = w ) P(W 2 = w 2 ) = P(W = w ) P(W 2 = w 2 ) w I w 2 I 2 (w,w 2 ) I = P(A = a X = x) P(A 2 = a 2 X = x) = C (a, x) C 2 (a 2, x), where entry (p, q) of the matrix C an C 2 is enote by C (p, q) an C 2 (p, q), respectively Consequently, A an A 2 are conitionally inepenent As next step, we efine λ i j to be the joint proportion of population units with A = i an A 2 = j (i, j =,, k) These joint proportions are arrange in the column vector λ of length k 2 where we first sort by the value of A For example, for k = 3, λ is given by λ = (λ, λ 2, λ 3, λ 2, λ 22, λ 23, λ 3, λ 32, λ 33 ) It follows that λ = C (π,, π k ) where C is a k 2 k matrix an the jth column of C is given by C (:, j) C 2 (:, j) Here, C (:, j) an C 2 (:, j) represents the jth column of C an C 2, respectively, an the symbol stans for the Kronecker matrix prouct The Kronecker matrix prouct of two matrices R R r r 2 an S R s s 2 is efine as R R 2 R,r2 R S R 2 S R,r2 S R 2 R 22 R 2,r2 R S = R r, R r,2 R r,r 2 S = R 2 S R 22 S R 2,r2 S R r, S R r,2 S R r,r 2 S that is, R S is a matrix of size r s r 2 s 2 Thus, C is the columnwise Kronecker prouct of C an C 2 To give an example, for k = 3, we have C = C (, ) C 2 (, ) C (, 2) C 2 (, 2) C (, 3) C 2 (, 3) C (, ) C 2 (2, ) C (, 2) C 2 (2, 2) C (, 3) C 2 (2, 3) C (, ) C 2 (3, ) C (, 2) C 2 (3, 2) C (, 3) C 2 (3, 3) C (2, ) C 2 (, ) C (2, 2) C 2 (, 2) C (2, 3) C 2 (, 3) C (2, ) C 2 (2, ) C (2, 2) C 2 (2, 2) C (2, 3) C 2 (2, 3) C (2, ) C 2 (3, ) C (2, 2) C 2 (3, 2) C (2, 3) C 2 (3, 3) C (3, ) C 2 (, ) C (3, 2) C 2 (, 2) C (3, 3) C 2 (, 3) C (3, ) C 2 (2, ) C (3, 2) C 2 (2, 2) C (3, 3) C 2 (2, 3) C (3, ) C 2 (3, ) C (3, 2) C 2 (3, 2) C (3, 3) C 2 (3, 3) For the following, it is avisable to number the k 2 answer categories by,, k 2 where we first sort by answer A an then by A 2 For example, for k = 3, the numbering scheme is given by Table 2 Table 2 Answer categories,, 9 for A {, 2, 3} an A 2 {, 2, 3} answer category 2 3 4 5 6 7 8 9 answer A 2 2 2 3 3 3 answer A 2 2 3 2 3 2 3 Let n l (l =,, k 2 ) be the observe absolute frequency of answer category l in the sample Furthermore, let C be the k 2 (k ) matrix that arises as follows from C: The jth column of C ( j =,, k ) is given by the ifference C(:, j) C(:, k) The log-likelihoo function is l(π,, π k ) = (n,, n k 2) log [ C (π,, π k ) + C(:, k) ], 04

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 with componentwise application of the logarithm The score function corresponing to l is ( s(π,, π k ) = [l (π,, π k )] = C n,, n ) k 2 λ λ k 2 For the secon erivative of l, we obtain Consequently, the Fisher matrix is l (π,, π k ) = C iag n,, n k 2 λ 2 λ 2 k 2 ( F = F(π,, π k ) = C n iag,, λ We maximize the log-likelihoo by a Fisher scoring algorithm However, other algorithms such as EM algorithm, Newton algorithm or Neler/Mea simplex algorithm are also possible Our Fisher scoring algorithm generates a sequence π (t) = (π (t),, π(t) k ), t =, 2,, via the rule λ k 2 C π (t+) = π (t) + [ F(π (t),, π(t) k )] s(π (t),, π(t) until convergence We enote the ML estimator for π = (π,, π k ) by ˆπ = (ˆπ,, ˆπ k ) The asymptotic variance of this ML estimator is given by [F(π,, π k )] An estimator for the asymptotic variance is [F(ˆπ,, ˆπ k )] 5 Precision Improvement We quantify the estimation inaccuracy by the trace of the asymptotic variance matrix of the ML estimator for π = (π,, π k ) For this variance matrix, we refer to the en of the previous section We start this section with a formal proof that the estimation inaccuracy of a two-trial NRR metho is always less than or equal to the estimation inaccuracy of the single-trial process Let A i j be the jth inirect answer of responent i (i =,, n, j =, 2) We set f A (a ) = P(A = a ), f A,A 2 (a, a 2 ) = P(A = a, A 2 = a 2 ), as well as f A2 A (a 2 a ) = P(A 2 = a 2 A = a ) The Fisher matrix can be written as F = F(π, π k ) = n E π log f A,A 2 (A, A 2 ) π log f A,A 2 (A, A 2 ) We have ) C k ) E π log f A,A 2 (A, A 2 ) π log f A,A 2 (A, A 2 ) [( = E π log f ( ) + ) π log f A (A ) ( π log f ( ) + )] π log f A (A ) = E π log f A (A ) π log f A (A ) + E π log f ( ) π log f ( ) [( ] + E + E ) π log f ( ) [( ) π log f A (A ) π log f A (A ) π log f ( ) In the following, we show that the last two summans are zero (zero matrix) We introuce the function g with ( ) g(a, a 2 ) = π log f (a 2 a ) π log f A (a ) ( ) = f A2 A (a 2 a ) π f (a 2 a ) π log f A (a ) ] 05

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 It is true that E [ ] g(a, A 2 ) A = a [ ( ) ] = E f A2 A (A 2 a ) π f (A 2 a ) π log f A (a ) A = a ( ) = f a 2 {,,k} A2 A (a 2 a ) π f (a 2 a ) π log f A (a ) f A2 A (a 2 a ) ( ) = π f (a 2 a ) π log f A (a ) a 2 {,,k} = π a 2 {,,k} = (0,, 0) f A2 A (a 2 a ) π log f A (a ) = 0 π log f A (a ) Consequently, E(E(g(A, A 2 ) A )) = E(g(A, A 2 )) = 0 hols That is, the thir summan is zero Regaring the fourth summan, we have E π log f A (A ) π log f ( ) { } = E π log f ( ) π log f A (A ) = 0 Thus, we obtain F = n E π log f A (A ) π log f A (A ) [( ) + n E π log f ( ) [( ) =: G + n E π log f ( ) π log f ( ) π log f ( ) ] ] () The matrix G = G(π, π k ) is the Fisher matrix if we only have observations on A,, A n, that is, if we only require one inirect answer per responent It follows from () that F G is positive-semiefinite By a known property of the Löwner orer (Norström, 989, p 4473), we obtain that G F is positive-semiefinite Thus, the trace of G is larger than or equal to the trace of F G is the asymptotic variance matrix of the ML estimator for π for one inirect answer per interviewee an F is the asymptotic variance matrix of the ML estimator for two inirect answers per interviewee Hence, we have shown that the estimation inaccuracy of a two-trial NRR metho is always less than or equal to the estimation inaccuracy of the single-trial process For numerical illustration, we now compute the estimation inaccuracy of our two-trial NRR techniques for concrete parameter specifications an make comparisons to the estimation inaccuracy of the single-trial versions For the crosswise metho, we set π = 08 an consier c {0, 02,, 09, } an c 2 {0, 02,, 09, } (2) The quantity n times the asymptotic variance of the ML estimator for π for the two-trial crosswise metho is presente for any combination of c an c 2 in the mile of Table 3 In the right column of Table 3, we provie the quantity n times the asymptotic variance of the ML estimator for π for the single-trial crosswise metho epening on the parameter c Here, the asymptotic variance for the single-trial version is ( ( C n iag /(C (π, π 2 ) ) ) ) C with C = C (:, ) C (:, 2) an / symbolizing componentwise ivision For the triangular metho, we again consier π = 08 an procee analogously to the crosswise metho The computational results for the triangular metho are given in Table 4 For the et al (2009) technique, we consier k = 3 categories, (π, π 2, π 3 ) = (06, 03, 0), an 0 istributions of an auxiliary variable as follows: 06

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 = ( 03333 03333 03333 ), c (2) = ( 04372 0523 00504 ), = ( 0453 0567 00680 ), c (4) = ( 0353 07807 0084 ), = ( 0090 0447 04628 ), c (6) = ( 07780 0278 00942 ), = ( 04384 05243 00373 ), c (8) = ( 0222 04280 03509 ), = ( 0067 0525 04257 ), c (0) = ( 03799 0475 0450 ) The first istribution is a uniform istribution The other vectors c (2),, c(0) were rawn ranomly The mile of Table 5 shows n times the trace of the asymptotic variance matrix of the ML estimator for (π, π 2 ) for the et al (2009) metho with two inirect answers per responent for each combination c (i) j) an c( The right column of this table provies the quantity n times the trace of the asymptotic variance of the ML estimator for (π, π 2 ) for the single-trial metho The asymptotic variance for the single-trial et al (2009) metho is ( ( C n iag /(C (π, π 2, π 3 ) ) ) ) C with C = [C (:, ) C (:, 3), C (:, 2) C (:, 3)] We finally come to the iagonal technique Say, we have k = 4 categories, the vector (π,, π 4 ) = (04, 03, 02, 0), an the 0 istributions of an auxiliary characteristic = ( 03250 02250 02250 02250 ), c (2) = ( 04000 02000 02000 02000 ), = ( 04750 0750 0750 0750 ), c (4) = ( 05500 0500 0500 0500 ), = ( 06250 0250 0250 0250 ), c (6) = ( 07000 0000 0000 0000 ), = ( 07750 00750 00750 00750 ), c (8) = ( 08500 00500 00500 00500 ), = ( 09250 00250 00250 00250 ), c (0) = ( 0000 00000 00000 00000 ) The istributions were chosen accoring to Groenitz (204, p 29) where we consiere σ {/20, 2/20,, 9/20, 0/20} in Groenitz (204, p 29) For the two-trial an single-trial iagonal technique, the results concerning estimation inaccuracy are given in Table 6 For the right column of this table, we remark that the asymptotic variance matrix of the ML estimator for (π, π 2, π 3 ) in the single-trial iagonal technique is equal to ( ( C n iag /(C (π,, π 4 ) ) ) ) C where C = [C (:, ) C (:, 4), C (:, 2) C (:, 4), C (:, 3) C (:, 4)] Altogether, the Tables 3-6 emonstrate that large efficiency gains are possible by two-trial NRR methos in comparison with single-trial NRR schemes Table 3 Inaccuracy crosswise metho inaccuracy two-trial crosswise metho single trial c /c 2 0 02 03 04 05 06 07 08 09 0-0 02 025 027 029 030 029 027 025 02 06 030 02 025 034 045 056 060 056 045 034 025 06 060 03 027 045 075 9 47 9 075 045 027 06 47 04 029 056 9 308 66 308 9 056 029 06 66 05 030 060 47 66 66 47 060 030 06 06 029 056 9 308 66 308 9 056 029 06 66 07 027 045 075 9 47 9 075 045 027 06 47 08 025 034 045 056 060 056 045 034 025 06 060 09 02 025 027 029 030 029 027 025 02 06 030 0 06 06 06 06 06 06 06 06 06 06 06 07

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 Note This table shows the quantity n times the asymptotic variance of the ML estimator for π for the crosswise metho For c = 05 in the single-trial proceure an c = c 2 = 05 in the two-trial proceure, the log-likelihoo oes not epen on π implying that ML estimation is not aequate in these cases Table 4 Inaccuracy triangular metho inaccuracy two-trial triangular metho single trial c /c 2 0 02 03 04 05 06 07 08 09 0-0 357 222 52 0 08 06 046 034 024 06 736 02 222 58 8 090 069 054 04 03 023 06 336 03 52 8 093 074 059 047 037 029 022 06 203 04 0 090 074 06 050 04 034 027 02 06 36 05 08 069 059 050 043 036 030 025 020 06 096 06 06 054 047 04 036 03 027 023 09 06 069 07 046 04 037 034 030 027 024 02 08 06 050 08 034 03 029 027 025 023 02 09 08 06 036 09 024 023 022 02 020 09 08 08 07 06 025 0 06 06 06 06 06 06 06 06 06 06 06 Note This table provies the quantity n times the asymptotic variance of the ML estimator for π for the triangular metho Table 5 Inaccuracy for et al (2009) esign inaccuracy two-trial et al (2009) esign single trial c (2) c (4) c (6) c (8) c (0) - 070 07 072 09 082 052 072 076 085 072 205 c (2) 07 088 089 26 083 054 089 08 090 084 89 072 089 089 27 085 054 090 082 092 085 99 c (4) 09 26 27 236 6 059 29 0 32 8 746 082 083 085 6 03 055 084 093 09 085 8 c (6) 052 054 054 059 055 048 054 054 056 054 07 072 089 090 29 084 054 09 08 09 085 9 c (8) 076 08 082 0 093 054 08 085 098 08 332 085 090 092 32 09 056 09 098 6 09 244 c (0) 072 084 085 8 085 054 085 08 09 082 207 Note This table shows n times the trace of the asymptotic variance matrix of the ML estimator for (π, π 2 ) for the et al (2009) metho Table 6 Inaccuracy for iagonal technique c (2) c (4) c (6) c (8) c (0) inaccuracy two-trial iagonal technique c (2) c (4) c (6) c (8) c (0) single trial - 2860 60 588 350 23 64 22 094 075 06 5697 60 738 464 307 24 57 9 093 075 06 44 588 464 345 256 92 48 6 092 075 06 647 350 307 256 208 68 36 090 074 06 368 23 24 92 68 45 24 05 088 074 06 237 64 57 48 36 24 098 085 073 06 66 22 9 6 05 098 090 08 072 06 23 094 093 092 090 088 085 08 077 070 06 095 075 075 075 074 074 073 072 070 067 06 075 06 06 06 06 06 06 06 06 06 06 06 Note This table presents n times the trace of the asymptotic variance matrix of the ML estimator for (π, π 2, π 3 ) for the iagonal metho by Groenitz (204) 08

http://ijspccsenetorg International Journal of Statistics an Probability Vol 6, No 5; 207 6 Summary NRR esigns for sensitive attributes have attracte much attention in the literature of the last years In this article, we have consiere two-trial versions of four NRR schemes In a two-trial esign, each person in the sample must provie two inirect answers Each answer epens on a separate auxiliary characteristic We have evelope the maximum likelihoo inference for the istribution of the sensitive variable an erive the asymptotic estimation variance Moreover, we analyze the gains in estimation precision by two inirect answers per responent instea of one inirect answer Acknowlegements The author thanks the Eitor an two anonymous reviewers for their comments an suggestions on the manuscript References Alavi, S M R, & Tajoini, M (206) Maximum Likelihoo Estimation of Sensitive Proportion Using Repeate Ranomize Response Techniques Journal of Applie Statistics, 43, 563-57 https://oiorg/0080/026647632050708 Chauhuri, A (20) Ranomize Response an Inirect Questioning Techniques in Surveys Chapman & Hall/CRC https://oiorg/020/b0476 Chauhuri, A, & Christofies, T C (203) Inirect Questioning in Sample Surveys Springer https://oiorg/0007/978-3-642-36276-7 Chauhuri, A, Christofies, T C, & Rao, C R (206) Data Gathering, Analysis an Protection of Privacy Through Ranomize Response Techniques: Qualitative an Quantitative Human Traits Hanbook of Statistics 34, North Hollan https://oiorg/006/s069-76(6)x0002-8 Eriksson, S A (973) A New Moel for Ranomize Response International Statistical Review, 4, 0-3 https://oiorg/02307/40279 Fox, J A, & Tracy, PE (986) Ranomize Response - A Metho for Sensitive Surveys Sage https://oiorg/0435/9784298558 Groenitz, H (204) A New Privacy-Protecting Survey Design for Multichotomous Sensitive Variables Metrika, 77, 2-224 https://oiorg/0007/s0084-02-0406-8 Groenitz, H (206) Vali Estimates for Repeate Ranomize Response Methos Journal of Applie Statistics, in press https://oiorg/0080/026647632062679 Norström, K (989) Some Further Aspects of the Löwner-Orering Antitonicity of the Moore-Penrose Inverse Communications in Statistics - Theory an Methos, 8, 447-4489 https://oiorg/0080/036092890883067 Tan, M T, Tian, G L, &, M L (2009) Sample Surveys with Sensitive Questions: A Nonranomize Response Approach The American Statistician, 63, 9-6 https://oiorg/098/tast20090002, M L, Tian, G L,, N S, & Liu, Z (2009) A New Non-ranomize Multi-category Response Moel for Surveys With a Single Sensitive Question: Design an Analysis Journal of the Korean Statistical Society, 38, 339-349 https://oiorg/006/jjkss20082004 Warner, S L (965) Ranomize Response: A Survey Technique for Eliminating Evasive Answer Bias Journal of the American Statistical Association, 60, 63-69 https://oiorg/02307/228337 Yu, J W, Tian, G L, &, M L (2008) Two New Moels for Survey Sampling With Sensitive Characteristic: Design an Analysis Metrika, 67, 25-263 https://oiorg/0007/s0084-007-03-x Copyrights Copyright for this article is retaine by the author(s), with first publication rights grante to the journal This is an open-access article istribute uner the terms an conitions of the Creative Commons Attribution license (http://creativecommonsorg/licenses/by/40/) 09