Estimation of Sensitive Characteristic using Two-Phase Sampling for Regression Type Estimator

Similar documents
An Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling

On the Estimation Of Population Mean Under Systematic Sampling Using Auxiliary Attributes

Ratio Estimators in Simple Random Sampling Using Information on Auxiliary Attribute

On the Relative Efficiency of the Proposed Reparametized Randomized Response Model

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

A Multi-proportion Randomized Response Model Using the Inverse Sampling

Estimation of sensitive quantitative characteristics in randomized response sampling

¼ ¼ 6:0. sum of all sample means in ð8þ 25

Introduction to Probability and Statistics

Singh, S. (2013). A dual problem of calibration of design weights. Statistics: A Journal of Theoretical and Applied Statistics 47 (3),

Hotelling s Two- Sample T 2

Estimation of the large covariance matrix with two-step monotone missing data

LOGISTIC REGRESSION. VINAYANAND KANDALA M.Sc. (Agricultural Statistics), Roll No I.A.S.R.I, Library Avenue, New Delhi

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Variable Selection and Model Building

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

General Linear Model Introduction, Classes of Linear models and Estimation

ESTIMATION OF THE RECIPROCAL OF THE MEAN OF THE INVERSE GAUSSIAN DISTRIBUTION WITH PRIOR INFORMATION

On split sample and randomized confidence intervals for binomial proportions

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION

Notes on Instrumental Variables Methods

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

The Binomial Approach for Probability of Detection

Exponential Ratio Type Estimators of Population Mean under Non-Response

Guaranteed In-Control Performance for the Shewhart X and X Control Charts

Sampling. Inferential statistics draws probabilistic conclusions about populations on the basis of sample statistics

7.2 Inference for comparing means of two populations where the samples are independent

Machine Learning: Homework 4

Bayesian System for Differential Cryptanalysis of DES

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

Statistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform

On the Estimation of Population Proportion

4. Score normalization technical details We now discuss the technical details of the score normalization method.

An Investigation on the Numerical Ill-conditioning of Hybrid State Estimators

Indirect Rotor Field Orientation Vector Control for Induction Motor Drives in the Absence of Current Sensors

substantial literature on emirical likelihood indicating that it is widely viewed as a desirable and natural aroach to statistical inference in a vari

Statics and dynamics: some elementary concepts

The Poisson Regression Model

Published: 14 October 2013

On The Class of Double Sampling Exponential Ratio Type Estimator Using Auxiliary Information on an Attribute and an Auxiliary Variable

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Supplementary Materials for Robust Estimation of the False Discovery Rate

MULTIVARIATE SHEWHART QUALITY CONTROL FOR STANDARD DEVIATION

Using Factor Analysis to Study the Effecting Factor on Traffic Accidents

Using a Computational Intelligence Hybrid Approach to Recognize the Faults of Variance Shifts for a Manufacturing Process

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Formal Modeling in Cognitive Science Lecture 29: Noisy Channel Model and Applications;

Estimating function analysis for a class of Tweedie regression models

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

Background. GLM with clustered data. The problem. Solutions. A fixed effects approach

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

Analysis of M/M/n/K Queue with Multiple Priorities

Downloaded from jhs.mazums.ac.ir at 9: on Monday September 17th 2018 [ DOI: /acadpub.jhs ]

Tax Evasion, Social Customs and Optimal Auditing

A New Asymmetric Interaction Ridge (AIR) Regression Method

Evaluation of the critical wave groups method for calculating the probability of extreme ship responses in beam seas

3.4 Design Methods for Fractional Delay Allpass Filters

On Wald-Type Optimal Stopping for Brownian Motion

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression

16.2. Infinite Series. Introduction. Prerequisites. Learning Outcomes

Tax Evasion, Social Customs and Optimal Auditing

On The Class of Double Sampling Ratio Estimators Using Auxiliary Information on an Attribute and an Auxiliary Variable

ANALYSIS ON PROBLEMS AND MOTIVATION OF POST 90S UNDERGRADUATES' ETIQUETTE

Multivariable PID Control Design For Wastewater Systems

Bayesian Spatially Varying Coefficient Models in the Presence of Collinearity

Use of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

On Wrapping of Exponentiated Inverted Weibull Distribution

Conditioning and Independence

DIFFERENTIAL evolution (DE) [3] has become a popular

Temperature, current and doping dependence of non-ideality factor for pnp and npn punch-through structures

On Fractional Predictive PID Controller Design Method Emmanuel Edet*. Reza Katebi.**

ON POLYNOMIAL SELECTION FOR THE GENERAL NUMBER FIELD SIEVE

Improved General Class of Ratio Type Estimators

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

COMPARISON OF VARIOUS OPTIMIZATION TECHNIQUES FOR DESIGN FIR DIGITAL FILTERS

Modeling and Estimation of Full-Chip Leakage Current Considering Within-Die Correlation

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

AN OPTIMAL CONTROL CHART FOR NON-NORMAL PROCESSES

16.2. Infinite Series. Introduction. Prerequisites. Learning Outcomes

The European Commission s science and knowledge service. Joint Research Centre

arxiv:cond-mat/ v2 25 Sep 2002

A new improved estimator of population mean in partial additive randomized response models

F. Augustin, P. Rentrop Technische Universität München, Centre for Mathematical Sciences

Chapter 9 Practical cycles

1. INTRODUCTION. Fn 2 = F j F j+1 (1.1)

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

BERNOULLI TRIALS and RELATED PROBABILITY DISTRIBUTIONS

A Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition

PERFORMANCE BASED DESIGN SYSTEM FOR CONCRETE MIXTURE WITH MULTI-OPTIMIZING GENETIC ALGORITHM

Research of PMU Optimal Placement in Power Systems

The Interpretation of Unit Value Indices

Linear diophantine equations for discrete tomography

Chapter 10. Supplemental Text Material

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Generating Swerling Random Sequences

Transcription:

Journal of Informatics and Mathematical Sciences Volume 4 (01), Number 3,. 311 317 RGN Publications htt://www.rgnublications.com Estimation of Sensitive Characteristic using Two-Phase Samling for Regression Tye Estimator M. Javed and M.L. Bansal Abstract. The roblem of underreorting and non resonse on the sensitive issues are very common in most of the surveys due to our social setu. The randomied resonse (RR) models reduce rates of non-resonse and biased resonse that would ensure resondents rivacy if they resond truthfully concerning ersonal questions. Here RR device is roosed for regression tye estimator in which two indeendent samles are drawn from the oulation. The estimator of oulation mean of sensitive variable has been develoed. Its bias and variance and otimum variance have also been derived. Introduction Warner (1965) in his ioneer model suggested the technique of randomied resonse where every erson in the oulation belongs either to sensitive grou A, or its comliment. Enquirer often feels embarrassed in asking direct questions about sensitive issues. Such questions might be about drug usage, tax evasion, history of induced abortion, illegal income etc. Horvit et al. (1967) and Greenberg et al. (1971) have extended the Warner (1965) model to the case where the resonses to the sensitive question are quantitative rather than a simle yes or no. The resondent selects one of the two questions by means of a randomiation device. According to Greenberg et al. (1971), it is essential that the mean and variance of the resonses to the unrelated question be close to those for the sensitive questions, otherwise, it will often be ossible to recognie from the resonse which question was selected. Two samle single alternate RR model roosed by Greenberg and co-workers for obtaining the data on continuous tye sensitive random variable is more racticable and easy in handling. If some auxiliary information is available then resonses can be used in regression analysis by following Singh et al. (1996), Strachan et al. (1998), and Singh and King (1999). Recent ublications on RR techniques among others are, Singh et al. (000), Chaudhuri (001, 004), Singh (00), Elffers et al. (003), Huang Key words and hrases. RR device; Ratio tye estimator; Two-hase samling; Bias and variance.

31 M. Javed and M.L. Bansal (004), Javed and Grewal (006), Grewal et al. (006) and Sidhu et al. (007) and Gjestvang and Singh (006, 007), Samuel (008) and Carel et al. (010). Proosed Randomied Resonse Device Here an RR device is roosed for ratio tye estimator in which two indeendent samles, samle-i and samle-ii of sies n and k with relacement are drawn from the oulation. The resondents in the first samle are given the RR device which contained statements regarding A based on sensitive characteristic X and Q based on unrelated characteristic Y with resective robabilities and (1 ). Then the resondents are asked to answer the selected statement in terms of numerical figures without revealing to the interviewer which one of the two statements is answered. The resondents in the second samle are asked to answer in numerical figures (without using the RR device) for two direct statements Q 1 and Q based on unrelated characteristics Y 1 and Y resectively. The statements Q 1 and Q are correlated. The characteristics Y 1 and Y are unrelated with the sensitive character X. Let = robability that sensitive question is selected by the first resondent in the first samle. 1 = robability that non-sensitive question Q is selected by the first resondent in the first samle. = q Z i = observed resonse from individual i in first samle. X i = resonse of individual i in case he/she selects sensitive question through RR device in first samle. Y i = resonse of individual i in case he/ she selects alternate question Q through RR device in first samle. Y 1 j = resonse of first question Q 1 directly asked from j-th resondent in second samle. Y j = resonse of second question Q directly asked from j-th resondent in second samle. Estimator based on roosed randomied device Here a regression tye estimator for two-hase samling is roosed using the suggested RR device. A regression tye estimator µ xlr of mean µ x is roosed in two-hase samling is as below µ xlr = 1 X + (1 )ȳ (1 ){ ȳ + β(ȳ 1 ȳ 1} (1)

Estimation of Sensitive Characteristic using Two-Phase Samling for Regression Tye Estimator 313 where ȳ 1 = 1 m y 1 j, ȳ = 1 m y j, ȳ 1 = 1 k k y 1j and ȳ = 1 n n y i, () i=1 Let β = s y 1 y s y1, (3) s y1 y = 1 m 1 s y = 1 m 1 (y 1 j ȳ 1 )(y j ȳ ), s y 1 = 1 (y 1 j ȳ 1 ) and m 1 (y j ȳ ). (4) ε 1 = ȳ1 Ȳ 1 1, ε = ȳ 1 Ȳ 1 1, ε 3 = s y 1 y S y1 y such that E(ε 3 ) = E(ε 4 ) = 0 where 1 and ε 4 = s y 1 S y 1 1 (5) E(ε 1 ε 3 ) = µ 1 ȳ 1 µ 11 and E(ε 1 ε 4 ) = µ 30 ȳ 1 µ 0 (6) µ i j = E(y 1 ȳ 1 ) i (y ȳ ) j. (7) Bias of the Estimator The roosed estimator is biased. The bias of the estimator µ xlr is given in the following theorem: Theorem 1. The bias of the estimator µ xlr to the order O(n ) is given by 1 1 B(µ xlr ) = m 1 µ1 µ 11µ 30 k µ 0 µ. (8) 0 Proof. We have E(µ xlr ) = 1 E[ X + (1 )ȳ (1 ){ȳ + β(ȳ 1 ȳ 1}] = 1 [ X + (1 )Ȳ (1 ){Ȳ + βȳ 1 E((1 + ε 3 )(1 + ε 4 ) 1 (ε ε 1 ))}]. Considering the binomial exansion of (1 + ε 4 ) 1 u to the order of O(n ) and substituting the values of E(ε 1 ε 3 ) and E(ε 1 ε 4 ) in the above exression, we have 1 1 E(µ xlr ) = X m 1 µ1 µ 11µ 30 k µ 0 µ. (9) 0 As B(µ xlr ) = E(µ xlr ) X. (10) Using (9) in (10) we get (8).

314 M. Javed and M.L. Bansal Variance of the Estimator The variance of the estimator µ xlr is given in the following theorem: Theorem. The variance of the estimator µ xlr to the order of O(n ) is given by V (µ xlr ) = 1 σ 1 + (1 ) n k 1 1 σ y N + m 1 σ y (1 ρ ) (11) where σ = σ x + qσ y + q( X Ȳ ). Proof. We have V (µ xlr ) = 1 [V ( ) + (1 ) V (ȳ lrd )] (1) where = X + (1 )ȳ and ȳ lrd = ȳ + β(ȳ 1 ȳ 1) V (µ xlr ) = 1 σ n + (1 ) V (ȳ lrd ). (13) Now V (ȳ lrd ) = E 1 V ( ȳ lrd /s) + V 1 E (ȳ lrd /s) = E 1 V ( ȳ lrd /s) + V 1 (y ) where s means reliminary samle and V (ȳ lrd /s) = E( ȳ lrd /s) [E(ȳ lrd/s)] = E({ ȳ + β(ȳ 1 ȳ 1)} ) (E{ ȳ + β(ȳ 1 ȳ 1)}). For two-hase samling one can refer Sukhatme et al. (1984) to obtain the variance exression 1 V (ȳ lrd /s) = m 1 s y (1 ρ ) where ρ is the coefficient of correlation between y 1 and y and s y is defined at (4). 1 V (ȳ lrd ) = E 1 m 1 k s y (1 ρ ) + V 1 (y ) 1 = m 1 1 s y (1 ρ ) + k 1 σ y N. (14) Using the result obtained at (14) in (13), we get (11). Otimum Variance of the Estimator We should choose that rocedure which for a fixed cost C 0 minimies the variance of the estimate. If C 1, C and C 3 are the costs er unit of collecting information on the variable Z, auxiliary variable y 1 and y under study resectively. Then, the total cost of the survey can be exressed as C 0 = nc 1 + kc + mc 3. (15)

Estimation of Sensitive Characteristic using Two-Phase Samling for Regression Tye Estimator 315 We shall now determine n, k and m so that for a fixed cost C 0, the variance of the estimator µ xlr is least. Consider therefore, the exression φ = V (µ xlr ) λ(c 0 nc 1 kc mc 3 ). (16) On differentiating φ artially with resect to n, k and m and equating them equal to ero, we have φ n = 1 σ n + λc 1, (17) φ 1 σ k = y (1 ρ ) + λc, (18) φ 1 σ m = y (1 ρ ) + λc 3. (19) Solving (17), (18) and (19) for n, k and m resectively, we have σ m n =, (0) λc 1 1 1 ρ k = σ y, (1) λc m = k C C 3. () Again differentiating φ artially with resect to λ and equating the result equal to ero, we have φ λ = C 0 nc 1 kc mc 3. (3) Substituting the values of n, k and m in (3), we have λ = σ C1 + C [qa + k C 3 ] C 0 where A = σ y 1 ρ. Now substituting the values of λ in (0), (1) and (), we get the otimum values of n, k and m resectively as n ot = C 0σ B C 1, k ot = C 0qA B C, m ot = k ot C C 3

316 M. Javed and M.L. Bansal and B = σ C1 + C [qa + k C 3 ]. where A is defined above and q = 1. Theorem 3. The otimum variance of the estimator µ xlr to the order of O(n ) is given by V (µ xlr ) = 1 σ B C 1 B C + q C 0 C 0 qa 1 B σ y + N C 0 qa ( C 3 C )A. (4) Proof. We have V (µ xlr ) = 1 σ 1 + (1 ) n k 1 N 1 σ y + m 1 σ y (1 ρ ). Substituting the otimum values of n, k and m obtained earlier in the above exression, we get (4). References [1] F.W. Carel, Gerty Peeters, J.L.M. Lensvelt-Mulders and Karin Lasthuien (010), A note on a simle and ractical randomied resonse framework for eliciting dichotomous and quantitative information, Sociological Meth. Res., 39, 83 96. [] A. Chaudhuri (001), Using randomied resonse from a comlex survey to estimate a sensitive roortion in a dichotomous finite oulation, J. Statist. Planning Inference 94, 37 4. [3] A. Chaudhuri (004), Christofides randomied resonse technique in comlex samle survey, Metrika 60, 3 8. [4] E. Elffers, P.G.M. Van Der Heijden and M. Heemans (003), Exlaining regulatory non-comliance: a survey study of rule transgression for two Deutch instrumentl laws, alying the randomied resonse method, J. Quantitative Criminology 19, 409 439. [5] C.R. Gjestvang and S. Singh (006), A new randomied resonse model, J. Roy. Stat. Soc., B, 68, 53 530. [6] C.R. Gjestvang and S. Singh (007), Forced quantitative randomied resonse model: a new device, Metrika 66(), 43 56. [7] D.G. Horvit, B.V. Shah and W.R. Simmons (1967), The unrelated question randomied resonse model, Proc. Social Statist. Sec., Am. Statist. Ass., 65 7. [8] C.H. Huang (004), A survey technique for estimating the roortion and sensitivity in a dichotomous finite oulation, Statistica Neerlandica 58, 75 8. [9] B.G. Greenberg, R.R. Kuebler, J.R. Abernathy and D.G. Horvit (1971), Alication of the randomied resonse technique in obtaining quantitative data, J. Am. Statist. Ass. 66, 43 50. [10] I.S. Grewal, M. Bansal and S.S. Sidhu (006), Poulation mean estimator corresonding to Horvit and Thomson s estimator for multi-characteristics using randomied resonse technique, Model Assisted Statistics and Alications 1(4), 15 0. [11] M. Javed and I.S. Grewal (006), On the relative efficiencies of randomied resonse devices with Greenberg unrelated question model, Model Assisted Statistics and Alications 1(4), 91 97.

Estimation of Sensitive Characteristic using Two-Phase Samling for Regression Tye Estimator 317 [1] H. Samuel (008), The multi-item randomied resonse technique, Sociological Meth. Res. 36(4), 495 514. [13] S.S. Sidhu, M.L. Bansal and S. Singh (007), Estimation of sensitive multi-characters using unknown value of unrelated question, Alied Mathematical Sciences 1(37), 1803 180. [14] S. Singh (00), A new stochastic randomied resonse model, Metrika 56, 131 14. [15] S. Singh, A.H. Joarder and M.L. King (1996), Regression analysis using scrambled resonses, Australian J. Statist. 38(), 01 11. [16] S. Singh and M.L. King (1999), Estimation of coefficient of determination using scrambled resonses, J. Indian Soc. Agric. Statist. 5(3), 338 343. [17] S. Singh, R. Singh and N.S. Mangat (000), Some alternative strategies to Moor s model in randomied resonse samling, J. Statist. Planning Inference 83, 43 55. [18] R. Strachan, M.L. King and S. Singh (1998), Likelihood based estimation of the regression model with scrambled resonse, Australian Newealand J. Statist. 40(3), 79 90. [19] P.V. Sukhatme, B.V. Sukhatme, S. Sukhatme and C. Asok (1984), Samling Theory of Surveys with Alications, Iowa State University Press, USA and Indian Society of Agricultural Statistics, New Delhi, India. [0] S.L. Warner (1965), Randomied resonse: a survey technique for eliminating evasive answer bias, J. Am. Statist. Ass. 60, 63 69. M. Javed, Deartment of Mathematics, Statistics & Physics, Punjab Agricultural University, Ludhiana 141004, India. E-mail: mjaved@au.edu M.L. Bansal, Deartment of Mathematics, Statistics & Physics, Punjab Agricultural University, Ludhiana 141004, India. Received May, 011 Acceted May 9, 01