Analysis of rounded data in mixture normal model

Similar documents
CS229 Lecture notes. Andrew Ng

A proposed nonparametric mixture density estimation using B-spline functions

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

Two-sample inference for normal mean vectors based on monotone missing data

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

The EM Algorithm applied to determining new limit points of Mahler measures

A. Distribution of the test statistic

STA 216 Project: Spline Approach to Discrete Survival Analysis

A Brief Introduction to Markov Chains and Hidden Markov Models

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

Lecture Note 3: Stationary Iterative Methods

A Comparison Study of the Test for Right Censored and Grouped Data

Some Measures for Asymmetry of Distributions

Homework 5 Solutions

arxiv: v1 [math.fa] 23 Aug 2018

Explicit overall risk minimization transductive bound

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

First-Order Corrections to Gutzwiller s Trace Formula for Systems with Discrete Symmetries

SydU STAT3014 (2015) Second semester Dr. J. Chan 18

Statistical Inference, Econometric Analysis and Matrix Algebra

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

Consistent linguistic fuzzy preference relation with multi-granular uncertain linguistic information for solving decision making problems

Algorithms to solve massively under-defined systems of multivariate quadratic equations

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

Mat 1501 lecture notes, penultimate installment

Statistics for Applications. Chapter 7: Regression 1/43

General Certificate of Education Advanced Level Examination June 2010

Problem set 6 The Perron Frobenius theorem.

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION

AST 418/518 Instrumentation and Statistics

Copyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU

General Certificate of Education Advanced Level Examination June 2010

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Establishment of Weak Conditions for Darboux- Goursat-Beudon Theorem

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case

XSAT of linear CNF formulas

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

A GENERALIZED SKEW LOGISTIC DISTRIBUTION

4 Separation of Variables

arxiv: v1 [math.co] 17 Dec 2018

Moreau-Yosida Regularization for Grouped Tree Structure Learning

An Approximate Fisher Scoring Algorithm for Finite Mixtures of Multinomials

Discrete Techniques. Chapter Introduction

FORECASTING TELECOMMUNICATIONS DATA WITH AUTOREGRESSIVE INTEGRATED MOVING AVERAGE MODELS

Construction of Supersaturated Design with Large Number of Factors by the Complementary Design Method

Improving the Reliability of a Series-Parallel System Using Modified Weibull Distribution

Research Article On the Lower Bound for the Number of Real Roots of a Random Algebraic Equation

The distribution of the number of nodes in the relative interior of the typical I-segment in homogeneous planar anisotropic STIT Tessellations

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

High-order approximations to the Mie series for electromagnetic scattering in three dimensions

Volume 13, MAIN ARTICLES

On the estimation of multiple random integrals and U-statistics

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design

The Symmetric and Antipersymmetric Solutions of the Matrix Equation A 1 X 1 B 1 + A 2 X 2 B A l X l B l = C and Its Optimal Approximation

Componentwise Determination of the Interval Hull Solution for Linear Interval Parameter Systems

A CLUSTERING LAW FOR SOME DISCRETE ORDER STATISTICS

Week 6 Lectures, Math 6451, Tanveer

THE THREE POINT STEINER PROBLEM ON THE FLAT TORUS: THE MINIMAL LUNE CASE

FOURIER SERIES ON ANY INTERVAL

Efficiently Generating Random Bits from Finite State Markov Chains

Introduction. Figure 1 W8LC Line Array, box and horn element. Highlighted section modelled.

Discrete Techniques. Chapter Introduction

An Extension of Almost Sure Central Limit Theorem for Order Statistics

Partial permutation decoding for MacDonald codes

Two-Stage Least Squares as Minimum Distance

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

Lecture 6: Moderately Large Deflection Theory of Beams

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

c 2007 Society for Industrial and Applied Mathematics

On Some Basic Properties of Geometric Real Sequences

arxiv:hep-ph/ v1 15 Jan 2001

Minimizing Total Weighted Completion Time on Uniform Machines with Unbounded Batch

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Model-based Clustering by Probabilistic Self-organizing Maps

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Separation of Variables and a Spherical Shell with Surface Charge

Effective Appearance Model and Similarity Measure for Particle Filtering and Visual Tracking

Path planning with PH G2 splines in R2

Target Location Estimation in Wireless Sensor Networks Using Binary Data

NON PARAMETRIC STATISTICS OF DYNAMIC NETWORKS WITH DISTINGUISHABLE NODES

FRIEZE GROUPS IN R 2

Restricted weak type on maximal linear and multilinear integral maps.

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

A NOTE ON QUASI-STATIONARY DISTRIBUTIONS OF BIRTH-DEATH PROCESSES AND THE SIS LOGISTIC EPIDEMIC

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

B. Brown, M. Griebel, F.Y. Kuo and I.H. Sloan

An Information Geometrical View of Stationary Subspace Analysis

Transcription:

Stat Papers (2012) 53:895 914 DOI 10.1007/s00362-011-0395-0 REGULAR ARTICLE Anaysis of rounded data in mixture norma mode Ningning Zhao Zhidong Bai Received: 13 August 2010 / Revised: 9 June 2011 / Pubished onine: 13 Juy 2011 Springer-Verag 2011 Abstract Rounding errors have a considerabe impact on statistica inferences, especiay when the data size is arge and the finite norma mixture mode is very important in many appied statistica probems, such as bioinformatics. In this artice, we investigate the statistica impacts of rounding errors to the finite norma mixture mode with a known number of components, and deveop a new estimation method to obtain consistent and asymptoticay norma estimates for the unknown parameters based on rounded data drawn from this kind of modes. Keywords Finite mixture norma mode EM agorithm Consistent estimation Asymptotic normaity Mathematics Subject Cassification (2000) 62F10 62F12 1 Introduction There are many inevitabe factors to cause the data being rounded in statistica practice, such as the precision of measuring instruments and/or the confinement of the recording or storage mechanism. Athough the data rounding is omnipresent in the measurement of continuous variabes and the rounding errors have certainy impact to statistica inferences, it has been ignored in most a statistica procedures because the effect of rounding errors is not significant for sma sampes. However, data sets N. Zhao Z. Bai (B) KLASMOE and Schoo of Mathematics and Statistics, Northeast Norma University, 5268 Peope s Road, Changchun 130024, China e-mai: baizd@nenu.edu.cn N. Zhao e-mai: zhaonn456@gmai.com

896 N. Zhao, Z. Bai of arge sizes have become more common thanks to the wide appication of modern data coection techniques and thus the effect of rounding errors become more and more serious for hypothesis testing or confidence intervas. Simuation resuts have shown that any nu hypothesis wi be rejected with probabiity 1 when usua t-test is appied base on rounded data and the sampe size is arge enough. So mitigating the effect of the rounding errors has graduay attracted considerabe attention in modern statistica research. Consequenty, some new methods deaing with the rounding errors have been proposed in the iterature. The eariest discussion on rounding errors dates back to Sheppard (1898). Subsequent works continued in Fisher (1922), Lindey (1950) and Muet and Murray (1971). In some recent works, Tricker (1984, 1990a,b) and Tricker et a. (1998) considered the effects of rounding errors on the precision of Type I errors, powers, or R charts. Vardeman (2005) considered the construction of confidence intervas based on rounded data. Dempster and Rubin (1983) compared severa methods for correcting rounding error, incuding the Sheppard s correction, the BRB correction and the ordinary east squares estimation with rounding errors ignored. Both Sheppard s and the BRB (see Beaton et a. 1976) corrections were derived under the assumption that rounding errors have uniform distributions over the symmetric interva with ength of the rounding precision unit and centered at the rounded vaues. Furthermore, the rounding errors are supposed to be independent of the rounded vaues or origina vaues, respectivey. It is obvious that these assumptions are invaid in genera cases 1 because both the rounded data and the rounding errors are functions of the true vaues of the sampe subjects. A earier works had faied to provide consistent estimation except for the recent work by Lee and Vardeman (2001, 2002, 2003), Bai et a. (2009) and Zhang et a. (2010). Lee and Vardeman gave the confidence interva estimation of parameters μ or σ 2 based on the rounded sampe drawn from the norma N(μ, σ 2 ) popuation. But they can t give the estimates of μ and σ 2 simutaneousy. Bai et a. (2009) proved that the sampe mean of rounded independent and identicay distributed (i.i.d) norma data is not consistent uness the true mean is a mutipe of haf precision unit. In the same artice, the authors aso proposed a new method, named approximate ikeihood estimation, to dea with parameter estimation with rounded data from an AR(p) time series sequences. Zhang et a. (2010) further extended this work to a genera mode of α-mixing sequences and proposed a method, named short overapping series (SOS) method, for parameter estimation based on rounded data from a weaky dependent sequence. They proved that the new estimates are strongy consistent and asymptotic normay distributed under genera reguarity conditions. The finite norma mixture is an important statistica mode in many appied areas, such as bioinformatics, remote sensing appications, pharmaceutica studies, 1 The raw data X cannot be independent of the rounding error because the conditiona distribution of the rounding error given the raw data X is aways degenerate. But there are some exampes that the rounded data and rounding error are independent. As an exampe, if X can be decomposed as a sum of independent random variabes Y and Z, wherey is integer-distributed and Z is supported by (0, 1), then the rounded version Y of X is independent of the rounding error Z. However, such cases seem to be rather artificia than natura. In genera cases, such a specia decomposition is not avaiabe and thus the rounding error is not independent of the rounded data Y.

Anaysis of rounded data in mixture norma mode 897 modeing for economics and financia probems, etc. Athough the rounded mode on a finite norma mixture is a specia case of the genera theorem given in Zhang et a. (2010), it is not easy to see the ten reguarity conditions to hod for finite norma mixtures. In this artice, we sha investigate the statistica errors of cassica methods induced by the rounding errors and propose a new estimation method which ensures the consistency and asymptotic normaity of the new estimates of parameters based on rounded data from a norma mixture famiy and demonstrate its performance by simuation studies. The cassica method of parameter estimation with unrounded data (more precisey speaking, with rounded errors ignored) from finite norma mixture distributions can be found in Basford and McLachanSource (1985), Cuesta-Abertos et a. (2008), Hassebad (1966), Murray and Donad (1985), Richard (1985), among others. In this artice we wi focus on studying the estimation of parameters of finite norma mixture distribution of a known number of the components, however the data from this kind of distribution is recorded in rounded form. We investigate the estimates of the unknown parameters for both one- and high-dimensiona finite norma mixture distributions. We start with the investigation on one-dimensiona finite mixture norma distributions. In this part, we sha empoy the EM agorithm. And we then extend our method to high-dimensiona finite norma mixture case. For a high-dimensiona norma mixture mode, in order to reduce the quantity of the computation and improve the precision of estimates, we foow the idea of SOS method proposed in Zhang et a. (2010) to spit our sampe sequence into severa ow-dimensiona sequences. Finay, we wi prove the consistency and asymptotic normaity of the estimates. The remainder of this artice is organized as foows. In Sect. 2, we describe the method of parameter estimation of finite norma mixture, starting from one-dimensiona case and then extending it to high-dimensiona cases. In Sect. 3, we present the asymptotic properties of consistency and asymptotic normaity of the estimates obtained based on rounded data. In Sect. 4, some simuation studies on comparison of the cassica approaches to the newy proposed method are presented. Some concusions are presented in Sect. 5. Some technica proofs are postponed to Appendix. 2 Parameter estimation In this section, we propose a new method for parameter estimation based on rounded data from a finite norma mixture mode. We consider the one-dimensiona case first and then extend it to the high-dimensiona case. 2.1 One-dimensiona mixture In this subsection, we consider the case where the sampes are drawn from a onedimensiona finite norma mixture distribution. Suppose that the true sampe foow the finite norma mixture distribution: L f (x, θ) = p f (x, θ ) =1

898 N. Zhao, Z. Bai where L is a known positive integer 2 f (x, θ ), = 1,...,L, are the component densities each of which is the probabiity density function (pdf) of a norma distribution with mean and variance parameters θ = (μ,σ 2), = 1,...,L, and p, = 1,...,L,the mixing proportions satisfying L p = 1. =1 The density function f is then parameterized by θ = (p 1,...,p L 1,μ 1,σ1 2,..., μ L,σL 2) ={θ; 0 < p < 1, <μ <,σ 2 > 0}. Let X 1,...,X n be an i.i.d. sampe drawn from the pdf f (x, θ). The rounded sampe data Y i, i = 1,...,n, are then defined by: Y i = y i if y i h/2 X i < y i + h/2 where h is the width of the rounding interva. According to the rounded data y 1,...,y n, the ogikeihood function is n (θ) = n og g(y i, θ), (2.1) i=1 where g(y, θ) = y+h/2 y h/2 f (x, θ)dx. For ater use, we simiary define g (y, θ ) with f (x, θ) repaced by f (x, θ ). In order to compute the maximum ikeihood estimation (MLE) of parameters of (2.1), we wi use the EM agorithm. Let Z i = (Z i1,...,z il ), i = 1,...,n, be indicator variabes, that is, Z i = 1 indicates that X i arises in the -th subpopuation, satisfying L Z i = 1, P(Z i = 1) = p. =1 Then, the compete data set is (Y, Z) = ((y i, z i ); i = 1,...,n, = 1,...,L). Since z i s are not observabe, the EM agorithm can then be appied. The parameter vector θ =(p 1,...,p L, θ 1,...,θ L ) is the objective to be estimated in this part. It is easy to verify that (Y i Z i, θ) are conditionay independent and have the conditiona densities L L Z i g (y i, θ ) = (g (y i, θ )) Z i (2.2) =1 and (Z i ) are i.i.d. with distribution =1 2 It is we known that the parameters in a mixture mode are not identifiabe when some mixing proportion equas 0 or two components are identica. In either cases, the number of component can be reduced to a sma number. Hence, we use the statement the number of mixture components is known to impicity mean that no mixing proportion equas 0 and no two components are identica. Anyway, we sha ceary mention these identifiabiity conditions when isting our assumptions.

Anaysis of rounded data in mixture norma mode 899 L =1 p Z i. (2.3) From the specifications (2.2) and (2.3), we are ed to the ikeihood function L(θ Y, Z) = n L i=1 =1 Then the og-ikeihood function is given by (θ Y, Z) n L = Z i og p + i=1 =1 p Z i (g (y i, θ )) Z i. (2.4) n i=1 =1 L Z i og (g (y i, θ )). (2.5) Starting from a set of initia vaues θ (0) =(p (0) 1,...,p(0) L, θ (0) (0) 1,...,θ L ), we sha use the standard steps of EM agorithm to get a sequence of recursivey updated estimates θ (m) =(p (m) 1,...,p (m) L, θ (m) 1,...,θ (m) L ): E step: Evauate Q(θ, θ (m) ) = E((θ Y, Z) Y, θ (m) ) n L n L = E(Z i Y, θ (m) ) og (g (y i, θ ))+ E(Z i Y, θ (m) ) og p i=1 =1 where E(Z i Y, θ (m) ) = M step: Compute p(m) L=1 p (m) g (y i,θ. (m) ) g (y i,θ (m) ) In M step, we first see that p (m+1) = i=1 =1 θ (m+1) = arg max Q(θ, θ (m) ). θ p (m) L=1 p (m) g (y i,θ (m) ) g (y i,θ.andθ (m+1) can be found (m) ) by maximizing Q 1 (θ 1,...,θ L, θ (m) ) = n L=1 i=1 p (m+1) og (g (y i, θ )). Forthe atter, we empoy the Newton Raphson agorithm [ θ (m+1) = θ (m) + 2 Q 1 (θ 1,...,θ L, θ (m) ] ) ( 1) [ Q1 (θ 1,...,θ L, θ (m) ] ) θ 2 θ (m) θ θ (m) where θ = (θ 1,...,θ L ). The cacuation of the partia derivatives in Newton Raphson agorithm is not difficut for finite norma mixture modes, then we obtain a series of θ (m+1) which maximize Q(θ, θ (m) ). The iteration process stops at θ (m+1), when

900 N. Zhao, Z. Bai Q(θ (m+1), θ (m) ) Q(θ (m), θ (m) ) reaches the prechosen accuracy standard. The fina estimate θ n is consistent and aso asymptoticay norma. 2.2 Two-dimensiona mixture The estimation procedure described in Sect. 2.1 can be easiy extended to the twodimensiona case, provided that the norma mixtures x i are changed to represent x i = (x i1, x i2 ), the rounded observations are changed as y i = (y i1, y i2 ), and the parameters are changed to θ =(p 1,...,p L 1, θ 1,...,θ L ). Then, the corresponding formua (2.1) becomes n (θ) Δ = n og g(y i, θ), (2.6) i=1 the MLE of parameters of (2.6) wi be the same procedure as in the one-dimensiona case except the formuae (2.2), (2.4) and (2.5) become L Z i g (y i, θ ) = =1 L(θ Y, Z) = n L i=1 =1 L (g (y i, θ )) Z i, =1 p Z i (g (y i, θ )) Z i and (θ Y, Z) = n i=1 =1 L Z i og p + n i=1 =1 L Z i og y i +h/2 y i h/2 f (x, θ )dx. where g is simiary defined as for one-dimensiona case. And the target function to be maximized in the M step is Q(θ, θ (m) ) = E((θ Y, Z) Y, θ (m) ) n L ( = E Z i Y, θ (m)) ( ) n L ( og g (y i, θ ) + E Z i Y, θ (m)) og p. i=1 =1 i=1 =1 2.3 High-dimensiona mixture For k-dimensiona finite norma mixture mode (k > 2), the pdf can be written as: L f (x, θ) = p f (x, θ ) x R k, =1

Anaysis of rounded data in mixture norma mode 901 where p 1,...,p L are sti the mixing proportions satisfying L =1 p =1, the mixture density f is parameterized by θ = (p 1,...,p L 1, θ 1,...,θ L ), θ = (μ, ), = 1,...,L. Let X = (X 1,...,X n ) be drawn from f (x, θ)= L =1 p f (x, θ ), where X i = (X i1,...,x ik ) T, i = 1,...,n. Correspondingy, the rounded data of X, denoted by Y, can be formuated as Y = (Y 1,...,Y n ) T, where Y i = (Y i1,...,y ik ) T, Y i = y i if y ij h j /2 X ij < y ij + h j /2, j = 1,...,k, y i = (y i1,...,y ik ) T and h = (h 1,...,h k ) being the rounding units. Then, for the rounded data, the og-ikeihood function is given by: n i=1 ( L og =1 ) p g (y i, θ ). (2.7) Logicay, the procedure deveoped in ast two subsections can be formay extended to the case where the dimension k is arger than 2. However, because the density function (2.7) contains a k-fod integra, the numerica computation wi be very demanding even for moderatey arge k. In order to reduce the amount of computation with acceptabe accuracy of the estimates, we convert the high-dimensiona structure into severa two-dimensiona ones as foows: We spit the random vector X i = (X i1,...,x ik ) T, into the pairs: (X i1, X i2 ), (X i2, X i3 ), (X i3, X i4 ), (X i4, X i5 ),..., (X i(k 1), X ik ), (X i1, X i3 ), (X i2, X i4 ), (X i3, X i5 ),...,(X i(k 2), X ik ) (X i1, X i4 ), (X i2, X i5 ),...,(X i(k 3), X ik ),... (X i1, X ik ) For each pair (t, s), t = 1,...,k 1; s = 1,...,k t, { X its = (X it, X i(t+s) ), i = 1,...,n} forms a two-dimensiona mixture of norma variabes with a pdf function L=1 p f ts (x), where f ts (x) represents the two dimensiona norma distribution N(μ ts, ts ), μ ts and ts are the mean vector and covariance matrix of the t-th and (t + s)-th entries respectivey. Simiary, the vector Y i wi be spit in the same manner: (Y i1, Y i2 ), (Y i2, Y i3 ), (Y i3, Y i4 ), (Y i4, Y i5 ),...,(Y i(k 1), Y ik ), (Y i1, Y i3 ), (Y i2, Y i4 ), (Y i3, Y i5 ),...,(Y i(k 2), Y ik ) (Y i1, Y i4 ), (Y i2, Y i5 ),...,(Y i(k 3), Y ik ),... (Y i1, Y ik ) Based on the rounded observation Ỹ its = (Y it, Y i(t+s) ), i = 1, 2,...,n of X its and using the method described in Sect. 2.2, we can obtain maximum ikeihood estimates of ˆμ ts, ts and ˆp ts. Here, note that athough p is independent of the component indices t and t + s, the estimators shoud depend on them. Note aso that the covariance (or correations) of the t-th and (t + s)-th entry has ony been estimated once

902 N. Zhao, Z. Bai and thus we can choose it as the fina estimate. But the mean μ and variance σ 2 of the component X have been estimated k 1 times and mixing weight p has been estimated k(k 1)/2 times. Thus, we can take the averages of the estimates as the fina estimates of these parameters. Remark 1 To improve the estimates of the means and variances of norma components as we as the mixture weights, one may aso use {X it, i = 1, 2,...,n} as onedimensiona norma mixture and the procedure described in Sect. 2.1 to get MLE s of ˆμ t, ˆσ t 2 and the mixture weight ˆp t. Thus, the estimates of the variances can be taken as the average of k estimates and the MLE of the mixture weight can be taken as the average of k(k + 1)/2 estimates. Remark 2 From the consistency which wi be estabished in the next section, the estimates of the mean μ t and variance σt 2 from each pair of subsampe shoud be very cose to the true vaue when n is arge. However, for a rea sampe, the vaues of the estimates may be different, even the difference is arge. In rea appication, we may take the average after each M-step unti convergence. 3 Asymptotic properties of parameter estimates In this section, we show that the proposed estimates of the unknown parameters of finite norma mixture is consistent and asymptoticay norma. We assume that the number L in the norma mixture distributions is known. Then we have p > 0 satisfying L =1 p = 1 for each L, and (μ i, i ) = (μ j, j ) for each pair i = j. Theorem 1 For finite norma mixture distribution in which the number of components is known, the proposed estimate ˆθ n by the ikeihood function (2.7) satisfies ˆθ n θ 0 a.s. (3.1) d n(ˆθ n θ 0 ) N(0, I 1 (θ 0 )) (3.2) where I (θ 0 ) is the Fisher information matrix given by I (θ 0 ) = j 1 q(j θ) q(j θ) θ q(j θ) θ θ =θ 0 and q(j θ) = j+1/2 j 1/2 f (x, θ)dx, j = ( j 1,..., j k ), j i are integers for i = 1,...,k. The proofs of (3.1) and (3.2) are postponed to the Appendix. We point out here that the asymptotic normaity is important for hypothesis test and confidence regions in finite norma mixture modes. In practica appications of the asymptotic normaity of ˆθ n, one needs aso a consistent estimator of the asymptotic covariance matrix of ˆθ n. To this end, we propose to estimate I (θ 0 ) by repacing θ 0 with ˆθ n.

Anaysis of rounded data in mixture norma mode 903 4 Simuation resuts and discussions In this section, we present some simuation resuts to exhibit how the new procedure performs at the presence of rounding error. We estimate the unknown parameters in both one-dimensiona and five-dimensiona finite norma mixture distribution based on sampes that are rounded to integers. As a reference, we present the cassica estimates obtained by unrounded data (named precise estimate ) and those based on rounded data with rounding errors ignored (named uncorrected estimate ). For one-dimensiona norma mixture mode, the simuation is conducted for twocomponent norma mixture distribution and three-component norma mixture distribution. Simuation resuts are obtained by 1000 repetitions of sampes of size 10,000. It is we known that the goba maximum ikeihood for finite norma mixture mode is aways infinite. However, when the data are not rounded, it is natura to beieve that the ikeihood for finite norma mixture mode shoud have a oca maximum near the true parameter when the sampe size is arge. Therefore, if the initia vaues are suitaby chosen, the estimates by EM agorithm may converge to a oca maximum. In our simuation, we empoy the EM agorithm to compute the precise estimates and the uncorrected estimates. The EM agorithm works we for unrounded data for most of the cases. However, when the two components in norma mixture are mixed up and the data are rounded, the EM agorithm fais for the cases where the recursive estimators the goba maximizers. In this case, we wi report NA, standing for not avaiabe. The estimates by our proposed method wi be indicated by our correction. The simuation resuts are presented in Tabes 1, 2, 3 and 4. In these tabes, we aso present the square roots of the mean squared errors of the 1000 repetitions. From these tabes, it can be seen that the EM agorithm works we for unrounded data for a cases and it is fais when the data are rounded and the two components are mixed heaviy. For five-dimensiona norma mixture modes, the simuation is aso conducted by 100 repetitions of sampes of sizes 10,000. In the simuation, we sha use the foowing two sets of true parameters; the first set is p 1 = p = 0.75, μ 1 = (0.25, 1.75, 2.25, 3, 4.25), μ 2 = (3.25, 5, 5.25, 6.25, 7.25), 0.25 0.2 0.2 0.2 0.2 1 0.1 0.1 0.1 0.1 0.2 0.25 0.2 0.2 0.2 1 = 0.2 0.2 0.25 0.2 0.2 0.2 0.2 0.2 0.25 0.2 and 0.1 1 0.1 0.1 0.1 2 = 0.1 0.1 1 0.1 0.1 0.1 0.1 0.1 1 0.1. 0.2 0.2 0.2 0.2 0.25 0.1 0.1 0.1 0.1 1 And the second set of parameters is p 1 = p = 0.8, μ 1 = (4, 0, 0.25, 0.75, 1), μ 2 = (13.4, 2.2, 2.45, 2.95, 3.8), 5.174 0.955 1.116 1.407 1.496 0.955 0.35 0.18 0.245 0.28 1 = 1.116 0.18 0.46 0.336 0.24 1.407 0.245 0.336 0.59 0.336 and 1.496 0.28 0.24 0.336 0.74

904 N. Zhao, Z. Bai Tabe 1 Estimates for parameters in one-dimensiona norma mixture distribution (Case 1) Parameter True vaue Precise estimate Uncorrected estimate Our correction μ 1 0 0.000426 0.00231 0.000547 MSE( ˆμ1 )) (0.00623) (0.00763) (0.00719) σ1 2 ) 0.25 0.248 0.321 0.247 MSE( ˆσ 21 ) (0.00445) (0.0741) (0.00551) μ 2 3 2.978 2.959 2.977 MSE( ˆμ2 )) (0.0287) (0.0423) (0.0335) σ2 2 ) 1 0.995 1.111 0.998 MSE( ˆσ 22 ) (0.0457) (0.135) (0.0575) p 0.8 0.794 0.792 0.794 ) MSE( ˆp) (0.00434) (0.00514) (0.00459) Tabe 2 Estimates for parameters in one-dimensiona norma mixture distribution (Case 2) Parameter True vaue Precise estimate Uncorrected estimate Our correction μ 1 0 0.0000577 NA 0.000250 MSE( ˆμ1 )) (0.00659) (NA) (0.00843) σ1 2 ) 0.25 0.250 NA 0.250 MSE( ˆσ 21 ) (0.00488) (NA) (0.00620) μ 2 2.5 2.498 NA 2.497 MSE( ˆμ2 )) (0.0379) (NA) (0.0510) σ2 2 ) 1 1.001 NA 1.001 MSE( ˆσ 22 ) (0.0542) (NA) (0.0708) p 0.8 0.799 NA 0.799 ) MSE( ˆp) (0.00517) (NA) (0.00665) 5.452 1.428 1.194 0.99 1.64 1.428 0.59 0.336 0.21 0.392 2 = 1.194 0.336 0.46 0.21 0.288 0.99 0.21 0.21 0.35 0.32. 1.64 0.392 0.288 0.32 0.74 For the first set of true parameters, the covariance matrices were manuay chosen to have equa variances and equa correation coefficients. Whereas, for the second set of true parameters, the covariance matrices were generated by sampe covariance matrices of sma sampe sizes, so that to have different variances and different correation coefficients. As in one-dimensiona norma mixture distribution, there are aso some situations where the cassica maximum ikeihood estimation and/or EM agorithm are not appicabe to the rounded data with rounding errors ignored. Our simua-

Anaysis of rounded data in mixture norma mode 905 Tabe 3 Estimates for parameters in one-dimensiona norma mixture distribution (Case 3) Parameter True vaue Precise estimate Uncorrected estimate Our correction μ 1 0 0.000885 NA 0.00402 MSE( ˆμ1 )) (0.00774) (NA) (0.0141) σ1 2 ) 0.25 0.250 NA 0.251 MSE( ˆσ 21 ) (0.00690) (NA) (0.0109) μ 2 1.5 1.510 NA 1.547 MSE( ˆμ2 )) (0.0923) (NA) (0.164) σ2 2 ) 1 0.990 NA 0.961 MSE( ˆσ 22 ) (0.0854) (NA) (0.131) p 0.8 0.800 NA 0.805 ) MSE( ˆp) (0.0147) (NA) (0.0259) Tabe 4 Estimates for parameters in one-dimensiona norma mixture distribution (Three-component case) Parameter True vaue Precise estimation Uncorrected Our correction Our correction (round to integer) (round to 0.1 digit) μ 1 2 2.000701 NA 1.999 2.001 MSE( ˆμ1 )) (0.0484) (NA) (0.0850) (0.0486) σ1 2 ) 1 1.000 NA 1.000 1.000 MSE( ˆσ 21 ) (0.0592) (NA) (0.0886) (0.0592) μ 2 0 0.000513 NA 0.00131 0.000545 MSE( ˆμ2 )) (0.0116) (NA) (0.0250) (0.0117) σ2 2 ) 0.25 0.250 NA 0.250 0.251 MSE( ˆσ 22 ) (0.0118) (NA) (0.0177) (0.0119) μ 3 2 2.00557 NA 2.006 2.006 MSE( ˆμ3 )) (0.0604) (NA) (0.102) (0.0604) σ3 2 ) 1 0.995 NA 0.994 0.995 MSE( ˆσ 23 ) (0.0712) (NA) (0.107) (0.0712) p 1 0.3 0.300 NA 0.300 0.3000 MSE( ˆp1 )) (0.0101) (NA) (0.0177) (0.0101) p 2 0.5 0.501 NA 0.501 0.501 MSE( ˆp2 )) (0.0141) (NA) (0.0233) (0.0141) tion was conducted to a five-dimensiona norma mixture distribution of two components with 100 repetitions of sampes of sizes 1000. The data are rounded to integers. Because there are too many parameters in the five-component mode to present the simuation resuts by tabes, we sha present our simuation resuts in Figs. 1 and 2 for the two sets of parameters. Each figure consists of biases and the root mean square

906 N. Zhao, Z. Bai biases 0.20 0.05 0.10 mu1 sigma1 rho1 mu2 sigma2 rho2 p precise uncorrected our correction parameters squre roots of MSE 0.00 0.10 0.20 mu1 sigma1 rho1 mu2 sigma2 rho2 p precise uncorrected our correction parameters Fig. 1 The biases and the square-roots of MSE of the parameter estimates for the first set of parameters obtained by precise data, rounded data without correction and our correction based on rounded data error. In Fig. 1, we arrange the parameters on the horizonta axes in the order: the five means, five variances, and ten correation coefficients of the first subpopuation, then the five means, five variances, and ten correation coefficients of the second subpopuation, and finay the four mixture coefficients, denoted respectivey by mu1, sigma1, rho1, mu2, sigma2, rho2, and p accordingy. In Fig. 2, we arrange the parameters on the horizonta axes in the order: the five means, five variances, and ten covariances of the first subpopuation, then the five means, five variances, and ten covariances of the second subpopuation, and finay the four mixture coefficients, denoted respectivey by mu1, sigma1, cov1, mu2, sigma2, cov2, and p accordingy. At each x-point, there are three points presenting, respectivey, the bias (or root MSE) of the precise estimate, uncorrected estimate, and our corrected estimates based on rounded data. To make the differences between the three methods easiy visibe, for the mean estimators, the y-coordinates were mutipied by five (or two for the second set of parameters). 5 Comments and concusions Tabes 1, 2 and 3 show that the EM agorithm works we for unrounded data in most cases, but may fai for rounded data when the two components mixed up heaviy. Our corrections work we for a cases. In the proofs of the main theorem given in the Appendix, one may find that in our proposed method, the integrated ikeihood never has an infinite maximum and thus the EM agorithm wi never fai. Aso, these

Anaysis of rounded data in mixture norma mode 907 biases 15 5 5 15 mu1 sigma1 cov1 mu2 sigma2 cov2 p precise uncorrected our correction parameters mu1 sigma1 cov1 mu2 sigma2 cov2 p squre roots of MSE 0 5 10 20 precise uncorrected our correction parameters Fig. 2 The biases and the square-roots of MSE of the parameter estimates for the second set of parameters obtained by precise data, rounded data without correction and our correction based on rounded data tabes show that our method provides estimates as accurate as those obtained from unrounded data, which means our approach corrects the effect of rounding errors effectivey. As for the muti-dimensiona norma mixture modes, Figures 1 and 2 aso show that our method corrects the rounding effect effectivey for both the means, the variances, and the mixture weights. As for the correation estimates, our method preforms better than the uncorrected method except for correation estimates for the second popuation with the first set of parameters. We beieve that this is due to random errors in the simuation. According to the simuation resuts, we find that rounding errors have smaer impact on the estimates of means but arger impact on the estimates of variances, when sampe size is adequatey arge. But the infuence of the rounding error is enormous when the variance of the popuation is cose to or smaer than the rounding errors. A the simuation resuts show that our method corrects the rounding errors effectivey for norma mixture distribution. Finite norma mixture is a usefu cass of modes in appied data anaysis. On the other hand, rounded data are ubiquitous in appication. In this artice, we dea with the rounding error in finite norma mixture distribution, incuding one-dimensiona and high-dimensiona norma mixture distribution. We give the consistent estimation of unknown parameters in the modes, and prove the asymptotic normaity of the estimates. The simuation resuts show that the proposed method correct rounding errors effectivey.

908 N. Zhao, Z. Bai 6 Appendix 6.1 Some anciary resuts (3.1) and (3.2) wi be proved by using Theorem 1 of Zhang et a. (2010). For competeness, we first quote Theorem 1 of Zhang et a. (2010) as foows: 1. The parameter space is an open set of R k and the true vaue θ 0 is an inner point of. 2. The sampe space X ={x; f a+1 (x, θ) >0} is independent of the parameters, where f a (x, θ) is the joint pdf of a consecutive sequence of ength a of {X t }. 3. λ({x : g(y, θ) = g(y, θ )}) >0 θ = θ, where λ is the Lebesgue measure in R a+1, where g(y, θ) is the probabiity defined as g(y, θ) = y+ h 2 f y h a+1 (x, θ)dx. 2 4. {X t } is stricty stationary and α-mixing. 5. At the true vaue θ 0, ψ(θ 0 ; θ) = f a+1 (x, θ 0 ) og g(y, θ)dx exists for a θ and is continuous in θ. 6. For each compact subset of, there is a function h(y) = h (y) such that Eh(Y) = j h(j)g(j, θ 0)< and f a+1 (x, θ) h(y) f a+1 (x, θ) for a θ, where y is the vector of rounded x and the norm denotes the maximum absoute vaue of entries of the indicated vector or matrix. 7. Except for a nu set of X, for each fixed x, sup θ c m f a+1 (x, θ) 0, m, where { m } is an increasing sequence of compact subspaces m of such that m =. 8. For some constant δ>0, Eh(Y) 2+δ <,, where the function h(y) is defined in condition 6. 9. The α-mixing coefficients satisfy n=1 α δ/(2+δ) n <. 10. The integration with respect to x and the differentiation with respect to θ of f a+1 (x, θ) are interchangeabe.

Anaysis of rounded data in mixture norma mode 909 Theorem Z1 (Zhang et a) Under reguarity conditions 1 6, the oca SOS-MLE is consistent, that is, for some compact subspace,ifθ 0 is an inner point of, then θ n θ 0 a.s. where ˆθ n = arg{sup sos (θ)}. θ Under reguarity conditions 1 7, the goba SOS-MLE is strongy consistent, where the goba SOS-MLE is defined by ˆθ ng = arg{sup sos (θ)}. θ Theorem Z2 (Zhang et a) Under reguarity conditions 1 10, n a(ˆθ n θ 0 ) d N k (0, I 1 (θ 0 )V (θ 0 )I 1 (θ 0 )) where N k (μ, ) denotes a mutivariate norma variabe of dimension k with mean vector μ and covariance matrix, and I (θ 0 ) = E og g(y 1, θ) θ V (θ 0 ) = I (θ 0 ) og g(y 1, θ) θ [ og g(y1, θ) + E θ t=1, θ=θ 0 og g(y t+1, θ) θ + og g(y t+1, θ) θ ] og g(y 1, θ) θ. θ=θ 0 In ight of the properties of finite norma mixture distribution with known number of components, we have the foowing observations: (1) Due to independence, one may choose a = 0 and f 1 (x; θ) = f (x; θ), the pdf of finite norma mixture. Aso, α n = 0. (2) In ight of these, the reguarity conditions 1 5 and 8 10 of Theorem 1 of Zhang et a. (2010) are satisfied. (3) The reguarity condition 7 is not satisfied when parameters of two components tend to each other or one of the mixture weight tends to 0. However, in view of the appication of EM agorithm, our maximization can be imited to oca maximizers in the proof of Theorem 3.1 and 3.2. (4) What remains to verify is that if the reguarity condition 6 is satisfied for finite norma mixture.

910 N. Zhao, Z. Bai Verification for one-dimensiona case We have f (x, θ) = L =1 p f (x, θ ), where f (x, θ ) is the pdf of norma distribution with parameter θ = (μ,σ 2 ). The unknown parameters in this mode are denoted by θ = (p 1,...,p L 1,μ 1,σ1 2,...,μ L,σL 2). In each compact subset of, K 1 > 0, K 2 > 0, K 3 > 0, K 2 < K 3, st μ K 1, K 2 σ 2 K 3, K 2 < p, = 1,...,L 1, and K 2 < 1 p 1 p L 1. First, notice that the partia derivative of the pdf with respect to p, = 1,...,L 1 is given by f (x, θ) p = f (x, θ ) f L (x, θ L ), = 1,...,L 1. In a finite norma mixture mode, we have p f (x,θ ) < 1, = 1,...,L. Due to the f (x,θ ) known L, wehave f (x,θ ) p f (x, θ) = f (x, θ ) f L (x, θ L ) f (x, θ) { 1 max, 1 } 1/K 2. p p L Next, the partia derivative of the density function with respect to μ is given by f (x, θ) = (x μ ) μ σ 2 f (x, θ ) Then f (x, θ) = f (x,θ ) μ (x μ ) σ 2 f (x, θ ) f (x, θ) x +K 1 K2 2 C 1 x +C 2 Finay we note that the partia derivative of the density function with respect to σ 2 f (x, θ) σ 2 = σ 2 + (x μ ) 2 2σ 4 f (x, θ ) is Then f (x,θ ) σ 2 f (x, θ) = σ 2 + (x μ ) 2 2σ 4 f (x, θ ) f (x, θ) K 3 + 2 x 2 + 2K 1 2K2 3 C 3 + C 4 x 2. Taking s(x) = max{m 1, C 1 x +C 2, C 3 + C 4 x 2 }, since norma distributions have finite moments of a orders, we obtain s(x) f (x, θ 0 )dx <. Taking h(y) =

Anaysis of rounded data in mixture norma mode 911 max y 1 2 <x<y+ 1 2 s(x), and noticing that there is a constant Δ>0 such that s(x) Δh(y), wehaveeh(y) Δ 1 j j 1 2 j 1 2 Verification for muti-dimensiona case s(x) f (x, θ 0 )dx <. In this case, we have f (x, θ) = L =1 p f (x, θ ), where the unknown parameters are denoted by θ = (p 1,...,p L 1,μ 1,...,μ L, 1,..., L ), where μ = (μ 1,...,μ n ) T, and = (σ ij ) n n. In a compact subset of, it can be ensure that there are positive constants W 1, W 2 and W 3 such that μ i W 1, W 2 λ min ( ) λ max ( ) W 3, and W 2 < p, = 1,...,L, i = 1,...,p. Under these conditions, we have σ ii W 1/2 3. First we note that the partia derivative of f (x, θ) with respect to p, = 1,...,L 1 is f (x, θ) p = f (x, θ ) f L (x, θ L ). Then, simiar to the one-dimensiona case, we have f (x,θ ) p f (x, θ) 1 K 2 where M is a positive constant. Next, we find that the partia derivative of f (x, θ) with respect to μ i is given by f (x, θ) μ = p f (x, θ ) 1 (x μ ). So f (x,θ ) μ f (x, θ) = p f (x, θ ) f (x, θ) p f (x, θ ) f (x, θ) D 1 + D 2 x, 1 (x μ ) 1 (x μ ) W2 1 ( x +W 1 ) where denotes the Eucidean norm of matrices or vectors. Finay, we consider the partia derivative of f (x, θ) with respect to σ ij.forsymmetric matrix, we use the notation g( ) g( ) whose i-diagona eement is σ ii and (i, j)-eement is 2 1. Then, we have g( ) σ ij f (x, θ) = p f (x, θ ) 1 (x μ )(x μ ) 1.

912 N. Zhao, Z. Bai Consequenty, f (x,θ ) f (x, θ) 1 (x μ )(x μ ) 1 2W2 2 ( x 2 + W1 2 ) D 3 x 2 + D 4. Define s(x) = max{m, D 1 +D 2 x, D 3 +D 4 x 2 } and h(x) = max{s(x); y 2 1 x y + 2 1 }. It is not difficut to see that there is a positive constant Δ such that h(y) Δ min{s(x); y 2 1 x y + 2 1 }. Observing these, it is not difficut to prove that h(y) is integrabe. Now the verification of reguarity condition 6 is compete. 6.2 Proof of (3.1) Now, we appy Theorem 1 of Zhang et a. (2010) to prove (3.1). Without oss of generaity, we ony prove the theorem for 2-dimensiona norma mixture. For any compact subset 0 of that contains the true parameter θ 0 as an inner point, the oca maximizer ˆθ n ( 0 ) of the ikeihood (2.6) tends amost surey to θ 0. Now et us consider the goba maximizer ˆθ n of the og-ikeihood. We caim that ˆθ n θ 0 amost surey. Otherwise, there exists a subsequence {n } such that ˆθ n θ = θ 0. Now, define a random variabe X (n) with conditiona distribution that P(X (n) = X i F n ) = n 1, where F n is the σ -fied generated by {X 1,...,X n }. It is we known that for a compact subset 0 of, 1 n n(ˆθ n ( 0 )) = E(og g(y (n), ˆθ n ( 0 )) F ) a.s. f (x, θ 0 ) og g(y, θ 0 )dx, (6.1) where F = im F n and Y (n) is the rounded version of X (n). On the other hand, by noticing g(y, θ) 1 and appying Fatou Lemma, we have im sup 1 n n (ˆθ n ) = im sup E(og g(y (n ), ˆθ n ) F ) E(im sup og g(y (n ), ˆθ n ) F ) = f (x, θ 0 ) og g(y, θ)dx. (6.2) By assumption, the right hand side of (6.1) is arger than the right hand side of (6.2). But the fact that n (ˆθ n ) n (ˆθ n ( 0 )) gives a contradiction of the above. Thus, the proof of Theorem 3.1 is compete. Remark 6.1 In the proof above, we have used an abused notabe g(y, θ) which is we defined ony when θ. When θ (incude the case that some components

Anaysis of rounded data in mixture norma mode 913 of θ are infinity), g(y, θ) shoud be understood as the imit of g(y, θ (k) ) whenever θ (k) θ as k. It is not difficut to show that such a imit exists and is unique. Remark 6.2 The reader shoud be reminded that the right hand side of (6.2) may be negative infinity. Remark 6.3 The fact that g(y, θ) [0, 1] pays an important roe for the consistency of MLE for rounded data of finite norma mixture mode. For the unrounded data of finite norma mixture, the ikeihood f (x, θ (k) ) is not the case because Fadou Lemma is not appicabe. By the same method, one show that the oca maximizer of the ikeihood of unrounded data from a finite norma mixture mode is strongy consistent, but not the goba maximizer. This is an important advantage of MLE procedure for rounded data of finite norma mixture mode. 6.3 The proof of (3.2) After the strong consistency of ˆθ n is proved, the proof of (3.2) become routine and hence omitted. Acknowedgements The research was partiay supported by the Fundamenta Research Funds for the Centra Universities 10ssxt149 and by the China NSF Grant 10871036. The author woud ike to express their sincere thanks to the anonymous referees for their invauabe comments that improved the artice significanty. References Bai ZD, Zheng SR, Zhang BX, Hu GR (2009) Statistica anaysis for rounded data. J Stat Pan Inference 139(8):2526 2542 Basford KE, McLachanSource GJ (1985) Likeihood estimation with norma mixture modes. J R Stat Soc C 34:282 289 Beaton AE, Rubin DB, Baxone JL (1976) The acceptabiity of regression soutions: another ook at computationa accuracy. J Am Stat Assoc 71:158 168 Cuesta-Abertos JA, Matrán C, Mayo-Iscar A (2008) Robust estimation in the norma mixture mode based on robust custering. J R Stat Soc B 70:779 802 Dempster AP, Rubin DB (1983) Rounding error in regression: the appropriateness of Sheppard s corrections. J R Stat Soc B 45:51 59 Fisher RA (1922) On the mathematica foundations of theoretica statistics. Phios Trans R Soc A 222: 309 368 Hassebad V (1966) Estimation of parameters for a mixture of norma distributions. Technometrics 8(3):431 444 Lee CS, Vardeman SB (2001) Interva estimation of a norma process mean from rounded data. J Qua Techno 33:335 348 Lee CS, Vardeman SB (2002) Interva estimation of a norma process standard deviation from rounded data. Commun Stat B 31:13 34 Lee CS, Vardeman SB (2003) Confidence interva based on rounded data from the baanced one-way norma random effects mode. Commun Stat B 32:835 856 Lindey DV (1950) Grouping correction and maximum ikeihood equations. Proc Camb Phios Soc 46: 106 110 Muet GM, Murray TW (1971) A new method for examining rounding error in east-squares regression computer programs. J Am Stat Assoc 66:496 498 Murray A, Donad BR (1985) Estimation and hypothesis testing in finite mixture modes. J R Stat Soc B 47:67 75

914 N. Zhao, Z. Bai Richard JH (1985) A constrained formuation of maximum-ikeihood estimation for norma mixture distributions. Ann Stat 13(2):795 800 Sheppard WF (1898) On the cacuation of the most probabe vaues of frequency constants for data arranged according to equidistant divisions of a scae. Proc Lond Math Soc 29:353 380 Tricker A (1984) The effect of rounding data samped from the exponentia distribution. J App Stat 11: 54 87 Tricker A (1990a) Estimation of parameters for rounded data from non-norma distributions. J App Stat 17:219 228 Tricker A (1990b) The effect of rounding on the significance eve of certain norma test statistics. J App Stat 17:31 38 Tricker A, Coates E, Oke E (1998) The effect on the R chart of precision of measurement. J Qua Techno 30:232 239 Vardeman SB (2005) Sheppard s correction for variances and the quantization noise mode. IEEE Trans Instrum Meas 54:2117 2119 Zhang BX, Liu TQ, Bai ZD (2010) Anaysis of rounded data from dependent sequences. Ann Inst Stat Math 62(6):1143 1173