Two-way contingency tables for complex sampling schemes

Biomctrika (1976), 63, 2, p. 271-6 271 Printed in Oreat Britain Two-way contingency tables for complex sampling schemes BT J. J. SHUSTER Department of Statistics, University of Florida, Gainesville AND D. J. DOWNING Department of Statistics, Marquette University, Milwaukee SUMMARY Methods for testing independence, quasiindependence, and marginal symmetry in contingency tables are derived for a wide variety of sampling schemes including stratified multistage cluster sampling. The null hypothesis is a vector of linear and quadratic contrasts involving the probabilities. The asymptotic null distribution of the test statistic is chisquared. The theory can also be used to test equality of failure distributions in a stratified prospective multiclinio trial, jointly for all strata or merely population-wide. Some key words: Censored survival data; Clinical trial; Cluster design; Contingency table; Independence; Marginal symmetry; Mover-stayer model; Quasiindependence. 1. INTRODUCTION In many sample survey problems, we wish to test hypotheses in a two-way contingency table. The data are usually collected by complex sampling schemes (Kish & Frankel, 1974), rather than by simple random sampling. The usual multinomial approximations of the classical contingency table analyses may lead to misleading values of the chi-squared statistics. Such methods are insensitive to dependence among sampling unite. Although many techniques are available for data analysis when simple random sampling is employed (Cox, 1970; Gart, 1972; Goodman, 1970; Ku & Kullback, 1974; Zelen, 1971), very little beyond the 2x2 table has been published on extensions of these techniques to complex sampling schemes. Gart (1971), Kish & Hess (1959) and Miettinen (1969, 1970) consider the 2x2 table in some nonstandard sampling situations. From time to time, applied scientists use simple random sampling analyses, but note that clustering affects the validity of their conclusions (Kessner, Snow & Singer, 1974; Tietze & Lewit, 1974). We propose statistical methods which can be used to test vectors of linear and quadratic hypotheses involving a vector of probabilities. Section 2 presents the general asymptotic theory of our test statistics. Section 3 is devoted to examples of the types of statistical hypotheses that can be tested. These include tests of independence, quasiindependence (Goodman, 1968), marginal symmetry (Stuart, 1955), mover-stayer model (Goodman, 1961), equality of survival distributions under various treatments, and simultaneous independence of row and column effects for all strata. In 4, we derive estimators, which can be used to implement the methods described in 2 and 3, for complex sampling situations, including stratified multistage cluster sampling and stratified multiclinic censored survival experimente.

272 J. J. SHUSTER AND D. J. DOWNING 2. GENERAL ASYMPTOTIC THEORY FOR TESTING LINEAR AND QUADRATIC HYPOTHESES IN COMPLEX SAMPLING SITUATIONS In this section we shall develop the general asymptotic theory required for a wide variety of sampling situations. Some typical applications of these methods are presented in 3 and 4. Let n =' (P x,...,p L )' be an arbitrary vector of probabilities, not necessarily summing to any specified value, and let {n n } be a sequence of estimators of n such that ni(n n n) tends in law to N(0, V) as n->-oo. The covariance matrix V need not be of full rank, but we shall assume that all linear constraints in n are satisfied by n n. Let V(n) = An-in'B^,...,n'B m n)', (2-1) where A is a specified mxl matrix, and the B { are specified LxL matrices. The following theorem provides a test for the hypothesis THEOREM 2-1. Let H,: U(n) = 0. (2-2) G(n) = A-Q, (2-3) where the ith row of Q is Q i = n'(b i + B' i ) (t = l,...,m). Furthermore, let t n converge in probability to V. Then, as n->-oo, ni{u(n n ) - V(n)} converges to N{0,0(n) VG(n)'}; and if G(n) VO{n)' is invertible, -Vin)} (2-4) converges to Xm- The quantities U(.) and (?(.) are given in (2-1) and (2-3) respectively. Under the null hypothesis (2*2), the statistic (2-4) has no unknown parameters. The proof follows methods of Wald (1943) and is, therefore, omitted. Note that by setting the B matrices equal to zero, TJ{TT) is linear. 3. EXAMPLES OF LINEAR AND QUADRATIC HYPOTHESES IN TWO-WAY TABLES We shall write nfor the RxG table as n = (A. >PRO) = (^n^ >n l0,7t il,...,n 2O,...,n m,...,n RC ). (3-1) Example 3-1. The classical tests of complete independence in a two-way table. For the case of testing homogeneity of several multinomial distributions, U(TT) is linear, while for the case of complete independence in a two-way table, U(n) is quadratic. In either case, U(n) is an (JS 1) ((7 1) vector of contrasts, leading to (B 1) (G 1) degrees of freedom. Example 3-2. Test for marginal symmetry, i.e. matched control studies. Here n is as in (3-1) with B = G. The hypothesis of interest is: fio: (^-7r yj ) = O (l^izb-1), (3-2) when n is a probability distribution. This is linear in the P t 'B. The number of degrees of freedom, m, is (R~ 1).

Ttoo-way contingency tables for complex sampling schemes 273 Example 3-3. Test for quasiindependence, n as in (3-1) over the set 8 of (i,j) values. The hypothesis of interest is for all t,i' and jes t O 8?, where $ 4 = {j:{i,j)e8}, when 7r is a probability distribution (Goodman, 1968, equations (1-2), (1-3), (1-5)). While this is quadratic in the i^'s, H o must be reduced so that Q(n) VQ[n)' is invertible. Although the mechanics of such reductions are routine for any given situation, the general notation is awkward. Hence we do not include it here. Example 3-4. Test for a mover-stayer model (Goodman, 1961) in the BxB situation. The hypothesis of interest is that, given that a move is made from any' parent classification', t, are the probabilities of the ' offspring classification' the same as that of the population wide probability of being in that category, given one is not in classification t? If 77 is the probability distribution of all moves, not the transition matrix, then the hypothesis of interest is»)(%)(5 = 0 (3-4) k+i I I k+i for all (i,j) such that i4= j. It can be shown that (3-4) can be described by B(B 2) functionally independent equations. Hence m = B(R 2). Example 3-5. Simultaneous test of independence of row and column effects for all strata. Here there are probabilities 6 iik (1 ^ i ^ B, 1 < j < C, 1 ^ k ^ L), with ^{ t k 6 iik = 1. The hypothesis of interest is for all i, j and k. Then m = L(R - 1) (O- 1). Example 3-6. Test for equality of survival distributions under various treatments. Each TT ii in (3-1) represents the probability that an individual using treatment i, survives j periods after initiation of treatment. The hypothesis of interest is This is linear in the i^'s. Here m = (B 1) C. kest Ho-*H = " i+ i.i (K»<jR-l,l<i C). (3-6) 4. DEBIVATION OF n n AND V n or 2 IN COMPLEX SAMPLING SITUATIONS Example 4-1. Two stage cluster sampling. (a) First stage. The population is subdivided into N primary clusters. A simple random sample of n clusters is drawn with replacement. (6) Second stage. Any sampling scheme that yields unbiased estimates of the L cluster totals may be used. The method must be invariant under changes in the order in which primary units are drawn. Hence, once a cluster is sampled, we shall restore it to its original form. Typical methods of sampling within primary clusters include: simple random sampling, stratified sampling, single- or two-stage cluster sampling, etc. Sample sizes are arbitrary and may be taken with or without replacement.

274 J. J. SHUSTEB AND D. J. DOWNING Let lf < y be the estimated total in the jth category of the tth primary cluster. Under our sampling conditions, the vectors (T n,...,(t il ) (t = 1,..., n) are independent and identically distributed. The vector n = (P 1 P L )' can be estimated by where. refers to summing over the variable of summation. By the law of large numbers and Taylor's Theorem, we have -i@ ml -P 1 f l _,...,f Ji -P L $_), (4-1) where ~ symbolizes the fact that the ratio of any linear function of the left-hand side divided by the corresponding linear function of the right-hand side converges to one in probability, whenever such linear functions are npt trivially zero. By the Central Limit Theorem, the T t vector has an asymptotic multivariate normal distribution, and hence so has nl(n n n) by the relation (4-1). Let F B be the L x L matrix whose entries are given by where 17«-* -(*.,/*_)*,. By the law of large numbers F n converges in probability to V, the covariance matrix in the limiting distribution of nl(n n n). For stratified random sampling within clusters, with sample size functionally related to strata sizes within the clusters, we use f it = Z k Mt k 7 ijk (1 <S t < n, 1 < j *S L), where M^ is the number of elements in the entire tth cluster and kth. stratum, and Y iik is the fraction of elements sampled in the ifcth stratum of the ith cluster, that fall in the jth category. If M ik = 0, we take T i]k = 0. If M ik =# 0, at least one element is assumed to be sampled from stratum k. Example <L-2. Extensions of Example 4-1. (a) Example 4-1 can be extended to sampling of primary clusters without replacement, provided that our large sample size of primary clusters is a small fraction of the population size. (6) The primary clusters themselves may be drawn from a stratified random sample, rather than a simple random sample. If W t is the tth stratum weight, n t is the number of clusters sampled in stratum t, # <(*) is the n^ for stratum t, ^(t) is the f^ for stratum i, and njn-*?^ as w->oo, with n = 2^, then n n = S^^t) and t n = S^JPJ^O/AJ. (c) In single-stage cluster sampling, where the primary clusters chosen are exhaustively investigated, then (Madow, 1948) the results of Example 4-1 can be applied to sampling without replacement using the same n n andfi n = (1 f)v%, where F has entries given in (4-2)and/=lim(n/^). Example 4-3. Multiclinic prospective trials with censored data. Suppose that H clinics combine to run a prospective clinical trial. Conditional on the fact that each clinic has surviving patients in all treatments group in all time periods, we can make an overall inference as to the survival distributions.

Two-ioay contingency tables for complex sampling schemes 275 Let W h be the fraction of patients who would be treated in the Ath clinic, given that they would be treated in one of the H clinics. Using the life table methods for survival data of Cutler & Ederer (1958), each clinic obtains independent estimates of their set 6 iih, the probabilities of surviving j periods under treatment t, at clinic h. The variance-covariance structure of each clinic's estimator is obtained by Greenwood's formula. These estimates are respectively denoted by B iih and V nh (h), where n h is the number of patients in clinic h who entered the trial. Let n = 1ai h -yco in such a way that?i ft /n-»-a A. Then gives the components of n n, and A A-"l The framework of such inferences applies only to the H clinics, since the above is a fixed effect analysis. Example 4-4. Extension of Example 4-3. Should the trial be run as a stratified trial within clinics, Example 4-3 extends in an obvious way. We can test equality of survival distributions, or equality of survival distributions within all strata. The latter is a much more stringent condition. An account of stratified trials is given by Zelen (1974). Again, the inference applies only to the H clinics. 5. CONCLUDING DISCUSSION Often, in the social science and medical literature Pearson chi-squared has been used, rather than a technique of the type described in this article. In the following example the significance level achieved by Pearson's chi-squared is larger than the nominal level of our test. A hypothetical medical experiment is run as follows. A simple random sample of 40 patients is drawn from a target population. Each patient provides 4 muscle specimens. Each muscle specimen is cut into three pieces. Each of the three pieces is randomly assigned to one of treatments A, B and G such that each treatment is used once on each specimen. The measured response is quantal, i.e. all or none. We treat subjects as primary clusters, specimens as a simple random sample of secondary clusters, and treatment assignment as a sample of size one of the six possible assignments. The hypothesis of interest is that the treatments are equivalent. The test is given in Example 3-1 and the distribution theory in Example 4-1. Note that, so long as the subject-treatment interaction is not too large, and substantial variability between subjects' positive response probabilities exist, then the estimates of positive response probabilities for each treatment would be highly positively correlated. The naive assumption that we have three independent binomial samples of 160 assumes these correlations to be zero. The clustering tends to keep the relative frequencies closer together than would the three independent binomials. Hence, Pearson's chisquared is conservative in this example. The methods described in this paper are easy to use. The only computational requirement is an accurate matrix inversion subroutine. The authors wish to acknowledge the help of the referees and editor. In addition, we thank Professor Marvin Zelen for his helpful discussion of the comparison of survival distributions.

276 J. J. SHUSTEB AND D. J. DOWNING REFERENCES Cox, D. R. (1970). The Analysis of Binary Data. London: Methuen. CUTLEB, S. J. & EDEBER, F. (1958). MftTiitinnn utilization of the life table method in analyzing survival. J. Ohron. Dis. 8, 699-712. GABT, J. J. (1971). The comparison of proportions: A review of significance teste, confidence intervals, and adjustments for stratification. Rev. Inst. Int. Statist. 39, 148-69. GABT, J. J. (1972). Interaction teats for 2 x s x t contingency tables. Biometrika 59, 309-16. GOODMAN, L. A. (1961). Statistical methods for the mover-stayer model. J. Am. Statist. Assoc. 56, 841-68. GOODMAN, L. A. (1968). The analysis of cross-classified data, independence, quasi-independence, and interactions in contingency tables with or without minmng entries. J. Am. Statist. Assoc. 63,1091 131. GOODMAN, L. A. (1970). The multivariate analysis of qualitative data: Interactions among multiple classifications. J. Am. Statist. Assoc. 65, 226-66. KESSNEB, D., SNOW, C. & SINGEB, J. (1974). Assessment of Medical Care for Children. Washington: National Academy of Sciences. KJBH, L. <fc FBANKEX, M. R. (1974). Inference from complex samples. J. R. Statist. Soc. B 36, 1-38. KISH, L. & HESS, I. (1969). On variances of ratios and their differences in multistage sampling. J. Am. Statist. Assoc. 54, 416-^6. Correction (1963), J. Am. Statist. Assoc. 58, 1162. Ku, H. H. <fc KtrtiBAOK, S. (1974). Loglinear models in contingency table analysis. Am. Statistician 28, 116-25. MADOW, W. G. (1948). On the limiting distribution of estimates based on samples from finite universes. Ann. Math. Statist. 19, 636-46. MTETTINEN, O. S. (1969). Individual matching with multiple controls in the case of all or none responses. Biometrics 25, 339-55. MTETTINEN, O. S. (1970). Estimation of relative risk from individually matched series. Biometrics 26, 75-86. STUART, A. (1956). Test for homogeneity of the marginal distributions in a two-way classification. Biometrika 42, 412-6. TTETZE, C. & LBWIT, S. (1974). Comparison of the oopper-t and loop-d: A research report. Stud. Fam. Plann. 5, 277-8. WALD, A. (1943). Teste of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Am. Math. Soc. 54, 426 82. ZELEN, M. (1971). The analysis of several 2x2 contingency tables. Biometrika 58, 129-37. ZELEN, M. (1974). The randomization and stratification of patiente to clinical trials. J. Chron. Dis. 27, 366-75. [Received August 1974. Revised December 1975]