Statistical Inference with Randomized Nomination Sampling

Size: px
Start display at page:

Download "Statistical Inference with Randomized Nomination Sampling"

Transcription

1 Statistical Inference with Randomized Nomination Sampling by Mohammad Nourmohammadi A Thesis submitted to the Faculty of Graduate Studies of The University of Manitoba in partial fulfilment of the requirements of the degree of Doctor of Philosophy Department of Statistics University of Manitoba Winnipeg Copyright c 2014 by Mohammad Nourmohammadi

2

3 Abstract In this dissertation, we develop several new inference procedures that are based on randomized nomination sampling (RNS). The first problem we consider is that of constructing distribution-free confidence intervals for quantiles for finite populations. The required algorithms for computing coverage probabilities of the proposed confidence intervals are presented. The second problem we address is that of constructing nonparametric confidence intervals for infinite populations. We describe the procedures for constructing confidence intervals and compare the constructed confidence intervals in the RNS setting, both in perfect and imperfect ranking scenario, with their simple random sampling (SRS) counterparts. Recommendations for choosing the design parameters are made to achieve shorter confidence intervals than their SRS counterparts. The third problem we investigate is the construction of tolerance intervals using the RNS technique. We describe the procedures of constructing one- and two-sided RNS tolerance intervals and investigate the sample sizes required to achieve tolerance intervals which contain the determined proportions of the underlying population. We also investigate the efficiency of RNS-based tolerance intervals compared with their corresponding intervals based on SRS. A new method for estimating ranking error probabilities is proposed. The final problem we consider is that of parametric inference based on RNS. We introduce different data types associated with different situation that one might encounter using the RNS design and provide the maximum likelihood (ML) and the method of moments (MM) estimators of the parameters in two classes of distributions; proportional hazard rate (PHR) and proportional reverse hazard rate (PRHR) models. i

4 ii

5 To my wife Elaheh and my daughters Dorsa and Tara iii

6 iv

7 Acknowledgement I would like to express my sincere appreciation to my advisors, Dr. Mohammad Jafari Jozani and Dr. Brad C. Johnson, for their support, encouragement, consistently positive outlook and insightful feedback. I would also like to thank Dr. Mary Thompson, Dr. Mahmoud Torabi and Dr. Katherine Davies for serving in my dissertation committee and their valuable comments and suggestions. I wish to thank the Department of Statistics at the University of Manitoba for providing me a wonderful environment for my Ph.D. study. My special thanks go to my wife, Elaheh Heydari, and my daughters, Dorsa and Tara, for their enduring love and support. v

8 vi

9 Abbreviations CDF EM ERSS IID LS MANS MINS ML MM MML PDF PHR PIHR RNS RNS(ρ, ζ) RSS SRS SRSWOR SRSWR Cumulative Distribution Function Expectation-Maximization Extreme Ranked Set Sampling Independent and Identically Distributed Least Squares Maxima Nomination Sampling Minima Nomination Sampling Maximum Likelihood Method of Moments Modified Maximum Likelihood Probability Distribution Function Proportional Hazard Rate Proportional Inverse Hazard Rate Randomized Nomination Sampling Randomized Nomination Sampling with parameters ρ and ζ Ranked Set Sampling Simple Random Sampling Simple Random Sampling Without Replacement Simple Random Sampling With Replacement vii

10 viii

11 Published and Submitted Papers From This Thesis 1. Nourmohammadi, M., Jafari Jozani, M. and Johnson, B. (2014). Confidence interval for quantiles in finite populations with randomized nomination sampling. Computational Statistics and Data Analysis, 73, Nourmohammadi, M., Jafari Jozani, M. and Johnson, B. (2014). Nonparametric confidence intervals for quantiles with randomized nomination sampling. Sankhya, DOI /s Nourmohammadi, M., Jafari Jozani, M. and Johnson, B. (2014). Distribution-free tolerance intervals with randomized nomination samples. Statistical Methodology, Revision Submitted. 4. Nourmohammadi, M., Jafari Jozani, M. and Johnson, B. (2014). Parametric randomized nomination sampling. In preparation. ix

12 x

13 Contents Contents xi List of Tables xv List of Figures xxi 1 Introduction On Rank-Based Sampling Techniques Obtaining a Randomized Nomination Sample Motivation Literature Review Thesis Outline Confidence Intervals for Quantiles in Finite Populations Introduction RNS Replacement Protocols Level 0 RNS Design Level 1 RNS Design xi

14 2.2.3 Level 2 RNS Design A Small Example Confidence Intervals Based on Level 0 RNS Design Optimal Choice of ζ and a Guideline for Choosing r Confidence Intervals Based on Level 1 RNS Design Confidence Intervals Based on Level 2 RNS Design Numerical Study Symmetric Confidence Intervals Equal-tail Confidence Intervals Confidence Intervals for Quantiles in Infinite Populations Introduction Distributional Properties of RNS Samples RNS Confidence Intervals for Quantiles Large Sample Theory Choosing ζ for Symmetric Confidence Intervals On the Optimal Choice of ζ An Illustrative Example A Case Study xii

15 4 Distribution-Free Tolerance Interval Introduction RNS Tolerance Intervals Sample Size Determination RNS in Presence of Ranking Error Numerical Study A Case Study Parametric Inference using RNS Introduction ML Estimation in RNS Complete-Data ML Estimation in the PHR Model ML Estimation in RNS Incomplete-Data RNS-Based MM Estimators Numerical Studies Summary and Future Work 135 Bibliography 141 xiii

16 xiv

17 List of Tables 2.1 Probabilities of possible samples under three RNS protocols when N = 4, m = 2, and P(K = 2) = Probabilities of all possible samples under three RNS protocols in Table 2.1 when ζ {0, 0.5, 1} The values of r, lower and upper coverage probabilities associated with [Y r+1:m, Y m r:m ] and [Y r:m, Y m r+1:m ], respectively, and average length of 95% interpolated symmetric confidence intervals for the first quartile for three introduced population distributions of size N = 100, m = 15, ζ = 0, and K i 1 is a Poisson(0.25) random variable that has been truncated so that K i {1, 2, 3, 4} The values of r and s, length and the observed coverage probabilities (in parentheses) of interpolated 95% equal-tail confidence intervals for the first and third quartiles associated with SRSWOR and Level 0 RNS design with ζ = 0 and ζ = 1, respectively. We considered three population shapes with population sizes N {40, 80, 100}, the sampling fraction f = 20% and K i 1 Bernoulli(0.75) xv

18 2.5 The length and the observed coverage probabilities (in parentheses) of interpolated 95% equal-tail and symmetric confidence intervals for the first (Q 1 ), second (Median) and third (Q 3 ) quartiles associated with Level 0 RNS design with ζ = 0.05, ζ = 0.5 and ζ = 0.95, respectively. We considered the population shape to be LogNormal with population sizes N = 100, the sampling fraction f = 20% for two distributions of K i, K i 1 Bernoulli(0.75) and P(K i = 5) = The length and the observed coverage probabilities (in parentheses) of interpolated 95% equal-tail and symmetric confidence intervals for the first (Q 1 ), second (Median) and third (Q 3 ) quartiles associated with Level 0 RNS design with ζ = 0.05, ζ = 0.5 and ζ = 0.95, respectively. We considered the population shape to be Normal Mixture with population size N = 100, the sampling fraction f = 20% for two distributions of K i, K i 1 Bernoulli(0.75) and P(K i = 5) = The length and the observed coverage probabilities (in parentheses) of interpolated 95% equal-tail and symmetric confidence intervals for the first (Q 1 ), second (Median) and third (Q 3 ) quartiles associated with Level 0 RNS design with ζ = 0.05, ζ = 0.5 and ζ = 0.95, respectively. We considered the population shape to be Normal with population size N = 100, the sampling fraction f = 20% for two distributions of K i, K i 1 Bernoulli(0.75) and P(K i = 5) = Minimum values of sample size m in 95% SRS and RNS(ρ i, ζi ) confidence intervals for Q X,p, i = 1, 2, 3, when p = 0.005, 0.01, and 0.05 with known ρ i and optimum values of ζ calculated as functions of ρ i xvi

19 4.1 Minimum values of the sample needed such that Y 1:m (ζ r = 0) (Y m:m (ζ s = 1)) are lower (upper) (π, γ) SRS- and RNS-based tolerance limitas for four distributions on K. In the imperfect ranking setting the ranking is done by two concomitant random variables with the Kendall s τ = 0.4 (I 1 ) and τ = 0.7 (I 2 ), when π = 0.6(0.1)0.9, 0.95 and 0.99, γ = 0.8, 0.9 and 0.95, B = 50000, and the variable of study follows Normal(1000, ) The range of ζ which improves RNS over SRS for three distributions on K. In imperfect ranking case ranking is done by a concomitant variable with the Kendall s τ = 0.7 (I 2 ), when γ = 0.95, π = 0.6(0.1)0.9 and 0.95, B = 50000, and the variable of study follows Normal(1000, ) Values of r (s) for lower (upper) one-sided (π, γ) tolerance limits of the form [Y r:m (ζ r = 0), ) ((, Y s:m (ζ s = 0)]) for various ranking settings, three distributions on K, π = 0.6(0.1)0.9, γ = 0.9, 0.95, and m = 20, 40 and 70. The imperfect ranking is conducted based on B = when the population shape is LogNormal Values of r (s) for lower (upper) one-sided (π, γ) tolerance limits of the form [Y r:m (ζ r = 0), ) ((, Y s:m (ζ s = 0)]) for various ranking settings, three distributions on K, π = 0.6(0.1)0.9, γ = 0.9, 0.95, and m = 20, 40 and 70. The imperfect ranking is conducted based on B = when the population shape is Mixture Normal Values of r (s) for lower (upper) one-sided (π, γ) tolerance limits of the form [Y r:m (ζ r = 0), ) ((, Y s:m (ζ s = 0)]) for various ranking settings, three distributions on K, π = 0.5(0.1)0.9, γ = 0.9, 0.95, and m = 20, 40 and 70. The imperfect ranking is conducted based on B = when the population shape is Normal xvii

20 4.6 Optimal values of ζ to construct the RNS-based confidence interval for Q π for three distributions on K and π = 0.6(0.1) Values of r (s) for lower (upper) one-sided (π, γ) tolerance limits of the form [Y r:m (ζ), ) ((, Y s:m (ζ)]) for four distributions on K, π = 0.6(0.1)0.9, γ = 0.9, 0.95, m = 20, 40 and 70, and for ζ rs = 0.5 and ζ 0 = 0.8 and ζ = ζ π when the ranking setting is perfect Minimum values of m such that [Y 1:m (ζ), Y m:m (ζ)] are two-sided (π, γ) SRS- and RNS-based tolerance intervals for four distributions on K when the ranking is perfect and ζ rs = 0.5 and ζ 0 = Interval length and values of (r, s) for two-sided (π, γ) tolerance intervals of the form [Y r:m (ζ ), Y s:m (ζ )] for four distributions on K, π = 0.6(0.1)0.9, γ = 0.90 and 0.95, m = 20, 40 and 70, and ζ rs = 0.5, when the ranking is perfect and the population shape is LogNormal Interval length and values of (r, s) for two-sided (π, γ) tolerance intervals of the form [Y r:m (ζ ), Y s:m (ζ )] for three distributions on K, π = 0.6(0.1)0.9, γ = 0.90 and 0.95, m = 20, 40 and 70, and ζ rs = 0.5, when the ranking is perfect and the population shape is Mixture Normal Interval length and values of (r, s) for two-sided (π, γ) tolerance intervals of the form [Y r:m (ζ ), Y s:m (ζ )] for three distributions on K, π = 0.6(0.1)0.9, γ = 0.90 and 0.95, m = 20, 40 and 70, and ζ rs = 0.5, when the ranking is perfect and the population shape is Normal The range of ζ on which RNS (ρ 2, ζ) needs smaller sample size than SRS for ρ 3 = (0.2, 0, 0, 0.8), perfect (P ) and imperfect ranking settings when the auxiliary variable is weight (W ) and length (L), π {0.8, 0.9} and γ = xviii

21 4.13 Values of r (s) for lower (upper) (π, γ) tolerance limits of the form [Y r:m (ζ r = 0), ) ( (, Y s:m (ζ s = 1] ) for perfect (P ) and imperfect ranking settings when the auxiliary variable is weight (W ) and length (L), π {0.8, 0.9}, γ = 0.95 and m = Values of (r, s) for two-sided (π, γ) tolerance intervals of the form [Y r:m (ζ rs = 0.5), Y s:m (ζ rs = 0.5] for perfect (P ) and imperfect ranking settings when the auxiliary variable is weight (W ) and length (L), π {0.8, 0.9}, γ = 0.95 and m = The value of R i introduced in Theorem 5.2 for RNS complete-data and Type I, Type II and Type II incomplete-data xix

22 xx

23 List of Figures 2.1 Computed ρ (SRS,RNS) from Level 0 (L0-solid line), Level 1 (L1-dashed line) and Level 2 (L2-dotted line) RNS protocols and SRSWOR 95% interpolated symmetric confidence intervals for the first (Q 1 ), second (median) and third (Q 3 ) population quartiles, three hypothetical population shapes, ζ [0, 1], N = 100 and m = 15 when K i 1 is Bernoulli(0.75) random variable.the first row shows the shapes of underlying distributions Computed ρ (SRS,RNS) from Level 0 (L0-Solid line), Level 1 (L1-Dashed line) and Level 2 (L2-Dotted line) RNS protocols and SRSWOR 95% interpolated equal-tail confidence intervals for the first (Q 1 ), second (median) and third (Q 3 ) population quartiles, three hypothetical population shapes, ζ [0, 1], N = 100 and m = 15 when K i 1 is Bernoulli(0.75) random variable. The first row shows the shapes of underlying distributions The area of ζ which improves RNS(ρ, ζ)-based symmetric confidence intervals over their SRS counterparts for p (0, 1) xxi

24 3.2 Comparison of p and ρ (p, ζ ) and comparison of m 2r + 1 in SRS, RNS(ρ 1, ζ 1 ), RNS(ρ 2, ζ 2 ), and RNS(ρ 3, ζ 3 ), where ζ i is the optimum value of ζ as functions of ρ i, i = 1, 2, 3, m = 45 and r is chosen as the largest value for which the coverage probability exceeds the nominal 0.95 level Comparison of the coverage probabilities of exact confidence intervals for p [0, 1] in SRS, RNS(ρ 1, ζ 1 ), RNS(ρ 2, ζ 2 ), and RNS(ρ 3, ζ 3 ), where ζ i is the optimal value of ζ as a function of ρ i, i = 1, 2, 3. The nominal confidence level is Comparison of the coverage probabilities of confidence intervals for p = 0.005, 0.01, and 0.05 in SRS, RNS(ρ 1, ζ 1 ), RNS(ρ 2, ζ 2 ), and RNS(ρ 3, ζ 3 ), ζ i is the optimal value of ζ calculated as a function of ρ i, i = 1, 2, 3. The nominal confidence level is Comparison of the average coverage probability of the confidence interval [Y 1:m (ζ i ), Y m:m(ζ i )] in SRS, RNS(ρ 1, ζ 1 ), RNS(ρ 2, ζ 2 ), and RNS(ρ 3, ζ 3 ), where ζ i is the optimal value of ζ calculated as a function of p and ρ i, i = 1, 2, Average confidence interval lengths for the SRS and RNS(ρ i, ζ i,0.75 ) designs (first row) and the SRS and RNS(ρ i, ζi,0.75 o ) designs (second row), i = 1, 2, 3, under perfect ranking and imperfect ranking 1 and 2 scenarios The difference between the simulated and expected coverage probabilities for the SRS and RNS(ρ i, ζi,0.75 ) designs (first row) and the SRS and RNS(ρ i, ζi,0.75 o ) designs (second row), i = 1, 2, 3, under perfect ranking and imperfect ranking 1 and 2 scenarios Mercury levels, weight and length of Sander vitreus in the dataset and associated scatterplots xxii

25 5.1 The relative precision of the ML estimators of γ(θ) based on the RNS completedata over their SRS counterparts in the PHR (left panel) and PRHR (right panel) models when K = 2, 3, 4, 5 and the proportion of maximums is p = 0(0.1) The relative precision of the RNS incomplete-data Type I (top left panel) for ζ [0, 1] and λ {1, 2, 3, 4}, Type II (top right panel) for four distributions on K and λ = 1(0.1)5, and Type III (middle and down panels) for ζ {0, 0.25, 0.75, 1} and λ = 1(0.1)5 in an exponential distribution with parameter λ and m = The relative precision of the RNS incomplete-data Type I (top left panel) for ζ [0, 1] and η {1, 2, 3, 4}, Type II (top right panel) for four distributions on K and η = 1(0.1)5, and Type III (middle and down panels) for ζ {0, 0.25, 0.75, 1} and η = 1(0.1)5 in an exponential distribution with parameter η and m = The relative precision of the MM estimators of γ(θ) based on the RNS complete-data over their SRS counterparts in the PHR (left panel) and PRHR (right panel) models when k i = 2, 3, 4, and 5, and the proportion of maximums is p = 0.1(0.1) The relative precision of the RNS incomplete-data Type I (top left panel) for ζ [0, 1] and λ {1, 2, 3, 4}, Type II (top right panel) for four distributions on K and η = 1(0.1)5, and Type III (middle and down panels) for ζ {0, 0.25, 0.75, 1} and η = 1(0.1)5 in a beta distribution with parameter 1/η, the shape parameter β = 1, and m = xxiii

26 5.6 The relative precision of the RNS incomplete-data Type I (top left panel) for ζ [0, 1] and η {1, 2, 3, 4}, Type II (top right panel) for four distributions on K and η = 1(0.1)5, and Type III (middle and down panels) for ζ {0, 0.25, 0.75, 1} and η = 1(0.1)5 in a beta distribution with parameter 1/η, the shape parameter β = 1, and m = xxiv

27 Chapter 1 Introduction 1.1 On Rank-Based Sampling Techniques One of the important practical problems in statistics is to obtain more representative samples from the underlying population and give an informative image of the variable of interest. Providing more structured samples than those obtained in a simple random sampling (SRS) design is one way to address this issue. Rank-based sampling schemes utilize easily obtained rank-based information related to the variable of interest in the population to provide an artificially stratified sample and direct our attention toward the more representative units in the population. Ranked set sampling (RSS) is a rank-based sampling approach to data collection which was first proposed by McIntyre (1952) for the situations where taking the actual measurements on the attribute of interest from the sampling units is time-consuming, destructive, or costly, but a small set of sample units can be ordered with respect to the variable of interest, without actual measurements of them, relatively easy and with negligible cost. In RSS, a set of k items is drawn from the population, the items of the set are ranked by judgement, and only the item judged the smallest is quantified. Then another set of size k is drawn 1

28 and ranked, and only the item judged the second smallest is quantified. The procedure continues until the item judged the largest in the k-th set is quantified. This completes a cycle of the sampling. The cycle is then repeated for as many times as desired. This form of RSS is referred to as balanced in the sense that each order statistic in the ranked sets is quantified the same number of times. In the unbalanced RSS scheme, a cycle of k sets of size k is drawn and ranked, and the order statistics with orders r 1,..., r k are quantified in these k ranked sets, where the r j s can be any integer from 1 to k and are not necessarily different. For more information on RSS see Chen et al. (2004) and Wolfe et al. (2004). Randomized nomination sampling (RNS) is also a rank-based sampling technique which has been introduced by Jafari Jozani and Johnson (2012) for estimating the mean value of the characteristic of interest in finite populations. This thesis concerns parametric and nonparametric inference based on the RNS design. To this end, in this chapter we discuss RNS design and some of its applications. We show that several commonly used rank-based sampling techniques can be written as special cases of the RNS design. 1.2 Obtaining a Randomized Nomination Sample Let {K i : i N} be a sequence of independent random variables taking values in {1,..., M} with probabilities ρ = {(ρ 1, ρ 2,..., ρ M ) : M i=1 ρ i = 1}. When the underlying population is finite with size N, we have 1 M N. The value of M is defined by the context, as will be seen later. Further, let {Z i : i N} be a sequence of independent Bernoulli random variables with success probability ζ [0, 1], independent of K i. First, an initial simple random sample of K 1 units from the population is selected; this sample is called a set. Then, the sampled units in the set are ordered based on the attribute of interest via some ranking process so that 2

29 the smallest and the largest units are identified. This judgement ranking is accomplished using a variety of mechanisms, including expert opinion, visual comparisons, or the use of an easy-to-obtain auxiliary variable. Note that the accomplished ranking can be perfect or imperfect. When the judgement ranking of the K 1 units in the set is accomplished, the first unit in the sample is selected based on the value of Z 1. We select the smallest or the largest unit in the set if Z 1 = 0 or Z 1 = 1, respectively. Then the attribute of interest is measured on the selected unit. This process is repeated m times to obtain an RNS sample of size m. We refer to this design as an RNS design with parameters ρ and ζ, or RNS(ρ, ζ) design for short. Suppose X is the variable of interest. Within the i-th set, the selected nominee for the final measurement can be written as Y i := Y (X, K i, Z i ) = (1 Z i )X 1:Ki + Z i X Ki :K i, where X, K i and Z i are independent and X r:k refers to the r-th order statistic in a sample of size K. Throughout the thesis, we use X r:k and X [r:k] to distinguish between the r-th order statistics in a sample of size K under perfect and imperfect ranking scenarios, respectively. RNS-based statistical inference may be made under the assumption that the value y i is observed and k i and z i, where i = 1,..., m, are unknown. However, there might be situations in which the size of sets or the number of maximums (and subsequently the number of minimums), or both are chosen in advance, instead of being chosen based on a randomized process. The cumulative distribution function (CDF) of Y i can be found by conditioning on K i = k i and Z i = z i, or both. The conditioning argument makes the theoretical investigation more complicated, but it provides more efficient statistical inference. In this thesis, both unconditional and conditional RNS are presented. Some well-known examples of randomized nomination sampling are given below: (1) The choice of K i = 1, i = 1,..., m, results in the SRS design. Throughout this thesis the simple random sample is denoted by X i, i = 1,..., m. (2) The choice of ζ = 1 nominates the maximum from each set and results in a maxima 3

30 nomination sampling (MANS) design (see Willemain, 1980, and Jafari Jozani and Mirkamali, 2010). (3) The choice of ζ = 0 nominates the minimum from each set and results in a minima nomination sampling (MINS) design (see Tiwari and Wells, 1989). (4) The choice of ζ = 1 2 and k i = k, for a constant k N, results in a randomized extreme ranked set sampling (RERSS) design (see Jafari Jozani and Johnson, 2012). With slight modifications the results can be used in the following designs: (5) The choice of Z 2i 1 = 1 and Z 2i = 0, and K i = k, where i = 1,..., m, for a constant k N and an even number m, results in an extreme ranked set sampling (ERSS) design (see Samawi et al., 1996, and Ghosh and Tiwari, 2009) (6) The choice of Z 2i 1 = 1 and Z 2i = 0, and K i = i, where i = 1,..., m, for an even number m results in a moving extreme ranked set sampling (MRSS) design (see Al-Odat and Al-Saleh, 2001 ). Note that Z i, i = 1,..., m, in (5) and (6) are no longer IID. 1.3 Motivation Willemain (1980) introduced the MANS design with a fixed set size k (i.e., ζ = 1 with ρ k = 1) to estimate the median of a population in the situation where it is practically impossible to draw a simple random sample. In his application, a federal agency was interested in more closely relating reimbursement amounts provided to nursing home services to the actual cost of services provided. Since complete enumeration of costs for all patients in every nursing home would be prohibitively expensive, sampling methods were considered. However, it was felt that the nursing home operators would be more willing to accept sample-based rates 4

31 if they were given a role in selecting the cases chosen for assessment. The operators were allowed to make more likely the selection of those residents whose care they felt to be most expensive. Willemain (1980) described the procedure where one would select m independent sets of size k and, for each set, allow the nursing home operator(s) to select the patient they judged to have the most expensive care requirements. Since Willemain (1980), maxima nomination sampling has been the topic of numerous papers; more details are presented in Section 1.4. Jafari Jozani and Johnson (2012) showed that MANS design may not necessarily be the optimal design choice for estimating the population total. They suggested to use the observations from both extremes, maximums and minimums, and give more flexibility to the design by considering the set size to be random. The method proposed by Jafari Jozani and Johnson (2012) also provides a unified way of analyzing MANS and MINS designs. Efficiency of statistical inference using rank-based sampling techniques relative to their SRS counterpart, with the same number of quantified units, depends upon the underlying population, ranking accuracy, and set sizes. One concern in working with extremes is when the distribution of the variable of interest is skewed. RNS has the flexibility to adjust the proportion of maximums and minimums by choosing the appropriate ζ [0, 1] based on the shape of the underlying distribution. Regarding the ranking accuracy, unlike regular RSS, RNS allows for an increase of the set size without introducing too much error. Identifying the extremes, rather than a complete ranking, is more practical, since we need to successfully identify only the first and/or the last ordered unit. In terms of the set size, while the RNS design does not preclude the use of fixed set sizes (by taking P(K i = k) = 1 for i = 1,..., m and some fixed k), allowing for random set sizes provides additional flexibility in the design. There are circumstances in which the set size might naturally be random. For example, reliability experiments quite often give rise to an RNS design and the statistical inference using RNS can be easily applied in a system reliability context to make inference on the 5

32 lifetime of a system with several components in parallel or in series. If K is the number of components in a parallel (series) system and X is the system lifetime, then repeated observations on the pair (X, K) constitute a maxima (minima) nomination sample. In this context, one notes that K might be random or non-random, depending on the application. A user, wishing to control the level of redundancy in a system, might use K as a design parameter. Life-testing experiments carried out by a manufacturer might involve a variety of K values chosen according to some fixed scheme. On the other hand, consumers often have the option of choosing the value of K, perhaps implicitly. Thus a random sample of consumers would result in a randomized nomination sample with random K. Gemayel et al. (2010) provides more discussion on random set sizes in RSS. Another advantage in allowing random set sizes is that, when ρ 1 > 0, we have, on average, mρ 1 observations which comprise a simple random sample. Indeed, on average, RNS samples will contain mζρ k maximums from sets of size k and m(1 ζ)ρ k minimums from sets of size k for k = 1,..., M. Thus, in addition to the simple random sample portion of the RNS sample, we also have a collection of extremal order statistics from various set sizes, which can contain much more information about the population than SRS observations. The RNS scheme has the ability to apply the random set size in the application. The studies on the RNS design presented in this thesis were motivated by the work of Jafari Jozani and Johnson (2012). We investigate the performance of the RNS design in making parametric and nonparametric statistical inference in both finite and infinite populations. 1.4 Literature Review After Willemain (1980) introduced MANS, Boyles and Samaniego (1986) used this design to examine the problem of nonparametric estimation of the population CDF in MANS 6

33 by relaxing the assumption that K i are fixed. In this case, the units obtained from the MANS design are generated from the same parent distribution, but they are not identically distributed unless the m sets from which they were obtained are of the same size. Considering a random set size, Boyles and Samaniego (1986) derived the nonparametric ML estimator of the population CDF and established the strong consistency of their estimator. They also identified the asymptotic distribution of their estimator. They applied their results on a real data set from the Natural Environmental Research Council of Great Britain in 1975 consisting of the number of floods per year of the Nidd River in England and the discharge, in cube meters per second, corresponding to the maximum flood. They also demonstrated the strong consistency of the Willemain s estimate of the median, as a byproduct of their development. Tiwari (1988) derived the Bayes estimator of the population CDF (F ) in the MANS design. Considering the Dirichlet distribution as the prior distribution, he showed that under some conditions the Bayes estimator was identical to the nonparametric ML estimator derived by Boyles and Samaniego (1986). Later, Tiwari and Wells (1989) presented a quantile estimator using maxima nomination samples based on the ML estimator derived by Boyles and Samaniego (1986). Wells et al. (1990) considered another important case in which the nominee is the minimum of the sets and called it MINS. Kvam and Samaniego (1993) proposed an alternative nonparametric estimator of the population CDF using the least squares approach in MANS. This approach was also used by the same authors for estimating the CDF using RSS. Kvam and Samaniego (1993) concluded that if the cost to obtain maximums is no more than the cost to obtain regular sample units, then the least squares (LS) estimator of the CDF within the range of the maxima nomination sample is preferred over the empirical distribution function (EDF) of an independent and identically distributed (IID) sample when the primary interest is the upper portion of the 7

34 unknown distribution. The conducted simulation showed that, with minor adjustments, the LS estimator performs equally well using the observed minima rather than maxima. Jafari Jozani and Mirkamali (2010) provided an application of MANS in constructing better acceptance sampling plans for attributes of interest than the usual ones based on SRS. Acceptance sampling uses statistical sampling to determine whether to accept or reject a production lot of material. It has been a common quality control technique used in industry. An important area in quality control research is control charting for attributes to monitor the number or proportion of nonconforming items, i.e., the items which do not satisfy some conditions in the process. Jafari Jozani and Mirkamali (2011) showed that using the MANS technique in constructing attribute control charts leads to a substantial improvement over the usual control charts for attributes based on SRS. RNS was first introduced by Jafari Jozani and Johnson (2012) who provided design based estimation procedures for use in finite population sampling and showed some optimality results over SRS and strict maxima (minima) nomination sampling schemes. They provided the first- and second-order inclusion probabilities of the population elements for both with and without replacement variations. Minimizing the variance of the Hansen-Hurwitz estimator, they stated that, under perfect ranking assumption, there is always an RNS design with ζ > 1 2 that outperforms every RNS design with ζ 1 2. In particular, MINS and a randomized version of ERSS with ζ = 1 2 cannot be optimal in the finite population setting and MANS always performs better than MINS. They showed that the RNS design can perform better than the SRS design in estimating the population total for many population types. Moreover, they provided the relative efficiencies for the RNS with and without replacement designs with respect to a SRS without replacement design for the optimal choice of design parameters for both perfect and imperfect ranking scenarios. 8

35 1.5 Thesis Outline The outline of the thesis is as follows. In Chapter 2, we discuss three different ways of constructing an RNS design in finite populations using various replacement policies. We also discuss the construction of confidence intervals for population quantiles. Several theoretical results are presented in this chapter. We also provide guidelines for choosing the design parameter ζ which lead to shorter confidence intervals for specific population quantiles compared with their SRS counterparts. We develop recursive algorithms that can be used to obtain the confidence coefficient associated with the confidence intervals. Numerical studies are conducted to evaluate the performance of the RNS-based symmetric and equal-tail confidence intervals compared with their counterparts based on the SRS design. Chapter 2 also contains a discussion on the effect of the fixed set size and conditional results given the selected unit is maximum or minimum, on the length of the constructed confidence intervals. Chapter 3 describes the RNS design and presents some basic distribution theory of the random variables associated with this design when the underlying population is absolutely continuous. We study the construction of the exact and asymptotic RNS confidence intervals for quantiles. We show a duality between the problem of constructing an RNS confidence interval for the p-th quantile of the underlying population with the one for the p -th quantile of the population based on SRS, where p and p [0, 1], and p is obtained as a function of p and the parameters of the RNS design. We compare the RNS and SRS confidence intervals based on their length and coverage probabilities and the necessary sample size to achieve the desired coverage probabilities in each design. It is observed that the design parameter ζ associated with RNS provides a flexible tool which enables one to construct confidence intervals with the exact desired coverage probabilities for a wide range of population quantiles without the use of randomized procedures. A case study based on a small livestock data set for both perfect and imperfect ranking situations is also provided. 9

36 In Chapter 4, distribution-free RNS-based tolerance intervals are studied and some of the necessary theoretical results are derived. We study the performance of the proposed RNS tolerance intervals based on the corresponding coverage probabilities and the necessary sample size for their existence with those based on SRS. The efficiency of the constructed RNS-based tolerance intervals compared to the SRS counterparts is discussed. We investigate the performance of RNS-based tolerance intervals for different values of the design parameters and various population shapes. We find the values of the design parameters which makes the RNS-based tolerance intervals to be superior to their SRS-based counterparts in terms of the sample size needed to construct the tolerance intervals. The RNS design in the presence of ranking error is also discussed and a new method for estimating ranking error probabilities is proposed. Chapter 5 examines the use of the RNS design in the estimation of the distribution parameters θ in proportional hazard rate (PHR) models and proportional reverse hazard rate (PRHR) models. We consider both the method of moments (MM) and ML estimation of θ. Depending on what values of y i, k i, and z i are known, four types of RNS data will be defined. The RNS complete-data, in which the triplet y i, k i and z i in the selected units are known, is investigated in details and the expected values and the variances of the estimators are obtained. It is shown that there is always a value of the design parameter ζ by which the RNS design is more efficient than the SRS design. The EM algorithm is used to compare the RNS-based estimators with their SRS counterparts when the data is not complete. In Chapter 6, we conclude with a summary of what was accomplished in Chapters 2 to 5. We also describe some future works that could be carried out in the areas related to those discussed in this dissertation. 10

37 Chapter 2 Confidence Intervals for Quantiles in Finite Populations 2.1 Introduction There are many cases in which one may be interested in the quantiles of a distribution. For example, making inference on population quantiles may be of interest rather than a population mean when the underlying distribution is highly skewed, as quantiles are less influenced by extreme observations. In this chapter, we study the problem of constructing confidence intervals for population quantiles under different RNS designs for finite populations. In recent years, many researchers have considered similar problems for finite and infinite populations under different rank-based sampling designs. For example, under the RSS design, Ozturk and Deshpande (2006) proposed distribution-free confidence intervals for quantiles of infinite populations, and showed that RSS-based intervals tend to be shorter than their counterparts based on SRS. Later, Deshpande et al. (2006) developed nonparametric RSS-based confidence intervals for quantiles of finite populations. For recent developments in this direction, see Frey (2007a), Ozturk (2012) and the references therein. In other applications, construction of confidence intervals for quantiles has been the topic of the 11

38 papers by Philip and Lam (1997) for the assessment of the status of hazard waste sites, Kvam (2003) for monitoring water quality, Murff and Sager (2006) for evaluating mercury contamination in fish, and Burgette and Reiter (2012) for studying the adverse effects of tobacco smoke exposure of mothers during the pregnancy on health indices of infants. In the finite population setting, the construction of a randomized nomination sample can be done in different ways. It is usual to assume that sets are drawn without replacement from the underlying population. However, different replacement policies for the measured and ranked units in a set, prior to the selection of the units in the next set, result in different RNS designs. Following Deshpande et al. (2006), we consider three without replacement RNS techniques denoted by Level 0, Level 1, and Level 2 RNS design, respectively. In the Level 0 RNS design, sets are drawn without replacement, but all units in the set, including the measured unit, are replaced back into the population prior to selection of the next set. In the Level 1 RNS design, all units in the set, except the unit selected for full measurement, are replaced back into the population. If none of the units from the sets are replaced back into the population before drawing the next set, then we call this the Level 2 RNS design. Jafari Jozani and Johnson (2012) developed recursive algorithms to obtain the first and second order inclusion probabilities for population units under the Level 0 and Level 1 RNS sampling designs. In Section 1.3, we briefly discussed the advantages of random set sizes in RNS. In addition, when ρ 1 > 0 is moderately large, as proposed in Chapter 4, after observing the RNS sample we can bootstrap its SRS portion to estimate the ranking error probabilities in an imperfect RNS design. One may also want to choose the number of maximums (and so the minimums) in advance, instead of using a randomized process; this can be accomplished following a conditioning argument on Z i s (see Section 2.6). Despite the complexity of making inference based on conditioning on Z i = z i after randomization, the conditioning 12

39 argument may lead to better results. However, the proportion of required maximums in this setting would be another concern requiring attention. This concern can be addressed using the results we obtain in the randomized setting. The outline of this chapter is as follows. In Section 2.2, we discuss three different ways of constructing an RNS design in finite populations using the replacement policies Level 0, Level 1 and Level 2. Section 2.3 deals with the construction of confidence intervals for population quantiles under Level 0 RNS design. Several interesting theoretical results are presented in this section. Also, we provide a guideline for choosing the design parameter ζ in Level 0 RNS design to obtain more efficient confidence intervals for specific population quantiles compared with its SRS counterpart. In Sections 2.4 and 2.5, we develop recursive algorithms that can be used to obtain the confidence coefficients associated with Level 1 and Level 2 RNS confidence intervals, respectively. In Section 2.6, numerical studies are conducted to evaluate the performance of the RNS based symmetric and equal-tail confidence intervals compared with their counterparts based on SRS design. Section 2.6 also contains a discussion on the effect of the fixed set size and conditional results given Z i = z i, i = 1,..., m, on the length of the constructed symmetric and equal-tail confidence intervals. 2.2 RNS Replacement Protocols In this section, we describe three protocols for drawing randomized nomination samples from the finite population U. We assume that ranking of the units in each set is done based on an auxiliary variable. To set the notation, suppose we have a finite population of N elements, labeled U = {1,..., N}, consisting of bivariate pairs (x 1, u 1 ),..., (x N, u N ), where X is the study variable and U is an auxiliary variable, known to us in advance and assumed to be positively (or negatively) associated with X. Without loss of generality, we assume the u i, i = 1,..., N, are distinct; otherwise a random noise ɛ i should be added to 13

40 u i, to make them unique. Let u (i), (i = 1,..., N), denote the i-th ordered u value in the population and x [i] denote the x value associated with u (i), so that the ordered population may be written as (x [1], u (1) ),..., (x [N], u (N) ). Similarly, in a simple random sample of size k, we will denote the i-th ordered u value as u i:k and the associated x value as x [i:k]. For each RNS protocol described below, the final sample can be represented as a set of vectors {(y i, k i, z i ), i = 1,..., m}, where y i represents the final measurement obtained from the set i, k i is the set size and z i {0, 1}. The value z i = 1 indicates that y i is the measurement obtained from the unit associated with x [ki :k i ] which is ranked the largest among the k i units in the set, while z i = 0 shows that y i is obtained from the unit associated with x [1:ki ] which is ranked the smallest in the set, so that, y i = z i x [ki :k i ] + (1 z i )x [1:ki ] Level 0 RNS Design In the Level 0 RNS design, given K i = k i and Z i = z i, we first draw a simple random sample without replacement of size k i from the population and then we rank the units in the set without actual measurement on the variable of interest. We then select the unit ranked k i when z i = 1 or the unit ranked 1 when z i = 0 within the ordered set for the full measurement. We then return all the units back to the population before drawing the next set for the next measurement. In other words, the Level 0 RNS protocol allows the same unit to appear more than once in the final sample as well as the sets used for ranking purposes. The observations y i s therefore are marginally independent for this protocol. Here, there is no restriction on the set size and we can simply let M to be equal to the population size N, although in practice, small values of M are recommended. We can state the Level 0 RNS design in algorithmic form as follows. For each i = 1,..., m: Step I: Observe K i = k i using the probability model ρ = {(ρ 1, ρ 2,..., ρ M ) : 0 ρ i 14

41 1, M i=1 ρ i = 1}. Step II: Draw a simple random set without replacement of size k i from the underlying population. Step III: Using the auxiliary variable, rank the set units to identify the minimum to the maximum. Step IV: Select the minimum (with probability 1 ζ) or the maximum (with probability ζ) from the ranked set and measure the study variable for this unit. Step V: Return all k i units back to the population Level 1 RNS Design The algorithm of this level is similar to the Level 0, except that we only return the k i 1 units not measured in Step IV. In other words, we do not allow the same unit to appear more than once in the final sample, and so observations are not independent. As a result, we will see in the next section that finding the confidence interval for quantiles under this protocol is more complicated. In this level, the unit which is selected from the population is removed, and to be able to draw a randomized nomination sample of size m from the underlying population, we are forced to restrict the set size K i to be less than M m + 1, for i = 1,..., m Level 2 RNS Design In the Level 2 RNS design, the algorithm is similar to the Level 0, except that none of the units selected in each set are returned back to the population after we take a measurement from it (i.e., in Step V). That is, given K i = k i, all k i units in the set are removed from the 15

42 population prior to the next selection. Similar to Level 1, units cannot be selected more than once for full measurement in the final sample. However, because of the strict insistence on sampling without replacement both for the ranking and measurement purposes, the Level 2 sample can only be drawn under more restrictive conditions on the set size K i. For this design, to be able to select a randomized nomination sample of size m from the finite population U, we must have P(K i N/m) = 1, for i = 1,..., m A Small Example To show the difference between these sampling protocols, in the following example, we consider a simple example and obtain all possible randomized nomination samples and their probabilities under the various sampling protocols from the underlying finite population. Example 2.1. Suppose U = {1, 2, 3, 4}, and without loss of generality, assume that the values of the study variable are such that x 1 < x 2 < x 3 < x 4. Consider a simple case in which m = 2 and P(K i = 2) = 1, for i = 1, 2. Table 2.1 shows all possible randomized nomination samples of size m = 2 and their probabilities (as a function of ζ) under various RNS sampling protocols from U. From Table 2.2, we can easily observe that Level 0, Level 1 and Level 2 RNS designs assign substantially different probabilities to the possible samples depending on the value of ζ. The effect of the without replacement policy in the Level 2 RNS protocol can be seen from the fact that, when ζ = 0, only two distinct sets of samples are possible, (x 1, x 2 ) and (x 1, x 3 ). Considering ζ = 1, the only samples that have positive probabilities of being chosen under the Level 2 RNS protocol are (x 2, x 4 ) and (x 3, x 4 ). The set of possible samples is larger under the Level 1 and Level 0 RNS designs. 16

43 Table 2.1: Probabilities of possible samples under three RNS protocols when N = 4, m = 2, and P(K = 2) = 1. Samples Level 0 Level 1 Level 2 9 x 1, x 1 36 (1 ζ) x 1, x 2 36ζ(1 ζ) + 36 (1 ζ)2 36 (1 ζ) ζ(1 ζ) (1 ζ)2 12 x 1, x 3 36 ζ(1 ζ) (1 ζ)2 36 (1 ζ) ζ(1 ζ) (1 ζ) ζ(1 ζ) 18 x 1, x 4 36 ζ(1 ζ) ζ(1 ζ) 24 36ζ(1 ζ) 4 x 2, x 2 36 ζ(1 ζ) (1 ζ) ζ x 2, x 3 36 (1 ζ) ζ(1 ζ) ζ2 36 ζ(1 ζ) ζ (1 ζ)2 36ζ(1 ζ) 12 x 2, x 4 36 ζ(1 ζ) ζ2 36 ζ ζ(1 ζ) ζ(1 ζ) + 36 ζ2 1 x 3, x 3 36 (1 ζ) ζ(1 ζ) ζ x 3, x 4 36ζ(1 ζ) + 36ζ2 36 ζ ζ(1 ζ) ζ2 9 x 4, x 4 36 ζ2 0 0 Table 2.2: Probabilities of all possible samples under three RNS protocols in Table 2.1 when ζ {0, 0.5, 1}. ζ ζ = 0 ζ = 0.5 ζ = 1 Sample Level 0 Level 1 Level 2 Level 0 Level 1 Level 2 Level 0 Level 1 Level x 1, x x 1, x x 1, x x 1, x x 2, x x 2, x x 2, x x 3, x x 3, x x 4, x Confidence Intervals Based on Level 0 RNS Design Suppose Y 1, Y 2,..., Y m is a sample of size m drawn from the finite population U (consisting of N elements) using the Level 0 RNS design with design parameters ρ and ζ. Let the ordered values of the study variable for the population elements be denoted by x (1) < x (2) <... < x (N). Letting t be a fixed integer, t {1,..., N}; it is desired to obtain a confidence interval for x (t), the (t/n)-th quantile of U, where P(X < x (t) ) t/n P(X x (t) ). Let 17

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Mohammed El-Shambakey Dissertation Submitted to the Faculty of the Virginia Polytechnic Institute and State

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples

Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples Ann Inst Stat Math (2009 61:235 249 DOI 10.1007/s10463-007-0141-5 Exact two-sample nonparametric test for quantile difference between two populations based on ranked set samples Omer Ozturk N. Balakrishnan

More information

RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES

RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES . RANKED SET SAMPLING FOR ENVIRONMENTAL STUDIES 000 JAYANT V. DESHPANDE Indian Institute of Science Education and Research, Pune - 411021, India Talk Delivered at the INDO-US Workshop on Environmental

More information

PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES

PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES REVSTAT Statistical Journal Volume 16, Number 4, October 2018, 463 474 PARAMETRIC TESTS OF PERFECT JUDGMENT RANKING BASED ON ORDERED RANKED SET SAMPLES Authors: Ehsan Zamanzade Department of Statistics,

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

RANK-SUM TEST FOR TWO-SAMPLE LOCATION PROBLEM UNDER ORDER RESTRICTED RANDOMIZED DESIGN

RANK-SUM TEST FOR TWO-SAMPLE LOCATION PROBLEM UNDER ORDER RESTRICTED RANDOMIZED DESIGN RANK-SUM TEST FOR TWO-SAMPLE LOCATION PROBLEM UNDER ORDER RESTRICTED RANDOMIZED DESIGN DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate

More information

Examining the accuracy of the normal approximation to the poisson random variable

Examining the accuracy of the normal approximation to the poisson random variable Eastern Michigan University DigitalCommons@EMU Master's Theses and Doctoral Dissertations Master's Theses, and Doctoral Dissertations, and Graduate Capstone Projects 2009 Examining the accuracy of the

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Estimation of Quantiles

Estimation of Quantiles 9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker

More information

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Consideration of prior information in the inference for the upper bound earthquake magnitude - submitted major revision

Consideration of prior information in the inference for the upper bound earthquake magnitude - submitted major revision Consideration of prior information in the inference for the upper bound earthquake magnitude Mathias Raschke, Freelancer, Stolze-Schrey-Str., 6595 Wiesbaden, Germany, E-Mail: mathiasraschke@t-online.de

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15

More information

Experimental designs for multiple responses with different models

Experimental designs for multiple responses with different models Graduate Theses and Dissertations Graduate College 2015 Experimental designs for multiple responses with different models Wilmina Mary Marget Iowa State University Follow this and additional works at:

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring

Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring Algorithms Research 013, (): 43-49 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, Arif-Ul-Islam Department of Statistics & Operations

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

Rank-sum Test Based on Order Restricted Randomized Design

Rank-sum Test Based on Order Restricted Randomized Design Rank-sum Test Based on Order Restricted Randomized Design Omer Ozturk and Yiping Sun Abstract One of the main principles in a design of experiment is to use blocking factors whenever it is possible. On

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

MATH4427 Notebook 4 Fall Semester 2017/2018

MATH4427 Notebook 4 Fall Semester 2017/2018 MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their

More information

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M. TIME SERIES ANALYSIS Forecasting and Control Fifth Edition GEORGE E. P. BOX GWILYM M. JENKINS GREGORY C. REINSEL GRETA M. LJUNG Wiley CONTENTS PREFACE TO THE FIFTH EDITION PREFACE TO THE FOURTH EDITION

More information

c 2011 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED

c 2011 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED c 211 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED ANALYTICALLY AND NUMERICALLY MODELING RESERVOIR-EXTENDED POROUS SLIDER AND JOURNAL BEARINGS INCORPORATING CAVITATION EFFECTS A Dissertation Presented to

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Likelihood ratio testing for zero variance components in linear mixed models

Likelihood ratio testing for zero variance components in linear mixed models Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,

More information

5 Years (10 Semester) Integrated UG/PG Program in Physics & Electronics

5 Years (10 Semester) Integrated UG/PG Program in Physics & Electronics Courses Offered: 5 Years (10 ) Integrated UG/PG Program in Physics & Electronics 2 Years (4 ) Course M. Sc. Physics (Specialization in Material Science) In addition to the presently offered specialization,

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Hybrid Censoring; An Introduction 2

Hybrid Censoring; An Introduction 2 Hybrid Censoring; An Introduction 2 Debasis Kundu Department of Mathematics & Statistics Indian Institute of Technology Kanpur 23-rd November, 2010 2 This is a joint work with N. Balakrishnan Debasis Kundu

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

Contents. Chapter 1 Vector Spaces. Foreword... (vii) Message...(ix) Preface...(xi)

Contents. Chapter 1 Vector Spaces. Foreword... (vii) Message...(ix) Preface...(xi) (xiii) Contents Foreword... (vii) Message...(ix) Preface...(xi) Chapter 1 Vector Spaces Vector space... 1 General Properties of vector spaces... 5 Vector Subspaces... 7 Algebra of subspaces... 11 Linear

More information

Name :. Roll No. :... Invigilator s Signature :.. CS/B.TECH (NEW)(CSE/IT)/SEM-4/M-401/ MATHEMATICS - III

Name :. Roll No. :... Invigilator s Signature :.. CS/B.TECH (NEW)(CSE/IT)/SEM-4/M-401/ MATHEMATICS - III Name :. Roll No. :..... Invigilator s Signature :.. 202 MATHEMATICS - III Time Allotted : 3 Hours Full Marks : 70 The figures in the margin indicate full marks. Candidates are required to give their answers

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Contents. Chapter 1 Vector Spaces. Foreword... (vii) Message...(ix) Preface...(xi)

Contents. Chapter 1 Vector Spaces. Foreword... (vii) Message...(ix) Preface...(xi) (xiii) Contents Foreword... (vii) Message...(ix) Preface...(xi) Chapter 1 Vector Spaces Vector space... 1 General Properties of vector spaces... 5 Vector Subspaces... 7 Algebra of subspaces... 11 Linear

More information

Simulating Realistic Ecological Count Data

Simulating Realistic Ecological Count Data 1 / 76 Simulating Realistic Ecological Count Data Lisa Madsen Dave Birkes Oregon State University Statistics Department Seminar May 2, 2011 2 / 76 Outline 1 Motivation Example: Weed Counts 2 Pearson Correlation

More information

The performance of estimation methods for generalized linear mixed models

The performance of estimation methods for generalized linear mixed models University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2008 The performance of estimation methods for generalized linear

More information

Fundamentals of Applied Probability and Random Processes

Fundamentals of Applied Probability and Random Processes Fundamentals of Applied Probability and Random Processes,nd 2 na Edition Oliver C. Ibe University of Massachusetts, LoweLL, Massachusetts ip^ W >!^ AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS

More information

DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition

DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition Douglas C. Montgomery ARIZONA STATE UNIVERSITY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore Contents Chapter 1. Introduction 1-1 What

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Step-Stress Models and Associated Inference

Step-Stress Models and Associated Inference Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated

More information

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location MARIEN A. GRAHAM Department of Statistics University of Pretoria South Africa marien.graham@up.ac.za S. CHAKRABORTI Department

More information

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v)

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v) (vii) Preface... (v) CHAPTER 1 Set Theory Definition of Set... 1 Roster, Tabular or Enumeration Form... 1 Set builder Form... 2 Union of Set... 5 Intersection of Sets... 9 Distributive Laws of Unions and

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

Integrated reliable and robust design

Integrated reliable and robust design Scholars' Mine Masters Theses Student Research & Creative Works Spring 011 Integrated reliable and robust design Gowrishankar Ravichandran Follow this and additional works at: http://scholarsmine.mst.edu/masters_theses

More information

Frequency Analysis & Probability Plots

Frequency Analysis & Probability Plots Note Packet #14 Frequency Analysis & Probability Plots CEE 3710 October 0, 017 Frequency Analysis Process by which engineers formulate magnitude of design events (i.e. 100 year flood) or assess risk associated

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Probability for Statistics and Machine Learning

Probability for Statistics and Machine Learning ~Springer Anirban DasGupta Probability for Statistics and Machine Learning Fundamentals and Advanced Topics Contents Suggested Courses with Diffe~ent Themes........................... xix 1 Review of Univariate

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

On Comparison of Some Variation of Ranked Set Sampling (Tentang Perbandingan Beberapa Variasi Pensampelan Set Terpangkat)

On Comparison of Some Variation of Ranked Set Sampling (Tentang Perbandingan Beberapa Variasi Pensampelan Set Terpangkat) Sains Malaysiana 40(4)(2011): 397 401 On Comparison of Some Variation of Ranked Set Sampling (Tentang Perbandingan Beberapa Variasi Pensampelan Set Terpangkat) KAMARULZAMAN IBRAHIM* ABSTRACT Many sampling

More information

NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS

NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS By Yun Zhai, M.Sc. A Thesis Submitted to the School of Graduate Studies in partial fulfilment of the requirements for the degree of Ph.D. Department

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs.

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs. Appendix A Math Reviews 03Jan007 Objectives. Review tools that are needed for studying models for CLDVs.. Get you used to the notation that will be used. Readings. Read this appendix before class.. Pay

More information

Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions

Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions Statistics Preprints Statistics 10-2014 Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions Yimeng Xie Virginia Tech Yili Hong Virginia Tech Luis A. Escobar Louisiana

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

18.05 Practice Final Exam

18.05 Practice Final Exam No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For

More information

1. 4 2y 1 2 = x = x 1 2 x + 1 = x x + 1 = x = 6. w = 2. 5 x

1. 4 2y 1 2 = x = x 1 2 x + 1 = x x + 1 = x = 6. w = 2. 5 x .... VII x + x + = x x x 8 x x = x + a = a + x x = x + x x Solve the absolute value equations.. z = 8. x + 7 =. x =. x =. y = 7 + y VIII Solve the exponential equations.. 0 x = 000. 0 x+ = 00. x+ = 8.

More information

Bayesian Estimation of log N log S

Bayesian Estimation of log N log S Bayesian Estimation of log N log S Paul D. Baines Department of Statistics University of California, Davis May 10th, 2013 Introduction Project Goals Develop a comprehensive method to infer (properties

More information

AN INTRODUCTION TO PROBABILITY AND STATISTICS

AN INTRODUCTION TO PROBABILITY AND STATISTICS AN INTRODUCTION TO PROBABILITY AND STATISTICS WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Noel A. C. Cressie, Garrett M.

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

STAT Section 2.1: Basic Inference. Basic Definitions

STAT Section 2.1: Basic Inference. Basic Definitions STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

What to do today (Nov 22, 2018)?

What to do today (Nov 22, 2018)? What to do today (Nov 22, 2018)? Part 1. Introduction and Review (Chp 1-5) Part 2. Basic Statistical Inference (Chp 6-9) Part 3. Important Topics in Statistics (Chp 10-13) Part 4. Further Topics (Selected

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Foundations of Probability and Statistics

Foundations of Probability and Statistics Foundations of Probability and Statistics William C. Rinaman Le Moyne College Syracuse, New York Saunders College Publishing Harcourt Brace College Publishers Fort Worth Philadelphia San Diego New York

More information

STATISTICS ( CODE NO. 08 ) PAPER I PART - I

STATISTICS ( CODE NO. 08 ) PAPER I PART - I STATISTICS ( CODE NO. 08 ) PAPER I PART - I 1. Descriptive Statistics Types of data - Concepts of a Statistical population and sample from a population ; qualitative and quantitative data ; nominal and

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Measurement And Uncertainty

Measurement And Uncertainty Measurement And Uncertainty Based on Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST Technical Note 1297, 1994 Edition PHYS 407 1 Measurement approximates or

More information

NAG Library Chapter Introduction. G08 Nonparametric Statistics

NAG Library Chapter Introduction. G08 Nonparametric Statistics NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types

More information

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship Scholars' Mine Doctoral Dissertations Student Research & Creative Works Spring 01 Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with

More information

A CONDITIONALLY-UNBIASED ESTIMATOR OF POPULATION SIZE BASED ON PLANT-CAPTURE IN CONTINUOUS TIME. I.B.J. Goudie and J. Ashbridge

A CONDITIONALLY-UNBIASED ESTIMATOR OF POPULATION SIZE BASED ON PLANT-CAPTURE IN CONTINUOUS TIME. I.B.J. Goudie and J. Ashbridge This is an electronic version of an article published in Communications in Statistics Theory and Methods, 1532-415X, Volume 29, Issue 11, 2000, Pages 2605-2619. The published article is available online

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Statistical. Psychology

Statistical. Psychology SEVENTH у *i km m it* & П SB Й EDITION Statistical M e t h o d s for Psychology D a v i d C. Howell University of Vermont ; \ WADSWORTH f% CENGAGE Learning* Australia Biaall apan Korea Меяко Singapore

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution

More information

Correlated and Interacting Predictor Omission for Linear and Logistic Regression Models

Correlated and Interacting Predictor Omission for Linear and Logistic Regression Models Clemson University TigerPrints All Dissertations Dissertations 8-207 Correlated and Interacting Predictor Omission for Linear and Logistic Regression Models Emily Nystrom Clemson University, emily.m.nystrom@gmail.com

More information

Modelling Under Risk and Uncertainty

Modelling Under Risk and Uncertainty Modelling Under Risk and Uncertainty An Introduction to Statistical, Phenomenological and Computational Methods Etienne de Rocquigny Ecole Centrale Paris, Universite Paris-Saclay, France WILEY A John Wiley

More information

Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling

Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling JIRSS (05) Vol. 4, No., pp -3 Test of the Correlation Coefficient in Bivariate Normal Populations Using Ranked Set Sampling Nader Nematollahi, Reza Shahi Department of Statistics, Allameh Tabataba i University,

More information

Inference on distributions and quantiles using a finite-sample Dirichlet process

Inference on distributions and quantiles using a finite-sample Dirichlet process Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics

More information

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Journal of Data Science 7(2009), 459-468 Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Rand R. Wilcox University of Southern California Abstract: When comparing

More information

Continuous Univariate Distributions

Continuous Univariate Distributions Continuous Univariate Distributions Volume 1 Second Edition NORMAN L. JOHNSON University of North Carolina Chapel Hill, North Carolina SAMUEL KOTZ University of Maryland College Park, Maryland N. BALAKRISHNAN

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information