Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design

Size: px
Start display at page:

Download "Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design"

Transcription

1 University of Iowa Iowa Research Online Theses and Dissertations 27 Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design Jianlin Hou University of Iowa Copyright 27 Jianlin Hou This dissertation is available at Iowa Research Online: Recommended Citation Hou, Jianlin. "Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design." PhD (Doctor of Philosophy) thesis, University of Iowa, Follow this and additional works at: Part of the Education Commons

2 EFCTIVENESS OF THE HYBRID LEVINE EQUIPERCENTILE AND MODIFIED FREQUENCY ESTIMATION EQUATING METHODS UNDER THE COMMON- ITEM NONEQUIVALENT GROUPS DESIGN by Jianlin Hou An Abstract Of a thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Psychological and Quantitative Foundations (Educational Measurement and Statistics) in the Graduate College of The University of Iowa December 27 Thesis Supervisor: Professor Walter P. Vispoel

3 1 ABSTRACT The purpose of this study was to evaluate the effectiveness of the hybrid Levine equipercentile (Hybrid LE) and modified frequency estimation (M) equating methods in improving accuracy of equating as compared to the percentile rank frequency estimation (), kernel frequency estimation (Kernel ) and percentile rank chained equipercentile (CE) equating methods under the common-item nonequivalent groups (CINEG) design. The methods were compared under a wide variety of simulated conditions with log-linear pre-smoothing. The simulated conditions reflected differences in sample size, group proficiency, test length, ratio of common items and the similarity of form difficulty. An item response theory (IRT) model was used to define the true equating criterion, simulate group differences, and generate response data. Considered collectively, the simulation results revealed that the Hybrid LE and M methods performed best under most simulation conditions. When group proficiency differed only in the mean, the M method produced the lowest equating error. When the group proficiency differed in both mean and variance, the Hybrid LE method yielded the lowest error. Overall, the Hybrid LE method yielded less bias but greater standard error of equating (SEE) than the M method except at the two ends of the scale score. Aside from method effects, group proficiency differences had the most impact on equating error. Equating error also decreased as the common item ratio increased, sample size increased and test length decreased. Form difficulty, in contrast, had little effect on equating error. Abstract Approved: Thesis Supervisor Title and Department Date

4 EFCTIVENESS OF THE HYBRID LEVINE EQUIPERCENTILE AND MODIFIED FREQUENCY ESTIMATION EQUATING METHODS UNDER THE COMMON- ITEM NONEQUIVALENT GROUPS DESIGN by Jianlin Hou A thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Psychological and Quantitative Foundations (Educational Measurement and Statistics) in the Graduate College of The University of Iowa December 27 Thesis Supervisor: Professor Walter P. Vispoel

5 Copyright by JIANLIN HOU 27 All Rights Reserved

6 Graduate College The University of Iowa Iowa City, Iowa CERTIFICATE OF APPROVAL Ph.D. THESIS This is to certify that the Ph.D. thesis of Jianlin Hou has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Psychological and Quantitative Foundations (Educational Measurement and Statistics) at the December 27 graduation. Thesis Committee: Walter P. Vispoel, Thesis Supervisor Tianyou Wang Robert L. Brennan Timothy Ansley Kung-Sik Chan iii

7 To Zhongwei, Tony and My parents iv

8 ACKNOWLEDGMENTS First, I would like to express my deepest gratitude to my advisor and thesis supervisor, Dr. Walter P. Vispoel for his mentorship during my studies in the Educational Measurement and Statistics program. He was always in support with his cordial generosity when I was in need. This dissertation would not have been possible without his expert guidance, advice, time, effort, and patience. I am also deeply grateful to Dr. Tianyou Wang for his helping me with selecting the dissertation topic, guiding the design of the study, sharing his computer program, and for his help throughout the completion of the study. I am also thankful to Dr. Robert L. Brennan for his insightful comments and suggestions that make the dissertation better, and for his excellent supervision during my Master s degree program. I would like to thank other committee members, Drs. Timothy Ansley and Kung-Sik Chan for their valuable time and input on the dissertation. I would like to thank ACT, Inc., Iowa City for giving me an opportunity to enhance my research experiences that contributed considerably to my learning, and for providing me with financial support. I especially thank Drs. Deborah Harris and Xiaohong Gao for their excellent supervision and guidance in educational measurement during my assistantships at ACT, Inc. Finally, this dissertation would not have been possible without the love and patience of my family. My husband Zhongwei deserves my special recognition. His support, encouragement, quiet patience and unwavering love were undeniably the bedrock upon which the past ten years of my life have been built. I am also indebted to my parents for their unconditional love, support and encouragement. Last but not the least, I would thank all my teachers, classmates, and friends during every phase of my study. I have learned much more through their teaching and help. v

9 TABLE OF CONTENTS LIST OF TABLES... vi LIST OF FIGURES... viii LIST OF NOTATIONS... xi CHAPTER I INTRODUCTION...1 Common-Item Nonequivalent Group Design...3 Five Equipercentile Methods...4 The Frequency Estimation Method...4 The Chain Equipercentile Method...5 The Kernel Frequency Estimation Method...5 The Hybrid Levine Equipercentile Equating Method...6 The Modified Frequency Estimation Method...6 Equating Error...7 The Need for This Study...7 Research Purposes and Questions...9 Comparison of Bias in Equating...1 Comparison of Standard Errors of Equating...1 Comparisons of Root Mean Square Error of Equating...11 Suggestions for a Preferred Equating Method in Practice...11 CHAPTER II LITERAURE REVIEW...12 Equipercentile Equating...12 Common-Item Nonequivalent Group Design...14 The Frequency Estimation Method...14 The Basic Assumption for the Frequency Method...15 Procedures of the Frequency Estimation Method...15 A Problem with the Basic Assumption for the Method...18 The Chained Equipercentile Equating...2 The Kernel Equating with Frequency Estimation...21 Hybrid Levine Equipercentile Equating Method...24 Levine Linear Observed Equating...24 Two theorems...25 The Hybrid Levine Equipercentile Equating Function...27 The Modified Frequency Estimation Method...29 Equating Errors...31 Relevant Studies...32 Studies on the and CE Methods...32 Studies on the Kernel, and/or CE Methods...4 Studies on the Hybrid LE and M method...42 Summary...43 CHAPTER III METHODS...45 Justification for Using Simulation Studies...45 The Criterion Equating Function...46 Data Used in This Study...49 vi

10 Study Design and Procedures of the Study...51 Independent Variables...51 Simulation Steps...56 Evaluation Indices...57 Study Questions and Hypotheses...6 CHAPTER IV RESULTS...67 Bias...67 Method Comparisons for Weighted Absolute Bias...68 Method Comparisons for Conditional Bias...7 Effects of Group Proficiency, Form Similarity, Ratio of Common Items, Test Length, and Sample Size within the Method...71 Standard Error of Equating...72 Method Comparisons for Weighted SEE...72 Method Comparisons for Conditional SEE...73 Effects of Group Proficiency, Form Similarity, Ratio of Common Items, Test Length, and Sample Size within the Method...74 Root Mean Square Error of Equating...75 Method Comparisons for Weighted RMSE...75 Method Comparisons for Conditional RMSE...77 Effects of Group Proficiency, Form Similarity, Ratio of Common Items, Test Length, and Sample Size within the Method...78 Best Method in Simulation...79 CHAPTER V SUMMARY AND DISCUSSION...86 Summary of Findings...86 Discussion and Conclusions...99 Limitations and Recommendations for Future Study...12 APPENDIX A EQUATION DERIVATION...15 APPENDIX B TABLES APPENDIX C FIGURES RERENCES vii

11 LIST OF TABLES Table 2.1 Important Previous Comparison Studies...34 Table 3.1 Descriptive Statistics of Difficulty Parameters (b) of Seven Forms...5 Table 3.2 The Conditions Studied for Each Equating Method...52 Table 3.3 Descriptive Statistics for 4-item Test Forms with a Common Item Set Length in a Ratio 1: Table 3.4 Descriptive Statistics for 4-item Test Forms with a Common Item Set Length in a Ratio 1: Table 3.5 Descriptive Statistics for 8-item Test Forms with a Common Item Set Length in a Ratio1: Table 3.6 Descriptive Statistics for 8-item Test Forms with a Common Item Set Length in a Ratio 1: Table 4.1 Effects of Factors on Bias for Each Method...81 Table 4.2 Effects of Factors on SEE for Each Method...82 Table 4.3 Effects of Factors on RMSE for Each Method...83 Table 4.4 Equating Method that Results in the Smallest Weighted Bias, WSEE, and WRMSE within Each Condition...84 Table 4.5 Equating Method that Results in the Second Smallest Weighted Bias, WSEE, and WRMSE for Each Condition...85 Table B1 WAB for the 4-Item Test Form Pairs and N= Table B2 WAB for the 8-Item Test Form Pairs and N= Table B3 WAB for the 4-Item Test Form Pairs and N= Table B4 WAB for the 8-Item Test Form Pairs and N= Table B5 Overall Summary of Bias for the Five Equating Methods Table B6 WSEE for the 4-Item Test Form Pairs and N= Table B7 WSEE for the 8-Item Test Form Pairs and N= Table B8 WSEE for the 4-Item Test Form Pairs and N= Table B9 WSEE for the 8-Item Test Form Pairs and N= Table B1 Overall Summary of WSEE for the Five Equating Methods viii

12 Table B11 WRMSE for the 4-Item Test Form Pairs and N= Table B12 WRMSE for the 8-Item Test Form Pairs and N= Table B13 WRMSE for the 4-Item Test Form Pairs and N= Table B14 WRMSE for the 8-Item Test Form Pairs and N= Table B15 Overall Summary of WRMSE for the Five Equating Methods Table B16 Ratio of Common Item Effects on WAB with N= Table B17 Ratio of Common Item Effects on WAB with N= Table B18 Test Length effects on WAB with N= Table B19 Test Length Effects on WAB with N= Table B2 Sample Size Effects on WAB with 4-Item Test Table B21 Sample Size Effects on WAB with 8-Item Test Table B22 Ratio of Common Item Effects on WSEE with N= Table B23 Ratio of Common Item Effects on WSEE with N= Table B24 Test Length Effects on WSEE with N= Table B25 Test Length Effects on WSEE with N= Table B26 Sample Size Effects on WSEE with 4-Item Test Table B27 Sample Size Effects on WSEE with 8-Item Test Table B28 Ratio of Common Item Effects on WRMSE with N= Table B29 Ratio of Common Item Effects on WRMSE with N= Table B3 Test Length Effects on WRMSE with N= Table B31 Test Length Effects on WRMSE with N= Table B32 Sample Size Effects on WRMSE with 4-Item Test Table B33 Sample Size Effects on WRMSE with 8-Item Test ix

13 LIST OF FIGURES Figure C1 Bias for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= Figure C2 Bias for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C3 Bias for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C4 Bias for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C5 Bias for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C6 Bias for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C7 Bias for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C8 Bias for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= Figure C9 Bias for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= Figure C1 Bias for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C11 Bias for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C12 Bias for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C13 Bias for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C14 Bias for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C15 Bias for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C16 Bias for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= Figure C17 SEE for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= x

14 Figure C18 SEE for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C19 SEE for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C2 SEE for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C21 SEE for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C22 SEE for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C23 SEE for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C24 SEE for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= Figure C25 SEE for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= Figure C26 SEE for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C27 SEE for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C28 SEE for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C29 SEE for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C3 SEE for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C31 SEE for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C32 SEE for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= Figure C33 RMSE for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= Figure C34 RMSE for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C35 RMSE for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= xi

15 Figure C36 RMSE for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C37 RMSE for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C38 RMSE for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C39 RMSE for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C4 RMSE for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= Figure C41 RMSE for the 4-Item Test Form Pairs with 1 Common Items Small Form Difficulty Difference and N= Figure C42 RMSE for the 4-Item Test Form Pairs with 1 Common Items Large Form Difficulty Difference and N= Figure C43 RMSE for the 4-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C44 RMSE for the 4-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C45 RMSE for the 8-Item Test Form Pairs with 2 Common Items Small Form Difficulty Difference and N= Figure C46 RMSE for the 8-Item Test Form Pairs with 2 Common Items Large Form Difficulty Difference and N= Figure C47 RMSE for the 8-Item Test Form Pairs with 4 Common Items Small Form Difficulty Difference and N= Figure C48 RMSE for the 8-Item Test Form Pairs with 4 Common Items Large Form Difficulty Difference and N= xii

16 LIST OF NOTATIONS Symbol Or Arabic ^ a 2 ax Bias b c C X C V Explanation Denotes an estimate. Item slope parameter in IRT. Part of definition of the kernel equating continuzation process. The bias of the equating. Item location parameter in IRT. Item pseudochance level parameter in IRT. The moments preserved in the marginal distribution of the total score for Form X. The moments preserved in the marginal distribution of the total score for common item set. C The cross moments preserved in the bivariated XV distribution. E X, E V e Error scores in the congeneric model. The equipercentile equating function. e ( x ) The Form Y equipercentile equivalent of a Form X score. Y ev 1 ( x ) The common item set score equipercentile equivalent of a Form X score for Population 1. ey 2 () v The Form Y score equipercentile equivalent of a common item set score V for Population 2. F( x ) The cumulative distribution for X. f ( x ) The discrete density function for X. f ( xv, ) The joint density of X and V. f ( x v ) The conditional density of x given v. fhx ( x j ) The height of pdf of the kernel distribution at x j. G(y) The cumulative distribution for Y. G -1 The inverse of function G. xiii

17 Symbol Or Arabic Explanation g(y) The discrete density for Y. g(y, v) The joint density of Y and V. g( y v) The conditional density of y given v. h(v) The discrete density for V. h X,, h Y K l Ys (x) l Ys(lv) (x) The two bandwidths that are used to define kernel equating continuization of F(x) and G(y). A positive number used in the kernel equating continuization. The Form Y linear equivalent of a Form X score for synthetic population. The Form Y linear equivalent of a Form X score for synthetic population by Levine method. P 1 Population taking the new test form X. P 2 Population taking the new test form Y. P(x) The percentile rank function for X. PX ( V ) PX ( θ ) P -1 P ij p jk PEN 1 PEN 2 The score distribution of X conditional on V in IRT. The score distribution of X conditional on θ in IRT. The inverse of percentile rank function. Probability of a correct response in IRT. Fitted score probability for total j and the common item score k. A penalty function used to chose an optimal bandwidth in the continuization process. Defined in Equation (2-26). A penalty function used to chose an optimal bandwidth in the continuization process. Defined in Equation (2-26). Q(y) The percentile rank function for Y. Q -1 R(x) The inverse of percentile rank function. The remainder term in the decomposition of the equipercentile function. xiv

18 Symbol Or Arabic RSME r(z) r j SEE s T t U V Explanation Root mean square errors of equating. The term used in the decomposition of the equipercentile function. The discrete relative frequency at score x j. The standard error of equating. Synthetic population. Number correct true score. Realization of number correct true score. Uniform random variable. The random variable indicating raw score on the common items. v A realization of V. X The random variable indicating raw score on Form X. w Population weight. x A realization of X. x j A possible score value for X, j=1 to J. x * The score point computed in the Hybrid LE method. Y The random variable indicating raw score on Form Y. y A realization of Y. Greek Letters α β δ γ Explanation The normalizing constant selected in log-linear presmoothing The parameters to be estimated in log-linear presmoothing The true score difference of the common items between Population 1 and Population 2. Expansion factor in linear equating with the common item nonequivalent groups design xv

19 Greek Letters θ Explanation Ability in IRT. ρ Correlation, such as ρ ( X, V ). ' ρ ( X, X ), ( VV, ) ρ Reliability. µ Mean as in µ ( X ), µ ( Y ) and µ ( V ). 2 σ Variance such asσ 2 ( X ). σ Standard deviation such as σ ( X ) and σ ( V ). σ ( X, V ) Covariance. Φ ϕ λ The normal or Gaussian cdf. The normal or Gaussian pdf The congeneric coefficients as λ X, λ. V xvi

20 1 CHAPTER I INTRODUCTION To address test security, testing programs regularly produce multiple test forms to administer on different occasions. These forms are similar in content and statistical characteristics, but may contain different sets of questions. Because of dissimilar sets of questions in the different forms, the forms can vary in difficulty which may result in test scores being dependent on the particular test form used. Although test items may differ, test scores using different test forms must be comparable to ensure fairness and consistency on each occasion. Test equating represents statistical and psychometric methods that adjust for differences in difficulty so that scores obtained on different test forms are comparable (Kolen & Brennan, 24). To conduct equating, different equating designs can be used to collect data with specialized equating methods available for each design. Three commonly used data collection designs are: random groups design, single-group design, and common-item nonequivalent groups design. In the random groups design, a spiraling process is typically used in which alternate examinees are administered alternate forms of the exam at the same time. A single group design refers to the scenario in which one group is administered both forms to be equated. Typically, the order of administration of the forms is counterbalanced to eliminate order effects. Both of these two designs lead to equivalent groups of examinees taking alternate forms, so that differences in form difficulty are directly reflected in differences in group level performance. In the commontem nonequivalent groups design (CINEG) discrepancies between group scores reflect both group and form difficulty differences. In this design, separately sampled groups of examinees from different populations are administered different test forms that have a subset of items in common. In this case, strong statistical assumptions for scores across forms are required to adjust for population differences.

21 2 Some equating methods are based on the statistical relationships between true scores, such as Levine true score equating (Levine, 1959) and item response theory (IRT) true score equating (Lord, 1965). Other equating methods are based on statistical relationships between observed scores. Linear equating sets the mean and standard deviation of the equated scores of the new form equal to the old form for a specified population of examinees, whereas equipercentile equating converts a score on the new form to a score on the old form to have the same percentile rank. The present study is focused exclusively on the equipercentile-like equating method of observed score equating under the common-item nonequivalent groups design (CINEG). Under the CINEG design, two common equipercentile-like equating methods are: the frequency estimation method and the chained equipercentile method (Angoff, 1971). The frequency estimation method involves an equipercentile equating of the marginal distributions for both new (X) and old (Y) test forms for a synthetic population based on statistical assumptions about the invariance of conditional distribution of X on V (common item set). The chained equipercentile method equates X to V and V to Y through a chain of two equipercentile equatings. In performing the equipercentile equating for either method, there are two ways to handle the discreteness of score distribution. One uses the percentile rank function (Kolen & Brennan, 24; also known as poststratification equating (PSE), von Davier et al., 24a); the other, kernel equating (von Davier et al., 24a), transforms the discrete distributions into continuous distributions using an adjusted Gaussian kernel procedure. Both of these procedures can be combined with the frequency estimation and chained equipercentile methods, thus creating four possible equipercentile equating methods for the CINEG design: percentile rank based frequency estimation, percentile rank based chained equipercentile, Kernel frequency estimation, and Kernel chained equipercentile. For convenience, in this study, they are called the, CE, Kernel and Kernel CE methods, respectively.

22 3 Recently, von Davier et al. (26b) and Wang & Brennan (26) developed two additional methods to improve equating accuracy under the CINEG design; they are labeled here as the Hybrid Levine equipercentile equating (Hybrid LE) method (also known as Hybrid PSE-Levine equipercentile equating function, von Davier et al., 26b) and the modified frequency estimation (M) method (Wang & Brennan, 26). These methods are based on assumptions about true scores and have not been studied thoroughly. The primary purpose of this dissertation was to evaluate the effectiveness of these two newly developed methods more extensively by comparing to them to the, CE, and Kernel methods under a variety of assessment conditions. In the reminder of this chapter, the CINEG design is described in greater detail. Next, the five equipercentile-like equating methods under the CINEG design investigated in this study are discussed. Then, equating error - the main criterion for comparing methods - is defined. Finally, the need for the study, research purposes and research questions are delineated. Common-Item Nonequivalent Group Design In the common-item nonequivalent groups design (CINEG), samples from two different populations, Population 1 and Population 2, might take the new test form X and the old test form Y on different occasions. The two populations are not required to be equivalent. Because of this, it is critical that there is a common set of test items V on the two forms to adjust for differences in skill across populations. If the scores on the common items contribute to the total scores of X and Y, they are considered to be internal. If not, they are considered external. In test equating, the equating function is typically defined for a single population even if two populations are used for the data collection. Braun & Holland (1982) introduced the concept of synthetic population to define a single population for which an equating relationship is constructed. The synthetic population is

23 4 defined as the weighted average of Population 1 and Population 2 where the weight can be defined subjectively. Five Equipercentile Methods In the CINEG design, the discrepancy in performance scores between the two sampled groups reflects both group and form difficulty differences. The major task of equating for this design is to separate group and form differences using information from the common item set V. Five equipercentile-like equating methods that can be used under this design were examined in this study:, CE, Kernel, Hybrid LE, and M. These methods are all based on classical test theory, and are described in more detail in the sections that follow. The Frequency Estimation Method The core idea for equipercentile equating is that scores on the alternate forms are considered to be equivalent if their corresponding percentile ranks in equivalent groups and/or synthetic groups are equal (Angoff, 1971; Braun & Holland, 1982). To get the cumulative distribution of Form X and Form Y for a synthetic population using the frequency estimation () method, the following basic assumption is made: For both Form X and Form Y, the conditional distribution of the total score given each common item score is the same for the Population 1 and Population 2. This assumption is the same whether the common items are internal or external. Under this assumption, the method involves the following steps: Step 1: Data collection to get the bivariate score distributions of total score and common item score for both Form X and Form Y; Step 2: Pre-smoothing the bivariate score distributions (optional); Step 3: Estimation of the marginal score distributions for the synthetic population based on the invariance assumption of conditional distributions;

24 5 Step 4: Using the percentile rank function to continuize the estimated discrete score distributions from step 3; Step 5: Performing equipercentile equating using the percentile rank functions distribution from step 4. Pre-smoothing referred to in Step 2 above is commonly used in score equating. A function (e.g. log-linear, Holland & Thayer, 1987, 2) is fit to the observed score distribution to reduce irregularities in the data due to sampling fluctuations. Presmoothing is required for some equating methods (e.g., the Kernel method), but not for others (e.g., the method). The Chained Equipercentile Method The chained equipercentile (CE) method (Angoff, 1971) treats the common item score as a chain in the linking. Form X scores are equated to V scores for Population 1 using an equipercentile function e ( V 1 x ). Then, V scores are equated to Form Y scores for Population 2 using an equipercentile function e () Y 2 v. Finally, the equating function e [ e ( x)] Y2 V1 to convert Form X scores to Form Y scores is found. This approach assumes that the statistical relationship between observed test scores and common-item scores form a symmetric linking for the two forms. Similar to the method, pre-smoothing is optional in obtaining the equipercentile function. The Kernel Frequency Estimation Method Kernel equating is a unified approach to conducting test equating. It is based on several equipercentile-like equating functions and treats linear equating as a special case. It is called kernel equating because Gaussian kernel smoothing is used to continuize the score distribution by replacing the frequency at each discrete score point with a continuous normal (Gaussian) frequency distribution centered at that point. The kernel frequency estimation (Kernel ) method has the same basic assumption as the method (or the percentile rank frequency estimation method).

25 6 Generally speaking, the Kernel equating steps are similar to the method with two main differences. The first is that the Kernel method requires pre-smoothing of the bivariate distribution of the total score conditional on the common item score before calculating the cumulative distribution of the synthetic population. In contrast, the method can be applied with or without pre-smoothing. The second difference is that the Kernel method uses a normal (Gaussian) kernel rather than percentile rank function to continuize the estimated score distribution. The Hybrid Levine Equipercentile Equating Method The hybrid Levine equipercentile (Hybrid LE) equating method or hybrid PSE- Levine equipercentile equating function (von Davier et al., 26b) was developed based on the idea that any equipercentile equating function can be decomposed into linear and non-linear portions. Specifically, the linear part is from Levine s linear function and the non-linear part reflects a compatible form that involves the difference between the Kernel and Kernel linear functions. This method can be viewed as using the Levine linear equating function to replace the kernel linear function in the decomposition of the Kernel function. Since the Levine linear observed-score method is based on true score assumptions, we consider the Hybrid LE method s to be based on true scores as well. In addition, because the Hybrid LE function is derived from a decomposition of an equipercentile function, it also is an equipercentile-like function under the CINEG design. Logically, if the function works well, the Hybrid LE is expected to produce more accurate equating results because the Levine observed-score equating method is more accurate than other linear equating methods (Petersen, et al., 1982). More details about steps for implementing the Hybrid LE method are presented in chapter II. The Modified Frequency Estimation Method The modified frequency estimation (M) method, introduced by Wang & Brennan (26), incorporates a change in the basic assumption of the method. Instead

26 7 of assuming that the conditional distribution of X on the observed score V is the same for Populations 1 and Population 2, the conditional distribution of X on the true score of V is assumed to remain invariant for both populations. Modifying the assumption results in estimating a marginal score distribution different from the method. Other steps are the same as the method. The altered assumption in the M method is weaker than the method assumption and expected to improve equating accuracy compared to the method. More details regarding this method are presented in chapter II as well. Equating Error Accuracy of equating can be described by equating error. There are two types of equating error: random and systematic. Random error occurs when a sample of examinees is used to estimate the equating relationship of the whole population. If the whole population was used to conduct the equating, there would be no random error. Therefore, random error decreases as the sample size increases. However, even within a population, systematic error may still be present. Systematic error may be introduced due to violation of the equating assumptions, improper implementation of the data collection design, or bias embedded in the equating methods. Total error is defined as the sum of random error and systematic error (Kolen & Brennan, 24). In the present simulation study, the standard error of equating (SEE) was used to index random error. It is defined as the standard deviation of the equated scores across replications. Bias was used to denote systematic error. It is the difference between mean equated scores across replications and the true equated score. The total equating error is quantified in the root mean squared error (RMSE) index (see Chapters II and III for further details). The Need for This Study Under the CINEG design, each of the equating methods described above uses specific statistical assumptions to establish the equating relationship between the two

27 8 forms. Most of these assumptions cannot be tested using the data usually available, and unmet assumptions may result in bias in the equating results. Previous studies using real test data (Harris & Kolen, 199; Livingston et al., 199; Macro et al., 1983; von Davier et al., 26a; Holland et al., 26) have shown that when two groups differ substantially, the method may not be the best choice to conduct the equating. Livingston (24), for example, showed graphically that the method may bias the equating results whenever the two equating samples differ in their scores on the common-items and the correlations between the total scores and the common-item score are not perfect. Wang & Brennan (26) also showed that the basic assumption of the method did not hold under both congeneric and the IRT models. The Kernel method has the same basic assumption as the method, and it holds promise of approximating the method. Theoretically, it has similar bias as the method. The CE method produces less bias when groups differ substantially (Kolen & Brennan, 24; Wang et al., 26), but is criticized because it does not incorporate the synthetic population concept. In addition, the CE method involves equating a long test (total test) to a short test (common items) that may not mirror the long test s characteristics. The newer Hybrid LE and the M methods are expected to reduce bias caused by the and Kernel methods, but neither method has been thoroughly studied. The Hybrid LE function was introduced in a study by von Davier et al. (26b), but only as a special case due to software limitations. The M method (Wang & Brennan, 26) was shown to reduce bias of the method in a simulation study but that study had several limitations. Firstly, test form difficulty differences were not considered. Secondly, the M method was only compared to the and CE methods. Finally, the three equating methods (, CE and M methods) in Wang & Brennan (26) were compared without pre-smoothing. Evaluating the two newly developed Hybrid and M methods along with the, CE and Kernel methods under different simulation conditions should contribute

28 9 to our understanding of how those equating methods behave in different assessment contexts. In this dissertation, all equipercentile-like methods were investigated with presmoothing to yield results that can be compared to the Kernel and the Hybrid LE methods in which pre-smoothing is required. Research Purposes and Questions The main purpose of this study was to evaluate the performance of two newly developed equating methods (Hybrid LE and M) using a simulation procedure. For comparison purposes, three previous developed equating methods (, CE, and Kernel ) were included as well. The simulation procedure involved an Item Response Theory based (IRT-based) approach. With an IRT-based approach, the true equating function can be defined, and the five equating methods can be evaluated by comparing the equating results to the true equating results. In comparing the five equating methods, five factors were manipulated to mimic typical conditions encountered in practice: sample size, group proficiency difference, test length, ratio of common items (refer to the number of common items to total test length), and the similarity of form difficulty. The equating results for the five equating methods under each specific simulation condition were evaluated using two groups of indices: Weighted absolute bias, weight SEEs and weighted RMSEs collapsed across the test score scale Equating bias, standard errors of equating (SEE) and root mean square errors of equating (RMSE) derived conditional on each score point Specifically, the study was designed to answer the research questions within the following four categories.

29 1 Comparison of Bias in Equating Research question 1.1: How does weighted absolute bias for the Hybrid LE and M methods compare to each other and to the, CE, and Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length, and sample size? Research question 1.2: How does conditional bias for the Hybrid LE and M methods compare to each other and to the, CE, and Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length and sample size? Research question 1.3: What are the relative effects of group proficiency, similarity of test form, ratio of common items, test length and sample size on bias yielded by each method? Comparison of Standard Errors of Equating Research question 2.1: How does weighted standard error of equating (WSEE) for the Hybrid LE and M methods compare to each other and to the, CE, Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length, and sample size? Research question 2.2: How does conditional standard error of equating (SEE) for the Hybrid LE and M methods compare to each other and to the, CE, Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length and sample size? Research question 2.3: What are the relative effects of group proficiency, similarity of test form, ratio of common items, test length and sample size on the SEE yielded by each method?

30 11 Comparisons of Root Mean Square Error of Equating Research question 3.1: How does weighted root mean square of error (WRMSE) for the Hybrid LE and M methods compare to each other and to the, CE and Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length, and sample size? Research question 3.2: How does conditional RMSE for the Hybrid LE and M methods compare to each other and to the, CE, and Kernel methods under each condition of group proficiency, similarity of test form, ratio of common items, test length and sample size? Research question 3.3: What are the relative effects of group proficiency, similarity of test form, ratio of common items, test length and sample size on the RMSE yielded by each method? Suggestions for a Preferred Equating Method in Practice Research question 4.1: What suggestions can be made about the preferred method of equating to use in practice among the, CE, Kernel, Hybrid LE and M methods based on how well each equating method estimates the criterion equating relationship? In summary, the objective of this study was to evaluate the Hybrid LE and M methods, not only by comparing them to each other but also to three previously developed equipercentile-like equating methods (, CE and Kernel ). The primary goal was to provide appropriate guidelines for selecting an equating method under a common-item nonequivalent group design in different assessment situations.

31 12 CHAPTER II LITERAURE REVIEW This chapter is focused on literature related to five equipercentile-like equating methods under the common-item nonequivalent group (CINEG) design: the frequency estimation () method, the chained equipercentile (CE) method, the kernel equating with frequency estimation (Kernel ) method, the hybrid Levine equipercentile equating (Hybrid LE) method and the modified frequency estimation (M) method. The chapter starts with an introduction to equipercentile equating, which serves as the basic building block for all the methods. Next, the common item design, five equating methods and equating error are described in detail. Finally, previous studies involving comparisons among these equipercentile-like equating methods under the CINEG design are reviewed and summarized. Equipercentile Equating Angoff (1971, p.563) described equipercentile equating as follows, Two scores, one on Form X and the other on Form Y (where X and Y measure the same function with the same degree of reliability) may be considered equivalent if their corresponding percentile ranks in any given group are equal. Such equating is typically done when separate random samples of examinees from the same population take each test form (i.e., a randomly equivalent group design) or a single random sample of examinees takes both test forms (i.e., a single group design). In practice, test scores are typically discrete integer values such as number-correct scores. To obtain the equipercentile equating function, the cumulative score distributions F( x ) for Form X (the new form) and G( y) for Form Y (the old form) need to be continuized before conducting the equating. The traditional approach to continuization is to use the percentile rank function (Holland & Thayer, 1989). In this method, the discrete

32 13 probability at each score point uniformly spreads out over the unit interval around it, and the cumulative score distributions are continuized. For a given discrete score x j and a continuous random variable u that is uniformly distributed over a range of -.5 and +.5, a new random variable X is defined as x = x + u j. (2-1) The new random variable X is continuous. The score distribution of X is * * [ x.5, x +.5). With this continuization, the score distribution of X is referred to a j j histogram that spreads the score probability over the unit interval centered at the score point x. Then, the percentile rank function used for the x score is j P( x) = 1 { Fs( xj 1) + [ x ( xj.5)][ Fs( xj) Fs( xj 1)]}.5 x< K X +.5 =, x <.5 = 1, x K x +.5 (2-2) where x j is an integer that is closest to x such that x.5 x< x +.5. K represents the x number of items on the Form X of a test. Percentile rank function j j Qy ( ) can be obtained from the score distribution of G( y) in a similar fashion. Define ey ( x) to be the equipercentile equating function that converts Form X * scores to Form Y scores y, then the equated score y is y e ( x) =. (2-3) According to the definition of equipercentile equating, Q( y) = P( x), (2-4) and Y 1 Y Q P x Substituting Equation (2-3) into Equation (2-5), = ( ( )). (2-5) e x = Q P x. (2-6) 1 ( ) ( ( )) Y Equation (2-6) is the general function for equipercentile equating.

33 14 Common-Item Nonequivalent Group Design In the common-item nonequivalent group design (CINEG), samples from two different populations, Population 1 ( ) and Population 2 ( P ) respectively, might take P1 2 the new test form X and the old test form Y on different occasions. The two populations are not required to be equivalent. Because of this, it is critical that there be a common set of test items V on the two forms to adjust for differences in proficiency across populations. If the scores on the common items contribute to the total scores of X and Y, they are considered to be internal. If not, they are considered external. The common items should be representative of the test forms to be equated in terms of content and statistical characteristics regardless of whether they are internal or external. Typically, the equating function is defined for a single population even if two populations are used for the data collection. Braun & Holland (1982) introduced the concept synthetic population (or sometimes called target population) to define a single population when creating an equating relationship. The synthetic population S is defined as S = wp + (1 w) P, where w is the weight for P in the synthetic population. If w =1, then the synthetic population is the Population 1. P 1 The Frequency Estimation Method Equipercentile equating typically assumes that the examinees who take the two forms are random samples from the same population. However, for the common-item nonequivalent group design (CINEG), the examinees who take the tests are from two different populations that are not required to be equivalent. As a result, the equipercentile equating can not be applied to the CINEG design directly. Making some statistical assumptions, Angoff (1971) described the method, and Braun & Holland (1982) developed the method to estimate the cumulative distributions of scores on Form X and Form Y for a synthetic population under the CINEG design. The equating function then can be constructed based on percentile ranks obtained from the cumulative distributions.

34 15 first is The Basic Assumption for the Frequency Method Two quantities are defined to reflect the basic assumption of the method. The, which is the joint distribution of total score and common-item score representing the probability of earning a score of x on Form X and a score of v on the common items. The second is, which is the conditional distribution of scores on Form X for examinees earning score v on the common items. f( x v) can be expressed as f( x, v) f( x v) f ( xv, ) f( x v) =, (2-7) hv () where hv () represents the marginal score distribution on the common items. The basic assumption for Equation (2-7) is that for Form X and Form Y the conditional distribution of total score given each score v is the same in both populations. Whether the common items are internal or external, this assumption can be expressed as f ( x v) = f ( x v), for all v 1 2 g ( y v) = g ( y v), for all v. (2-8) 1 2 Procedures for the Frequency Estimation Method Procedures for the Frequency Estimation Method () involve the steps below: Step 1: Data collection From the data collected by the CINEG design, the following distributions can be obtained directly: f ( x v ), 1 f ( ) 1 x and h() v 1 for Population 1 on Form X; and g ( y v) 2, g ( y) and h () v 2 2 for Population 2 on Form Y. Step 2: Pre-smoothing After the data are collected, pre-smoothing can be done to reduce equating error reflected in sampling error. This step is optional for the method, but is required for the Kernel method. To allow comparison of results for the and Kernel methods,

35 16 pre-smoothing was conducted in this study before calculating the cumulative distribution of Form X scores and Form Y scores for the synthetic population. The purpose of pre-smoothing is to get better estimation of the population distribution. The quality of smoothing can affect equating error. When pre-smoothing is conducted, it can reduce sampling error, but at the same time it may introduce systematic error. However, proper pre-smoothing should result in less total error compared with using an unsmoothed score distribution (Kolen & Jarjoura, 1987; Harris & Kolen, 199; Livingston, 1993b). Several pre-smoothing methods can be used in practice including: log-linear pre-smoothing (see Rosenbaum & Thayer, 1987; Holland & Thayer, 1987, 2), beta-4 smoothing (also called strong true score smoothing, Lord, 1965) and cubic B-spine pre-smoothing (Cui, 26). In this study, the log-linear pre-smoothing method was used to reduce sampling error. Log-linear models (see Rosenbaum & Thayer, 1987; Holland & Thayer, 1987, 2) can fit not only univariate but also bivariate score distributions. They belong to exponential families of discrete distributions and can be estimated by maximum likelihood using standard iterative techniques to fit various power moments of the distributions. For bivariate distributions, the model has the following form: CX CV CIX CIV ' i i i i jk = α + jk + βxi j + βki k + β ' xvii j v k (2-9) i= 1 i= 1 i= 1 ' i = 1 log( p ) u ( x ) ( v ) ( x ) ( ) where p is the fitted score probability for the total score j and the common item score jk k. α is the normalizing constant selected to make the sum of p to be one. u is the null distribution, for which different choices result in different models. Normally, jk jk u jk is set to be zero (Holland & Thayer, 2). The values of β are the parameters to be estimated in the model fitting process. The fitting of the model in Equation (2-9) produces a fitted bivariate distribution that preserves C X moments in the marginal (univariate) distribution of the total scores, C V moments in the marginal (univariate) distribution of the common-item scores, and C XV cross-moments in the bivariate

Choice of Anchor Test in Equating

Choice of Anchor Test in Equating Research Report Choice of Anchor Test in Equating Sandip Sinharay Paul Holland Research & Development November 2006 RR-06-35 Choice of Anchor Test in Equating Sandip Sinharay and Paul Holland ETS, Princeton,

More information

Haiwen (Henry) Chen and Paul Holland 1 ETS, Princeton, New Jersey

Haiwen (Henry) Chen and Paul Holland 1 ETS, Princeton, New Jersey Research Report Construction of Chained True Score Equipercentile Equatings Under the Kernel Equating (KE) Framework and Their Relationship to Levine True Score Equating Haiwen (Henry) Chen Paul Holland

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 37 Effects of the Number of Common Items on Equating Precision and Estimates of the Lower Bound to the Number of Common

More information

Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data

Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data Research Report Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data Gautam Puhan February 2 ETS RR--6 Listening. Learning. Leading. Chained Versus Post-Stratification

More information

A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating

A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating Tianyou Wang and Michael J. Kolen American College Testing A quadratic curve test equating method for equating

More information

Research on Standard Errors of Equating Differences

Research on Standard Errors of Equating Differences Research Report Research on Standard Errors of Equating Differences Tim Moses Wenmin Zhang November 2010 ETS RR-10-25 Listening. Learning. Leading. Research on Standard Errors of Equating Differences Tim

More information

A Comparison of Bivariate Smoothing Methods in Common-Item Equipercentile Equating

A Comparison of Bivariate Smoothing Methods in Common-Item Equipercentile Equating A Comparison of Bivariate Smoothing Methods in Common-Item Equipercentile Equating Bradley A. Hanson American College Testing The effectiveness of smoothing the bivariate distributions of common and noncommon

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 31 Assessing Equating Results Based on First-order and Second-order Equity Eunjung Lee, Won-Chan Lee, Robert L. Brennan

More information

Equating of Subscores and Weighted Averages Under the NEAT Design

Equating of Subscores and Weighted Averages Under the NEAT Design Research Report ETS RR 11-01 Equating of Subscores and Weighted Averages Under the NEAT Design Sandip Sinharay Shelby Haberman January 2011 Equating of Subscores and Weighted Averages Under the NEAT Design

More information

GODFREY, KELLY ELIZABETH, Ph.D. A Comparison of Kernel Equating and IRT True Score Equating Methods. (2007) Directed by Dr. Terry A. Ackerman. 181 pp.

GODFREY, KELLY ELIZABETH, Ph.D. A Comparison of Kernel Equating and IRT True Score Equating Methods. (2007) Directed by Dr. Terry A. Ackerman. 181 pp. GODFREY, KELLY ELIZABETH, Ph.D. A Comparison of Kernel Equating and IRT True Score Equating Methods. (7) Directed by Dr. Terry A. Ackerman. 8 pp. This two-part study investigates ) the impact of loglinear

More information

Raffaela Wolf, MS, MA. Bachelor of Science, University of Maine, Master of Science, Robert Morris University, 2008

Raffaela Wolf, MS, MA. Bachelor of Science, University of Maine, Master of Science, Robert Morris University, 2008 Assessing the Impact of Characteristics of the Test, Common-items, and Examinees on the Preservation of Equity Properties in Mixed-format Test Equating by Raffaela Wolf, MS, MA Bachelor of Science, University

More information

Testing the Untestable Assumptions of the Chain and Poststratification Equating Methods for the NEAT Design

Testing the Untestable Assumptions of the Chain and Poststratification Equating Methods for the NEAT Design Research Report Testing the Untestable Assumptions of the Chain and Poststratification Equating Methods for the NEAT Design Paul W. Holland Alina A. von Davier Sandip Sinharay Ning Han Research & Development

More information

Statistical Equating Methods

Statistical Equating Methods Statistical Equating Methods Anthony Albano January 7, 2011 Abstract The R package equate (Albano, 2011) contains functions for non-irt equating under random groups and nonequivalent groups designs. This

More information

Equating Subscores Using Total Scaled Scores as an Anchor

Equating Subscores Using Total Scaled Scores as an Anchor Research Report ETS RR 11-07 Equating Subscores Using Total Scaled Scores as an Anchor Gautam Puhan Longjuan Liang March 2011 Equating Subscores Using Total Scaled Scores as an Anchor Gautam Puhan and

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 24 in Relation to Measurement Error for Mixed Format Tests Jae-Chun Ban Won-Chan Lee February 2007 The authors are

More information

Population Invariance of Score Linking: Theory and Applications to Advanced Placement Program Examinations

Population Invariance of Score Linking: Theory and Applications to Advanced Placement Program Examinations Research Report Population Invariance of Score Linking: Theory and Applications to Advanced Placement Program Examinations Neil J. Dorans, Editor Research & Development October 2003 RR-03-27 Population

More information

Experimental designs for multiple responses with different models

Experimental designs for multiple responses with different models Graduate Theses and Dissertations Graduate College 2015 Experimental designs for multiple responses with different models Wilmina Mary Marget Iowa State University Follow this and additional works at:

More information

ROBUST SCALE TRNASFORMATION METHODS IN IRT TRUE SCORE EQUATING UNDER COMMON-ITEM NONEQUIVALENT GROUPS DESIGN

ROBUST SCALE TRNASFORMATION METHODS IN IRT TRUE SCORE EQUATING UNDER COMMON-ITEM NONEQUIVALENT GROUPS DESIGN ROBUST SCALE TRNASFORMATION METHODS IN IRT TRUE SCORE EQUATING UNDER COMMON-ITEM NONEQUIVALENT GROUPS DESIGN A Dissertation Presented to the Faculty of the Department of Educational, School and Counseling

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

Multidimensional item response theory observed score equating methods for mixed-format tests

Multidimensional item response theory observed score equating methods for mixed-format tests University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Multidimensional item response theory observed score equating methods for mixed-format tests Jaime Leigh Peterson University

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

A Unified Approach to Linear Equating for the Non-Equivalent Groups Design

A Unified Approach to Linear Equating for the Non-Equivalent Groups Design Research Report A Unified Approach to Linear Equating for the Non-Equivalent Groups Design Alina A. von Davier Nan Kong Research & Development November 003 RR-03-31 A Unified Approach to Linear Equating

More information

Effect of Repeaters on Score Equating in a Large Scale Licensure Test. Sooyeon Kim Michael E. Walker ETS, Princeton, NJ

Effect of Repeaters on Score Equating in a Large Scale Licensure Test. Sooyeon Kim Michael E. Walker ETS, Princeton, NJ Effect of Repeaters on Score Equating in a Large Scale Licensure Test Sooyeon Kim Michael E. Walker ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research Association

More information

Package equate. February 15, 2013

Package equate. February 15, 2013 Package equate February 15, 2013 Version 1.1-4 Date 2011-8-23 Title Statistical Methods for Test Score Equating Author Anthony Albano Maintainer Anthony Albano

More information

Observed-Score "Equatings"

Observed-Score Equatings Comparison of IRT True-Score and Equipercentile Observed-Score "Equatings" Frederic M. Lord and Marilyn S. Wingersky Educational Testing Service Two methods of equating tests are compared, one using true

More information

NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS

NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS NON-NUMERICAL RANKING BASED ON PAIRWISE COMPARISONS By Yun Zhai, M.Sc. A Thesis Submitted to the School of Graduate Studies in partial fulfilment of the requirements for the degree of Ph.D. Department

More information

Chapter 1 A Statistical Perspective on Equating Test Scores

Chapter 1 A Statistical Perspective on Equating Test Scores Chapter 1 A Statistical Perspective on Equating Test Scores Alina A. von Davier The fact that statistical methods of inference play so slight a role... reflect[s] the lack of influence modern statistical

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes

An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes An Approach to Constructing Good Two-level Orthogonal Factorial Designs with Large Run Sizes by Chenlu Shi B.Sc. (Hons.), St. Francis Xavier University, 013 Project Submitted in Partial Fulfillment of

More information

Estimation for state space models: quasi-likelihood and asymptotic quasi-likelihood approaches

Estimation for state space models: quasi-likelihood and asymptotic quasi-likelihood approaches University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2008 Estimation for state space models: quasi-likelihood and asymptotic

More information

Ability Metric Transformations

Ability Metric Transformations Ability Metric Transformations Involved in Vertical Equating Under Item Response Theory Frank B. Baker University of Wisconsin Madison The metric transformations of the ability scales involved in three

More information

Modular Monochromatic Colorings, Spectra and Frames in Graphs

Modular Monochromatic Colorings, Spectra and Frames in Graphs Western Michigan University ScholarWorks at WMU Dissertations Graduate College 12-2014 Modular Monochromatic Colorings, Spectra and Frames in Graphs Chira Lumduanhom Western Michigan University, chira@swu.ac.th

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS by Mary A. Hansen B.S., Mathematics and Computer Science, California University of PA,

More information

Nonequivalent-Populations Design David J. Woodruff American College Testing Program

Nonequivalent-Populations Design David J. Woodruff American College Testing Program A Comparison of Three Linear Equating Methods for the Common-Item Nonequivalent-Populations Design David J. Woodruff American College Testing Program Three linear equating methods for the common-item nonequivalent-populations

More information

Use of Continuous Exponential Families to Link Forms via Anchor Tests

Use of Continuous Exponential Families to Link Forms via Anchor Tests Research Report ETS RR 11-11 Use of Continuous Exponential Families to Link Forms via Anchor Tests Shelby J. Haberman Duanli Yan April 11 Use of Continuous Exponential Families to Link Forms via Anchor

More information

Comparing Group Means When Nonresponse Rates Differ

Comparing Group Means When Nonresponse Rates Differ UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2015 Comparing Group Means When Nonresponse Rates Differ Gabriela M. Stegmann University of North Florida Suggested Citation Stegmann,

More information

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI Department of Computer Science APPROVED: Vladik Kreinovich,

More information

In this lesson, students model filling a rectangular

In this lesson, students model filling a rectangular NATIONAL MATH + SCIENCE INITIATIVE Mathematics Fill It Up, Please Part III Level Algebra or Math at the end of a unit on linear functions Geometry or Math as part of a unit on volume to spiral concepts

More information

The robustness of Rasch true score preequating to violations of model assumptions under equivalent and nonequivalent populations

The robustness of Rasch true score preequating to violations of model assumptions under equivalent and nonequivalent populations University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2008 The robustness of Rasch true score preequating to violations of model assumptions under equivalent and

More information

Examining the accuracy of the normal approximation to the poisson random variable

Examining the accuracy of the normal approximation to the poisson random variable Eastern Michigan University DigitalCommons@EMU Master's Theses and Doctoral Dissertations Master's Theses, and Doctoral Dissertations, and Graduate Capstone Projects 2009 Examining the accuracy of the

More information

A White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example

A White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example A White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example Robert L. Brennan CASMA University of Iowa June 10, 2012 On May 3, 2012, the author made a PowerPoint presentation

More information

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations

Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and Implementations Mohammed El-Shambakey Dissertation Submitted to the Faculty of the Virginia Polytechnic Institute and State

More information

Linear Equating Models for the Common-item Nonequivalent-Populations Design Michael J. Kolen and Robert L. Brennan American College Testing Program

Linear Equating Models for the Common-item Nonequivalent-Populations Design Michael J. Kolen and Robert L. Brennan American College Testing Program Linear Equating Models for the Common-item Nonequivalent-Populations Design Michael J. Kolen Robert L. Brennan American College Testing Program The Tucker Levine equally reliable linear meth- in the common-item

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Content Descriptions Based on the Common Core Georgia Performance Standards (CCGPS) CCGPS Coordinate Algebra

Content Descriptions Based on the Common Core Georgia Performance Standards (CCGPS) CCGPS Coordinate Algebra Content Descriptions Based on the Common Core Georgia Performance Standards (CCGPS) CCGPS Coordinate Algebra Introduction The State Board of Education is required by Georgia law (A+ Educational Reform

More information

Lessons in Estimation Theory for Signal Processing, Communications, and Control

Lessons in Estimation Theory for Signal Processing, Communications, and Control Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL

More information

Abdul-Majid Wazwaz. Linear and Nonlinear Integral Equations. Methods and Applications

Abdul-Majid Wazwaz. Linear and Nonlinear Integral Equations. Methods and Applications Abdul-Majid Wazwaz Linear and Nonlinear Integral Equations Methods and Applications Abdul-Majid Wazwaz Linear and Nonlinear Integral Equations Methods and Applications With 4 figures ~ -:tr It i >j: Pt.~l

More information

The impact of equating method and format representation of common items on the adequacy of mixed-format test equating using nonequivalent groups

The impact of equating method and format representation of common items on the adequacy of mixed-format test equating using nonequivalent groups University of Iowa Iowa Research Online Theses and Dissertations Summer 2010 The impact of equating method and format representation of common items on the adequacy of mixed-format test equating using

More information

Reduced [tau]_n-factorizations in Z and [tau]_nfactorizations

Reduced [tau]_n-factorizations in Z and [tau]_nfactorizations University of Iowa Iowa Research Online Theses and Dissertations Summer 2013 Reduced [tau]_n-factorizations in Z and [tau]_nfactorizations in N Alina Anca Florescu University of Iowa Copyright 2013 Alina

More information

equate: An R Package for Observed-Score Linking and Equating

equate: An R Package for Observed-Score Linking and Equating equate: An R Package for Observed-Score Linking and Equating Anthony D. Albano University of Nebraska-Lincoln Abstract The R package equate (Albano 2016) contains functions for observed-score linking and

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. A Multinomial Error Model for Tests with Polytomous Items

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. A Multinomial Error Model for Tests with Polytomous Items Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 1 for Tests with Polytomous Items Won-Chan Lee January 2 A previous version of this paper was presented at the Annual

More information

equate: An R Package for Observed-Score Linking and Equating

equate: An R Package for Observed-Score Linking and Equating equate: An R Package for Observed-Score Linking and Equating Anthony D. Albano University of Nebraska-Lincoln Abstract The R package equate (Albano 2014) contains functions for observed-score linking and

More information

Prediction of double gene knockout measurements

Prediction of double gene knockout measurements Prediction of double gene knockout measurements Sofia Kyriazopoulou-Panagiotopoulou sofiakp@stanford.edu December 12, 2008 Abstract One way to get an insight into the potential interaction between a pair

More information

A Note on the Choice of an Anchor Test in Equating

A Note on the Choice of an Anchor Test in Equating Research Report ETS RR 12-14 A Note on the Choice of an Anchor Test in Equating Sandip Sinharay Shelby Haberman Paul Holland Charles Lewis September 2012 ETS Research Report Series EIGNOR EXECUTIVE EDITOR

More information

VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION. Carol Lindee

VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION. Carol Lindee VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION Carol Lindee LindeeEmail@netscape.net (708) 479-3764 Nick Thomopoulos Illinois Institute of Technology Stuart

More information

Curvature measures for generalized linear models

Curvature measures for generalized linear models University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 1999 Curvature measures for generalized linear models Bernard A.

More information

SESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01

SESUG 2011 ABSTRACT INTRODUCTION BACKGROUND ON LOGLINEAR SMOOTHING DESCRIPTION OF AN EXAMPLE. Paper CC-01 Paper CC-01 Smoothing Scaled Score Distributions from a Standardized Test using PROC GENMOD Jonathan Steinberg, Educational Testing Service, Princeton, NJ Tim Moses, Educational Testing Service, Princeton,

More information

Honors Algebra II / Trigonometry

Honors Algebra II / Trigonometry Honors Algebra II / Trigonometry 2013-2014 Instructor: Busselmaier Room: 158 Academic Support Location: Room 158 or Office 152 E-mail: cbusselmaier@regisjesuit.com (email is the best way to get in touch

More information

Numerical computation of an optimal control problem with homogenization in one-dimensional case

Numerical computation of an optimal control problem with homogenization in one-dimensional case Retrospective Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 28 Numerical computation of an optimal control problem with homogenization in one-dimensional case Zhen

More information

A THESIS. Submitted by MAHALINGA V. MANDI. for the award of the degree of DOCTOR OF PHILOSOPHY

A THESIS. Submitted by MAHALINGA V. MANDI. for the award of the degree of DOCTOR OF PHILOSOPHY LINEAR COMPLEXITY AND CROSS CORRELATION PROPERTIES OF RANDOM BINARY SEQUENCES DERIVED FROM DISCRETE CHAOTIC SEQUENCES AND THEIR APPLICATION IN MULTIPLE ACCESS COMMUNICATION A THESIS Submitted by MAHALINGA

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

LESOTHO HIGH SCHOOL STUDENTS CONCEPTIONS OF EARTHQUAKES

LESOTHO HIGH SCHOOL STUDENTS CONCEPTIONS OF EARTHQUAKES LESOTHO HIGH SCHOOL STUDENTS CONCEPTIONS OF EARTHQUAKES MALITŠOANELO NTHATI THAMAE Degree of Master of Science by coursework and research: A research report submitted to the Faculty of Science, University

More information

Shelby J. Haberman. Hongwen Guo. Jinghua Liu. Neil J. Dorans. ETS, Princeton, NJ

Shelby J. Haberman. Hongwen Guo. Jinghua Liu. Neil J. Dorans. ETS, Princeton, NJ Consistency of SAT I: Reasoning Test Score Conversions Shelby J. Haberman Hongwen Guo Jinghua Liu Neil J. Dorans ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

Equating Tests Under The Nominal Response Model Frank B. Baker

Equating Tests Under The Nominal Response Model Frank B. Baker Equating Tests Under The Nominal Response Model Frank B. Baker University of Wisconsin Under item response theory, test equating involves finding the coefficients of a linear transformation of the metric

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

Development and Calibration of an Item Response Model. that Incorporates Response Time

Development and Calibration of an Item Response Model. that Incorporates Response Time Development and Calibration of an Item Response Model that Incorporates Response Time Tianyou Wang and Bradley A. Hanson ACT, Inc. Send correspondence to: Tianyou Wang ACT, Inc P.O. Box 168 Iowa City,

More information

IRT linking methods for the bifactor model: a special case of the two-tier item factor analysis model

IRT linking methods for the bifactor model: a special case of the two-tier item factor analysis model University of Iowa Iowa Research Online Theses and Dissertations Summer 2017 IRT linking methods for the bifactor model: a special case of the two-tier item factor analysis model Kyung Yong Kim University

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 23 Comparison of Three IRT Linking Procedures in the Random Groups Equating Design Won-Chan Lee Jae-Chun Ban February

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

AP Calculus. Analyzing a Function Based on its Derivatives

AP Calculus. Analyzing a Function Based on its Derivatives AP Calculus Analyzing a Function Based on its Derivatives Presenter Notes 016 017 EDITION Copyright 016 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org

More information

Multivariate Analysis in The Human Services

Multivariate Analysis in The Human Services Multivariate Analysis in The Human Services INTERNATIONAL SERIES IN SOCIAL WELFARE Series Editor: William J. Reid State University of New York at Albany Advisory Editorial Board: Weiner W. Boehm Rutgers,

More information

Effect of 3D Stress States at Crack Front on Deformation, Fracture and Fatigue Phenomena

Effect of 3D Stress States at Crack Front on Deformation, Fracture and Fatigue Phenomena Effect of 3D Stress States at Crack Front on Deformation, Fracture and Fatigue Phenomena By Zhuang He B. Eng., M. Eng. A thesis submitted for the degree of Doctor of Philosophy at the School of Mechanical

More information

Estimating Measures of Pass-Fail Reliability

Estimating Measures of Pass-Fail Reliability Estimating Measures of Pass-Fail Reliability From Parallel Half-Tests David J. Woodruff and Richard L. Sawyer American College Testing Program Two methods are derived for estimating measures of pass-fail

More information

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments The application and empirical comparison of item parameters of Classical Test Theory and Partial Credit Model of Rasch in performance assessments by Paul Moloantoa Mokilane Student no: 31388248 Dissertation

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014 JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA 3 Credit Hours Prepared by: Skyler Ross & Connie Kuchar September 2014 Dr. Robert Brieler, Division Chair, Math & Science Ms. Shirley Davenport,

More information

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014 JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA 3 Credit Hours Prepared by: Skyler Ross & Connie Kuchar September 2014 Ms. Linda Abernathy, Math, Science, & Business Division Chair Ms. Shirley

More information

Mir Md. Maruf Morshed

Mir Md. Maruf Morshed Investigation of External Acoustic Loadings on a Launch Vehicle Fairing During Lift-off Supervisors: Professor Colin H. Hansen Associate Professor Anthony C. Zander School of Mechanical Engineering South

More information

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship Scholars' Mine Doctoral Dissertations Student Research & Creative Works Spring 01 Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with

More information

Using statistical equating for standard maintaining in GCSEs and A levels

Using statistical equating for standard maintaining in GCSEs and A levels Using statistical equating for standard maintaining in GCSEs and A levels Tom Bramley & Carmen Vidal Rodeiro Cambridge Assessment Research Report 22 nd January 2014 Author contact details: Tom Bramley

More information

Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics

Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 Event Operators: Formalization, Algorithms, and Implementation Using Interval- Based Semantics Raman

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

Advanced Placement Calculus AB/BC Standards

Advanced Placement Calculus AB/BC Standards A Correlation of Calculus AP Edition, 2018 To the Advanced Placement Calculus AB/BC Standards AP is a trademark registered and/or owned by the College Board, which was not involved in the production of,

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. Hierarchical Cognitive Diagnostic Analysis: Simulation Study

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. Hierarchical Cognitive Diagnostic Analysis: Simulation Study Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 38 Hierarchical Cognitive Diagnostic Analysis: Simulation Study Yu-Lan Su, Won-Chan Lee, & Kyong Mi Choi Dec 2013

More information

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit March 27, 2004 Young-Sun Lee Teachers College, Columbia University James A.Wollack University of Wisconsin Madison

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

STAT FINAL EXAM

STAT FINAL EXAM STAT101 2013 FINAL EXAM This exam is 2 hours long. It is closed book but you can use an A-4 size cheat sheet. There are 10 questions. Questions are not of equal weight. You may need a calculator for some

More information

Probability and Stochastic Processes

Probability and Stochastic Processes Probability and Stochastic Processes A Friendly Introduction Electrical and Computer Engineers Third Edition Roy D. Yates Rutgers, The State University of New Jersey David J. Goodman New York University

More information

REALIZING TOURNAMENTS AS MODELS FOR K-MAJORITY VOTING

REALIZING TOURNAMENTS AS MODELS FOR K-MAJORITY VOTING California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-016 REALIZING TOURNAMENTS AS MODELS FOR K-MAJORITY VOTING Gina

More information

PRINCIPLES OF STATISTICAL INFERENCE

PRINCIPLES OF STATISTICAL INFERENCE Advanced Series on Statistical Science & Applied Probability PRINCIPLES OF STATISTICAL INFERENCE from a Neo-Fisherian Perspective Luigi Pace Department of Statistics University ofudine, Italy Alessandra

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

All rights reserved. Reproduction of these materials for instructional purposes in public school classrooms in Virginia is permitted.

All rights reserved. Reproduction of these materials for instructional purposes in public school classrooms in Virginia is permitted. Algebra I Copyright 2009 by the Virginia Department of Education P.O. Box 2120 Richmond, Virginia 23218-2120 http://www.doe.virginia.gov All rights reserved. Reproduction of these materials for instructional

More information

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS Tao Jiang A Thesis Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

Preliminary statistics

Preliminary statistics 1 Preliminary statistics The solution of a geophysical inverse problem can be obtained by a combination of information from observed data, the theoretical relation between data and earth parameters (models),

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

c 2011 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED

c 2011 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED c 211 JOSHUA DAVID JOHNSTON ALL RIGHTS RESERVED ANALYTICALLY AND NUMERICALLY MODELING RESERVOIR-EXTENDED POROUS SLIDER AND JOURNAL BEARINGS INCORPORATING CAVITATION EFFECTS A Dissertation Presented to

More information