The LTA Data Aggregation Program

Similar documents
Using SPSS for One Way Analysis of Variance

WinLTA USER S GUIDE for Data Augmentation

Q-Matrix Development. NCME 2009 Workshop

New York State. Electronic Certificate of Need. HCS Coordinator. Overview. Version 1.0

1 Interaction models: Assignment 3

Testing the Limits of Latent Class Analysis. Ingrid Carlson Wurpts

Retrieve and Open the Data

Twin Case Study: Treatment for Articulation Disabilities

A Re-Introduction to General Linear Models (GLM)

Table of content. Understanding workflow automation - Making the right choice Creating a workflow...05

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Regional estimates of poverty indicators based on a calibration technique

Systematic error, of course, can produce either an upward or downward bias.

CHEMICAL INVENTORY ENTRY GUIDE

LED Lighting Facts: Manufacturer Guide

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

AN INTERNATIONAL SOLAR IRRADIANCE DATA INGEST SYSTEM FOR FORECASTING SOLAR POWER AND AGRICULTURAL CROP YIELDS

William M. Frank* and George S. Young The Pennsylvania State University, University Park, PA. 2. Data

PRELIMINARY GEOLOGIC MAP OF THE SANTA SUSANA 7.5' QUADRANGLE, SOUTHERN CALIFORNIA: A DIGITAL DATABASE

example can be used to refute a conjecture, it cannot be used to prove one is always true.] propositions or conjectures

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Multiple Correspondence Analysis

Phasor mathematics. Resources and methods for learning about these subjects (list a few here, in preparation for your research):

LED Lighting Facts: Product Submission Guide

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management

Companion. Jeffrey E. Jones

Categorical and Zero Inflated Growth Models

FIT100 Spring 01. Project 2. Astrological Toys

Comparative study on the complex samples design features using SPSS Complex Samples, SAS Complex Samples and WesVarPc.

VELA. Getting started with the VELA Versatile Laboratory Aid. Paul Vernon

Lab 3. Newton s Second Law

GENERAL INSTRUCTIONS

Manipulating Radicals

AMBIGUITY PROBLEMS WITH POLAR COORDINATES

Variable-Specific Entropy Contribution

Equating Tests Under The Nominal Response Model Frank B. Baker

bmuthen posted on Tuesday, August 23, :06 am The following question appeared on SEMNET Aug 19, 2005.

NINE CHOICE SERIAL REACTION TIME TASK

Comparing IRT with Other Models

Solving Word Problems

User's Guide for SCORIGHT (Version 3.0): A Computer Program for Scoring Tests Built of Testlets Including a Module for Covariate Analysis

OpenSees Navigator. OpenSees Navigator

CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3

TESTING AND MEASUREMENT IN PSYCHOLOGY. Merve Denizci Nazlıgül, M.S.

Algebra. Mathematics Help Sheet. The University of Sydney Business School

Refine & Validate. In the *.res file, be sure to add the following four commands after the UNIT instruction and before any atoms: ACTA CONF WPDB -2

August 7, 2007 NUMERICAL SOLUTION OF LAPLACE'S EQUATION

Latent Class Analysis for Models with Error of Measurement Using Log-Linear Models and An Application to Women s Liberation Data

Chapter 19: Logistic regression

2 Prediction and Analysis of Variance

Double Star Observations

Unit 4 Patterns and Algebra

Supplementary Material. Computational Study of Hydrido Boronium Dications and Comparison with the Isoelectronic Carbon Analogs

PHYS 301 HOMEWORK #13-- SOLUTIONS

Social Problem Solving Scale Grade 2 /Year 3. Fast Track Project Technical Report Anne Corrigan July 28, 2003

Getting to the Roots of Quadratics

A GUI FOR EVOLVE ZAMS

A SAS MACRO FOR ESTIMATING SAMPLING ERRORS OF MEANS AND PROPORTIONS IN THE CONTEXT OF STRATIFIED MULTISTAGE CLUSTER SAMPLING

Correlations with Categorical Data

Looking hard at algebraic identities.

Electric Fields and Equipotentials

Teaching about circuits at the introductory level: An emphasis on potential difference

On the declaration of the measurement uncertainty of airborne sound insulation of noise barriers

Sample Project: Simulation of Turing Machines by Machines with only Two Tape Symbols

Part 7: Glossary Overview

Yu Xie, Institute for Social Research, 426 Thompson Street, University of Michigan, Ann

Solving Equations by Adding and Subtracting

Test and Evaluation of an Electronic Database Selection Expert System

Probability and Samples. Sampling. Point Estimates

Grade 3 / Constructed Response Student Samples INFORMATIONAL PERFORMANCE TASK

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

HOMEWORK 8 SOLUTIONS MATH 4753

LAB 2: INTRODUCTION TO MOTION

Fundamentals of Algebra, Geometry, and Trigonometry. (Self-Study Course)

Classical Test Theory. Basics of Classical Test Theory. Cal State Northridge Psy 320 Andrew Ainsworth, PhD

UCLA STAT 233 Statistical Methods in Biomedical Imaging

Applied Statistics in Business & Economics, 5 th edition

Visualization of Commuter Flow Using CTPP Data and GIS

Collaborative topic models: motivations cont

Space Group & Structure Solution

mylab: Chemical Safety Module Last Updated: January 19, 2018

The Difficulty of Test Items That Measure More Than One Ability

RADIATION SAFETY GUIDELINES FOR NON-USERS

Permutations and Isomorphisms

ChemSep Case Book: Handling Missing Components

Group Dependence of Some Reliability

GEOGRAPHIC INFORMATION SYSTEMS SPECIALIST 3 DEFINITION:

Regression Analysis in R

Some general observations.

Proportions PRE-ACTIVITY PREPARATION

1 Overview of Simulink. 2 State-space equations

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

Computational Complexity

Jeffrey D. Ullman Stanford University

A MACRO-DRIVEN FORECASTING SYSTEM FOR EVALUATING FORECAST MODEL PERFORMANCE

SAMPLING- Method of Psychology. By- Mrs Neelam Rathee, Dept of Psychology. PGGCG-11, Chandigarh.

How to Read 100 Million Blogs (& Classify Deaths Without Physicians)

Transcription:

The LTA Data Aggregation Program Brian P Flaherty The Methodology Center The Pennsylvania State University August 1, 1999

I Introduction Getting data into response pattern format (grouped data) has often been a difficulty for new users of latent transition analysis (LTA; Collins, Wugalter & Rousculp, 1992) Typically, researchers have data files which are at the level of the individual Individual level data files have a separate set of responses for each person in the sample If one collected all the individuals who made identical responses to each of the items, one could record that set of responses once and follow it with the number of people making those particular responses In other words, one could cross-tabulate the item responses and record the number of subjects in each cell of the contingency table This is the format in which LTA expects data The program documented here creates, from a rectangular, individual level ASCII text file, a response pattern data file useable by LTA II Preprocessing the Data The individual level data need to meet certain criteria before this program can accurately aggregate the data The responses must be coded correctly, including any missing values, and the variables must be written out in a certain order LTA is a categorical data program, so all variables in an analysis must be coded categorically While the coding of categorical variables is arbitrary, LTA requires that response categories be consecutive positive integers starting with one So even if the codes you are working with are currently letters or some other set of labels, these must be recoded in order for 1

this program and LTA to handle them properly If missing values will be included in the LTA analysis, the missing values must be coded as zero This is the only code telling LTA that the response to a particular item is missing This recoding is easily performed in a statistical package like SPSS, SAS, etc The individual level data should be written out with at least one space between the variables, as shown in figure 1 LTA expects the items in the analysis to appear in the data file in a certain order For example, if one were analyzing four items measured three times, the items must appear in the same order within each time, and the four time 1 items must precede the time 2 items, which must precede the time 3 items If latent class indicators are also being used, they must come before all items measuring the latent statuses Finally, some people think that it is incorrect to have people in the analysis with all responses missing If one does not wish to have a completely missing response pattern in the analysis, then they should also be removed during this preprocessing step III Running the Data Aggregation Program The data aggregation program is invoked at the command prompt, either DOS or UNIX, by the command dataaggexe Once started, the user is asked four questions and then the program runs The user must type in 0 0 0 0 0 0 0 0 2 2 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 1 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 1 2 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 1 2 1 1 1 2 1 2 1 2 2 2 2 2 1 2 1 2 1 2 2 0 0 0 0 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 Figure 1 An example individual level data file 2

the name of the input (individual level) data file, the name of the output data file, the number of respondents in the input data file, and the number of variables (columns) in the input data file For example, if an LTA problem has four items measured at five times, the input data might look like Figure 1 These are the data for the subjects comprising the analysis When running the data aggregation program, the name of the input and output files need to be specified, as well as the number of subjects and variables, here equal to 20 Figure 2 shows part of the response pattern file created by the data aggregation 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 1 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 2 1 1 1 0 0 2 2 1 1 0 0 0 0 2 1 1 1 2 2 1 1 1 1 1 0 1 2 1 2 2 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 2 1 2 1 2 2 2 2 2 2 2 2 1 1 1 1 0 0 0 0 0 2 2 2 1 0 0 0 0 2 2 2 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 162 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 2 Figure 2 An example of the resultant response pattern file program IV Limitations Currently, there are a few limitations to the program The upper limit on the number of variables is 9999 at one time of measurement It is unlikely that this will prove to be a constraint The only real limitation is that the program is slow with a large data file The data needs to be sorted, and the sort routine is not the fastest available The sort routine is a selection sort (adapted from Ellis, Philips, & Lahey, 1994) Future versions of the program may have an improved sorting algorithm 3

V Bug reports these methods Send bug reports, suggestions, and/or feature requests to Brian P Flaherty in either of Email: bxf4@psuedu US Post Mail: The Pennsylvania State University The Methodology Center S-159 Henderson Bldg University Park, PA, 16802 References Collins, L M, Wugalter, S E, & Rousculp, S S (1992) Latent Transition Analysis (LTA) Program Manual (Version 10) Los Angeles: University of Southern California Ellis, T M R, Philips, I R, & Lahey, T M (1994) Fortran 90 Programming New York: Addison-Wesley 4