emerge Network: CERC Survey Survey Sampling Data Preparation

Similar documents
emerge Network: CERC Survey Survey Sampling Data Preparation

Point-in-Time Count KS-507 Kansas Balance of State CoC

Institution: CUNY John Jay College of Criminal Justice (190600) User ID: 36C0029 Completions Overview distance education

Institution: CUNY Hostos Community College (190585) User ID: 36C0029 Completions Overview distance education All Completers unduplicated count

Completions Survey materials can be downloaded using the following link: Survey Materials.

Completions Survey materials can be downloaded using the following link: Survey Materials.

Institution: CUNY Hostos Community College (190585) User ID: 36C0029 Completions Overview distance education All Completers unduplicated count

COMMISSION ON ACCREDITATION 2017 ANNUAL REPORT ONLINE

Completions Institution: CUNY Borough of Manhattan Community College User ID: 36C0029

Institution: CUNY Bronx Community College (190530) User ID: 36C0029 Completions Overview distance education All Completers unduplicated count

Institution: CUNY Queensborough Community College (190673) User ID: 36C0029 Completions Overview distance education

COMMISSION ON ACCREDITATION 2011 ANNUAL REPORT ONLINE

Section 4. Test-Level Analyses

Integrated Postsecondary Education Data System

Upload Multiple Student File Layout TerraNova CSP Emphasis Highlighted

Completions Overview. Completions Common Errors

Landmark Elementary School School Report Card Arch Street Pike Little Rock, AR

Using American Factfinder

Cato Elementary School School Report Card Jacksonville Cato Road North Little Rock, AR

College Station Elem. School School Report Card Frasier Pike, PO Bx 670 College Station, AR

Completions Overview. Completions Common Errors

Institution: New Mexico Highlands University (187897) User ID: P Completions Overview distance education All Completers unduplicated count

Completions Overview. Recent Changes

INSIDE. Metric Descriptions by Topic Area. Data Sources and Methodology by Topic Area. Technical Appendix

Using Decision Trees to Evaluate the Impact of Title 1 Programs in the Windham School District

In-State Resident

Life, Physical, and Social Science Occupations in Allegheny County

Working with Census 2000 Data from MassGIS

Completions Completions Overview

Completions Overview. Recent Changes

GIS-Based Analysis of the Commuting Behavior and the Relationship between Commuting and Urban Form

COMMISSION ON ACCREDITATION 2014 ANNUAL REPORT ONLINE

COMMISSION ON ACCREDITATION 2012 ANNUAL REPORT ONLINE

Oregon Population Forecast Program

Exercise on Using Census Data UCSB, July 2006

Institutional Research with Public Data and Open Source Software

Frontier and Remote (FAR) Area Codes: A Preliminary View of Upcoming Changes John Cromartie Economic Research Service, USDA

2010 Census Data Release and Current Geographic Programs. Michaellyn Garcia Geographer Seattle Regional Census Center

Pellissippi State Community College Spring Fact Book

Institution: Eastern Washington University (235097) User ID: P Completions Overview distance education All Completers unduplicated count

STAR COMMUNITY RATING SYSTEM OBJECTIVE EE-4: EQUITABLE SERVICES & ACCESS COMMUNITY LEVEL OUTCOMES FOR KING COUNTY, WA

Enrollment at a Glance Fall 2015

DESCRIPTION OF SAMPLING PROCEDURE AND WEIGHTING METHOD. POLISH NATIONAL ELECTION STUDY (PGSW 2001).

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX

2014 Planning Database (PDB)

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name.

Appendix B: Undergraduate Academic Interests Survey Spring 2009

Dr Arulsivanathan Naidoo Statistics South Africa 18 October 2017

Environmental Justice Analysis FOR THE MINNESOTA STATEWIDE FREIGHT SYSTEM PLAN

The Church Demographic Specialists

BROOKINGS May

A Street Named for a King

PROGRAM EVALUATION REPORT The following information is organized by CACREP-Accredited Program Specialty Areas for

Enrollment of Students with Disabilities

Spotlight on Population Resources for Geography Teachers. Pat Beeson, Education Services, Australian Bureau of Statistics

Completions Survey materials can be downloaded using the following link: Survey Materials.

VIKING INSPECTION PROPERTY 4921 U.S. Hwy. 85, Williston, ND 58801

ASSOCIATED PRESS-WEATHER UNDERGROUND WEATHER SURVEY CONDUCTED BY KNOWLEDGE NETWORKS January 28, 2011

Completions Overview. Completions Common Errors

Utilizing Data from American FactFinder with TIGER/Line Shapefiles in ArcGIS

PALS: Neighborhood Identification, City of Frederick, Maryland. David Boston Razia Choudhry Chris Davis Under the supervision of Chao Liu

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

Economic Analysis of Public Transportation in Reno, Nevada

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

Low-Income African American Women's Perceptions of Primary Care Physician Weight Loss Counseling: A Positive Deviance Study

Informational PDF #8. CSP-Friendly Version of the Precoding File Layout

Demographic Data. How to get it and how to use it (with caution) By Amber Keller

ADDRESSING TITLE VI AND ENVIRONMENTAL JUSTICE IN LONG-RANGE TRANSPORTATION PLANS

Population Profiles

PROGRAM EVALUATION: COMPUTER AND ELECTRONICS TECHNOLOGY. OIR Report No

PROGRAM EVALUATION PRELIMINARY REPORT. The following information is organized by program specialty areas for

Demographic characteristics of the School of International Studies 9 th Grade class and their success their first semester.

GIS Lecture 5: Spatial Data

NAAB Annual Report -- Part I Statistical Report SECTION A. INSTITUTIONAL CHARACTERISTICS

8. Who is the university administrator responsible for verifying data (and completing IPEDS reports) at your institution?

Population Research Center (PRC) Oregon Population Forecast Program

8. Who is the university administrator responsible for verifying data (and completing IPEDS reports) at your institution?

8. Who is the university administrator responsible for verifying data (and completing IPEDS reports) at your institution?

Integrated Postsecondary Education Data System

AP Preadministration Instructions

AP Preadministration Instructions

AP Preadministration Instructions

Dental Hygiene. Program Director: Deborah Carl Wolf, RDH, MEd Student Undergraduate Enrollment by First Degree: First Major - Fall Semester

Measures from the Adult Social Care Outcomes Framework, England

Inclusion of Non-Street Addresses in Cancer Cluster Analysis

GEORGETOWN UNIVERSITY Standard 509 Information Report

For Sublease. Turn-Key Restaurant 1303 South 72nd Street Suites 101 & 102 Omaha, NE 68124

Tuesday 6:30 9:30 (First/Last classes) Home Phone: SYLLABUS. I. Focus of Course

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Demographic Data in ArcGIS. Harry J. Moore IV

An Internet-Based Integrated Resource Management System (IRMS)

The 2020 Census Geographic Partnership Opportunities

SOUTH COAST COASTAL RECREATION METHODS

Spatial Organization of Data and Data Extraction from Maptitude

Understanding Your Community A Guide to Data

Administrative Data Research Facility Linked HMDA and ACS Database

NSHE DIVERSITY REPORT

2003 National Name Exchange Annual Report

Tables and Figures to Accompany

Geographic Products and Data. Improvements in Spatial Accuracy and Accessing Data

Transcription:

emerge Network: CERC Survey Survey Sampling Data Preparation Overview The entire patient population does not use inpatient and outpatient clinic services at the same rate, nor are racial and ethnic subpopulations represented proportionately. Using the clinic population as a pool of potential survey participants, random sampling will likely predispose the survey results to primarily reflect the opinions of an older, sicker, and whiter portion of the populace. We will use demographic and geocode-based data to develop a stratified sampling scheme to enable greater representation from demographic groups less likely to come to clinic and who are less often studied. The Sampling Strategy Workgroup will centrally develop a sampling plan for use by the entire Network based on patient-specific demographics data combined with data from the 2012 American Community Survey (ACS) that will be sent by the sites to the CC. Population definitions Adult populations All patients who have had at least one inpatient or outpatient clinic visit at one of the participating sites, between October 1, 2012 and October 1, 2014, who were at least 18 years old at the time of the last visit, who have a valid home address in the United States of America, and who are not known to be deceased. Pediatric populations Parents of all patients who have had at least one inpatient or outpatient clinic visit at one of the participating sites, between October 1, 2012 and October 1, 2014, who were less than 18 years old at the time of the last visit, who have a valid home address in the United States of America, and who are not known to be deceased. Data transfers There will be four population data transfers relevant to the sampling strategy: two transfers from the site to the CC and two transfers from the CC to the site. The first data transfer corresponds to the pilot study and will be used, in part, to identify all challenges. The second will be on a relatively short timeline and will be used to identify participants for the survey itself. Data transfer #1 (May 1, 2014 June 30, 2014) Site à CC:

Sites will send the CC data on their entire population of patients who are qualified to participate in this study, but will limit the inclusion date of the most recent clinic visit to October 1, 2012 May 1, 2014. Data transfer #2 (September 1, 2014) CC à Site: The CC will return the population data to the sites but with five added columns: 1) MetInclusion (1=met phase 1 and phase 2 checks*), 2) SurveyCode (A, B, C, or NA), 3) Passcode (a unique 6 character code use to access website; values are lowercase and do not include o or l ), 4) InstitutionCode (a 2 digit code defined in the Scantron document) and 5) CustomerContactID (a concatenation of the institution code and ID). * Phase 1 identified subjects that had a missing ID, a duplicate ID, missing census information, missing address/household ID, a common householdid (20+), a date issue, or an address that was not geocodable (if available). Phase 2 then identified those subjects with a missing age or gender or that had an age conflict (e.g., age>=18, but seen at a pediatric center). Data transfer #3 (November 1, 2014 November 15, 2014) Site à CC: Sites will update the population datasets using the entire date range October 1, 2012 October 1, 2014. It will contain updated information (including updated age, geocode and other data) for those who were included in data transfer #1 as well new patients observed between May 1, 2014 and October 1, 2014. Data transfer #4 (January 15, 2015) CC à Site: The CC will return the population data to the sites along with the five columns outlined in Data transfer #2. Data transfer protocol Data may be transferred from each institution to the Coordinating Center using an institutions own secure data transfer program/service. If one is not available, Data Hippo (https://data.vanderbilt.edu/data-hippo/) may be used. All data returned from the Coordinating Center will use Vanderbilt s subscription to Accellion. Questions regarding the transfer of data, should be directed to Nate Mercaldo (nate.mercaldo@vanderbilt.edu) and Jonathan Schildcrout (jonathan.schildcrout@vanderbilt.edu). Process description 1. Send Data to CC Send the CC a csv file that contains the demographic variables and the census-derived variables at the block group level (see Data to be sent to the CC below). The table will have N rows where N is the number of unique patients. Please use the variable names

(case-sensitive) provided below, but please DO NOT include the GEOID variable in the dataset. See DemographicsExample.csv for an example file that should be sent to the CC. 2. CC will create sampling strata The CC will cross-tabulate the demographic data at each site. The cross-tabulation will involve the following stratification variables: age, sex, race, ethnicity, urban/rural (census block group), median income (census block group), and educational attainment for adults at least 25 years (census block group). For patients with missing race or ethnicity, we will define strata using the race and ethnicity information at the block group level. 3. Sampling Strategy Workgroup Develops Sampling Plan The CC will work with the sampling strategy working group to develop a sampling plan. The sampling scheme will consider the availability of demographic subpopulations that are of interest across the network when developing the sampling strategy. The goal of the sampling strategy is to develop the simplest approach (i.e., fewest sampling strata) that will permit adequate representation of subpopulations of interest. This can only be detailed after demographic data are observed. 4. Data Returned to Site After the sampling plan has been identified, the CC will return the population data to the sites as described above in Data transfers. Data to be sent to the CC The following is a data dictionary for the demographics dataset that is to be joined with both the block group based census data (including urban/rural) and the ACS 2008-2012 five-year summary file estimates. The CC created a national shapefile and state-specific shapefiles that include all relevant census and ACS data. Join the following demographics dataset with the shapefile. The datasets are to be joined by the block group variable, GEOID. Important note: Do not include GEOID in the file that is sent to the CC. Variable format Definition Name id character(10) Patient identifier: The first two characters correspond to the emerge site id. age numeric Patient age on the date of the most recent clinic visit rounded to the nearest year (this is a whole number) or if that cannot be calculated please indicate how it was calculated. sex character(1) M =male, F =female, O =other, U =unknown race character(2) Patient race. Include race categories that are at least as finely detailed as those listed in NIH Planned Enrollment Report.docx (in the dropbox folder). Attach a detailed and

comprehensive data dictionary. ethnicity character(1) Hispanic or latino: Y =yes; N =no; U =unknown biobank numeric Is the patient a participant in the biobank?: 1 = yes; 2 = no (declined participation); 3=unknown monthyear character(7) Month and year of the most recent inpatient or outpatient contact with the EMR. Format: mm/yyyy AddressID character(7) This variable should be coded in such a way that each unique street address can be identified. Patients who live at the same street address should receive the same AddressID. Patients who live in different apartments at the same street address should receive the same AddressID. HouseholdID character(7) This variable should be coded in such a way that each unique household can be identified based on the street address and possibly the apartment number. Patients who live at the same street address but in different apartments should receive different values for HouseholdID. GEOID numeric Census block group: a 12 digit concatenation of state (2 digits), county (3 digits), census tract (6 digits), block group (1 digit). Not to be confused with GEOID_1 and GEOID_TIGR. The following is a data dictionary of the census and ACS shapefiles that are to be joined with the demographics dataset. If you would like a data dictionary for the ACS five-year summary tables, see ACS2012_5-year_TableShells.xls in the dropbox folder. Additionally, see the document emergegeocodedata.docx in the dropbox folder if you want to understand how the census and ACS shapefile was developed. Finally, see shapefile_download_instructions.docx for instructions on how to download state-specific and country shapefiles. Variable format Definition Name GEOID numeric Census block group: a 12 digit concatenation of state (2), county (3), census tract (6), block group (1). Not to be confused with GEOID_1 and GEOID_TIGR. LSAD10 numeric 75=urbanized area (50000 or more), 76=urban cluster (2500 to 50000), missing=rural. From the 2010 Census urban and rural classification and urban area criteria. B19013_001, B19013A_00, Through B19013I_00 numeric Median household income overall and by race / ethnicity (White alone, Black or AA alone, Am Indian / Alaska Native alone, Asian alone, Native Hawaiian / Other Pacific Islander, Some other race alone, Two or more races, White alone not hispanic or latino, hispanic or latino) for the census block group. Table B19013 from the 5-year estimate summary file (ACS 2008-2012)

B03002_001 through B03002_021 B15002_001 through B15002035 numeric numeric Number overall and of each race (White alone, Black or AA alone, Am Indian / Alaska Native alone, Asian alone, Native Hawaiian / Other Pacific Islander, Some other race alone, Two or more races, White alone not hispanic or latino, hispanic or latino Two races including some other race, two races excluding some other race / three or more races) by ethnicity (Hispanic, not hispanic) combination within the census block group. Table B3002002 from the 5-year estimate summary file (ACS 2008-2012). Number of each educational attainment group (no schooling, nursery to fourth grade, 5 th and 6 th, 7 th -8 th, 9 th, 10 th, 11 th, 12 th with no diploma, HS grad/ged/alternative, some college less than 1 year, some college one or more years and no degree, associates degree, bachelor s degree, masters degree, professional school degree, doctorate) by gender combination within the census block group for those who are 25 or older. Table B15002 from the 5-year estimate summary file (ACS 2008-2012).