Can a Pseudo Panel be a Substitute for a Genuine Panel?

Size: px

Start display at page:

Download "Can a Pseudo Panel be a Substitute for a Genuine Panel?"

Ezra Bridges
5 years ago
Views:

1 Can a Pseudo Panel be a Substitute for a Genuine Panel? Min Hee Seo Washington University in St. Louis minheeseo@wustl.edu February 16th 1 / 20

2 Outline Motivation: gauging mechanism of changes Introduce pseudo panels as alternative statistical tool Limitations in existing pseudo panel approaches Technique for improvement Empirical Analysis Result Conclusion 2 / 20

3 Motivation: Presidential Approval Rating Figure: Changes in presidential approval rating in individual level Strongly Approve Somewhat Approve Neutral Somewhat Disapprove Strongly Disapprove Data: CCES Panel from More details 3 / 20

4 Motivation 1. Lack of panel survey data availability 2. Costly and less feasible to conduct panel survey 3. Limitation with cross-sectional survey 4 / 20

5 Pseudo Panel as Alternative Tool Advantages: 1. Different sources can be combined 2. Approximation of true panel 5 / 20

6 Pseudo Panel as Alternative Tool Advantages: 1. Different sources can be combined 2. Approximation of true panel Disadvantages: 1. Measurement error (observed - true) 2. Absence of robust techniques 3. Controversial reliability of pseudo panel 4. Not applied to political science 6 / 20

7 Pseudo Panel as Alternative Tool Advantages: 1. Different sources can be combined 2. Approximation of true panel Disadvantages: 1. Measurement error (observed - true) 2. Absence of robust techniques 3. Controversial reliability of pseudo panel 4. Not applied to political science Types: 1. Macro/cohort level 2. Individual level 7 / 20

8 Pseudo Panel with Matching Technique What it does: 1. Find a unit with similar observable characteristics 2. Reduce bias due to confounding 3. Enables a comparison of outcomes among matched and original units 8 / 20

9 Pseudo Panel with Matching Technique What it does: 1. Find a unit with similar observable characteristics 2. Reduce bias due to confounding 3. Enables a comparison of outcomes among matched and original units Nearest neighbor matching Propensity scores are a common tool for matching cases Match based on nearest distance of scalar, π Propensity Score : π = Pr(Y = 1 X) Distance(X i,x j ) = π i π j 9 / 20

10 Pseudo Panel with Matching Technique What it does: 1. Find a unit with similar observable characteristics 2. Reduce bias due to confounding 3. Enables a comparison of outcomes among matched and original units Nearest neighbor matching Propensity scores are a common tool for matching cases Match based on nearest distance of scalar, π Propensity Score : π = Pr(Y = 1 X) Distance(X i,x j ) = π i π j Limitations: 1. Apply to complete cases 2. Focus on distribution of covariates based on a single criteria rather than one-to-one exact matching 10 / 20

11 More details. 11 / 20 Pseudo Panel: Affinity Score Matching Finds exact matching between two individuals based on n dimensions (accounting discrete variables and missing values) ID year y x 1 x 2 x 3 x 4 x NA ID year y x 1 x 2 x 3 x 4 x Table. Process of Constructing a Pseudo Panel with Affinity Score Matching

12 More details. 12 / 20 Pseudo Panel: Affinity Score Matching Finds exact matching between two individuals based on n dimensions (accounting discrete variables and missing values) ID year y x 1 x 2 x 3 x 4 x NA ID year y x 1 x 2 x 3 x 4 x Table. Process of Constructing a Pseudo Panel with Affinity Score Matching

13 More details. 13 / 20 Pseudo Panel: Affinity Score Matching Finds exact matching between two individuals based on n dimensions (accounting discrete variables and missing values) ID year y x 1 x 2 x 3 x 4 x NA ID year y x 1 x 2 x 3 x 4 x Table. Process of Constructing a Pseudo Panel with Affinity Score Matching

14 Validation and Empirical Application Data Survey Data: Cooperative Congressional Election Study (CCES) Both panel and cross-sectional surveys ( ) n = 9500 Measurement Response Variable Obama s Approval Rating (5-point Scale) Explanatory Variable Positive perception of national economy between two waves Control Variable Female, Party Identification, Education, Race, Income Model Strategy Approval Rating i = α j[i] + β [i] time i BetterEcon i + ε i α j N(µ α,σ 2 α) 14 / 20

15 Result: Varying Intercept Model on Obama s Approval Rating True Panel: Affinity Matching Pseudo: Propensity Matching Pseudo: Democrat (0.018) (0.018) (0.019) Republican (0.018) (0.018) (0.019) Time2:BetterEcon.t (0.014) (0.025) (0.028) Time3:BetterEcon.t (0.014) (0.027) (0.028) σ α σ y ICC number of observation=28500, unique individual=9500. Standard errors are in parenthesis. 15 / 20

16 Method: Statistical Power Power: 1 the probability of making Type II error (β ) Estimate the precision of inferences 16 / 20

17 Method: Statistical Power Power: 1 the probability of making Type II error (β ) Estimate the precision of inferences Expectations: True panel > Pseudo panel (affinity score) Pseudo panel (affinity score) > Pseudo panel (propensity score) 17 / 20

18 Result Power: Power: Power: Approval Rating Strongly Disapprove Strongly Approve ID = ID = ID = Approval Rating Strongly Disapprove Strongly Approve ID = ID = ID = Approval Rating Strongly Disapprove Strongly Approve ID = ID = ID = Year Year Year Year Year Year Year Year Year (a) True Panel (b) Pseudo Panel (Affinity Score) (b) Pseudo Panel (Propensity Score)

19 Conclusion Summary: 1. Limitations in existing studies on constructing pseudo panel with matching technique 2. Suggest improved matching technique to build pseudo panel Finding: 1. Pseudo panel as an approximation of a true panel data 2. Introduce more feasible technique to build pseudo panel 19 / 20

20 Where to go next? Limitation: Examined 1) short period, 2) specific outcome variable, 3) one specific type of pseudo panel Future Studies: Explore local level, different dataset, and different types of pseudo panel Identifying panel attrition by applying affinity score matching technique Power analysis in dynamic hierarchical model Multiple imputation in longitudinal studies 20 / 20

21 Supplementary Materials Detail: Riverplot ( here ). Cohort Pseudo Panel ( here ). Affinity Score ( here ). CCES ( here ). Aggregated Estimation ( here ). Data - Graphics ( here ). Climate Change Model: Individual-level - Table ( here ). Individual-level - Posterior Distribution ( here ). Cohort-level - Table ( here ). Cohort-level - Posterior Distribution ( here ). 21 / 20

22 River plot of approval rating in individual-level: 1. Data: CCES 2. Panel: n = 9500, Complete cases = Average percentage of n for each categories: 47%, 8%, 1%, 25%, 19% 4. Percentage of n changed their opinion over three waves: 32% Back to Back to slide list 22 / 20

23 Cohort Pseudo Panel: The sample is divided into a small number of cohorts with a large number of observations in each (Browning et al 1985; Propper, Rees, and Green 2001). Cohort implies time invariant variables such as birth year. Aggregated level analysis. ȳ ct = x ct β + ᾱ ct + ū ct, where c = 1,...,C,t = 1,...,T 1. If n c is large enough, the time varying ᾱ ct can be treated as constant over time as ᾱ c. 2. Bias due to sampling error in the cohort average exist and can be substantial even for a sample size of thousands 3. No robust approach to build a cohort pseudo panel. Not much discussion but many blinded applications. Back to list 23 / 20

24 Affinity Score Computation: Affinity Score i,j = k i q i z i,j k i q i k i : the total number of variables that we are interested for individual i q i : the number of variables which has missing values for individual i z i,j : represents the number of variables when i and j have different values. Affinity Score i,j : the number of exact matching of the same variable between two individuals divided by the total number of variables that we are interested for individual i * Threshold: > 0.8 (among 7 dimensions, 6 of them should be exactly matched) Criteria: age, gender, education, race, party identification, ideology, income Back to Back to list slide 24 / 20

25 CCES Cross-sectional: 2010 (n=55400), 2012 (n=54535), 2014 (n=56200) Approval Rating: 1 (strongly disapprove) to 5 (strongly approve) National Economy Status: 1 (gotten much worse) to 5 (gotten much better) Back to list 25 / 20

26 Result p: p: p: Approval Rating Strongly Disapprove Strongly Approve Oppose Support Approval Rating Strongly Disapprove Strongly Approve Oppose Support Approval Rating Strongly Disapprove Strongly Approve Oppose Support Support for Same Sex Marriage Support for Same Sex Marriage Support for Same Sex Marriage (a) True Panel (b) Pseudo Panel (Affinity Score) (b) Pseudo Panel (Propensity Score) Back to list 26 / 20

27 Panel and Pseudo Panel Dataset 1. Pseudo Panel by Nearest Neighbor Propensity Score Matching ID year y x 1 x 2 x 3 x Pseudo Panel by Affinity Score Matching ID year y x 1 x 2 x 3 x vs. True Panel Survey ID year y x 1 x 2 x 3 x Back to list 27 / 20

28 Individual-Level Analysis Result Logistic Hierarchical Model on Belief in Global Warming Panel : Pseudo Panel (Affinity) : Unusual Temperature (0.019) (0.019) Female (0.239) (0.228) White (0.311) (0.309) Age (0.008) (0.008) Education (0.062) (0.062) Democrat (0.317) (0.270) Republican (0.248) (0.247) Interest in Politics (0.151) (0.140) Intercept (7.281) (6.948) σt (2.374) (3.044) N Npanelist Nwave 3 3 Back to list 28 / 20

29 Individual-Level Analysis Result Comparison of Posterior Distributions of an Unusual Temperature Density panel pseudo (affinity) difference Esimate Size of Unusual Temperature Back to list 29 / 20

30 Cohort-Level Analysis Result Table: Logistic Hierarchical Model on Belief in Global Warming Panel : Pseudo Panel (Affinity) : 10-year cohort 3-year cohort 10-year cohort 3-year cohort Unusual Temperature (0.020) (0.023) (0.019) (0.019) Intercept (0.997) (0.986) (0.961) (0.957) σc (0.521) (0.252) (0.245) (0.170) σt (2.636) (1.515) (1.765) (2.714) ICC for σ 2 c Control Variables N N panelist N wave N cohort Back to list 30 / 20

31 Cohort-Level Analysis Result Comparison of Posterior Distributions of Unusual Temperature by the Cohort Group Density panel pseudo (affinity) difference panel pseudo (affinity) difference Estimate Size of Unusual Temperature Estimate Size of Unusual Temperature (a) 10-year Age Span Cohort Group (b) 3-year Age Span Cohort Group Back to list 31 / 20

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black