Dynamics in Social Networks and Causality

Similar documents
Propensity Score Matching

PROPENSITY SCORE MATCHING. Walter Leite

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Gov 2002: 5. Matching

Lecture Data Science

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Ordinary Least Squares Regression Explained: Vartanian

Harvard University. Rigorous Research in Engineering Education

Lab 4, modified 2/25/11; see also Rogosa R-session

The Simple Linear Regression Model

Empirical approaches in public economics

ECON Introductory Econometrics. Lecture 17: Experiments

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

The Central Limit Theorem

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Inferences for Regression

Binary Logistic Regression

1 The Classic Bivariate Least Squares Model

Introduction and Single Predictor Regression. Correlation

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Matching for Causal Inference Without Balance Checking

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

Review of Multiple Regression

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Relating Graph to Matlab

Categorical Predictor Variables

appstats27.notebook April 06, 2017

Statistical Distribution Assumptions of General Linear Models

Selection on Observables: Propensity Score Matching.

Simple Regression Model. January 24, 2011

Quantitative Economics for the Evaluation of the European Policy

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Machine Learning Linear Classification. Prof. Matteo Matteucci

Exam Applied Statistical Regression. Good Luck!

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

Sociology 593 Exam 2 Answer Key March 28, 2002

Introducing Generalized Linear Models: Logistic Regression

Potential Outcomes Model (POM)

CRP 272 Introduction To Regression Analysis

Lab #12: Exam 3 Review Key

Tables and Figures. This draft, July 2, 2007

Sociology 593 Exam 2 March 28, 2002

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Lecture (chapter 13): Association between variables measured at the interval-ratio level

review session gov 2000 gov 2000 () review session 1 / 38

1 Correlation and Inference from Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Ordinary Least Squares Regression Explained: Vartanian

FEEG6017 lecture: Akaike's information criterion; model reduction. Brendan Neville

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS.

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Statistical Analysis of the Item Count Technique

Achieving Optimal Covariate Balance Under General Treatment Regimes

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Propensity Score Analysis Using teffects in Stata. SOC 561 Programming for the Social Sciences Hyungjun Suh Apr

Can you tell the relationship between students SAT scores and their college grades?

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Covariate Balancing Propensity Score for General Treatment Regimes

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Matching. Stephen Pettigrew. April 15, Stephen Pettigrew Matching April 15, / 67

Education Production Functions. April 7, 2009

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Gov 2002: 4. Observational Studies and Confounding

Chapter 27 Summary Inferences for Regression

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

THE PEARSON CORRELATION COEFFICIENT

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Chs. 15 & 16: Correlation & Regression

An Introduction to Mplus and Path Analysis

Correlation and simple linear regression S5

ECON3150/4150 Spring 2015

Predicting the Treatment Status

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Background of Matching

Gov 2000: 9. Regression with Two Independent Variables

Chs. 16 & 17: Correlation & Regression

Statistical Analysis of List Experiments

Lecture on Null Hypothesis Testing & Temporal Correlation

Statistical Inference for Means

PSC 504: Dynamic Causal Inference

Variance Decomposition and Goodness of Fit

Introduction to Econometrics. Review of Probability & Statistics

Gov 2002: 9. Differences in Differences

In Class Review Exercises Vartanian: SW 540

An Introduction to Path Analysis

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

28. SIMPLE LINEAR REGRESSION III

Review of probabilities

ECO220Y Simple Regression: Testing the Slope

Transcription:

Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences

Last Time: Case Study: the spread of Facebook (Causal) Relationships Correlation Today: (Causal) Relationships Regression Analysis Matching Methods Spreading of culture 2

Isolating mechanisms is difficult but important. We should think about mechanisms that may explain causal effects before we measure them http://www.bitbybitbook.com/en/running-experiments/beyond-simple/mechanisms/ 5

Is X related with Y? RELATIONS 6

Correlation Coefficient Is the normalized version of the covariance: Variance: Frequently same sign of difference: positive correlation Frequently different sign of difference: negative correlation Vary sign of difference: no correlation 7

Correlation coefficient http://guessthecorrelation.com/ http://greenteapress.com/thinkstats/thinkstats.pdf 8

Can we predict Y from X? LINEAR REGRESSION 9

Observations: Can we predict Y from X? Can we formulate a model that expresses Y as a function of X? 10

Assume Linear Relationship Y i = b 0 + b 1 X i + ε i iid ε1,., ε n N(0,σ 2 ) Y i N(b 0 + b 1 X i, σ 2 ) Y i = 2 + 0.5 X i i Observed value 6 ε Y Y i = b 0 + b 1 X i + ε i Y 4 2 Predicted value: Y i = b 0 + b 1 X i 0 b 0 0 2 4 6 8 X 11 11

Bivariate Regression How to estimate b0 and b1? N SSE = (Y i Y i ) 2 i=1 N SSE = (Y i b 0 + b 1 X i ) 2 i=1 6 Y 4 2 0 0 2 4 6 8 X 12 12

Interpreting coefficients Y = b 0 + b 1 X + ε b 1 is the average effect of X on Y One unit increase in X leads to b 1 units of increase in Y b o is the best guess for Y if X=0 13

Example How much does the IQ of kid change on average if the mother has a high school degree or not? outcome predictor IQ kid = b 0 + b 1 HC mom + ε error intercept coefficient Binary predictor. b 1 tells us how much IQ kid differs between the 2 groups 14 14

Example Continuous Predictor How much does the IQ of a kid change on average if we increase IQ of mother? outcome predictor IQ kid = b 0 + b 1 IQ mom + ε intercept coefficient error IQ kid when IQ mom =0 Difference between IQ kid for 2 groups that differ in one unit of IQ mom 15 15

Example IQ kid = 26 + 0.6 IQ mom If we compare children one unit change in IQ of mother leads to 0.6 change in IQ kid 10 points difference in mothers IQ corresponds to 6 point difference in kid s IQ What does the intercept tell us? Kids of mothers with IQ=0 who did not go to college would have a IQ of 26 16

Multiple Predictors Continouse predictor Difference between IQ kid for 2 groups that differ in HC mom but have equal IQ mom units outcome coefficient IQ kid = b 0 + b 1 IQ mom + b2 HC mom + ε intercept coefficient IQ kid when IQ mom =0 and HC mom =0 Difference between IQ kid for 2 groups that differ in one unit of IQ mom but have same HC mom value 17 17

Correlation versus Linear Regression Both measure the strength of the linear relationship between X and Y. Correlation gives you a bounded measurement of how close X and Y follow a perfect linear relationship Regression coefficient indicates the estimated change in the expected value of Y for a given value of X. It dependents on scale. The Pearson correlation coefficient, is the slope of the regression line when both variables have been standardized first. 18

Regression Models Causal Effects? Do students from elite colleges earn more later in life? earn ~ b0 +b1 *college +error Do people who went to an elite college earn on average more later in life? College Salary Time 19

Problem with Regressions Being accepted in an elite college correlates with motivation and socio-economic status These factors also correlate with salary socioeconomic status Motivation College Salary Time 20

Observational Data Randomization in experiments assures that: T O C T O In observational studies we need to control for covariates C that effect both, the outcome O and the treatment assignment T T O C 21

Regression Models Causality Even if we include all covariates that are relevant to ensure that T O C we measure population-wide average effects with regression model We measure the average effect of treatment T on outcome O when controlling for covariates C earn ~ b0 +b1 *college + b2 *motivation+ b3 *socio-econ+ error 22

Causal Effect Individual Level Effect wealth Did going to an elite college impact your future earnings? Y i (T) - Y i (C) college time 23

Does X cause Y? CAUSAL RELATIONSHIPS 24

Solutions Matching Methods Idea: find people that look like twins in pre-treatment covariates 25

Matching Methods Goal: treatment assignment T should be conditionally independent of outcome O given observed covariates X T O X Balance the distribution of observed covariates in treated and control group matching == pruning Unobserved covariates that are related with treatment and outcome and are not correlated with observed covariates are still a problem! 26

Matching Methods 4 Steps: Define a distance measure Find matches (e.g., use greedy k:1 matching) Assess quality of matches Are covariate distributions similar for treated and untreated group? Analyze effect of treatment on outcome 27

Position Does Special Training Help Job Promotion? Treated with elite education Outcome education (in years) 1-dimensional covariate Ho, Daniel, Kosuke Imai, Gary King, and Elizabeth Stuart. 2007. Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis 15: 199 236. Copy at http://j.mp/jpupwz 28

Example How to match people based on eductation? Greedy Distance Matching Treated Edu Pos 1 4 5 1 7 5 1 10 9 Treated Edu Pos 0 6 5 0 10 5 0 1 9 Matches: 4-6 7-10 10-1 One really bad match. Optimal solution: 4-1, 7-6 and 10-10 29

Greedy Distance Matching Greedy matching may not find the optimal matches! Introduce Caliper = maximal acceptable distance Throw bad matches away Pair-matching (1:1 matching), but also 1:k matching is possible What if our data is multivariate? 30

Euclidean distance Euclidean distance doesn t make sense when different dimensions are on different scales. For example: yearly income, age, gender, body weight Problem: distance is dominated by largest values 31

Mahalanobis Distance Multivariate distance measure Measures how many std away two points are Rescales variables based on their direction and variance Inverse of variance-covariance matrix 32

position Example: Distance Matching Approximates fully blocked experiment Completely randomized: Flip a coin for each subject. Heads -> T, tails -> C. Could get unlucky: all men assigned T education Fully blocked experiment: First pair up similar subjects: e.g. same gender, age, Then flip a coin for each pair. One gets T, one C. Balances the known covariates. Gary King, "Why Propensity Scores Should Not Be Used for Matching, Methods Colloquium, 2015 33

Assess Quality of Matches Was the matching successful? Are covariates balanced between the 2 groups? Standardized difference is the difference in groups for each covariate in units of standard deviation smd < 0.1 is good smd >0.2 indicates serious imbalance 34

Matching Methods 4 Steps: Define a distance measure Create matches (e.g., use greedy k:1 matching) Assess quality of matches Are covariate distributions similar for treated and untreated group? Analyze effect of treatment on outcome ATE Regression 35

Average Treatment Effect (ATE) Randomization Tests Compute test statistic from observed (matched) data ATE = Outcome Treated - Outcome Control Assume that null hypothesis is true (no treatment effect) ATE = E[Outcome Treated ] E[Outcome Control ]= 0 Randomly permute treatment assignments (shuffle treatment labels) Recompute test-statistic Is our observed test statistic surprising? 36

Regression After Matching position (p) pos = b0 + b1*edu + b2*is_treated 1) Preprocessing (matching) 2) Estimation of effects (regression models) education (e) Matching can help to reduce model dependence! 37

What if we simply use regression analysis? position change (p) pos = b0 + b1*edu + b2*is_treated binary variable estimated treatment effect education (e) Correcting for education, the treated group has higher positions. 38

Quadratic Regression position (p) pos = b0 + b 1 *e + b 2 *edu 2 + γ*is_treated Model Dependence Too much freedom given to analyst. Reason: Imbalance of covariates education (e) Correcting for education, the treated group has lower positions. 39

Distance Matching Distance matching works well if we have few covariates (less than 50). What if we have high dimensional data? Many covariates? Which should we include? Idea: project covariates into lower dimensional space compute for each observation (high dimensional vector) one number, the probability to be treated (propensity score). Match based on this number 40

Propensity Score Matching Propensity score is the probability of a subject to receive treatment given all covariates we want to control for If a subject has a propensity score=0.3 that means that a subject with these covariate values has a 30% chance of receiving the treatment Idea: subjects are not matched based on the covariates but based on the propensity score 41

position Propensity Score Matching Approximates complete randomization Completely randomized: Flip a coin for each subject. Heads -> T, tails -> C. Could get unlucky: all men assigned T education 43

Propensity Score Matching Estimate propensity score from observed data Outcome variable: T Independent variable: X Logistic regression: take predicted outcome T as propensity score Idea: achieve balance in covariates by conditioning on propensity score. This works if: P X = x π X = p, T = 1 = P(X = x π X = p, T = 0) Note: In a randomized trail the propensity score (==allocation probability) is known: P T = 1 X = P T = 1 = 0.5 44

Propensity Score Matching Matching: greedy, nearest neighbor Match on logit(propensity score) = log odds of propensity score We do that to stretch the values, often treatment probabilities are very small Look at std of transformed propensity scores Calliper: remove bad matches subjects that are more than 0.2 std away from the propensity score of their match Smaller calliper lower bias but more variance 45

Assess Quality of Matches Is the distribution of propensity scores similar for treated and control group? 48

Instrument Variables Problem: lets assume we cannot observe socio-economic status and motivation What could we do? socioeconomic status Motivation College Salary Time 50

Instrumental Variables Instrumental Variables are highly correlated with covariates but not with outcome Example Apply to elite college correlates with motivation Living area correlates with socio-economic status earn ~ b0 +b1 *college + b2*collegeap + b3*living +error We come closer to the causal effect of college choice on earning after controlling for the confounder socioeconomic status 51

Summary Matching methods and Instrumental Variables are powerful and help to approximate causality Problems Instrumental variables are often hard to find Researchers have lots of freedom when decide how to match We remove data we need to specify for which group the causal effect holds Compare results from different matching methods, different dimensionality reduction methods, different models Avoid model dependence and method dependence! 52

Dissemination of Culture How does culture evolve? 53 53

Cultural practices 54

Dynamics of Culture What is culture? How does it diffuse? Culture can be seen as an agglomerate of beliefs, opinions, values, behaviour and other things that certain groups of people have agreed on. People learn culture by interacting People are more likely to interact with similar people E.g. if they share a language they are more likely to interact and start sharing other traits over time Why do we see global polarization and local convergence? 55 55

56 56

Axelrod Model Each agent has a vector of f different features Each feature may allow q different traits Ethnicity Political Orientation Religion 1 5 4 57

Initial Setup 58 58

Axelrod Model Dynamic Process: An agent i and one of his neighbours j are randomly selected The overlap w i,j between their cultural vectors is computed With the probability w i,j the interaction takes place If the interaction takes place one feature is selected randomly and the trait of the neighbour j is set to the trait of i for this feature 59

Will we converge to one mono-culture? http://www-personal.umich.edu/~axe/research/dissemination.pdf 60

Will we converge to one mono-culture? http://www-personal.umich.edu/~axe/research/dissemination.pdf 61

Cultural Complexity The more cultural features we have the more likely it is the 2 agents will have something in common and can interact Few features and many traits the probability that agents will not share anything is high http://www-personal.umich.edu/~axe/research/dissemination.pdf 62

Any further questions? See you next week 90