Modeling networks: regression with additive and multiplicative effects

Similar documents
Sampling and incomplete network data

Dyadic data analysis with amen

Modeling homophily and stochastic equivalence in symmetric relational data

Statistics 360/601 Modern Bayesian Theory

Theory and Methods for the Analysis of Social Networks

Random Effects Models for Network Data

A Prior Distribution of Bayesian Nonparametrics Incorporating Multiple Distances

Appendix: Modeling Approach

Higher order patterns via factor models

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Department of Statistics. Bayesian Modeling for a Generalized Social Relations Model. Tyler McCormick. Introduction.

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462)

Consistency Under Sampling of Exponential Random Graph Models

STA 4273H: Statistical Machine Learning

Nonparametric Latent Feature Models for Link Prediction

The sbgcop Package. March 9, 2007

Learning latent structure in complex networks

From Argentina to Zimbabwe: Where Should I Sell my Widgets?

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Mixed Membership Stochastic Blockmodels

Statistical Model for Soical Network

2017 Source of Foreign Income Earned By Fund

c Copyright 2013 Alexander Volfovsky

Shortfalls of Panel Unit Root Testing. Jack Strauss Saint Louis University. And. Taner Yigit Bilkent University. Abstract

Appendix B: Detailed tables showing overall figures by country and measure

Package sbgcop. May 29, 2018

Part 6: Multivariate Normal and Linear Models

Export Destinations and Input Prices. Appendix A

Network Event Data over Time: Prediction and Latent Variable Modeling

Gibbs Sampling in Latent Variable Models #1

CSC 2541: Bayesian Methods for Machine Learning

Introduction to statistical analysis of Social Networks

Probability models for multiway data

Modeling heterogeneity in random graphs

Confidence Sets for Network Structure

Hierarchical Models for Social Networks

Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data

Stochastic blockmodeling of relational event dynamics

Gentle Introduction to Infinite Gaussian Mixture Modeling

VCMC: Variational Consensus Monte Carlo

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Specification and estimation of exponential random graph models for social (and other) networks

Mathematics. Pre-Leaving Certificate Examination, Paper 2 Higher Level Time: 2 hours, 30 minutes. 300 marks L.20 NAME SCHOOL TEACHER

Bayesian Methods for Machine Learning

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Based on slides by Richard Zemel

Conditional Marginalization for Exponential Random Graph Models

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff

A nonparametric test for path dependence in discrete panel data

Do Policy-Related Shocks Affect Real Exchange Rates? An Empirical Analysis Using Sign Restrictions and a Penalty-Function Approach

Goodness of Fit of Social Network Models 1

Bayesian Linear Regression

Statistical Models for Social Networks with Application to HIV Epidemiology

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Assessing Goodness of Fit of Exponential Random Graph Models

Mixed Membership Matrix Factorization

STA 414/2104: Machine Learning

MCMC algorithms for fitting Bayesian models

Markov Chain Monte Carlo methods

Agent-Based Methods for Dynamic Social Networks. Duke University

Non-Parametric Bayes

Markov Chain Monte Carlo

Scalable Gaussian process models on matrices and tensors

Data Mining and Analysis: Fundamental Concepts and Algorithms

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs

Online Appendix to: Crises and Recoveries in an Empirical Model of. Consumption Disasters

STA 4273H: Statistical Machine Learning

Generative Clustering, Topic Modeling, & Bayesian Inference

arxiv: v1 [stat.me] 3 Apr 2017

Default Priors and Effcient Posterior Computation in Bayesian

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Mixed Membership Matrix Factorization

Bayesian nonparametric models for bipartite graphs

Stochastic blockmodels with a growing number of classes

Latent Stochastic Actor Oriented Models for Relational Event Data

Web Structure Mining Nodes, Links and Influence

Predictive Discrete Latent Factor Models for large incomplete dyadic data

Statistical Methods for Social Network Dynamics

How to display data badly

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Bayesian Machine Learning

STAT 518 Intro Student Presentation

GLAD: Group Anomaly Detection in Social Media Analysis

Assessing the Goodness-of-Fit of Network Models

Social Network Notation

Parity Reversion of Absolute Purchasing Power Parity Zhi-bai ZHANG 1,a,* and Zhi-cun BIAN 2,b

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

20: Gaussian Processes

CS Homework 3. October 15, 2009

Measuring Social Influence Without Bias

MULTILEVEL IMPUTATION 1

Undirected Graphical Models

Model-Based Clustering for Social Networks

Unified Modeling of User Activities on Social Networking Sites

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Transcription:

Modeling networks: regression with additive and multiplicative effects Alexander Volfovsky Department of Statistical Science, Duke May 25 2017 May 25, 2017 Health Networks

1 Why model networks? Interested in understanding the formation of relationships

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions:

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models?

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail?

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems?

Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these?

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these? Causal inference

1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these? Causal inference Link prediction

Some context: Facebook Facebook wants to change its ad algorithm. Source: Wikimedia

Some context: Facebook Facebook wants to change its ad algorithm. Can t do it on the whole graph Source: Wikimedia

Some context: Facebook Facebook wants to change its ad algorithm. Can t do it on the whole graph Need total network effect Source: Wikimedia

How do they solve it? Interested in estimating 1 N N [Y i (all treated) Y i (all controls)] i=1 At a high level, graph cluster randomization is a technique in which the graph is partitioned into a set of clusters, and then randomization between treatment and control is performed at the cluster level. Where can we find clusters? Observable information (e.g. same school) Unobservable information ( social space )

Some context: (im)migration Want to know how regime change affects population. Politicians during election years care about direct effects. Source: http://openscience.alpine-geckos.at/courses/social-networkanalyses/empirical-network-analysis/

Some more context Studying tram traffic in Vienna Source: kurier.at 5

And one more Studying taxi rides in Porto I 442 taxis I 1.7 million rides with (x, y ) coordinates at 15 second intervals. Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic Differentiation Variational Inference. Journal of Machine Learning Research, 18(14), 1-45. 6

And one more Studying taxi rides in Porto I Project into a 100 dimensional latent space. I Learn hidden interpretable patterns... Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic Differentiation Variational Inference. Journal of Machine Learning Research, 18(14), 1-45. 7

8 Relational data: common examples and goals Changes in exports from year to year second eigenvector of R^ row 0.4 0.2 0.0 0.2 0.4 Finland United Kingdom rmany Italy Spain Switzerland France Ireland Norw New Zealand USA Canada Mexico Turkey Netherlands Austria Brazil Japan Australia China Rep. of Korea Indonesia Malaysia Greec Thailand China, Hong Kong SAR second eigenvector of R^ col 0.3 0.1 0.1 0.3 Indonesia China Turkey Japan Malaysia New Norway Zealand Australia Thailand Greece Finland Rep. of Korea Austria Brazil Spain Italy Mexico Netherlands China, IrelandHong Kong SAR Canada United France Kingdom USA Germany Switzer 0.30 0.20 0.10 0.25 0.15 0.05 0.05 first eigenvector of R^ row first eigenvector of R^ col Network regression problems y ij = x ij β + ɛ ij frequently assume independence of the ɛ ij

Estimating β in network regression second eigenvector of R^ row 0.4 0.2 0.0 0.2 0.4 Finland United Kingdom rmany Italy Spain Switzerland France Ireland Norw New Zealand USA Canada Mexico Turkey Netherlands Austria Brazil Japan Australia China Rep. of Korea Indonesia Malaysia Greec Thailand China, Hong Kong SAR second eigenvector of R^ col 0.3 0.1 0.1 0.3 Indonesia China Turkey Japan Malaysia New Norway Zealand Australia Thailand Greece Finland Rep. of Korea Austria Brazil Spain Italy Mexico Netherlands China, IrelandHong Kong SAR Canada United France Kingdom USA Germany Switzer 0.30 0.20 0.10 0.25 0.15 0.05 0.05 first eigenvector of R^ row first eigenvector of R^ col For Y =< X, β > +E we have OLS (assume no dependence among ɛ ij ): ˆβ (ols) = (mat(x) t mat(x)) 1 mat(x) t vec(y ) Oracle GLS (assume dependence among ɛ ij ): ˆβ (gls) = (mat(x) t (Σ 1 )mat(x)) 1 mat(x) t (Σ 1 )vec(y )

Network models The data There are n actors/nodes labeled 1,..., n Y is a sociomatrix: y ij is a dyadic relationship between node i and node j. y ii frequently undefined. Covariates: node specific: x i dyad specific: xij

Social relations model Goal: describe the variability in Y. Sender effects describe sociability. Receiver effects describe popularity. Capture this in the Social Relations Model (SRM) y ij = a i + b j + ɛ ij Almost an ANOVA want to relate a i to b i since the senders/receivers are from the same set.

Social relations model y ij =µ + a i + b j + ɛ ij (a i, b i ) iid N(0, Σ ab ) (ɛ ij, ɛ ji ) iid N(0, Σ e ) ( ) σ 2 Σ ab = a σ ab describes sender/receiver variability and σ ab σ 2 b within person similarity. ( ) 1 ρ Σ e = σɛ 2 describes within dyad correlation. ρ 1

Variability var(y ij ) =σa 2 + 2σ ab + σb 2 + σ2 ɛ cov(y ij, y ik ) =σa 2 cov(y ij, u kj ) =σb 2 cov(y ij, y jk ) =σ ab cov(y ij, y ji ) =2σ ab + ρσɛ 2 How hard is it to fit this model? fit_srm <- ame(y)

Source: Hoff (2015). arxiv:1506.08237 14 Pictures that pop up These help capture how well the Markov Chain is mixing and goodness of fit information.

Source: Hoff (2015). arxiv:1506.08237 15 Goodness of fit Posterior predictive distributions. sd.rowmean: standard deviation of row means of Y. sd.colmean: standard deviation of column means of Y. dyad.dep: correlation between vectorized Y and vectorized Y t triad.dep: i jk e ije jk e ki Var(vec(Y ))3/2 #triangle on n nodes

Incorporating covariates Imagine you have some covariates and want to fit y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + ɛ ij x d,ij are dyad specific covariates. x r,i are row (sender) covariates. x c,i are column (receiver) covariates. Frequently x r,i = x c,i = x i When does this not make sense? (Example: popularity is affected by athletic success, but sociability is not) How hard is it to fit this model? fit_srrm <- ame(y, Xd=Xd,Xr=Xr,Xc=Xc)

Parsing the input fit_srrm <- ame(y, Xdyad=Xd, #n x n x pd array of covariates Xrow=Xr, #n x pr matrix of nodal row covariates Xcol=Xc #n x pc matrix of nodal column covariates ) Xr i,p is the value of the pth row covariate for node i. Xd i,j,p is the value of the pth dyadic covariate in the direction of i to j.

Back to basics Can you get rid of the dependencies in the model? fit_rm<-ame(y,xd=xd,xr=xn,xc=xn, rvar=false, #should you fit row random effects? cvar=false, #should you fit column random effects? dcor=false #should you fit a dyadic correlation? ) Note that summary will output: Variance parameters: pmean psd va 0.000 0.000 cab 0.000 0.000 vb 0.000 0.000 rho 0.000 0.000 ve 0.229 0.011

So what s missing here? We have a lot of left over variability. Common themes in network analysis: Homophily: similar people connect to each other Stochastic equivalence: similar people act similarly

Which is which? Source: Hoff (2008). NIPS

Which is which? Left: homophily; Right: stochastic equivalence What are good models for this? Source: Hoff (2008). NIPS

Introducing multiplicative effects SR(R)M can represent second-order dependencies very well. Has a hard time capturing triadic behavior. Homophily: create dyadic covariates x d,ij = x i x j Generally this can be represented by xr t i Bx j,i = k l b klx r,ik x c,jl This is linear in the covariates and so can be baked into the amen framework. Sometimes there is excess correlation to account. This suggests a multiplicative effects model: y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + u t i v j + ɛ ij

Source: Hoff (2015). arxiv:1506.08237 22 Fitting these models and beyond fit_ame2<-ame(y,xd,xn,xn, R=2 #dimension of the multiplicative effect )

What happened here? Why do multiplicative effects help triadic behavior? Triadic measure is related to transitivity (at least for binary data). Turns out homophily can capture transitivity... y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + u t i v j + ɛ ij u i is information about the sender, v j is information about the receiver if u i v j then u t i v j > 0... if u i u j then there is some stochastic equivalence...

Lets generalize: ordinal models Imagine a binary (probit) model: y ij = 1 zij >0 z ij = µ + a i + b j + ɛ ij Looks like the SRM on the latent scale. fit_srm<-ame(y, model="bin" #lots of model options here ) If we go to the iid set up this is just an Erdos-Renyi model: fit_srg<-ame(y,model="bin", rvar=false,cvar=false,dcor=false)

Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij )

Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender

25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver

25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights

25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights g is an increasing function mapping the latent space to the observed space.

25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights g is an increasing function mapping the latent space to the observed space. (Some gs... Normal: g(z) = z, binomial: g(z) = 1 z 0 )

This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij )

This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence

This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence Λ is a matrix of eigenvalues:

This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence Λ is a matrix of eigenvalues: positive λ i imply homophily, negative ones imply heterophily.

What is this latent space? Problem 1: need to select a dimension R.

What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition.

What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted?

What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space...

What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space... Problem 3: what about my favorite other models like stochastic blockmodels?

What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space... Problem 3: what about my favorite other models like stochastic blockmodels? These are just a subclass of models For example, the stochastic blockmodel has discrete support for the latent positions.

What is this latent space? All quotes from Hoff, et al 2002 A subset of individuals in the population with a large number of social ties between them may be indicative of a group of individuals who have nearby positions in this space of characteristics, or social space. Various concepts of social space have been discussed by McFarland and Brown (1973) and Faust (1988). In the context of this article, social space refers to a space of unobserved latent characteristics that represent potential transitive tendencies in network relations. A probability measure over these unobserved characteristics induces a model in which the presence of a tie between two individuals is dependent on the presence of other ties.

(Tiny portion of the) literature Nowicki, Krzysztof, and Tom A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association 96, no. 455 (2001): 1077-1087. Hoff, Peter D., Adrian E. Raftery, and Mark S. Handcock. Latent space approaches to social network analysis. Journal of the american Statistical association 97, no. 460 (2002): 1090-1098. Hoff, Peter. Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems, pp. 657-664. 2008. Airoldi, Edoardo M., David M. Blei, Stephen E. Fienberg, and Eric P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9, no. Sep (2008): 1981-2014. Hoff, Peter, Bailey Fosdick, Alex Volfovsky, and Katherine Stovel. Likelihoods for fixed rank nomination networks. Network Science 1, no. 03 (2013): 253-277. Hoff, Peter D. Dyadic data analysis with amen. arxiv preprint arxiv:1506.08237 (2015).

ame(y, Xdyad=NULL, Xrow=NULL, Xcol=NULL, rvar = (model=="rrl"), cvar = TRUE, dcor = symmetric, nvar = TRUE, R = 0, model="nrm", intercept=is.element(model,c("rrl","ord")), symmetric=false, odmax=rep(max(apply(y>0,1,sum,na.rm=true)),nrow(y)),...) Y: an n x n square relational matrix of relations. Xdyad: an n x n x pd array of covariates Xrow: an n x pr matrix of nodal row covariates Xcol: an n x pc matrix of nodal column covariates rvar: logical: fit row random effects (asymmetric case)? cvar: logical: fit column random effects (asymmetric case)? dcor: logical: fit a dyadic correlation (asymmetric case)? nvar: logical: fit nodal random effects (symmetric case)? R: int: dimension of the multiplicative effects (can be 0) model: char: one of "nrm","bin","ord","cbin","frn","rrl" odmax: a scalar integer or vector of length n giving the maximum number of nominations that each node may make

What s in the...? seed = 1, nscan = 10000, burn = 500, odens = 25, plot=true, print = TRUE, gof=true seed: random seed nscan: number of iterations of the Markov chain (beyond burn-in) burn: burn in for the Markov chain odens: output density for the Markov chain plot: logical: plot results while running? print: logical: print results while running? gof: logical: calculate goodness of fit statistics?

An AddHealth Example 32

Social network data Datasets: PROSPER, NSCR, AddHealth 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w

Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices In practice: Ranked data Censored observations 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w

Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices In practice: Ranked data Censored observations 9 10 11 12 0.00 0.05 0.10 0.15 0.20 proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w A type of likelihood that accommodates the ranked and censored nature of data from Fixed Rank Nomination (FRN) surveys and allows for estimation of regression effects.

34 Data collection examples PROmoting School Community-University Partnerships to Enhance Resilience (PROSPER): Who are your best and closest friends in your grade? National Longitudinal Study of Adolescent to Adult Health (AddHealth): Your male friends. List your closest male friends. List your best male friend first, then your next best friend, and so on.

Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1

Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1

Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j}

Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j} Different sampling schemes define different maps between Y and Z (set relations between y ij and z ij ).

Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j} Different sampling schemes define different maps between Y and Z (set relations between y ij and z ij ). Statistical model {p (Z θ) : θ Θ} assists in analysis

Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree

36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i 1 2 3 4 5 6 7 8 9 10

36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i 4 3 2 1 0 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i 4 3 2 1 0 0 0 0 0 0 z i1 > z i2 > z i3 > z i4 > 0> 0> 0> 0> 0> 0> 1 2 3 4 5 6 7 8 9 10

36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i 1 2 3 4 5 6 7 8 9 10

36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i 5 4 3 2 1 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i 5 4 3 2 1 0 0 0 0 0 z i1 > z i2 > z i3 > z i4 > z i5 >????? 1 2 3 4 5 6 7 8 9 10

Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y)

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i 4 3 2 1 0 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i 4 3 2 1 0 0 0 0 0 0 z i1 > z i2 > z i3 > z i4 >?????? 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i 5 4 3 2 1 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i 5 4 3 2 1 0 0 0 0 0 z i1 > z i2 > z i3 > z i4 > z i5 >????? 1 2 3 4 5 6 7 8 9 10

37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) Cannot estimate row ( sender ) specific effects y i z i 5 4 3 2 1 0 0 0 0 0 z i1 > z i2 > z i3 > z i4 > z i5 >????? 1 2 3 4 5 6 7 8 9 10

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y)

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i 1 2 3 4 5 6 7 8 9 10

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i 4 3 2 1 0 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i 4 3 2 1 0 0 0 0 0 0 >0 >0 >0 >0 0> 0> 0> 0> 0> 0> 1 2 3 4 5 6 7 8 9 10

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i 1 2 3 4 5 6 7 8 9 10

38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i 5 4 3 2 1 0 0 0 0 0 z i 1 2 3 4 5 6 7 8 9 10

Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i 5 4 3 2 1 0 0 0 0 0 >0 >0 >0 >0 >0 0> 0> 0> 0> 0> 1 2 3 4 5 6 7 8 9 10

39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler.

Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )):

Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ).

39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij 0.

39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij 0. 3. y ij = 0 and d i = m: z ij p(z ij Z ij, θ)1 zij min(z ik :y ik >0)

Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij 0. 3. y ij = 0 and d i = m: z ij p(z ij Z ij, θ)1 zij min(z ik :y ik >0) Allows for imputation of missing y ij 39

40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid 1 0.5 normal 0, 0.5 1 ) ( ( )) iid 1 0.9 normal 0, 0.9 1 ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group

40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid 1 0.5 normal 0, 0.5 1 ) ( ( )) iid 1 0.9 normal 0, 0.9 1 ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group β r = β c = β d1 = β d2 = 1 and β 0 = 3.26 x ir, x ic, x ij1 iid N (0, 1) xij2 = s i s j /.42 for s i iid binary (1/2)

40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid 1 0.5 normal 0, 0.5 1 ) ( ( )) iid 1 0.9 normal 0, 0.9 1 ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group β r = β c = β d1 = β d2 = 1 and β 0 = 3.26 x ir, x ic, x ij1 iid N (0, 1) xij2 = s i s j /.42 for s i iid binary (1/2)

41 Simulations - Censoring r r 0.0 0.0 0.5 0.5 1.0 1.0 1.5 1.5 m = 5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 8 simulations for each m {5, 15} with 100 nodes each c c 0.4 0.4 0.8 0.8 1.2 1.2 1.6 1.6 m = 5 m = 15 1 2 3 4 5 6 7 8 m = 5 m = 15 0.0 0.0 0.5 0.5 1.0 1.0 1.5 1.5 0.4 0.4 0.8 0.8 1.2 1.2 1.6 1.6 1 2 3 4 5 6 7 8 d1 d1 0.8 0.8 1.0 1.0 0.8 0.8 1.0 1.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d2 d2 0.4 0.4 0.8 0.8 1.2 1.2 and an iid dyadic variable. The groups of three CIs are based on 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 binary, FRN and rank simulationlikelihoods from left to right. simulation simulation simulation 0.4 0.4 0.8 0.8 1.2 1.2 Confidence intervals under the three different likelihood for column

Simulations - Censoring r 0.0 0.5 1.0 1.5 m = 5 m = 5 m = 15 0.0 0.5 1.0 1.5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 c 0.4 0.8 1.2 1.6 0.4 0.8 1.2 1.6 Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d1 0.8 1.0 0.8 1.0 2 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 42

Simulations - Censoring r 0.0 0.5 1.0 1.5 m = 5 m = 5 m = 15 0.0 0.5 1.0 1.5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 c 0.4 0.8 1.2 1.6 Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects 1 2 3 4 5 6 7 8 0.4 0.8 1.2 1.6 1 2 3 4 5 6 7 8 d1 0.8 1.0 0.8 1.0 2 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 42

Simulations - Censoring r 0.0 0.5 1.0 1.5 m = 5 m = 5 m = 15 0.0 0.5 1.0 1.5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 c 0.4 0.8 1.2 1.6 Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects 1 2 3 4 5 6 7 8 Large amount of censoring 0.4 0.8 1.2 1.6 1 2 3 4 5 6 7 8 d1 0.8 1.0 0.8 1.0 2 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 42

Simulations - Censoring r 0.0 0.5 1.0 1.5 m = 5 m = 5 m = 15 0.0 0.5 1.0 1.5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 c 0.4 0.8 1.2 1.6 d1 0.8 1.0 Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects 1 2 3 4 5 6 7 8 Large amount of censoring 0.4 0.8 1.2 1.6 1 2 3 4 5 6 7 8 Heterogeneity of censored outdegrees is low 0.8 1.0 2 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 42

Simulations - Censoring r 0.0 0.5 1.0 1.5 m = 5 m = 5 m = 15 0.0 0.5 1.0 1.5 m = 15 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 c 0.4 0.8 1.2 1.6 d1 0.8 1.0 Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects 1 2 3 4 5 6 7 8 Large amount of censoring 0.4 0.8 1.2 1.6 1 2 3 4 5 6 7 8 Heterogeneity of censored outdegrees is low Regression coefficients estimated too low 0.8 1.0 2 1 2 3 4 5 6 7 8 2 1 2 3 4 5 6 7 8 42

43 Simulations - Censoring d1 0.8 1.0 0.8 1.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d2 0.4 0.8 1.2 m = 5 m = 15 0.4 0.8 1.2 1 2 3 4 5 6 7 8 simulation 1 2 3 4 5 6 7 8 simulation Recall: x ij2 s i s j, an indicator of comembership to a group

Simulations - Censoring d1 0.8 1.0 0.8 1.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d2 0.4 0.8 1.2 m = 5 m = 15 0.4 0.8 1.2 1 2 3 4 5 6 7 8 simulation 1 2 3 4 5 6 7 8 simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring

Simulations - Censoring d1 0.8 1.0 0.8 1.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d2 0.4 0.8 1.2 m = 5 m = 15 0.4 0.8 1.2 1 2 3 4 5 6 7 8 simulation 1 2 3 4 5 6 7 8 simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring Binary likelihood underestimates row variability

Simulations - Censoring d1 0.8 1.0 0.8 1.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 d2 0.4 0.8 1.2 m = 5 m = 15 0.4 0.8 1.2 1 2 3 4 5 6 7 8 simulation 1 2 3 4 5 6 7 8 simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring Binary likelihood underestimates row variability Underestimate the variability in x ij2

44 Simulations - information in the ranks Let C (Y ) be the set of values for which the following is true: y ij > 0 z ij > 0 y ij = 0 and d i < m z ij 0 min {z ij : y ij > 0} max {z ij : y ij = 0} We refer to L C (θ : Y ) = Pr (Z C (Y ) θ) as the censored binary likelihood. Recognizes censoring but ignores information in the ranks

44 Simulations - information in the ranks Let C (Y ) be the set of values for which the following is true: y ij > 0 z ij > 0 y ij = 0 and d i < m z ij 0 min {z ij : y ij > 0} max {z ij : y ij = 0} We refer to L C (θ : Y ) = Pr (Z C (Y ) θ) as the censored binary likelihood. Recognizes censoring but ignores information in the ranks Performs similarly to FRN in the previous study Less precise than FRN when m is big

Simulations - information in the ranks Same setup as before, but average uncensored outdegree is m relative concentration around true value 0.2 0.4 0.6 0.8 1.0 1.2 1.4 r c d2 d1 β r : row β c : column β d1 : continuous dyad β d2 : co-membership 10 20 30 40 50 m Relative concentration [ around ] true[ value of each parameter: ] Measured by E (β 1) 2 F (Y ) /E (β 1) 2 C (Y ) for each β 2: Posterior concentration around true parameter values. The average of E[(β S)]/E[(β β ) 2 C(S)] across eight simulated datasets for each m {5, 15, 30, 50}. ensored binomial likelihood. As the censored binomial likelihood recognizes the censoring in ata, we expect it to provide parameter estimates that do not have the biases of the binomial od estimators. On the other hand, L C ignores the information in the ranks of the scored uals, and so we might expect it to provide less precise estimates than the FRN likelihood.

Simulations - information in the ranks Same setup as before, but average uncensored outdegree is m relative concentration around true value 0.2 0.4 0.6 0.8 1.0 1.2 1.4 r c d2 d1 β r : row β c : column β d1 : continuous dyad β d2 : co-membership 10 20 30 40 50 m Relative concentration [ around ] true[ value of each parameter: ] Measured by E (β 1) 2 F (Y ) /E (β 1) 2 C (Y ) for each β 2: Posterior concentration around true parameter values. The average of E[(β S)]/E[(β β ) 2 C(S)] across eight simulated datasets for each m {5, 15, 30, 50}. When m n, most of the information found by considering ranked/unranked individuals as groups rather than the relative ordering of the ranked individuals. ensored binomial likelihood. As the censored binomial likelihood recognizes the censoring in ata, we expect it to provide parameter estimates that do not have the biases of the binomial od estimators. On the other hand, L C ignores the information in the ranks of the scored uals, and so we might expect it to provide less precise estimates than the FRN likelihood.

AddHealth Data - Results β 3.65 3.50 3.35 intercept 0.05 0.00 0.05 0.10 rsmoke rdrink rgpa 0.05 0.00 0.05 0.10 csmoke cdrink cgpa β 0.05 0.00 0.05 0.10 dsmoke ddrink dgpa β 0.2 0.4 0.6 dacad darts dsport dcivic β 0.2 0.4 0.6 0.8 1.0 dgrade drace 646 females were asked to rank up to 5 female friends Mean model with row, column and dyadic effects for smoking, drinking and gpa as well as dyadic effects for comembership in activities and grade, and a similarity-in-race measure. The CIs are based on binary, FRN and rank likelihoods. 46