Higher order patterns via factor models

Similar documents
Latent Factor Models for Relational Data

Latent SVD Models for Relational Data

4. "God called the light (BRIGHT, DAY), and the darkness He called (NIGHT, SPACE). So the evening and the morning were the first day.

Hierarchical models for multiway data

Mean and covariance models for relational arrays

Modeling homophily and stochastic equivalence in symmetric relational data

Probability models for multiway data

Dyadic data analysis with amen

OT004. Genesis 1:26-2:3

OT002. Genesis 1:14-25

fish birds stars cattle mountains creeping things

Bible Story 2 GOD MAKES ADAM & EVE GENESIS 1:26-4:1

The International-Trade Network: Gravity Equations and Topological Properties

Inferring Latent Preferences from Network Data

Modeling networks: regression with additive and multiplicative effects

Sampling and incomplete network data

Creation Genesis 1:1-2:25

Quick Tour of Linear Algebra and Graph Theory

And I looked, and I heard an angel flying through the midst of heaven, saying with a loud voice, Woe, woe, woe to the inhabitants of the earth,

Linear Algebra (Review) Volker Tresp 2018

Review of Linear Algebra

Lecture Note 13 The Gains from International Trade: Empirical Evidence Using the Method of Instrumental Variables

14 Singular Value Decomposition

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Quick Tour of Linear Algebra and Graph Theory

Lecture 10 Optimal Growth Endogenous Growth. Noah Williams

Christ and the Geological Periods

Properties of Matrices and Operations on Matrices

CS 143 Linear Algebra Review

PCA and admixture models

Random Effects Models for Network Data

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

Clustering and blockmodeling

Agriculture, Transportation and the Timing of Urbanization

Stat 206: Linear algebra

Foundations of Computer Vision

CS 340 Lec. 6: Linear Dimensionality Reduction

ECON 581. The Solow Growth Model, Continued. Instructor: Dmytro Hryshko

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Linear Algebra Review. Vectors

Singular Value Decomposition and Principal Component Analysis (PCA) I

Landlocked or Policy Locked?

Deep Learning Book Notes Chapter 2: Linear Algebra

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Multivariate Statistical Analysis

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

Introduction to Machine Learning

Economic Growth: Lecture 1, Questions and Evidence

Principal Component Analysis

Lecture 7 Spectral methods

A re examination of the Columbian exchange: Agriculture and Economic Development in the Long Run

Dimensionality Reduction

CS 246 Review of Linear Algebra 01/17/19

Probabilistic Latent Semantic Analysis

TRUE OR FALSE: 3. "And Noah did according to all that the LORD commanded him." GENESIS 7:5 TRUE OR FALSE

7 Principal Component Analysis

REVIEW 8/2/2017 陈芳华东师大英语系

Review of Linear Algebra

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Maximum Likelihood Estimation

. = V c = V [x]v (5.1) c 1. c k

VECTORS, TENSORS AND INDEX NOTATION

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

Specification and estimation of exponential random graph models for social (and other) networks

Fundamentals of Matrices

DS-GA 1002 Lecture notes 10 November 23, Linear models

The Blood Moon Tetrad

Large Scale Data Analysis Using Deep Learning

Linear Algebra (Review) Volker Tresp 2017

Manning & Schuetze, FSNLP (c) 1999,2000

Vector Space Models. wine_spectral.r

PETER'S VISION (ACTS 10:9-16) MEMORY VERSE: "...What God has cleansed you must not call common." ACTS 10:15

God s Amazingly Astonishing World

IDENTIFYING MULTILATERAL DEPENDENCIES IN THE WORLD TRADE NETWORK

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Agent-Based Methods for Dynamic Social Networks. Duke University

Days of Creation Mini Puzzle Unit

STATISTICAL LEARNING SYSTEMS

Lecture Notes 2: Matrices

Introduction to statistical analysis of Social Networks

Latent Semantic Analysis. Hongning Wang

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

Matrix Factorizations

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Matrices A brief introduction

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Department of Statistics. Bayesian Modeling for a Generalized Social Relations Model. Tyler McCormick. Introduction.

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Lecture 2: Linear Algebra Review

Multilinear tensor regression for longitudinal relational data

Eigenvalues and diagonalization

Definition 2.3. We define addition and multiplication of matrices as follows.

Fiedler s Theorems on Nodal Domains

TRUE OR FALSE: 3. "So God blessed Noah and his sons, and said to them: 'Be fruitful and multiply, and fill the earth.' " GENESIS 9:1 TRUE OR FALSE

1 Vocabulary. Chapter 5 Ecology. Lesson

Bargaining, Information Networks and Interstate

Transcription:

1/39 Higher order patterns via factor models 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington

2/39 Conflict data Y<-conflict90s$conflicts Xn<-conflict90s$nodevars colnames(xn) ## [1] "pop" "gdp" "polity" Xn[,1:2]<-log(Xn[,1:2]) table(y) ## Y ## 0 1 2 3 4 5 6 7 8 ## 16567 154 22 14 7 2 2 1 1

Dichotomized conflict data AFG ALB ANG ARG AUL BAH BEL BEN BNG BOT BUI CAM CAN CAO CDI CHA CHL CHN COL CON COS CUB CYP DOM DRC EGY FRN GHA GNB GRC GUI GUY HAI HON IND INS IRN IRQ ISR ITA JOR JPN KEN LBR LES LIB MAA MAL MLI MON MOR MYA MZM NAM NIC NIG NIR NTH OMA PAK PHI PNG PRK QAT ROK RWA SAF SAL SAU SEN SIE SIN SPN SRI SUD SWA SYR TAW TAZ THI TOG TRI TUR UAE UGA UKG USA VEN YEM ZAM ZIM 3/39

4/39 SRM fit to ordinal data #### nodal covariates only fit_n<-ame(y,xrow=xn,xcol=xn,model="ord",nscan=10000) summary(fit_n) ## ## beta: ## pmean psd z-stat p-val ## pop.row 0.256 0.121 2.114 0.034 ## gdp.row -0.430 0.099-4.334 0.000 ## polity.row -0.014 0.017-0.797 0.425 ## pop.col 0.207 0.096 2.151 0.031 ## gdp.col -0.371 0.078-4.774 0.000 ## polity.col -0.001 0.014-0.062 0.950 ## ## Sigma_ab pmean: ## a b ## a 1.134 0.801 ## b 0.801 0.687 ## ## rho pmean: ## 0.804

5/39 SRM fit to ordinal data plot(fit_n) SABR 0.5 1.0 1.5 2.0 0 100 200 300 400 BETA 0.8 0.2 0.4 0 100 200 300 400 0.035 0.040 0.045 0.050 0.055 sd.rowmean 0.025 0.030 0.035 0.040 sd.colmean 0.2 0.3 0.4 0.5 0.6 0.7 dyad.dep 0.00 0.01 0.02 0.03 0.04 0.05 triad.dep

6/39 Triad dependence measure gofstats ## function (Y) ## { ## sd.rowmean <- sd(rowmeans(y, na.rm = TRUE), na.rm = TRUE) ## sd.colmean <- sd(colmeans(y, na.rm = TRUE), na.rm = TRUE) ## dyad.dep <- cor(c(y), c(t(y)), use = "complete.obs") ## E <- Y - mean(y, na.rm = TRUE) ## D <- 1 * (!is.na(e)) ## E[is.na(E)] <- 0 ## triad.dep <- sum(diag(e %*% E %*% E))/(sum(diag(D %*% D %*% ## D)) * sd(c(y), na.rm = TRUE)^3) ## gof <- c(sd.rowmean, sd.colmean, dyad.dep, triad.dep) ## names(gof) <- c("sd.rowmean", "sd.colmean", "dyad.dep", "triad.dep") ## gof ## } ## <environment: namespace:amen>

7/39 Triad dependence measure Let E = (Y 11 T ȳ )/s y, that is, e i,j = (y i,j ȳ )/s y. ȳ is the grand mean; s y is the sample standard deviation of the y i,j s e i,j is like a scaled residual from the simple mean model y i,j = µ + ɛ i,j. The sum of the diagonal of E 3 gives the scaled third-order moment: trace(e 3 ) = e i,j e j,k e k,i i j The GOF statistic used in amen is the average of the e i,j e j,k e k,i s: i j k t(y) = e i,je j,k e i,j # of ordered triads k

7/39 Triad dependence measure Let E = (Y 11 T ȳ )/s y, that is, e i,j = (y i,j ȳ )/s y. ȳ is the grand mean; s y is the sample standard deviation of the y i,j s e i,j is like a scaled residual from the simple mean model y i,j = µ + ɛ i,j. The sum of the diagonal of E 3 gives the scaled third-order moment: trace(e 3 ) = e i,j e j,k e k,i i j The GOF statistic used in amen is the average of the e i,j e j,k e k,i s: i j k t(y) = e i,je j,k e i,j # of ordered triads k

7/39 Triad dependence measure Let E = (Y 11 T ȳ )/s y, that is, e i,j = (y i,j ȳ )/s y. ȳ is the grand mean; s y is the sample standard deviation of the y i,j s e i,j is like a scaled residual from the simple mean model y i,j = µ + ɛ i,j. The sum of the diagonal of E 3 gives the scaled third-order moment: trace(e 3 ) = e i,j e j,k e k,i i j The GOF statistic used in amen is the average of the e i,j e j,k e k,i s: i j k t(y) = e i,je j,k e i,j # of ordered triads k

7/39 Triad dependence measure Let E = (Y 11 T ȳ )/s y, that is, e i,j = (y i,j ȳ )/s y. ȳ is the grand mean; s y is the sample standard deviation of the y i,j s e i,j is like a scaled residual from the simple mean model y i,j = µ + ɛ i,j. The sum of the diagonal of E 3 gives the scaled third-order moment: trace(e 3 ) = e i,j e j,k e k,i i j The GOF statistic used in amen is the average of the e i,j e j,k e k,i s: i j k t(y) = e i,je j,k e i,j # of ordered triads k

8/39 Triad dependence and transitivity This triadic dependence measure is related to transitivity for binary data: Transitive triple: An ordered triple (i, j, k) is transitive if i j k i (y i,j = y j,k = y k,i ). trans(y) = i y i,j y j,k y k,i j k

8/39 Triad dependence and transitivity This triadic dependence measure is related to transitivity for binary data: Transitive triple: An ordered triple (i, j, k) is transitive if i j k i (y i,j = y j,k = y k,i ). trans(y) = i y i,j y j,k y k,i j k

8/39 Triad dependence and transitivity This triadic dependence measure is related to transitivity for binary data: Transitive triple: An ordered triple (i, j, k) is transitive if i j k i (y i,j = y j,k = y k,i ). trans(y) = i y i,j y j,k y k,i j k

9/39 SRM fit to ordinal data with dyadic covariates #### dyadic and nodal covariates fit_dn<-ame(y,xdyad=xd,xrow=xn,xcol=xn,model="ord",nscan=10000) summary(fit_dn) ## ## beta: ## pmean psd z-stat p-val ## pop.row 0.212 0.095 2.230 0.026 ## gdp.row 0.077 0.078 0.992 0.321 ## polity.row -0.028 0.013-2.045 0.041 ## pop.col 0.205 0.092 2.232 0.026 ## gdp.col -0.029 0.078-0.376 0.707 ## polity.col 0.002 0.012 0.130 0.896 ## polity_int.dyad -0.003 0.001-2.419 0.016 ## imports.dyad -0.142 0.100-1.415 0.157 ## shared_igos.dyad -0.017 0.005-3.383 0.001 ## distance.dyad -1.837 0.101-18.244 0.000 ## ## Sigma_ab pmean: ## a b ## a 0.500 0.332 ## b 0.332 0.462 ## ## rho pmean: ## 0.566

10/39 SRM fit to ordinal data with dyadic covariates plot(fit_dn) SABR 0.2 0.6 1.0 BETA 2.0 1.0 0.0 0 100 200 300 400 0 100 200 300 400 0.030 0.035 0.040 0.045 0.050 sd.rowmean 0.025 0.030 0.035 0.040 sd.colmean 0.2 0.3 0.4 0.5 0.6 dyad.dep 0.00 0.02 0.04 0.06 triad.dep

11/39 GOF comparison Density 0 20 40 60 80 0.00 0.02 0.04 0.06 th Density 0 10 20 30 0.00 0.02 0.04 0.06 th

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

13/39 Homophily and stochastic equivalence More generally, let u i v j be a covariate of i as a sender of ties; be covariate of j as a receiver of ties. y i,j β 0 + β 1 s(u i, v j ) Such a model can describe various types of higher-order dependence, including transitivity stochastic equivalence

13/39 Homophily and stochastic equivalence More generally, let u i v j be a covariate of i as a sender of ties; be covariate of j as a receiver of ties. y i,j β 0 + β 1 s(u i, v j ) Such a model can describe various types of higher-order dependence, including transitivity stochastic equivalence

13/39 Homophily and stochastic equivalence More generally, let u i v j be a covariate of i as a sender of ties; be covariate of j as a receiver of ties. y i,j β 0 + β 1 s(u i, v j ) Such a model can describe various types of higher-order dependence, including transitivity stochastic equivalence

14/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

14/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

14/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

4/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

4/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

4/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

15/39 Stochastic equivalence Returning to the more general model: y i,j β 0 + β 1 s(u i, v j ) If u i = u k then i and j are equivalent as senders in terms of the model. Example: Logistic regression If u i = u k = u, then Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) = Pr(Y k,j = 1), and nodes i and k are stochastically equivalent (as senders).

15/39 Stochastic equivalence Returning to the more general model: y i,j β 0 + β 1 s(u i, v j ) If u i = u k then i and j are equivalent as senders in terms of the model. Example: Logistic regression If u i = u k = u, then Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) = Pr(Y k,j = 1), and nodes i and k are stochastically equivalent (as senders).

15/39 Stochastic equivalence Returning to the more general model: y i,j β 0 + β 1 s(u i, v j ) If u i = u k then i and j are equivalent as senders in terms of the model. Example: Logistic regression If u i = u k = u, then Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) = Pr(Y k,j = 1), and nodes i and k are stochastically equivalent (as senders).

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

Visualizing stochastic equivalence For which network is homophily a plausible explanation? Which network exhibits a large degree of stochastic equivalence? 17/39

Visualizing stochastic equivalence For which network is homophily a plausible explanation? Which network exhibits a large degree of stochastic equivalence? 17/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 18/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 18/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 8/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 8/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 8/39

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

9/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

9/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

9/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

9/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

1/39 Matrix decomposition interpretation Recall from linear algebra: Every m n matrix Z can be written Z = UDV T where D = diag(d 1,..., d n), U and V are orthonormal. If UDV T is the svd of Z, then Ẑ k U [,1:k] D [1:k,1:k] V T [,1:k] is the least-squares rank-k approximation to Z.

1/39 Matrix decomposition interpretation Recall from linear algebra: Every m n matrix Z can be written Z = UDV T where D = diag(d 1,..., d n), U and V are orthonormal. If UDV T is the svd of Z, then Ẑ k U [,1:k] D [1:k,1:k] V T [,1:k] is the least-squares rank-k approximation to Z.

21/39 Matrix decomposition interpretation Recall from linear algebra: Every m n matrix Z can be written Z = UDV T where D = diag(d 1,..., d n), U and V are orthonormal. If UDV T is the svd of Z, then Ẑ k U [,1:k] D [1:k,1:k] V T [,1:k] is the least-squares rank-k approximation to Z.

22/39 Least squares matrix approximations singular value 0 5 10 15 20 25 30 5 10 15 20 Z^ 10 5 0 5 r=1 Z^ 10 5 0 5 r=3 Z^ 10 5 0 5 r=6 Z^ 10 5 0 5 r=12 10 5 0 5 Z 10 5 0 5 Z 10 5 0 5 Z 10 5 0 5 Z

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

24/39 Least squares approximations of increasing rank eigenvalue 50 0 50 0 20 40 60 80 100 Index ^ Z1 6 2 0 2 4 6 Z3 8 4 0 2 4 6 ^ ^ Z9 5 0 5 ^ Z27 6 2 0 2 4 6 6 2 2 4 6 range(c(x, y)) 8 4 0 2 4 6 range(c(x, y)) 5 0 5 range(c(x, y)) 6 2 2 4 6 range(c(x, y))

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

26/39 Returning to directed relations: AME models SRM: We have motivated the SRM in order to represent 2nd order dependence: within row dependence, within column dependence; within dyad dependence. z i,j = β T x i,j + a i + b j + ɛ i,j y i,j = g(z i,j ) This model is made up of additive random effects. LFM: We have motivated the LFM to model more complex structures: third order dependence and transitivity; stochastic equivalence. z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) This model is made up of multiplicative random effects.