Higher order patterns via factor models

1/39 Higher order patterns via factor models 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington

2/39 Conflict data Y<-conflict90s$conflicts Xn<-conflict90s$nodevars colnames(xn) ## [1] "pop" "gdp" "polity" Xn[,1:2]<-log(Xn[,1:2]) table(y) ## Y ## 0 1 2 3 4 5 6 7 8 ## 16567 154 22 14 7 2 2 1 1

Dichotomized conflict data AFG ALB ANG ARG AUL BAH BEL BEN BNG BOT BUI CAM CAN CAO CDI CHA CHL CHN COL CON COS CUB CYP DOM DRC EGY FRN GHA GNB GRC GUI GUY HAI HON IND INS IRN IRQ ISR ITA JOR JPN KEN LBR LES LIB MAA MAL MLI MON MOR MYA MZM NAM NIC NIG NIR NTH OMA PAK PHI PNG PRK QAT ROK RWA SAF SAL SAU SEN SIE SIN SPN SRI SUD SWA SYR TAW TAZ THI TOG TRI TUR UAE UGA UKG USA VEN YEM ZAM ZIM 3/39

4/39 SRM fit to ordinal data #### nodal covariates only fit_n<-ame(y,xrow=xn,xcol=xn,model="ord",nscan=10000) summary(fit_n) ## ## beta: ## pmean psd z-stat p-val ## pop.row 0.256 0.121 2.114 0.034 ## gdp.row -0.430 0.099-4.334 0.000 ## polity.row -0.014 0.017-0.797 0.425 ## pop.col 0.207 0.096 2.151 0.031 ## gdp.col -0.371 0.078-4.774 0.000 ## polity.col -0.001 0.014-0.062 0.950 ## ## Sigma_ab pmean: ## a b ## a 1.134 0.801 ## b 0.801 0.687 ## ## rho pmean: ## 0.804

5/39 SRM fit to ordinal data plot(fit_n) SABR 0.5 1.0 1.5 2.0 0 100 200 300 400 BETA 0.8 0.2 0.4 0 100 200 300 400 0.035 0.040 0.045 0.050 0.055 sd.rowmean 0.025 0.030 0.035 0.040 sd.colmean 0.2 0.3 0.4 0.5 0.6 0.7 dyad.dep 0.00 0.01 0.02 0.03 0.04 0.05 triad.dep

6/39 Triad dependence measure gofstats ## function (Y) ## { ## sd.rowmean <- sd(rowmeans(y, na.rm = TRUE), na.rm = TRUE) ## sd.colmean <- sd(colmeans(y, na.rm = TRUE), na.rm = TRUE) ## dyad.dep <- cor(c(y), c(t(y)), use = "complete.obs") ## E <- Y - mean(y, na.rm = TRUE) ## D <- 1 * (!is.na(e)) ## E[is.na(E)] <- 0 ## triad.dep <- sum(diag(e %*% E %*% E))/(sum(diag(D %*% D %*% ## D)) * sd(c(y), na.rm = TRUE)^3) ## gof <- c(sd.rowmean, sd.colmean, dyad.dep, triad.dep) ## names(gof) <- c("sd.rowmean", "sd.colmean", "dyad.dep", "triad.dep") ## gof ## } ## <environment: namespace:amen>

7/39 Triad dependence measure Let E = (Y 11 T ȳ )/s y, that is, e i,j = (y i,j ȳ )/s y. ȳ is the grand mean; s y is the sample standard deviation of the y i,j s e i,j is like a scaled residual from the simple mean model y i,j = µ + ɛ i,j. The sum of the diagonal of E 3 gives the scaled third-order moment: trace(e 3 ) = e i,j e j,k e k,i i j The GOF statistic used in amen is the average of the e i,j e j,k e k,i s: i j k t(y) = e i,je j,k e i,j # of ordered triads k

8/39 Triad dependence and transitivity This triadic dependence measure is related to transitivity for binary data: Transitive triple: An ordered triple (i, j, k) is transitive if i j k i (y i,j = y j,k = y k,i ). trans(y) = i y i,j y j,k y k,i j k

9/39 SRM fit to ordinal data with dyadic covariates #### dyadic and nodal covariates fit_dn<-ame(y,xdyad=xd,xrow=xn,xcol=xn,model="ord",nscan=10000) summary(fit_dn) ## ## beta: ## pmean psd z-stat p-val ## pop.row 0.212 0.095 2.230 0.026 ## gdp.row 0.077 0.078 0.992 0.321 ## polity.row -0.028 0.013-2.045 0.041 ## pop.col 0.205 0.092 2.232 0.026 ## gdp.col -0.029 0.078-0.376 0.707 ## polity.col 0.002 0.012 0.130 0.896 ## polity_int.dyad -0.003 0.001-2.419 0.016 ## imports.dyad -0.142 0.100-1.415 0.157 ## shared_igos.dyad -0.017 0.005-3.383 0.001 ## distance.dyad -1.837 0.101-18.244 0.000 ## ## Sigma_ab pmean: ## a b ## a 0.500 0.332 ## b 0.332 0.462 ## ## rho pmean: ## 0.566

10/39 SRM fit to ordinal data with dyadic covariates plot(fit_dn) SABR 0.2 0.6 1.0 BETA 2.0 1.0 0.0 0 100 200 300 400 0 100 200 300 400 0.030 0.035 0.040 0.045 0.050 sd.rowmean 0.025 0.030 0.035 0.040 sd.colmean 0.2 0.3 0.4 0.5 0.6 dyad.dep 0.00 0.02 0.04 0.06 triad.dep

11/39 GOF comparison Density 0 20 40 60 80 0.00 0.02 0.04 0.06 th Density 0 10 20 30 0.00 0.02 0.04 0.06 th

12/39 Dyadic functions of nodal characteristics Adding these dyadic covariates improved the fit with regard to t(y). Note that each of these dyadic covariates was a function of nodal covariates: x i,j,1 = f 1(polity i, polity j ) x i,j,2 = f 2(igo i, igo j ) x i,j,3 = f 3(location i, location j ) This suggests models of the form y i,j β 0 + β 1x i,j x i,j = s(x i, x j ) where s(, ) is some (non-additive) function of the nodal characteristics x i, x j.

13/39 Homophily and stochastic equivalence More generally, let u i v j be a covariate of i as a sender of ties; be covariate of j as a receiver of ties. y i,j β 0 + β 1 s(u i, v j ) Such a model can describe various types of higher-order dependence, including transitivity stochastic equivalence

14/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

4/39 Transitivity via homophily Homophily on covariates can explain transitivity: y i,j β 0 + β 1 s(x i, x j ) If β 1 > 0 and the y i,j s are binary, then y i,j = 1 x i x j ; y i,k = 1 x i x k ; x i x j, x i x k x j x k. x j x k y j,k = 1. Similarly, homophily can explain triadic dependence for ordinal y i,j s.

15/39 Stochastic equivalence Returning to the more general model: y i,j β 0 + β 1 s(u i, v j ) If u i = u k then i and j are equivalent as senders in terms of the model. Example: Logistic regression If u i = u k = u, then Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) Pr(Y i,j = 1) = eβ 0+β 1 s(u i,v j ) 1 + e β 0+β 1 s(u i,v j ) = Pr(Y k,j = 1), and nodes i and k are stochastically equivalent (as senders).

16/39 Homophily and stochastic equivalence In our hypothetical model of social relations, we ve seen how nodal characteristics relate to homophily and stochastic equivalence. Homophily: Similar nodes link to each other similar in terms of characteristics (potentially unobserved) homophily leads to transitive or clustered social networks observed transitivity may be due to exogenous or endogenous factors ( See Shalizi and Thomas 2010 for a more careful discussion ) Stochastic equivalence: Similar nodes have similar relational patterns similar nodes may or may not link to each other equivalent nodes can be thought of as having the same role

Visualizing stochastic equivalence For which network is homophily a plausible explanation? Which network exhibits a large degree of stochastic equivalence? 17/39

Homophily and stochastic equivalence in real networks, ; :. a above abundantly after air all also and appear be bearing beast beginning behold blessed bring brought called cattle created creature creepeth creeping darkness day days deep divide divided dominion dry earth evening every face female fifth fill finished firmament first fish fly for form forth fourth fowl from fruit fruitful gathered gathering give given god good grass great greater green had hath have he heaven heavens herb him his host i image in is it itself kind land lesser let life light lights likeness living made make male man may meat midst morning moved moveth moving multiply night of one open our over own place replenish rule said saw saying sea seas seasons second seed set shall signs sixth so spirit stars subdue that the their them there thing third thus to together tree two under unto upon us very void was waters were whales wherein which whose winged without years yielding you b0185 b2316 b1094 b4187 b2697 b1731 b2097 b2496 b4131 b1000 b0882 b0437 b0438b1120 b3556 b1557 b1823 b0880 b0623 b3162 b3702 b0184 b0015 b0014 b0215 b2779 b1288 b0180 b2925 b3895 b0684 b3463 b0095 b3517 b1493 b3181 b3406 b2614 b2231 b3699 b0059 b4172 b1099 b4259 b4372 b1842 b2526 b3931 b3932 b4000 b0440 b2728 b2729 b3687 b1718 b2530 b2529 b1215 b0186 b0179 b2890 b4129 b3418 b2942 b4143 b4142 b3251 b0924 b0923 b3189 b1224 b1225 b1226 b1467 b3169 b3982 b0133 b3019 b1127 b0903 b3164 b3863 b4226 b1207 b2585 b3725 b2476 b1854 b2892 b3822 b3619 b2038 b3780 b1084 b0214 b3704 b3319 b3295b3987 b3988 b3067 b3461 b3202 b2741 b3649 b0098 b3609 b3590 b2620 b3650 b2576 b4059 b3115 b0406 b1274 b1763 b3919 b0170 b3339 b3980 b2319 b1913 b4179 b0101 b0119 b0528 b0607 b0660 b0877 b0948 b1055 b1086 b1103 b1269 b2330 b2517 b2579 b2581 b2830 b2960 b3180 b3470 b3746 b4191 b0661 b2836 b2323 b4041 b2563 b1095 b1093 b3730 b0736 b3888 b0492 b3701 b3783 b3317 b4203 b3315 b3314 b0169 b3341 b1468 b2592 b0990 b3298 b3320 b3308 b3231 b0911 b3296 b3984b3303 b3309 b3321 b0023 b1552 b3054 b2606 b3186 b4200 b3306 b3230 b3307 b3165 b2609 b4202 b0470 b0640 b2699 b0595 b1294 b2610 b3305 b2515 b2311 b3340 b3247 b1413 b0439 b0436 b0178 b1808 b3935 b3417 b2487 b2721 b3686 b3986 b2096 b0922 b3168 b0480 b2011 b3301 b1716 b2372 b4051 b1817 b0925 b3637 b3652 b3294 b3499 b3304 b3602 b3318 b2804 b3342 AddHealth friendships: friendships among 247 12th-graders Word neighbors in Genesis: neighboring occurrences among 158 words Protein binding interactions: binding patterns among 230 proteins 18/39

19/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

9/39 Latent factor models Homophily and stochastic equivalence from unobserved variables can be represented with a latent factor model: y i,j u T i Dv j, u i is a vector of latent factors describing i as a sender of ties; v j is a vector of latent factors describing j as a receiver of ties; D is a diagonal matrix of factor weights. Normal, binomial, ordinal data can be represented with this structure as follows: z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) where g is some increasing function: g(z) = z for normal data; g(z) = 1(z > 0) for binomial data; g(z) = some increasing step function for ordinal data.

20/39 Understanding latent factors Z = U T DV + E z i,j = u T i Dv j + ɛ i,j R = d r u i,r v j,r + ɛ i,j For example, in a 2 factor model, we have z i,j = d 1(u i,1 v j,1 ) + d 2(u i,2 v j,2 ) + ɛ i,j r=1 Interpretation u i u j : similarity of latent factors implies approximate stoch equivalence; u i v j : similarity of latent factors implies high probability of a tie.

1/39 Matrix decomposition interpretation Recall from linear algebra: Every m n matrix Z can be written Z = UDV T where D = diag(d 1,..., d n), U and V are orthonormal. If UDV T is the svd of Z, then Ẑ k U [,1:k] D [1:k,1:k] V T [,1:k] is the least-squares rank-k approximation to Z.

21/39 Matrix decomposition interpretation Recall from linear algebra: Every m n matrix Z can be written Z = UDV T where D = diag(d 1,..., d n), U and V are orthonormal. If UDV T is the svd of Z, then Ẑ k U [,1:k] D [1:k,1:k] V T [,1:k] is the least-squares rank-k approximation to Z.

22/39 Least squares matrix approximations singular value 0 5 10 15 20 25 30 5 10 15 20 Z^ 10 5 0 5 r=1 Z^ 10 5 0 5 r=3 Z^ 10 5 0 5 r=6 Z^ 10 5 0 5 r=12 10 5 0 5 Z 10 5 0 5 Z 10 5 0 5 Z 10 5 0 5 Z

23/39 LFM for symmetric data Probit version of the symmetric latent factor model: y i,j = g(z i,j ), where g is a nondecreasing function z i,j = u T i Λu j + ɛ i,j, where u i R K, Λ=diag(λ 1,..., λ K ) {ɛ i,j } iid normal(0, 1) Writing {z i,j } as a matrix, Recall from linear algebra: Z = UΛU T + E Every n n symmetric matrix Z can be written Z = UΛU T where Λ = diag(λ 1,..., λ n) and U is orthonormal. If UΛU T is the eigendecomposition of Z, then Ẑ k U [,1:k] Λ [1:k,1:k] U T [,1:k] is the least-squares rank-k approximation to Z.

24/39 Least squares approximations of increasing rank eigenvalue 50 0 50 0 20 40 60 80 100 Index ^ Z1 6 2 0 2 4 6 Z3 8 4 0 2 4 6 ^ ^ Z9 5 0 5 ^ Z27 6 2 0 2 4 6 6 2 2 4 6 range(c(x, y)) 8 4 0 2 4 6 range(c(x, y)) 5 0 5 range(c(x, y)) 6 2 2 4 6 range(c(x, y))

25/39 Understanding eigenvectors and eigenvalues Z = U T ΛU + E z i,j = u T i Λu j + ɛ i,j R = λ r u i,r u j,r + ɛ i,j For example, in a rank-2 model, we have z i,j = λ 1(u i,1 u j,1 ) + λ 2(u i,2 u j,2 ) + ɛ i,j r=1 Interpretation u i,r u j,r : equality of latent factors; represents stochastic equivalence; λ r > 0: positive eigenvalues represent homophily; λ r < 0: negative eigenvalues represent antihomophily.

26/39 Returning to directed relations: AME models SRM: We have motivated the SRM in order to represent 2nd order dependence: within row dependence, within column dependence; within dyad dependence. z i,j = β T x i,j + a i + b j + ɛ i,j y i,j = g(z i,j ) This model is made up of additive random effects. LFM: We have motivated the LFM to model more complex structures: third order dependence and transitivity; stochastic equivalence. z i,j = u T i Dv j + ɛ i,j y i,j = g(z i,j ) This model is made up of multiplicative random effects.