Modeling networks: regression with additive and multiplicative effects

Size: px

Start display at page:

Download "Modeling networks: regression with additive and multiplicative effects"

Virgil Parrish
5 years ago
Views:

1 Modeling networks: regression with additive and multiplicative effects Alexander Volfovsky Department of Statistical Science, Duke May May 25, 2017 Health Networks

2 1 Why model networks? Interested in understanding the formation of relationships

3 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology

4 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions:

5 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models?

6 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail?

7 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems?

8 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these?

9 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these? Causal inference

10 1 Why model networks? Interested in understanding the formation of relationships Applied fields: sociology, economics, biology, epidemiology Fundamental theory questions: What assumptions are made for different network models? What models work when the assumptions fail? How to develop fail-safes to overcome these problems? Where to apply these? Causal inference Link prediction

11 Some context: Facebook Facebook wants to change its ad algorithm. Source: Wikimedia

12 Some context: Facebook Facebook wants to change its ad algorithm. Can t do it on the whole graph Source: Wikimedia

13 Some context: Facebook Facebook wants to change its ad algorithm. Can t do it on the whole graph Need total network effect Source: Wikimedia

14 How do they solve it? Interested in estimating 1 N N [Y i (all treated) Y i (all controls)] i=1 At a high level, graph cluster randomization is a technique in which the graph is partitioned into a set of clusters, and then randomization between treatment and control is performed at the cluster level. Where can we find clusters? Observable information (e.g. same school) Unobservable information ( social space )

15 Some context: (im)migration Want to know how regime change affects population. Politicians during election years care about direct effects. Source:

16 Some more context Studying tram traffic in Vienna Source: kurier.at 5

17 And one more Studying taxi rides in Porto I 442 taxis I 1.7 million rides with (x, y ) coordinates at 15 second intervals. Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic Differentiation Variational Inference. Journal of Machine Learning Research, 18(14),

And one more Studying taxi rides in Porto I Project into a 100 dimensional latent space. I Learn hidden interpretable patterns... Source: Kucukelbir, A.

18 And one more Studying taxi rides in Porto I Project into a 100 dimensional latent space. I Learn hidden interpretable patterns... Source: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic Differentiation Variational Inference. Journal of Machine Learning Research, 18(14),

19 8 Relational data: common examples and goals Changes in exports from year to year second eigenvector of R^ row Finland United Kingdom rmany Italy Spain Switzerland France Ireland Norw New Zealand USA Canada Mexico Turkey Netherlands Austria Brazil Japan Australia China Rep. of Korea Indonesia Malaysia Greec Thailand China, Hong Kong SAR second eigenvector of R^ col Indonesia China Turkey Japan Malaysia New Norway Zealand Australia Thailand Greece Finland Rep. of Korea Austria Brazil Spain Italy Mexico Netherlands China, IrelandHong Kong SAR Canada United France Kingdom USA Germany Switzer first eigenvector of R^ row first eigenvector of R^ col Network regression problems y ij = x ij β + ɛ ij frequently assume independence of the ɛ ij

20 Estimating β in network regression second eigenvector of R^ row Finland United Kingdom rmany Italy Spain Switzerland France Ireland Norw New Zealand USA Canada Mexico Turkey Netherlands Austria Brazil Japan Australia China Rep. of Korea Indonesia Malaysia Greec Thailand China, Hong Kong SAR second eigenvector of R^ col Indonesia China Turkey Japan Malaysia New Norway Zealand Australia Thailand Greece Finland Rep. of Korea Austria Brazil Spain Italy Mexico Netherlands China, IrelandHong Kong SAR Canada United France Kingdom USA Germany Switzer first eigenvector of R^ row first eigenvector of R^ col For Y =< X, β > +E we have OLS (assume no dependence among ɛ ij ): ˆβ (ols) = (mat(x) t mat(x)) 1 mat(x) t vec(y ) Oracle GLS (assume dependence among ɛ ij ): ˆβ (gls) = (mat(x) t (Σ 1 )mat(x)) 1 mat(x) t (Σ 1 )vec(y )

21 Network models The data There are n actors/nodes labeled 1,..., n Y is a sociomatrix: y ij is a dyadic relationship between node i and node j. y ii frequently undefined. Covariates: node specific: x i dyad specific: xij

22 Social relations model Goal: describe the variability in Y. Sender effects describe sociability. Receiver effects describe popularity. Capture this in the Social Relations Model (SRM) y ij = a i + b j + ɛ ij Almost an ANOVA want to relate a i to b i since the senders/receivers are from the same set.

23 Social relations model y ij =µ + a i + b j + ɛ ij (a i, b i ) iid N(0, Σ ab ) (ɛ ij, ɛ ji ) iid N(0, Σ e ) ( ) σ 2 Σ ab = a σ ab describes sender/receiver variability and σ ab σ 2 b within person similarity. ( ) 1 ρ Σ e = σɛ 2 describes within dyad correlation. ρ 1

24 Variability var(y ij ) =σa 2 + 2σ ab + σb 2 + σ2 ɛ cov(y ij, y ik ) =σa 2 cov(y ij, u kj ) =σb 2 cov(y ij, y jk ) =σ ab cov(y ij, y ji ) =2σ ab + ρσɛ 2 How hard is it to fit this model? fit_srm <- ame(y)

25 Source: Hoff (2015). arxiv: Pictures that pop up These help capture how well the Markov Chain is mixing and goodness of fit information.

Source: Hoff (2015). arxiv:1506.08237 15 Goodness of fit Posterior predictive distributions. sd.rowmean: standard deviation of row means of Y. sd.colmean: standard deviation of column means of Y.

26 Source: Hoff (2015). arxiv: Goodness of fit Posterior predictive distributions. sd.rowmean: standard deviation of row means of Y. sd.colmean: standard deviation of column means of Y. dyad.dep: correlation between vectorized Y and vectorized Y t triad.dep: i jk e ije jk e ki Var(vec(Y ))3/2 #triangle on n nodes

27 Incorporating covariates Imagine you have some covariates and want to fit y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + ɛ ij x d,ij are dyad specific covariates. x r,i are row (sender) covariates. x c,i are column (receiver) covariates. Frequently x r,i = x c,i = x i When does this not make sense? (Example: popularity is affected by athletic success, but sociability is not) How hard is it to fit this model? fit_srrm <- ame(y, Xd=Xd,Xr=Xr,Xc=Xc)

28 Parsing the input fit_srrm <- ame(y, Xdyad=Xd, #n x n x pd array of covariates Xrow=Xr, #n x pr matrix of nodal row covariates Xcol=Xc #n x pc matrix of nodal column covariates ) Xr i,p is the value of the pth row covariate for node i. Xd i,j,p is the value of the pth dyadic covariate in the direction of i to j.

29 Back to basics Can you get rid of the dependencies in the model? fit_rm<-ame(y,xd=xd,xr=xn,xc=xn, rvar=false, #should you fit row random effects? cvar=false, #should you fit column random effects? dcor=false #should you fit a dyadic correlation? ) Note that summary will output: Variance parameters: pmean psd va cab vb rho ve

30 So what s missing here? We have a lot of left over variability. Common themes in network analysis: Homophily: similar people connect to each other Stochastic equivalence: similar people act similarly

31 Which is which? Source: Hoff (2008). NIPS

32 Which is which? Left: homophily; Right: stochastic equivalence What are good models for this? Source: Hoff (2008). NIPS

33 Introducing multiplicative effects SR(R)M can represent second-order dependencies very well. Has a hard time capturing triadic behavior. Homophily: create dyadic covariates x d,ij = x i x j Generally this can be represented by xr t i Bx j,i = k l b klx r,ik x c,jl This is linear in the covariates and so can be baked into the amen framework. Sometimes there is excess correlation to account. This suggests a multiplicative effects model: y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + u t i v j + ɛ ij

34 Source: Hoff (2015). arxiv: Fitting these models and beyond fit_ame2<-ame(y,xd,xn,xn, R=2 #dimension of the multiplicative effect )

35 What happened here? Why do multiplicative effects help triadic behavior? Triadic measure is related to transitivity (at least for binary data). Turns out homophily can capture transitivity... y ij = β t d x d,ij + β t r x r,i + β t cx c,j + a i + b j + u t i v j + ɛ ij u i is information about the sender, v j is information about the receiver if u i v j then u t i v j > 0... if u i u j then there is some stochastic equivalence...

36 Lets generalize: ordinal models Imagine a binary (probit) model: y ij = 1 zij >0 z ij = µ + a i + b j + ɛ ij Looks like the SRM on the latent scale. fit_srm<-ame(y, model="bin" #lots of model options here ) If we go to the iid set up this is just an Erdos-Renyi model: fit_srg<-ame(y,model="bin", rvar=false,cvar=false,dcor=false)

37 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij )

38 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender

39 25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver

40 25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights

41 25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights g is an increasing function mapping the latent space to the observed space.

42 25 Even more general Consider the following generative model: z ij = u t i Dv j + ɛ ij y ij = g(z ij ) u i are latent factors describing i as a sender v j are latent factors describing j as a receiver D is a matrix of factor weights g is an increasing function mapping the latent space to the observed space. (Some gs... Normal: g(z) = z, binomial: g(z) = 1 z 0 )

43 This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij )

44 This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence

45 This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence Λ is a matrix of eigenvalues:

46 This works for symmetric matrices too Imagine that y ij = y ji then the model looks like: z ij = u i Λu j + ɛ ij y ij = g(z ij ) u i u j represents stochastic equivalence Λ is a matrix of eigenvalues: positive λ i imply homophily, negative ones imply heterophily.

47 What is this latent space? Problem 1: need to select a dimension R.

48 What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition.

49 What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted?

50 What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space...

51 What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space... Problem 3: what about my favorite other models like stochastic blockmodels?

52 What is this latent space? Problem 1: need to select a dimension R. This is hard... sometimes there is some intuition. Problem 2: should the latent positions be interpreted? Unclear maybe think of the distances in this space... Problem 3: what about my favorite other models like stochastic blockmodels? These are just a subclass of models For example, the stochastic blockmodel has discrete support for the latent positions.

53 What is this latent space? All quotes from Hoff, et al 2002 A subset of individuals in the population with a large number of social ties between them may be indicative of a group of individuals who have nearby positions in this space of characteristics, or social space. Various concepts of social space have been discussed by McFarland and Brown (1973) and Faust (1988). In the context of this article, social space refers to a space of unobserved latent characteristics that represent potential transitive tendencies in network relations. A probability measure over these unobserved characteristics induces a model in which the presence of a tie between two individuals is dependent on the presence of other ties.

54 (Tiny portion of the) literature Nowicki, Krzysztof, and Tom A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association 96, no. 455 (2001): Hoff, Peter D., Adrian E. Raftery, and Mark S. Handcock. Latent space approaches to social network analysis. Journal of the american Statistical association 97, no. 460 (2002): Hoff, Peter. Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems, pp Airoldi, Edoardo M., David M. Blei, Stephen E. Fienberg, and Eric P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9, no. Sep (2008): Hoff, Peter, Bailey Fosdick, Alex Volfovsky, and Katherine Stovel. Likelihoods for fixed rank nomination networks. Network Science 1, no. 03 (2013): Hoff, Peter D. Dyadic data analysis with amen. arxiv preprint arxiv: (2015).

55 ame(y, Xdyad=NULL, Xrow=NULL, Xcol=NULL, rvar = (model=="rrl"), cvar = TRUE, dcor = symmetric, nvar = TRUE, R = 0, model="nrm", intercept=is.element(model,c("rrl","ord")), symmetric=false, odmax=rep(max(apply(y>0,1,sum,na.rm=true)),nrow(y)),...) Y: an n x n square relational matrix of relations. Xdyad: an n x n x pd array of covariates Xrow: an n x pr matrix of nodal row covariates Xcol: an n x pc matrix of nodal column covariates rvar: logical: fit row random effects (asymmetric case)? cvar: logical: fit column random effects (asymmetric case)? dcor: logical: fit a dyadic correlation (asymmetric case)? nvar: logical: fit nodal random effects (symmetric case)? R: int: dimension of the multiplicative effects (can be 0) model: char: one of "nrm","bin","ord","cbin","frn","rrl" odmax: a scalar integer or vector of length n giving the maximum number of nominations that each node may make

56 What s in the...? seed = 1, nscan = 10000, burn = 500, odens = 25, plot=true, print = TRUE, gof=true seed: random seed nscan: number of iterations of the Markov chain (beyond burn-in) burn: burn in for the Markov chain odens: output density for the Markov chain plot: logical: plot results while running? print: logical: print results while running? gof: logical: calculate goodness of fit statistics?

57 An AddHealth Example 32

58 Social network data Datasets: PROSPER, NSCR, AddHealth proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

59 Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

60 Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w 33

61 Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w

62 Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices In practice: Ranked data Censored observations proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w

63 Social network data Datasets: PROSPER, NSCR, AddHealth Relate network characteristics to individual-level behavior Literature: ERGM, latent variable models Assumptions: Data is fully observed The support is the set of all sociomatrices In practice: Ranked data Censored observations proportion Figure 3 interest is a comparison of such estima in order to see if the relationships betw study in Section 3.2. To this end, w A type of likelihood that accommodates the ranked and censored nature of data from Fixed Rank Nomination (FRN) surveys and allows for estimation of regression effects.

64 34 Data collection examples PROmoting School Community-University Partnerships to Enhance Resilience (PROSPER): Who are your best and closest friends in your grade? National Longitudinal Study of Adolescent to Adult Health (AddHealth): Your male friends. List your closest male friends. List your best male friend first, then your next best friend, and so on.

65 Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1

66 Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1

67 Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j}

68 Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j} Different sampling schemes define different maps between Y and Z (set relations between y ij and z ij ).

69 Notation Z = {z ij : i j} is a sociomatrix of ordinal relationships z ij > z ik denotes person i preferring person j to person k z 12 z 1n z 21 Z =. z n1 Instead of Z we observe a sociomatrix Y = {y ij : i j} Different sampling schemes define different maps between Y and Z (set relations between y ij and z ij ). Statistical model {p (Z θ) : θ Θ} assists in analysis

70 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree

71 36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i

72 36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i

73 36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i z i1 > z i2 > z i3 > z i4 > 0> 0> 0> 0> 0> 0>

74 36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i

75 36 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i

76 Fixed rank nominations y ij > y ik z ij > z ik } y ij = 0 and d i < m z ij 0 F (Y ) y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) m = maximal number of nominations, d i = individual outdegree Differentiates between different ranks Captures censoring in the data y i z i z i1 > z i2 > z i3 > z i4 > z i5 >?????

77 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y)

78 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i

79 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i

80 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i z i1 > z i2 > z i3 > z i4 >??????

81 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i

82 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i

83 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) y i z i z i1 > z i2 > z i3 > z i4 > z i5 >?????

84 37 Rank R(Y) y ij > y ik z ij > z ik } R (Y ) y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y ij = 0 z ij < 0 F(Y) Valid but not fully informative: F (Y ) R (Y ) Cannot estimate row ( sender ) specific effects y i z i z i1 > z i2 > z i3 > z i4 > z i5 >?????

85 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y)

86 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i

87 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i

88 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i >0 >0 >0 >0 0> 0> 0> 0> 0> 0>

89 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i

90 38 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i

91 Binary R(Y) y ij > y ik z ij > z ik y ij = 0 and d i < m z ij 0 y ij > 0 z ij > 0 y } B (Y ) ij = 0 z ij < 0 F(Y) B(Y) Neither fully informative nor valid Discards information on the ranks Ignores the censoring on the outdegrees In particular: F (Y ) B (Y ) y i z i >0 >0 >0 >0 >0 0> 0> 0> 0> 0>

92 39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler.

93 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )):

94 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ).

95 39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij 0.

96 39 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij y ij = 0 and d i = m: z ij p(z ij Z ij, θ)1 zij min(z ik :y ik >0)

97 Bayesian Estimation for Fixed Rank Nominations Model: Z p(z θ), θ Θ Data: Z F (Y ) Likelihood: L F (θ : Y ) = Pr (Z F (Y ) θ) = F (Y ) dp (Z θ) Estimation: Given p(θ), p(θ Z F (Y )) can be approximated by a Gibbs sampler. Simulate z ij p(z ij θ, Z ij, Z F (Y )): 1. y ij > 0: z ij p(z ij θ, Z ij )1 zij (a,b) where a = max(z ik : y ik < y ij ) and b = min(z ik : y ik > y ij ). 2. y ij = 0 and d i < m: z ij p(z ij Z ij, θ)1 zij y ij = 0 and d i = m: z ij p(z ij Z ij, θ)1 zij min(z ik :y ik >0) Allows for imputation of missing y ij 39

98 40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid normal 0, ) ( ( )) iid normal 0, ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group

99 40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid normal 0, ) ( ( )) iid normal 0, ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group β r = β c = β d1 = β d2 = 1 and β 0 = 3.26 x ir, x ic, x ij1 iid N (0, 1) xij2 = s i s j /.42 for s i iid binary (1/2)

100 40 Simulations We generated Z from the following Social Relations Model (Warner, Kenny and Stoto (1979)): ( ai b i ( ɛij z ij = β t x ij + a i + b j + ɛ ij ) ( ( )) iid normal 0, ) ( ( )) iid normal 0, ɛ ji Mean model: β t x ij = β 0 + β r x ir + β c x jc + β d1 x ij1 + β d2 x ij2 x ir, x jc : individual level variables x ij1 : pair specific variable x ij2 : co-membership in a group β r = β c = β d1 = β d2 = 1 and β 0 = 3.26 x ir, x ic, x ij1 iid N (0, 1) xij2 = s i s j /.42 for s i iid binary (1/2)

101 41 Simulations - Censoring r r m = 5 m = simulations for each m {5, 15} with 100 nodes each c c m = 5 m = m = 5 m = d1 d d2 d and an iid dyadic variable. The groups of three CIs are based on binary, FRN and rank simulationlikelihoods from left to right. simulation simulation simulation Confidence intervals under the three different likelihood for column

102 Simulations - Censoring r m = 5 m = 5 m = m = c Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects d

103 Simulations - Censoring r m = 5 m = 5 m = m = c Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects d

104 Simulations - Censoring r m = 5 m = 5 m = m = c Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects Large amount of censoring d

105 Simulations - Censoring r m = 5 m = 5 m = m = c d Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects Large amount of censoring Heterogeneity of censored outdegrees is low

106 Simulations - Censoring r m = 5 m = 5 m = m = c d Z R (Y ) Z + c1 t R (Y ) c R n Rank likelihood cannot estimate row effects Binary likelihood poorly estimates row effects Large amount of censoring Heterogeneity of censored outdegrees is low Regression coefficients estimated too low

107 43 Simulations - Censoring d d m = 5 m = simulation simulation Recall: x ij2 s i s j, an indicator of comembership to a group

108 Simulations - Censoring d d m = 5 m = simulation simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring

109 Simulations - Censoring d d m = 5 m = simulation simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring Binary likelihood underestimates row variability

110 Simulations - Censoring d d m = 5 m = simulation simulation Recall: x ij2 s i s j, an indicator of comembership to a group Ignore the censoring Binary likelihood underestimates row variability Underestimate the variability in x ij2

111 44 Simulations - information in the ranks Let C (Y ) be the set of values for which the following is true: y ij > 0 z ij > 0 y ij = 0 and d i < m z ij 0 min {z ij : y ij > 0} max {z ij : y ij = 0} We refer to L C (θ : Y ) = Pr (Z C (Y ) θ) as the censored binary likelihood. Recognizes censoring but ignores information in the ranks

112 44 Simulations - information in the ranks Let C (Y ) be the set of values for which the following is true: y ij > 0 z ij > 0 y ij = 0 and d i < m z ij 0 min {z ij : y ij > 0} max {z ij : y ij = 0} We refer to L C (θ : Y ) = Pr (Z C (Y ) θ) as the censored binary likelihood. Recognizes censoring but ignores information in the ranks Performs similarly to FRN in the previous study Less precise than FRN when m is big

113 Simulations - information in the ranks Same setup as before, but average uncensored outdegree is m relative concentration around true value r c d2 d1 β r : row β c : column β d1 : continuous dyad β d2 : co-membership m Relative concentration [ around ] true[ value of each parameter: ] Measured by E (β 1) 2 F (Y ) /E (β 1) 2 C (Y ) for each β 2: Posterior concentration around true parameter values. The average of E[(β S)]/E[(β β ) 2 C(S)] across eight simulated datasets for each m {5, 15, 30, 50}. ensored binomial likelihood. As the censored binomial likelihood recognizes the censoring in ata, we expect it to provide parameter estimates that do not have the biases of the binomial od estimators. On the other hand, L C ignores the information in the ranks of the scored uals, and so we might expect it to provide less precise estimates than the FRN likelihood.

114 Simulations - information in the ranks Same setup as before, but average uncensored outdegree is m relative concentration around true value r c d2 d1 β r : row β c : column β d1 : continuous dyad β d2 : co-membership m Relative concentration [ around ] true[ value of each parameter: ] Measured by E (β 1) 2 F (Y ) /E (β 1) 2 C (Y ) for each β 2: Posterior concentration around true parameter values. The average of E[(β S)]/E[(β β ) 2 C(S)] across eight simulated datasets for each m {5, 15, 30, 50}. When m n, most of the information found by considering ranked/unranked individuals as groups rather than the relative ordering of the ranked individuals. ensored binomial likelihood. As the censored binomial likelihood recognizes the censoring in ata, we expect it to provide parameter estimates that do not have the biases of the binomial od estimators. On the other hand, L C ignores the information in the ranks of the scored uals, and so we might expect it to provide less precise estimates than the FRN likelihood.

115 AddHealth Data - Results β intercept rsmoke rdrink rgpa csmoke cdrink cgpa β dsmoke ddrink dgpa β dacad darts dsport dcivic β dgrade drace 646 females were asked to rank up to 5 female friends Mean model with row, column and dyadic effects for smoking, drinking and gpa as well as dyadic effects for comembership in activities and grade, and a similarity-in-race measure. The CIs are based on binary, FRN and rank likelihoods. 46

Sampling and incomplete network data

1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a