Curved exponential family models for networks

Size: px
Start display at page:

Download "Curved exponential family models for networks"

Transcription

1 Curved exponential family models for networks David R. Hunter, Penn State University Mark S. Handcock, University of Washington February 18, 2005 Available online as Penn State Dept. of Statistics Technical report from Statistical Models, Sunbelt 2005 p. 1

2 Overview This talk will focus on the alternating k-star and alternating k-triangle statistics of Snijders et al. General goals: 1. Present equivalent (and preferable) formulations of these statistics alternating k-stars d G (y, θ) alternating k-triangles p G (y, θ) 2. Introduce the mathematical issues that make model-fitting particularly challenging when using these statistics Snijders et al. is W.P. # 42 at Statistical Models, Sunbelt 2005 p. 2

3 Alternating k-star statistic The alternating k-star statistic is defined as s 2 (y) s 3(y) γ + + ( 1) n s n 1(y) γ n 2, where s k (y) denotes the number of k-stars in the network y. Statistical Models, Sunbelt 2005 p. 3

4 Alternating k-star statistic The alternating k-star statistic is defined as s 2 (y) s 3(y) γ + + ( 1) n s n 1(y) γ n 2, where s k (y) denotes the number of k-stars in the network y. Consider the γ parameter. What does it do? How do we choose it? Note in particular that if the alternating k-star statistic is used in a model, γ enters in a nonlinear way. Statistical Models, Sunbelt 2005 p. 3

5 A small undirected network Statistical Models, Sunbelt 2005 p. 4

6 A small undirected network Degree distribution: (d 0,...,d n 1 ) = (0, 1, 1, 3, 0) Statistical Models, Sunbelt 2005 p. 4

7 A small undirected network Degree distribution: (d 0,...,d n 1 ) = (0, 1, 1, 3, 0) k-star distribution: (s 1,...,s n 1 ) = (6, 10, 3, 0) Statistical Models, Sunbelt 2005 p. 4

8 A small undirected network Degree distribution: (d 0,...,d n 1 ) = (0, 1, 1, 3, 0) k-star distribution: (s 1,...,s n 1 ) = (6, 10, 3, 0) Edgewise shared partner distribution: (p 0,...,p n 2 ) = (1, 4, 1, 0) Statistical Models, Sunbelt 2005 p. 4

9 A small undirected network Degree distribution: (d 0,...,d n 1 ) = (0, 1, 1, 3, 0) k-star distribution: (s 1,...,s n 1 ) = (6, 10, 3, 0) Edgewise shared partner distribution: (p 0,...,p n 2 ) = (1, 4, 1, 0) k-triangle distribution: (t 1,...,t n 2 ) = (2, 1, 0) Statistical Models, Sunbelt 2005 p. 4

10 A small undirected network Degree distribution: (d 0,...,d n 1 ) = (0, 1, 1, 3, 0) k-star distribution: (s 1,...,s n 1 ) = (6, 10, 3, 0) Edgewise shared partner distribution: (p 0,...,p n 2 ) = (1, 4, 1, 0) k-triangle distribution: (t 1,...,t n 2 ) = (2, 1, 0) Relationship between edgewise shared partners and k-triangles: analagous to the relationship between degrees and k-stars (and also the relationship between dyadic shared partners and alternating independent 2-paths.) Statistical Models, Sunbelt 2005 p. 4

11 Rewriting alternating k-stars The alternating k-star statistic s 2 (y) s 3(y) γ + + ( 1) n 1 s n 1(y) γ n 3 may be rewritten (brace yourself): Statistical Models, Sunbelt 2005 p. 5

12 Rewriting alternating k-stars The alternating k-star statistic s 2 (y) s 3(y) γ + + ( 1) n 1 s n 1(y) γ n 3 may be rewritten (brace yourself): where: d G (y; θ) = 2e θ s 1 (y) n 1 i=1 γ is replaced by e θ (to ensure γ > 0) e 2θ [ 1 ( 1 e θ) i ] d i (y), s k (y) = # of k-stars in the graph y. (In particular, s 1 = # of edges.) d k (y) = # of nodes of degree k in y. Statistical Models, Sunbelt 2005 p. 5

13 Alternating k-triangle statistic, rewritten The alternating k-triangle statistic of Snijders et al. (2004) is 3t 1 (y) t 2(y) γ + + ( 1) n 1 t n 2(y) γ n 3. In analogy with the alternating k-star case, we rewrite: Statistical Models, Sunbelt 2005 p. 6

14 Alternating k-triangle statistic, rewritten The alternating k-triangle statistic of Snijders et al. (2004) is 3t 1 (y) t 2(y) γ + + ( 1) n 1 t n 2(y) γ n 3. In analogy with the alternating k-star case, we rewrite: p G (y; θ) = n 2 i=1 e θ { 1 ( 1 e θ) i } p i (y), where γ is replaced by e θ (to ensure γ > 0) t k (y) = # of k-triangles in the graph y. p k (y) = # of nodes with k edgewise shared partners in y. Statistical Models, Sunbelt 2005 p. 6

15 An important question We have shown that the alternating k-star statistic is the same as d G (y, θ) the alternating k-triangle statistic is the same as p G (y, θ) where θ = log γ. Suppose we wish to include d G (y, θ 1 ) and/or p G (y, θ 2 ) in an ERGM, but we wish to estimate θ 1 and θ 2. How do we do it? Statistical Models, Sunbelt 2005 p. 7

16 ERGM specification For a (random, as-yet-unobserved) graph Y, we assume P(Y = y) = exp{ηt g(y)} c(η) = exp{η 1g 1 (y) + + η p g p (y)} c(η) for all possible realizations y. The name ERGM (exponential random graph model) arises because this model is based on a statistical exponential family. Statistical Models, Sunbelt 2005 p. 8

17 ERGM specification For a (random, as-yet-unobserved) graph Y, we assume P(Y = y) = exp{ηt g(y)} c(η) = exp{η 1g 1 (y) + + η p g p (y)} c(η) for all possible realizations y. The name ERGM (exponential random graph model) arises because this model is based on a statistical exponential family. As usual, g(y) is a vector of statistics to be specified by the modeler (and p is the number of statistics). The vector η is sometimes called the canonical parameter. Statistical Models, Sunbelt 2005 p. 8

18 Not quite an ERGM? The ERGM says P(Y = y) = exp{η 1g 1 (y) + + η p g p (y)}. c(η) But suppose g(y) consists of only the statistic p G (y, θ). Thus, we wish to estimate the parameters θ 1 and θ 2 in the model P(Y = y) = exp{θ 1p G (y, θ 2 )}. c(θ) Statistical Models, Sunbelt 2005 p. 9

19 Not quite an ERGM? The ERGM says P(Y = y) = exp{η 1g 1 (y) + + η p g p (y)}. c(η) But suppose g(y) consists of only the statistic p G (y, θ). Thus, we wish to estimate the parameters θ 1 and θ 2 in the model P(Y = y) = exp{θ 1p G (y, θ 2 )}. c(θ) The second equation is not in ERGM form because the parameters are mixed up with the statistics! Statistical Models, Sunbelt 2005 p. 9

20 η vs. θ Earlier, we showed p G (y; θ) = n 2 i=1 e θ { 1 ( 1 e θ) i } p i (y). Thus, the model we wish to fit turns into P(Y = y) = exp{θ 1p G (y, θ 2 )} c [ n 2 { exp i=1 θ 1e θ 2 1 ( ) } ] 1 e θ 2 i p i (y) =. c Thus, we can write η i (the coefficient of the ith statistic) as a function of θ (the parameter vector we want to estimate). Statistical Models, Sunbelt 2005 p. 10

21 Curved exponential family models The original model has turned into P(Y = y) = exp{η 1(θ)p 1 (y) + + η n 2 (θ)p n 2 (y)}. c[η(θ)] Statistical Models, Sunbelt 2005 p. 11

22 Curved exponential family models The original model has turned into P(Y = y) = exp{η 1(θ)p 1 (y) + + η n 2 (θ)p n 2 (y)}. c[η(θ)] Thus, η is the vector of coefficients, whereas θ is the parameter vector to be estimated. Often, the two vectors are the same so this distinction is ignored. Statistical Models, Sunbelt 2005 p. 11

23 Curved exponential family models The original model has turned into P(Y = y) = exp{η 1(θ)p 1 (y) + + η n 2 (θ)p n 2 (y)}. c[η(θ)] Thus, η is the vector of coefficients, whereas θ is the parameter vector to be estimated. Often, the two vectors are the same so this distinction is ignored. But sometimes, η(θ) is a nonlinear function; the equation above imposes nonlinear constraints on η. In that case, statisticians call the model a curved exponential family. Statistical Models, Sunbelt 2005 p. 11

24 Staving off death The complication of a nonlinear constraint on η can actually be a good thing: What happens if we try to estimate the unconstrained η vector in P(Y = y) = exp{η 1p 1 (y) + + η n 2 p n 2 (y)}? c(η) Statistical Models, Sunbelt 2005 p. 12

25 Staving off death The complication of a nonlinear constraint on η can actually be a good thing: What happens if we try to estimate the unconstrained η vector in P(Y = y) = exp{η 1p 1 (y) + + η n 2 p n 2 (y)}? c(η) Answer: In the words of Pip Pattison, Death By Parameter! Statistical Models, Sunbelt 2005 p. 12

26 Staving off death The complication of a nonlinear constraint on η can actually be a good thing: What happens if we try to estimate the unconstrained η vector in P(Y = y) = exp{η 1p 1 (y) + + η n 2 p n 2 (y)}? c(η) Answer: In the words of Pip Pattison, Death By Parameter! Boiling the entire η vector down into a function of just the (θ 1, θ 2 ) is actually healthy. But nothing that s good for you is ever fun, so there s more work to be done at the estimation step. See paper for details. Statistical Models, Sunbelt 2005 p. 12

27 Lazega s lawyer collaboration data F F F Sizes indicate seniority (larger=more recent); colors indicate office location; F indicates female; shapes indicate practice (circle=litigation, square=corporate) Statistical Models, Sunbelt 2005 p. 13

28 Coefficient estimates (generated by statnet) F F F Model 1 Model 2 Parameter est. s.e. est. s.e. Alternating k-triangles Rate of transitivity Seniority main effect Practice main effect Same practice Same gender Same office Statistical Models, Sunbelt 2005 p. 14

29 Deviance analysis F F F Model Residual Deviance Deviance Residual d.f. p-value NULL Covariates Full model Covariates : The model with only the covariate terms Full model : The model with covariate terms plus p G (y, θ) Statistical Models, Sunbelt 2005 p. 15

30 Conclusion Other things covered in our paper: Statistical Models, Sunbelt 2005 p. 16

31 Conclusion Other things covered in our paper: A general formulation of the problem of fitting curved exponential family models Statistical Models, Sunbelt 2005 p. 16

32 Conclusion Other things covered in our paper: A general formulation of the problem of fitting curved exponential family models Numerical algorithms for estimating curved EF parameters and their standard errors Statistical Models, Sunbelt 2005 p. 16

33 Conclusion Other things covered in our paper: A general formulation of the problem of fitting curved exponential family models Numerical algorithms for estimating curved EF parameters and their standard errors How to estimate likelihood ratio statistics and loglikelihoods using MCMC Statistical Models, Sunbelt 2005 p. 16

34 Conclusion Other things covered in our paper: A general formulation of the problem of fitting curved exponential family models Numerical algorithms for estimating curved EF parameters and their standard errors How to estimate likelihood ratio statistics and loglikelihoods using MCMC Huge thanks to: Tom Snijders for extremely helpful suggestions about the manuscript; Steve Goodreau for blackboard brainstorming sessions; Martina Morris, Garry Robins, and Pip Pattison for insightful comments. Statistical Models, Sunbelt 2005 p. 16

Specification and estimation of exponential random graph models for social (and other) networks

Specification and estimation of exponential random graph models for social (and other) networks Specification and estimation of exponential random graph models for social (and other) networks Tom A.B. Snijders University of Oxford March 23, 2009 c Tom A.B. Snijders (University of Oxford) Models for

More information

Assessing the Goodness-of-Fit of Network Models

Assessing the Goodness-of-Fit of Network Models Assessing the Goodness-of-Fit of Network Models Mark S. Handcock Department of Statistics University of Washington Joint work with David Hunter Steve Goodreau Martina Morris and the U. Washington Network

More information

Inference in curved exponential family models for networks

Inference in curved exponential family models for networks Inference in curved exponential family models for networks David R. Hunter, Penn State University Mark S. Handcock, University of Washington Penn State Department of Statistics Technical Report No. TR0402

More information

Modeling tie duration in ERGM-based dynamic network models

Modeling tie duration in ERGM-based dynamic network models University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2012 Modeling tie duration in ERGM-based dynamic

More information

Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data

Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data Maksym Byshkin 1, Alex Stivala 4,1, Antonietta Mira 1,3, Garry Robins 2, Alessandro Lomi 1,2 1 Università della Svizzera

More information

Algorithmic approaches to fitting ERG models

Algorithmic approaches to fitting ERG models Ruth Hummel, Penn State University Mark Handcock, University of Washington David Hunter, Penn State University Research funded by Office of Naval Research Award No. N00014-08-1-1015 MURI meeting, April

More information

Goodness of Fit of Social Network Models

Goodness of Fit of Social Network Models Goodness of Fit of Social Network Models David R. HUNTER, StevenM.GOODREAU, and Mark S. HANDCOCK We present a systematic examination of a real network data set using maximum likelihood estimation for exponential

More information

Statistical Models for Social Networks with Application to HIV Epidemiology

Statistical Models for Social Networks with Application to HIV Epidemiology Statistical Models for Social Networks with Application to HIV Epidemiology Mark S. Handcock Department of Statistics University of Washington Joint work with Pavel Krivitsky Martina Morris and the U.

More information

Assessing Goodness of Fit of Exponential Random Graph Models

Assessing Goodness of Fit of Exponential Random Graph Models International Journal of Statistics and Probability; Vol. 2, No. 4; 2013 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Assessing Goodness of Fit of Exponential Random

More information

Goodness of Fit of Social Network Models 1

Goodness of Fit of Social Network Models 1 Goodness of Fit of Social Network Models David R. Hunter Pennsylvania State University, University Park Steven M. Goodreau University of Washington, Seattle Mark S. Handcock University of Washington, Seattle

More information

c Copyright 2015 Ke Li

c Copyright 2015 Ke Li c Copyright 2015 Ke Li Degeneracy, Duration, and Co-evolution: Extending Exponential Random Graph Models (ERGM) for Social Network Analysis Ke Li A dissertation submitted in partial fulfillment of the

More information

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian Inference for Contact Networks Given Epidemic Data Bayesian Inference for Contact Networks Given Epidemic Data Chris Groendyke, David Welch, Shweta Bansal, David Hunter Departments of Statistics and Biology Pennsylvania State University SAMSI, April 17,

More information

Conditional Marginalization for Exponential Random Graph Models

Conditional Marginalization for Exponential Random Graph Models Conditional Marginalization for Exponential Random Graph Models Tom A.B. Snijders January 21, 2010 To be published, Journal of Mathematical Sociology University of Oxford and University of Groningen; this

More information

Statistical Model for Soical Network

Statistical Model for Soical Network Statistical Model for Soical Network Tom A.B. Snijders University of Washington May 29, 2014 Outline 1 Cross-sectional network 2 Dynamic s Outline Cross-sectional network 1 Cross-sectional network 2 Dynamic

More information

TEMPORAL EXPONENTIAL- FAMILY RANDOM GRAPH MODELING (TERGMS) WITH STATNET

TEMPORAL EXPONENTIAL- FAMILY RANDOM GRAPH MODELING (TERGMS) WITH STATNET 1 TEMPORAL EXPONENTIAL- FAMILY RANDOM GRAPH MODELING (TERGMS) WITH STATNET Prof. Steven Goodreau Prof. Martina Morris Prof. Michal Bojanowski Prof. Mark S. Handcock Source for all things STERGM Pavel N.

More information

Instability, Sensitivity, and Degeneracy of Discrete Exponential Families

Instability, Sensitivity, and Degeneracy of Discrete Exponential Families Instability, Sensitivity, and Degeneracy of Discrete Exponential Families Michael Schweinberger Abstract In applications to dependent data, first and foremost relational data, a number of discrete exponential

More information

arxiv: v1 [stat.me] 3 Apr 2017

arxiv: v1 [stat.me] 3 Apr 2017 A two-stage working model strategy for network analysis under Hierarchical Exponential Random Graph Models Ming Cao University of Texas Health Science Center at Houston ming.cao@uth.tmc.edu arxiv:1704.00391v1

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with

More information

arxiv: v1 [stat.me] 12 Apr 2018

arxiv: v1 [stat.me] 12 Apr 2018 A New Generative Statistical Model for Graphs: The Latent Order Logistic (LOLOG) Model Ian E. Fellows Fellows Statistics, San Diego, United States E-mail: ian@fellstat.com arxiv:1804.04583v1 [stat.me]

More information

An approximation method for improving dynamic network model fitting

An approximation method for improving dynamic network model fitting University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2014 An approximation method for improving dynamic

More information

Exponential random graph models for the Japanese bipartite network of banks and firms

Exponential random graph models for the Japanese bipartite network of banks and firms Exponential random graph models for the Japanese bipartite network of banks and firms Abhijit Chakraborty, Hazem Krichene, Hiroyasu Inoue, and Yoshi Fujiwara Graduate School of Simulation Studies, The

More information

A multilayer exponential random graph modelling approach for weighted networks

A multilayer exponential random graph modelling approach for weighted networks A multilayer exponential random graph modelling approach for weighted networks Alberto Caimo 1 and Isabella Gollini 2 1 Dublin Institute of Technology, Ireland; alberto.caimo@dit.ie arxiv:1811.07025v1

More information

An Approximation Method for Improving Dynamic Network Model Fitting

An Approximation Method for Improving Dynamic Network Model Fitting DEPARTMENT OF STATISTICS The Pennsylvania State University University Park, PA 16802 U.S.A. TECHNICAL REPORTS AND PREPRINTS Number 12-04: October 2012 An Approximation Method for Improving Dynamic Network

More information

Generalized Exponential Random Graph Models: Inference for Weighted Graphs

Generalized Exponential Random Graph Models: Inference for Weighted Graphs Generalized Exponential Random Graph Models: Inference for Weighted Graphs James D. Wilson University of North Carolina at Chapel Hill June 18th, 2015 Political Networks, 2015 James D. Wilson GERGMs for

More information

Learning with Blocks: Composite Likelihood and Contrastive Divergence

Learning with Blocks: Composite Likelihood and Contrastive Divergence Arthur U. Asuncion 1, Qiang Liu 1, Alexander T. Ihler, Padhraic Smyth {asuncion,qliu1,ihler,smyth}@ics.uci.edu Department of Computer Science University of California, Irvine Abstract Composite likelihood

More information

Overview course module Stochastic Modelling

Overview course module Stochastic Modelling Overview course module Stochastic Modelling I. Introduction II. Actor-based models for network evolution III. Co-evolution models for networks and behaviour IV. Exponential Random Graph Models A. Definition

More information

A note on perfect simulation for exponential random graph models

A note on perfect simulation for exponential random graph models A note on perfect simulation for exponential random graph models A. Cerqueira, A. Garivier and F. Leonardi October 4, 017 arxiv:1710.00873v1 [stat.co] Oct 017 Abstract In this paper we propose a perfect

More information

Delayed Rejection Algorithm to Estimate Bayesian Social Networks

Delayed Rejection Algorithm to Estimate Bayesian Social Networks Dublin Institute of Technology ARROW@DIT Articles School of Mathematics 2014 Delayed Rejection Algorithm to Estimate Bayesian Social Networks Alberto Caimo Dublin Institute of Technology, alberto.caimo@dit.ie

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

IV. Analyse de réseaux biologiques

IV. Analyse de réseaux biologiques IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Nonlinear Regression

Nonlinear Regression Nonlinear Regression 28.11.2012 Goals of Today s Lecture Understand the difference between linear and nonlinear regression models. See that not all functions are linearizable. Get an understanding of the

More information

Bayesian Analysis of Network Data. Model Selection and Evaluation of the Exponential Random Graph Model. Dissertation

Bayesian Analysis of Network Data. Model Selection and Evaluation of the Exponential Random Graph Model. Dissertation Bayesian Analysis of Network Data Model Selection and Evaluation of the Exponential Random Graph Model Dissertation Presented to the Faculty for Social Sciences, Economics, and Business Administration

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Modeling of Dynamic Networks based on Egocentric Data with Durational Information

Modeling of Dynamic Networks based on Egocentric Data with Durational Information DEPARTMENT OF STATISTICS The Pennsylvania State University University Park, PA 16802 U.S.A. TECHNICAL REPORTS AND PREPRINTS Number 12-01: April 2012 Modeling of Dynamic Networks based on Egocentric Data

More information

A COMPARATIVE ANALYSIS ON COMPUTATIONAL METHODS FOR FITTING AN ERGM TO BIOLOGICAL NETWORK DATA A THESIS SUBMITTED TO THE GRADUATE SCHOOL

A COMPARATIVE ANALYSIS ON COMPUTATIONAL METHODS FOR FITTING AN ERGM TO BIOLOGICAL NETWORK DATA A THESIS SUBMITTED TO THE GRADUATE SCHOOL A COMPARATIVE ANALYSIS ON COMPUTATIONAL METHODS FOR FITTING AN ERGM TO BIOLOGICAL NETWORK DATA A THESIS SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE MASTER

More information

Dynamic modeling of organizational coordination over the course of the Katrina disaster

Dynamic modeling of organizational coordination over the course of the Katrina disaster Dynamic modeling of organizational coordination over the course of the Katrina disaster Zack Almquist 1 Ryan Acton 1, Carter Butts 1 2 Presented at MURI Project All Hands Meeting, UCI April 24, 2009 1

More information

A Graphon-based Framework for Modeling Large Networks

A Graphon-based Framework for Modeling Large Networks A Graphon-based Framework for Modeling Large Networks Ran He Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA

More information

Parameterizing Exponential Family Models for Random Graphs: Current Methods and New Directions

Parameterizing Exponential Family Models for Random Graphs: Current Methods and New Directions Carter T. Butts p. 1/2 Parameterizing Exponential Family Models for Random Graphs: Current Methods and New Directions Carter T. Butts Department of Sociology and Institute for Mathematical Behavioral Sciences

More information

Using contrastive divergence to seed Monte Carlo MLE for exponential-family random graph models

Using contrastive divergence to seed Monte Carlo MLE for exponential-family random graph models University of Wollongong Research Online National Institute for Applied Statistics Research Australia Working Paper Series Faculty of Engineering and Information Sciences 2015 Using contrastive divergence

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Point of intersection

Point of intersection Name: Date: Period: Exploring Systems of Linear Equations, Part 1 Learning Goals Define a system of linear equations and a solution to a system of linear equations. Identify whether a system of linear

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families

More information

Essentials of Intermediate Algebra

Essentials of Intermediate Algebra Essentials of Intermediate Algebra BY Tom K. Kim, Ph.D. Peninsula College, WA Randy Anderson, M.S. Peninsula College, WA 9/24/2012 Contents 1 Review 1 2 Rules of Exponents 2 2.1 Multiplying Two Exponentials

More information

An Introduction to Exponential-Family Random Graph Models

An Introduction to Exponential-Family Random Graph Models An Introduction to Exponential-Family Random Graph Models Luo Lu Feb.8, 2011 1 / 11 Types of complications in social network Single relationship data A single relationship observed on a set of nodes at

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 01 Introduction to Optimization Today, we will start the constrained

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

arxiv: v1 [stat.me] 1 Aug 2012

arxiv: v1 [stat.me] 1 Aug 2012 Exponential-family Random Network Models arxiv:208.02v [stat.me] Aug 202 Ian Fellows and Mark S. Handcock University of California, Los Angeles, CA, USA Summary. Random graphs, where the connections between

More information

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017 Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Centre for Statistical and Survey Methodology. Working Paper

Centre for Statistical and Survey Methodology. Working Paper Centre for Statistical and Survey Methodology The University of Wollongong Working Paper 11-12 Using Social Network Information for Survey Estimation Thomas Suesse and Ray Chambers Copyright 2008 by the

More information

Graph-theoretic Problems

Graph-theoretic Problems Graph-theoretic Problems Parallel algorithms for fundamental graph-theoretic problems: We already used a parallelization of dynamic programming to solve the all-pairs-shortest-path problem. Here we are

More information

Sections 2.1, 2.2 and 2.4: Limit of a function Motivation:

Sections 2.1, 2.2 and 2.4: Limit of a function Motivation: Sections 2.1, 2.2 and 2.4: Limit of a function Motivation: There are expressions which can be computed only using Algebra, meaning only using the operations +,, and. Examples which can be computed using

More information

Relational Event Modeling: Basic Framework and Applications. Aaron Schecter & Noshir Contractor Northwestern University

Relational Event Modeling: Basic Framework and Applications. Aaron Schecter & Noshir Contractor Northwestern University Relational Event Modeling: Basic Framework and Applications Aaron Schecter & Noshir Contractor Northwestern University Why Use Relational Events? Social network analysis has two main frames of reference;

More information

Massive-scale estimation of exponential-family random graph models with local dependence

Massive-scale estimation of exponential-family random graph models with local dependence Massive-scale estimation of exponential-family random graph models with local dependence Sergii Babkin Michael Schweinberger arxiv:1703.09301v1 [stat.co] 27 Mar 2017 Abstract A flexible approach to modeling

More information

Solution to Tutorial 7

Solution to Tutorial 7 1. (a) We first fit the independence model ST3241 Categorical Data Analysis I Semester II, 2012-2013 Solution to Tutorial 7 log µ ij = λ + λ X i + λ Y j, i = 1, 2, j = 1, 2. The parameter estimates are

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables

More information

Modeling Relational Event Dynamics with statnet

Modeling Relational Event Dynamics with statnet Modeling Relational Event Dynamics with statnet Carter T. Butts Department of Sociology and Institute for Mathematical Behavioral Sciences University of California, Irvine Christopher S. Marcum RAND Corporation

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Bernoulli Graph Bounds for General Random Graphs

Bernoulli Graph Bounds for General Random Graphs Bernoulli Graph Bounds for General Random Graphs Carter T. Butts 07/14/10 Abstract General random graphs (i.e., stochastic models for networks incorporating heterogeneity and/or dependence among edges)

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Defining Exponential Functions and Exponential Derivatives and Integrals

Defining Exponential Functions and Exponential Derivatives and Integrals Defining Exponential Functions and Exponential Derivatives and Integrals James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University February 19, 2014

More information

Bayesian estimation of complex networks and dynamic choice in the music industry

Bayesian estimation of complex networks and dynamic choice in the music industry Bayesian estimation of complex networks and dynamic choice in the music industry Stefano Nasini Víctor Martínez-de-Albéniz Dept. of Production, Technology and Operations Management, IESE Business School,

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Generalized linear models

Generalized linear models Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs Outline Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009,

More information

Social Networks 34 (2012) Contents lists available at ScienceDirect. Social Networks. jo ur nal homep ag e:

Social Networks 34 (2012) Contents lists available at ScienceDirect. Social Networks. jo ur nal homep ag e: Social Networks 34 (2012) 6 17 Contents lists available at ScienceDirect Social Networks jo ur nal homep ag e: www.elsevier.com/locate/socnet Networks and geography: Modelling community network structures

More information

Whether to use MMRM as primary estimand.

Whether to use MMRM as primary estimand. Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Sampling and incomplete network data

Sampling and incomplete network data 1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a

More information

Figure 9.1: A Latin square of order 4, used to construct four types of design

Figure 9.1: A Latin square of order 4, used to construct four types of design 152 Chapter 9 More about Latin Squares 9.1 Uses of Latin squares Let S be an n n Latin square. Figure 9.1 shows a possible square S when n = 4, using the symbols 1, 2, 3, 4 for the letters. Such a Latin

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm by Korbinian Schwinger Overview Exponential Family Maximum Likelihood The EM Algorithm Gaussian Mixture Models Exponential

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Generalized Linear Models (1/29/13)

Generalized Linear Models (1/29/13) STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Variational Learning : From exponential families to multilinear systems

Variational Learning : From exponential families to multilinear systems Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.

More information

Continuous-time Statistical Models for Network Panel Data

Continuous-time Statistical Models for Network Panel Data Continuous-time Statistical Models for Network Panel Data Tom A.B. Snijders University of Groningen University of Oxford September, 2016 1 / 45 Overview 1 Models for network panel data 2 Example 3 Co-evolution

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

A social scientist s guide to network statistics

A social scientist s guide to network statistics A social scientist s guide to network statistics http://mominmalik.com/network-stats-guide.pdf Momin M. Malik November 10, 2016 (updated December 23, 2016) 70/73-449: Social, Economic and Information Networks

More information

Volatility. Gerald P. Dwyer. February Clemson University

Volatility. Gerald P. Dwyer. February Clemson University Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use

More information

Bayesian Regression (1/31/13)

Bayesian Regression (1/31/13) STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed

More information

Censoring mechanisms

Censoring mechanisms Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Stat 710: Mathematical Statistics Lecture 12

Stat 710: Mathematical Statistics Lecture 12 Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:

More information

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad 1 Supplemental Materials Graphing Values for Individual Dyad Members over Time In the main text, we recommend graphing physiological values for individual dyad members over time to aid in the decision

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Beyond ERGMs. Scalable methods for the statistical modeling of networks. David Hunter. Department of Statistics Penn State University

Beyond ERGMs. Scalable methods for the statistical modeling of networks. David Hunter. Department of Statistics Penn State University Beyond ERGMs Scalable methods for the statistical modeling of networks David Hunter Department of Statistics Penn State University Supported by ONR MURI Award Number N4---5 University of Texas at Austin,

More information