Sampling and Estimation in Network Graphs
|
|
- Lorena Johnston
- 6 years ago
- Views:
Transcription
1 Sampling and Estimation in Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester March 23, 2017 Network Science Analytics Sampling and Estimation in Network Graphs 1
2 Network sampling Network sampling and challenges Background on statistical sampling theory Network graph sampling designs Estimation of network totals and group size Estimation of degree distributions Network Science Analytics Sampling and Estimation in Network Graphs 2
3 Sampling network graphs Measurements often gathered only from a portion of a complex system Ex: social study of high-school class vs. large corporation, Internet Network graph sample from a larger underlying network Goal: use sampled network data to infer properties of the whole system Approach using principles of statistical sampling theory Sampling in network contexts introduces various potential challenges System under study G(V, E) Population graph Random Procedure Available measurements G (V, E ) Sampled graph G often a subgraph of G (i.e., V V, E E), but may not be Network Science Analytics Sampling and Estimation in Network Graphs 3
4 The fundamental problem Suppose a given graph characteristic or summary η(g) is of interest Ex: order Nv, size N e, degree d v, clustering coefficient cl(g),... Typically impossible to recover η(g) exactly from G Q: Can we still form a useful estimate ˆη = ˆη(G ) of η(g)? Plug-in estimator ˆη := η(g ) Boils down to computing the characteristic of interest in G Many familiar estimators in statistical practice are of this type Ex: sample means, standard deviations, covariances, quantiles... Oftentimes η(g ) is a poor representation of η(g) Network Science Analytics Sampling and Estimation in Network Graphs 4
5 Example: Estimating average degreee Let G(V, E) be a network of protein interactions in yeast Characteristic of interest is average degree η(g) = 1 N v i V Here N v = 5, 151, N e = 31, 201 η(g) = Consider two sampling designs to obtain G First sample n vertices V = {i 1,..., i n } without replacement Design 1: For each i V, observe incident edges (i, j) E Design 2: Observe edge (i, j) only when both i, j V Estimate η(g) by averaging the observed degree sequence {d i } i V η(g ) = 1 n d i di i V Network Science Analytics Sampling and Estimation in Network Graphs 5
6 Example: Estimating average degreee (cont.) Random sample of n = 1, 500 vertices, Designs 1 and 2 for edges Process repeated for 10,000 trials histogram of η(g ) Density Design 2 Design Estimate of of average Average Degree degree Under-estimate η(g) for Design 2, but Design 1 on target. Why? Design 1: sample vertex degree explicitly, i.e., d i = d i Design 2: (implicitly) sample vertex degree with bias, i.e., d i n N v d i Network Science Analytics Sampling and Estimation in Network Graphs 6
7 Improving estimation accuracy In order to do better we need to incorporate the effects of Random sampling; and/or Measurement error Sampling design, topology of G, nature of η( ) all critical Model-based inference Likelihood-based and Bayesian paradigms Design-based methods Statistical sampling theory Assume observations made without measurement error Only source of randomness sampling procedure Ex: Estimating average degree Under Design 2 the estimate is biased, with mean of only Adjusting η(g ) upward by a factor Nv = yields 12,115 n Will see how statistical sampling theory justifies this correction Network Science Analytics Sampling and Estimation in Network Graphs 7
8 Background Network sampling and challenges Background on statistical sampling theory Network graph sampling designs Estimation of network totals and group size Estimation of degree distributions Network Science Analytics Sampling and Estimation in Network Graphs 8
9 Statistical sampling theory Suppose we have a population U = {1,..., N u } of N u units Ex: People, animals, objects, vertices,... A value y i is associated with each unit i U Ex: Height, age, gender, infected, membership,... Typical interest in the population totals τ and averages µ τ := y i and µ := 1 y i = 1 τ N u N u i U i U Basic sampling theory paradigm oriented around these steps: S1: Randomly sample n units S = {i 1,..., i n } from U S2: Observe the value y ik for k = 1,..., n S3: Form an unbiased estimator ˆµ of µ, i.e., E [ˆµ] = µ S4: Evaluate or estimate the variance var [ˆµ] Network Science Analytics Sampling and Estimation in Network Graphs 9
10 Inclusion probabilities Def: For given sampling design, the inclusion probability π i of unit i is π i := P (unit i belongs in the sample S) Simple random sampling (SRS): n units sampled uniformly form U Without replacement: i 1 chosen from U, i 2 from U \ {i 1 }, and so on There are ( N u ) n such possible samples of size n There are ( N u 1) n 1 samples which include a given unit i The inclusion probability is π i = ( Nu 1 n 1 ( Nu n ) ) = n N u Network Science Analytics Sampling and Estimation in Network Graphs 10
11 Sample mean estimator Definition of sample mean estimator ˆµ = 1 n Using indicator RVs I {i S} for i U, where E [I {i S}] = π i [ ] [ ] 1 1 N u E [ˆµ] = E y i = E y i I {i S} n n = 1 n N u i=1 i S i S y i i=1 y i E [I {i S}] = 1 n N u i=1 y i π i SRS without replacement unbiased because π i = n N u Unequal probability sampling More common than SRS, especially with networks. (More soon) Sample mean can be a poor (i.e., biased) estimator for µ Network Science Analytics Sampling and Estimation in Network Graphs 11
12 Horvitz-Thompson estimation for totals Idea: weighted average using inclusion probabilities as weights Horvitz-Thompson (HT) estimator ˆµ π = 1 N u i S y i π i and ˆτ π = N u ˆµ π Remedies the bias problem E [ˆµ π ] = 1 N u N u i=1 y i π i E [I {i S}] = 1 N u Size of the population N u assumed known N u i=1 y i = µ Broad applicability, but π i may be difficult to compute Network Science Analytics Sampling and Estimation in Network Graphs 12
13 Horvitz-Thompson estimator variance Def: Joint inclusion probability π ij of units i and j is π ij := P (units i and j belong in the sample S) If inclusion of units i and j are independent events π ij = π i π j Ex: Simple random sampling without replacement yields π ij = Variance of the HT estimator: var [ˆτ π ] = i U j U n(n 1) N u (N u 1) y i y j ( πij π i π j 1 ), var [ˆµ π ] = var [ˆτ π] N 2 u Typically estimated in an unbiased fashion from the sample S Network Science Analytics Sampling and Estimation in Network Graphs 13
14 Probability proportional to size sampling Unequal probability sampling n units selected w.r.t. a distribution {p 1,..., p Nu } on U Uniform sampling: special case with p i = 1 N u Probability proportional to size (PPS) sampling for all i U Probabilities p i proportional to a characteristic c i Ex: households chosen by drawing names from a database If sampling with replacement, PPS inclusion probabilities are π i = 1 (1 p i ) n, where p i = c i k c k Joint inclusion probabilities for variance calculations π ij = π i + π j [1 (1 p i p j ) n ] Network Science Analytics Sampling and Estimation in Network Graphs 14
15 Estimation of group size So far implicitly assumed N u known Often not the case! Ex: endangered animal species, people at risk of rare disease Special population total often of interest is the group size N u = 1 i U Suggests the following HT estimator of N u ˆN u = i S π 1 i Infeasible, since knowledge of N u needed to compute π i Network Science Analytics Sampling and Estimation in Network Graphs 15
16 Capture-recapture estimator Capture-recapture estimators overcome HT limitations in this setting Two rounds of SRS without replacement Two samples S 1, S 2 Round 1: Mark all units in sample S 1 of size n 1 from U Ex: tagging a fish, noting the ID number... All units in S1 are returned to the population Round 2: Obtain a sample S 2 of size n 2 from U Capture-recapture estimator of N u ˆN u := n 2 m n 1, where m := S 1 S 2 Factor m/n 2 indicative of marked fraction of the overall population Can derive using model-based arguments as an ML estimator Network Science Analytics Sampling and Estimation in Network Graphs 16
17 Common network graph sampling designs Network sampling and challenges Background on statistical sampling theory Network graph sampling designs Estimation of network totals and group size Estimation of degree distributions Network Science Analytics Sampling and Estimation in Network Graphs 17
18 Graph sampling designs Q: What are common designs for sampling a network graph G? A: Will see a few examples, along with their inclusion probabilities π i Graph-based sampling designs Two inter-related classes of units, vertices i and edges (i, j) Often two stages Selection among one class of units (e.g., vertices) Observation of units from the other class (e.g., edges) Inclusion probabilities offer insight into the nature of the designs Central to HT estimators of network graph characteristics η(g) Network Science Analytics Sampling and Estimation in Network Graphs 18
19 Induced subgraph sampling S) Sample n vertices V = {i 1,..., i n } without replacement (SRS) O) Observe edges (i, j) E only when both i, j V (induced by V ) Ex: construction of contact networks in social network research Vertex and edge inclusion probabilities are uniformly equal to π i = n N v and π {i,j} = n(n 1) N v (N v 1) Network Science Analytics Sampling and Estimation in Network Graphs 19
20 Incident subgraph sampling Consider a complementary design to induced subgraph sampling S) Sample n edges E without replacement (SRS) O) Observe vertices i V incident to those selected edges in E Ex: construction of sampled telephone call graphs Network Science Analytics Sampling and Estimation in Network Graphs 20
21 Inclusion probabilities For incident subgraph sampling, edge inclusion probabilities are π {i,j} = n N e Vertex in V if any one or more of its incident edges are sampled π i = P (vertex i is sampled) = 1 P (no edge incident to i is sampled) { ( Ne di ) n 1 ( = Ne ), if n N e d i n 1, if n > N e d i Vertices included with unequal probs. that depend on their degrees Probability proportional to size (degree) sampling of vertices Requires knowledge of N e and degree sequence {d i } i V Network Science Analytics Sampling and Estimation in Network Graphs 21
22 Snowball sampling S) Sample n vertices V 0 = {i 1,..., i n } without replacement (SRS) O1) Observe edges E0 incident to each i V 0, forming the initial wave O2) Observe neighbors N (V 0 ) of i V 0, i.e., V 1 = N (V 0 ) (V 0 )c Iterate to a desired number of e.g., k waves, or until Vk empty G has V = V0 V 1... V k, and their incident edges Ex: spiders or crawlers to discover the WWW s structure Network Science Analytics Sampling and Estimation in Network Graphs 22
23 Star sampling Difficult to compute inclusion probabilities beyond a single wave Single-wave snowball sampling reduces to star sampling Unlabeled: V = V0 and E = E0 their incident edges Ex: Count all co-authors of n sampled authors Vertex inclusion probabilities are simply πi = n/n v Labeled: V = V0 (N (V 0 ) (V 0 )c ) and E = E0 Ex: Count and identify all co-authors of n sampled authors Vertex inclusion probabilities can be shown to look like π i = L N i ( 1) L +1 P (L), where P (L) = ( Nv L n L ( Nv ) n ) Denoted by Ni the neighborhood of vertex i (including i itself) Network Science Analytics Sampling and Estimation in Network Graphs 23
24 Link tracing Link-tracing designs Select an initial sample of vertices VS Trace edges (links) from Vs to another set of vertices VT Snowball sampling: special case where all incident edges are traced May be infeasible to follow all incident edges to a given vertex Ex: lack of recollection/deception in social contact networks Path sampling designs Source nodes V S = {s 1,..., s ns } V Target nodes V T = {t 1,..., t nt } V \ V S Traverse and measure the path between each pair (s i, t j ) Ex: Traceroute Internet studies, Milgram s Six Degrees experiment Network Science Analytics Sampling and Estimation in Network Graphs 24
25 Traceroute sampling Trace shortest paths from each source to all targets s1 s2 t1 t2 Vertex and edge inclusion probabilities roughly [Dall Asta et al 06]: π i 1 (1 ρ S ρ T )e ρ S ρ T c Be (i) and π {i,j} 1 e ρ S ρ T c Be ({i,j}) Source and target sampling fractions ρ S := n S /N v and ρ T := n T /N v Induces PPS sampling, size given by betweenness centralities Network Science Analytics Sampling and Estimation in Network Graphs 25
26 Estimation of totals in network graphs Network sampling and challenges Background on statistical sampling theory Network graph sampling designs Estimation of network totals and group size Estimation of degree distributions Network Science Analytics Sampling and Estimation in Network Graphs 26
27 Network summaries as totals Various graph summaries η(g) are expressible in terms of totals τ Average degree: Let U = V and y i = d i, then η(g) = d i V d i Graph size: Let U = E and y ij = 1, then η(g) = N e = (i,j) E 1 Betweenness centrality: Let U = V (2) (unordered vertex pairs) and y ij = I { k P (i,j) }. For unique shortest i j paths P(i,j), then η(g) = c Be (k) = (i,j) V (2) I { k P(i,j) } Clustering coefficient: Let U = V (3) (unordered vertex triples), then η(g) = cl(g) = 3 total number of triangles total number of connected triples Often such totals can be obtained from sampled G via HT estimation Network Science Analytics Sampling and Estimation in Network Graphs 27
28 Vertex totals Vertex totals are of the form τ = i V y i, averages are τ/n v Ex: average degree where yi = d i Ex: nodes with characteristic C, where yi = I {i C} Given a sample V V, the HT estimator for vertex totals is ˆτ π = y i π i i V Variance expressions carry over, let U = V and V for estimates Inclusion probabilities π i depend on network sampling design Sampling also influences whether y i is observable, e.g., y i = d i Network Science Analytics Sampling and Estimation in Network Graphs 28
29 Totals on vertex pairs Quantity y ij corresponding to vertex pairs (i, j) V (2) of interest Totals τ = (i,j) V (2) y ij become relevant Ex: graph size Ne and betweenness c Be (k) where y ij = I { k P (i,j) } Ex: shared gender in friendship network, average dissimilarity The HT estimator in this context is ˆτ π = y ij π ij (i,j) V (2) Edge totals a special case, when y ij 0 only for (i, j) E Variance expression increasingly complicated, namely var [ˆτ π ] = ( ) πijkl y ik y kl 1 π ij π kl (i,j) V (2) (k,l) V (2) Depends on inclusion probabilities π ijkl of vertex quadruples Network Science Analytics Sampling and Estimation in Network Graphs 29
30 Example: Estimating network size Consider estimating N e as an edge total, i.e., N e = 1 = (i,j) E (i,j) V (2) A ij Bernoulli sampling (BS): I {i V } Ber(p) i.i.d. for all i V Edges E obtained via induced subgraph sampling π ij = p 2 The HT estimator of N e is ˆN e = (i,j) V (2) A ij π ij = p 2 N e Scales up the empirically observed edge total N e by p 2 > 1 Variance can be shown to take the form [Frank 77] [ ] var ˆN e = (p 1 1) di 2 + (p 2 2p 1 + 1)N e i V Network Science Analytics Sampling and Estimation in Network Graphs 30
31 Example: Estimating network size (cont.) Protein network: N v = 5, 151, N e = 31, 201 BS of vertices with p = 0.1 and p = 0.3 Process repeated for 10,000 trials histogram of ˆN e Density 0e+00 6e Density 0e+00 3e N^ e, for p=0.10 se^ (N^ e), for p=0.10 Density Density N^ e, for p=0.30 se^ (N^ e), for p=0.30 Average of ˆN e was 31, 116 and 31, 203 Unbiasedness supported Mean and variability of ŝe shrinks with p (larger sample) Network Science Analytics Sampling and Estimation in Network Graphs 31
32 Example: Estimating clustering coefficient Average clustering coefficient cl(g) can be expressed as cl(g) = 3 τ (G) τ 3 (G) Involves the quotient of two totals on vertex triples τ = (i,j,k) V (3) y ijk ˆτ π = Total number of triangles τ (G), where (i,j,k) V (3) y ijk π ijk y ijk = A ij A jk A ki Total number of connected triples τ 3 (G), where y ijk = A ij A jk + A ij A ki + A jk A ki Network Science Analytics Sampling and Estimation in Network Graphs 32
33 Example: Estimating clustering coefficient (cont.) Protein network: τ (G) = 44, 858, τ 3 (G) 1M, and cl(g) = BS of vertices with p = 0.2 Induced subgraph sampling of edges Process repeated for 10,000 trials histogram of ĉl(g) Density Estimated Clustering Coefficient Unbiased HT estimators ˆτ = p 3 τ (G ) and ˆτ 3 = p 3 τ 3 (G ) Plug-in estimator ĉl(g) = 3ˆτ /ˆτ 3 results in ĉl(g) = cl(g ) Quite accurate with mean and ŝe of Network Science Analytics Sampling and Estimation in Network Graphs 33
34 Caveat emptor Horvitz-Thompson framework fairly straightforward in its essence Success in network sampling and estimation rests on interaction among a) Sampling design; b) Measurements taken; and c) Total to be estimated Three basic elements must be present in the problem 1) Network summary statistic η(g) expressible as total; 2) Values y either observed, or obtainable from measurements; and 3) Inclusion probabilities π computable for the sampling design Unfortunately, often not all three are present at the same time... Network Science Analytics Sampling and Estimation in Network Graphs 34
35 Example: Estimating average degreee Recall our first example on estimation of average degree 1 N v i V d i Design 1: Unlabeled star sampling, observes degrees di, i V Design 2: Induced subgraph sampling, does not observe degrees Average degree is a scaling of a vertex total (N v known) HT estimation applicable so long as y i = d i observed True for unlabeled star sampling, and since π i = n/n v we have ˆµ St = ˆτ St N v, where ˆτ St = i V St d i n/n v We do not observe d i under induced subgraph sampling Not amenable to HT estimation as vertex total for this design Network Science Analytics Sampling and Estimation in Network Graphs 35
36 Example: Estimating average degreee (cont.) Identity µ = 2Ne N v For induced subgraph sampling π ij = ˆN e,is = Tackle instead as estimation of network size N e n(n 1) N v (N v 1), so HT estimator is A ij n(n 1)/[N v (N v 1)] = N v (N v 1) n(n 1) N e,is (i,j) V (2) Desired unbiased estimator for the average degree is ˆµ IS = 2 ˆN e,is N v Estimators under both designs can be compared by writing them as ˆµ St = 2N e,st n and ˆµ IS = 2N e,is. N v 1 n n 1 Design 1: uses the identity µ = 2Ne Design 2: same but inflated by N v on G St Nv 1 n 1, compensates d i,is < d i Network Science Analytics Sampling and Estimation in Network Graphs 36
37 Estimation of network group size Assuming that N v is known may not be on safe grounds Human or animal groups too mobile or elusive to count accurately All Web pages or Internet routers are too massive and dispersed Often estimating N v may well be the prime objective If vertex SRS or BS feasible, could sample twice marking in between Facilitates usage of capture-recapture estimators off-the-shelf If sampling infeasible, or capture-recapture performs poorly Develop estimators of N v tailored to the graph sampling at hand Network Science Analytics Sampling and Estimation in Network Graphs 37
38 Estimating the size of a hidden population Hidden population: individuals do not wish to expose themselves Ex: humans of socially sensitive status, such as homeless Ex: involved in socially sensitive activities, e.g., drugs, prostitution Such groups are often small Estimating their size is challenging Snowball sampling used to estimate the size of hidden populations O. Frank and T. Snijders, Estimating the size of hidden populations using snowball sampling, J. Official Stats., vol. 10, pp , 1994 Network Science Analytics Sampling and Estimation in Network Graphs 38
39 Sampling a hidden population Directed graph G(V, E), V the members of the hidden population Graph describing willingness to identify other members Arc (i, j) when ask individual i, mentions j as a member Graph G obtained via one-wave snowball sampling, i.e., V = V0 V 1 Initial sample V0 obtained via BS from V with probability p 0 Consider the following random variables (RVs) of interest N = V 0 : size of the initial sample M1: number of arcs among individuals in V0 M2: number of arcs from individuals in V0 to individuals in V1 Snowball sampling yields measurements n, m 1, and m 2 of these RVs Network Science Analytics Sampling and Estimation in Network Graphs 39
40 Method of moments estimator Method of moments: equate moments to sample counterparts [ ] E [N] = E I {i V0 } = N v p 0 = n E [M 1 ] = E j E [M 2 ] = E j i I {i V0 }I {j V0 }A ij = N e p0= 2 m 1 i j I {i V0 }I {j / V0 }A ij = N e p 0 (1 p 0 )= m 2 i j Expectation w.r.t. randomness in selecting the sample V 0. Solution: ( ) m1 + m 2 ˆN v = n m 1 Size of initial sample inflated by estimate of the sampling rate Network Science Analytics Sampling and Estimation in Network Graphs 40
41 Estimation of degree distributions Network sampling and challenges Background on statistical sampling theory Network graph sampling designs Estimation of network totals and group size Estimation of degree distributions Network Science Analytics Sampling and Estimation in Network Graphs 41
42 Estimation of other network characteristics Classical sampling theory rests heavily on Horvitz-Thompson theory Scope limited to network totals Q: Other network summaries, e.g., degree distributions? Findings on the effect of sampling on observed degree distributions: Highly unrepresentative of actual degree distributions; and Unhelpful to characterizing heterogeneous distributions Ex: Internet traceroute sampling [Lakhina et al 03] Broad degree distribution in G, while concentrated in G Ex: Sampling protein-protein interaction networks [Han et al 05] Power-law exponent estimate from G underestimates α in G Network Science Analytics Sampling and Estimation in Network Graphs 42
43 Impact of sampling on degree distribution Let N(d) denote the number of vertices with degree d in G Let N (d) be the counterpart in a sampled graph G Introduce vectors n = [N(0),..., N(d max )] and likewise n Under a variety of sampling designs, it holds that E [n ] = Pn Matrix P depends fully on the sampling, not G itself Expectation w.r.t. randomness in selecting the sample G O. Frank, Estimation of the number of vertices of different degrees in a graph, J. Stat. Planning and Inference, vol. 4, pp , 1980 Network Science Analytics Sampling and Estimation in Network Graphs 43
44 An inverse problem Recall the identity E [n ] = Pn Face a linear inverse problem Unbiased estimator of the degree distribution n ˆn naive = P 1 n While natural, two problems with this simple solution Matrix P typically not invertible in practice; and Non-negativity of the solution is not guaranteed We actually have an ill-posed linear inverse problem Network Science Analytics Sampling and Estimation in Network Graphs 44
45 Performance of naive estimator Erdös-Renyi graph with N v = 100 and N e = 500 BS of vertices with p = 0.6 Induced subgraph sampling of edges Network Science Analytics Sampling and Estimation in Network Graphs 45
46 Penalized least-squares formulation Constrained, penalized, weighted least-squares [Zhang et al 14] min (Pn n ) C 1 (Pn n ) + λpen(n) n s. to N(d) 0, d = 0, 1,..., d max, d max N(d) = N v d=1 Matrix C denotes the covariance of n Functional pen(n) penalizes complexity in n, tuned by λ Constraints Non-negativity of degree counts Total degree counts equal the number of vertices Smoothness: pen(n) = Dn 2, D differentiating operator Network Science Analytics Sampling and Estimation in Network Graphs 46
47 Application to online social networks Communities from online social networks Orkut and LiveJournal BS of vertices with p = 0.3 Induced subgraph sampling of edges True, sampled, and estimated degree distribution Network Science Analytics Sampling and Estimation in Network Graphs 47
48 Glossary Enumeration and samping Population graph Sampled graph Plug-in estimator Sampling design Sample with(out) replacement Design-based methods Averages and totals Inclusion probability Simple random sampling Bernoulli sampling Unequal probability sampling Horvitz-Thompson estimator Probability proportional to size sampling Capture-recapture estimator Induced subgraph sampling Incident subgraph sampling Snowball and star sampling Traceroute sampling Hidden population Ill-posed inverse problem Penalized least squares Network Science Analytics Sampling and Estimation in Network Graphs 48
Estimating network degree distributions from sampled networks: An inverse problem
Estimating network degree distributions from sampled networks: An inverse problem Eric D. Kolaczyk Dept of Mathematics and Statistics, Boston University kolaczyk@bu.edu Introduction: Networks and Degree
More informationSampling and incomplete network data
1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a
More informationIndirect Sampling in Case of Asymmetrical Link Structures
Indirect Sampling in Case of Asymmetrical Link Structures Torsten Harms Abstract Estimation in case of indirect sampling as introduced by Huang (1984) and Ernst (1989) and developed by Lavalle (1995) Deville
More informationSampling. Jian Pei School of Computing Science Simon Fraser University
Sampling Jian Pei School of Computing Science Simon Fraser University jpei@cs.sfu.ca INTRODUCTION J. Pei: Sampling 2 What Is Sampling? Select some part of a population to observe estimate something about
More informationof being selected and varying such probability across strata under optimal allocation leads to increased accuracy.
5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability
More informationNetworks: Lectures 9 & 10 Random graphs
Networks: Lectures 9 & 10 Random graphs Heather A Harrington Mathematical Institute University of Oxford HT 2017 What you re in for Week 1: Introduction and basic concepts Week 2: Small worlds Week 3:
More informationContinuous-time Markov Chains
Continuous-time Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 23, 2017
More informationSpecification and estimation of exponential random graph models for social (and other) networks
Specification and estimation of exponential random graph models for social (and other) networks Tom A.B. Snijders University of Oxford March 23, 2009 c Tom A.B. Snijders (University of Oxford) Models for
More informationGaussian, Markov and stationary processes
Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ November
More informationSupporting Statistical Hypothesis Testing Over Graphs
Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,
More information6.207/14.15: Networks Lecture 12: Generalized Random Graphs
6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More information1 Complex Networks - A Brief Overview
Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More informationTutorial on Machine Learning for Advanced Electronics
Tutorial on Machine Learning for Advanced Electronics Maxim Raginsky March 2017 Part I (Some) Theory and Principles Machine Learning: estimation of dependencies from empirical data (V. Vapnik) enabling
More informationCluster Sampling 2. Chapter Introduction
Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationUncovering structure in biological networks: A model-based approach
Uncovering structure in biological networks: A model-based approach J-J Daudin, F. Picard, S. Robin, M. Mariadassou UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Statistics
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2
More informationProbabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling
Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling Harry Crane Based on Chapter 3 of Probabilistic Foundations of Statistical Network Analysis Book website: http://wwwharrycranecom/networkshtml
More information1 Mechanistic and generative models of network structure
1 Mechanistic and generative models of network structure There are many models of network structure, and these largely can be divided into two classes: mechanistic models and generative or probabilistic
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationComplex networks: an introduction
Alain Barrat Complex networks: an introduction CPT, Marseille, France ISI, Turin, Italy http://www.cpt.univ-mrs.fr/~barrat http://cxnets.googlepages.com Plan of the lecture I. INTRODUCTION II. I. Networks:
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationBlind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1
Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of ECE and Goergen Institute for Data Science University of Rochester cye7@ur.rochester.edu http://www.ece.rochester.edu/~cye7/
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationStatistics Canada International Symposium Series - Proceedings Symposium 2004: Innovative Methods for Surveying Difficult-to-reach Populations
Catalogue no. 11-522-XIE Statistics Canada International Symposium Series - Proceedings Symposium 2004: Innovative Methods for Surveying Difficult-to-reach Populations 2004 Proceedings of Statistics Canada
More informationLecture 3: Simulating social networks
Lecture 3: Simulating social networks Short Course on Social Interactions and Networks, CEMFI, May 26-28th, 2014 Bryan S. Graham 28 May 2014 Consider a network with adjacency matrix D = d and corresponding
More informationGraph Detection and Estimation Theory
Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and
More informationCHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM
CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationMixed-Models. version 30 October 2011
Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector
More informationSzemerédi s regularity lemma revisited. Lewis Memorial Lecture March 14, Terence Tao (UCLA)
Szemerédi s regularity lemma revisited Lewis Memorial Lecture March 14, 2008 Terence Tao (UCLA) 1 Finding models of large dense graphs Suppose we are given a large dense graph G = (V, E), where V is a
More informationAdventures in random graphs: Models, structures and algorithms
BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 Complex
More information3.2 Configuration model
3.2 Configuration model 3.2.1 Definition. Basic properties Assume that the vector d = (d 1,..., d n ) is graphical, i.e., there exits a graph on n vertices such that vertex 1 has degree d 1, vertex 2 has
More informationNetwork models: random graphs
Network models: random graphs Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis
More informationEfficient MCMC Samplers for Network Tomography
Efficient MCMC Samplers for Network Tomography Martin Hazelton 1 Institute of Fundamental Sciences Massey University 7 December 2015 1 Email: m.hazelton@massey.ac.nz AUT Mathematical Sciences Symposium
More informationarxiv: v2 [stat.ap] 9 Jun 2018
Estimating Shortest Path Length Distributions via Random Walk Sampling Minhui Zheng and Bruce D. Spencer arxiv:1806.01757v2 [stat.ap] 9 Jun 2018 Department of Statistics, Northwestern University June 12,
More informationNetwork Degree Distribution Inference Under Sampling
Network Degree Distribution Inference Under Sampling Aleksandrina Goeva 1 Eric Kolaczyk 1 Rich Lehoucq 2 1 Department of Mathematics and Statistics, Boston University 2 Sandia National Labs August 18,
More informationPageRank: Ranking of nodes in graphs
PageRank: Ranking of nodes in graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October
More informationA Tunable Mechanism for Identifying Trusted Nodes in Large Scale Distributed Networks
A Tunable Mechanism for Identifying Trusted Nodes in Large Scale Distributed Networks Joydeep Chandra 1, Ingo Scholtes 2, Niloy Ganguly 1, Frank Schweitzer 2 1 - Dept. of Computer Science and Engineering,
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationPart 7: Hierarchical Modeling
Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationCRISP: Capture-Recapture Interactive Simulation Package
CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationShortest Paths & Link Weight Structure in Networks
Shortest Paths & Link Weight Structure in etworks Piet Van Mieghem CAIDA WIT (May 2006) P. Van Mieghem 1 Outline Introduction The Art of Modeling Conclusions P. Van Mieghem 2 Telecommunication: e2e A ETWORK
More informationYou are allowed 3? sheets of notes and a calculator.
Exam 1 is Wed Sept You are allowed 3? sheets of notes and a calculator The exam covers survey sampling umbers refer to types of problems on exam A population is the entire set of (potential) measurements
More informationFigure 36: Respiratory infection versus time for the first 49 children.
y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects
More informationA = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1
Lecture I jacques@ucsd.edu Notation: Throughout, P denotes probability and E denotes expectation. Denote (X) (r) = X(X 1)... (X r + 1) and let G n,p denote the Erdős-Rényi model of random graphs. 10 Random
More informationNonparametric Bayesian Matrix Factorization for Assortative Networks
Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin
More informationAdaptive Web Sampling
Biometrics 62, 1224 1234 December 26 DOI: 1.1111/j.1541-42.26.576.x Adaptive Web Sampling Steven K. Thompson Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationLecture 10: Dimension Reduction Techniques
Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set
More informationReducing contextual bandits to supervised learning
Reducing contextual bandits to supervised learning Daniel Hsu Columbia University Based on joint work with A. Agarwal, S. Kale, J. Langford, L. Li, and R. Schapire 1 Learning to interact: example #1 Practicing
More informationFCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E
FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationLikelihood-Based Methods
Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)
More informationJIGSAW PERCOLATION ON ERDÖS-RÉNYI RANDOM GRAPHS
JIGSAW PERCOLATION ON ERDÖS-RÉNYI RANDOM GRAPHS ERIK SLIVKEN Abstract. We extend the jigsaw percolation model to analyze graphs where both underlying people and puzzle graphs are Erdös-Rényi random graphs.
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Intro sessions to SNAP C++ and SNAP.PY: SNAP.PY: Friday 9/27, 4:5 5:30pm in Gates B03 SNAP
More informationMN 400: Research Methods. CHAPTER 7 Sample Design
MN 400: Research Methods CHAPTER 7 Sample Design 1 Some fundamental terminology Population the entire group of objects about which information is wanted Unit, object any individual member of the population
More information45 6 = ( )/6! = 20 6 = selected subsets: 6 = and 10
Homework 1. Selected Solutions, 9/12/03. Ch. 2, # 34, p. 74. The most natural sample space S is the set of unordered -tuples, i.e., the set of -element subsets of the list of 45 workers (labelled 1,...,
More informationOverall Plan of Simulation and Modeling I. Chapters
Overall Plan of Simulation and Modeling I Chapters Introduction to Simulation Discrete Simulation Analytical Modeling Modeling Paradigms Input Modeling Random Number Generation Output Analysis Continuous
More informationMining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee
Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of
More informationAppendix: Modeling Approach
AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationNotes 6 : First and second moment methods
Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative
More informationSTATS 306B: Unsupervised Learning Spring Lecture 2 April 2
STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationStatistical analysis of biological networks.
Statistical analysis of biological networks. Assessing the exceptionality of network motifs S. Schbath Jouy-en-Josas/Evry/Paris, France http://genome.jouy.inra.fr/ssb/ Colloquium interactions math/info,
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationarxiv: v4 [stat.me] 28 May 2015
The Annals of Applied Statistics 2015, Vol. 9, No. 1, 166 199 DOI: 10.1214/14-AOAS800 c Institute of Mathematical Statistics, 2015 arxiv:1305.4977v4 [stat.me] 28 May 2015 ESTIMATING NETWORK DEGREE DISTRIBUTIONS
More informationNon-uniform coverage estimators for distance sampling
Abstract Non-uniform coverage estimators for distance sampling CREEM Technical report 2007-01 Eric Rexstad Centre for Research into Ecological and Environmental Modelling Research Unit for Wildlife Population
More informationBasic Sampling Methods
Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution
More informationAdaptive Web Sampling
Adaptive Web Sampling To appear in Biometrics, 26 Steven K. Thompson Department of Statistics and Actuarial Science Simon Fraser University, Burnaby, BC V5A 1S6 CANADA email: thompson@stat.sfu.ca Abstract
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationUnequal Probability Designs
Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014 Section 7.11 and 7.12 Probability Sampling Designs: A quick review A probability
More informationDouble-Sampling Designs for Dropouts
Double-Sampling Designs for Dropouts Dave Glidden Division of Biostatistics, UCSF 26 February 2010 http://www.epibiostat.ucsf.edu/dave/talks.html Outline Mbarra s Dropouts Future Directions Inverse Prob.
More informationLink Prediction. Eman Badr Mohammed Saquib Akmal Khan
Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling
More informationEmpirical Bayes Quantile-Prediction aka E-B Prediction under Check-loss;
BFF4, May 2, 2017 Empirical Bayes Quantile-Prediction aka E-B Prediction under Check-loss; Lawrence D. Brown Wharton School, Univ. of Pennsylvania Joint work with Gourab Mukherjee and Paat Rusmevichientong
More informationToday s Lecture: HMMs
Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models
More informationSmall Area Confidence Bounds on Small Cell Proportions in Survey Populations
Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Aaron Gilary, Jerry Maples, U.S. Census Bureau U.S. Census Bureau Eric V. Slud, U.S. Census Bureau Univ. Maryland College Park
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationSanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore
AN EMPIRICAL LIKELIHOOD BASED ESTIMATOR FOR RESPONDENT DRIVEN SAMPLED DATA Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore Mark Handcock, Department
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationSurvey Sample Methods
Survey Sample Methods p. 1/54 Survey Sample Methods Evaluators Toolbox Refreshment Abhik Roy & Kristin Hobson abhik.r.roy@wmich.edu & kristin.a.hobson@wmich.edu Western Michigan University AEA Evaluation
More informationLecture 3: Introduction to Complexity Regularization
ECE90 Spring 2007 Statistical Learning Theory Instructor: R. Nowak Lecture 3: Introduction to Complexity Regularization We ended the previous lecture with a brief discussion of overfitting. Recall that,
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationChapter 3. Erdős Rényi random graphs
Chapter Erdős Rényi random graphs 2 .1 Definitions Fix n, considerthe set V = {1,2,...,n} =: [n], andput N := ( n 2 be the numberofedgesonthe full graph K n, the edges are {e 1,e 2,...,e N }. Fix also
More informationOptimal Blocking by Minimizing the Maximum Within-Block Distance
Optimal Blocking by Minimizing the Maximum Within-Block Distance Michael J. Higgins Jasjeet Sekhon Princeton University University of California at Berkeley November 14, 2013 For the Kansas State University
More informationTGDR: An Introduction
TGDR: An Introduction Julian Wolfson Student Seminar March 28, 2007 1 Variable Selection 2 Penalization, Solution Paths and TGDR 3 Applying TGDR 4 Extensions 5 Final Thoughts Some motivating examples We
More informationConsistency Under Sampling of Exponential Random Graph Models
Consistency Under Sampling of Exponential Random Graph Models Cosma Shalizi and Alessandro Rinaldo Summary by: Elly Kaizar Remember ERGMs (Exponential Random Graph Models) Exponential family models Sufficient
More informationDiscrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14
CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationWhat s New in Econometrics. Lecture 1
What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More information