Long Term Evolution of Networks CS224W Final Report

Size: px
Start display at page:

Download "Long Term Evolution of Networks CS224W Final Report"

Transcription

1 Long Term Evolution of Networks CS224W Final Report Jave Kane (SUNET ID javekane), Conrad Roche (SUNET ID conradr) Nov. 13, 2014 Introduction and Motivation Many studies in social networking and computer science have investigated evolution of large networks, on time scales of a year to ten years. Less well investigated is the long term evolution of large networks and the stability of communities when the external forces generating these structures are themselves evolving. We investigate collaboration networks generated from American Physical Society (APS) article metadata ( ). We study the correlation between author interests as represented by the APS PACS fields. We then simulate the APS network using the Kronecker and Forest Fire models, which turn out to be difficult to use with APS. Therefore, we create a third, simple model of a collaboration network based on a few intuitive rules about how coauthors are chosen. This model produces a network with power law degree structure and clustering coefficient similar to APS. The effective diameter is systematically lower than for APS, and we explore why. Overall the results suggest APS is a collection of tightly bound communities, with few edges joining them. Finally, we briefly investigate longer- term evolution in two what- if scenarios. Summary & Critique of selected papers Microscopic Evolution of Social Networks, Leskovec et al. Analyzes the evolution of four social networks at the microscopic level. Show that the edge creation for a node seems unaffected by its age, but is proportional to its degree. Evolution of the social network of scientific collaborations, Baraba śi et al. Investigate structural properties of two collaboration networks. For both networks, the degree distribution has a power law tail, with different exponents for the two networks. Tracking the Evolution of Communities in Dynamic Social Networks, Greene & Doyle. A model for tracking user communities in dynamic networks where each the evolution of each community is determined by a set of significant events. Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, Leskovec et al. A graph generator that obeys static properties and temporal evolution patterns of real- life network and is mathematically. Generated graphs exhibit multinomial degree distribution, and multinomial eignevalue distribution. Processing the APS Data and Creating Network Snapshots The corpus of Physical Review Letters, Physical Review, and Reviews of Modern Physics contains metadata for 541,448 articles dating back to 1893 from 12 APS journals described in Table 1. Pre- processing the data to generating network snapshots is a significant effort;

2 after considerable investigation we settled on the following. We select only articles where having the authors and names fields, thus eliminating preambles and commentaries, and the first names, thus eliminating collaborations (with an average of 331 authors per article.) We build 121 yearly snapshots (1893 to 1913) by scanning the articles in date order. We add an author as a node at the time of her first article. For each article, we add a coauthor edge for each pair of authors. We accumulate the time- dependent interest vector for each author as the sum of the PACS vector of all their papers. Identifying authors is very difficult. An author may be listed with various first names or initials. Some names are very common, e.g. Lee, and Brown. The affiliations field is not helpful; its format varies even for the same author. Furthermore, authors often change institutions. Misidentification means authors may map one- to- many or many- to- one to nodes, spuriously creating or omitting edges. The final (2013) network has 2.7e5 nodes and 5.7e6 edges. Full Title Year Description Physical Review Original journal Physical Review A Atomic, molecular, optical, quantum information Physical Review B Condensed matter and materials Physical Review C Experimental and theoretical nuclear Physical Review D Elementary particle, field theory, gravitation, cosmology Physical Review E Collective phenomena of many-body systems: PR Series I Physical Review Letters All fields, short letters, important research. PR Special Topics Accelerators and Beams PR Special Topics Education PR X all areas: pure, applied, interdisciplinary Reviews of Modern Physics 1929 broad fundamental current trends and applications Table 1. The APS journals We use the APS PACS classification field to assign interest vectors to authors. The 10 PACS areas were introduced in 1975, e.g. (10) The Physics of Elementary Particles and Fields; (70) Condensed Matter: Electronic Structure, Electrical, Magnetic, and Optical Properties. For each article we parse the PACS code down to a broad category, e.g. 7y.xx, becomes % of the articles have no PACS, mostly older articles but persisting well into the 1980 s. We tested using journal- averaged PACS in these cases but felt this procedure was too complicated and could obscure network of interest, so we assign a zero PAC vector instead, and for analyzing interest of communities, average over the normalized non- zero vectors. Community Detection and PACS Interest Vectors To detect communities in the APS network snapshots and investigate their correlation with the accumulated PACS vectors of authors, we have tried the Girvan- Newman (GM) and Clauset- Newman- Moore (CNM) algorithms (fastcommunity_mh in C. GM appears intractable, while CNM takes 18 hours on the 2013 network. We considered Big Clam, and settled on Louvain, which takes less than a minute on both the APS and simulated networks, and generates intuitively reasonable communities in both cases. We also tested Louvain with edge weighting with our simulated network; we found it was too slow. Figure 1 shows the distribution of PACS for each of the top 5 communities by size in the 2010, 2011, 2012

3 and 2013 snapshots. The largest community (blue bars in all plots) is always specialized in Condensed Matter Physics (PACS categories 7 and 8), but the continuity of the other communities is less clear. All communities grow in size with time, but in addition Louvain unpredictably splits communities with similar interests that might intuitively be combined into one. Communities also change order in the ranking by size. To match communities from one year to the next, we computed two Jaccard- type similarities on the author (node) ids, and also cosine similarities between the mean normalized total PAC vectors (summed over authors) within each community. Because a very small community U (of say, 200 nodes) could become split off from a larger community V (on, say, nodes) that contained it a year earlier, we computed both J! U V U V and J! U V min U, V. We found that both J 1 and J 2 were well correlated with the cosine similarities of the groups. Since our goal here was only to confirm a supposed link between communities and interests, we did not attempt to compute exhaustive statistics for the database, but for example both J 1 =72% and J 2 = 88% both match community 3 (green) in 2012 with community 2 (light blue) in 2012; the interest cosine similarity is 99.76%. These results suggest APS communities are tight knit and identifiable by similar PACS interests. Figure 1. Fraction of total community PACS vector in each category for top 5 Louvain APS communities by size for the years 2010, 2011, 2012 and The legend shows the total number of nodes in each group.

4 Models of the APS Network Forest Fire Model The Forest Fire Model [3] generate evolutionary models by burning through an existing graph. It has two parameter, the forward and backward burning probabilities p and r. Each arriving node v attaches to an ambassador node, w chosen uniformly at random then attaches to a subset of the out- and in- neighbors of w using a geometric distribution based on the p and r. The node v then creates edges with the subset of nodes selected. This is repeated recursively on the subset until no nodes are left, with any node visited only once. The resulting graphs exihibit densification and a power law degree distribution. For APS network, the forest fire model can be viewed as new author selecting a primary co- author, then selecting more coauthors recursively. The forward/reverse burning probabilities were determined for each of the APS snapshots. A simple least square approximation of the goodness of fit model was used to determine the probabilities for which the model closely matches the properties of the actual network. The representative properties used were node count, edge count, approximate diameter, maximum WCC size, clustering coefficient and maximum node degree. The probabilities for the best fits were in the sweet spot between 0.2 and The forest fire model with the best fit for all the snapshots had a forward burning probability, p, of 0.33 and a backward burning probability, r, of The basic Forest Fire model will generate a graph which contains a single component whose nodes (ignoring edge direction) one can navigate to any other node in the component. Thus, the maximal WCC is the entire network. This is not true for APS, however (Figure 4), so the forest fire model did not correctly model the maximum WCC size. Figure 2. Graph properties over time for the Forest Fire models initial years of the graph while the basic forest fire model fits the later years of the graph. As suggested in [3], one way to generate a graph which contains multiple weakly connected components is the orphan model. Here, with probability of op, v, will not establish an edge with the existing graph. This leads to many orphan nodes which will eventually form edges

5 with new nodes. The forest fire model with the best fit for all the snapshots had a forward burning probability, p, of 0.27 and a backward burning probability, r, of 0.28 and an orphan probability of 0.3. The Forest Fire orphan model fit the initial evolution of APS well, but the basic model fit the latter part of the evolution better. The clustering coefficients in the both models remain flat for most of the evolution. The APS network on the other hand has a steep initial increase in the clustering coefficient and then, as the number of nodes increase, it asymptotes. This makes it difficult to fit the forest fire model to the data. Stochastic Kronecker Graph model The self- similar Stochastic Kronecker graph [4] is generated recursively, starting with a N! N! probability matrix (N! = 2 in our cae), and compute its k!" power. Edges are based on the probability in the corresponding entry of the resulting matrix. For APS self- similarity would imply that the network amongst authors is similar (but not exactly the same) across the network. This makes sense, as authors in different fields would exhibit similar collaborative behavior. Unlike Forest Fire, stochastic Kronecker is undirected. It exhibits a power law degree distribution, small diameter and densification. Another interpretation of Stochastic Kronecker graphs, as described in [4], is to associate each node with a set of features, where the probability of an edge depends on the feature similarities. For APS the features are authors (research) interests. We used the KronFit approach [4] to fit the APS netwrk snapshots and determine the initiator matrix. We chose N 1 = 2, since larger N 1 did radically improve the fit not in [4]. The matrix varied for each of the snapshots. The diagonal values in the matrices were nearly the same, with an average difference over the years of The median value of the matrix elements in the yearly snapshots was chosen as the initiator for the entire time series, Θ = Figure 3 The Stochastic Kronecker graph modeled the edge growth and the effective diameter well. It could not model the behavior of the maximum SCC or the clustering coefficient, which a steep fall with network growth while the APS network had the clustering coefficient rise for the latter half of the time.

6 Simple Model We have also built a simple model for generating an APS- like network. The key element is a small set of intuitive rules choosing coauthors that account for the interests of the authors, where interest means any external influence, especially funding or popular support for particular subfields of physics. A rule is chosen randomly using the weighted distribution shown in Table 2. Through extensive experimentation, these rules were found to be both effective in controlling network propertie, and feasible for APS- size graphs. Coauthor selection rule Typical relative weighting 1 complete a triangle where nodes have the same interest 23 2 seek a high degree node with same interest 70 3 seek any node with same interest 6 4 seek a high degree node (neighbor of random node with any 1e- 3 interest) 5 random node (any degree, any interest 0.4 Table 2. Intuitive rules for choosing coauthors in the simple model. Notably, Rule 4, essentially an edge rewiring rule, reduces the effective diameter in the model shown here from around 8 to an APS- like value of around 5 6. In the model shown here, for simplicity of exposition each author has one of three possible scalar interest values (1,2, or 3). We also tried as many as 20 scalar interests. For the model run shown here, the network G = (V, E) is initialized with 4 nodes in each of three equal- size complete subgraph communities C1, C2 and C3, where every node in Ci has a single scalar interest i. The subgraphs are connected in a ring (three additional edges joining pairs of subgraphs). The model is run for 1440 time steps (representing 120 years time 12 months). Each month n = max(1, rn) new authors (nodes) are added to the network, where the rate r = 0.058/12 and N= V. Each new author is assigned an interest value 1, 2, or 3 with equal probability. The top right panel of Figure 3 shows an excellent quadratic fit to the number of edges vs. the number of nodes in the APS Wcc, so we stipulate the number of new papers ΔP is related to the number of new nodes ΔN at each step as (1 + 2α N 2 ) ΔN; the constant α = 3e- 3 is tuned along with the probabilities for number of coauthors stated above.. Each new paper has m coauthors in addition to the first author, where m is chosen randomly for each paper separately using the weighting [0.35, 0.15, 0.10, 0.10, 0.05] for m = [1, 2, 3, 4, 5]. This weighting influences the rate at which edges appear. All n new authors are placed in an initially empty queue q of first authors waiting for coauthors, and then s = q n existing first authors are also added to the list. To choose an existing first author for q, an existing node v is chosen randomly from the network. With 10% probability, v is added to the list, or with 90% probability, a random neighbor of v is node is added to the list; the latter implements a preferential well- connected get more publications mode. Once q is assembled, each v in q is assigned its own m(v) - 1 coauthors by the rules in Table 1. Because the initial network is connected and every paper has at least one coauthor, the network remains connected. Using a profiling utility in Python, we found that careful approximations to the rule reduced, the running time to generate the full APS- size from several hours to two minutes. For example, at first we used sampling routines in the Python stats package to generate a list of

7 candidate nodes for Rule 1, from which we randomly downselected. However, the profiler showed this and similar approaches are very expensive. Therefore, in Rule 1, if the chosen neighbor of v or its neighbor does not have the same interest as v, we simply iterate and choose a rule again (possibly rule 1). This means the rules are not applied with exactly the distribution shown in Table 2. However, because the edges are sparse E << N(N- 1)/2, and the rules tend to surround neigbors with similar interest, the rules usually find a candidate, so we expect the approximations to work well. The improved speed of smade it possible to test many concepts for the model and to vary the parameters. Results of Simple Model As the top left panel of Figure 4 shows, the APS network is significantly disconnected until the 1940s, when the network has only a few thousand nodes. Therefore we compare simulated results to the maximum weakly connected component (Wcc) of APS. The left panel on the second line of Figure 4 shows the number of edges versus the number of nodes. The simple model produces an APS- like network with the following features. 1) clustering coefficient C: close to APS value of 6.1: second line, right panel of Figure 4. 2) power law degree similar to APS: third line of Figure 4 3) effective diameter D e and # of nodes at a given # of hops similar to APS: bottom line. Surprisingly, D e of the APS network is the most sensitive measure to the input parameters. As the left panel on the bottom line of Figure 4 shows, for the model run shown the final D e is closer to 4 than to the final D e of about 5 6 in the APS Wcc, although the two may be converging. We note that the full diameter of the model is very similar to the APS Wcc D e. One way a graph can have high C and high D e is if it has dense communities with tendrils. For example, a complete graph G C on n has C = 1. If a chain (line) subgraph G L of m nodes is attached by one edge to one node v of G C, then every node i in G L has C i = 0. The full diameter of the graph is D = m +1. Every node j in G C except v has C j = 1. The node v has 2e C v = v ( ( ) +1= n is the degree of v and k v = n 1 )( n 2) is the number of edges k v 1 2 k v ( ), where k v = n 1 between pairs of neighbors of C v, i.e. between the n 1 neighbors of v in G L. Thus ( ) ( n 1) ( n 2) 2 ( ) & ( ) 2 C v = k v ( k v 1) = 2 = n 2. Therefore for the entire graph, the clustering coefficient n n 1 n is C = 1 # % m + n m 0 + n 2 + ( n 1) 1( = 1 n n 2 2 =, m 1 $ n ' m + n n D + n 1 n For a given D, lim n C =1, i.e. C can be arbitrarily close to 1. We suspect this type of structure is present in APS, but it would not result from the simple rules in Table 1. (This would be interesting further work). Notably, APS has a cloud of high- frequency, high- distribution counts above the power law (third line, left panel of Figure 4) that the simple model does not reproduce. These could be explained as small highly connected communities nearly detached from the main network, and could raise D e. (These may be specialty institutions or fields this would be worth investigating further.) In general, by varying parameters in the simple model it is easy to improve the match on any of criteria 1) 3) individually (not shown), such as the number of nodes at degree 1 or the tail of single- count, high- degree nodes. We have made no systematic attempt to find a simultaneous good match on all 3 criteria. However, given the simplicity of our rules, these results suggest that simple intuitive rules can robustly explain the APS network

8 Simple Model Figure 4. Results of the simple model Top left: APS # Nodes vs. Year. Top right: Fit to APS # Edges vs. # Nodes. 2 nd line left: # Edge vs. # Nodes. 2 nd line right: Global Clustering Coefficient. 3 rd line left: degree distribution for APS. 3 rd line right: same for simple model. Bottom left: Diameter. Bottom distribution of nodes versus number of hops:

9 Long term evolution of simulated network We have also the used the simple model to investigate longer- term evolution of a collaboration network in two what- if scenarios. The type of scenario we envision is a significant change in the funding levels for two subfields of physics research. An example might be an increase in funding for Condensed Matter as it becomes the dominant profitable field in physics, and a decrease in funding for nuclear physics as it loses political support. To mock up these scenarios, at 1200 months (out of 1440 months) we impose a change in the interest attributes of new nodes and existing nodes publishing new papers. At 1200 months time the network contains about 1/3 the number of nodes it does at 1440 months. The change in interest mocks up an increase in the fractional funding to Interest 1 from 33% to 60% of total physics funding, and a decrease in funding for Interest 3 from 33% to 5%. New authors entering the network are assigned interests according to the changed interest distribution, while existing first authors who currently have Interest 3 reassign their interest according to the new interest distribution when they publish a new paper. This mocks up the effect of existing authors possibly bailing out on a declining field. We run Louvain community detection on yearly network snapshots. The result of this model is that the clustering coefficient increases, while the effective diameter stays nearly constant. This is a somewhat surprising result. A change in author interest might be expected to generate more long- distance connections since an existing author who changes interest would tend to have different interests from its neighbors; this would not tend to decrease the clustering coefficient, but would decrease the diameter. As a fraction of the total number of authors in the network, C1 increases in size in response to increased funding, while C2 stays constant in size. C3 changes in several ways: it shrinks in fractional size by a factor of 3; its average interest changes from 3 to 1.6, and the fraction of authors in Community 3 with interests 1, 2 and 3 respectively each become about 1/3. This result suggest that a fairly simple change in the funding input to a collaboration network can lead to the disruption of a previously stable community in a short amount of time. Since existing edges (papers published on earlier interests) are not remove in this simple model, it is interesting that the new edges come to dominate the structure. However, this result was obtained for the case where the network continued to grow; in this case new authors entering the network may dominate the structure. To address the last point, we simulate a case where we stop adding nodes to the network at 1200 months, and where existing authors with Interest 3 change their interest according to the prescription in the previous section when they publish a paper. By 1440 months the effective network diameter drops to to 5 and the clustering coefficient actually decrease to 0.46, because many existing nodes that had Interest 3 have started new triangles that are not yet completed. The number of edges in the network has increased from 1.8e6 to 3.9e6 (compared to ~5.9e6 in the case where new nodes are added.) At 1440 months the Interest 3 community from 1200 months has significantly changed: the community is only 5% Interest 3, and 51% Interest 1. This result in not surprising, but suggests that the rapid transformation of communities takes place among both existing nodes and new nodes.

10 Conclusions and Further work The forward and backward probabilities of the forest fire mode that fit each of the yearly snapshots varied from year to year. We could consider a model where the probabilities are a function of the number of nodes in the graph decreasing as the number of nodes increase. An alternative approach could be a model where the orphan probability decreases as the graph grows, so that it reduces to zero once the graph reaches a certain size. We would like to add more features to our simple model of network generation. Nodes and edges could age, and be refreshed by new papers; edges could have strengths based on cosine similarity of the endpoint interest. We have created and run models with these features (shown in the milestone report) but have not presented results here. External changes on collaboration networks could also include influence/outbreak, where author interests spread by contact edges. However, from the viewpoint of creating simple, realistic model, it s not obvious that such effects are as important as more practical concerns like funding levels and political/social palatability of particular areas of research. The results for extended 'longer term' evolution with the simple model are interesting, but not too surprising, because the APS it consists of tight communities with few links between them, i.e. as a whole it's barely a network. The average number of edges from a node to a node in a different community is much less than one. This is not astonishing; coauthoring a paper is a much more involved undertaking and commitment than friending someone or liking their post. The main conclusion is that the APS network is formed by authors choosing like- minded (or like- funded) coauthors that are well- connected. References [1] [2] A. Clauset, M.E.J. Newman and C. Moore, "Finding community structure in very large networks." Phys. Rev. E 70, (2004). [3] Leskovec, J. Kleinberg and C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), [4] J Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research (JMLR), Tasks/Roles C. Roche: compiled Louvain for Mac; ran and analyzed Forest Fire and Kronecker models. J. Kane: formulated project goals; supervised project; preprocessed APS data; created network snapshots; performed Louvain community detection and analysis; built and ran simple model and analyzed output; did long- term evolution runs; assembled final report and wrote 80% of it.

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity CS322 Project Writeup Semih Salihoglu Stanford University 353 Serra Street Stanford, CA semih@stanford.edu

More information

CS224W: Social and Information Network Analysis

CS224W: Social and Information Network Analysis CS224W: Social and Information Network Analysis Reaction Paper Adithya Rao, Gautam Kumar Parai, Sandeep Sripada Keywords: Self-similar networks, fractality, scale invariance, modularity, Kronecker graphs.

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Modeling the Evolution of the Global Migration Network

Modeling the Evolution of the Global Migration Network Modeling the Evolution of the Global Migration Network Stephanie Chen (schen751) December 10, 2017 1 Introduction Human migration has shaped the world since humans first came into being as a species; with

More information

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling

More information

Modeling Dynamic Evolution of Online Friendship Network

Modeling Dynamic Evolution of Online Friendship Network Commun. Theor. Phys. 58 (2012) 599 603 Vol. 58, No. 4, October 15, 2012 Modeling Dynamic Evolution of Online Friendship Network WU Lian-Ren ( ) 1,2, and YAN Qiang ( Ö) 1 1 School of Economics and Management,

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is

More information

Modeling of Growing Networks with Directional Attachment and Communities

Modeling of Growing Networks with Directional Attachment and Communities Modeling of Growing Networks with Directional Attachment and Communities Masahiro KIMURA, Kazumi SAITO, Naonori UEDA NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Kyoto 619-0237, Japan

More information

Deterministic Decentralized Search in Random Graphs

Deterministic Decentralized Search in Random Graphs Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.

More information

CS224W: Methods of Parallelized Kronecker Graph Generation

CS224W: Methods of Parallelized Kronecker Graph Generation CS224W: Methods of Parallelized Kronecker Graph Generation Sean Choi, Group 35 December 10th, 2012 1 Introduction The question of generating realistic graphs has always been a topic of huge interests.

More information

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs 6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search 6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)

More information

arxiv:physics/ v3 [physics.soc-ph] 28 Jan 2007

arxiv:physics/ v3 [physics.soc-ph] 28 Jan 2007 arxiv:physics/0603229v3 [physics.soc-ph] 28 Jan 2007 Graph Evolution: Densification and Shrinking Diameters Jure Leskovec School of Computer Science, Carnegie Mellon University, Pittsburgh, PA Jon Kleinberg

More information

Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA

Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA Groups of vertices and Core-periphery structure By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA Mostly observed real networks have: Why? Heavy tail (powerlaw most

More information

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions 1 Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions Jiaqi Liu, Luoyi Fu, Yuhang Yao, Xinzhe Fu, Xinbing Wang and Guihai Chen Shanghai Jiao Tong University {13-liujiaqi, yiluofu,

More information

Similarity Measures for Link Prediction Using Power Law Degree Distribution

Similarity Measures for Link Prediction Using Power Law Degree Distribution Similarity Measures for Link Prediction Using Power Law Degree Distribution Srinivas Virinchi and Pabitra Mitra Dept of Computer Science and Engineering, Indian Institute of Technology Kharagpur-72302,

More information

On Node-differentially Private Algorithms for Graph Statistics

On Node-differentially Private Algorithms for Graph Statistics On Node-differentially Private Algorithms for Graph Statistics Om Dipakbhai Thakkar August 26, 2015 Abstract In this report, we start by surveying three papers on node differential privacy. First, we look

More information

Analysis & Generative Model for Trust Networks

Analysis & Generative Model for Trust Networks Analysis & Generative Model for Trust Networks Pranav Dandekar Management Science & Engineering Stanford University Stanford, CA 94305 ppd@stanford.edu ABSTRACT Trust Networks are a specific kind of social

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/30/17 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

More information

Data mining in large graphs

Data mining in large graphs Data mining in large graphs Christos Faloutsos University www.cs.cmu.edu/~christos ALLADIN 2003 C. Faloutsos 1 Outline Introduction - motivation Patterns & Power laws Scalability & Fast algorithms Fractals,

More information

1 Complex Networks - A Brief Overview

1 Complex Networks - A Brief Overview Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,

More information

Adventures in random graphs: Models, structures and algorithms

Adventures in random graphs: Models, structures and algorithms BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 LECTURE

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

Prediction of Citations for Academic Papers

Prediction of Citations for Academic Papers 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Identification of Bursts in a Document Stream

Identification of Bursts in a Document Stream Identification of Bursts in a Document Stream Toshiaki FUJIKI 1, Tomoyuki NANNO 1, Yasuhiro SUZUKI 1 and Manabu OKUMURA 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute

More information

Predicting flight on-time performance

Predicting flight on-time performance 1 Predicting flight on-time performance Arjun Mathur, Aaron Nagao, Kenny Ng I. INTRODUCTION Time is money, and delayed flights are a frequent cause of frustration for both travellers and airline companies.

More information

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu Mary McGlohon Christos Faloutsos Carnegie Mellon University, School of Computer Science {lakoglu, mmcgloho, christos}@cs.cmu.edu

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Submitted for Blind Review Abstract How do real, weighted graphs change over time? What patterns, if any, do they obey? Earlier studies

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family

More information

CS 224w: Problem Set 1

CS 224w: Problem Set 1 CS 224w: Problem Set 1 Tony Hyun Kim October 8, 213 1 Fighting Reticulovirus avarum 1.1 Set of nodes that will be infected We are assuming that once R. avarum infects a host, it always infects all of the

More information

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,

More information

Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small World Network

Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small World Network Commun. Theor. Phys. (Beijing, China) 42 (2004) pp. 242 246 c International Academic Publishers Vol. 42, No. 2, August 15, 2004 Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small

More information

Network Observational Methods and. Quantitative Metrics: II

Network Observational Methods and. Quantitative Metrics: II Network Observational Methods and Whitney topics Quantitative Metrics: II Community structure (some done already in Constraints - I) The Zachary Karate club story Degree correlation Calculating degree

More information

1 Searching the World Wide Web

1 Searching the World Wide Web Hubs and Authorities in a Hyperlinked Environment 1 Searching the World Wide Web Because diverse users each modify the link structure of the WWW within a relatively small scope by creating web-pages on

More information

Supplementary Information Activity driven modeling of time varying networks

Supplementary Information Activity driven modeling of time varying networks Supplementary Information Activity driven modeling of time varying networks. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani May 11, 2012 Contents 1 The Model 1 1.1 Integrated network......................................

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

T , Lecture 6 Properties and stochastic models of real-world networks

T , Lecture 6 Properties and stochastic models of real-world networks T-79.7003, Lecture 6 Properties and stochastic models of real-world networks Charalampos E. Tsourakakis 1 1 Aalto University November 1st, 2013 Properties of real-world networks Properties of real-world

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle  holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/39637 holds various files of this Leiden University dissertation Author: Smit, Laurens Title: Steady-state analysis of large scale systems : the successive

More information

Class President: A Network Approach to Popularity. Due July 18, 2014

Class President: A Network Approach to Popularity. Due July 18, 2014 Class President: A Network Approach to Popularity Due July 8, 24 Instructions. Due Fri, July 8 at :59 PM 2. Work in groups of up to 3 3. Type up the report, and submit as a pdf on D2L 4. Attach the code

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000 Topology of evolving networks: local events and universality arxiv:cond-mat/0005085v1 [cond-mat.dis-nn] 4 May 2000 Réka Albert and Albert-László Barabási Department of Physics, University of Notre-Dame,

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS LINK ANALYSIS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link analysis Models

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

Lecture: Local Spectral Methods (1 of 4)

Lecture: Local Spectral Methods (1 of 4) Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 14: Random Walks, Local Graph Clustering, Linear Programming Lecturer: Shayan Oveis Gharan 3/01/17 Scribe: Laura Vonessen Disclaimer: These

More information

Models of Communication Dynamics for Simulation of Information Diffusion

Models of Communication Dynamics for Simulation of Information Diffusion Models of Communication Dynamics for Simulation of Information Diffusion Konstantin Mertsalov, Malik Magdon-Ismail, Mark Goldberg Rensselaer Polytechnic Institute Department of Computer Science 11 8th

More information

Modularity in several random graph models

Modularity in several random graph models Modularity in several random graph models Liudmila Ostroumova Prokhorenkova 1,3 Advanced Combinatorics and Network Applications Lab Moscow Institute of Physics and Technology Moscow, Russia Pawe l Pra

More information

Web Structure Mining Nodes, Links and Influence

Web Structure Mining Nodes, Links and Influence Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.

More information

Nonlinear Dynamical Behavior in BS Evolution Model Based on Small-World Network Added with Nonlinear Preference

Nonlinear Dynamical Behavior in BS Evolution Model Based on Small-World Network Added with Nonlinear Preference Commun. Theor. Phys. (Beijing, China) 48 (2007) pp. 137 142 c International Academic Publishers Vol. 48, No. 1, July 15, 2007 Nonlinear Dynamical Behavior in BS Evolution Model Based on Small-World Network

More information

Supporting Statistical Hypothesis Testing Over Graphs

Supporting Statistical Hypothesis Testing Over Graphs Supporting Statistical Hypothesis Testing Over Graphs Jennifer Neville Departments of Computer Science and Statistics Purdue University (joint work with Tina Eliassi-Rad, Brian Gallagher, Sergey Kirshner,

More information

Mini course on Complex Networks

Mini course on Complex Networks Mini course on Complex Networks Massimo Ostilli 1 1 UFSC, Florianopolis, Brazil September 2017 Dep. de Fisica Organization of The Mini Course Day 1: Basic Topology of Equilibrium Networks Day 2: Percolation

More information

Protein Complex Identification by Supervised Graph Clustering

Protein Complex Identification by Supervised Graph Clustering Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie

More information

Congratulations! You ve completed Practice Test 1! You re now ready to check your

Congratulations! You ve completed Practice Test 1! You re now ready to check your Practice Test 1: Answers and Explanations Congratulations! You ve completed Practice Test 1! You re now ready to check your answers to see how you fared. In this chapter, I provide the answers, including

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Intuitionistic Fuzzy Estimation of the Ant Methodology

Intuitionistic Fuzzy Estimation of the Ant Methodology BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 2 Sofia 2009 Intuitionistic Fuzzy Estimation of the Ant Methodology S Fidanova, P Marinov Institute of Parallel Processing,

More information

Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication

Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication Jurij Leskovec 1, Deepayan Chakrabarti 1, Jon Kleinberg 2, and Christos Faloutsos 1 1 School of Computer

More information

Lecture 2: Divide and conquer and Dynamic programming

Lecture 2: Divide and conquer and Dynamic programming Chapter 2 Lecture 2: Divide and conquer and Dynamic programming 2.1 Divide and Conquer Idea: - divide the problem into subproblems in linear time - solve subproblems recursively - combine the results in

More information

Quantum Percolation: Electrons in a Maze. Brianna Dillon-Thomas, PhD 2016

Quantum Percolation: Electrons in a Maze. Brianna Dillon-Thomas, PhD 2016 Quantum Percolation: Electrons in a Maze Brianna Dillon-Thomas, PhD 2016 Physicists, especially theoretical physicists, love to make models of the world to help us understand it. We weigh various effects

More information

Preliminaries and Complexity Theory

Preliminaries and Complexity Theory Preliminaries and Complexity Theory Oleksandr Romanko CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 16, 2006 Introduction Book structure: 2 Part I Linear Algebra

More information

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n Class Note #14 Date: 03/01/2006 [Overall Information] In this class, we studied an algorithm for integer multiplication, which improved the running time from θ(n 2 ) to θ(n 1.59 ). We then used some of

More information

arxiv: v2 [stat.ml] 21 Aug 2009

arxiv: v2 [stat.ml] 21 Aug 2009 KRONECKER GRAPHS: AN APPROACH TO MODELING NETWORKS Kronecker graphs: An Approach to Modeling Networks arxiv:0812.4905v2 [stat.ml] 21 Aug 2009 Jure Leskovec Computer Science Department, Stanford University

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Spectral Analysis of Directed Complex Networks. Tetsuro Murai

Spectral Analysis of Directed Complex Networks. Tetsuro Murai MASTER THESIS Spectral Analysis of Directed Complex Networks Tetsuro Murai Department of Physics, Graduate School of Science and Engineering, Aoyama Gakuin University Supervisors: Naomichi Hatano and Kenn

More information

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Hujia Yu, Jiafu Wu [hujiay, jiafuwu]@stanford.edu 1. Introduction Housing prices are an important

More information

Approximate Inference

Approximate Inference Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate

More information

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Contemporary Mathematics Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Robert M. Haralick, Alex D. Miasnikov, and Alexei G. Myasnikov Abstract. We review some basic methodologies

More information

Complex Networks, Course 303A, Spring, Prof. Peter Dodds

Complex Networks, Course 303A, Spring, Prof. Peter Dodds Complex Networks, Course 303A, Spring, 2009 Prof. Peter Dodds Department of Mathematics & Statistics University of Vermont Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

More information

Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)

Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.) Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.) Which pair of nodes {i,j} should be connected? Variant: node i is given Alice Bob Charlie Friend

More information

arxiv: v1 [cs.it] 26 Sep 2018

arxiv: v1 [cs.it] 26 Sep 2018 SAPLING THEORY FOR GRAPH SIGNALS ON PRODUCT GRAPHS Rohan A. Varma, Carnegie ellon University rohanv@andrew.cmu.edu Jelena Kovačević, NYU Tandon School of Engineering jelenak@nyu.edu arxiv:809.009v [cs.it]

More information

QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS

QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS QR FACTORIZATIONS USING A RESTRICTED SET OF ROTATIONS DIANNE P. O LEARY AND STEPHEN S. BULLOCK Dedicated to Alan George on the occasion of his 60th birthday Abstract. Any matrix A of dimension m n (m n)

More information

Integrated CME Project Mathematics I-III 2013

Integrated CME Project Mathematics I-III 2013 A Correlation of -III To the North Carolina High School Mathematics Math I A Correlation of, -III, Introduction This document demonstrates how, -III meets the standards of the Math I. Correlation references

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

Computing PageRank using Power Extrapolation

Computing PageRank using Power Extrapolation Computing PageRank using Power Extrapolation Taher Haveliwala, Sepandar Kamvar, Dan Klein, Chris Manning, and Gene Golub Stanford University Abstract. We present a novel technique for speeding up the computation

More information

Dynamics of Real-world Networks

Dynamics of Real-world Networks Dynamics of Real-world Networks Thesis proposal Jurij Leskovec Machine Learning Department Carnegie Mellon University May 2, 2007 Thesis committee: Christos Faloutsos, CMU Avrim Blum, CMU John Lafferty,

More information

0.1 O. R. Katta G. Murty, IOE 510 Lecture slides Introductory Lecture. is any organization, large or small.

0.1 O. R. Katta G. Murty, IOE 510 Lecture slides Introductory Lecture. is any organization, large or small. 0.1 O. R. Katta G. Murty, IOE 510 Lecture slides Introductory Lecture Operations Research is the branch of science dealing with techniques for optimizing the performance of systems. System is any organization,

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University

CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University TheFind.com Large set of products (~6GB compressed) For each product A=ributes Related products Craigslist About 3 weeks of data

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

Lecture 1: March 7, 2018

Lecture 1: March 7, 2018 Reinforcement Learning Spring Semester, 2017/8 Lecture 1: March 7, 2018 Lecturer: Yishay Mansour Scribe: ym DISCLAIMER: Based on Learning and Planning in Dynamical Systems by Shie Mannor c, all rights

More information

Oriented majority-vote model in social dynamics

Oriented majority-vote model in social dynamics Author: Facultat de Física, Universitat de Barcelona, Diagonal 645, 08028 Barcelona, Spain. Advisor: M. Ángeles Serrano Mass events ruled by collective behaviour are present in our society every day. Some

More information

Correlation Lengths of Red and Blue Galaxies: A New Cosmic Ruler

Correlation Lengths of Red and Blue Galaxies: A New Cosmic Ruler 10/22/08 Correlation Lengths of Red and Blue Galaxies: A New Cosmic Ruler Michael J. Longo University of Michigan, Ann Arbor, MI 48109 A comparison of the correlation lengths of red galaxies with blue

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis

More information

Collaborative Filtering Applied to Educational Data Mining

Collaborative Filtering Applied to Educational Data Mining Journal of Machine Learning Research (200) Submitted ; Published Collaborative Filtering Applied to Educational Data Mining Andreas Töscher commendo research 8580 Köflach, Austria andreas.toescher@commendo.at

More information

Greedy Search in Social Networks

Greedy Search in Social Networks Greedy Search in Social Networks David Liben-Nowell Carleton College dlibenno@carleton.edu Joint work with Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. IPAM, Los Angeles 8 May 2007

More information

Teaching a Prestatistics Course: Propelling Non-STEM Students Forward

Teaching a Prestatistics Course: Propelling Non-STEM Students Forward Teaching a Prestatistics Course: Propelling Non-STEM Students Forward Jay Lehmann College of San Mateo MathNerdJay@aol.com www.pearsonhighered.com/lehmannseries Learning Is in the Details Detailing concepts

More information

LAMMPS Simulation of a Microgravity Shear Cell 299r Progress Report Taiyo Wilson. Units/Parameters:

LAMMPS Simulation of a Microgravity Shear Cell 299r Progress Report Taiyo Wilson. Units/Parameters: Units/Parameters: In our simulations, we chose to express quantities in terms of three fundamental values: m (particle mass), d (particle diameter), and τ (timestep, which is equivalent to (g/d)^0.5, where

More information

INFO 2950 Intro to Data Science. Lecture 18: Power Laws and Big Data

INFO 2950 Intro to Data Science. Lecture 18: Power Laws and Big Data INFO 2950 Intro to Data Science Lecture 18: Power Laws and Big Data Paul Ginsparg Cornell University, Ithaca, NY 7 Apr 2016 1/25 Power Laws in log-log space y = cx k (k=1/2,1,2) log 10 y = k log 10 x +log

More information

Learning Energy-Based Models of High-Dimensional Data

Learning Energy-Based Models of High-Dimensional Data Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal

More information

Lower Bounds for Testing Bipartiteness in Dense Graphs

Lower Bounds for Testing Bipartiteness in Dense Graphs Lower Bounds for Testing Bipartiteness in Dense Graphs Andrej Bogdanov Luca Trevisan Abstract We consider the problem of testing bipartiteness in the adjacency matrix model. The best known algorithm, due

More information

Algebra 1 Yearlong Curriculum Plan. Last modified: June 2014

Algebra 1 Yearlong Curriculum Plan. Last modified: June 2014 Algebra 1 Yearlong Curriculum Plan Last modified: June 2014 SUMMARY This curriculum plan is divided into four academic quarters. In Quarter 1, students first dive deeper into the real number system before

More information

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120 Whitman-Hanson Regional High School provides all students with a high- quality education in order to develop reflective, concerned citizens and contributing members of the global community. Course Number

More information