Published in: ICCSA 2014: Fourth International Conference on Complex Systems and Applications: June 23-26, 2014, Le Havre, Normandie, France

Similar documents
Recent revisions of phosphate rock reserves and resources: a critique Edixhoven, J.D.; Gupta, J.; Savenije, H.H.G.

Data-driven methods in application to flood defence systems monitoring and analysis Pyayt, A.

UvA-DARE (Digital Academic Repository)

Citation for published version (APA): Harinck, S. (2001). Conflict issues matter : how conflict issues influence negotiation

Rotational symmetry breaking in the topological superconductor SrxBi2Se3 probed by uppercritical

UvA-DARE (Digital Academic Repository) Converting lignin to aromatics: step by step Strassberger, Z.I. Link to publication

Published in: Tenth Tbilisi Symposium on Language, Logic and Computation: Gudauri, Georgia, September 2013

Mass loss and evolution of asymptotic giant branch stars in the Magellanic Clouds van Loon, J.T.

UvA-DARE (Digital Academic Repository) Phenotypic variation in plants Lauss, K. Link to publication

NMDA receptor dependent functions of hippocampal networks in spatial navigation and memory formation de Oliveira Cabral, H.

Citation for published version (APA): Weber, B. A. (2017). Sliding friction: From microscopic contacts to Amontons law

Physiological and genetic studies towards biofuel production in cyanobacteria Schuurmans, R.M.

On a unified description of non-abelian charges, monopoles and dyons Kampmeijer, L.

UvA-DARE (Digital Academic Repository)

Citation for published version (APA): Hin, V. (2017). Ontogenesis: Eco-evolutionary perspective on life history complexity.

Coherent X-ray scattering of charge order dynamics and phase separation in titanates Shi, B.

Combining radar systems to get a 3D - picture of the bird migration Liechti, F.; Dokter, A.; Shamoun-Baranes, J.Z.; van Gasteren, J.R.; Holleman, I.

Monitoring and modelling hydrological fluxes in support of nutrient cycling studies in Amazonian rain forest ecosystems Tobon-Marin, C.

UvA-DARE (Digital Academic Repository) The syntax of relativization de Vries, M. Link to publication

NMDA receptor dependent functions of hippocampal networks in spatial navigation and memory formation de Oliveira Cabral, H.

UvA-DARE (Digital Academic Repository) Electrokinetics in porous media Luong, D.T. Link to publication

UvA-DARE (Digital Academic Repository) Charge carrier dynamics in photovoltaic materials Jensen, S.A. Link to publication

Water in confinement : ultrafast dynamics of water in reverse micelles Dokter, A.M.

Citation for published version (APA): Jak, S. (2013). Cluster bias: Testing measurement invariance in multilevel data

Love and fear of water: Water dynamics around charged and apolar solutes van der Post, S.T.

Measurement of the charged particle density with the ATLAS detector: First data at vs = 0.9, 2.36 and 7 TeV Kayl, M.S.

Measuring more or less: Estimating product period penetrations from incomplete panel data Hoogendoorn, A.W.

Citation for published version (APA): Ochea, M. I. (2010). Essays on nonlinear evolutionary game dynamics Amsterdam: Thela Thesis

Optical spectroscopy of carrier dynamics in semiconductor nanostructures de Jong, E.M.L.D.

Citation for published version (APA): Nguyen, X. C. (2017). Different nanocrystal systems for carrier multiplication

Citation for published version (APA): Adhyaksa, G. W. P. (2018). Understanding losses in halide perovskite thin films

Young intermediate-mass stars: from a HIFI spectral survey to the physics of the disk inner rim Kama, M.

Theoretical simulation of nonlinear spectroscopy in the liquid phase La Cour Jansen, Thomas

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

An impossibility theorem concerning multilateral international comparison of volumes van Veelen, C.M.

Citation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.

UvA-DARE (Digital Academic Repository) Delayed fracture in poreus media Shahidzadeh, N.F.; Vie, P.; Chateau, X.; Roux, J.N.; Bonn, D.

A polarized view on DNA under tension van Mameren, Joost; Vermeulen, K.; Wuite, G.J.L.; Peterman, E.J.G.

Climate change and topography as drivers of Latin American biome dynamics Flantua, S.G.A.

Power Laws & Rich Get Richer

Downloaded from UvA-DARE, the institutional repository of the University of Amsterdam (UvA)

Love and fear of water: Water dynamics around charged and apolar solutes van der Post, S.T.

Citation for published version (APA): Fathi, K. (2004). Dynamics and morphology in the inner regions of spiral galaxies Groningen: s.n.

Citation for published version (APA): Bremmer, R. H. (2011). Non-contact spectroscopic age determination of bloodstains

Efficient Monitoring Algorithm for Fast News Alert

Citation for published version (APA): Weber, B. A. (2017). Sliding friction: From microscopic contacts to Amontons law

UvA-DARE (Digital Academic Repository) Matrix perturbations: bounding and computing eigenvalues Reis da Silva, R.J. Link to publication

Quantitative Prediction of Crystal Nucleation Rates for Spherical Colloids: A Computational Study Auer, S.

Citation for published version (APA): Susyanto, N. (2016). Semiparametric copula models for biometric score level fusion

NOTES AND CORRESPONDENCE. A Quantitative Estimate of the Effect of Aliasing in Climatological Time Series

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000

Mountain View Community Shuttle Monthly Operations Report

MSc Thesis. Modelling of communication dynamics. Szabolcs Vajna. Supervisor: Prof. János Kertész

Optimization and approximation on systems of geometric objects van Leeuwen, E.J.

Citation for published version (APA): Petrov, D. S. (2003). Bose-Einstein condensation in low-dimensional trapped gases

Bose-Einstein condensation of a finite number of particles trapped in one or three dimensions Ketterle, W.; van Druten, N.J.

Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data

CIMA Professional

CIMA Professional

UvA-DARE (Digital Academic Repository)

System-theoretic properties of port-controlled Hamiltonian systems Maschke, B.M.; van der Schaft, Arjan

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

University of Groningen. Hollow-atom probing of surfaces Limburg, Johannes

Aalborg Universitet. CLIMA proceedings of the 12th REHVA World Congress Heiselberg, Per Kvols. Publication date: 2016

Modeling Dynamic Evolution of Online Friendship Network

Exploring spatial decay effect in mass media and social media: a case study of China

University of Groningen. Laser Spectroscopy of Trapped Ra+ Ion Versolato, Oscar Oreste

Analysis of Violent Crime in Los Angeles County

U N I V E R S I T A S N E G E R I S E M A R A N G J U L Y, S U H A R T M A I L. U N N E S. A C. I D E D I T O R D O A J

UvA-DARE (Digital Academic Repository)

CIMA Professional 2018

Citation for published version (APA): Boomsma, R. (2007). The disk-halo connection in NGC 6946 and NGC 253 s.n.

UvA-DARE (Digital Academic Repository)

Citation for published version (APA): Hin, V. (2017). Ontogenesis: Eco-evolutionary perspective on life history complexity.

Scaling invariant distributions of firms exit in OECD countries

Direct Observation of Entropic Stabilization of bcc Crystals Near Melting Sprakel, J.; Zaccone, A.; Spaepen, F.; Schall, P.; Weitz, D.A.

CIMA Professional 2018

ACCA Interactive Timetable & Fees

University of Groningen. Statistical Auditing and the AOQL-method Talens, Erik

Citation for published version (APA): Shen, C. (2006). Wave Propagation through Photonic Crystal Slabs: Imaging and Localization. [S.l.]: s.n.

Mountain View Community Shuttle Monthly Operations Report

UvA-DARE (Digital Academic Repository)

UvA-DARE (Digital Academic Repository)

University of Groningen. Statistical inference via fiducial methods Salomé, Diemer

UvA-DARE (Digital Academic Repository)

CS5112: Algorithms and Data Structures for Applications

University of Groningen. Water in protoplanetary disks Antonellini, Stefano

SUPPLEMENTARY INFORMATION

Elastodynamic single-sided homogeneous Green's function representation: Theory and examples

Citation for published version (APA): Shen, C. (2006). Wave Propagation through Photonic Crystal Slabs: Imaging and Localization. [S.l.]: s.n.

University of Groningen. Event-based simulation of quantum phenomena Zhao, Shuang

DM-Group Meeting. Subhodip Biswas 10/16/2014

Citation for published version (APA): Hoekstra, S. (2005). Atom Trap Trace Analysis of Calcium Isotopes s.n.

UvA-DARE (Digital Academic Repository)

arxiv:cs/ v1 [cs.dl] 22 Sep 2006

Loco-regional hyperthermia treatment planning: optimisation under uncertainty de Greef, M.

Applied Natural Language Processing

MEP Y7 Practice Book B

Classification of Coronal Mass Ejections and Image Processing Techniques

Why does attention to web articles fall with time?

Transcription:

UvA-DARE (Digital Academic Repository) Dynamics of Media Attention Traag, V.A.; Reinanda, R.; Hicks, J.; van Klinken, G.A. Published in: ICCSA 2014: Fourth International Conference on Complex Systems and Applications: June 23-26, 2014, Le Havre, Normandie, France Link to publication Citation for published version (APA): Traag, V. A., Reinanda, R., Hicks, J., & van Klinken, G. (2014). Dynamics of Media Attention. In M. A. Aziz- Alaoui, C. Bertelle, X. Liu, & D. Olivier (Eds.), ICCSA 2014: Fourth International Conference on Complex Systems and Applications: June 23-26, 2014, Le Havre, Normandie, France (pp. 301-304). Le Havre: Normandie University. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl) Download date: 01 Jan 2019

ICCSA 2014 Fourth International Conference on Complex Systems and Applications June 23-26, 2014 Le Havre, Normandie, France Editors M.A. Aziz-Alaoui, Cyrille Bertelle, Xinzhi Liu and Damien Olivier

Proceedings of ICCSA 2014 Normandie University, Le Havre, France - June 23-26, 2014 DYNAMICS OF MEDIA ATTENTION V.A. Traag, R. Reinanda, J. Hicks, G. Van Klinken Abstract. Studies of human attention dynamics analyses how attention is focused on specific topics, issues or people. In online social media, there are clear signs of exogenous shocks, bursty dynamics, and an exponential or powerlaw lifetime distribution. We here analyse the attention dynamics of traditional media, focussing on co-occurrence of people in newspaper articles. The results are quite different from online social networks and attention. Different regimes seem to be operating at two different time scales. At short time scales we see evidence of bursty dynamics and fast decaying edge lifetimes and attention. This behaviour disappears for longer time scales, and in that regime we find Poissonian dynamics and slower decaying lifetimes. This suggests that a cascading Poisson process may take place, with issues arising at a constant rate over a long time scale, and faster dynamics at a smaller time scale. Keywords. co-occurrence network media attention attention dynamics lifetime Poisson process. 1 Introduction With the arrival of large scale data sets, interest in quantifying human attention rose. It became possible to measure quite precisely how attention grew and decayed [1]. Moreover, it appeared that many human dynamics showed signs of bursty behaviour: short windows of intense activity with long intermittent time spans of inactivity [2]. The duration a person is active the time between its first and last occurrence, i.e. its lifetime seems to decay as an exponential, while the edge lifetime seems to follow a powerlaw distribution [3, 4]. We analyse a large dataset of newspaper articles from traditional printed media. We show that the dynamics of this dataset are quite different from social media. Our data consist of 140 263 newspaper articles from Indonesia from roughly 2004 to 2012, gathered by a news service called Joyo, mainly focussing on political news. We automatically identify entities by using a technique known as named entity recognition, and only retain person names (we discard organisations and locations) [5]. We then construct a network by creating a node for every person and an edge for each co-occurrence between two persons, and we record the date of the co-occurrence. We only take into account co-occurrences of people in the same sentence, and only about 3.2% of the sentences contain Corresponding author: vincent@traag.net All authors are with the KITLV, Leiden, the Netherlands R. Reinanda is associated to Faculty of Sciences, University of Amsterdam, the Netherlands Manuscript received April 19, 2009; revised January 11, 2010. Magnitude 300 250 200 150 100 50 0 2004 2005 2006 2007 2008 Time 2009 2010 2011 2012 Autocorrelation1.0 0.8 0.6 0.4 0.2 0.0 0 5 10 15 20 Lag 25 30 35 40 50000 40000 30000 20000 10000 0 0.0 0.1 0.2 0.3 Frequency 0.4 0.5 Frequency Figure 1: Daily node frequency. The first panel contains the number of nodes in the newspaper (that have at least one co-occurrence) per day (in gray), with the solid black line representing a smoothing (using convolution with a Hann window of 8 weeks) of this daily frequency. The second panel shows the autocorrelation function, which shows a clear peak at a lag of 7 days, indicating there is a weekly pattern in the data. This is also confirmed by Fourier analysis in the third panel, which shows a clear peak at a frequency of 1/7 0.14. more than one person, so this is quite restrictive. All time is measured in days. 2 Results In total, there are n = 9 467 nodes and they have about k i 12 neighbours on average. Two people co-occur on average about 3 times. Let us first simply look at how these quantities vary over time. Let E t be the number of co-occurrences at time t, and N t the number of nodes that have a co-occurrence at time t. The dynamics of N t follow a distinctive weekly pattern (Fig. 1). This is confirmed by the autocorrelation function, which shows a clear peak at a lag of 7 days with a correlation of about 0.57, while the Fourier transform shows clear peaks at a frequency of about 1/7 0.14. Results for E t follow a similar pat- ICCSA 2014, Normandie University, Le Havre, France June 23-26, 2014 301

Traag, Reinanda, Hicks & Van Klinken 7 6 5 4 3 2 1 0 Nodes per article Edges per article Mon Tue Wed Thu Fri Sat Sun Figure 2: Number of nodes and edges per article per day of the week. Although the largest part of the cyclical behaviour is due to the weekly newscycle (weekend vs. weekday), there still remains some cyclical patterns after normalisation. tern. Although this largely follows the weekly cycle of the number of articles, some cyclic pattern remains if the node frequencies are normalised by the article frequencies (Fig. 2). For the nodes this pattern is quite weak, but for the edges more noticeable. Surprisingly, the cycle seems to be at its high point at the end/beginning of the week, while its low point is attained in the middle of the week. Let us denote by N t (i) the number of co-occurrences of node i with any other node at time t. Neighbours tend to show quite similar patterns of attention, much more than compared to the overall trend. This suggests that attention for people rises and falls together, hinting at some underlying commonality. One possible explanation is that issues arise in which people play a role together, so that they tend to show similar patterns of attention. Let t p (i) = arg max t N t (i) be the peak of the number of co-occurrences of node i with any other node (if there are multiple such times the first is used). We then normalise the time series, such that the peak is centred at 0 with a value of 1, and denote the average of these time series by Ñt (see Fig. 3a). Hence Ñ0 = 1, and we are interested in how Ñt grows for t < 0 and decays for t > 0. It was suggested that Ñt t β, where different exponents β would correspond to different universal classes [1]. However, it was also suggested that Ñt β log t, which is unrealistic for large times since log t < 0 for sufficiently large t. Alternatively, an exponential growth and decay Ñt e β t is a common phenomena, suggesting the rate of growth/decay is constant throughout time. However, we find that none of these satisfactorily model the growth and decay of attention. The logarithmic and exponential grow/decay too slowly at a small time (around 10 30 days), while the powerlaw poorly fits the dynamics for larger times. This suggests that Ñ t e β t t γ might be a better fit. Indeed, this functional form quite accurately captures the growth and decay in our data, and comparing AIC values, clearly points to this model (see Table 1 for results). This suggests that at a short timescale there is a very fast (powerlaw) decay, but that at longer timescales the exponential decay dominates. In fact, it suggest that the attention essentially diverges for t 0. Although there is a large degree of symmetry, which would suggest an endogenous pattern following [1], we do find different exponents for the growth and decay (see Table 1). In particular, the decay seems slightly slower than the growth at short time scales, and nodes tend to occur more frequently after their peak than before their peak on longer time scales, contrary to [6]. Let t s (i, j) be the time of the s th co-occurrence between node i and j. The inter-event time can then be denoted by δ s (i, j) = t s+1 (i, j) t s (i, j), which can possibly be 0 if two (or more) co-occurrences happened at the same day. Similarly for nodes, we denote by t s (i) the s th co-occurrence of i with another node, and by δ s (i) = t s+1 (i) t s (i) the inter-event time. If events happen at a constant rate, we would expect an exponential distribution of inter-event times. If events follow a bursty patter however, we expect to find a powerlaw inter-event time distribution, often observed in other settings [2]. We find that although the inter-event times decay quite fast at a relatively short time scale, the inter-event times for a larger time scale follow an exponential distribution (Fig. 3b). Altogether, a powerlaw with exponential cutoff p(x) x α e βx provides the better fit, compared to a pure powerlaw or pure exponential distribution (log-likelihood ratios 6.3 10 4 and 3 10 4 ). For the edges, using MLE we find coefficients α 1.003 ± 2.6 10 3 and β 0.00151 ± 1.1 10 5, while for the nodes the distribution is slightly less skewed, but decays slightly faster, with α = 1.045 ± 3.7 10 3 and β = 0.00244 ± 2.3 10 5, both using x min = 10. Indeed, for even shorter time intervals, the decay is very fast, and using lower x min gives poorer fits. One possible explanation is that if somebody is involved in an issue, his co-occurrence will show a bursty pattern at this shorter time scale, but that the rate at which somebody is involved in an issue follows a Poisson process over a longer time scale, suggesting something like a cascading Poisson process [2]. If the processes would be stationary, then the expected first time we observe a node would be t 0 (i) = δs(i)2 2 δ s(i) (and similarly so for edges). Of course there is some censoring in the data, and for those nodes (and edges) we would expect to find t 0 (i) t 0 (i). However, if the actual first time a node appears in the paper t 0 (i) is much larger than this expected value t 0 (i), it suggest this really is the first time this person appears in the media. We call the difference t 0 (i) = t 0 (i) t 0 (i) the first time delay, and the distribution is plotted in Fig. 3c. First, 302 ICCSA 2014, Normandie University, Le Havre, France June 23-26, 2014

Dynamics of Media Attention 10-1 10-2 10 0 10-1 10-2 Edges Nodes Attention 10-3 Probability 10-3 10-4 10-4 βlog( t ) +α α t β αe β t αe β t t γ 10-5 -10 4-10 3-10 2-10 1-10 0 010 0 10 1 10 2 10 3 10 4 Time (a) Peak attention 10-5 10-6 10-7 0 500 1000 1500 2000 2500 3000 IET (b) Inter event time 0.06 0.05 Edges Nodes 10 0 Edges Nodes 0.04 10-1 Probability 0.03 Probability 0.02 10-2 0.01 0.00 1500 1000 500 0 500 1000 1500 2000 2500 3000 First Time Delay (c) First time delay 10-3 0 500 1000 1500 2000 2500 3000 Lifetime (d) Life time Figure 3: Attention statistics. Panel (a) shows how attention as in the frequency of occurring grows and decays with the peak attention centred at zero. Panel (b) shows the inter-event time distribution, which has a clear exponential tail, suggesting a Poisson process. Panel (c) shows the distribution of the first time delay. This delay represents the difference between the time at which we would expect to first observe a node (or edge) and the time at which we actually observed it in the dataset. This suggests that new nodes and edges are continuously being observed. Finally, in panel (d) the lifetime the amount of time between the first and last time of observing that node or edge follows a very peculiar distribution. The edge lifetime decays very fast for the first few days, but then only decays very slowly. The node lifetime actually shows an increasing distribution, suggesting they can have a very long lifetime. there is a clear peak around t 0 (i) = 0, suggesting that censoring played a role in observing these nodes for the first time. The fast decay that is visible on the negative part, does not show at the positive part. This implies that many nodes and edges are first observed much later than expected. This suggests that these nodes and edges are only introduced at a later time. Hence, there are continuously new edges and nodes appearing in the news. This seems especially prominent for the edges, which show an almost uniform distribution between 250 2 000 days of difference, whereas the distribution for the nodes decays more continuously. We denote by (i, j) = max t s (i, j) min t s (i, j) the lifetime of an edge, and similarly by (i) the lifetime of a node (Fig. 3d). Most edges have a very low lifetime of only a single day, and the probability to have a larger lifetime quickly decreases. Nonetheless, after an initial rapid decay, the probability decays much slower. This ICCSA 2014, Normandie University, Le Havre, France June 23-26, 2014 303

Traag, Reinanda, Hicks & Van Klinken Growth Decay Model α β log t α t β αe βt αe β t t γ α 0.011 ± 6.9 10 5 0.031 ± 3.2 10 4 0.0056 ± 6.3 10 5 0.0233 ± 1.9 10 4 β 0.0013 ± 9.8 10 6 0.49 ± 2.3 10 3 0.0019 ± 3.1 10 5 6.6 10 4 ± 1.2 10 5 γ 0.36 ± 2.3 10 3 AIC 3 117 3 045 4 883 α 0.014 ± 8.1 10 5 0.037 ± 3.8 10 4 0.0067 ± 7.1 10 5 0.028 ± 2.3 10 4 β 0.0017 ± 1.1 10 5 0.46 ± 2.1 10 3 0.0015 ± 2.3 10 5 5.5 10 4 ± 9.8 10 6 γ 0.34 ± 2.2 10 3 AIC 2 559 2 986 4 820 Table 1: Growth and decay estimates and model performance. The AIC differences are quite high, and suggest that αe β t t γ is the best model. suggests that besides the more volatile short term links, there are quite some long term stable links. The node lifetimes show a quite unexpected behaviour. Although there are nodes that have a lifetime that is quite short, nodes tend to have a longer lifetime, only cutoff by the duration of the dataset. This suggests that the lifetime of nodes can be extremely long, and can easily run in the decades. 3 Conclusion [5] J. R. Finkel, T. Grenager, C. Manning, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 05 (Association for Computational Linguistics, Stroudsburg, PA, USA, 2005), p. 363370. [6] J. Leskovec, L. Backstrom, J. Kleinberg, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 09 (ACM, New York, NY, USA, 2009), p. 497506. Online social media show signs of exogenous shocks, bursty dynamics and exponential lifetime distributions. We have shown here that traditional media seems to operate differently. The current results suggest the media operates at two different time scales. There is a short time scale which operates at the level of issues: links have short lifetime, attention decays quickly and there are indications of burstiness. However, at a longer time scale, these issues seem to occur at a uniform rate, and often involve similar actors: nodes and links have a relatively long lifetime, attention decays slower, and inter-event times decay exponentially. We aim to further analyse this idea in future research, following the cascading Poisson model [2]. References [1] R. Crane, D. Sornette, Proc. Natl. Acad. Sci. USA 105, 15649 (2008). [2] R. D. Malmgren, D. B. Stouffer, A. E. Motter, L. A. N. Amaral, Proc. Natl. Acad. Sci. USA 105, 18153 (2008). [3] C. A. Hidalgo, C. Rodriguez-Sickert, Physica A 387, 3017 (2008). [4] J. Leskovec, L. Backstrom, R. Kumar, A. Tomkins, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 08 (ACM, New York, NY, USA, 2008), p. 462470. 304 ICCSA 2014, Normandie University, Le Havre, France June 23-26, 2014