Supplementry Informtion to The role of endogenous nd exogenous mechnisms in the formtion of R&D networks Mrio V. Tomsello 1, Nicol Perr 2, Cludio J. Tessone 1, Márton Krsi 3, nd Frnk Schweitzer 1 1 Chir of Systems Design, Deprtment of Mngement, Technology nd Economics (D-MTEC), ETH Zurich, Weinbergstrsse 56/58, 8092 Zurich, Switzerlnd 2 Lbortory for the Modeling of Biologicl nd Socio-technicl Systems, Northestern University, Boston, MA 02115, USA 3 Lbortoire de l Informtique du Prllélisme, INRIA-UMR 5668, IXXI, ENS de Lyon, 69364 Lyon, Frnce Dtset detils The typicl dispersed nd right-skewed distribution of prtners per llince is universl feture cross industril sectors. In Fig. 1 we show such distribution for the nine lrgest sectors reported in the dtset. Empiricl firm ctivities re robust with respect to the time t t which they re mesured. In Fig. 2, we show tht shifting the time window of ny length t long the 26 yer observtion period of the dtset does not ffect the results. In ddition, we find tht the ctivity distribution is robust to the sectorl clssifiction of the firms. In Fig. 3 we show the empiricl firm ctivity distributions (computed on four different time windows) for the nine lrgest sectors reported in the dtset. Numericl simultions results For ech of the 684,000 computer simultions we run, we test the resulting generted R&D network with respect to three properties: verge degree k, verge pth length l nd globl clustering coefficient C. In Fig. 4 we show how these three quntities re distributed cross ll the 684,000 reliztions nd we compre them with the observed vlues k OBS, l OBS nd C OBS. We find tht the globl clustering coefficient nd the verge pth length distributions re peked round the observed vlues. However, the verge degree distribution does not disply ny pek, despite being reltively nrrow nd centered round the rel vlue (note the vlues on the x-xis in Fig. 4). The fct tht these three distributions re centered round the rel vlues testifies tht our model well cptures the topology of the observed network for lrge set of free prmeters, despite we hve imposed only few fetures of the network (number of nodes N nd llinces E, nd the distributions of node ctivities i nd prtners per llince m). At the sme time, the distributions of k, l nd C re not excessively nrrow, showing tht the we cn meningfully perform n explortion nd consequently fit of the free prmeters of our model. mtomsello@ethz.ch 1
Computer Softwre Phrmceuticls R&D, Lb nd Testing Prtners per llince Computer Hrdwre Prtners per llince Electronic Components Prtners per llince Communictions Equipment 10 10 2 10 3 10 10 2 10 3 Prtners per llince Prtners per llince Prtners per llince Medicl Supplies Telephone Communictions Universities 2 5 10 Prtners per llince Prtners per llince 2 5 10 Prtners per llince Figure 1: Distribution of the number of prtners per llince for the nine lrgest industril sectors, s mesured from the SDC dtset. The error threshold vlue ɛ 0 we impose for the computtion of the Likelihood score influences the number of points in the prmeter spce tht fulfill our mtching criteri. Obviously, by decresing ɛ 0, we observe smller number of points displying high likelihood scores, s we could expect, becuse better representtion of relity is required. In Fig. 5 we show the Likelihood scores of every point in the prmeter spce for six different vlues of ɛ 0, rnging from 1% to 10%. For our nlysis, we tke conservtive pproch nd fix ɛ 0 = 2%. 2
CCDF() t = 1 yer t = 3 yers 1984 1985 1986 1987 1988 1989 1990 1991 1992 1994 1995 1996 1997 1999 2000 2002 2004 2005 2006 2008 CCDF() t = 2 yers 1985 1987 1989 1991 1995 1997 t = 5 yers 1999 2005 CCDF() 1986 1989 1992 1995 2004 CCDF() 1988 2008 t = 10 yers t = 26 yers CCDF() CCDF() Figure 2: Complementry cumultive distribution function (CCDF) of the empiricl firm ctivities, mesured on the SDC dtset with 6 different time windows t of 1, 2, 3, 5, 10 nd 26 yers. When the time window is shorter thn 26 yers, we shift such time window long the observtion period nd show the corresponding ctivity CCDF. 3
CCDF() CCDF() CCDF() Computer Softwre t = 1 yer t = 5 yers t = 10 yers t = 26 yers Computer Hrdwre t = 1 yer t = 5 yers t = 10 yers t = 26 yers Medicl Supplies t = 1 yer t = 5 yers t = 10 yers t = 26 yers CCDF() CCDF() CCDF() Phrmceuticls t = 1 yer t = 5 yers t = 10 yers t = 26 yers Electronic Components t = 1 yer t = 5 yers t = 10 yers t = 26 yers Telephone Communictions t = 1 yer t = 5 yers t = 10 yers t = 26 yers CCDF() CCDF() CCDF() R&D, Lb nd Testing t = 1 yer t = 5 yers t = 10 yers t = 26 yers Communictions Equipment t = 1 yer t = 5 yers t = 10 yers t = 26 yers Universities t = 1 yer t = 5 yers t = 10 yers t = 26 yers Figure 3: Complementry cumultive distribution function (CCDF) of the empiricl firm ctivities, mesured for the nine lrgest industril sectors in the SDC dtset. 4
Frequency 0 2000 6000 10000 Frequency 0 2000 4000 6000 8000 Frequency 0 2000 6000 10000 1.5 2.0 2.5 3.0 verge degree 2 4 6 8 10 12 verge pth length 0.0 0.1 0.2 0.3 0.4 0.5 0.6 clustering coefficient Figure 4: Distributions of verge degree k, verge pth length l nd globl clustering coefficient C cross ll 684,000 runs of our model (ech of the 3,420 points in the prmeter spce hs been explored 200 times). The verticl red lines represent the observed vlues k OBS, l OBS nd C OBS in the empiricl R&D network. () (b) (c) (d) (e) (f) Figure 5: Likelihood scores for ll points in the prmeter spce, for ɛ 0 equl to 10% (), 8% (b), 5% (c), 3% (d), 2% (e) nd 1% (f). 5