A proposal for a First-Citation-Speed-Index Link Peer-reviewed author version

Similar documents
OBJECTIVES INTRODUCTION

PHY 171. Lecture 14. (February 16, 2012)

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

A note on the multiplication of sparse matrices

COS 424: Interacting with Data. Written Exercises

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Chapter 6: Economic Inequality

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

Chapter 6 1-D Continuous Groups

Mathematical derivation of the impact factor distribution Link Peer-reviewed author version

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

Feature Extraction Techniques

Cosine similarity and the Borda rule

. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe

Curious Bounds for Floor Function Sums

In this chapter, we consider several graph-theoretic and probabilistic models

LONG-TERM PREDICTIVE VALUE INTERVAL WITH THE FUZZY TIME SERIES

Physics 202H - Introductory Quantum Physics I Homework #12 - Solutions Fall 2004 Due 5:01 PM, Monday 2004/12/13

ma x = -bv x + F rod.

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

Probability Distributions

Principal Components Analysis

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

lecture 36: Linear Multistep Mehods: Zero Stability

R. L. Ollerton University of Western Sydney, Penrith Campus DC1797, Australia

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

Determining OWA Operator Weights by Mean Absolute Deviation Minimization

A method to determine relative stroke detection efficiencies from multiplicity distributions

Estimation of the Population Mean Based on Extremes Ranked Set Sampling

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Lecture 21. Interior Point Methods Setup and Algorithm

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

Relations between the shape of a size-frequency distribution and the shape of a rank-frequency distribution Link Peer-reviewed author version

13 Harmonic oscillator revisited: Dirac s approach and introduction to Second Quantization

Gaussian Illuminants and Reflectances for Colour Signal Prediction

26 Impulse and Momentum

A Note on the Applied Use of MDL Approximations

Polygonal Designs: Existence and Construction

AN IMPROVED FUZZY TIME SERIES THEORY WITH APPLICATIONS IN THE SHANGHAI CONTAINERIZED FREIGHT INDEX

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

SPECTRUM sensing is a core concept of cognitive radio

Name: Partner(s): Date: Angular Momentum

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

IN modern society that various systems have become more

Midterm 1 Sample Solution

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry

CHAPTER ONE. Physics and the Life Sciences

Non-Parametric Non-Line-of-Sight Identification 1

When Short Runs Beat Long Runs

4 = (0.02) 3 13, = 0.25 because = 25. Simi-

List Scheduling and LPT Oliver Braun (09/05/2017)

Estimation of Combined Wave and Storm Surge Overtopping at Earthen Levees

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

arxiv: v1 [stat.ot] 7 Jul 2010

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Lean Walsh Transform

Introduction to Machine Learning. Recitation 11

Efficient Filter Banks And Interpolators

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

The accelerated expansion of the universe is explained by quantum field theory.

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Estimation of Korean Monthly GDP with Mixed-Frequency Data using an Unobserved Component Error Correction Model

The Weierstrass Approximation Theorem

arxiv: v1 [math.nt] 14 Sep 2014

The Transactional Nature of Quantum Information

[95/95] APPROACH FOR DESIGN LIMITS ANALYSIS IN VVER. Shishkov L., Tsyganov S. Russian Research Centre Kurchatov Institute Russian Federation, Moscow

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

time time δ jobs jobs

On Conditions for Linearity of Optimal Estimation

Journal of Modern Physics, 2011, 2, doi: /jmp Published Online November 2011 (

3.8 Three Types of Convergence

Handwriting Detection Model Based on Four-Dimensional Vector Space Model

Reducing Vibration and Providing Robustness with Multi-Input Shapers

Lecture #8-3 Oscillations, Simple Harmonic Motion

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers

Block designs and statistics

Analysis of Impulsive Natural Phenomena through Finite Difference Methods A MATLAB Computational Project-Based Learning

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

Designing for the Road User. Maximum Spiral Transition Lengths

Bootstrapping Dependent Data

AUTOMATIC DETECTION OF RWIS SENSOR MALFUNCTIONS (PHASE II) Northland Advanced Transportation Systems Research Laboratories Project B: Fiscal Year 2006

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Lecture 16: Scattering States and the Step Potential. 1 The Step Potential 1. 4 Wavepackets in the step potential 6

IMPROVEMENTS IN DESCRIBING WAVE OVERTOPPING PROCESSES

A Generalized Permanent Estimator and its Application in Computing Multi- Homogeneous Bézout Number

Example A1: Preparation of a Calibration Standard

Birthday Paradox Calculations and Approximation

Physics 2107 Oscillations using Springs Experiment 2

Beyond Mere Convergence

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

USEFUL HINTS FOR SOLVING PHYSICS OLYMPIAD PROBLEMS. By: Ian Blokland, Augustana Campus, University of Alberta

1 Proof of learning bounds

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

THE CONSTRUCTION OF GOOD EXTENSIBLE RANK-1 LATTICES. 1. Introduction We are interested in approximating a high dimensional integral [0,1]

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all

Many-to-Many Matching Problem with Quotas

Distributed Subgradient Methods for Multi-agent Optimization

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Transcription:

A proposal for a First-Citation-Speed-Index Link Peer-reviewed author version Made available by Hasselt University Library in Docuent Server@UHasselt Reference (Published version): EGGHE, Leo; Bornann, L. & Guns, R.(20) A proposal for a First-Citation-Speed-Index. In: JOURNAL OF INFORMETRICS, 5(). p. 8-86 DOI: 0.06/j.joi.200.0.006 Handle: http://hdl.handle.net/942/52

A proposal for a First-Citation-Speed-Index by L. Egghe (*) Universiteit Hasselt (UHasselt), Capus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgiu (**) and Universiteit Antwerpen (UA), Inforatie- en Bibliotheekwetenschap, Stadscapus, Venusstraat 35, B-2000 Antwerpen, Belgiu leo.egghe@uhasselt.be L. Bornann Professorship for Social Psychology and Research on Higher Education ETH Zurich Zähringerstr. 24, 8092 Zurich, Switzerland bornann@gess.ethz.ch R. Guns Universiteit Antwerpen (UA), Inforatie- en bibliotheekwetenschap, Stadscapus, Venusstraat 35, 2000 Antwerpen raf.guns@ua.ac.be ABSTRACT In this paper, we define a First-Citation-Speed-Index (FCSI) for a set of papers, based on their ties of publication and of first citation. The index is based on the definition of a h-index for increasing sequences. We show that the index has several good properties in the sense that the shorter the ties are between publication and first citation (in a global anner) the higher the FCSI is.

2 We present two case studies: a first-citation speed coparison of three journals in the field of psychology and a first-citation speed coparison of accepted and rejected, but published elsewhere anuscripts by the journal Angewandte Cheie International Edition. Both case studies indicate that our FCSI satisfies the intuitive feeling of what values a FCSI should have in these cases. (*) Corresponding author (**) Peranent address. Key words and phrases: First-Citation-Speed-Index, FCSI, h-index, increasing sequence.

3 I. Introduction If we have a set of articles (e.g. in a journal) we can deterine, for each article, the publication tie, denoted t P and (if it exists), the tie that this article received its first citation (denoted t C and of course we can assue that t C t P ). How t P and t C can be expressed (e.g; in onths or years) is not iportant at this oent, but we will spend soe coents on it in the discussion section at the end of this paper. So for all these papers, we can deterine t tc tp (), i.e. the tie between publication of a paper and first-citation to this paper. This is an iportant indicator since it expresses how fast ( t sall ) this paper changes its status of unused to used. Since we have a set of papers we, hence, also have a sequence of t - values. As is classical in such cases we can wonder if this sequence can be used to define a kind of First-Citation-Speed-Index (FCSI). Of course, we should discuss soe desired properties for such a speed index (this will, of course, be done in this paper). Deriving an indicator fro a sequence reinds us of the Hirsch-index or h -index (Hirsch, 2005). There, in its classical application, we have a decreasing sequence of nubers of citations to papers of e.g. an author. This is a situation where large is beautiful since the higher the nuber of citations to a paper are, the better and also: the higher the h -index, the better. In our case of t -values, however, it does not ake uch sense to arrange the in decreasing order. Here we have sall is beautiful since the saller the t -values, the better. So it is natural to arrange these t -values in increasing order. But it is not clear how to apply the well-known definition of the h -index to increasing sequences. In the next section we will propose a ethod to define an h -index for increasing sequences. With this h -index we are then able to define an indicator of first-citation-speed. Here, again large is beautiful since high speed indicates sall t -values. Details will be given. We give soe exaples but we also forulate soe logical good properties that a FCSI should

4 have. For instance, if all t -values are 0, the FCSI should attain its axiu. Conversely, if all t -values are very high (in the liit, ), the FCSI should go to its iniu. Also adding a constant a 0 to all t -values should decrease the value of the FCSI. Our indicator satisfies these good properties. Also we can show that ultiplying all t -values with a constant a leads to a decrease of our FCSI, another good property. In the next section practical exaples of journals in psychology and cheistry are given, illustrating the good properties of our FCSI. A first attept for defining such a FCSI was given in Bornann and Daniel (200). Avoiding the proble of defining a h -index for increasing sequences, they apply the h -index (for decreasing sequences) on the data of tie between the present tie and the tie of the firstcitation. They argue that the larger this nuber is, the ore the first-citation tie is to the past, hence the earlier the first-citation is received. The present paper tries to iprove this ethod by also taking into account the tie of publication since it is essential in calculating first-citation speed. The paper closes with soe concluding rearks and soe open probles (incl. the proble of how to deal with tie-units in FCSIs).

5 II. Definition of the First-Citation-Speed-Index (FCSI) and its good properties We assue that we have a set of papers (e.g. in a journal) and for each of the we have deterined the tie t P of publication and the tie t C at which the first citation is received and t tc tp. Let t ax t (2) where the axiu is over all papers. Since sall values of t are iportant (cf. the Introduction) we will define a h -index for increasing sequences as follows. Calculate t t for each paper and put the in decreasing order (we have added so that the case of one paper or the case of papers with equal t yields and not 0 ). On this decreasing sequence, the classical h -index can be calculated. We denote it by h. We then define the First-Citation-Speed-Index (FCSI), denoted F, as F t h (3) First we look at soe siple exaples which show that we are on the right track.. Set A has 3 articles with t -values:, 2,3. Hence t 3 and the t t values are 3, 2,, hence h 2 and hence F 3 2 2 2. Set B has 3 articles with t -values:,2,4. Now t 4 and the t t values are 4,3,, hence h 2 and hence F 4 2 3 It is logical that the FCSI of B is saller than that of A.

6 3. Set C has 3 articles with t -values:,2,6. Now t 6 and the t t values are 6,5,2, hence h 2 and F 6-2 5 It is logical that this value of C is saller than the ones of A and B. Of course, as is also the case with the h -index, there is soe robustness. 4. Set D has 3 articles with t -values:,3,4. Now t 4 and the t t values are 4, 2,, hence h 2 and F 4 2 3, the sae as for B although the speed in D is a bit slower (a strictly higher value of F for case D would be a bad property which is not the case here). Let us now consider soe logical requireents for a FCSI and we will check if F satisfies these.. Let us have N papers all with t 0 (in practise, any papers have this as follows fro Bornann and Daniel (200) but it is also the case for the data of e.g. Egghe (see Egghe, L. in WOS)). Now t 0 and t t for all papers, hence h (as we can see clearly here, h is not a good speed easure). Now t h 0 and hence F, the highest possible value, as it should in this case since all t -values are 0. Note: if we want to avoid as highest value we can always apply a strictly increasing transforation such as 2 x Arctan x (4) (highest value is then the highest value while the lowest value 0 reains 0 ). 2. Let us have N papers with all t -values equal and very high (say ), the worst case. Now t, all values t t and hence h.

7 Now F 0 t h (5) (and 0 for t ), the lowest possible value for F. 3. Copare two cases 3.. N papers with the sae t -values Here t t, and all t t values are, hence h. So F t h t (6) 3.2. N papers with the sae t -values, naely t t. The sae arguent as above, with t replaced by t yields F t t (7) a logical fact. More general: if all t -values are equal to t a ( a 0 ) then F, (8) t a decreasing in a, which is a logical fact. 3.3. N papers with the sae t -values, naely t at where a. The sae arguent as above with t replaced by t yields F (8) at t a logical fact.

8 4. The next case shows a robustness property of F. Let N papers have equal t -values t th and the N paper has t - value t t. Then t t. In the table for calculating h we have that the first N papers have t t t t values and the t t value. If N is sufficiently large we have that h t t and F t h t t t th N paper has F t (as if the th N paper was not there): this is exactly the sae robustness as with the h -index where sall values do not count. We could have several papers with high t -values as long as the h -index is based on the top articles with t -values, t t. This robustness of the h -index is considered as a good property!

9 III. Case study: three journals fro subject category Psychology We will now investigate the FCSI in practice. To this end, we collected publication and citation data for the following three journals fro the ISI subject category Psychology : Discourse & Society, Psychological Methods, and Aerican Journal of Counity Psychology. All data were collected fro Thoson Reuters Web of Science on May 0, 200. The citations in the data refer to articles published in these journals in the year 2000, thus allowing for sufficient tie for all articles to gain (at least) a first citation. The data were obtained using a query like SO=(AMERICAN JOURNAL OF COMMUNITY PSYCHOLOGY) AND PY=2000 and siilarly for the other two journals and analyzed using the Citation Report feature. The tie units here are years; for instance, if t 0, the first citation was received in the year of publication. We note that, overall, there are only two articles one in Discourse & Society and one in Psychological Methods that have no citations and hence no tie of first citation t C or t. With t and t defined as in the previous section, Table contains the first citation data for Discourse & Society, Psychological Methods, and Aerican Journal of Counity Psychology. The two uncited articles are oitted, since there is no obvious rank where they should be positioned. In a way, uncited articles receive their first citation at t. It is now straightforward to deterine the h -index for each journal on the basis of the third colun for each journal. We first look at Discourse & Society and find h 7. The largest value for t is 8, hence t h 2 and F. Next, we turn to Psychological Methods. For 2 this journal, we find that h 6, and hence t h and F. The Aerican Journal of Counity Psychology has h 4. Thus, t h 0and F. This is a case soewhat siilar to the one discussed in requireent. 0

0 Generally, if the first t papers have t 0, then each of these papers has t t t, and h t. Hence, in this case t h 0 and F. This best case scenario can also be found in practice, as evidenced by Table. Indeed, the first ( t 4) papers received their first citation within less than a year, leading to the largest F possible. These results accord well with our intuition of how these three journals should copare regarding first citation speed. Although Discourse & Society has published less articles than Psychological Methods, its t is larger. The sae observation holds when coparing either to Aerican Journal of Counity Psychology. More iportantly, ore than one quarter of the articles in the latter journal have been cited within the sae year, which clearly exceeds the other two journals. Siilarly, a larger fraction of articles fro Psychological Methods has been cited after one year, copared to Discourse & Society. It is interesting to copare the FCSI to the edian of the t values. In principle, this also gives an indication of first citation speed (note, though, that, for the edian, saller values indicate higher speed). In this case study, however, the edian for each journal equals 2, which would lead us to conclude that these journals are siilar in first citation speed. This is clearly in contrast with the FCSI, which assigns a different score to each one. It sees that the FCSI paints a ore correct picture of their existing differences. The iediacy index II is the nuber of citations in the year of publication divided by the nuber of publications. Hence, it can also be considered an indicator of early interest in a given paper. For the three journals fro Table, we find iediacy index values of, respectively, 0.263, 0.85 and 0.405. Just like for the FCSI, the Aerican Journal of Counity Psychology has the highest value. The ranking of the other two journals by II is, however, different fro the ranking by F. Closer exaination shows that this is due to the nuber of papers published in 2000; in fact, both journals receive 5 citations within the sae year, but Psychological Methods has published ore articles, leading to a lower iediacy index.

Table. First citation data for Discourse & Society, Psychological Methods and Aerican Journal of Counity Psychology Discourse & Society Psychological Methods Aerican Journal of Counity Psychology rank t t t + Rank t t t + rank t t t + 0 9 0 7 0 4 2 8 2 6 2 0 4 3 8 3 6 3 0 4 4 8 4 6 4 0 4 5 8 5 6 5 0 4 6 8 6 6 6 0 4 7 8 7 6 7 0 4 8 2 7 8 6 8 0 4 9 2 7 9 6 9 0 4 0 2 7 0 6 0 0 4 2 7 6 0 4 2 3 6 2 6 2 3 3 3 6 3 2 5 3 3 4 4 5 4 2 5 4 3 5 4 5 5 2 5 5 3 6 5 4 6 2 5 6 3 7 5 4 7 2 5 7 2 2 8 8 8 2 5 8 2 2 9 3 4 9 2 2 20 4 3 20 2 2 2 4 3 2 2 2 22 4 3 22 2 2 23 4 3 23 2 2 24 5 2 24 2 2 25 6 25 2 2 26 6 26 2 2 27 2 2 28 2 2 29 2 2 30 2 2 3 2 2 32 2 2 33 3 34 3 35 3 36 3 37 3

2 IV. Case study: first-citation speed coparison of accepted and rejected, but published elsewhere anuscripts by Angewandte Cheie International Edition. For the second case study we used biblioetric data of Bornann and Daniel (200) for 899 anuscripts that were subitted to Angewandte Cheie International Edition (AC-IE). What the editors of AC-IE look for ost of all is excellence in cheical research. Manuscripts that reviewers dee to be of high quality are selected for publication. Manuscripts that do not eet the high standards are rejected. Of the 899 anuscripts that were reviewed by the AC- IE in the year 2000, 46% (n=878) were accepted for publication in AC-IE, and 54% (n=02) were rejected. A search in the literature databases Science Citation Index (SCI, Thoson Reuters) and Cheical Abstracts (CA, Cheical Abstracts Services, CAS, Colubus, OH) revealed that of the 02 rejected anuscripts, 959 (94%) were published later in 36 other (different) journals. For accepted and rejected (but published elsewhere) anuscripts, Bornann and Daniel (200) deterined in addition to the nuber of citations the nuber of onths since publication and the first tie the paper was cited. If t =0, the first citation was received within the publication onth. The searches were done using Web of Science (WoS, Thoson Reuters). For different values of t and t -t +, Table 2 shows the nuber of accepted and rejected, but published elsewhere anuscripts. There are, e.g., 8 accepted and rejected, but published elsewhere anuscripts with t =7 and t -t +=6. For both anuscript groups, the largest value for the difference between t c and t p is 77 (=t ). Siilar to the Aerican Journal of Counity Psychology in the previous section, we have a best case scenario in Table 2 for the accepted anuscripts: 86 anuscripts received their first citation within the publication onth (about 0% of all accepted anuscripts). This leads to the largest possible F. For the rejected, but published elsewhere anuscripts we find h=76. The largest value for t is 77 (=t ), hence t -h+=2 and F. 2

3 The difference in F between both anuscript groups point to a higher first-citation-speed for accepted than for rejected, but published elsewhere anuscripts. These findings are in accordance to the results of Bornann and Daniel (200). They found not only a higher h index (Hirsch, 2005) for accepted than for rejected, but published elsewhere anuscripts, but also a higher speed, defined there as a longer tie between first citation and date of search for the first citation. Thus, the FCSI results are convergently valid: they correspond to the results produced by other (speed) indicators and to the qualitative outcoe of the AC-IE peer review process. Table 2. First citation data for accepted or rejected, but published elsewhere anuscripts with t <8 t t -t + Accepted anuscripts Rejected anuscripts 7 6 8 6 62 8 22 5 63 3 25 4 64 5 25 3 65 9 2 2 66 22 43 67 22 46 0 68 28 45 9 69 52 47 8 70 60 57 7 7 60 63 6 72 74 70 5 73 72 7 4 74 92 65 3 75 66 5 2 76 74 32 77 44 24 0 78 86 27

4 V. Conclusions and suggestions for further research We presented a proposal for a First-Citation-Speed-Index (FCSI), hereby introducing an h - index for increasing sequences. Several good logical properties are shown and the practical exaples underline the good distinctive power of the FCSI. The FCSI takes values between 0 and but we indicate how the FCSI can be transfored into an FCSI with values in the interval 0, and with the sae good properties. Several probles reain. First of all, one should search for other good FCSIs. As in concentration theory (cf. Egghe (2005)) one has defined several concentration easures with good properties. So we should do the sae for FCSIs. All FCSIs should be based on the values t t C t P. Next there is the proble with the tie units. Of course, if we keep the sae tie unit, exaples can be copared. The proble with tie unit can be forulated in two different ways. Suppose we keep the sae tie unit but we apply the transforation t at where a. Then all tie periods (such as t ) are strictly larger and hence the FCSI should decrease. But we can also look at t at being a transforation where the tie unit is changed (e.g. going fro a year to a onth). The sae tie period is then ultiplied by a 2 and it is not clear how a FCSI should behave in this context. This proble is coparable (though not identical) with the principle of scale invariance for concentration easures (also called inequality easures). If a transforation r ar eans a change of the currency (e.g. fro $ to ) then the inequality should be identical: wealth or poverty is not changed when we change the currency! But if the transforation r ar (say for a ) eans that each person s incoe (or capital) is ultiplied by a (keeping the sae currency) then it is not at all clear that inequality reains the sae. The proble of the tie units in the fraework of FCSIs is left as an open proble.

5 References Bornann L., & Daniel H.-D. (200). The citation speed index: a useful biblioetric indicator to add to the h index. Journal of Inforetrics, 4(), 83-88. Egghe L. (2005). Power Laws in the Inforation Production Process: Lotkaian Inforetrics., Oxford, UK: Elsevier. Hirsch J. (2005). An index to quantify an individual s scientific research output. Proceedings of the National Acadey of Sciences of the United States of Aerica 02(46), 6569-6572.