Why Sirtes s claims (Sirtes, 2012) do not square with reality

Similar documents
Analysis of bibliometric indicators for individual scholars in a large data set

Coercive Journal Self Citations, Impact Factor, Journal Influence and Article Influence

arxiv: v2 [physics.soc-ph] 27 Oct 2008

Field-Normalized Impact Factors (IFs): A Comparison of Rescaling and Fractionally Counted IFs

Journal Impact Factor Versus Eigenfactor and Article Influence*

arxiv: v2 [cond-mat.stat-mech] 9 Dec 2010

Article-Count Impact Factor of Materials Science Journals in SCI Database

Elsevier Editorial System(tm) for Computers & Chemical Engineering Manuscript Draft

2 Chapter 2: Conditional Probability

Computing the (k-)monopoly number of direct product of graphs

FORMULATION OF THE LEARNING PROBLEM

Introducing Proof 1. hsn.uk.net. Contents

W G. Centre for R&D Monitoring and Dept. MSI, KU Leuven, Belgium

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Information Retrieval and Web Search Engines

AVOLTAGE SAG is a short-duration reduction in rms

-1- THE PROBABILITY THAT TWEETY IS ABLE TO FLY Giangiacomo Gerla Dipartimento di Matematica e Fisica, Università di Camerino ITALY.

Letter to the Editor. On the calculation of decision limits in doping control

Scientometrics as network science The hidden face of a misperceived research field. Sándor Soós, PhD

Journal influence factors

arxiv: v1 [physics.soc-ph] 23 May 2017

Analogies and discrepancies between the vertex cover number and the weakly connected domination number of a graph

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Locating an Astronomy and Astrophysics Publication Set in a Map of the Full Scopus Database

arxiv:cond-mat/ v1 [cond-mat.mtrl-sci] 6 Jun 2001

Paul Barrett

5.2 Tests of Significance

Characterisation of the open cluster M67

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

Reading for Lecture 6 Release v10

Testing Research and Statistical Hypotheses

THE RESIDUE THEOREM. f(z) dz = 2πi res z=z0 f(z). C

MGP versus Kochen-Specker condition in hidden variables theories

Finding the Median. 1 The problem of finding the median. 2 A good strategy? lecture notes September 2, Lecturer: Michel Goemans

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Building a (sort of) GoI from denotational semantics: an improvisation

Elementary Linear Algebra, Second Edition, by Spence, Insel, and Friedberg. ISBN Pearson Education, Inc., Upper Saddle River, NJ.

Samit De Library, Saha Institute of Nuclear Physics 1/AF, Bidhannagar, Kolkata E mail:

Direct Proof and Counterexample I:Introduction

Lecture 3: Probabilistic Retrieval Models

Chapter Three. Hypothesis Testing

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Direct Proof and Counterexample I:Introduction. Copyright Cengage Learning. All rights reserved.

CS229 Supplemental Lecture notes

CURRICULUM VITAE ET STUDIORUM. Luciano Ortenzi

Comparing the effects of two treatments on two ordinal outcome variables

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 18 Feb 2004 Diego Garlaschelli a,b and Maria I. Loffredo b,c

Linear Algebra, Summer 2011, pt. 2

Please bring the task to your first physics lesson and hand it to the teacher.

2. Two binary operations (addition, denoted + and multiplication, denoted

Is the Relationship between Numbers of References and Paper Lengths the Same for All Sciences?

The Simultaneous Local Metric Dimension of Graph Families

Math 4242 Fall 2016 (Darij Grinberg): homework set 6 due: Mon, 21 Nov 2016 Let me first recall a definition.

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Journal of Informetrics

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

Confidence intervals and the Feldman-Cousins construction. Edoardo Milotti Advanced Statistics for Data Analysis A.Y

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com

Improving Dense Packings of Equal Disks in a Square

West Windsor-Plainsboro Regional School District Algebra Grade 8

Stochastic calculus for summable processes 1

Introduction to Metalogic 1

Radiological Control Technician Training Fundamental Academic Training Study Guide Phase I

Boosting: Foundations and Algorithms. Rob Schapire

So far we discussed random number generators that need to have the maximum length period.

Explaining the Phenomenon of Neutrino Particles Moving Faster than Light Speed. Le Van Cuong

Listing the families of Sufficient Coalitions of criteria involved in Sorting procedures

The Problem of Inconsistency of Arithmetic M. George Dept. of Physics Southwestern College Chula Vista, CA /10/2015

arxiv: v2 [cs.dl] 16 Oct 2017

Probability and Statistics

Structure learning in human causal induction

hypothesis a claim about the value of some parameter (like p)

Optimal Control of Partiality Observable Markov. Processes over a Finite Horizon

1 Normal Distribution.

CHAPTER 1. REVIEW: NUMBERS

Chapter 10: Comparing Two Populations or Groups

Numbers, proof and all that jazz.

Friedman s test with missing observations

Supplementary Information

A-LEVEL STATISTICS. SS04 Report on the Examination June Version: 1.0

Introduction to Error Analysis

EVALUATING THE USAGE AND IMPACT OF E-JOURNALS IN THE UK

Remarks on the cavitation of Thorium-228

Chapter 10: Comparing Two Populations or Groups

Axiomatic set theory. Chapter Why axiomatic set theory?

Unsupervised Learning with Permuted Data

Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur. Lecture 1 Real Numbers

Some Formal Analysis of Rocchio s Similarity-Based Relevance Feedback Algorithm

Discrete Mathematics, Spring 2004 Homework 4 Sample Solutions

Conclusions About the Simultaneity of Two Events. Vesselin C. Noninski

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discriminative Learning can Succeed where Generative Learning Fails

Difference Between Pair Differences v. 2 Samples

A NUMBER-THEORETIC CONJECTURE AND ITS IMPLICATION FOR SET THEORY. 1. Motivation

From: "Pryor, Benjamin S." Subject: Reductions in Electronic Resources

A posteriori reading of Virtual Impactors impact probability

IOP Journal Archive. complete your library s scientific collection. Over 206,000 articles

Repe$$ve music and gap filling in full disc helioseismology and asteroseismology of solar- like stars. Eric Fossat Laboratoire Lagrange, O.C.A.

Reaction-diffusion processes and meta-population models in heterogeneous networks Supplementary information

Superstatistics and temperature fluctuations. F. Sattin 1. Padova, Italy

Transcription:

Why Sirtes s claims (Sirtes, ) do not square with reality Filippo Radicchi a, Claudio Castellano b,c a Departament d Enginyeria Quimica, Universitat Rovira i Virgili, Av. Paisos Catalans 6, 437 Tarragona, Catalunya, Spain b Istituto dei Sistemi Complessi (ISC-CNR), Via dei Taurini 9, I-85 Roma, Italy c Dipartimento di Fisica, Sapienza Università di Roma, P.le A. Moro, I-85 Roma, Italy In a recent Letter (Sirtes, ), D. Sirtes presents, in an unnecessarily harsh and aggressive tone, two criticisms to a recent publication of ours(radicchi & Castellano, ) in Journal of Informetrics. In our work, we discussed the problem of identifying quantitative bibliometric indicators for comparing in a fair manner publications belonging to different fields or subfields. We proposed a statistical procedure to test the fairness of indicators and then used it to evaluate two recently proposed indicators (Radicchi, Fortunato & Castellano, 8; Leydesdorff & Opthof, ), when applied to a set of papers, those published in journals of the American Physical Society (APS, www.aps.org), belonging to all subfields of physics. Sirtes presents two arguments against our work. The first is the allegation of circularity. According to Sirtes whatever the inequality of the samples of the different fields stems from, whether from genuine field-specific differences or from differences in the quality of the samples, the process of rescaling evens them all out. Now the problem becomes clear: As they use the same sample to calculate their average c as they use to test the fairness of their indicator, their test becomes circular and begs the question. First, let us notice that this criticism has nothing to do with the fairness test. It has to do with one of the indicators the test is applied to. The criticism is rather obscure. The most straightforward interpretation is that, according to Sirtes, by simply dividing the number of citations by their average, all differences between fields are necessarily canceled. This idea is clearly wrong. Nothing guarantees a priori that simply rescaling by the average many distribution functions is enough to make them equal. In the case Preprint submitted to Journal of Informetrics June,

of citation distributions for different subfields this scaling property turns out to hold, but this is highly nontrivial. We reached this conclusion only after having tested the performance of the rescaled indicator with a statistical analysis. It is surprising that Sirtes is so deeply convinced that our rescaling works, that he considers it to be trivial. There is nothing circular in our method. There is instead a short-circuit in Sirtes s argument. The second criticism is, fortunately, more clearly formulated. According to Sirtes, APS journals cannot be used as benchmark for our fairness test, because their quality is largely different in the various subfields of physics. The fact that indicators based on raw and fractional citation counts fail our fairness test would be only the consequence of the different quality of APS publications in different physics subfields. The proof of this claim is, according to the author, in Fig. of Sirtes (). This proof does not stand a closer scrutiny. There are several important methodological errors is Sirtes s procedure. Let us first remark that Sirtes did not include in the analysis papers from one APS journal, Physical Review Letters (PRL), which publishes many highly cited papers in all physics subfields. Judging the quality of APS journals without considering PRL publications is a serious shortcoming that alone makes the argument of Sirtes inconclusive. Secondly, it is clearly untenable the use of the Impact Factor (IF) (Garfield, 6) as a measure for the impact of all publications in a journal, despite ample empirical evidence that a huge variability exists in the citations accrued by papers published in the same outlet (Seglen, 997). But the most inappropriate ingredient of Fig. of Sirtes () is the use of the Journal Citation Reports (JCR) subject-categories and the mapping between them and APS journals. JCR categories are known to be an unreliable categorization of scientific papers. According to Pudovkin & Garfield (), in JCR classification Journals are assigned to categories by subjective, heuristic methods. In many fields these categories are sufficient but in many areas of research these classifications are crude and do not permit the user to quickly learn which journals are most closely related. When using JCR categories for classifying papers, articles are assigned to categories automatically, depending on where they have been published. In doing that, however, a major problem arises because many journals belong to more than one category. For example, papers published in Physical Review A (PRA), automatically belong both to Physics, atomic, molecular & chemical and to Optics. This means that the articles, which make (according to Sirtes) PRA a second rate publishing venue in Physics, atomic, molecular & chemical and a top jour-

nal in Optics, are the same! In our paper, we decided to use Physics and Astronomy Classification Scheme (PACS, publish.aps.org/pacs) numbers as the categorization for our benchmark exactly to overcome these problems with the JCR categorization. PACS numbers are author-provided and attributed to individual articles, so that they solve the two main problems of JCR categorization. Unfortunately, in the assessment of the relative quality of APS journals with respect to others, the problem with JCR categories cannot be fully eliminated, because, while APS papers can be attributed to JCR categories based on their PACS numbers, non APS publications are classified with the automatic (and unreliable) JCR procedure. Keeping this caveat in mind, we now proceed to test empirically Sirtes s claim, by measuring the correlation between the prestige of APS papers in a certain subfield and the results of our fairness test. Sirtes claims that such a correlation exists for the indicators based on raw citations and on fractional citations. As discussed above the mapping between JCR categories and the ten PACS categories considered in Radicchi & Castellano () is a nontrivial problem. We consider the seven JCR subject-categories already analyzed by Sirtes, plus the subject-category Astronomy & Astrophysics. We collected from Web of Science (WoS, isiknowledge.com) all papers published in year in journals that are classified in one of these categories, plus all papers published in in APS journals (including PRL). We consider for consistency only publications that are classified as Articles by WoS, and for each of them we retrieve the information about the number of citations accumulated. Data were collected between Feb. 9 and Apr. 4,. To create the mapping between JCR categories and PACS category, we first associate APS papers to JCR subject-categories based on the information given by the first two digits of their principal PACS number. Then, we associate to each JCR category the PACS category where most papers were published in. See Table for the detailed mapping. Note that, while in some cases this mapping is obvious, there are also cases such as Physics, Condensed matter in which many PACS with different first digit are placed together. In such a case the attribution to the category 7 is due to the fact that 67% of the papers had PACS code beginning with 7, the 3% of the papers had PACS code beginning with 6, and the remaining papers had PACS code beginning with 4 or 8. In order to quantify the prestige of APS papers within each JCR subjectcategory we select the most cited z% papers appeared in year, irre- 3

JCR category PACS first two digits PACS first digit PHYSICS, MATHEMATICAL 3 4 5 89 PHYSICS, PARTICLES & FIELDS 3 4 PHYSICS, NUCLEAR 3 4 5 6 7 8 9 PHYSICS, ATOMIC, 3 3 33 34 35 36 37 8 3 MOLECULAR & CHEMICAL OPTICS 4 4 4 PHYSICS, FLUIDS & PLASMAS 47 5 5 5 PHYSICS, CONDENSED MATTER 44 6 6 63 64 65 66 67 68 7 7 73 74 75 76 77 78 79 7 ASTRONOMY & ASTROPHYSICS 8 83 84 85 95 96 97 98 9 Table : Mapping between JCR categories (first column), first two digits of the of PACS codes (second column), and most frequent first digit of the PACS number (third column). Some first numbers of PACS codes do not appear in the mapping, because they cannot be univocally attributed to a single JCR category. spectively of their journal of publication, and calculate the proportion p i,z of APS papers that are part of this top cited group. For z = for example, we find that the 8% of APS papers are part of the top list in Astronomy & Astrophysics while this figure is 5% for Optics and 5% for Physics, Atomic, molecular & chemical. We then quantify the relative error of each citation indicator, with respect to our fairness test. We take, from Fig. 3 in Radicchi & Castellano (), the values r i,z indicating the percentage of papers in PACS category i belonging to the z% globally most cited papers. The relative deviation is given by = (r i,z r i,z )/w i,z, where r i,z is the median value of the proportion predicted by our model and w i,z is the width of the 9% confidence interval (See Fig. 3 in Radicchi & Castellano ()). Using these quantities, we are now able to test the validity of Sirtes s claim. q (indicator) i,z In Fig. we plot q (indicator) i,z vs p i,z for all citation indicators (unscaled, fractional, and rescaled) considered in our previous analysis. The values of the correlation coefficients are reported in Table. They are all within the 9% confidence interval expected for a random configuration with same size, which is roughly between -.6 and.6 in all cases. Thus there is no statistically Noticethatthevaluesofz inthedeterminationofp i,z andinq (indicator) i,z independent, and could also be taken different. are in principle 4

z R unsc R frac R resc.7.54.49 -...6 5 -.7.3 -.9 -.6.7.53 -..4 -.33 3 -.5.7 -.44 Table : Values of the correlation coeffients for plots in Fig. for unscaled (R unsc ), fractional (R frac ) and rescaled (R resc ) citation indicators. No value of the correlation coefficient (either positive or negative) is significant at 5% confidence level (the 9% confidence interval is roughly between.6 and.6 in all cases). significant evidence of correlation. The failure of the indicators based on raw citations and on fractional citations in (Radicchi & Castellano, ) cannot be explained by variations of the prestige of APS papers in the different subfields. In conclusion, the analysis presented by Sirtes contains serious methodological flaws. Even if such flaws are removed, a statistically grounded approach, presented here, contradicts Sirtes s claims. The present results support instead the validity of the work by Radicchi & Castellano (). Sirtes, D. (). Finding the Easter eggs hidden by oneself: Why Radicchi and Castellano s () fairness test for citation indicators is not fair. Journal of Informetrics 6(), 448 45. doi:.6/j.joi...8 Radicchi, F., & Castellano, C. (). Testing the fairness of citation indicators for comparison across scientific domains: The case of fractional citation counts. Journal of Informetrics, 6(), 3. doi:.6/j.joi..9. Radicchi, F., Fortunato, S., & Castellano, C. (8). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences, 5(45), 768 77. doi:.73/pnas.869775 Leydesdorff, L., & Opthof, T. (). Normalization at the field level: Fractional counting of citations. Journal of Informetrics, 4(4), 644 646. doi:.6/j.joi..5.3 5

Garfield, E. (6). The history and meaning of the journal impact factor. The Journal of the American Medical Association, 95(): 9 93. doi:./jama.95..9 Seglen, P.O. (997). Why the impact factor of journals should not be used for evaluating research. British Medical Journal, 34(779): 4985. doi:.36/bmj.34.779.497 Pudovkin, A.I., & Garfield, E. (). Algorithmic procedure for finding semantically related journals. Journal of the American Society for Information Science and Technology, 53(3): 39. doi:./asi.53 6

Publication year Relative deviations from expected impact of PACS categories - - - Top % Top 5% unscaled rescaled fractional.5..5..5 3. 3.5 4. 4.5 5 6 7 8 9 3 4 Top % Top % Top % Top 3% 4 8 3 36 4 44 - - -..5 3. 3.5 4. 4.5 5. 5.5 6. 6.5 7. 4 6 8 4 6 34 36 38 4 4 44 46 48 5 5 54 56 Relative impact in JCR subject-categories Figure : For a given value of z and each PACS category, we quantify the relative deviation from the expected proportion value predicted by Eq.() of Radicchi & Castellano (). These values are then plotted against the relative impact of APS papers in their JCR subject-category of reference. We consider the same values of z used in our previous analysis (Radicchi & Castellano, ). 7