On the correction of the h-index for career length

Similar documents
Structure and Drive Paul A. Jensen Copyright July 20, 2003

Foundations of Arithmetic

The Order Relation and Trace Inequalities for. Hermitian Operators

Norm Bounds for a Transformed Activity Level. Vector in Sraffian Systems: A Dual Exercise

Section 8.3 Polar Form of Complex Numbers

9 Characteristic classes

Problem Set 9 Solutions

Geometry of Müntz Spaces

A new construction of 3-separable matrices via an improved decoding of Macula s construction

Difference Equations

a b a In case b 0, a being divisible by b is the same as to say that

The Minimum Universal Cost Flow in an Infeasible Flow Network

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

1 GSW Iterative Techniques for y = Ax

Example: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,

Color Rendering Uncertainty

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Chapter 3 Describing Data Using Numerical Measures

Maximizing the number of nonnegative subsets

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Research Article Green s Theorem for Sign Data

The Quadratic Trigonometric Bézier Curve with Single Shape Parameter

More metrics on cartesian products

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

2.3 Nilpotent endomorphisms

Assortment Optimization under MNL

Randić Energy and Randić Estrada Index of a Graph

Mathematical Preparations

Note on EM-training of IBM-model 1

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

Module 9. Lecture 6. Duality in Assignment Problems

Chapter 8 Indicator Variables

MMA and GCMMA two methods for nonlinear optimization

Homework Notes Week 7

Numerical Heat and Mass Transfer

e - c o m p a n i o n

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

Errors for Linear Systems

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Solution Thermodynamics

SUCCESSIVE MINIMA AND LATTICE POINTS (AFTER HENK, GILLET AND SOULÉ) M(B) := # ( B Z N)

THE SUMMATION NOTATION Ʃ

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

Homework Assignment 3 Due in class, Thursday October 15

The internal structure of natural numbers and one method for the definition of large prime numbers

A random variable is a function which associates a real number to each element of the sample space

Remarks on the Properties of a Quasi-Fibonacci-like Polynomial Sequence

CSCE 790S Background Results

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

In the figure below, the point d indicates the location of the consumer that is under competition. Transportation costs are given by td.

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

A Robust Method for Calculating the Correlation Coefficient

Lecture 12: Discrete Laplacian

Limited Dependent Variables

On Graphs with Same Distance Distribution

Finding Primitive Roots Pseudo-Deterministically

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force.

= z 20 z n. (k 20) + 4 z k = 4

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

PAijpam.eu SOME NEW SUM PERFECT SQUARE GRAPHS S.G. Sonchhatra 1, G.V. Ghodasara 2

Graph Reconstruction by Permutations

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

Convexity preserving interpolation by splines of arbitrary degree

A Hybrid Variational Iteration Method for Blasius Equation

Lecture Notes on Linear Regression

A new Approach for Solving Linear Ordinary Differential Equations

Kernel Methods and SVMs Extension

ON MECHANICS WITH VARIABLE NONCOMMUTATIVITY

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Linear Regression Analysis: Terminology and Notation

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Uncertainty in measurements of power and energy on power networks

Pulse Coded Modulation

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Non-interacting Spin-1/2 Particles in Non-commuting External Magnetic Fields

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Implicit Integration Henyey Method

Week 9 Chapter 10 Section 1-5

Perfect Competition and the Nash Bargaining Solution

EXPONENTIAL ERGODICITY FOR SINGLE-BIRTH PROCESSES

Lecture 17 : Stochastic Processes II

The Symmetries of Kibble s Gauge Theory of Gravitational Field, Conservation Laws of Energy-Momentum Tensor Density and the

Lecture 13 APPROXIMATION OF SECOMD ORDER DERIVATIVES

Bernoulli Numbers and Polynomials

Appendix B. The Finite Difference Scheme

Introduction to Random Variables

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

From Biot-Savart Law to Divergence of B (1)

Tail Dependence Comparison of Survival Marshall-Olkin Copulas

Lecture 4. Instructor: Haipeng Luo

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

A CHARACTERIZATION OF ADDITIVE DERIVATIONS ON VON NEUMANN ALGEBRAS

Determinants Containing Powers of Generalized Fibonacci Numbers

χ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body

Transcription:

1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat 35, B-2000 Antwerpen, Belgum leo.egghe@uhasselt.be ABSTRACT We descrbe mathematcally the age-ndependent verson of the h-ndex, defned by Abt (Scentometrcs 91(3), 863-868, 2012) and explan when ths ndcator s constant wth age. We compare ths ndex wth the one where not the h-ndex s dvded by career length but where all ctaton numbers are dvded by career length and where we then calculate the new h-ndex. Both mathematcal models are compared. A varant of ths second method s by calculatng the h-ndex of the ctaton data, dvded by artcle age. Examples are gven. 1 Permanent address Key words and phrases: age-ndependent, h-ndex (Hrsch-ndex)

2 Introducton Let us have a researcher wth T publcatons and let c ( 1,..., T) be the number of receved ctatons of paper. We suppose that the papers are arranged n decreasng order of number of receved ctatons (.e. c cj f and only f j). Then the Hrsch-ndex (Hrsch (2005)) (or h-ndex) s the largest rank r h such that all papers on ranks 1,..., h have at least h ctatons (.e. the largest rank r h such that ch h and hence c h for all 1,..., h). It s clear that the h-ndex s age-dependent (.e. s dependent of career length). Long careers usually have hgher values of T (total number of publcatons) and of c ( 1,..., T) (number of ctatons receved by paper ), when compared to shorter careers (e.g. younger researchers). Ths fact was already noted n the defnng paper Hrsch (2005). So, wth the h-ndex, one should not compare researchers wth dfferent career length (as we should also not compare researchers from dfferent felds, but that s the case for all ctaton- based ndcators n ths paper we do not deal wth ths problem). Ths has lead Abt (2012) to the followng age-ndependent h-ndex (also mplct n Hrsch (2005)). Denote by ht the h-ndex of a researcher at career length t (startng at the tme of the researcher s frst publshed paper). Then defne at ht (1) t 10,.e. the h-ndex (at tme t) dvded by the (fractonal) number of decades snce the frst publshed paper. The factor 10 s only useful n practcal examples; n theoretcal models we mght use ht a t (2) t as well.

3 Abt (2012) clams that constant at mples that at s constant n t and gves practcal evdence for t. Incdentally, a ht ncreases lnearly, as s trval from (1) or (2). In ths paper, based on the models developed n Egghe and Rousseau (2006) and Egghe (2009), we gve a mathematcal model for at and at and a a t to be constant. Practcal data on gven. Ths s done n the next secton. t and present necessary and suffcent condtons for ht and at for ths author s career are In the thrd secton we descrbe a second way to reach an age-ndependent ndcator: we do not dvde ht by t (or t 10 ) but we dvde all ctaton numbers c by ths career length and then we calculate the h-ndex of ths transformed set of data. We denote by (f we use 10 t ) and by b ht ths ndcator t ths ndcator (f we use t). Also for these ndcators we present a mathematcal model based on Egghe and Rousseau (2006) and Egghe (2008a, b) and gve necessary and suffcent condtons for bt or b t to be constant. An example from ths author s career s gven. Also n ths secton we remark that a thrd ndcator can be constructed to reach age-ndependence. In the prevous two cases we dvde always by career length t or t 10 where we do not take nto account the dfferent ages of the publshed papers. So, theoretcally, t makes more sense to dvde the number of ctatons c of the th paper by the age of the paper whch s 2012 gven by 2012 publcaton year + 1 (3) and then calculate the h-ndex of ths transformed set of data. If we use age/10 we denote ths ndcator ct and f we use age we denote ths ndcator c t (as we dd wth the prevous two ndcators). We prove relatons (nequaltes) between these three ndcators and present an example from ths author s career. The paper closes wth some conclusons and suggestons for further research.

4 A model for Abt s age-ndependent ndex In Egghe and Rousseau (2006) we assumed that the paper-ctaton system s Lotkaan (Egghe (2005)) wth Lotka exponent 1. We there showed that, f there are T papers n total, that the h-ndex of ths system s gven by 1 T h (4) In Egghe (2009) we assumed that we have a number (densty) of publcatons per tme unt (at t) equal to dt, where d 0, 0 (d was denoted b n Egghe (2009) but we avod ths n order not to confuse wth the second ndcator bt, dscussed n the ntroducton). Note that the case 0 s the case where we have a constant number of publcatons per year. The total number of publcatons T, dependent on t (denoted ast t ) s gven by 0 t T t dt ' dt ' (5) d 1 1 t T t (6) Combnng (4) and (6) (supposng to be ndependent of t) and denotng the tme-dependent h-ndex h by ht, yelds h t d 1 1 t 1 (7)

5 Note that ht s a concavely ncreasng functon of t f and only f 1, s lnear n t f and only f 1 and s a convexly ncreasng functon of t f and only f 1. By defnton of Abt s ndcator at we have, by (1) and (7) a t d 10 1 1 t 1 (8) (and the same for a t wth the factor 10 deleted). We now have Proposton 1, yeldng a necessary and suffcent condton for Abt s clam to be vald. Proposton 1: The ndcator at s constant f and only f 1 (9) The same result s true for a t. Proof: Ths s trvally followng from (8) Note: The case 1 s a classcal nformetrc case. The most classcal Lotka exponent s 2 (see Egghe (2005)). Ths mples that 1 and by (7) we have a lnearly ncreasng ht functon, much n lne wth Fg. 1 ( ht for ths author s career). So Proposton 1 ndcates that a constant (2012). at functon s classcal and hence supports the fndng n Abt

6 We further have Proposton 2. Proposton 2: () at s convexly decreasng f and only f 1 (10) () at s ncreasng f and only f 1 (11) () at s convexly ncreasng f and only f 1 2 (12) (v) at s concavely ncreasng f and only f 1 2 (13) (v) at s lnearly ncreasng f and only f 1 2 (14) The same results are true for a t Proof: All these results follow trvally from (8). These results are llustrated by ths author s ctaton data yeldng Fg. 1 at. ht and Fg. 2

7 Fg. 1: h(t) sequence for ths author s career Fg. 2: a(t) sequence for ths author s career

8 Fg. 1 s an updated verson of Fg. 2 n Egghe (2009), updated to 35 career years ( t 1s 1978, the year of the frst publcaton and t 35 s 2012, the present year). We can say that ht s convexly ncreasng ( 1 ) but s close to lnear ( 1 ) leadng to an ncreasng also Proposton 1). at (see also Proposton 2 ()) but wth a relatvely constant mddle part (see An varant of Abt s age-ndependent ndex There are several ways to correct the h-ndex for career length. One of them s Abt s ndex, dscussed n the prevous secton: the smple dea there s by dvdng the h-ndex by the career length t (or t 10 ). A smlar dea s as follows. We take the ctaton data c ( 1,..., T) and dvde all these numbers by t (or t 10 ). For these normalzed ctaton data we calculate the h-ndex. We denote ths age-ndependent ndex by bt (usng t 10 ) and by b bt and b t (usng t). To model t we nvoke a result from Egghe (2008 a, b) on transformatons of the h-ndex. Proposton 3 (Egghe (2008 a, b)): Let h be the h-ndex of the system c ( 1,..., T): the number of receved ctatons for paper, where there are T papers n total. If we do not change the total number T of papers but f we multply each c by a postve number B, then the h-ndex h of ths transformed system s gven by 1 h B h (15) where s the Lotka exponent of the orgnal system.

9 So we have the followng formulae for bt and b t. b t 1 t 10 1 h (16) 1 1 b t h t (17) Combnng (16) wth (7) yelds b t 1 1 2 d 10 1 t (18) and smlarly for b t (wth 10 1 deleted). From (8) and (18) we now see that b t 1 10 a t t (19) and smlarly b t a t t (20) Note that, snce t 1, t follows from (20) that b t a t

10 for all t. Ths shows that bt ncreases faster than at (and smlar for the followng propostons, smlar to Propostons 1 and 2. b t and a t ). We have Proposton 4: The ndcator bt s constant f and only f 2 (21) The same result s true for b t. Proposton 5: () bt s convexly decreasng f and only f 2 (22) () bt s ncreasng f and only f 2 (23) () bt s convexly ncreasng f and only f 2 2 (24) (v) bt s concavely ncreasng f and only f 2 2 (25)

11 (v) bt s lnearly ncreasng f and only f 2 2 (26) The same results are true for b t. The proofs of Propostons 4 and 5 follow trvally from (18) and the smlar result for b t. From Proposton 4 we see that, for the classcal Lotka exponent 2 we have that bt s constant f 0. From (7) t follows that ht s concavely ncreasng (snce 1). The calculaton of bt s llustrated on ths author s data for t 35 (the year 2012). The ctaton data are as n Table 1. From ths table we see that h h 35 22. Hence, by (1) h 35 a35 6.2857 (27) 35 10 For b 35 we have to dvde the c -values by 35 10. Ths s done n Table 2 from whch t follows that b 35 10 (28)

12 Table 1. Ctaton data of ths author, retreved from the Web of Scence on August 21, 2012 yeldng h h 35 22 c 1 258 2 137 3 106 4 61 5 60 6 52 7 48 8 41 9 41 10 35 11 31 12 31 13 30 14 29 15 26 16 26 17 24 18 24 19 24 20 24 21 23 22 23 23 20

13 Table 2. Ctaton data from Table 1, dvded by 3.5, yeldng b35 10. c 3.5 1 73.71 2 39.14 3 30.29 4 17.43 5 17.14 6 14.86 7 13.71 8 11.71 9 11.71 10 10 11 8.86 Both methods of normalzng the h-ndex use the career length t. There s a thrd normalzng method. It s the same as the one yeldng bt (or b t ) but nstead of dvdng each c by t 10 (or t) we now dvde each c by the (artcle age)/10 respectvely by the artcle age and calculate the h-ndex of ths new table. The new age-ndependent ndces are denoted ct and age c t respectvely. Note that, by defnton, ct bt and c t b t t. snce artcle The dsadvantage of ths thrd method s that t s complex to calculate: we have to check every artcle and the obtaned new table s not decreasng anymore. A manual control of ths author s ctaton data n the Web of Scence on August 21, 2012 showed that c35 19. Here c 35h35 22 but the smple example n Table 3 shows that ct ht Ths occurs when artcles are cted n a fast way. Ths s a good property. s possble.

14 Table 3. Example of ht ct 1 2 c age c age 10 1 10 1 100 2 1 1 10 Dscusson on the three correcton methods As dscussed above, we have selected three methods for correctng the h-ndex for career length. The frst method s Abt s orgnal proposal (Abt (2012)) by smply dvdng the h- ndex of a researcher by the career length. A second method that s presented here s to dvde all ctaton data by ths career length and then calculate the h-ndex of ths set of normalzed data. A thrd and last method that s presented here s to dvde each ctaton number by the age of the cted paper and then calculate the h-ndex of ths set of normalzed data. Clearly the frst method s the smplest, followed by the second method. The thrd method s the most logcal one (snce each artcle s ctaton number s dvded by ts age) but s the most ntrcate one snce the order of the normalzed ctaton data s dfferent from the orgnal one and hence one s oblged to consder all papers of the researcher. In ths sense, the second method s an acceptable alternatve for the thrd method snce one dvdes each artcle s ctaton number by the career length. Ths s more logcal than smply dvdng the h-ndex by the career length as n the frst method (Abt s method). The frst method s not logcal n ths sense snce ths smple method ndcates that a normalzaton s obtaned by smply dvdng the h-ndex by career length whch would only be logcal f the h-ndex s a lnear functon of tme (career length) whch s, by (7), not always the case. From the above dscusson one would be nclned to say that the second method s to be preferred snce t s more logcal than the frst one and smpler than the thrd one. However, from (20):

a t t 15 b (29) t ndcatng that the frst method s equvalent to the second method snce t s obtaned by dvdng the normalzed h-ndex b t by the career length as ndcated n (29). So the frst method (although too smple n ts defnton) performs equally well as the second method, hence the frst method should be preferred due to ts smplcty. We also repeat that the frst method yelds a constant functon (as ndcated by Abt) n a classcal nformetrcs case: 1 (see Proposton 1) whch s e.g. the case for 2 (most classcal Lotka exponent see Egghe (2005)) and 1 (lnear growth of the h-ndex). Conclusons and suggestons for further research Ths paper presented a mathematcal model for the age-ndependent ndcator of Abt. We gve characterzatons of the dfferent shapes of ths functon of the career length t, amongst whch a characterzaton of when ths functon s constant. We show that ths happens n classcal nformetrc cases gvng evdence to Abt s clam that ths ndcator often s t- ndependent. A second type of age-ndependent ndcator s obtaned by not dvdng the h-ndex by t but by dvdng each ctaton number c by t and then calculatng the h-ndex of ths transformed table. Also for ths ndcator, a mathematcal model s presented and characterzatons of the dfferent shapes are gven, amongst whch a characterzaton of when ths functon s constant. Both models are also compared. A thrd method of normalzng the h-ndex for career length t s as n the second method but, nstead of dvdng every ctaton number by t we dvde by the age of each artcle. Although t s more dffcult to calculate ths thrd ndex t has the good property that, the faster artcles are cted, the hgher ths ndex becomes.

16 We encourage the reader to conduct further experments on these three types of agendependent ndces and to defne new varants of these three methods. References H.A. Abt (2012). A publcaton ndex that s ndependent of age. Scentometrcs 91(3), 863-868. L. Egghe (2005). Power Laws n the Informaton Producton Process: Lotkaan Informetrcs. Elsever, Oxford, UK. L. Egghe (2008a). The nfluence of transformatons on the h-ndex and the g-ndex. Journal of the Amercan Socety for Informaton Scence and Technology 59(8), 1304-1312. L. Egghe (2008b). Examples of smple transformatons of the h-ndex: Qualtatve and quanttatve conclusons and consequences for other ndces. Journal of Informetrcs 2(2), 136-148. L. Egghe (2009). Mathematcal study of h-ndex sequences. Informaton Processng and Management 45(2), 288-297. L. Egghe and R. Rousseau (2006). An nformetrc model for the Hrsch-ndex. Scentometrcs 69(1), 121-129. J.E. Hrsch (2005). An ndex to quantfy an ndvdual s scentfc research output. Proceedngs of the Natonal Academy of Scences of the Unted States of Amerca 102, 16569-16572.