The number of scientists in the world is considerable
(several millions), and in biology alone more than 1.5 million
articles are published every year. It is therefore impossible
to read most of them. Research is financed most often by public
money and this makes essential to evaluate the output of the
general production of scientists.
The core product of research
activity is the scientific article. Obviously, because this pertains
to creation of hypotheses, accumulation of hopefully significant
facts, placing them in context, and discovery, evaluation cannot
proceed via any type of mass process. Knowledge
is not the result of a democratic vote. As in all types
of highly specialised competences, only specialists well aware
of the domain (would you trust a self-taught engineer to take
care of the engine of the plane on which you board?) can judge
whether scientific activity is relevant and creative. This resulted
in the widely accepted peer-review system. This system, which builds
up its efficiency on the direct competence of specialists, is not
perfect, but it cannot be replaced by any other system because
it rests on knowledge which cannot be widely shared (except for
one essential element, unfortunately often forgotten, common sense).
The problem with peer-review is that it cannot avoid biases, either
ideological or personal. For this reason it is generally accepted
that peer-review is handled anonymously. This has many drawbacks
(in particular it allows much unethical behaviour) which are usually
remedied by involving several reviewers (at least two, often three
or more) to judge one piece of work. Yet, this process is extremely
time-consuming, and financing agencies have had an (unfortunate)
tendency to try to substitute peer-review by automated processes.
Bibliometrics is one such process.
Bibliometrics belongs to the many
social sciences techniques that rest on creation of measures.
In bibliometrics, a variety of measures (distances and the like)
are used to evaluate the information content of articles and
the performance of their authors, according to a variety of methods.
Among those are the citations of work, notoriety of journals
and notoriety of authors. Indices have been created, combining
bibliometrics and other sociological measures to evaluate the
performance of universities and countries in education and scientific
domains. Particular emphasis is generally placed on the poorly
defined concept of notoriety.
Interesting views on this question have been published
by Jacques
Ninio, by Frank Laloe and
by John Ioannidis. A retraction
index has been created to identify
those journals or magazines that carry over the majority of retracted
articles. Remarkably, there is a significant correlation between
the relative number of retractions and the Impact
Factor of a journal.
A piece of advice provided by Phil Bourne:
Ten Simple Rules for Getting Ahead as a Computational Biologist
in Academia
Rule 1: Emphasize Publication Impact, Not Journal Impact
Rule 2: Quantify and Convince
Rule 3: Make Methods and Software Count
Rule 4: Make Web Sites Count
Rule 5: Make Data Deposition, Curation, and Other Related Activities
Count
Rule 6: Use Modern Tools to Emphasize/Quantify Your Academic Standing
Rule 7: Make an Easily Digestible Quantified Summary of Your Accomplishments
Rule 8: Make the Reviewers’ Job Easy
Rule 9: Make the Job of Your References Easy
Rule 10: Do Not Oversell Yourself
Among the important consequences of the pressure
exerted on scientist to publish fashionable results is biased
thinking.
This should, in fact, decrease the efficiency
of research institutions,
in terms of discoveries produced per capita (as well as per $
used in supporting research).
A common
view to evaluate the content of research is to rest on fashion,
assuming that what is fashionable (hence rapidly cited) is
scientifically sound and interesting. A commonly used measure
of fashion, the Impact Factor (or Fashion Index, IF) of a journal
has been created to take this popular view (easy to communicate
to politics and media) into account. The Impact Factor
has been invented by a commercial company the Institute for
Scientific Information (ISI), and it has an important role
in generating revenues, as scholars as well as institutions
wait, every year, for the novel IFs of journals, that they
can only get via the company. The IF represents, for a given
year, the ratio between the number of citations divided by
the number of articles published by a journal, during a two
years period of reference. It measures the average frequency
of the quotations of all the articles of that journal cited
during a defined period of time. It is a retrospective index
of the short
term impact of a journal. It is of course not
a measure in any way of the quality of the output of a scientist,
but rather, on his or her ability to cope with fashionable
items, including lobbying. The IF is also
a way, for countries hosting the corresponding journals, to make
research performed by other countries, without spending a single
cent on the work.
Because using the IF of journals
in which an author publishes to evaluate the quality of her or
his output can be very misleading (in particular it goes much
against innovation) the University of Cork (Ireland), in 2009
issued in its guidelines for
peer-reviewers of research
the following statement:
All panels will work to an underpinning
principle that all forms of research output will be assessed
on a fair and equal basis. Panels will neither rank
outputs, nor regard any particular form of output as of
greater or lesser quality than another per se. Panels
may use, as one measure of quality, evidence that the output
has already been reviewed or refereed by experts (who may
include users of the research), and has been judged to
embody research of high quality. No
panel will use journal impact factors as a proxy measure for assessing quality.
As an example is the rise and
fall of a discipline: driven by the ubiquitous development of
genomics, year 2002 witnessed a dramatic change in the Impact
Factor of many journals. The Impact Factors of Journals,
computed by the Institute of Scientific Information (ISI) for
2009 is available since mid-june 2010. It is quite interesting
as it illustrates extremely well the effects of fashion in
science. Indeed, after a period of a few years of celebrations
genomics and bioinformatics is now on the decline. Microbiology
is clearly out of fashion. In contrast, anything publishing
images (and it is well known that the visual cortex is very
important but quite unable to deep integration of concepts,
for example) is now very successful. This will probably be
trendy for the next few years. So, if
you wish to be visible (yes, this is not simply a metaphor!)
publish fine images.
The content does no longer seem to matter much ...
The IF of genomics and bioinformatics
journals has been considerably on the rise, as well as that of
open access journals. It has then been levelling off as new fashionable
domains, such as that of Systems Biology are emerging. Because
of those biases it is important to check, using Google
Scholar for example, that no important references from
an author have not been missed by the ISI. This is particularly
important when analyzing the track record of young scientists,
who might be discriminated against simply because they did
not publish their important work in journals immediately tracked
by the ISI. Furthermore the way the ISI "analyses" the
output of investigators mixes up all kinds of publications,
including work that is not meant to be cited (secondary publications
in popularization magazines, for example) so that a superficial
use of the automatic indicators is only valid for scientists
who do not communicate with the public and follow the mainstream
trends. Of course plagiarism play an unfortunate role in distorting
indicators: an interesting view of the situation can be obtained
by browsing
the Deja Vu site, which collects plagiarized articles. See
also the European initiative Scientific
Red Cards. For the time being Google Scholar is more reliable
than the ISI (except for papers published earlier than the beginning
of the Internet, ca 1985).
In January 2009 the Open Access Journal PLoS
ONE created a series of alternative metrics
to the traditional Impact Factor, at the ScienceOnline'09 conference
in Research Triangle Park, North Carolina.
Many features of scientific publications
are relevant to game theory, with no direct connection with
the scientific content supposed to be carried by articles.
A remarkable study shows why
current publication practices may distort science demonstrating
that "The current system of publication
in biomedical research provides a distorted view of the reality
of scientific data that are generated in the laboratory and clinic" with
a considerable bias towards overestimation of the quality of
work published in high IF journals... Many examples
of the situation can be found. Late 2008 a retraction of
a high profile study on a long sought for abscisic receptor in
plants is a further demonstration of the unfortunate situation
we have now reached. For a list of high profile retractions in
2010 see The
Scientist, but there is many many more!
A fair use
of bibliometric indicators (beware of
cheaters)
Open access (making
access to Science free for all)
The Impact Factor (the
impact of a journal, not of a work or a scientist)
The H-index (the
citation level of an author)
Other indices (notoriety,
immediacy, SCimago...) |
|
Before investigating further
the nature of bibliometrics indicators such as the IF,
it is essential to exercise common sense, and to consider
that the aim of research is discovery, not making oneself
known. By definition a discovery cannot be predicted, and
because it is new, it often takes time to be recognized.
In a world where emphasis is placed on the futile, on what
is important one day and forgotten the next day, where
crooners make the headlines, where money replaced moral
values, it is unavoidable that many scientists are tempted
by the limelight. Some scientific magazines, whose aim
is profit, take the full measure of this unfortunate situation
and play on indicators which best fit their money-driven
goal. We hope that the vast majority of our colleagues
are still motivated by the quest for Knowledge, and that
they will resist the temptation of facility, which would
make them evaluate their peers with a gross usage of bibliometric
indicators, rather than by analyzing the actual content of
their work. The following paragraphs are meant to help
them in this endeavour. It should finally be noticed that
journals producing images are systematically biased positively,
demonstrating that the role of structured language is much
less important in the way science is produced at the moment
than the ever-growing power of images. |
(A
further analysis)
As remarked by the late Maurice
Hofnung (1942-2001), many factors affect bibliometrics
indicators:
1- The number of citations dramatically
depends on the research domain, on the number of scientists
publishing in the domain, on the number of publications in
the domain. Medical sciences,
for example, have an impact factor (see
below for a definition) which is often considerably multiplied
as compared to biochemistry, just because of the sheer number
of publications and scientists in the domain (many publications
come from hospitals all over the world, and a large number
are simple case studies). In addition, medical journals contain
many articles that are not peer-reviewed, so that up to
40% of the IF is due to references to non peer-reviewed articles
in these journals! It is therefore expected to find medical
publications, or publications dealing with medical subjects
in the top IF publications, even when they would be quite
average in other domains. Inside medical sciences, it is
better to be an immunologist than a clinician, for example.
In contrast, zoology or molecular microbiology would fare
low. If one absolutely wishes to use IFs, a correct way to
appreciate a domain is then to calculate a "relative
impact factor" which standardizes the IF by dividing
it by the IF of the highest impact journal in the domain. For
example the reference journal in cell biology (cytology)
is the journal Cell, while its counterpart in microbiology
is Molecular Microbiology: comparing scientists
in both domains would benefit from comparing their citation
record in the perspective of the relative IFs of these journals
(a ratio of 4 to 5 in the significant number of citations).
Authors who publish successful methods can have a huge impact
(see protein dosage, plasmid preparation, software for protein
model construction and the like) and this contributes to
the impact of a journal. In contrast, authors who take some
time to popularize science in popular magazines are discriminated
against as soon as the corresponding journals are included
at the ISI. Indeed, articles in these journals are not meant
to be cited, but read by a general public, so that this will
immediately impact on the average yearly citations of the
authors who think that it is important to promote interest
for Science in the general public.
2- The bibliometric profile depends on the
history of the domain. Generalist magazines such
as Nature or Science have
a high impact factor because of their format (weekly magazines)
and of their status as established publications. Also they
are journals with high advertisement impact, asking them to
have regular contacts with the popular daily mass media. This
has nothing to do with the quality of science (and, as a matter
of fact, many fakes are published there and many great discoveries
have been refused publication there). Thus, at the beginning
of molecular biology of pathogenic bacteria, it was extremely
difficult to publish in high impact generalist journals such
as The EMBO Journal. This is much less so today, and
many new journals appeared in the domain, as the size of the
community increased (with concomitant increase in impact factor).
In contrast, publications on model organisms, which reached
high impact journals formerly, now are confined to much lower
impact journals (as the size of the community is shrinking).
Publication of genome sequences, which contributed considerably
to the IF of popular magazines are now considered standard
work and are published in the specialized journals of the disciplines
of the corresponding topics. Bibliometrics using IFs measures
the impact of a domain and not only that of the work under
analysis. How can we compare, using IFs, disciplines as different
today as mycology, entomology or development? Since there are
difficulties to define a domain, one may compare recognized
scientists known to belong to the same domain. Examples of
domains are: HIV, Yersinia, protein structure, vaccinology,
cellular microbiology, etc. This may allow one to situate the
scientist in his/her domain. One may try to normalize for each
domain by dividing by the total number of publications in the
domain during the same period of time. Comparing scientists
in different domains is extremely difficult: a way might be
to compare the level they have in their specialized domain.
Multivariate analyses may be important methods to perform the
task, but they need to be used by people competent in statistics.
3- Bibliometrics measures an ensemble of
factors describing the ability of a scientist to make discoveries
and/or inventions and to make them known. Some scientists have
the knack to make discoveries, while others help other scientists
to make them, others to make their own work known, and others
to make the work of others known! Putting too much emphasis
on a narrow use of bibliometrics has he unfortunate consequence
to make the "make-known" more important than the "make-discover" or
the "make-invent". It is also an incentive for unverified
or even fake experiments. In its narrow use, bibliometrics
does not take into account patents (and even less the fact
that a patent has been granted a licence!) or databases, and
it forgets conferences, teaching, the organisation of meetings,
creation of laboratories, etc
4- The bibliometric profile depends on the
moment of the career of a scientist and it is rare that his/her
production is constant in quality or in quantity. This should
be taken into account. A new subject or a new laboratory setting
will inevitably introduce a gap in scientific production, and
bibliometrics should not prevent this type of innovative approach
to Science! Indeed, the best reviewing committees measure the
production of scientists placing it in proper context, and
they are careful not to simply evaluate the quantity of output. As
a rule of thumb it should not be accepted that a scientist
publishes more than one article every two weeks (and usually
much less), as a too large output is the sign of sloppiness,
unethical behaviour and lack of proper consideration of the
importance of Science. For journals it is good practice to
black-list scientists who are familiar with such practices
and never to use them as peer-reviewers. A sudden explosive
increase in the output of a scientist should be carefully monitored,
as it is often the sign that something unethical is happening.
5- Some heads of laboratories sign only
the articles where they have had a significant scientific contribution.
Others have the tendency to sign everything, even without reading
what they sign! Some journals now demand that each individual
author is identified by his/her explicit contribution. This
practice should be generalized. The normal ethical behaviour
is that the first author of an article is the person who performed
most of the work. While this practice is still not general
it is good policy, to judge a leader, to count not only his/her
production, but all that coming from his/her laboratory. In
any event, scientists who publish far
too much (some sign 50 articles per year or even more!) should
not be considered as belonging to the category of ethical scientists
and should be black-listed.
6- It is now recognized that the utilization
of bibliometric criteria modifies the policy of the signature
of articles. There already exists scientists (especially
in countries familiar with lobbying practices) who deliberately
omit to cite their competitors to lower their impact. This
attitude goes against the fairness in chosing citations and
jeopardizes the objective use of bibliometrics. This is already
reflected in the average reference lists: references of articles
in the USA contain more citations from English-speaking authors
than the real world-wide contribution in the domain. This is
easily measured by comparing the citations offered by authors
of other nations in the same domain. This bibliometric
pratice should be known when scientists from diverse countries
compete for a given position.
7- A study has shown that authors with
names difficult to write, or unfamiliar to English-speaking
countries are often inaccurately spelled, and therefore not
quoted properly nor counted in the citation half-life for
example. It is indeed important that the spelling of the name
of authors is reported without errors. Because English is the
standard publication language, spelling errors in English names
are less frequent than in other names (e.g. Polish names, with
their many consonants often experiment spelling mistakes).
Also, it is not infrequent, when a new word is created for
a new concept, that it is ill-spelled (because it is absent
from dictionaries) and this results in under-reporting of citations
(see for example the amusing "homeotropic"
instead of "homeotopic" in the record at
the ISI of an article on the origin of life). This
is another bias (fortunately not acting against Chinese, who
have very simple spelling for their surnames) which goes again
in favour of the extreme domination of English-speaking countries,
already favored by the use of English as the basic language
of communication. Of course this has nothing to do with the
quality of the corresponding science. As a consequence, bibliometrics
should be used with appropriate caveats, especially in non-English
speaking countries.
Draft of a possible scheme for a more objective
bibliometric evaluation of a scientist
| Number of years since the first publication |
Number of peer-reviewed publications |
Number of ill-spelled
citations (when identified) |
Total number of citations |
Average number of citations per year
(before the five preceding years) |
Publications in the five preceding
years; give a negative value when
the number of articles is higher than 30 per year |
Number of papers cited less than 5
times (before the five preceding years) |
Number of papers cited more than 10
times |
Number of citations for the five most
cited paper |
Number of papers cited after 10 years |
Average rank in publications (first=last=1;
second=penultimate=2;
other
place = 3 etc)
• high influence: near 1
• highly collaborative: near 3 |
Of course, this is only one indicator,
and, for comparative purposes, it is important to evaluate
the impact of the specific domain, novelty, publication of
patents, databases, etc! It
is also important to check articles that were subsequently "commented"
upon by other authors: the comments often underline plagiarism,
sloppy experiments or even fakes...
An evident bias in favour of native English speakers has been
found, and there seems also to exist a gender bias:
Gender
bias in the refereeing process? Tom Tregenza
Trends in Ecology & Evolution, (June 06, 2002), 10.1016/S0169-5347(02)02545-4
Abstract
Scientists are measured by their publications. Yet anonymous
peer review is far from transparent. Does bias lurk within
the refereeing process? Investigating the outcomes of manuscript
subvisions suggests that the overall process is not sexist,
but differences in acceptance rates across journals according
to gender of the first author give grounds for caution.
Manuscripts with more authors and by native English speakers
are more successful; whether this is due to bias remains
to be seen.
Note also that scientific authors can also be cited in
the Literature and Arts domain, as well as in the domain
of Social Sciences, Anthropology and Philosophy...
Finally, unfortunately, the peer review system as
it is working now is heavily flawed. An interesting way out,
which would improve both the articles and the review process
has been proposed
by David Kaplan.
For several years a bitter fight is developing
between the tenants of private publishing and those favoring open
access to scientific research. The role of the Impact
Factor of journals is important in this fight, as already
established commercial publications make a large proportion of
their success on this bibliometric measure of their influence.
Government agencies, such as the National Institutes of Health
in the USA consider that the research they support being funded
by taxes, it should be public and open access. In a similar move,
the Wellcome Trust, the most influential charity in UK, has
required, from october 2005, that the research it supports
is published in open access journals. Some commercial journals,
such as Nucleic Acids Research,
have already decided to become open access. Open access
journals make the content of original publications free and public,
leaving the copyright property to the authors, provided they
refer exactly to the place where the work has been published.
A study
published on february 19th, 2009 shows
that free online availability of scientific articles increases
the prospect for authors to get cited. The tendency is particularly
visible in developing countries, where funding for research is
limited. It is now common practice for an author to look for
a reference in field directly related to his or her work, and
to shift to a related paper if the article initially chosen
is not readily available.
(see a
thorough analysis of Biomedical Digital Libraries)?
A fashionable way to evaluate
Science is to use bibliometric studies. One often considers
the "Impact
Factor" associated to the publications of a scientist,
assuming that this is a way to evaluate the quality of his/her
production. In fact an "Impact Factor" (invented
by Eugene Garfield from the profit-making Institute
of Scientific Information) is but one among several bibliometric
markers; it is a measure of the number of times a journal is
quoted in references, for a limited period of time. It rested
initially on the sole responsability of a Private Company,
that which maintains the Institute
of Scientific Information (ISI). Several other structures
now compute a similar index, that can be now be computed using
Google Scholar.
The Impact factor represents, for a given year,
the ratio between the number of citations divided by the number
of articles published by a journal, during a two years period
of reference. It measures the average frequency of the quotations
of all the articles of that journal cited during a defined
period of time. It is a retrospective index of the short
term impact of a journal.
For example, the impact factor of Science (21.911)
in 1995, has been computed as follows:
- citations in 1995 of articles published in 1993: 24,979;
1994 = 20,684; total = 45,663
- number of articles published in 1993: 1,030; 1994 = 1,054
; total = 2,084-
- IF = number of citations/number of articles (45,663/2,084)
: 21.911
This means that the papers published in Science in
1993 and 1994 have been cited slightly less than 22 times in
1995 on average.
Because this is a ratio, the impact factor depends
heavily on the definition of an article, and the same definition
is not used for the numerator and the denominator. Scientific
articles are usually counted for the total, while the total
number of citations quotes all types of articles published
in the journal. As a matter of fact, the definition of IF comprises
all types of articles, including Reviews, Comments, Editorials
etc. Therefore a
journal publishing reviews has always a higher IF than
those which do not publish reviews (hence the vogue for mini-reviews
or even review sections in most major journals now). For example
the largest IF in 1999 was that of Annual Reviews of Biochemistry (37.111).
Many review articles are not peer-reviewed in the same way
as standard scientific articles, but commissioned, thus creating
a huge bias in the choice of authors. Lobbying in this domain
is common practice. Morevover, comments and editorials are
often political in nature, and therefore frequently quoted:
the IF of a journal such as The Lancet owes much to
its controversial editorials, not to the scientific content
of its articles. This is even more so for Nature or Science,
and this explains the introduction of special sections such
as "Insight"
in Nature, since this will automatically boost the
impact factor of the journal.
As
the magazine Nature discovered a few years ago, the
Impact factor is flawed. Nature reiterates its words
of caution on june 23d, 2005:
"The net result
of all these variables is a conclusion that impact factors
don't tell us as much as some people may think about the
respective quality of the science that journals are publishing.
Neither do most scientists judge journals using such statistics;
they rely instead on their own assessment of what they
actually read. None of this would really matter very much,
were it not for the unhealthy reliance on impact factors
by administrators and investigators' employers worldwide
to assess the scientific quality of nations and institutions,
and often even to judge individuals. There is no doubt
that impact factors are here to stay. But these figures
illustrate why they should be handled with caution."
An analysis of quotations of the Human Genome
articles shows important errors in citation statistics (Nature
(2002) 415: 101.) As stated in the editorial of this famous
magazine: "This adds to worries about relying heavily
on these figures when rating scientific performance." Furthermore,
this figure is only computed for journals selected by the ISI,
excluding some very important journals (in particular some
published on the World Wide Web and fundamental to genomics).
It is important now to complement data provided by the ISI
with other sources of citation records, such as those provided
by Google Scholar. Caveat: journals
are considered only as providing references from the date they
are incorporated in the survey: this means that older papers
from those journals are not taken into account. There is also
a very strong bias in favour of native English speakers and
English speaking countries (Tom Tregenza T.Tregenza@leeds.ac.uk Trends
in Ecology and Evolution 2002, 17:349-350).
The
Impact Factor does not measure the quality of the production
of a scientist. An old study by Maunoury already showed
that "9% of the articles in Cell, 16% in PNAS,
43% in Experimental Physiology and 52% in the European
Journal of Pharmacology, published between 1989 and 1993,
have never been quoted". The citation of one single
article may significantly affect the IF of a journal. For
example, the article on Blast2 is quoted so often that this
single paper increased the IF of Nucleic Acids Research by
one unit! In the same way, genome papers were often "hot
papers" (see below) and they affected considerably the
IF of journals (this is the case of both Nature and Science).
In total, just a handful of articles may double the IF of
a journal. By far, the most important
factor to evaluate the production of a scientist is the number
of times his or her work has been cited by others. This
however depends heavily on the journals considered in the
databases. In the absence of an independent, non commercial,
study, the figures we possessed may be flawed. Note that
some articles act as "attractors" and get most
citations of a given subject. Quite unfortunately (and unethically)
lobbies are trained to cite only papers of friends... Furthermore
immoral scientists (in particular in the highly hierarchized
medical domain) sign more papers than they can really contribute
to, artificially increasing their citation record (in particular
through self-citation). This unfortunate
unethical and sloppy practice often involves plagiarism (including
self-plagiarism) using a general canvas for articles where
the name of the cases, organisms etc. may easily be replaced
by a variety of alternatives.
Other approaches are much better,
e.g. the number of times a paper (or a scientist) is still
quoted after 5 years, 10 years or 20 years. The Impact Factor
measures the ability of a journal to make itself known by advertisement
(often paid advertisement), or even scandal. The
publication of fakes or uncontrolled
results, increases the impact factor of a journal (see
the arsenic nightmare)!
For a public view of the situation, the "Research's
Scarlet List" as named by Alison McCook provides a
very conservative identification of misconduct, see also for
example (and this happens quite often):
Retraction:
Metal-insulator transition in chains with correlated disorder
PEDRO CARPENA, PEDRO BERNAOLA-GALVAN, PLAMEN CH. IVANOV & H.
EUGENE STANLEY
Retraction:
A cytosolic catalase is needed to extend adult lifespan in C.
elegans daf-C and clk-1 mutants
J. TAUB, J. F. LAU, C. MA, J. H. HAHN, R. HOQUE, J. ROTHBLATT &
M. CHALFIE
Another example of retraction that has important
consequences in the understanding of oxidative damage in animal
cells, a fundamental topic in cancer studies is the following:
A highly cited 1997 paper on transcription-coupled
repair was retracted by Science in june 2005, after
coauthor Steven Leadon, formerly of the University of North
Carolina, was found guilty by a university committee of fabricating
and falsifying data. An
analysis by Graciela Flores, shows that in spite of this
retraction, the matter is not dealt with as what we understand
as ethics could dictate. As it is often the case (we could
remember the famous fakes of Mark Spector, which turned the
whole community of cancer scientists away from investigating
metabolic features of the disease, for more than two decades),
this is not the first time that a paper by Leadon is retracted. Scientific
misconduct is a widespread plague, unfortunately. Aside
from publishing many more paper than intellectually possible
(a behaviour akin to corruption, and very much spread in
places where corruption is omnipresent), one of the most
pervasive misconduct is the lack of proper citation of related
work, which, of course, has a very high contribution to the
impact of scientific work. In fact, as stated in an editorial
of Nature about the terrible case of misconduct
of a Korean scientist working in the domain of embryonic
stem cells: "In view of the pattern of behaviour
that led up to Hwang's disgrace, however, no one should argue
ever again that despotism, abuse of junior colleagues, promiscuous
authorship on scientific papers or undisclosed payment of
research subjects can be tolerated on the grounds of eccentricity
or genius. Research ethics matter immensely to the health
of the scientific enterprise. Anyone who thinks differently
should seek employment in another sphere."
There is a strong correlation between the Impact Factor of
a journal and the number of retractions of articles published
in the journal (see the retraction
index).
Is, then, the impact factor a good way to evaluate
excellent science?
After the "Impact Factor",
another index, the "h-index",
has been proposed by HE Hirsch to evaluate the production of mainstream scientists:
it is the highest number
"h" of papers cited more than "h" times (the
same
"h") for a given author. h=35 will mean that an author
has published at least 35 papers cited 35 times or more. This
index is interesting in that it dampens the bias in favour of
authors who have a few highly cited papers but nothing else.
It also takes into account the past contribution of a scientist
if his or her work continues to be cited over the years, while
being in a significant number. It seems likely that this measure
will play a role at least as important as that of the popular
Impact Factor. A strong caveat must be
borne in mind when considering this index: because it
refers to citation in a given domain, it is highly sensitive
to the number of articles published in that particular domain.
For example — perhaps because of the attraction of images — cell
biology is probably enjoying the largest number of citations,
whereas biochemistry or microbiology have four to five times
less citations. Hence a rule of thumb would be that a h-index
of 25 in microbiology would match with a h-index of 100 in cell
biology (50 in immunology or in studies of the Central Nervous
system). This way of computing is substantiated by the comparison
between the Impact Factor of the highest ranked journal in Cell
Biology, Cell (around 30), and that of its counterpart
in Microbiology, Molecular Microbiology (around 6.5).
In fact purists would contend that, as the process of citation
is the
result of many multiplicative causes, it is more likely that
the distribution of citations is log-normal rather than normal.
If this is the case, then the multiplicative factor should probably
be the logarithm of the ratio of the overall number of citations.
If this were the case a h-index of 100 in cell biology would
correspond to 75 in immunology or studies of the CNS, and 40-50
in microbiology. Naturally, combining two fields, such as in
cellular microbiology, would even increase the h-index. It it
therefore extremely important to remember that the size
of a community is important when comparing the performance
of various authors: a h-index of 30 in Cell Biology is much less
significant than the same value in Microbiology. Strictly
speaking, comparing investigators on the basis of h-indexes should
be only possible if they belong to the same field.
A major issue with this index
is, of course, the way citations are recorded. It is easy to
see how authors publishing in journals that are not counted in
the citations used by the ISI, for example, will be overlooked
even if their research has a large impact: while the Japanese
journal DNA Research is
in the ISI list, for some reason its early papers are not in
the cited papers by the same ISI, and there are many such examples,
in particular with the journals published by BioMedCentral...
For this reason it is important to prefer Google
Scholar to the
ISI and compute the H-index using this source of information.
It is therefore prudent to take it with a grain of salt as scientists
are gregarious and tend to cluster in heavily trodden avenues:
it is not unusual to see that fairly boring topics are very popular
(hence, highly cited). Indeed, analysis of the publication record
of scientists who were awarded a Nobel Prize suggests an important
caveat: while most have a fairly high "h-index" (but
usually not among the highest ones), some do not.
This can easily be understood as a senior scientist signing a
very large number of papers will have a high h-index, despite,
often, lack of originality (interestingly, the h-index may help
to identify misconduct: no author should be able to be a co-author
in one article per week, for example, except in a very corrupt
system). Also, many important discoveries are made at the interface
between domains, or are found out of the mainstream, and this
does not always contribute to a high "h-index". It
is sometimes also important to monitor, together with the h-index,
the slope of the decrease in the citation number. Naturally,
it is essential to have an idea of the size of the underlying
community: working with infectious diseases, for example, will
lead to work widely cited in hospitals, which are much more frequent
than microbiology departments at universities! Original work
takes time to be recognized, and recent domains have much smaller
communities than older ones. This is particularly true in the
past ten years or so, when many new domains were created, precluding
high citation record.
The
place were one has access to citation numbers is also very
important. "Google
Scholar" is becoming an important resource in that domain
and it sometimes provides better insight and probably less biases,
for new or innovative articles than the expensive ISI resource.
The phenomenon is quite visible in Bioinformatics, where the
number of articles referenced at the ISI is significantly lower,
and sometimes much lower, than at Google
Scholar (this is highly time-dependent, however, and this
database is recent and does not identify easily citations older
than 10 years ago). By contrast, because it started recently,
papers published before 1985 are considerably less cited in "Google
Scholar" than at the ISI. Google Scholar often omits to
associate many articles with a given author, in particular when
the list of authors is a long list. It is also heavily influenced
by the behaviour of people with respect to Internet connection.
A new factor, the "y
factor" is being created to monitor the Internet-related
prestige of journals: we are not very far, now, from Science
as a full-blown show-business! Hence, as always with bibliometrics,
one needs to take all indexes with a grain of salt, especially
in the most innovative domains. Anyhow, it is of course essential
to consider first the content of research,
rather than the place where it is published or its visibility.
The immediacy index of
a journal is calculated by dividing the citations a journal
receives in the current year by the number of articles it publishes
in that year, i.e., the 2000 immediacy index is the average
number of citations in 2000 to articles published in 2000.
The number measures how quickly items in that journal get cited
upon publication. Hot papers is
an other way to measure immediacy of the few papers which are
most quoted immediately after their publication, and during
two years.
The cited half-life is
a measure of the rate of decline of the citation curve of an
article. It is the number of years that the number of current
citations takes to decline to 50% of its initial value. It
is a measure of how long articles in a journal continue to
be cited after publication. Review articles usually have longer
cited half-lifes.
The lucrative activity of publication of scientific
articles cannot accept that fair assessment of research could
affect their financial benefits. There is therefore a considerable
activity in the domain of creation of indexes meant to rank
papers (and hence publishers). Elsevier, via its database Scopus
has created a new index SCImago meant
to rank journals and countries. SCImago's
ranks differ somewhat from those obtained at the ISI, but review
articles and journals producing images obtain the top scores,
as expected.
Publishing a scientific articles assume some sort of selection
process. The most widely accepted view for this process is
peer-review. As democracy, this is not a perfect system, but
it is probably the best possible one. The idea behind is that
an author cannot be entirely safe with the research (s)he is
producing not only because (s)he is judge and jury, but also
because knowledge is so vast that it is always possible to
overlook important research pertaining to the subject of an
investigation, or simply to be unable to point out a defect
in reasoning. As a principle a referee should therefore help
to improve the content of an article, and only reject it when
it is logically flawed, plagiarized previous work, or simply
did not acknowledge the proper scientific background of the
work expecting to be published.
Many drawbacks are associated to this procedure, in particular
conflict of interest (when the reviewer is doing work in the
exact same domain as the author of the work under review) or
lack of competence or insight. The equilibrium between these
two features is difficult to get, as competence requires that
the reviewer is knowledgeable in the reviewed work. Peer-reviewed
is an unpaid work, to lower the impact of another type of conflict
of interest. Yet, the very fact that many journals aim at getting
a high impact factor makes that reviewers are asked to reject
much more work than what they would normally do. And, in the
most fashionable journals, which make a huge amount of money
out of the fame attached to a high impact factors, reviewing
is strongly distorting science towards fashion. However, there
is also a huge number of studies which are simply sound, but
dull, and do not help much in the progress of knowledge. The
most difficult question associated to the peer-review process
is the way it tackles innovation and discoveries. Fashion and
pressure for impact favour fakes (and retractions of articles
in otherwise reputed magazines is extremely frequent).
On the
other hand, using reviewers who are excellent professionals
but are not open to innovation is the most frequent drawback
in the process. It is often excellent to use scientists who
have been long recognized, and do not need further glory to
support highly innovative ideas. A 2009 study of the peer-reviewing
process in the Proceedings of the National
Academy of Sciences of the USA demonstrates this is a remarkable way. Rand and
Pfeiffer have investigated systematic
differences in impact across publication tracks at this
journal. PNAS has three tracks for article subvision. Papers
can be “Communicated” for others by NAS members (Track
I), submitted directly via the standard peer review process
(Track II), or “Contributed” by NAS members (Track
III). In the latter case the NAS members choose the reviewers
according to their preferences, so that, in general, this means
that if the NAS member has accepted to transmit the paper,
the reviewers will be kind enough to make the paper accepted.
For this reason it was feared that the process would end up,
for Track III papers in papers that would rate below average
quality. This work shows that this is indeed the case for a
significant number of papers, but, in contrast, that the most
interesting and innovative papers belong systematically to
this category. The standard peer-review papers are of excellent
professional quality, with little papers with a low citation
rank, but they are not the papers which would leave important
traces in science...
Retractions is the plague of scientific communication. The
fact that self-advertisement became more and more important
in recent years pushed investigators to become very sloppy
in the way they use statistics, or event to fake important
results. This is particularly true in the medical domain, as
shown in a study published
by Ioannidis in PLoS Medicine: Why most
published research findings are false. The more fashionable
a journal, the less confident we should be in the published
results. This may have considerable consequences in terms of
medicine. For example, back
in 1998 Andrew
Wakefield and his colleagues claimed in the famous
medical journal The Lancet to have found that
there was a link between the triple vaccine against measles,
mumps, and rubella and autism. This claim was based on
wrong statistics and very poor experiments. Unfortunately,
because it was published in a fashionable journal, it
triggered a strong anti-vaccination reaction, which is
still ongoing (rumors are difficult to stop). The
Lancet has finally published
a complete retraction of the paper, telling readers
that the published
flawed study should never have been made public. But this
work had enormous consequences not only in UK, with a considerable
increase in morbidity and mortality of these diseases, but
also in the developing world, where persons coming from UK
contaminated children.
In 2005, among many other cases,
a prominent Japanese scientist who published two major papers
in Nature (H. Kawasaki and K. Taira Nature 423,
838−842; 2003 and Nature 431,
211−217; 2004) could not produce the corresponding
experimental data. The first paper had already been retracted
(Nature 426,
100; 2003) and the second corrected (Nature 431,
878; 2004). An investigation panel asked Taira to submit
samples and notebooks relating to the experiments, but the
researcher in his lab who ran the experiments did not have
them. It is obvious that references to those papers will appear,
be it only to state that they are likely to be faked, but this
will obviously increase the Impact Factor of the journal...
In the same way we can see in october 2005: Retraction: RNA-interference-directed
chromatin modification coupled to RNA polymerase II transcription
Vera Schramke, Daniel M. Sheedy, Ahmet M. Denli, Carolina Bonila,
Karl Ekwall, Gregory J. Hannon and Robin C. Allshire
Nature 435, 1275;1279 (2005)
In spring 2006, investigation about RNA work in Taira's group
(at least 12 papers published in quite "visible" journals are
probably fakes).
In the same way, on december 16th, 2005, the magazine Science said
that Woo-suk Hwang, the Korean cloning researcher, has requested
a retraction of a paper on patient-specific human stem cells
that made the headlines of dailies world-wide. The consequences
of this work were considerable, including involving ethical
considerations about human cloning....
Go to Top