A system for the assessment of the importance of scientific papers

The final assessment in a traditional journal is just a yes or no. The journal it is published in gives some information on the importance of the article and helps somewhat in finding relevant articles. There is no post publication review beyond rare retractions of papers, traditionally in case of fraud, nowadays increasingly also in case of big mistakes.

A grassroots journal would be able to provide more information on the value of an article by publishing the reviews of the experts in the field. A grassroots assessment would include

  • A written review that discusses the article’s place in the literature, its strengths and weaknesses and detailed comments.
  • A categorisation of the article to make it easier to find relevant articles.
  • An assessment of the importance of the article.

This blog post is on the assessment of the importance of an article, the rest follows later. The importance of an article has at least five aspects.

  1. Contribution to the scientific field of the journal.
  2. Impact on the larger scientific community.
  3. The technical quality of the paper.
  4. Importance at the time of publishing.
  5. Importance of the research program.

Let me try to explain these 5 aspects below. The example I have in the back of my head is my own field, statistical homogenisation of climate station data as part of the large climate change research community. This is a field working much on methodological problems, which may have made my proposal less suitable for other fields.

It would be good when this system would fit for all fields, or at least many. It should also be as simple as possible; can we merge/skip aspects? Are there aspects missing (that are important for other fields)?

Importance for the field

This metric assesses how large the contribution of the paper to the scientific field is. A paper can be important for the homogenisation community when it helps understanding the algorithms we use, or the problems we have in the data, or proposes a better homogenisation method. Scientists need to know which the important papers in the field are to prioritise their reading.


The traditional peer review mainly measures the expected impact of an article via the Impact Factor of the journal: how often an article in that journal is cited in the first two or five years after publishing. Ideally journals make a more complete assessment of the importance of submitted manuscripts and the impact is just an emergent property, but now that the Impact Factor is published and taken in to account in the micro-management of science a high Impact Factor has become a goal in itself.

I would love to get rid of this system, but an assessment of the impact cannot be avoided as long as publish-or-perish micro-management systems are prescribed by politicians. Without something like an impact factor bureaucrats will not be satisfied with a publication in a grassroots journal and scientists would thus not submit their work to such journals.

It would also have some role for science. It would inform scientists in the larger community what the important papers to read are. In case of homogenisation this would be papers that change our assessment of the size of climatic changes or papers on how much used datasets have been homogenised. These do not necessarily have to be the papers that bring the field itself forward most.

This metric could also include the importance of a paper for the public or maybe that could be a metric by itself. Because media attention often also leads to more citations, but also to stimulate this kind of research. In the public climate “debate” the upper air warming estimated by satellites plays a large role. They are scientifically not that important because the time series is short and the data is expected to be unreliable, but studies improving this dataset are important for the public debate. Studies on the relationship between vaccines and autism are also scientifically no longer needed, but could still help inform the public and increase vaccination rates. This kind of research is important and we tend to do too little of it, focussing on scientific importance.

Technical quality

This would assess the technical quality of the work. A paper may not give many new insights and then score low on the previous metrics, but may dot all the i’s and cross all the t’s very carefully to increase our confidence in our assessment. Paper of high technical quality are good for citing. An extensive balanced review article would also score high in this aspect.

Important when published

Hopefully grassroots journals would not only review current articles, but also important classical ones. Some of these may not be strictly important any more, for example papers that introduce methods that have now been superseded by better ones, but were important innovations as the time. It would feel bad to give them a low assessment on the previous metrics without at least acknowledging their contribution to the field as classical papers.

Important research program

Some papers will be important as part of a series, for example as the last work of an important series. This could be about a new paper on an important method that makes only a small further improvement and would thus by itself not be too important.

Hopefully this will discourage publishing studies in thin salami slices, as only the last paper would be marked as important in this metric and the reviewer can feel to give lower assessments on the previous ones. It will often be completely legitimate to publish papers that only make a small additional contribution; improving the best methods/studies is hard.

To summarise and be more concrete: In my own field, a paper can be important

  1. for the homogenisation community when it helps understanding the algorithms we use or proposes a better method;
  2. for the broader climatological community (and general public) because it changes the assessment of climate trends;
  3. for the homogenisation community because it improves our confidence, for example an analytic study helping us understand a previous numerical result.
  4. for the history of the field, for example this first study using the relative homogenisation principle or the papers on the much used Standard Normal Homogeneity Test.
  5. for the users of homogenisation methods because it improves one of the best homogenisation methods.

The systems used for these assessments should preferably be useful for all (or many) sciences to make it easier for scientists from other disciplines (and for micro-managers) to judge papers. Thus I would very much appreciate feedback on these five aspects, especially from researchers from other fields to see if this would also work there.

If anyone knows of scholarly work on the assessment of the importance of papers please also leave a comment. (I am not thinking of paper on computing a bibliographic index based on citations, but on assessment of the intrinsic value of papers.)

An important advantage of such an assessment of the importance of a paper is that all papers on one topic, but of different importance, can be published in the same journal. In this way also replication studies could be published, which in traditional journals often do not reach the publication threshold. Also a grassroots journal has a lower quality limit: A paper should at least be technically sound.

The importance of a paper can change in time. Maybe it is found that a certain method has important application previously not appreciated. Maybe a problem is found in the paper. Maybe it is superseded by newer work.

The assessment of the papers in a grassroots journal should thus be updated if new comments and reviews come in and also periodically because the field is making progress. These changes should be visible (if only to be able to demonstrate that earlier assessments can predict impact) and should be justified by the editors.

I would suggest to make the quality assessments on a percentile scale. Once enough reviews are in, the software should present the reader with calibrated percentiles to avoid that the reviews call 90% of the studies to be in the best 50%. (Except for classical papers, where likely only the best ones are reviewed and thus no calibration should be performed.)

So far for the general ideas. I hope to get some feedback before writing a more concrete proposal later, as well as posts on the other parts of the assessment and on the work of editors and reviewers.

One thought on “A system for the assessment of the importance of scientific papers”

Leave a Reply

Your email address will not be published. Required fields are marked *