Citations are peer review by the world

22 June, 2017

Phil Ward, Deputy Director of Research Services at the University of Kent, gives us a snapshot on how research quality is assessed in the UK

The REF is used by the UK government to assess the quality of the research being undertaken in UK universities and research centres, and the results inform the distribution of ‘quality-related’ or QR block funding.

However, it’s taken on a life of its own, and is now used by league table compilers as a proxy for the research intensity of an institution. Moreover, the regular exercise has, since 1986 provided the navigation points in a turbulent and changing landscape, marking progress and, like a modern day Lachesis, assessing the threads of individual research in readiness for Atropos’ shears.

As we gear up for the next iteration, REF2021, and wait to hear from Higher Education Funding Council for England (HEFCE) about what form it will take, it seems an appropriate time to take stock and question whether the game is worth the candle, and if the measurement of research should be undertaken differently.

This is a perennial question. It maps onto the tide times of the REF itself. As the REF flows towards high tide, and a disproportionate amount of academic and administrative time is taken up with judging outputs and deciding on staff to be submitted, more voices are raised in protest. As it ebbs and people return to their day jobs, the protest dies down.

What are the alternatives?

Before REF2014 Derek Sayer of Lancaster University questioned the whole framework of the REF in his book Rank Hypocrisies, and even went so far as appealing against his inclusion in Lancaster’s submission. Similarly Dorothy Bishop of Oxford looked at alternatives, suggesting that the use of bibliometrics, which was rejected before the last exercise, should be re-examined.

At the heart of their argument is that peer review is not fit for purpose. The sheer scale of the exercise does not allow for an informed assessment of outputs. “The REF,” wrote Sayers, “is a system in which overburdened assessors assign vaguely defined grades in fields that are frequently not their own while (within many panels) ignoring all external indicators of the academic influence of the publications they are appraising, then shred all records of their deliberations.”

Whilst many might concur with this, most see no alternative. And yet peer review is a relative infant in the world of academia. I’ve written before about the surprisingly short history of this apparent gold standard. It’s current prevalence and dominance is the result, essentially, of a confluence of the baby boomers coming of age and the photocopier becoming readily available. As such, the term ‘peer review’ was only coined in 1969.

Challenging the norm

A couple of weeks ago the Kent Business School convened a ‘debate’ to reopen the wound. ‘The Future of Research Assessment’ meeting heard from both sceptics and believers, from those who were involved in the last REF and those who questioned it, and looked again at the potential worth of Dorothy Bishop’s bibliometric solution.

Prof John Mingers began by laying his cards on the table. ‘Debate is the cornerstone of academic progress but there is not enough of it when it comes to peer review and possible alternatives,’ he said.

Taking aim at the REF, he suggested that, whilst it was intended to evaluate and improve research, it actually has the opposite effect. It does not effectively evaluate research, and can have ‘disastrous’ effects on the research culture. For him, the REF peer review was fundamentally subjective and open to conscious and unconscious biases. The process by which panel members were appointed was secretive and opaque, and the final membership came to represent an ‘established pecking order’, and may not have the necessary expertise to properly evaluate all the areas of submission.

Moreover, even accepting the good faith of the members in assessing outputs objectively and fairly, the sheer workload (in the order of 600 papers) meant that “whatever the rhetoric, the evaluation of an individual paper was sketchy.”

Citations as a proxy

For Mingers, the exercise wasn’t fit for purpose, and didn’t justify the huge expense and time involved. Instead, he suggested that a nuanced analysis of citations offered a simpler, far cheaper and more immediate solution. After all, “citations are ‘peer review’ by the world.”

He accepted the potential problems inherent in them – such as disciplinary limitations – but suggested these could be allowed for through normalisation. In return, they offer an objectivity that peer review can never hope to achieve, however well intentioned.

Mingers compared the REF rankings to an analysis he had done using the data underlying Google Scholar. The REF table produced an odd result, with the Institute of Cancer Research topping the ranking, Cardiff University coming seventh, and a solid research institution such as Brunel University sliding far below its natural position.

Much of this is a result of the game-playing that the Stern Review sought to remedy. However, whatever solution is finally agreed by HEFCE, such a peer review based system will always drive behaviour in a negative way.

A different picture from a different method

His alternative used Google Scholar institutional level data, focussing on the top 50 academics by total citations. He examined a variety of metrics – mean cites, median cites, and h index – before selecting median five year cites as the measure which offered the most accurate reflection of actual research excellence.

The resulting table is less surprising than the one that resulted from the REF. Oxford and Cambridge are at the top, followed closely by Imperial and UCL, with Cardiff down to 34 and Brunel up to 33. Interestingly, LSE – a social science and humanities institution that hosts disciplines which don’t traditionally deal in citations, came 12th.

It was an interesting exercise, and seemed plausible. Whilst The Metric Tide had cautioned against over reliance on metrics, it did suggest that there should be “a more sophisticated and nuanced approach to the contribution and limitations of quantitative indicators.” Bishop and Mingers seem to provide such sophistication and nuance, and demonstrate that a careful, normalised analysis of bibliometrics can produce an equally robust and accurate result for a fraction of the cost – and agony – of the REF.

A version of this blog was first posted on Phil’s personal blog Research Fundermentals.

topics: Peer review

User comments must be in English, comprehensible and relevant to the post under discussion. We reserve the right to remove any comments that we consider to be inappropriate, offensive or otherwise in breach of the User Comment Terms and Conditions. Commenters must not use a comment for personal attacks.

Click here to post comment and indicate that you accept the Commenting Terms and Conditions.

blog