Thursday, December 27, 2018

Measuring What Matters: On the Tyranny of Academic Metrics

As an academic, I’m acutely aware of my numbers. They are the measure of my worth.

As of December 2018, I have published 34 peer-reviewed journal articles (not bad for my career stage?), 12 book chapters (okay, but there are more on the way!), 1 edited collection (should I have more?), and 1 monograph (technically not due for publication until next year but it is important that we count it now). If you count publications on this blog (and I do) then I have more than 1000 ‘other’ ‘non-peer reviewed’ publications, which have been viewed more than 3.5 million times (probably more, but I don’t have access to the figures on other websites). For all this effort, I have just under 270 citations, with a h-index of 9 and an i-10 index of 9 (not good, but I’m a philosopher/lawyer and we don’t do well on these metrics). In addition to this, I’ve been awarded just over €150,000 of research funding in my time (pathetic really, but I’ve applied for and failed to receive several million - don’t forget that!), and I’ve received more media mentions than I care to mention (mostly because I’m ashamed of the majority of them). I’ve taught more than a thousand students but received zero teaching awards. I’ve held one relatively major administrative role, and half a dozen minor ones. In short, after seven years as a full-time, post-PhD academic, I am a bundle of statistics and numbers, none of them very meaningful in their own right, but all of them crucial when it comes to selling myself and my ideas.

I’m also quite disillusioned. As I say to graduate students every year, when asked to give advice on publishing, the numbers shouldn’t matter. They don’t make you happy or make your scholarship any better. Having 34 peer-reviewed journal articles doesn’t make you any happier (or any better a scholar) than having 24. When asked for proof of this, I point to famous examples: Wittgenstein only published 3 things in his lifetimes, and one of those was a book review. That didn’t stop him from having an outsized influence on an entire generation of philosophers. The numbers are a distraction from the ideas, the arguments, the intellectual curiosity — the things that got you into academia in the first place.

But it is worth reflecting on why this might be the case. After all, a fixation on numbers does make a difference in other walks of life. If I want to improve my personal best at running, then it is useful to keep track of and try to purposefully improve, my split times. If I want to reduce my weight, then it is useful to keep track of and try to reduce my calorie intake. Numbers really are a metric of success in some walks of life. Why aren’t they (or why don’t they seem to be) in academic life?

Jerry Muller’s short polemic — The Tyranny of Metrics — offers some insight. In this book, Muller presents a tightly argued critique of ‘metric fixation’, a trend that has taken over in the management of education, academia, healthcare, policing and other bureaucratic systems. Metric fixation promises us a more productive and more efficient system, but those of us living under the thumb of the metrically obsessed often don’t see it that way. It’s not that we are lazy and inefficient — at least not all of us; it’s that we find the metrics misleading, distracting, and undermining. We yearn (or at least I yearn) to be free of them.

In what follows I want to briefly set out Muller’s critique of metric fixation, consider how it applies to my life as an academic, and then reflect on three strategies I have found helpful for avoiding the tyranny of academic metrics.

1. Muller’s Critique of Metric Fixation
Metric fixation starts with the best of intentions. The worry is that certain systems, particularly those in the public sector but also some in the private sector, are bloated, inefficient and expensive. The goal is to make them more efficient and effective. The metrics are supposed to do that. They give people clear targets. They introduce something like the discipline of the market to forms of work that are usually removed from such discipline. The result is that workers will be more motivated and self-actualised, and that consumers, service users and stakeholders will benefit as well. It all starts from a good place.

But then things quickly go downhill. Muller’s argument is very easy to grasp. He thinks that the metrics that have been introduced to sectors like education, academia, health, policing, finance, the military and elsewhere, have a tendency to be ‘dysfunctional’. There are two main problems of dysfunction, each of which breaks down into a number of sub-problems (all of these are set out on pages 23-25 of Muller’s book):

(1) The Distortion Problem: The metrics that are introduced tend to distort or distract attention from what really matters (what the true mark of efficiency or effectiveness is). This isn’t simply because the metrics are bad and need to be replaced by better ones (though that might sometimes be the case), but because some forms of success cannot be easily (or ever) metricised. The problem of distortion can manifest in a number of ways: 

(1.1) Measuring that which is easy to measure instead of the more complex thing that matters more - This is a big one because most large organisations are complex and so focusing on one thing that happens to be easy to measure is rarely the best way to ensure success.

(1.2) Measuring inputs instead of outcomes - it is usually the results of a project or collective effort that matter, but they are often diffuse and difficult to track, so organisations tend to focus on inputs than can be easily tracked, e.g. hours worked, money spent and so on.

(1.3) Degrading information quality through standardisation - in the effort to create standardised measures that allow for the comparison of performance across individuals and organisations, we often strip away much of the contextual information that is needed to properly understand what is going on. We create an illusion of simplicity and certainty when the reality is anything but.

(2) The Gaming Problem: Once the metrics are installed and people are rewarded for optimising them, they are frequently gamed, i.e. they become an end in themselves and people do whatever it takes (up to and including fraud) to hit their targets. This is obviously a problem because gaming often undermines or contravenes the original purpose of the metric. Again, the problem of gaming can manifest in a number of ways: 

(2.1) Gaming through creaming - people focus on the easy cases (clients, problems etc) in order to meet their targets and exclude the tougher cases that might cause them to miss their targets (the name comes from the idea of ‘skimming the cream’)

(2.2) Lowering standards - possibly the same as the above but can be a distinctive problem in education or other sectors where getting people through a system is one of the metrics of success. One of the easiest ways to do this is to lower the standard that people have to attain to get through the system.

(2.3) Omission and distortion of data - leaving out any figures that might undermine your metrics, or classifying/reclassifying cases so that they don’t get included (a particular problem, it seems, in policing where offences get reclassified in order to reduce crime rates).

(2.4) Cheating - going beyond mere distortion and omission and actually fabricating data in order to make it seem like targets have been met or exceeded.

Muller spends the majority of his book documenting how each of these problems arises across a range of sectors. The cumulative weight of this evidence is pretty impressive, though I have to confess that I am not able to critically assess some of the examples. It could be that he is unfair in a few cases, but I am inclined to believe him. Part of the reason for this is that I see the majority of the problems arising both academia in general, and my own life in particular. It’s to my personal experiences that I now turn.

2. Examples of the Tyranny of Academic Metrics
As mentioned earlier, the life of a modern academic is suffused by a panoply of metrics. They are everywhere. The universities in which you work are scored and ranked, your teaching is assessed and evaluated by your students, and your students are encouraged to take endless ‘satisfaction surveys’ to determine how they compare to students at other universities. Your research and skill as an academic is also assessed by various metrics: number of publications in top-ranking journals, number of paper downloads, number of citations (h-indexes and i-10 indexes), number of PhD students, amount of research funding won and so on and so on.

Many of these metrics are of dubious merit. Student surveys that compare student satisfaction across different institutions, for example, often strike me as being of limited value: most students only have experience of one university and one degree programme, and so comparing their satisfaction ratings across different institutions gives you questionable insights. It’s a bit like using fahrenheit and celsius scales to measure temperature in different universities and assuming the numbers are equivalent. And yet institutions are incentivised to optimise these ratings in order to advertise themselves to potential students. This can encourage them to lower standards in order to make students more satisfied, or to focus on these ratings to the exclusion of other important impediments to student well-being. And that’s just one example. In a great paper entitled ‘Academic Research in the 21st Century’, Marc Edwards and Siddhartha Roy document how virtually every performance metric that has been introduced into academia has had some perverse impact. They summarise the problems in the following table (which I reproduce from their article).

From Edwards and Roy 2017

These all seem to provide clear illustrations of the problems that Muller outlines in his book: metrics that distort and turn into self-perpetuating games. But rather than talk about these problems in the abstract, I want to consider four examples of distortion and gaming from my own life, each of which I find personally problematic.

The first has to do with research publications. In the past, I have definitely found myself distracted by the publication game. Here’s the dirty little secret about academia: the vast majority of people you work with, on a day-to-day basis, do not care about the research that you do.* They have neither the time, inclination nor expertise to read what you write. Your research is probably only of interest to a handful of like-minded researchers dotted around the world in other research institutions. You might be lucky enough to work in an institution where there is a concentration of people with a shared research focus. But in my experience that is a relative rarity. Indeed, even people who seem to share research interests with you have their own narrow specialisms (and their own metrics to optimise) and as a result won’t be that engaged with what you do. What they are engaged with, however, are your numbers: how many publications do you have? What’s your citation count?

These numbers become your currency of success within your institution. They determine your reputation (Professor X has 1,000 publications! Isn’t she amazing?). It’s hard not to let your ego get bound up with these numbers. You judge others by their numbers and you judge yourself by them too. I fall into this trap all the time. I’m almost ashamed to admit this, but it's somewhat cathartic to confess that I have spent hours browsing through the websites of colleagues to see how far ‘ahead’ or ‘behind’ them I am in the publication game. This is a problem because publication numbers are not what really matters. What really matters is how accurate, persuasive, insightful, original, explanatory, comprehensive (etc) the ideas/arguments/theories contained within those publications are.

At least, that’s what should really matter. But you get distracted from writing high quality research in order to optimise your metrics. As recent scandals have suggested, it is possible, with enough perseverance and pigheadedness to get any old rubbish published in a peer-reviewed journal. I know I have. There are several things I have published that I think are substandard and that I should have spent more time working on. But once you become obsessed with the game, you cannot afford the luxury of quality. You have to act now lest you fall behind your self-imposed standards of productivity. “I published 4 papers last year so I have to publish at least as many again this year, otherwise people will think I’m a slacker!”

The second example has to do with research funding. This is, historically, a bigger deal in the hard sciences than it is in the arts, humanities and social sciences. But it is becoming a big deal there too. Publications are all well and good, but to be truly successful as a modern academic, you have to think of yourself as being akin to a startup CEO. You have to rake in millions of pounds/euros/dollars in research funding to be counted among the elite. To do this, you have to spend long stretches of time dreaming up multi-year projects that can justify those millions. What’s bizarre is that universities themselves see this as a core metric of success and will often invest considerable resources in helping you out in this process. They may send you for costly and nauseating corporate training, and they will usually have dedicated staffs and offices to help you with grant writing and interviewing. Now, as I pointed out above, I haven’t been that successful in the grant-winning game, but I have become increasingly invested in it and have won some small pots of funding. These pots of funding have covered my PhD, two small projects, and one longer 18-month project. Furthermore, and perhaps more significantly, I have been shortlisted for some large grants. For example, last year (2018) I was shortlisted for an ERC starter grant worth just over €1 million euro. I’m proud to report that I failed spectacularly at the presentation and interview stage.

I have mixed feelings about grants. Getting money can be great. Not only is it a reputation builder, it is also genuinely useful to have money to be able to run workshops and events, and to buyout your time so that you can focus on certain research questions. But the grant-winning game also has significant downsides. People care mainly about how much funding you have ‘won’ not what you do with that funding. It’s a classic example of the ‘measuring inputs not outcomes’-fallacy outlined above. Universities set ambitious targets for research funding, but often don’t provide much support to grant winners once their projects are up and running. This is a disaster for someone like me who is bad at managing anything other my own time (and I struggle a lot with that too). What’s more, there is relatively little follow-up to see that you have done what you promised to do. Sure, you have to write progress reports and spend the money appropriately, but as long as you avoid outright fraud, you should be able to write semi-plausible project reports that don’t raise too many red flags. You thus end up managing your project as a series of punctuated panics: long periods of frustration and procrastination, followed by flurries of activity as reporting deadlines approach.

The funding game has other downsides too. It generates a lot of wasted competition: lots of time and money is spent in trying to win ultra-competitive grants. The nature of the game is such that the vast majority of that time ends up being wasted (at least 90% of people won’t get the funding). The high rejection rates are the mark of how prestigious the grants are. All that competition might be justified in the sciences on the grounds that there are some projects that are genuinely better than others and require lots of expensive equipment, but in arts, humanities and social sciences I’m not so sure. I suspect the time and money dedicated to winning grants would be better spent on actual research. And there is another big problem with the grant game too: it contributes to the casualisation of labour in higher education, and the over-supply of PhDs and postdocs. An ambitious researcher is now encouraged to view it as a mark of success to ‘win’ funding that employs other people to do work for them and in lieu of them on a temporary basis. On a case-by-case level this looks like a win-win: more jobs and more money. But at a structural level things look much less rosy.

The third example has to do with teaching. Like most of my colleagues, I have a healthy degree of scepticism when it comes to teaching evaluations. I think they have some merit. For example, if there are serious problems with your course, the student evaluations can reveal them to you (though there should be other avenues for this). But outside those extreme cases, I think they are practically worthless. Students aren’t always the best evaluators of their own educations. Courses that are difficult often score poorly on evaluations because evaluations capture students’ immediate feelings about a course. But these difficult courses can have the most positive long-term impact. The problem is that it is too difficult to assess those long-term impacts as a matter of course (though some researchers have tried to do so and confirmed common sense). They also provide disgruntled students with an excuse to vent their (unwarranted) frustrations with you (and research suggests that women often bear the brunt of this). Furthermore, if the evaluations are good, they do little more than massage your fragile ego.

Indeed, in some ways I find this last feature of student evaluations to be the most pernicious. Most of my teaching evaluations are positive (I’m genuinely not trying to brag). Some students hate me, to be sure, but the majority say that they enjoy my lectures and find them to be enlightening, well-organised and, occasionally, entertaining. That all sounds great. What bothers me is how much I have come to rely on these positive comments for my own self-worth. I anticipate them eagerly after every teaching evaluation. If I don’t get them (or don’t get as many as I think I deserve) I get upset about it. I also then fixate upon the few negative comments (“JD is a cunt”, “John Danaher is a very boring man”)** and let them get me down. In the end, I feel a need to ingratiate myself to my students to boost my metrics. I want to give them more handouts and easier assignments, even though I know that this may not be in their long-term interest. And, of course, the evaluations are not a purely private affair between myself and my students: my institution cares about them too and encourages me to use them in my own self-promotion. So I’m incentivised to optimise my evaluations for reasons that are unrelated to my fragile ego.

The fourth and final example has nothing to do with individual metrics and the games they generate. It has to do with the net effect of working in an environment in which there are so many metrics, and where you are rewarded and incentivised to optimise them all: you end up being spread too thin. You cannot possibly optimise every single metric. There are some opportunity costs. But how do you pick and choose? Unless you are laser-focused and harbour no doubts about the merits of what you are doing, it’s very easy to fall between all the cracks. To end up fragmented, distracted and perpetually frustrated.

3. Breaking Free from the Tyranny of Academic Metrics
All of this paints a negative picture. Is there any light at the end of the tunnel? Maybe. First, let me say that things aren’t as bad as I am making them out to be. I’m being deliberately provocative with my pessimism. I don’t feel disillusioned, distracted or disheartened all the time. I only feel like that some of the time, usually when I fall prey to my worst instincts. The rest of the time I approach my metricisation with a degree of irony and stoicism.

Nevertheless, I think finding an actual escape from the tyranny of academic metrics is hard. Part of the problem is institutional/structural: these are the metrics that are used to determine your employability and promotability within academic institutions. And part of the problem is intrinsic to certain academic disciplines: there are no criteria of success that obviously trump the metrics. In certain mathematical disciplines and hard sciences, it is possible to have laser focus. If your life’s mission is to prove the Riemann hypothesis, then you have a pretty clear standard for your own success. But in other disciplines things are much less clearcut. You want your ideas to achieve some degree of acceptance and influence within your chosen discipline, but beyond that you’re not really sure what it is that you are trying to do. It’s easy to fall back on the institutionally imposed metrics because they are at least tractable, even if they are misleading and disheartening.

So what can you do? I can only speak to my own experience. I find a combination of three strategies to be useful:

Achieve a reasonable ‘leave me alone’ credibility on the major metrics: In other words, score sufficiently highly on some key metrics that no one is going to question what you are doing. To use myself as an example, I feel I have done this on teaching and research. My student evaluations are sufficiently positive that no one is going to call my competence as a teacher into question; I have published enough (I think) that people don’t question my capacity as researcher (particularly since the majority of them aren’t going to read what I have written); and I’ve also won a few bonus points when it comes to research funding and non-academic impact. This gives me some breathing space to do what I most like to do.

Find some metrics that you can live with and do your best to ignore the rest: The abundance of metrics, and the lack of clear criteria for success, can be turned to your advantage. Since no one is really sure what the ultimate measure of success is, and since new metrics are being invented all the time, you have some flexibility to develop and play ‘games’ that appeal to you, and that also make you stand out from the crowd. For example, I genuinely enjoy reading, thinking and writing (not necessarily in that order!). There are plenty of metrics that I can use to pursue those ends that aren’t tied to traditional academic publishing. I’ve found, for example, that my limited success as an academic blogger generates metrics (page views per month, number of downloads per podcast) that have resonated with my employers. Since I’ve achieved reasonable credibility in other metrics, these metrics are often weighted quite favourably in performance reviews and job interviews. This does, admittedly, require a degree of imagination and flexibility in the application of assessment criteria, but it is a potential boon to the more independent-minded researcher.

Use regulative ideals or grand projects to motivate yourself, not metrics: This is the most cringeworthy strategy, and I apologise for that, but it is also the most important. To avoid distraction and frustration I think it is essential not to lose sight of the grander projects and ambitions that motivate your academic life. This doesn’t necessarily mean getting caught up in some messianic revolutionary project (though some of my colleagues do have such ambitions and they seem to work for them); it just means pursuing something other than personal, metricised success. My grand ambitions, for example, include ‘learning more about the world’, ‘pursuing my curiosity wherever it may lead’, and ‘mapping and evaluating the possible futures of humanity’. These projects do not generate simple metrics of success. But I like to constantly remind myself that they are the reason why I’m doing what I’m doing.

H-indexes be damned.

* Yes, I know that this translates more generally: the vast majority of people (full stop) will not read anything you write.

** Both real. For American readers: ‘cunt’ has a less gendered meaning in British/Irish slang. It’s probably most often used to negatively describe a man. It can however be used in a positive sense ( as in “He’s a funny cunt, isn’t he?”). However, I’m pretty sure that the intended meaning was not positive in this case.

1 comment:

  1. These issues are all too common across various organizations. It brings to mind a book I once read about the history of calculus. It discussed in detail the problem Sir Isaac Newton had with peer review publications. When his seminal paper on the colors of light was published, he was severely ridiculed by his peers. He vowed to not publish again. I believe it was 40 years later before he published again. He left academia for a post with the British mint and focused on counterfeit currency. What a loss for us. Thank you for your post on this issue and the nice summary of what happens when people try to meet dubious metrics. Eileen Beachell