|Image taken from Roche et al 2014|
A lot of the contemporary debate around digital surveillance and data-mining focuses on privacy. This is for good reason. Mass digital surveillance impinges on the right to privacy. There are significant asymmetries of power between the companies and governments that utilise mass surveillance and the individuals affected by it. Hence, it is important to introduce legal safeguards that allow ordinary individuals to ensure that their rights are not eroded by the digital superpowers. This is, in effect, the ethos underlying the EU’s General Data Protection Regulation (GDPR).
But is this always a good thing? I have encountered a number of AI enthusiasts who lament this fixation on privacy and data protection. Their worry seems to be this: Modern AI systems depend on massive amounts of data in order to be effective. If they don’t get the data, they cannot learn and develop the pattern-matching abilities that they need in order to work. This means that we need mass data collection in order to unlock the potential benefits of AI. If the pendulum swings too far in favour of privacy and data protection, the worry is that we will never realise these benefits.
Now, I am pretty sure that this is not a serious practical worry just yet. There is still plenty of data being collected even with the protections of the GDPR and there are also plenty of jurisdictions around the world where individuals are not so well protected against the depredations of digital surveillance. So it’s not clear that AI is being held back right now by the lack of data. Still, the objection is an interesting one because it suggests that (a) if there is a sufficiently beneficial use case for AI and (b) if the development of that form of AI relies on mass data collection then (c) there might be some reason to think that individuals ought to share their data with AI developers. This doesn’t mean they should be legally obliged to do so, but perhaps we might think there is a strong ethical or civic duty to do so (like, say, a duty to vote).
But this argument encounters an immediate difficulty, which we can call the ‘data free-rider problem’:
Data Free-Rider Problem: If the effectiveness of AI depends on mass data collection, then the contribution of any one individual’s data to the effectiveness of AI is negligible. Given that there is some moral cost to data sharing (in terms of loss of privacy etc.) then it seems that it is both rational and morally acceptable for any one individual to refuse to share their data.
If this is right, then it would be difficult to argue that there is a strong moral obligation on individuals to share their data.
Problems similar to this plague other ethical and political debates. In the remainder of this article, I want to see if arguments that have recently been made in relation to the ethics of vaccination might carry over to the case of data sharing and support the idea of an obligation to share data.
1. The Vaccination Analogy: Is there are duty to vaccinate?
The dynamics of vaccination are quite similar to the dynamics of AI development (at least if what I’ve said in the introduction is accurate). Vaccination is beneficial but only if a sufficient number of people in a given population get vaccinated. This is what allows for so-called ‘herd immunity’. The exact percentage of people within a population that need to be vaccinated in order to achieve herd immunity varies, but it is usually around 90-95%. This, of course, means that the contribution of anyone individual to achieving herd immunity is negligible. Given this, how can you argue that any one individual has an obligation to get vaccinated?
This is not a purely academic question. Although vaccination is medically contraindicated for some people, for the vast majority it is safe and low cost, with minimal side effects. Unfortunately, there has been a lot of misinformation spread about the harmfulness of vaccination in the past 20 years. This has led many people to refuse to vaccinate themselves and their children. This is creating all manner of real world health crises, with, for example, measles outbreaks now becoming more common despite the fact that an effective vaccination is available.
In a recent paper, Alberto Giublini, Tom Douglas and Julian Savulescu have argued that despite the fact that the individual contribution to herd immunity is minimal, there is nevertheless a moral obligation on individuals (for whom vaccination is not medically contraindicated) to get vaccinated. They make three arguments in support of this claim.
The first argument is a utilitarian one and derives from the work of Derek Parfit. Parfit asks us to imagine a hypothetical case in which a group of people are in a desert and need water. You belong to another group of people each of whom has 1 litre of water to spare. If you all pooled together your spare water, and carted it off to the desert, it would rescue the thirsty group of people. What should you do? Your intuition in such a case would probably be “well, of course I should give my spare water to the other group”. Parfit argues that this intuition can be justified on utilitarian grounds. If you have a case in which collective action is required to secure some beneficial outcome, then, under the right conditions, the utility-maximising thing to do is to contribute to the collective effort. So if you are a utilitarian, you ought to contribute to the collective effort, even if your contribution is minimal.
But what are the ‘right conditions’? One of the conditions stipulated by Parfit is that in order to secure the beneficial outcome everyone must contribute to the collective effort. In other words, if one person refuses to contribute, the benefit is not realised. That’s a bit of a problem since it is presumably not true in the hypothetical he is imagining nor in the kind of case we are concerned with. It is presumably unlikely that your 1 litre of water makes a critical difference to the survival of the thirsty group: 99 litres of water will save their lives just as much as 100 litres. Furthermore, you may yourself be a little thirsty and derive utility from drinking the water. So it might be the case that, if everyone else has donated their water, the utility-maximising thing to do is to keep the water for yourself.
Giublini et al acknowledge this problem and address it by modifying Parfit’s thought experiment. Imagine that instead of pooling the water into a tank that is delivered to the people in the desert, each litre of water goes to a specific person and helps to save their life (they call this a case of ‘directed donation’ and contrast it with the original case of ‘collective donation’). In that case, the utility-maximising thing to do would be to donate the water. They then argue that vaccination is more like a directed donation case than a collective donation case. This is because although any one non-vaccinated person is unlikely to make a difference to herd immunity, they might still make a critical difference by being the person that exposes another person to a serious or fatal illness. This is true even if the risk of contracting and conveying the disease is very low. The small chance of being the crucial causal contributor to another person’s serious illness is enough to generate a utilitarian duty to vaccinate (provided the cost of vaccination to the vaccinated person is low). Giublini et al then generalise from this to formulate a rule to the effect that if your failure to do X results in a low probability but high magnitude risk to others, and if doing X is low cost (lower than the expected risk to others) then you have a duty to do X. This means a utilitarian can endorse a duty to vaccinate. Note, however, that this utilitarian rule ultimately has nothing really to do with collective benefit: the rule would apply even if there was no collective benefit; it applies in virtue of the low probability high magnitude risk to others.
The second argument is a deontological one. Giublini et al actually consider two separate deontological arguments. The first one is based on a Kantian principle of universalisability: you ought to do that which you can endorse everyone doing; and you ought not to do that which you cannot endorse everyone doing. The argument then is that refusing to vaccinate yourself is not universalisable because you could not endorse a world in which everyone refused to vaccinate. Hence you ought to vaccinate yourself. Giublini et al dismiss this argument for somewhat technical reasons that I won’t get into right here. They do, however, accept a second closely-related deontological argument based on contractualism.
Contractualism in moral philosophy is the view that we can work out what our duties are by asking what rules of behaviour we would be willing accept under certain idealised bargaining conditions. Giublini et al focus on the version of contractualism that was developed by the philosopher Thomas Scanlon:
Scanlonian Contractualism: “[a]n act is wrong if its performance under the circumstances would be disallowed by any set of principles for the general regulation of behaviour that no one could reasonably reject as a basis for informed, unforced, general agreement.” (Scanlon 1998, 153 - quoted in Giublini et al 2018)
Reasonable-rejectability is thus the standard for assessing moral duties. If X is reasonably-rejectable under idealised bargaining conditions, then you do not have a duty to do it; if it is not reasonably rejectable, then you have a duty to do it. The argument is that the requirement to vaccinate is not reasonably rejectable under idealised bargaining conditions. Or, to put it another way, the argument is the failure to vaccinate would be disallowed by a set of rules that no one could reasonably reject. If each person in society is at some risk of infection, and if the cost of reducing that risk through vaccination is minimal, then it is reasonable to demand that each person get vaccinated. Note that the reasonability of this depends on the cost of vaccination. If the cost of vaccination is very high (and it might be, for certain people, under certain conditions) then it may not be reasonable to demand that everyone get vaccinated. Giublini et al’s argument is simply that for most vaccinations, for most people, the cost is sufficiently low to make the demand reasonable.
The third argument is neither utilitarian nor deontological. It derives from a widely-accepted moral duty that can be embraced by either school of thought. This is the duty of easy rescue, roughly: if you can save someone from a harmful outcome at minimal cost to yourself, then you have a duty to do so (because it is an ‘easy rescue’). The classic thought experiment outlining this duty is Peter Singer’s drowning infant case: you are walking past a pond with a drowning infant; you could easily jump in and save the infant. Do you have a duty to do so? Of course you do.
Giublini et al argue that vaccination gives rise to a duty of easy rescue. The only difference is, in this case, the duty applies not to individuals but to collectives. The argument works like this: The collective could ensure the safety of individuals by achieving herd immunity. This comes at a minimal cost to the collective as a whole. Therefore, the collective has a duty to do what it takes to achieve herd immunity. The difficulty is that this can only happen if 90-95% of the population contributes to achieving that end through vaccination. This means that in order for the collective to discharge their duty, it must somehow get 90-95% of the population to vaccinate themselves. This means the group must impose the burden of vaccination on that percentage of the population. How can it do this? Giublini et al argue that instead of selecting some specific cohort of 90-95% of the people (and sparing another cohort of 5-10%) the fairest way to distribute that burden is just to say that everyone ought to vaccinate. This means no one is singled out for harsher or more preferential treatment. In short, then, an individual duty to vaccinate can be derived from the collective duty of easy rescue because it is the fairest way to distribute the burden of vaccination.
Suffice to say there is a lot more detail and qualification in Giublini et al’s paper. This quick summary is merely intended to show how they try to overcome the free rider problem in the case of vaccination and conclude that there is an individual duty to vaccinate. The question now is whether these arguments carry over to data collection and AI.
2. Do the arguments carry over to AI development?
Each of Giublini et al’s arguments identifies a set of conditions that must apply in order to derive an individual duty to contribute to a collective benefit. Most of these conditions are shared across the three arguments. The two most important conditions are (a) that there is some genuine and significant benefit to be derived from the collective effort and (b) that the individual contribution to that collective benefit comes at a minimal cost to the individual. There are also other conditions that are only relevant to certain arguments. This is particularly true of the utilitarian argument which, in addition to the two conditions just mentioned, also requires that (c) the individual’s failure to perform the contributory act poses some low probability, high magnitude risk to others.
Identifying these three conditions helps with the present inquiry. Given the analogy we are drawing between AI development and vaccination, the question we need to focus on is whether these three conditions also apply to AI development. Let’s take them one at a time.
First, is there some genuine and significant benefit to be derived from mass data collection and the subsequent development of AI? At present, I am somewhat sceptical. There are lots of touted benefits of AI, but I don’t know that there is a single provable case of significant benefit that is akin to the benefit we derive from vaccination. The use of AI and data collection in medicine is the most obvious direct analogy, but my reading of the literature on AI in medicine suggests that the jury is still out on whether it generates significant benefits or not. There are some interesting projects in progress, but I don’t see a “killer” use case (pardon the irony) at this stage. That said, I would qualify this by pointing out that there are already people who argue that there is a duty to share public health data in some cases, and there is a strong 'open data' movement in the sciences that suggests there is a duty on scientists to share data. One could easily imagine these arguments being modified to make the case for a duty to share such data in order to develop medical AI.
The use of mass data collection to ensure safe autonomous vehicles might be another compelling case in which significant benefit depends on data sharing, but again it is early days there too. Until we have proof of significant benefit, it is hard to argue that there is an individual obligation to contribute data to the development of self-driving cars. And, remember, with any of these use cases it is not enough to show that the AI itself is genuinely beneficial, it must be shown that the benefit depends on mass data collection. This might not be the case. For example, it might be the case that targeted or specialised data (small data) is more useful. Still, despite my scepticism of the present state of AI, it is possible that a genuine and significant benefit will emerge in the future. If that happens, the case for an individual obligation to contribute data could be reopened.
Second, does the individual contribution to AI development (in the form of data sharing) come at minimal cost to the individual? Here is where the privacy activists will sharpen their knives. They will argue that there are indeed significant and underappreciated costs associated with data sharing that make it quite unlike the vaccination case. These costs include the intrinsic harm caused by the loss of privacy* as well as potential consequential harms arising from the misuse of data. For example, the data used to create better medical diagnostics AI could also be used to deny people medical insurance. The former might be beneficial but the latter might encourage more authoritarian control and greater social inequality.
My general take on these arguments is that they can be more or less compelling, depending on the type of data being shared and the context in which it is being shared. The sharing of some data (in some contexts) does come at minimal cost; in other cases the costs are higher. So it is not easy to do a global assessment of this second condition. Furthermore, I think it is worth bearing in mind that the users of technology often don’t seem to be that bothered by the alleged costs of data sharing. They share personal data willy-nilly and for minimal personal benefit. They might be wrong to do this (privacy activists would argue that they are) but this is one reason to think that the worry that prompted this article (that too much data protection is hindering AI) is probably misguided at the present time.
Finally, does the individual failure to contribute data pose some low probability high magnitude risk to others? I don’t know the answer to this. I find it hard to believe that it would. But it is conceivable that there could be a case in which your failure to share data poses a specific risk to another (i.e. that your data makes the crucial causal difference to the welfare of at least one other person). I don’t know of any such cases, but I’m happy to hear of them if they exist. Either way, it is worth remembering that this condition is only relevant if you are making the utilitarian argument for the duty to share data.
What can we conclude from this analysis? To briefly summarise, there is a prima facie case for thinking that AI development depends for its effectiveness on mass data collection and hence that the free rider dynamics of mass data collection pose a threat to the development of effective and beneficial AI. This raises the intriguing question as to whether there might be a duty on individuals to share data with AI developers. Drawing an analogy with vaccination, I have argued that it is unlikely that such a duty exists at the present time. This is because the reasons for thinking that there is an individual duty to contribute to herd immunity in the vaccination do not easily carry over to the AI case. Nevertheless, this is a tentative and defeasible argument. In the future, it is possible that a compelling case could be made for an individual duty to contribute data to AI development. It all depends on the collective benefits of the AI and the costs to the individual of sharing data.
*There are complexities to this. Is privacy harmed if you voluntarily submit your data, even if this is guided by your belief that you have an obligation to do so? This is something privacy scholars struggle with. Historically, the willingness to concede to individual expressed preference (via informed consent) was quite high, but nowadays a more paternalistic view is being taken. The GDPR for example doesn’t make ‘notice-and-consent’ the sole factor in determining the legitimacy of data protection. It works with the implicit assumption that sometimes individuals need to be protected in spite of informed consent.