Philosophical Disquisitions: The Singularity

Roughly (I’ll refine later on) the “technological singularity” (or “singularity” for short, and in the right context) is the name given to point in time at which greater-than-human superintelligent machines are created. The concept (and name) was popularised by the science fiction author Vernor Vinge in the 1980s and 90s, though its roots can be traced further back in time to the work of John Von Neumann and I.J. Good.

The notion of a technological singularity is treated with derision in some quarters. But that attitude seems inapposite. As David Chalmers points out in his long (and excellent) discussion, the singularity is worthy of serious philosophical and intellectual consideration. The arguments for its occurrence are themselves intrinsically fascinating, raising all sort of philosophical and scientific questions. Furthermore, if it ever did happen, it would have serious practical, ethical and metaphysical consequences. Why should their consideration warrant derision?

Many now seem to agree. The California-based Singularity Institute has long been calling for (and more recently delivering) serious research on this topic. The Oxford-based Future of Humanity Institute has several researchers who dedicate themselves either full or part-time to the issue. And the recently-launched Cambridge Centre for the Study of Existential Risk includes the possible creation of superintelligent AI as one the four greatest threats to humanity, each of which it will be studying.

I also agree. Like Chalmers, I think there are some intrinsically and instrumentally important philosophical issues to consider here, but I’m not entirely sure where I come down on any of those issues. One of the reasons for this is that I’m not entirely sure what the main issues are. I’m a philosopher (sort of - not officially!). I work best when complex topics are broken down into a set of theses, each of which can be supported by an argument (or set of arguments), and each of which is a potential object of further research and contestation. Unfortunately, I have yet to come across a general philosophical overview of the singularity that breaks it down in a way that I like. Chalmers’s article certainly attempts to do this, and succeeds in many areas, but his focus is both too narrow, and too complex for my liking. He doesn’t break down the core theses in a way that clearly and simply emphasises why the singularity is a matter of serious practical concern, something that may pose a genuine existential threat to humanity.

This post is my attempt to correct for this deficiency. I write it in full or partial awareness of other attempts to do the same thing, and with full awareness of the limitations of any such attempt. Breaking a complex topic down into discreet theses, while often useful, runs the risk of overlooking important connections, de-emphasising some issues, and obscuring others. I’m happy to entertain critiques of what I’m about to do that highlight failings of this sort. This is very much a first pass at this topic, one that I’m sure I’ll revisit in the future.

So what am I about to do? I’m going to suggest that there are three main theses to contend with when one thinks about the technological singularity. They are, respectively: (i) the Explosion thesis; (ii) the Unfriendliness thesis; and (iii) the Inevitability thesis. Each one of these theses is supported by a number of arguments. Identifying and clarifying the premises of these arguments is an important area of research. Some of this work has been done, but more can and should be done. Furthermore, each of three main theses represents a point on scale of possibilities. These other possibilities represent additional points of argumentation and debate, ones that may be taken up by opponents of the core theses. As a consequence, one’s view on the technological singularity could be thought to occupy a location within a three-dimensional space of possible views. But I’ll abjure this level of abstraction and focus instead on a couple of possible positions within this space.

With that in mind, the remainder of this post has the following structure. First, I’ll present each of the three main theses in order. As I do so, I’ll highlight the conceptual issues associated with these theses, the arguments that support them, and the scale on which each thesis lies. Second, I’ll consider which combinations of beliefs about these theses are most interesting. In particular, I’ll highlight how the combination of all three theses supports the notion of an AI-pocalypse, but that weaker versions of the theses might support AI-husbandry or AI-topia. And third, I’ll briefly present my own (tentative) views on each of the theses (without supporting argumentation).

1. The Explosion Thesis
The core thesis of singularitarian thinking is the Explosion Thesis. It might be characterised in the following manner.

Explosion Thesis: There will be an intelligence explosion. That is: a state of affairs will arise in which for every AI_n that is created, AI_n will create AI_n+1 (where AI_n+1 is more intelligent than AI_n) up to some limit of intelligence or resources.

Here, “AI” refers to an artificial intelligence with human level cognitive capacities. Sometimes, the label AGI is used. This refers to an Artificial General Iintelligence and when used can help us to avoid confusing the phenomenon of interest to singularitarians with computer programs that happen to be highly competent in a narrow domain. Still, I’ll use the term AI throughout this discussion.

The first human level AI will be numerically designated as “AI₀”, with AI₁ used to designate the first AI with greater than human capacities. Some useful shorthand expressions were introduced by Chalmers (2010), and they will be followed here. Chalmers used “AI+” to refer to any AI with greater than human intelligence, and “AI++” to refer to vastly more intelligent AIs of the sort that could arise during an intelligence explosion.

Conceptual and practical issues related to the Explosion Thesis include: (a) the nature of intelligence and (b) the connection between “intelligence” and “power”. Intelligence is a contested concept. Although many researchers think that there is a general capacity or property known as intelligence (labelled g) others challenge this notion. Their debate might be important here because if there is no such general capacity we might not be able to create an AI. But this is a red herring. As Chalmers (2010) argues, all that matters is that we can create machines with some self-amplifying capacity or capacities that correlate with other capacities we are interested in and are of general practical importance. For example, the capacity to do science, or to engage in general means-end reasoning, or to do philosophy or so on. Whether there is some truly general capacity called “intelligence” matters not. The relationship between “intelligence” and “power” is, however, more important. If “power” is understood as the capacity to act in the real world, then one might think it is possible to be super-smart or intelligent without having any power. As we shall see, this possibility is an important topic of debate.

The original argument for the Explosion Thesis was I.J. Good’s. He said:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make. (LINK)

Good’s argument has, in recent years, been refined considerably by Chalmers and Hutter (there may be others too whose work I have not read). I hope to look at Chalmers’s argument in a future post so I’ll say no more about it now.

As noted at the outset, the Explosion Thesis is a point on a scale. The scale is illustrated below. Below the Explosion Thesis one finds the AI+ Thesis. Defenders of this thesis agree that greater than human level AI is likely, but do not think this will lead to the recursive intelligence explosion identified by Good. In terms of practical implications, there may be little difference between the two positions. Many of the potentially negative (and positive) consequences of the Explosion Thesis would also follow from the AI+ Thesis, though the levels of severity might be different. Next on the scale is the AI Thesis, and below that is the No AI Thesis. I think these speak for themselves. Proponents and advocates of the singularity take up some position along this scale.

The Explosion Thesis is the core of singularitarianism. Unless one accepts the Explosion Thesis, or at a minimum the AI+ Thesis, the other theses will be irrelevant. But what does it mean to “accept” the thesis? I adopt a Bayesian or probabilistic approach to the matter. This means that even if I think the probability of an intelligence explosion is low, it might still be worth dedicating serious effort to working out its implications. For example, I think a 10% chance of an intelligence explosion would be enough for it to warrant attention. Where the threshold comes for theses that are not worthy of consideration is another matter. I couldn’t give a precise figure, but I could probably give analogous probabilities, e.g. the probability of my car suddenly morphing into a shark seems sufficiently low to me to not warrant my attention. If the probability of an intelligence explosion was similar to that, I wouldn’t care about it.

2. The Unfriendliness Thesis
This thesis may be characterised in the following manner:

Unfriendliness Thesis: It is highly likely that any AI+ or AI++ will be unfriendly to us. That is: will have goals that are antithetical to those of us human beings, or will act in a manner that is antithetical to our goals.

This thesis, and its complement The Friendliness Thesis, form an obvious spectrum of possible outcomes: at one end of the scale it is highly likely that AI+ will be unfriendly, and at the other end of the scale it is highly likely that AI+ will be friendly. One’s beliefs can then be represented as a probability distribution over this spectrum.

There are a couple of conceptual issues associated with this thesis. First, there is the question of what is meant by “friendliness” or “unfriendliness”. If the concept were introduced by a philosopher, it would probably be couched in terms of objective and subjective value, not friendliness. So, in other words, they would view the important issue as being whether an AI+ would act in manner that promotes or undermines objective or subjective value. Objective value would concern what is best from an agent-neutral perspective; subjective value would concern what is best from the perspective of one or more agents. The use of the term “us” in the definition of the thesis suggests that it is some form of subjective value that is being invoked — perhaps an aggregate form — but that may not be the case. What really seems to be at issue is whether the creation of AI+ would be “good” or “bad”, all things considered. Those terms are fuzzy, for sure, but their fuzziness probably enables them to better capture the issues at stake.

If the thesis does invoke some form of (aggregate) subjective value, it raises another conceptual question: who is the “us” that is being referred to? It might be that the “us” refers to us human beings as we currently exist, thus invoking a species relative conception of value, but maybe that’s not the best way to think about it. Within the debate over human enhancement, similar issues arise. Some philosophers — such as Nicholas Agar — argue that human enhancement is not welcome because enhanced human beings are unlikely to share the values that current unenhanced human beings share. But so what? Enhanced human beings might have better values, and if we are the ones being enhanced we’re unlikely to care too much about the transition (think about your own shift in values as you moved from childhood to adulthood). Something similar may be true in the case of an intelligence explosion. Even if an AI has different values to our own, “we” may not care, particularly if our intelligence is enhanced as part of the process. This may well be the case if we integrate more deeply with AI architectures.

The main argument in favour of the Unfriendliness Thesis would be the Vastness of Mindspace Argument. That is: the multidimensional space of possible AI+ minds is vast; most locations within that space are taken up by minds that are antithetical to our values; therefore, it is highly probable that any AI+ will have values that are antithetical to our own. Of course, this argument is flawed because it assumes that all locations within that mindspace are equally probable. This may not be the case. Additional arguments are needed.

The Orthogonality Argument (which I’ve looked at before) is one such argument. It has been developed by Armstrong and Bostrom and holds that, because intelligence and final goals are orthogonal, a high level of intelligence is compatible with (nearly) any final goal. In other words, there is no reason to think that intelligence correlates with increased moral virtue. Similarly, others (e.g. Muehlhauser and Helm) argue that values are incredibly complex and difficult to specify and stabilise within an AI architecture. Thus, there is no reason to think we could create an AI with goals that match our own.

The Friendliness Thesis might be held by a certain type of moral cognitivist. One that believes that as intelligence goes up, so too does the likelihood of having accurate moral beliefs. Since these beliefs are motivating, it would follow that AI+ is highly likely to be morally virtuous. Although many reject this claim, they may point to the possibility of a Friendly AI+ as a welcome one, and give this as a reason to be favourably disposed to the project of creating AI+. More on this later.

3. The Inevitability Thesis
The third thesis may be characterised in the following manner:

Inevitability Thesis: The creation of an unfriendly AI+ is inevitable.

The term “inevitable” is being used here in the same sense in which Daniel Dennett used it in his book Freedom Evolves. It means “unavoidable” and can be contrasted with the opposing concept of “evitability” (avoidability). One may wonder why I don’t simply use the terms “unavoidable” and “avoidable” instead. My response is that I like the idea of drawing upon the rich connotations of the “inevitability-evitability” distinction that arise from Dennett’s work.

In any event, the Inevitability Thesis obviously represents a point on a scale of possibilities that includes the Evitability Thesis at its opposite end. Those who subscribe to the latter will believe that it is possible to avoid the creation of unfriendly AI+. But we must be careful here. There are different ways in which unfriendly AI+ might be inevitable and different ways in which it might be evitable. Indeed, I identify at least two types of each. As follows:

Inevitability₁: The creation of an unfriendly AI+ with the ability to realise its goals is inevitable.(Strong Inevitability Thesis)

Inevitability₂: The creation of an unfriendly AI+ is inevitable, but it need not have the ability to realise its goals.(Containment/Leakproofing Thesis)

Evitability₁: It is possible to avoid the creation of an unfriendly AI+ and to realise the creation of a friendly AI+.(The AI-Husbandry Thesis)

Evitability₂: It is possible to avoid the creation of both unfriendly and friendly AI+.(Strong Evitability Thesis)

The two types of inevitability draw upon the distinction between intelligence and power to which I alluded earlier. Those who believe in strong inevitability think that intelligence carries power with it. In other words, any sufficiently intelligent AI will automatically have the ability to realise its goals. Those who believe in containment/leakproofing think it is possible to disentangle the two. Proponents or Oracle AI or Tool AI take this view.

The first type of evitability is the view adopted by the likes of Kurzweil. They are generally optimistic about the benefits of technology and intelligence explosions, and think we should facilitate them because they can be of great benefit to us. It is also the view that people in the Singularity Institute (and the FHI) would like to believe in. That is: they think we should try to control AI research in such a way that it brings about friendly rather than unfriendly AI+. Hence, I call it the AI-husbandry thesis. Its proponents think we can corral AI research in the direction of friendly AI+.

The arguments in this particular area are complex and multifarious. They range over a huge swathe of intellectual territory including international politics, the incentive structures of academic research, human psychology and foresightedness, technological feasibility and so on. It would be difficult to give a precis of the arguments here.

4. The AI-pocalypse or the AI-topia?
So there we have it: three main theses, each of which is supported by a number of arguments, and each of which can be viewed as a point on a scale of possible theses which in turn are supported by a variety of arguments. In other words, a complex web of premises and conclusions that is ample fodder for philosophical and scientific investigation.

Some combinations of views within this complex web are significant. The most obvious is the one in which the Explosion Thesis (or the AI+ thesis), the Unfriendliness Thesis, and the Strong Inevitability Thesis are accepted. The combination of these three views is thought to imply the AI-pocalypse: the state of affairs in which AI+ take over and do something very bad. This could range from enslaving the human population, to exterminating them, to destroying the world and to many other imaginable horrors. The precise form of the AI-pocalypse is not important, the fact that it is implied (or suggested) by the combination of the three theses is what matters.

Other combinations of views are more benign. For example, if you don’t think that AI+ is possible, then there’s nothing particularly significant to worry about. At least, nothing over and above what we already worry about. If you accept the likelihood of AI+, and the Unfriendliness Thesis, and the AI-Husbandry Thesis, then you have some reason for cautious optimism. Why? Because you think it is possible for us to create a friendly AI+, and a friendly AI+ could help us to get all sorts of things that are valuable to us. And if you accept the likelihood of AI+ and the Friendliness Thesis, you should be very optimistic indeed. Why? Because you think that the more intelligence there is, the more likely it is that valuable states of affairs will be brought about. So we’d better ramp-up intelligence as quickly as we can. This is the view of AI-topians.

I suspect, though I can’t say for sure, that cautious optimism is probably the prevalent attitude amongst researchers in this area, though this is modified by a good dose of AI-pocalypticism. This is useful for fundraising.

5. What do I think?
Honestly, I have no idea. I’m not sure I could even estimate my subjective probabilities for some of the theses. Still, here are my current feelings about each:

The Explosion Thesis: Good’s argument, along with Chalmers’s more recent refinement of it, seems plausible to me, but it relies crucially on the existence of self-amplifying capacities and the absence of intelligence and resource limitations. I couldn’t estimate a probability for all of these variables. I’d be a little bit more confident about the potential creation of AI+ for the simple reason that (a) I think there are many skills and abilities that humans have improved and will improve in the future (some through collective action), (b) artificial agents have several advantages over us (processing speed etc) and (c) I see no obvious reason why an artificial agent could not (some day) replicate our abilities. So I would put my subjective probability for this at approximately 0.75. I should also add that I’m not sure that the creation of a truly general AI is necessary for AI to pose a significant threat. A generally naive AI that was particularly competent at one thing (e.g. killing humans) would be bad enough (but some may argue that to be truly competent at that one thing, general intelligence would be required, I’ll not comment on that here).

The Unfriendliness Thesis: I have criticised Bostrom’s orthogonality thesis in the past, but that was mainly due some dubious argumentation in Bostrom’s paper, not so much because I disagree with the notion tout court. I am a moral cognitivist, and I do believe that benevolence should increase with intelligence (although it would depend on what we mean by “intelligence”). Still, I wouldn’t bet my house (if I owned one) on this proposition. I reckon it’s more likely than not that a highly competent AI+ would have goals that are antithetical to our own.

The Inevitability Thesis: This is where I would be most pessimistic. I think that if a particular technology is feasible, if its creation might bring some strategic advantage, and if people are incentivised to create it, then it probably will be created. I suspect the latter two conditions hold in the case of AI+, and the former is likely (see my comments on the Explosion Thesis). I’m not sure that there are any good ways to avoid this, though I’m open to suggestions.

At no point here have I mentioned the timeframe for AI+ because I’m not sure it matters. To paraphrase Chalmers, if it’s a decade away, or a hundred years, or a thousand years, who cares? What’s a thousand years in the scale of history? If it’s probable at some point, and its implications are significant, it’s worth thinking about now. Whether it should be one’s main preoccupation is another matter.

Philosophical Disquisitions

Pages

Monday, December 24, 2012

The Singularity - Overview and Framework

No comments:

Post a Comment