Sunday, December 7, 2014

Brain-based Lie Detection and the Mereological Fallacy

Some people think that neuroscience will have a significant impact on the law. Some people are more sceptical. A recent book by Michael Pardo and Dennis Patterson — Minds, Brains and Law: The Conceptual Foundations of Law and Neuroscience — belongs to the sceptical camp. In the book, Pardo and Patterson make a passionate plea for conceptual clarity when it comes to the interpretation of neuroscientific evidence and its potential application in the law. They suggest that most neurolaw hype stems from conceptual confusion. They want to throw some philosophical cold water on the proponents of this hype.

In many ways, I am sympathetic to their aims. I too am keen to downplay the neurolaw hype. Once upon a time, I wrote a thesis about criminal responsibility and advances in neuroscience. Half-way through that thesis, I realised that few, if any, of the supposedly revolutionary impacts of neuroscience on the law were all that revolutionary. Most were simply rehashed arguments about free will and responsibility, dressed up in a neuroscientific garb, but which had been around for millennia. I also agree with the authors that there has been much misunderstanding and philosophically naivety on display.

That said, one area of neurolaw that I’m slightly more bullish about is the potential use of brain-based lie detection. But let me clarify. I’m not bullish about the use of “lie detection” per se, but rather EEG-based recognition detection tests or concealed information tests. I’ve written about them many times. Initially I doubted their practical importance and worried about their possibly mystifying effects on legal practice. But, more recently, I’ve come around to the possibility that they may not be all that bad.

Anyway, I’m participating in a conference next week about Pardo and Patterson’s book and so I thought I should really take a look at what they have to say about brain-based lie detection. That’s what this post is about. It’s going to be critical and will argue that Pardo and Patterson’s scepticism about EEG-based tests is misplaced. There are several reasons for this. One of the main ones is that they focus too much on fMRI lie detection, and not enough on the EEG-based alternatives; another is that they fail to really engage with the best current scientific work being done on the EEG tests. The result is that their central philosophical critique of these methods seems to lack purchase, at least when it comes to this particular class of tests.

But I’m getting ahead of myself. To develop this critique, I first need to review the basic techniques of brain-based lie detection and to summarise Pardo and Patterson’s main argument (the “Mereological Fallacy”-argument). Only then can I move on to my own critical musings.

1. What kinds of technologies are we talking about?
When debating the merits of brain-based lie detection techniques, it’s important to distinguish between two distinct phenomena: (i) the scanning technology and (ii) the testing protocol. The scanning technology is what provides us with data about brain activity. There are currently two main technologies in use in this field. Functional magnetic resonance imaging (fMRI) is used to track variations in the flow of oxygenated blood across different brain regions. This is typically thought to be a good proxy measure for underlying brain activity. Electro-encephalographic imaging (EEG) tracks variations in electrical activity across the scalp. This too is thought to be a good measure of underlying brain activity, though the measure is cruder than that provided by fMRI (by “cruder” I mean less capable of being localised to a specific sub-region of the brain).

The testing protocol is how the data provided by the scanning technology is used to find out something interesting about the test subject. In the classic control question test (CQT) the data is used to make inferences as to whether a test subject is lying or being deceitful. This testing protocol involves asking the test subject a series of questions, some of which are relevant to a particular incident (e.g. a hypothetical or real crime), some of which are irrelevant, and some of which are emotionally salient and similar to the relevant questions. The latter are known as “control” questions. The idea behind the CQT is that the pattern of brain activity recorded from those who lie in response to relevant questions will be different from the pattern of activity recorded from those who do not. In this way, the test can help us to separate the deceptive from the honest.

This is to be contrasted with the concealed information test (CIT), which doesn’t try to assess whether a test subject is being deceptive or not. Instead, it tries to assess whether they, or more correctly their brain, recognises certain information. The typical CIT involves presenting a test subject with various stimuli (e.g. pictures or words). These stimuli are either connected to a particular incident (“probes”), not connected to a particular incident but similar to those that are (“targets”), or irrelevant to the incident (“irrelevants”). The subject will usually be asked to perform some task to ensure that they are paying attention to the stimuli (e.g. pressing a button or answering a question). The idea behind the CIT is that certain recorded patterns of activity (data signals) are reliably correlated with the recognition of the probe stimuli. In this way, the test can be used to separate those who recognise certain information from those who do not. Since the information in question will usually be tied to a crime scene, the test is sometimes referred to as the guilty knowledge test. But this name is unfortunate and should be avoided. The test does not prove guilt or even knowledge. At best, it proves recognition of information. Further inferences must be made in order to prove that a suspect has guilty knowledge. Indeed, calling it the “concealed” information test is not great either, since the suspect may or may not be “concealing” the information in question. For these reasons, I tend to prefer calling it something like a memory detection test or, better, a recognition detection test, but concealed information test is the norm within the literature so I’ll stick with that.

As I said above, scanning technologies and testing protocols are distinct. At present, it happens to be the case that EEGs are used to provide the basis for a CIT, and that fMRIs are used to provide the basis for a CQT. But this is only because of present limitations in what we can infer from the data provided by those scans. It is possible that fMRI data could provide the basis for a CIT; and it is possible that EEG data could provide the basis for a CQT. In fact, there are already people investigating the possibility of an fMRI-based CIT.

All that said, the technology I am most interested in, and the one that I will focus on for the remainder of this post, is the P300 CIT. This is an EEG-based technology. The P300 is a particular kind of brainwave (an “evoked response potential”) that can be detected by the EEG. The P300 is typically detected when a subject views a rare and meaningful (i.e. recognised) stimulus, in a sequence of other stimuli. As such, it is thought to provide a promising basis for a CIT. I won’t go into any great depth about the empirical evidence for this technique, though you can read about it in some of my papers, as well as in this review article from Rosenfeld et al 2013. I’m avoiding this because Pardo and Patterson’s criticisms of these technologies is largely conceptual in nature.

Let’s turn to that critique now.

2. The P300 and the Mereological Fallacy
Before I get into the meat of Pardo and Patterson’s argument, I need to offer some warnings to the reader. Although in the relevant chapter of their book, the authors do lump together the P300 CIT with the fMRI CQT, it is pretty clear that their major focus is on the latter, not the former. This is clear both from the time they spend covering the evidence in relation to the fMRI tests, and from their focus on the concept of lying in their critique of these tests. It is also clear from the fact that they assume that the P300 CIT is itself a type of lie detection. In this they are not entirely correct. It is true that one may be inclined to infer deceptiveness from the results of a P300 CIT, either because the testing protocol forces subjects to lie when responding to the stimuli, or because the subject themselves may deny recognising the target information. But inferring deceptiveness is not the primary goal nor the primary forensic use of this test — inferring recognition is.

Pardo and Patterson’s preoccupation with lie detection should blunt some of the force of my critique. After all, my focus is on recognition detection, and so it may be fairly said that my defence of that technology misses their larger point, or does not really call into question their larger point. Nevertheless, I do think there is some value to what I am about to say. Pardo and Patterson do still talk about the P300 CIT and they do still argue that the mereological fallacy (which I’ll explain in a moment) could apply to the interpretation of evidence drawn from that test. The fact that they spend less time fleshing this out doesn’t mean the topic is irrelevant. Indeed, it may serve to bolster my critique since it suggests that their application of the mereological fallacy to the P300 CIT is not as well thought-out, nor as respectful of the current state of the art in research and scholarship, as it should be.

But what is this mereological fallacy and how does it affect their argument? Mereology is the study of part-whole relations, so as you might gather the mereological fallacy arises from a failure to appreciate the difference between a whole and one of its parts. For the sake of clarity, let’s distinguish between two versions of the mereological fallacy. The first, more general one, can be defined like this:

The General Mereological Fallacy: Arises whenever you ascribe properties that are rightly ascribed to a whole to some part or sub-part of that whole. For example, applying the predicate “fast” to a runner’s leg, rather than to the runner themselves.

The second, more specific one, can be defined like this:

The Neurolaw Mereological Fallacy: Arises whenever a neurolaw proponent ascribes behavioural or person-level properties to a state of the brain. (In other words, whenever they assume that a brain state is constitutive of or equivalent to a behavioural or person-level state). For example, applying the predicate “wise” to a state of the brain, as opposed to the person whose brain state it is.

This more specific version of the fallacy is the centrepiece of Pardo and Patterson’s book. Indeed, their book is effectively one long elaboration of how the neurolaw mereological fallacy arises in various aspects of the neurolaw literature. In basing their criticism on this fallacy, they are following the work of others. As far as I am aware, the mereological fallacy was first introduced into debates about the philosophy of mind by Bennett and Hacker. Pardo and Patterson are simply adapting and updating Bennett and Hacker’s critique, and applying it to the neurolaw debate. This is not a criticism of their work since they do that job with great care and aplomb; it is simply an attempt to recognise the origins of the critique.

Anyway, the neurolaw mereological fallacy provides the basis for Pardo and Patterson’s main critique of brain-based lie detection. Though they do not set this critique out with any formality, I think it can be plausibly interpreted as taking the following form (see pp. 99-105 for the details):

  • (1) If it is likely that the use of brain-based lie detection evidence would lead legal actors (lawyers, judges, juries etc) to commit the neurolaw mereological fallacy, then we should be (very) cautious about its forensic uses.

  • (2) The use of brain-based lie detection evidence is likely to lead legal actors to commit the neurolaw mereological fallacy.

  • (3) Therefore, we should be (very) cautious about the forensic uses of brain-based lie detection evidence.

Let’s go through the main premises of this argument in some detail.

The first premise is the guiding normative assumption. I am not going to challenge it here. I will simply accept it arguendo (“for the sake of argument”). Nevertheless, one might wonder why we should endorse it? Why is the mereological fallacy so normatively problematic? There are some reasons. The main one is that the law cares about certain concepts. These include the intentions of a murder suspect, the knowledge of the thief, and the credibility or potential deceptiveness of the witness. The application of these concepts to real people is what carries all the normative weight in a legal trial. The content of one’s intentions and the state of one’s knowledge is what separates the criminal from the innocent. The deceptiveness of one’s testimony is what renders it probative (or not) in legal decision-making. Pardo and Patterson maintain that all these concepts, properly understood, apply to the behavioural or personal level of analysis. For example, they argue that deceptiveness depends on a complex relationship between a person’s behaviour and the context in which that behaviour is performed. To be precise, being deceptive means saying or implying something that one believes to be false, in a social context in which truth-telling is expected or demanded. These behavioural-contextual criteria are what ultimately determine the correct application of the predicate “deceptive” to an individual.

If we make a mistake in the application of those predicates, it has significant normative implications. If we deem someone deceptive when, by rights, they are not, then we risk a miscarriage of justice (or something less severe but still malign). The concern that Pardo and Patterson have is that neurolaw will encourage people to make such mistakes. If they start using neurological criteria as the gold-standard in the application of normative, behavioural-level predicates like “intention” and “knowledge”, then they risk making normative errors. This is why we should be cautious about the use of neuroscientific evidence in the law.

But how cautious should we be? That’s something I’m not entirely clear about from my reading of Pardo and Patterson’s book. They are not completely opposed to the use of brain-based lie detection in the law. Far from it. They think it could, one day, be used to assist legal decision-making. But they do urge some level of caution. My sense from their discussion, and from their book as a whole, is that they favour a lot of caution. This is why I have put “very” in brackets in my statement of premise (1).

Moving on then to premise (2), this is the key factual claim about the use of brain-based lie detection evidence. In its current form it does not discriminate between the P300 CIT and the fMRI CQT. Pardo and Patterson’s concern is that evidence drawn from these tests will lead to legal actors confusing the presence of brain signal X with the subject’s meeting the criteria for the application of a behavioural predicate like “knowing” or “intending” or “deceiving”. In the case of the P300 CIT, the fallacy arises if the detection of the P300 is taken to be equivalent to the detection of a “knowledge”-state within the subject’s brain, instead of merely evidence that can be used to infer that the subject is in the appropriate behavioural knowledge state.

But do proponents of this technology make the fallacy? Pardo and Patterson argue that they do. They offer support for this by quoting from an infamous proponent of the P300 CIT: Lawrence Farwell. When describing how the technology worked, Farwell once said that the “brain of the criminal is always there, recording events, in some ways like a video camera”. Hence, he argued that the P300 CIT reveals whether or not crime-relevant information is present in the brain’s recording. Farwell is committing the fallacy here because he thinks that the state of knowing crime relevant information is equivalent to a brain state. But it is not:

This characterization depends on a confused conception of knowledge. Neither knowing something nor what is known — a detail about a crime, for example — is stored in the brain…Suppose, for example, a defendant has brain activity that is purported to be knowledge of a particular fact about a crime. But, suppose further, this defendant sincerely could not engage in any behavior that that would count as manifestation of knowledge. On what basis could one claim and prove that the defendant truly had knowledge of this fact? We suggest that there is none; rather, as with a discrepancy regarding lies and deception, the defendant’s failure to satisfy any criteria for knowing would override claims that depend on the neuroscientific evidence 
(Pardo and Patterson 2013, pp. 101-102)

Or as they put it again later, behavioural evidence is “criterial” evidence for someone knowing a particular fact (satisfaction of the behaviour criteria simply is equivalent to being in a state of knowledge); neuroscientific evidence is merely inductive evidence that can be used to infer what someone knows. People like Farwell are wont to confuse the latter with the former and hence wont to commit the mereological fallacy.

That, at any rate, would appear to be their argument. Is it any good?

3. Should we take the mereological fallacy seriously?
I want to make three criticisms of Pardo and Patterson’s argument. First, I want to suggest that the risk of proponents of the P300 CIT committing the mereological fallacy is, in reality, slight. At least, it is when one takes into account the most up-to-date work being done on the topic. Second, I want to push back against Pardo and Patterson’s characterisation of the mereological fallacy in the case of the P300 CIT. And third — and perhaps most significantly — I want to argue that in emphasising the risk of a neurolaw mereological fallacy, Pardo and Patterson ignore other possible — and arguably more serious — evidential errors in the legal system.

(Note: these criticisms are hastily constructed. They are my preliminary take on the matter. I hope to revise them after next week’s conference)

Turning to the first criticism, my worry is that in their defence of premise (2), Pardo and Patterson are constructing something of a straw man. For instance, they cite Lawrene Farwell as an example of someone who might confuse inductive neuroscientific evidence of knowledge with criterial behavioural evidence of knowledge. But this is a misleading example. Farwell’s characterisation of the brain as something which simply records and stores information has been criticised by leading proponents of the P300 CIT. For example, J. Peter Rosenfeld, himself a leading psychophysiologist and P300 researcher, wrote a lengthy critical appraisal of Farwell back in 2005. In it, he identified the problems with Farwell’s analogy, and noted that the act of remembering or recollecting information is highly fallible and reconstructive. There are also other P300 CIT researchers have actually tried to check the vulnerability of the technique to false memories. Beyond this, Farwell has been more generally criticised by experts in the field. In a recent commentary on a review article written by Farwell, the authors (a group of leading P300 researchers) said this:

By selectively dismissing relevant data, presenting conference abstracts as published data, and most worrisome, deliberately duplicating participants and studies, he misrepresents the scientific status of brain fingerprinting. Thus, [Farwell] violates some of the cherished canons of science and if [he] is, as he claims to be, a ‘brain fingerprinting scientist’ he should feel obligated to retract the article. 
(Meijer et al, 2013)

Of course, Farwell isn’t a straw man: he really exists and he really has pushed for the use of this technology in the courtroom. So I’m not claiming that there is no danger here or that Pardo and Patterson are completely wrong to warn us about it. My only point is that Farwell isn’t particularly representative of the work being done in this field, and that there are others that are live to the dangers of assuming that the P300 signal does anything more than provide inductive evidence of knowledge. To be fair, I have a dog in this fight since I have written positively about this technology. But I would never claim that the detection of a P300 is criterial evidence of guilty knowledge; I would always point out that further inferential steps are needed to take reach such a conclusion. I am also keen to point out that this technology is not yet ready for forensic use. Along with other proponents, I think widespread field-testing — in which the results of a P300 are measured against other more conclusive forms of evidence (including behavioural evidence) in actual criminal/legal cases — would be needed before we seriously consider it.

This leads me to the second criticism, which is that I not entirely sure about Pardo and Patterson’s characterisation of the mereological fallacy, at least as it pertains to the P300 CIT. They are claiming that there is an important distinction between a person knowing something and the neurological states of that person. Knowledge is a state pertaining the whole, whereas neurological states are sub-parts of that whole. Fair enough. But as I see it, the P300 CIT is not a test of a subject’s knowledge at all. It is a recognition test. In fact, it is not even a test of whether a person recognises information; rather, it is a test of whether the person’s brain recognises information. A person’s brain could recognise a stimulus without the person themselves recognising the stimulus. Why? Because large parts of what the brain does are sub-conscious (sub-personal — if we assume the personal is defined by continuing streams of consciousness). Figuring out whether a subject’s brain recognises a stimulus seems forensically useful to me, and it need not be confused with assuming that the person recognises the stimulus.

The final criticism is probably the most important. A major problem I have with Pardo and Patterson’s discussion of brain-based lie detection is how isolated it feels. They highlight the empirical and conceptual problems with this form of evidence without considering that evidence in its appropriate context. I will grant that there is a slight risk that proponents of the P300 CIT will commit the mereological fallacy. But how important is that risk? Should it really lead us to be (very) cautious about the use of this technology? That’s something that can only be assessed in context. What other methods do we currently use for determining whether a witness or suspect recognises certain crime-relevant information? There are several. The most common are robust questioning, cross examination and interrogation. Verbal or behavioural responses from these methods are then used to make inferences about what someone knows or does not know. But these methods are not particularly reliable. Even if behavioural criteria determine what it means for a subject to know something, there are all sorts of behavioural signals that can mislead us. Is someone hiding something if they are being fidgety? Or if they look nervous and blink too often? What if they change their story? We routinely make inferences from these behavioural signals without knowing for sure how reliable they are or how likely they are to mislead us (though we may have some intuitive sense of this).

And this matters. One of the points that I, and others, have been making in relation to the P300 CIT is that it provides a neurological signal, from which we can make certain inferences, and that it comes with known error rates and precise protocols for its administration. In this respect it seems to have a comparative advantage over many of the other methods we use for making similar inferences. This is why we should take it seriously. In other words, even if it does carry with it the risk that legal actors will commit the mereological fallacy, that risk has to be weighed against the risks associated with other, similar, evidential methods. If the latter outweigh the former, Pardo and Patterson’s argument seem a good deal less significant.

4. Conclusion
To briefly sum up, Pardo and Patterson offer an interesting and philosophically sophisticated critique of brain-based lie detection. They argue that one of the dangers with this technology is that the legal actors who make use of it will be prone to commit the neurolaw mereological fallacy. This fallacy arises when they ascribe behavioural-level properties to brain states. Though I agree that this is a fallacy, I argue that it is not that dangerous, at least in the case of evidence drawn from the P300 CIT. This is for three reasons. First, I think the risk of actual proponents of the technology committing this fallacy is slight. With the exception of Lawrence Farwell — whom Pardo and Patterson critique — most proponents of the technology are sensitive to its various shortcomings. Second, Pardo and Patterson’s characterisation of the mereological fallacy — at least when it comes to this type of evidence — seems misleading. The P300 CIT provides a signal of brain-recognition of certain information, not person-recognition of information. And third, and most important, the risk of committing the mereological fallacy must be weighed against the risk of making faulty inferences from other types of evidence. It is suggested that the latter are likely to be higher than the former.

No comments:

Post a Comment