This is the second part in my series of posts about John Meixner’s article “Liar, Liar, Jury’s the Trier”. The article makes the case for the (potential) use of brain-based lie detection in legal trials. More precisely, it makes the case for the potential use of the P300 concealed information test in legal trials, which although often lumped together with traditional lie detection tests is importantly different from them.
In this series of posts, I am only looking at a small part of what Meixner has to say. As he notes, one objection to the use of brain-based lie detection tests is that the evidence drawn from them trespasses upon an issue that is the exclusive prerogative of the jury: determining the credibility of the witness. The classic view is that scientific evidence is only required when the court strays beyond its areas of competence; assessing witness credibility is traditionally deemed a competency of the court (particularly the jury); therefore, there is no reason to rely upon brain-based lie detection evidence when making credibility assessments.
Meixner argues that the problem with this is that juries are not, as a matter of fact, particularly good at determining the credibility of witnesses. And one of the nice things about Meixner’s article is how he reviews some of the empirical studies that have been done on this issue. We looked at the first batch of such studies in part one. Each of them dealt with the possibility of juries using “demeanour cues” to determine whether a witness was telling the truth or not. But demeanour cues are not the only thing one could use to assess credibility. They could also determine credibility by looking at the consistency or depth of a witness’ statements (call these “content cues”). This tracks with common sense: I often try to work out whether a friend or colleague is lying to me by checking the consistency of their story. It stands to reason that jurors would do the same thing, doesn’t it?
Maybe. But we need to see where the evidence leads. To this end, we’ll break down the remainder of the discussion into two parts. First, we’ll ask whether jurors do, as a matter of fact, use content cues to assess credibility. Second, we’ll ask whether they are any good at doing so.
1. Do juries use content cues to assess credibility?
As was the case in part one, the majority of this post will consist in the summary of empirical research. The research doesn’t always deal specifically with “juries” (mock or otherwise). Sometimes it just deals with “ordinary” people (usually undergraduate students). We make inferences from these studies to the behaviour of possible jurors on the grounds that jurors are no different from ordinary people. This, of course, may be a problematic inference in certain cases.
It is worth bearing in mind that Meixner looks at the empirical evidence in a particular context. He starts by examining an argument from a guy called Max Minzer, who has criticised the notion that all credibility assessments are based on demeanour cues. Minzer feels that juries would also rely on content cues. He cites two studies in support of this claim:
Hee Sun Park et al 2002: This was a survey of 202 undergraduate students. Each student was asked to recall a recent situation in which they had caught someone lying to them, and to recall as much information about the incident as possible. They were then asked specific questions about the incident, including questions about how they detected the lie. It was found that 32% of the lies were discovered through 3rd party information; 31% through a combination of methods; 18% through physical evidence; and 2% through demeanour cues.
As you can see, this study doesn’t expressly support the notion that ordinary people use content cues to assess credibility; all it does is suggest that they don’t rely very heavily on demeanour cues. This would be an interesting finding, if we could trust the results of the survey, but that would seem to be unwarranted. Meixner notes several methodological flaws with the study. By relying on incidents that people can actually recall, the study risks biasing the evidence in favour of a particular view. For example, it could be that our memories are selective, and that we don’t remember deception detection that relies heavily on demeanour cues. Furthermore, the scenario envisaged here is highly disanalogous to that which would take place in a courtroom. So drawing an inference from this to the likely behaviour of juries is problematic.
That brings us to the second study:
Seidman Diamond et al 2006: This was part of an ongoing research project dealing with the actual behaviour of real world juries. The study analysed the kinds of questions asked by jurors in 50 civil cases in Arizona state courts. The findings were that 42% of all juror questions were of the “cross-checking” variety. In other words, they were attempts by members of the jury to check the consistency of the story being presented by the witness. This was particularly so when the witness was a disinterested party. It is worth noting that not all courts allow juries to ask questions of witnesses during a trial, so this study could only be replicated in certain jurisdictions.
Meixner says that this study does provide some support for the notion that jurors use content cues to determine credibility. That said, it suffers from several limitations. The biggest one is that it doesn’t allow us to make any claims about the effect of demeanour cues on credibility assessments. The study looked at questions asked by jurors of witnesses. It is highly unlikely that a juror would ever ask a question about demeanour. Nevertheless, such cues could still play a very important role in juror deliberation. Furthermore, the study was limited to civil cases, the majority of which involved motor vehicle accidents. It is possible that consistency of the witness’ story is more important in these kinds of case.
So the upshot of all this is that there is some reason to think that content cues play a role in credibility assessments. But how great a role that is, and how widespread it is, remains to be determined.
2. Are content cues reliable guides to credibility?
We now move on to the more important question. If we grant that content cues are being used, we must ask whether they are being used effectively. Several studies touch upon this issue:
Hartwig et al. 2005: This study used a similar paradigm to that adopted by Kassin and Fong (discussed in part one). Essentially, students were broken into two groups, one of which performed a mock crime, the other of which performed a related non-criminal act. The students were then interrogated for five minutes about the crime. All of them were instructed to deny the crime during the interrogation. The interrogations were videotaped and played for another group of students, each of whom had to determine who was being deceptive and who wasn’t.
So far, this is identical to the Kassin and Fong study. Where Hartwig’s study differed was in the introduction of a “late” or “early” disclosure condition during the interrogation. In the early disclosure condition, the interrogator presented three bits of incriminating evidence to the student at the start of the interview. They then gave the student a free recall period in which to explain what happened. In the late disclosure condition, the three bits of evidence were presented after the free recall period and after the student’s initial explanation of what happened. The findings were interesting. 43% of those who viewed tapes involving an early disclosure were able to accurately assess credibility; whereas 62% of those who viewed the late disclosure cases were able to accurately assess credibility.
The idea here, of course, is that in the late disclosure condition there are more likely to be inconsistencies between the student’s story and the evidence being presented (if the student is lying about denying the crime). These inconsistencies are content cues that can be used to determine credibility more accurately.
In principle this sounds okay, but then we face the inevitable question: how well does this study replicate the kind of scenario faced by juries in legal trials. Minzer argues that the late disclosure case is like that faced by the typical jury; Meixner is not convinced. In some jurisdictions, witnesses are well-coached and well-prepped. They have time to prepare their stories and they know what sort of questions they are likely to be asked. Furthermore, in civil cases discovery procedures prior to trial ensure that both sides know what evidence is going to be presented before the court. Add this to the fact that the late disclosure condition only increased accuracy to 62% and we are still left with the conclusion that juries are not particularly good at credibility assessments.
Most damningly of all, however, is the fact that the ability to tell truth-tellers from liars was not equal in this study. Those who observed the late disclosure condition correctly classified 68% of the deceptive interrogations, but only 54% of the truthful ones. In other words, they misclassified 46% of the truth-tellers. That is a significant number of false positives, and would have significant implications for the normative goals of a legal trial. After all, in criminal cases, we are generally biased away from false convictions. This study does nothing to suggest that content-based credibility assessments help us to avoid false convictions.
Hartwig and her colleagues did a follow-up to their 2005 study:
Hartwig et al 2006: This study followed the same basic protocol as the previous one. The difference was that it focused on the ability of interrogators to assess credibility. Prior to the interview, some of the interrogators were trained for three hours in how to effectively use the late disclosure method. It was found that those who were trained achieved accuracy levels of 85% when it came to determining witness credibility.
This is an interesting result, and perhaps says something about how to effectively train interrogators. The problem is that jurors don’t have the same powers as a police interrogator.
To sum up, Meixner’s argument is that we shouldn’t create absolute legal impediments to the use of brain-based lie detection because existing methods of credibility assessment are not that impressive. As I noted in part one, this argument is worthy of critical scrutiny since it privileges accuracy or “truthiness” over the other virtues of the legal trial. Still, I think the empirical evidence reviewed by Meixner is worthy of our consideration.