Exposure of faked dishonesty study makes me proud to be a behavioural scientist

Exposure of faked dishonesty study makes me proud to be a behavioural scientist

Lucky Raccoon/Shutterstock

The story has a lot to recommend it: psychologist Dan Ariely, the author of a bestselling book on the behavioural science of dishonesty, retracts his study because the data was faked. No wonder it’s been picked up by the world’s media. Buzzfeed declared this “the latest blow to the buzzy field of behavioural economics”. Psychologist Stuart Ritchie, himself a scientist, wrote about the case under the headline: “Never trust a scientist”.

A picture of psychologist Dan Ariely giving a talk.
Dan Ariely accepted that the data on which his study relied had been faked and that he should have double checked it.
Yael Zur, CC BY

I worry about these interpretations. And not because I teach on a behavioural science master’s programme. I worry because headlines like this risk stoking anti-science sentiment at a time when faith in experts is low, when thoughtful people parrot that we live in a “post-truth world” and where mistrust of science is causing deaths.

But most of all, I worry about these interpretations because I take the opposite conclusion from this story. In this case, the lesson is that the scientific process actually worked well.

Casting doubt on the science

An important and overlooked detail is that the scientific process revealed years ago that the results of the paper didn’t hold. Using data provided by an insurance company, Ariely’s study claimed that people are more honest in their reports if they sign a declaration of truthfulness at the beginning of a document rather than at the end of it. The method was adopted by the IRS, the US tax collection agency, and at least one big insurance company.

While nobody expressed concerns of deliberate fraud, many research teams had reported their failed attempts to replicate the initial studies. Replication is important. Because science is rooted in probability, observing the same result on two independent occasions makes it far less likely that the result is a fluke.

In 2020, Ariely and his co-authors published a paper in which they themselves attempted and failed to replicate the initial results. At that time it had not yet emerged that the initial data had been faked. The authors concluded that the initial results were a fluke and titled the follow up paper: “Signing at the beginning versus at the end does not decrease dishonesty.”

Another striking feature is that the failed replications were published in one of the top general science journals. It’s a recent development that scientists would devote their time to replication studies – and that top journals would devote precious column inches to publishing them – and follows a series of statistical studies that cast doubt on the rigour of published science.

First was the provocative data simulation study that suggested more than half of published results of scientific research are false. This finding derives from the following three features:

1. Some results are flukes.
2. New results are being found all the time.
3. Unexpected and eye-catching results are more likely to be published.

Then there was the Many Labs replication project. It found that more than half the results published in top psychology journals couldn’t be replicated.

Exposing false results

Some insightful contributions come from behavioural science, which comprises several disciplines that look at human behaviour and interaction, and works at the intersection of statistics, economics and psychology. One of those insights was that scientists can publish false results even without knowing it.

To get a sense of this, you first need to know that the scientific community deems that a result provides evidence if the result passes a threshold. That threshold is measured as a p-value, with p standing for probablity. Lower p-values indicate more reliable results. A result passes the threshold into reliable evidence or, in the language of science, is statistically significant, if its p-value is below some threshold, for example, p < 0.05.

Intentionally or otherwise, researchers inflate the chances of attaining statistically significant results by engaging in questionable research practices. In a survey published in 2012, a majority of psychologists reported that they test their theory by measuring more than one outcome and then report the results only on the outcome which attains statistical significance. Presumably they admitted to this behaviour because they didn’t recognise that it inflates the chance of drawing an incorrect conclusion.

Uri Simonsohn, Leif Nelson and Joe Simmons, a trio of behavioural scientists who are routinely described as “data detectives”, devised a test to ascertain whether a conclusion is likely to have derived from questionable research practices. The test examines whether the evidence that supports a claim is suspiciously clustered just below the threshold of statistical significance.

It was this test that debunked the idea of “power posing” – the widely publicised claim that you can perform better in stressful situations if you adopt an assertive physical posture, such as hands on hips.

Now the three data detectives have done it again. It was on their blog that the stark and sensational facts of Ariely’s dishonesty study were exposed. Contrary to Buzzfeed’s claim that this case constitutes a blow to behavioural economics, it in fact demonstrates how behavioural science has led us to root out phoney results. Exposing that bad apple, and the fascinating techniques employed to do it, actually constitutes a victory for behavioural scientists.

The Conversation

David Comerford receives funding from UKRI.