You might think that scientists take care to ensure that the results they publish are as accurate as possible — particularly so in medicine, perhaps, where the current fashion is for decisions to be evidence-based, and for the evidence to come from published science.
A BMJ (British Medical Journal) podcast last year included a feature on the ways in which scientific knowledge progresses through articles in published journals. Even if you listened to the podcast at the time, its subdued tone and convoluted phrasing might have made you miss its message detailing widespread scientific deception.
Here’s just the relevant part (the full podcast contains other news and features):
The guest speaker, Steven A. Greenberg is an associate professor of neurology. He became interested in a very specific scientific claim (relating to Alzheimer’s disease, as it happens) and so he undertook a very detailed analysis of all the research papers published on the subject in a fifteen year period (0:43):
I found it hard to understand why so many people and so many articles have reported this claim as fact…and so I decided to try and study it in greater depth.
He presented the results of his analysis as a network of connected research papers. Each connection in the network is a reference in one paper to some earlier paper. Only references that relate to the specific scientific claim are counted as connections for the purpose of this analysis.
Greenberg’s study (in which the diagrams below appear) is published by the BMJ here:
How citation distortions create unfounded authority: analysis of a citation network
Data and belief
The original research mostly took place in a three year period. This original research produced what Greenberg calls primary data. However, over the following twelve to fifteen years what people came to believe about the claim did not reflect the original research findings, the primary data (2:21):
…there were approximately twelve or so primary data papers addressing the claim…only some of this data propagated through the network into a belief system.
Instead, subsequent researchers tended to take into account only the primary data that supported the claim, while they tended to ignore the rest (2:41):
About half of the data, all of which refuted the claim, or weakened the claim, failed to propagate and spread through the belief system.
This diagram shows a subset of the papers to make things clearer:
Each blob represents a published paper. The six white blobs in the first column near the bottom represent the original research (primary data) that is critical of the claim. Only one later paper (paper 90) refers to any of these six critical primary data papers.
The six blobs in the second column, five at the bottom and one at the top, represent the original research (primary data) that supports the claim. There are thirty-one references to these supportive papers by later papers.
Greenberg uses the term citation bias for this effect in which researchers are apparently blind to primary data that does not suit them.
Some papers tend to be cited in subsequent references much more often than others. These often-cited papers seem to have some authoritative status amongst other researchers. But the papers that can be identified as authorities in the network are not always the original research, the primary data (3:20):
What should be the authorities are the primary data papers, the papers that actually report data addressing the validity of the claim.
In this diagram, yellow blobs represent authoritative papers (the ones that have the most connections in the network).
No primary data paper that criticizes the claim is an authority in the network. Four of the papers that support the claim are authorities in the network. In addition, six other papers are authorities in the network even though they do not report on any relevant original research:
Greenberg coined the term citation diversion for cases where a paper dishonestly refers to an earlier paper so as to make it appear to support false information (4:18):
…I came across examples where certain papers were cited, but their content was perverted…so a paper would say that, “we did not find this to be the case” but then subsequent papers would cite it as saying “this paper did indeed find this claim to be true.”
For example, paper 77 was critical of the claim yet three subsequent papers (28, 37 and 38) reported that the claim had been confirmed, citing paper 77 as a reference. Greenberg writes:
Over the ensuing 10 years, these three supportive citations developed into 7848 supportive citation paths—chains of false claim in the network created
by citation diversion.
Greenberg coined the term citation transmutation for cases where papers reported things for which there was no data at all (5:26):
…[claims] were stated initially as being hypothesis in certain papers, but over time, as these papers were cited…this hypothesis spontaneously evolved into fact…without additional data supporting it.
This diagram shows how a hypothesis proposed in primary data papers 74 and 80 (bottom left) was reported by subsequent writers. There was no data supporting the hypothesis — it was just speculation:
For example, paper 74 speculated:
…such accumulations may represent early changes…
Ten subsequent papers (the split white-green blobs) reported that the hypotheses was likely, based on references alone, even though there was no new data supporting the hypothesis. For example, paper 26:
…accumulation…occurs very early in the disease process and appears to precede other abnormalities…
Eleven subsequent papers (the solid green blobs) reported that the hypotheses was fact, also based on references alone, even though there was no new data supporting the hypothesis. For example, paper 129:
We have previously demonstrated that accumulation…precedes other abnormalities…
Some papers reported the invented fact without even so much as a citation. For example, paper 16:
…supported by the fact that accumulation…precedes other abnormalities…
Why this is important
This is important because while evidence-based medical decisions have become the norm, enforced by government departments, professional organizations and commercial interests alike, the term “evidence” should really be printed in scare quotes. Apparent scientific evidence in published research papers can at times be flimsy or fraudulent, and Greenberg’s study demonstrates that misleading evidence of this kind may be widespread within a body of research.
It might seem strange that I, as a proponent of CBT, often criticize scientific evidence when CBT is widely thought to be evidence-based. The problem is that evidence is so easily distorted. So, if the only reason to support CBT were the scientific evidence, that would be no reason at all for anyone with any sense.
For real patients facing real problems, CBT works precisely because it does not rely on scientific evidence. Instead of treating individuals as data points within some scientific model, CBT treats individuals as individuals, and formulates individualized common-sense solutions to their problems.
Both patients and practitioners who might try to interpret scientific evidence as it applies to an individual case should therefore take great care — much greater care than than the researchers might have taken when they published their papers.