Seven years ago, the long arc of the Community for Rigor began with an NIH survey. It found that there are few classes on rigorous research practices in higher education. This should have come as no surprise since, by then, many incontrovertibly good practices had been measured at less than 50% adoption. Schools are one of the main stakeholders in science who stand to lose their ranking, funding, and reputation if they publish less because they’re too worried about rigor. NIH, specifically the National Institute of Neurological Disease and Stroke (NINDS), set out to change this by funding the production of teaching materials on rigorous science for schools to use.
Those well-versed in replication crisis interventions may see a hole in this logic. Long-resistant schools may then resist using the materials. It may be that knowing about the crisis, knowing rigorous practices, and rigorous practices being easy to carry out won’t induce someone with a lot to lose to follow through and practice rigor. To put it simply, the reproducibility crisis is embarrassing for schools. Schools generally don’t want to teach something that makes them look bad. In education, making yourself look bad also costs money, in grants and students.
This is a very strong incentive. However, schools that avoid teaching the replication crisis and improving rigor are only kicking the can down the road. The longer this goes on, the worse they will look. NIH trying to spoon-feed schools into teaching an ongoing historical event and the many empirical studies that back it up only makes schools look worse yet.
Those well-versed in replication crisis interventions may have missed another hole in this plan. The Community for Rigor grantees work at schools. The grantees may not want to produce materials that cite the many empirical studies that back up the replication crisis, or they may not mention it at all. They may want to produce material that makes schools look good and helps schools continue to publish low-quality papers, winning grants and students, not losing them.
With apologies to those Community for Rigor grantees and administrators who have done excellent work, some grantees want to keep kicking the can down the road. It may be a sign of what’s to come.
What’s to come
Recently, now-HHS secretary RFK said that 20% of NIH funding should go to replication. Vice President J.D. Vance is referencing the reproducibility crisis on X. And NIH has been asked to consider congressional proposals that include metascience like replications before congress makes them law (PDF).
The long-requested scientific reform of funding replications, oversight, and audits is coming true. Judging from both an understanding of the science reform community, and the fact that RFK’s comment has not been repeated anywhere in the community since, this is not the administration from which they wanted to get these reforms. Nonetheless, a large amount of money may soon be spent on metascience.
Historically, money has not been spent on metascience. Only a fraction of one percent of NIH projects mention replication. Metascience has been funded by private philanthropy. (Even back then, there was a worry that criticizing scientific practices is a political act because science is seen as predominantly liberal.)
The natural question to ask, then, is whether or not funding metascience will improve the core metrics of scientific success, replication and self-correction. I believe it will. However, the incentives that compromise research variously described as “publish or perish,” science progressing “one funeral at a time,” and putting one’s head in a guillotine, may be too strong and too existential to overcome by simply funding the same people in the opposite direction.
It’s worth asking whether or not giving more money to researchers with those incentives will reverse the incentives. One place to look for an answer to this question is one of the few federally-funded metascience projects, Community for Rigor (C4R).
The architects of C4R, NINDS head Walter Koroshetz, Director of Research Quality Shai Silberberg, and later Program Director Devon Crawford, designed it well. It acknowledges the ongoing crisis in rigor and neurology’s past scandal with “sham” stroke research that Dr. Koroshetz outlines in his opening statements. (In this talk, which I’ve recommended to many C4R stakeholders, Dr. Koroshetz quotes his scientific director calling some NINDS grantees “crooks,” and summarizes the extent of modern science’s problems. It’s hard to dismiss an NIH director, particularly if you think the hierarchy in science works. In that case, I would recommend it over what you’re reading now.)
The design of C4R stipulated that NIH would have an ongoing role in administering the project, community feedback would be essential, and an outside committee would help. NINDS requires past rigorous research to win a grant in the first place.
However, rigorous grantees would need to apply, and they would need to produce materials that could be inherently critical of research institutions. After the money was granted, NINDS’ ongoing involvement, and feedback from the community would be the only mechanisms to make sure that grantees weighed their incentives in favor of rigor.
This is where I believe this particular funding line went wrong. There’s nothing that says grantees can’t undermine the entire metascientific project, claim rigor is the thing they’ve been doing all along and fail to mention the large body of literature that says it’s not. This is the part that I believe may unwind the future funding of metascience and other research that may threaten the reputations and business models of research stakeholders: schools, scientists, journals, and funders. One incentive may not be enough to compensate for another.
I believe some of the C4R grantees chose the path of optimizing future funding, and despite feedback and intervention all the way up to Dr. Koroshetz, they continued on this path. More disturbingly, every effort was made to protect them from criticism including removing videos from the internet.
The Operation
The “Red Team Operation,” like others in science, is simple. Take a process such as “data available upon request” and request the data so to speak. In this case, Community for Rigor solicited feedback on their annual conference. I sent some feedback. This process shouldn’t end in stonewalling, or anything else that prevents feedback from having a chance of effect. Also, like other “operations” it didn’t start with high-minded principles. I was simply annoyed.
Results
What follows is feedback that I sent first to Devon Crawford at NINDS/NIH, then to C4R and Devon’s boss Shai Silberberg, partially to get his cooperation and partially to make sure a FOIA-able address was included, then to Shai’s boss Walter Koroshetz, then to individuals who gave presentations at the C4R conference.
In summary, the response I got, first from Dr. Crawford, and then from C4R is that the materials presented at the conference were not ready to be critiqued. This is contrary to Dr. Silberberg’s call for critique of the conference, and common sense. If any of the grantees are giving out bad information on how to do research, the material is not ready for the public to consume either. NINDS rules also say the grantees need to have done rigorous work in the past. One would assume they’d be ready right away.
I asked when the material would be ready for critique. I was told the dates, previously estimated on the C4R website, were part of “our internal planning process.”
As the months went on, I saw some hopeful signs at C4R like Richard Born’s “Stop Fooling Yourself” paper. However, I eventually told C4R I was going to email the professors my feedback directly and I did, both critiques and compliments. Around this point, C4R took the videos of the conference down and asked me not to contact “team members” directly.
Asking someone not to contact professors about their work is, on its face, ridiculous, and a response from them would have obviated this post. But this is also explicit. Quoting Shai Silberberg, “Since this is a community-led initiative, we envision champions from the community who are not part of this initiative will provide input and feedback directly to the METERs [the professors] and CENTER [the coordinators].”
One video remains. My comments on the remaining video were directed at a guest speaker, not a C4R grantee. I asked C4R why the videos were taken down and didn’t get a response to that question.
I did get a reasoned response to my feedback on the module that was based off of Dr. Born’s paper. In the two months since, the wording on one of the 140 slides in the presentation has only slightly changed.
This is a point worth stressing. C4R is under no obligation to take my feedback, nor anyone’s. However, in two years of following the project, I have not seen evidence that C4R has made changes based on my feedback, other community members’ feedback, nor feedback from NINDS’ director Walter Koroshetz (see emails). It doesn’t mean changes weren’t made internally. Given the seriousness of the topic, and how clear the errors were, I think it’s worth considering that self-governance doesn’t work as well as it should. The promised “tight feedback loop with the community,” “community-driven movement,” and “bidirectional collaboration with the community” is, at best, not apparent.
This is an approach to self-correction in science that I thoroughly condemn: Make science policed by something called metascience, fund the teaching of metascience, or meta-meta-science, make it policed by councils and feedback, or meta-meta-meta-science and by that time nobody cares, the videos have a handful of views and anything that goes wrong gets tossed in the void along with all the materials you produce. It won’t work. It’s not working.
For months, C4R refused to give me any indication that my feedback reached the authors. When it did, they asked me to stop. The authors I critiqued didn’t reply.
Afterword: Human Subjects
There has also been a development that is more evocative than conclusive. It fits with the theme of barriers to community feedback: The public can’t participate in module feedback sessions without agreeing to be a research subject. If someone joins a feedback session without agreeing, they are asked to leave.
The agreement includes this language:
“Participation in our research initiatives may pose some risks to you and your privacy. In extreme cases, when your responses are collected in a group session, others may be able to cause you hardship such as embarrassment or loss of employment on the basis of your responses. This risk is about the same as any time you speak in a professional capacity.”
Risks are part of informed consent and the review board that approved this research may have wanted this language included. However, I don’t think community members should have to be human subjects at all to participate in these sessions and shouldn’t have to take on these risks. It is certainly chilling to tell people that they may lose their job if they give you feedback.
University of Pennsylvania’s own proposal for C4R makes the case against this policy perfectly: “While many papers have been written and countless presentations delivered on rigorous practices, the average scientist fails to consistently follow the relevant principles.” If anyone should be at risk of anything, it should be those not following relevant principles.