The shocking truth about randomised control trials exposed!
By Terence Wood
Development debates are frequently fierce and rarely resolved. Often this makes sense, many disputes are ideologically charged, evidence is unclear, and peoples’ lives are at stake.
In other instances, the source of the sound and fury is hard to fathom. Randomised control trials (RCTs) are a case in point. Some eminent development thinkers proclaim their virtues, insisting they are the final word in evidence, others decry them in treatises.
I’m here to tell you both sides of this fight are wrong. Like much else in development, RCTs are remarkable, but also flawed. Here’s what you need to know.
What’s an RCT?
To simplify, RCTs involve taking a treatment and randomly giving it to some people but not others, subsequently comparing treated and non-treated groups. RCTs are best known as a way to test medicines, but in development all range of treatments are possible, from the effects of free bed nets to the impact of civic education on voters.
Crucially, treatments are randomly allocated. If samples are large enough, random allocation means the group of people who get the treatment will be very similar in other aspects of their lives to those who don’t. Because of this, any subsequent differences between the two groups will very likely have been caused by the treatment alone. Randomisation does not have to occur between individual people. Treatments can, for example, be randomly allocated to entire villages.
The practice of running RCTs can be more complicated, but these basics are all you need for the sake of this blog.
In aid work, RCTs can be used for two reasons: to evaluate a single aid project (an evaluation); and/or to contribute to more generalised learning (research). In the rest of this post I’ll make it clear when a particular strength or weakness matters for evaluation but not research and vice versa. If I don’t do this, assume that it’s relevant to both.
What’s so good about RCTs?
One benefit of RCTs is so prosaic it’s often forgotten. RCTs require good data. This entails making an effort to gather data well, usually beginning before the start of the actual aid project. In many evaluations evidence is an afterthought, only worried about once it’s too late. You don’t need to run an RCT to get better data but, if nothing else, running an RCT will almost certainly make you think about evidence while you can still do something about it.
Another strength of RCTs is that their findings are usually unambiguous. Compared to other approaches, particularly subjective approaches such as interviewing people who may have a vested interested in a certain outcome, the findings of RCTs are helpfully clear cut. This matters in aid, where people are often reluctant to abandon a favoured project unless the evidence is overwhelming, or where projects get cut, not because they’ve failed, but because there’s no good evidence they’ve succeeded. Clarity helps.
Findings from RCTs also eliminate other problems. If, for example, you simply compare outcomes before and after an aid project, and find an improvement, how can you be sure that the improvement was actually a result of aid, not the continuation of some pre-existing trend? RCTs address this.
What wrong with RCTs?
Despite these strengths RCTs are not universally popular.
At times the complaints are spurious. The worst of the arguments against RCTs is the claim that RCTs are unethical. If you have very good evidential grounds for believing a certain treatment will work and if you enough money to give it to everyone, denying the treatment to 50% of the relevant population just so you have a control group would be unethical. However, this almost never occurs in aid work. Usually, we don’t know what works, and often we don’t have money to treat everyone. In this case, RCTs are actually more ethical than the alternative: non-randomly allocating treatment and/or not gathering data, and not learning, thereby losing the opportunity to improve.
RCTs do have real limitations though.
Findings from individual RCTs may not be transferrable. (Technically, their external validity is not guaranteed.) This is not usually an issue for evaluations, but it is an issue for research. Context matters for a lot in development. Just because something works in Northern India doesn’t mean it will work in Papua New Guinea. It’s possible to combine the results of many RCTs in a way that increases the transferability of findings. But, even then it still isn’t guaranteed that something that works in many countries will work in Papua New Guinea.
Also, many development projects cannot be evaluated with RCTs. RCTs can’t usually be run on attempts to build the capacity of a country’s ministry of education, or on a national policy change, or on a large infrastructure project.
Moreover, on their own, RCTs only reveal how much impact a treatment had, they don’t reveal how it worked, or why it failed to work.
Finally, RCTs are expensive.
These are real objections. But they aren’t fatal. The first objection is merely a reminder that you need to be wary of context. The second and third objections show that RCTs aren’t the be all and end all: other research and evaluation methods are essential, sometimes as complements for RCTs, sometimes as substitutes.
And cost is really a question of priorities: is an RCT appropriate? What’s the current uncertainty? What’s the value of learning? Often the price of an RCT will be well worth paying. For NGOs, the binding constraint of cost could also be eased by government donors providing contestable funds for RCTs.
RCTs won’t answer all the questions that matter in development work, but they can answer some better than other approaches. Use them carefully when they will help. Use something else when they won’t.
Many development debates play out over decades. Some deserve to. Others wouldn’t be debates at all if we could just tolerate some nuance. The debate around RCTs falls into the latter category. The shocking truth about RCTs is that they are useful, when used appropriately.