When rational drug design meets an irrational disease

On April 25, 2023, the FDA approved Biogen’s novel drug tofersen under the brand name “Qalsody” as a treatment for a rare form of amyotrophic lateral sclerosis, or ALS. ALS, if you don’t know, is a terrible disease in which you slowly lose control of your muscles, including, eventually, the muscles that allow you to breathe. Tofersen is a drug that purports to slow the progression of ALS, specifically in people who get ALS because of a mutation in a specific gene called SOD1.

The press release for Qalsody touts this approval as a milestone because it’s the first treatment to target a genetic cause of ALS. If you read further into the press release, you’ll see the real milestone, though: Biogen is now 2 for 2 on getting the FDA to approve neurodegeneration “treatments” after they fail clinical efficacy trials, roughly two years after they managed the same with their clinically ineffective Alzheimer’s drug aducanumab.

That is to say, Biogen is now 2 for 2 on the following playbook: 

1) spend a lot of money developing a treatment for a neurodegenerative disease based on a mechanistic understanding of said neurodegenerative disease

2) successfully treat the supposed “causal mechanism” of that disease but fail to actually cure the symptoms of the disease

3) convince the FDA that the drug should be approved anyways because they treated the causal mechanism, even though they failed their own clinical trial

To quote the great Dr. Doofenshmirtz, if I had a nickel for every time that happened, I’d have two nickels. That isn’t a lot, but it’s weird that it happened twice.

From the outside looking in, it can be difficult to explain how we got here. A lot of the Twitter commentariat would simply just respond with “Because the FDA is corrupt”, but I don’t think that’s fair1. For one thing, the tofersen approval was supported by the FDA’s panel of independent advisors, the advisory committee or adcom. This is entirely unlike the aducanemab approval, which was not only opposed by the adcom, it was so opposed that three experts resigned from the adcom in light of the approval. Even if the FDA is corrupt (which I don’t think it is), the professors and doctors who make up the adcoms that meet for every drug approval are not. 

Instead, I think you can trace Biogen’s weird success in getting the FDA to approve clinically ineffective drugs back to a certain way of viewing medicine and drug development that’s infected both the industry and the regulators that run it. It’s a view that’s top-down and “objective”, free from the shooting in the dark that characterized all of drug development in its earlier decades, and still much of drug development and medicine now. It disdains the blind testing of massive numbers of compounds and the development of medicines that work for unclear reasons.

This infectious view is the “rational” view of drug development. Rational drug development has been used by a lot of people to mean a lot of things (especially as a marketing term), but the most important component of rational drug development is the assumption that every part of the body is mechanical: the body works by one component physically affecting another component, like a Rube Goldberg device. 

Diseases, in the rational view, also work the same way, but are the result of malfunctioning Rube Goldberg devices, like a waste product failing to be removed, building up, and then poisoning organs. So, to continue the analogy, if we carefully design our drugs to prevent one step in the process (e.g. the buildup of the waste product), we can prevent the end result of the Rube Goldberg device, which is the physical symptoms of the disease.

When pharmaceutical companies view diseases in this sequential way, then they start to think of drug development as an engineering problem. They let the academics define the steps of the Rube Goldberg machine and probably also a way to jam a step. Then they swoop in and make sure to optimize the jammer. They make it exceptionally good at jamming that step and make sure it doesn’t jam anything else. Then, at every point in the drug development process, they measure how well the jammer works, based on how much of the step after is able to be completed. They measure it in vitro (in a Petri dish), then in mice, and maybe then in healthy people. And, by the time they measure it in sick people, there are no surprises. At least, there shouldn’t be.

This way of thinking is also very appealing to regulators. When a company develops a drug, they talk constantly with the FDA to make sure their hundreds of millions or billions of dollars in development costs are going towards trials that would be convincing to the FDA. Now, the FDA sees their role in drug development as the following, roughly in order of importance:

1. Make sure that harmful drugs do not make it to the public (or, ideally, even to volunteers).

2. Make sure that ineffective drugs do not make it to the public.

3. Make sure that effective drugs do make it to the public.

For a long time, number 3 on that list was mostly an afterthought. The FDA was so scared of a high profile failure on their most important priority of preventing dangerous drugs from making it to the public2 that they were just straight up gatekeepers.

But that all changed in the 70s with Nixon’s War on Cancer. JFK had put an American on the Moon by the end of the 60s, and goddamnit Tricky Dick was going to put an end to cancer by the end of the 70s. Hundreds of billions of federal funding poured into cancer research, including expanding the National Cancer Institute within the NIH and funding cancer centers around the country. 

So suddenly the FDA got a lot of pressure from the top and the bottom to help end cancer. From the top, Tricky Dick and his cronies had promised the American people explicitly that they were going to end cancer by the year 1980. From the bottom, every university in the country started doing research on cancer and drug companies got a ton of incentives to help develop cures. The FDA was feeling the heat.

But, of course, they didn’t want to just roll over and let any snake oil peddler get a possibly harmful chemo treatment on the market. So what was the FDA to do? Well, first, they tried just repurposing some random drugs that looked like they had anticancer activity. This led to the rapid approval of megestrol acetate, a hormonal contraceptive that had limited success in treating sex-related cancers like endometrial cancer, and tretinoin, an acne treatment which had, again, limited success in leukemia. But this well quickly ran dry.

Then, quietly, and without much fanfare, the FDA relaxed rule 1. It was now ok for drugs to be harmful, as long as they were at least somewhat effective against cancer, because the alternative was death. So, the FDA started approving a bunch of nasty drugs. This included bleomycin, approved in 1973, which is the only drug I’ve ever heard of that has a lifetime limit (if you take more than 400 units it irrevocably scars your lungs), and doxorubicin, approved in 1974, which is so nasty that doctors nickname it “red death”.

This weird state of affairs with the FDA approving nasty drugs with limited effectiveness continued until the 80s, when suddenly: a plot twist! AIDS protesters began accusing the FDA of withholding promising treatments behind insurmountably long and expensive trials. These protesters eventually took the unprecedented step of picketing the FDA offices, calling them murderers.

In 1992, after 10 years of harassment by AIDS protesters, the FDA finally acquiesced to their demands, kind of. Along with allowing companies to run pilot trials of their drugs in sick patients before approval, the FDA created a new pathway for drugs to be approved: the accelerated approval pathway. The accelerated approval pathway meant that, in life-threatening diseases with “unmet clinical need”, the FDA could approve drugs based on a “surrogate endpoint” instead of a “clinical endpoint”, given that the company promised to run a trial on a clinical endpoint after approval. Or, in other words, companies could get drugs approved based on something that looked like it was related to curing the disease, rather than the disease actually being cured, as long as they promised to do a proper trial later.

Now, the idea of using a surrogate endpoint wasn’t new. Surrogate endpoints had been used before, like in asthma, where the FDA often used the increase in forced expiratory volume (how much air people can blow out) after exercise to measure the effectiveness of a new inhaler rather than the more nebulous (no pun intended) issue of an improvement in asthma in general. What was new is that these accelerated surrogate endpoints didn’t have to be clinically proven. They just had to be “reasonable”, or, we might say, rational.

And now you see where rational drug development comes in. Accelerated approval is tailor-made for rational drug development, as both rely on reason to justify each step in the drug’s development and approval. But what you might not see immediately is that accelerated approval also changes the role of the FDA. 

If the FDA is also trying to figure out what’s “reasonable” for a surrogate endpoint without any clinical evidence, they are implicitly putting themselves into more of a coach role than a judge role. The reasonableness of a given surrogate endpoint depends a lot on how it was chosen, which then goes back to how the drug was developed, and, if a company is smart, how much they involved the FDA in the drug’s development.

Accelerated approval was made for AIDS/HIV drugs, but pharmaceutical companies quickly realized the use of it for cancer drugs. They were still getting lots of incentives to develop cancer drugs, the FDA was still allowing cancer drugs with minor to moderate benefits and severe side effects to be approved, and it wasn’t hard to find cancers that were life-threatening and poorly treated, so why not use this new pathway? They just needed to find a proper surrogate endpoint.

Their first surrogate endpoint had been just measuring tumors by the “objective response rate”. This was, quite literally, just the doctors’ estimation of whether the tumors felt like they shrank when they palpated them (no, really, that was the entirety of the “objective” measure). It had been used as a secondary endpoint for a lot of the cancer drugs that had been approved in the 70s and 80s.

But, by 1992, doctors started complaining that it wasn’t exactly objective, despite the name. So, a new surrogate endpoint was developed, “disease free survival” (also known as “progression free survival”). This surrogate endpoint was based on how long people live with cancer without their symptoms getting noticeably worse. Adopting this new surrogate endpoint had a few benefits. 

For the patients, it tied the endpoint a little more closely to stuff that patients cared about (although still not overall survival). For the drug companies, it allowed them to argue that their drug halted or delayed the progression of cancer, even if it didn’t cure it or shrink the tumors. And, for the FDA, it allowed for the creation of increasingly elaborate stages of cancer treatment, with many multi-step flow charts that only PhDs could keep track of but which seemed like progress.

What to do if you have prostate cancer. Side note: prostate cancer is insanely common among elderly men, and mostly does not matter clinically and should not be treated. As my ex’s med school professor said, “Most men die with prostate cancer, not of it.”

With this new surrogate endpoint in hand, cancer companies flourished. Not only were cancer drugs 36% of accelerated approvals from 1992 to 2010 (vs. 40% for HIV drugs), they were actually a full 85% of accelerated approvals from 2010 to 2020.

All of this activity famously resulted in relatively weak progress in treating cancer compared to the literal hundreds of billions spent on it. Even now, in the year 2023, where we have some pretty miraculous cancer drugs, the FDA is still approving drugs that do not result in a statistically higher overall survival rate than the standard of care, like abemaciclib for certain forms of breast cancer. That is, if you have one of the forms of breast cancer that abemaciclib is approved for and you get abemaciclib added onto your treatment regiment, you will not live any longer overall. You’ll just get a little more time where your breast cancer isn’t advancing, before it comes back with a vengeance and kills you anyways at the same time it was going to in the first place. If this still sounds like a good bargain, you’ll also be fighting off statistically higher levels of immunodeficiency and diarrhea, so, uh, have fun. Also, you (or hopefully, your insurance), will be paying $14k/month for this privilege.

And, for the record, the drug I just mentioned wasn’t cherry-picked. I just went to the FDA’s webpage for most recently approved cancer drugs and clicked on a few until I saw one that did not improve survival. Try it yourself on the FDA’s recent approvals website and I guarantee you will find a terrible cancer drug in 30 seconds.

Now, for Internet nerds like you and me, this all likely sounds like, at best, mixed success on the part of the FDA’s “Oncology Center of Excellence”. But, think about it from the perspective of the rest of the FDA. The Oncology Center of Excellence, unlike every other part of the FDA, is approving a ton of drugs, is in constant contact with patient advocacy groups and incredibly well funded non-profits (the American Cancer Society alone brings in $700+ million per year), and regularly collaborates with industry to design super clever trials. That can start to make your own division look pretty boring.

This is especially true if your division is something like Neurology, which traditionally has had a lot of problems getting successful drugs. In fact, none of the big diseases in neurology have any successful drugs that can really modify the course of the disease. It can probably be a pretty depressing field to be a regulator in.

So, when there started to be a partly grassroots, partly astroturfed groundswell of support for new treatments in neurology by well–heeled high profile nonprofits, like the ~$500 million/year Alzheimer’s Association and their $1.5 million/year salaried CEO, as well as pressure from regulators, like the ALS Association’s roster of senators who’ve explicitly promised to pressure the FDA to help end ALS, the neurology division, too, wanted to become “rational”3. Maybe the problem in neurology all along was that they weren’t rational enough, and that they, too, were throwing out the baby with the bathwater by focusing solely on non-harm and then on clinical efficacy, without ever considering mechanistic understanding.

Now, there’s nothing wrong with this, per se. The FDA’s insistence on making sure no possibly harmful drugs ever made it to healthy volunteers did probably kill some drugs that could have been helpful in patients. And, ironically, the FDA’s focus on clinical efficacy above mechanism benefited some drugs that definitely weren’t helpful in neurodegenerative diseases and we had zero reason to believe they would be, but skated by on sketchy trial data4.

The problem is how this rationality is applied. And that, at last, brings us back to tofersen, Biogen’s ALS drug. Tofersen is a great example of misapplied rationality. And to prove it, all we need to do is to look at Biogen’s, well, rationale.

You see, one of the really nice things about drug development is that so much of it is public. Not only do drug companies publish a bunch of their research, but their arguments to the FDA’s experts are public. That gives transparency to how drug companies think about these diseases (or at least how they want the FDA to think about them), which can illuminate a lot more than what the company would like us to see.

So, if we dive into Biogen’s presentation to the FDA, we can see exactly how they approached this. Looking at their slide presentation, they started off thinking about wanting to tackle ALS in general. But, it was too confusing for them. There were too many mechanisms which all seemed to lead to the same pathology, and they didn’t think they could figure out an effective way to tackle all of the Rube Goldberg machines at once.

Instead, they thought, let’s tackle just one, the absolute simplest, most clear Rube Goldberg machine in ALS. We’ll nail this one and then work on the others.

In Biogen’s eyes, that simple Rube Goldberg machine was SOD1 ALS, a rare genetic variant of ALS that affects only about 3000* people worldwide. Biogen chose this one because the logic seemed so clear:

1. Academics had identified SOD1 as a gene in which mutations led to ALS. Certain ALS patients tended to have mutations in this gene and the misfolded SOD1 proteins were present in ALS patients’ spines. Inducing these mutations in mice leads to ALS type symptoms. 

Therefore, according to Biogen, the ALS Rube Goldberg machine works by accumulating too much of this bad protein, at least in this version of ALS. Some ALS patients have this gene, this gene leads to bad proteins in mice and humans, and these bad proteins lead to ALS symptoms in mice. 

2. Preventing the expression of mutant SOD1 using a newish technology called antisense oligonucleotides (ASOs) not only stopped the development of ALS type symptoms in mutant mice, but it stopped the accumulation of neurofilament in the blood and cerebrospinal fluid of mice, which Biogen had become convinced was the ultimate sign of neurodegeneration in ALS.

Therefore, according to Biogen, it seemed pretty clear that the ALS SOD1 Rube Goldberg machine could be stopped with an ASO jammer. The ASO jammer stopped the mutant SOD1 protein from being made; it stopped the ALS symptoms; and it stopped the accumulation of the thing that Biogen thought was most obviously caused by the SOD1 protein damage. 

3. Using ASOs in people with ALS was pretty safe and it resulted in reductions in SOD1 mutated proteins and neurofilament. 

Therefore, according to Biogen, their ASO, tofersen, would prevent SOD1 mutations from leading to ALS in humans.

Biogen’s favorite part of all of this was how quantifiable it was. In every step, it wasn’t just “people with SOD1 mutations look worse” or “it looks like they have ALS”. It was “people with quantifiably high levels of mutated SOD1 proteins in their cerebrospinal fluid have quantifiably high levels of neurofilament”.

This, in Biogen’s eyes, made SOD1 ALS the perfect Rube Goldberg machine to stop. Every part of it was quantifiable. That meant that they could also deliver quantifiable amounts of their Rube Goldberg jammer, tofersen, and say, “This amount of tofersen stops this amount of neurofilament build up.” Then, of course, they could start talking about how much that amount of tofersen costs (cost of goods sold), build a revenue model, and basically act like this is an engineering problem instead of a scientific one.

Unfortunately, as we already know, because Biogen only wanted to focus on quantifiable, rational parts of drug development, they ended up just assuming that the non-quantifiable (or semi-quantifiable) parts of developing this drug (i.e. the scientific unknowns) would take care of themselves. This was not the case.

Notably, once it came time to administer the drug to people, there was almost no difference in the ALS functional rating scale, the ALSFRS-R, between people who got the drug and people who got placebo at 28 weeks. Now, this rating scale is, at best, semi-quantifiable. It assigns a score of 0-4 to people with ALS across a range of measures, with 4 being the best and 0 being the worst. So, someone who can speak normally gets a 4, someone who slurs a bit gets a 3, and non-verbal people get a 0.

Because it’s semi-quantifiable and difficult to translate to mice, Biogen basically chose to ignore the ALSFRS-R all throughout their development. Even when it came to their final, pivotal trial, the one that they knew this rating scale would be the “primary endpoint” for (i.e. the main thing that the FDA judged them on), they still chose to ignore it. They just assumed that it would, well, behave rationally. This assumption bit them in the ass. Or, at least, it should have.

It’s interesting reading Biogen’s responses after their trial failed. Along with a bunch of post-trial data slicing in a desperate attempt to find some significance somewhere, they rail against how fundamentally unfair a placebo-controlled trial on a semi-quantitative measure is. They are so frustrated that their placebo group declined slower than they thought they would, and that there were no deaths attributable to ALS in either group, and that participants decline at different rates on this scale.

“All this real-world data is so messy,” you can almost hear them complaining behind their dry language and copious citations. “None of it is predictable. It doesn’t behave at all as it should. Why are we even paying attention to it?”

Now, I don’t think it was this complaining that actually sealed the deal with the FDA. I think that was probably the open-label extension, in which all the participants in the trial got switched to treatment after the end of the trial, regardless on whether they were on treatment before. This is standard for diseases where there are no other treatments, with the idea being that some hope of success is better than none.

In this open-label extension, people who got tofersen in the first place did actually do significantly better at the one year mark than people who got tofersen later (i.e. they had, on average, a 3.5 point higher score on the ALSFR-R rating scale, which is not a lot but might be the difference between not walking and walking with assistance). Biogen argued this was because tofersen actually just takes a while to work. 

It’s not quite so simple as Biogen would like, though. By the time the open label extension finished, 15/72 people dropped out of the early tofersen group, and those left might have just been the people for whom their ALS happened to progress slowly. That is, neither Biogen nor the patients know in advance if an individual patient’s ALS will progress quickly or slowly (e.g. end up on a feeding tube in 2 years or 4). 

The entire point of randomizing a trial is to try to allocate equal numbers of slow progressors and fast progressors to the treatment and placebo arms, and to avoid thinking that a treatment worked better than placebo when in fact the treatment group just randomly would have done better than the placebo group anyways. The people who don’t drop out of a treatment group (and, in fact, choose to continue treatment even after the study has concluded) might be people who were helped by the treatment, or might be people who would have done comparatively well even if they hadn’t been on treatment. It’s impossible for either Biogen or the patients to know.  These sorts of complications in interpretation are exactly why the FDA has traditionally been very strict about primary endpoints being what determines the approval of a drug. 

On a very related note, it’s pretty weird that Biogen argued that stopping the SOD1 mutation and preventing the accumulation of neurofilament was the absolute be-all and end-all, and that they had definitively understood the ALS Rube Goldberg machine and how to stop it, and then their strongest effect in ALS (by their own accounting!) was… possibly making people decline only 6 points on the 48 point rating scale rather than 9.5 points. Not exactly a miracle cure!

And it’s especially bad that Biogen was so successful with these rational arguments because their success is going to have a knock on effect on ALS drug development and drug development in neurology more generally, especially if they’re financially successful with this drug. If Biogen can make money with this approach, VCs and angels will insist that others will follow this approach as well. This will also take up the limited clinical trial resources available in a small disease like ALS (and especially small, genetically distinct subsets of ALS), making even testing alternative approaches to this disease much more difficult. I would say this is not an ideal situation!

But, the FDA, at the end of the day, does not care what I have to say. Instead, they bought Biogen’s argument about rationality and the open label extension, and let Biogen sell the drug in exchange for promising to do a follow-up trial later, with no actual enforcement on when that “later” is. That is, Biogen once again successfully argued before the FDA Neurology division that:

1) Weak clinical evidence that is subject to interpretation plus

2) Overwhelming “rational” evidence that they successfully reduced the accumulation of their biomarker means that

3) Their drug is effective for patients in an incurable neurodegenerative disease

And somewhere, Dr. Doofenshmirtz sighs as he finds another two nickels in his pocket.

1

Although it is pretty suspicious that Billy Dunn, who was in charge of the FDA’s Neurology Division during the aducanumab decision/debacle, joined the well-paid Prothena Board of Directors this past May. Prothena is a company trying to convince the FDA to approve their therapeutic for Alzheimer’s. They may want Billy Dunn on their Board for reasons other than his scientific expertise.

2

Like the infamous “morning sickness pill that causes birth defects” thalidomide, although that was actually never approved in the US, in large part because the FDA demanded further testing.

3

This is about a decade or two after academia/industry also became “rational” about neurology, exemplified most prominently by the amyloid hypothesis in Alzheimer’s. As that link explains, the NIH got so captured by the proponents of the eminently rational hypothesis that Alzheimer’s is caused by a buildup of amyloid protein that it became almost impossible to get funding for studying anything else in Alzheimer’s. 

4

Although this was more easily fixed by the FDA insisting on stuff like pre-registering trials, not hiding failed trials, and avoiding p-hacking. The FDA got way more serious about this stuff in the 2000s, but earlier drugs were grandfathered in. This is why we still have some neurodegenerative “treatments” from the 90s that nobody talks about, like memantine, a drug that was approved for Alzheimer’s in 2003 and is safe but definitely doesn’t work.