Which Future?

2026-02-182:3610michaelnotebook.com

Astera Institute February 17, 2026 This essay is the text for a talk on how to wisely navigate risks from transformative technology, especially artificial superintelligence (ASI). It was given at…

Show article

Astera Institute February 17, 2026

This essay is the text for a talk on how to wisely navigate risks from transformative technology, especially artificial superintelligence (ASI). It was given at Astera on January 28, 2026.

In 1954 the United States carried out its first full-scale test of a thermonuclear bomb, on Bikini Atoll in the Western Pacific. Known as Castle Bravo, the bomb's designers expected a 6 megaton blast. They were shocked when it yielded 15 megatons, an excess of 9 megatons, about 600 times the Hiroshima blast. The unexpected radiation fallout caused many deaths – the exact number is disputed – and serious radiation exposure to more than a thousand people, triggering a major international incident.

What went wrong? The bomb contained both the lithium-6 and lithium-7 isotopes of lithium. The bomb's designers believed only the lithium-6 would contribute to the yield, while the lithium-7 would be inert. But the lithium-7 converted to tritium far faster than expected, and that turned out to almost triple the yield. It's a case where a group of outstanding scientists failed to anticipate a deadly possibility latent in nature.

There's a curious coda to this story. A few years earlier, in 1946, physicists had seriously investigated whether a thermonuclear explosion would ignite the nitrogen in the atmosphere, causing it to catch fire and end almost all life on Earth. They concluded it would not¹. I'm glad it was the lithium-7 reaction they got wrong, and the nitrogen calculation they got right.

I knew one of the creators of Castle Bravo slightly. In 1997 and 1998 I worked on quantum computing in the theoretical astrophysics group at Los Alamos National Laboratory. One of the other people in the group was Stirling Colgate, who decades earlier had run the 3,000 person diagnostic team for Castle Bravo. In the 90s, Stirling would sometimes join us for lunch.

I liked Stirling. He was one of the most imaginative and daring people I've ever met. Among his many adventures, he'd taught himself to fly a plane so he could chase tornadoes, firing rockets from the plane, hoping to get diagnostic equipment inside a tornado, so he could study how they worked². According to lab legend he was one of the inspirations behind the movie Twister, but they'd had to tone his personality down for the movie. I never did ask him if that was true. He was the kind of person who, if he was 30 today, I'd expect to be running a successful company in Silicon Valley, or pursuing wildly ambitious technology research projects.

Which future?

Astera's motto is "The future, faster." A good question to ask is: which future? Obviously, we don't want it to be the bleak future imagined by some environmental activists, or those fearful of a large-scale nuclear exchange. No, we want it to be a good future, hopefully a future wildly better than today.

Unfortunately, good intentions don't ensure good outcomes. The inventors of asbestos, DDT, leaded gasoline, and CFCs all intended to help humanity. Naysayers got little traction at the time: the benefits seemed too large, the harms too easy to contest. Humanity took a laissez-faire approach and millions suffered. It's the Castle Bravo problem: dangerous capabilities, latent in nature, which we didn't sufficiently understand until too late.

One common framing of this issue is: how can we balance the risks and opportunities of science and technology? It's a fine question as far as it goes, and arises often in policy and engineering circles. But it's also often a platitude, something to say to sound wise, then do whatever you were going to do anyway.

The framing only makes sense informed by deeper analysis. What are our best models of risk and opportunity? What powerful ideas underlie the institutions we use to shape science and technology? Can we improve those ideas and institutions? These questions are becoming especially urgent as humanity develops artificial superintelligence (ASI)³. If we're about to instigate an explosion of posthuman intelligences, how can we ensure it goes well?

These questions are too large to answer comprehensively today. But what we can do is examine some models and historical examples, and use them to develop a conceptual armory which improves our understanding of these questions.

AI for virus design

The original talk included a section prior to this one, discussing specific viral pandemic agents. The material covered is all public knowledge, and it makes the talk more concrete, but on balance I see no good reason to collect and present such details in public. (There is a damned if you do, damned if you don't character to discussing risks: critics can dismiss the discussion as too-vague if you omit detail, or irresponsible if you include them. Regardless of critics, omission is the better call here.) The point of including it in the talk was as a second example, in the vein of Castle Bravo, of humans accidentally stumbling on unanticipated destructive capabilities latent in nature. It also illustrates a kind of scientific dysergy, where moderately concerning individual discoveries can be combined into something much worse – a pattern that could also be illustrated with the history of nuclear weapons. Most of all, it again raises the question: what other unknown destructive capabilities are latent in nature?

We'll table that question, and turn to discussing the more general (and much less specific) problem of tools to intentionally engineer viruses. That sounds horrid in the context of pandemic agents, but if we take a step back, there are many benefits to such tools. Those benefits have driven much work over the past few decades, and that work is now yielding fruit, with new viral delivery mechanisms for gene therapies – like Zolgensma, for spinal muscular atrophy – as well as therapies like T-VEC for melanoma, and progress on phage therapies for antibiotic-resistant bacteria. These are modifications targeting single genes, though there are also promising results from techniques such as directed evolution and rational design.

What's the long-term aim here? The examples I just gave are not de novo design, so much as minor alterations of existing viruses. But many groups are now aiming to design viruses, proteins, and other biological entities from much nearer to scratch. The vision is to develop predictive models good enough to enable rapid exploration of the design space, often so-called AI "biological foundation models".

The best existing example is protein design⁴ – AlphaFold 2 and subsequent models can predict protein structure well enough they're becoming useful for design, despite significant limits that are still being overcome. Efforts like Astera's Diffuse project may provide data that helps improve the design tools further.

Progress on viral design lags proteins, but there's a lot of effort and strong market incentives to improve. And as with proteins, there is a mad scientist ideal in which you say what properties you want in a virus, and if it's possible you'll be given a design and synthesis instructions.

I'm not a biologist, but I've noticed a striking divergence among experts: some tell me they are skeptical of this vision, perhaps because the biological foundation models have been so hyped; others seem strongly bought into the vision, bring on the artificial cell. Personally, I think the fundamental scientific and technological interest is so great, and progress so rapid, that we must take seriously the goal of predicting and designing biology, even if artificial cells from scratch certainly aren't coming tomorrow.

So let's suppose future AI-based models do enable de novo virus design. This will have many benefits, but the possibility of designing pandemic agents (and similar threats) will necessarily give rise to a field of AI biosafety engineering, similar to the safety engineering preventing misuse of large language models. What will safety in such AI models look like?

In the early foundation models, the safety engineering is primitive and likely easily circumvented. Let's use as an example Arc Institute's well-known Evo 2 model⁵. This model was trained on sequence data from roughly 128,000 genomes. Their main safety measure was to exclude from the training data all viruses that infect eukaryotic hosts. This worked well, in the sense that the trained model regards actual viruses infecting humans as biologically implausible. The model also performs extremely poorly when prompted to generate human-infecting viruses.

While this is superficially encouraging, it seems nearly certain you can quickly and easily remove these guardrails, by finetuning the model with human-infecting viruses. They imply as much in the paper, saying: "Task-specific post-training may circumvent this risk mitigation measure and should be approached with caution." Of course, there's a strong economic incentive to do such post-training: for applications to gene therapy and the like you need human-infecting, immune-evading viruses. And it is possible to do such finetuning, since the model and training code were all released openly. This isn't a serious approach to safety.

Indeed, similar guardrails often have been easily removed in language models. For instance, a group led by synthetic biologist Kevin Esvelt found it cost only a few hundred dollars to finetune an existing open source language model so that it would be far more helpful in generating pandemic agents, concluding⁶:

Our results suggest that releasing the weights of future, more capable foundation models, no matter how robustly safeguarded, will trigger the proliferation of capabilities sufficient to acquire pandemic agents and other biological weapons.

They were talking of language models, but it seems almost certainly true of the biological foundation models as well.

Suppose we succeed in building powerful foundation models to enable biological design. Even deeper than the technical safety problem is the social problem: no matter what safety measures are possible in principle, many organizations will build less- or un-guardrailed versions of the models anyway. That'll be true of military organizations such as DARPA; many companies will also have seemingly compelling cases. Guardrails are inherently a slippery slope: easily removed using finetuning, and depending in any case on subjective social consensus. By contrast, reality is an objective, stable target for investigation.

The real underlying issue is that such models aim to capture an understanding of how biology works. The deeper that understanding, the better the models will function. But the understanding is fundamentally value-free: there's nothing intrinsically "good" or "bad" about understanding, let us say, what makes a protease cleavage site efficient. That's just part of understanding biology well. "Good" and "bad" are downstream of such understanding. And we only get benefits by learning how to control things like immune evasion, the rate of lethal side effects, and the rate of viral spread. When you learn to control a system so as to improve outcomes, you can very often apply the same control to make things worse. Benefits and threats are intrinsically linked.

That is: a deep enough understanding of reality is intrinsically dual use⁷.

This isn't just true in biology. We've seen a similar pattern play out repeatedly through history, across sciences. A personally resonant example is the development of quantum mechanics in the twentieth century. This helped lead to many wonderful things, including much of modern molecular biology, materials science, and semiconductors. But it also underpinned nuclear weapons. It's hard to see how you can get the benefits without the downsides. Should we have surrendered quantum mechanics and its benefits in order to avoid nuclear weapons? Some may argue the answer is "yes", but it's a far from universal position. Again we see the pattern: sufficiently deep understanding of reality grants tremendous power for both good and ill⁸.

Instead of fragile safety engineering, a different response is to say: "look, it's near-inevitable we will soon build tools to uncover many deadly pandemic agents. Let's also use our improved understanding to defend the world."

There's a lot of work being done toward that end. I will mention just two ideas, both rather speculative. One intriguing idea is from my friend Hannu Rajaniemi, CEO of Red Queen Bio, who has suggested immune-computer interfaces. The idea is that people will wear devices which do real-time detection of environmental threats, and then develop and deploy countermeasures, also in real time. It'd be just-in-time immune system modulation, based on surveillance and response.

Another possibility is to secure the built environment. A lot of people are putting serious work into that, but I will just mention one amusing and perhaps slightly tongue-in-cheek observation, I believe first made by Carl Shulman: the cost of BSL-3 lab space is within a small multiple of San Francisco real estate (both currently in the very rough range of $1k / square foot). Obviously I don't mean we should all live in BSL-3 labs! But it does suggest that if biological disasters become common or severe enough, we may have both the incentive and the capacity to secure the entire built environment.

I mention this in part because it is very similar to the strategy humanity uses to deal with fire. As of 2014, the US spent more than $300 billion annually on fire safety⁹. That means investment in new materials, in meeting the fire code, in surveillance to detect and respond to threats, and many other measures. The fire code is expensive and disliked by many people, but it provides a form of collective safety, somewhat similar to the childhood vaccine schedule. We don't address the challenge of fire by putting guardrails on matches, making them "safe" or "aligned". Instead, we align the entire external world through materials and surveillance and institutions.

Existential risk

Let's move from specific examples to broader patterns. The underlying issue is that as humanity understands the world more deeply, that understanding enables more powerful technologies, for both good and ill. You can express that via the following heuristic graph:

Read the original article

yurivish

Karma: 2741

@Hacker__News
@hacker._news

Hacker News