Jun 13, 2025
Four Paths to Failure: Red Teaming ASI Governance
Luca De Leo, Zoé Roy-Stang, Heramb Podar, Damin Curtis, Vishakha Agrawal, Ben Smyth
We stress‑tested A Narrow Path Phase 0—the proposed 20‑year moratorium on training artificial super‑intelligence (ASI)—during a one‑day red‑teaming hackathon. Drawing on rapid literature reviews, historical analogues (nuclear, bioweapon, cryptography, and export‑control regimes), and rough‑order cost modelling, we examined four cornerstone safeguards: datacentre compute caps, training‑licence thresholds, use‑ban triggers, and 12‑day breakout‑time detection logic.
Our analysis surfaced four realistic circumvention routes that an adversary could pursue while remaining nominally compliant:
Jurisdictional arbitrage via non‑signatory “AI‑haven” states.
Distributed consumer‑GPU swarms operating below per‑node licence limits.
Covert runs on classified state supercomputers hidden behind national‑security carve‑outs.
Offshore proxy datacentres protected by host‑state sovereignty and inspection vetoes.
Each path could power GPT‑4‑scale training within 6–18 months, undermining the moratorium’s objectives.
To close these gaps, we propose ten mutually reinforcing amendments, including universal cryptographic “compute passports,” quarterly‑updating compute thresholds, an International AI Safety Commission with secondary‑sanctions powers, HSM‑bound weights, kill‑switch performance bonds, whistle‑blower bounties, whole‑network swarm detection, conditional infrastructure aid, and a 30‑month sunset‑and‑iteration clause. Implemented as a single package, these measures transform Phase 0 from a jurisdiction‑bounded freeze into a verifiable, adaptive global regime that blocks every breach path identified while permitting licensed low‑risk research.
Tolga Bilge
Great paper! I have reservations about adding sunset provisions.
Dave Kasten
First off, I appreciate the explicit choice to say " We deliberately favoured substance critiques—those that treat the
policies as flawlessly implemented—and set aside feasibility or implementation questions". This is the heart of the exercise! Thank you.
The discussion of regulatory flight is particularly thoughtful, especially with regards to the specific arguments about timeframes for moving to other countries and examples (we were already familiar with the Soviet bioweapons and Pakistani nuclear program examples and think they are some of the strongest bits of evidence against A Narrow Path, but thinking of FTX as an example of this is particularly inspired.)
I think the datacentre discussion is also good, though doesn't wrestle with the core question readers are left with: why aren't players in the space doing this already, and how quickly would the optimization pressure of our polices move the companies to overcome the (real but not insurmountable) technical hurdles. The compute passports idea in this context is intriguing, however. The discussion of other ways to identify sufficient criminal suspicion for a search (e.g., power usage or heat emissions, which is in line with how many jurisdictions detect e.g. unlawful marijuana grow houses today) is also thoughtful and better-articulated than most examples I've seen.
The offshore discussion is good, though it immediately made me think of examples of rendition during the US Global War on Terror that I wish you had brought in as well.
The 10 ideas are quite good as well; some of these mirror our proposals (e.g., the IASC inspections) but the specifics of how to make them work are just really sharp.
Overall, this hits the mark of giving me specific additional things that 1. I need to think about to find things to patch and 2. specific additional sentences that I can already use in discussions with policymakers to answer a "how might this work" question, driven by examples.
Cite this work
@misc {
title={
@misc {
},
author={
Luca De Leo, Zoé Roy-Stang, Heramb Podar, Damin Curtis, Vishakha Agrawal, Ben Smyth
},
date={
6/13/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}