Jun 13, 2025
Red Teaming A Narrow Path: ControlAI Policy Sprint by Aritra Das and Vaani Goenka
Vaani Goenka, Aritra Das
This research analyses two proposed AI governance policies – prohibiting recursive self-improvement in AI systems and mandating safety cases for deployment – through historical precedent analysis, agent-based modeling, and formal verification. Examining failures in analogous regulations (Basel II, BWC, NSG, Wassenaar, SEC rules), we identify systemic vulnerabilities: definitional ambiguity, enforcement leakage, and the impossibility of exhaustively enumerating safety scenarios for advanced AI. Agent-based simulations demonstrate that even minor probabilities of enforcement algorithm leakage or successful system ‘relabeling’ lead inevitably to policy failure, with evasion skill rapidly outpacing controls. Testing with current SoTA LLMs reveals inherent knowledge leakage risks, while formal analysis proves the fundamental contradiction in assuming finite safety cases for superintelligent systems. Our findings urge policymakers to prioritize precise definitions, acknowledge inherent verification limits for Ais improving Ais, and develop dynamic, leakage-resistant enforcement mechanisms, recognizing that proposed controls offer incomplete solutions against determined circumvention.
Tolga Bilge
This is a great project. I'd be excited to see these researchers continue to apply themselves to problems in AI policy.
Dave Kasten
Overall, this is a thoughtful critique. The Basel II example is a realllly good one to use with policymakers and I am immediately thinking about places to use it. I need to learn more, similarly, about the SEC kill-switch example you raise -- it sounds fascinating but I'm previously unfamiliar with the case.
I liked the agent-based modeling as well, though think it could be have been helpful to provide clearer "so what" impacts of the analysis.
Tolga Bilge
Very interesting and novel approach, enjoyed reading it.
The concrete proposals seem broadly quite sensible.
Would like to see these researchers continue to apply themselves to policy.
Cite this work
@misc {
title={
@misc {
},
author={
Vaani Goenka, Aritra Das
},
date={
6/13/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}