Mar 23, 2026
Performative Subversion: Who Monitors the Monitor?
Max Pinelo, Angel Tenorio, Fernando Avalos, Valeria Mora, Sebastian Urrea
Every AI control eval assumes clean code. Real codebases aren't clean — 87% carry vulnerabilities (Black Duck, 2026). When you ask an AI to extend vulnerable code, writing good code is perpetuating the vulnerability. No malicious step. No anomaly to detect. We call this performative subversion.
We ran 1,800 trials. Results: 74–81% perpetuation across all framings — the codebase drives everything, prompting barely matters. The scariest finding: transitive perpetuation — the AI calls into vulnerable functions without ever copying insecure code. A scanner sees nothing wrong. The standard monitor? 97% false positive rate — useless. Give it codebase context? FPR drops to 5%, detection stays at 100%. But vulnerabilities still persist at 78%. The only real fix: clean the code before the AI touches it.
The cleanroom assumption is wrong. 1,800 trials prove it.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Performative Subversion: Who Monitors the Monitor?
},
author={
Max Pinelo, Angel Tenorio, Fernando Avalos, Valeria Mora, Sebastian Urrea
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


