Mar 23, 2026
Toolchain Adversaries: A Control Setting for Supply-Chain Attacks on AI Agent Pipelines
Alexander Liu
Supply-chain attacks on AI agent tool ecosystems have emerged as a practical threat, yet existing AI control evaluations assume the adversary is the model itself. We introduce a complementary control setting where the agent is always honest but a tool in its pipeline is compromised, enabling data exfiltration through the agent's faithful execution of tool outputs.
Overall solidly motivated (this was written before LiteLLM supply chain attack). Supply chain attacks are a *known* issue, so I'm giving this a 4.5 instead of a 5. Execution and Presentation were about as good as I could have expected in a weekend. Overall this is about as high as I'm willing to give something that's essentially a solid proof of concept, and the author notes that this is different from a full-fledged experiment with more examples and ablations.
The new control protocol and conceptual frame is interesting and worth developing. I'd like to see this extended.
Cite this work
@misc {
title={
(HckPrj) Toolchain Adversaries: A Control Setting for Supply-Chain Attacks on AI Agent Pipelines
},
author={
Alexander Liu
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


