Mar 23, 2026
RecruitmentArena: AI control setting for multi-agent recruitment, manipulation and collusion
Ana Belen Barbero Castejon, Carlos Vecina Tebar
AI control research aims to ensure that misaligned agent behavior is detected or constrained before it causes harm. Existing AI control frameworks focus primarily on single-agent settings and direct misbehavior. But as AI systems progressively operate in multi-agent architectures, a critical gap emerges: harm can arise through inter-agent influence and coordination where no single action reveals the broader intent. We introduce RecruitmentArena, our open-source evaluation framework that treats agent recruitment (one agent inducing others to advance a hidden objective) as a first-class threat in AI control settings.
We contribute (1) a formal definition of recruitment attacks, (2) a six-dimensional taxonomy that characterizes how attacks are initiated, how influence is exerted, how the payload is structured across agents, how recruitment propagates, when it occurs and how it evades detection, and (3) an extensible framework to evaluate recruitment susceptibility, performance-safety trade-off and control protocol effectiveness. Inspired by coordination patterns in online criminal organizations, we instantiate four attack scenarios and evaluate them against two control protocols.
(https://github.com/AnaBelenBarbero/RecruitmentArena)
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) RecruitmentArena: AI control setting for multi-agent recruitment, manipulation and collusion
},
author={
Ana Belen Barbero Castejon, Carlos Vecina Tebar
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


