Mar 23, 2026
CovertCollude: A Control Setting for Steganographic Multi-Agent Collusion
Evelyn Hu, Nicole Tang, Kin Ching Ip, Gabriel Rosenbaum
Existing control benchmarks lack robust frameworks for evaluating collusion via steganography. This project addressed the gap in existing literature for evaluating two agents coordinating covertly to achieve a given outcome, with each individually appearing compliant to a monitor. We introduce CovertCollude, a control setting that empirically measures this behavior through three experiments. In our main experiment, two agents colluded via explicit covert channels embedded in legitimate code review outputs. Channels achieved 85-100% signal accuracy and monitors flagged compromised exchanges at meaningful rates only under aggressive configurations that also produced high false positives. This suggests a significant tradeoff between safety and usefulness. In a spontaneous ablation, agents given a shared goal but no encoding instructions did not produce a verifiable steganographic channel. However, models did generate monitor-detectable behavioral strains, suggesting that coordination pressure leaves traces even without explicit channels. A third experiment using social media captions as a transfer medium and an acrostic encoding scheme showed perfect decoding with zero monitor detection at any practical operating point, and revealed that LLMs actively searching for signals report false positives 67% of the time on clean inputs. Finally, we find that two-hop signal propagation achieves 100% end-to-end accuracy with only modest additional monitor detection (32%) and zero false positives under structured hop-by-hop monitoring. We find that explicit steganographic coordination is simultaneously more reliable and harder to detect than spontaneous behavioral coordination. Monitors face an unfavorable tradeoff: the sensitivity required to catch covert channels produces false positive rates that render them impractical.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) CovertCollude: A Control Setting for Steganographic Multi-Agent Collusion
},
author={
Evelyn Hu, Nicole Tang, Kin Ching Ip, Gabriel Rosenbaum
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


