Mar 23, 2026
LidaRed: ControlArena Red-Team Dataset Generator
Linh Le, David Williams-King
We present LidaRed, an automated pipeline for generating adversarial red-team datasets for AI control. Built on the ControlArena benchmark, our pipeline iteratively refines attacks using monitor feedback, with a focus on omission-based attacks. We achieve attack success rates of up to 60% against Claude Opus 4.6 as a safety monitor, demonstrating that current frontier models still cannot catch subtle side task execution.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) LidaRed: ControlArena Red-Team Dataset Generator
},
author={
Linh Le, David Williams-King
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


