Mar 23, 2026
ODRArena: Rigorously Evaluating Deep Research Control Policies
Henry Castillo
Deep research agents have become the bread and butter of professionals performing high-level and high-stakes knowledge work, representing an attractive opportunity for attackers to scale their impact by stealing sensitive information and compromising critical systems through prompt injection attacks. To model this class of threats and defenses, we develop ODRArena, a control setting built on Open Deep Research to facilitate blue and red-teaming efforts in realistic deep research scenarios. We provide a black-box and a white-box setting, emulating open-source systems and closed-source products such ChatGPT Deep Research. With our initial red-teaming efforts, we find deep research agents to be robust to prompt injection attacks due to the multiple rounds of summarization and compression in the deep research pipeline.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) ODRArena: Rigorously Evaluating Deep Research Control Policies
},
author={
Henry Castillo
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


