Mar 23, 2026
A Comparative Analysis of Ensemble Protocols for AI Control
Ariel Monzon, Natan Katz, Inbar Shulman, Mike Vazan, Or Shemesh, Guy Porat
We empirically evaluate ensembles of AI control protocols and show that performance depends on signal diversity rather than the number of tools: small, diverse ensembles outperform larger, redundant ones. Correlation between tools limits gains from stacking, while consensus-based aggregation improves robustness, providing a principled basis for ensemble design.
Congrats on a very thorough project! Mapping ensemble interactions across 100+ tool combinations is ambitious for a hackathon weekend. The report is really clear and well presented.
Impact & Innovation: The research question, although not groundbreaking, is interesting and pulling it off successfully would provide safety gains. The two novel detection mechanisms are well-motivated. As you flagged, the non-adaptive adversary does bound how directly these ensembles transfer to deployment, though a full red team exploration is understandably beyond hackathon scope.
Execution Quality: The experimental design is impressively thorough. Two gaps worth noting:
- The comparison between ensembles and individual tools isn't compute-matched. A 3-tool ensemble gets three inference passes per sample while a single protocol gets one. Therefore, it's unclear whether the gains come from complementary signals or simply from more compute. The right control would be to run each individual protocol multiple times and aggregate those results, then compare against a tool ensemble at equal compute budget. Without this, the central claim about diversity vs. quantity is confounded.
- There are no confidence intervals in the results or error bars in the figures. Adding them would help calibrate the relevance of the results.
Presentation & Clarity: The paper is well-written and concise with a clear narrative arc. Figures are well-chosen and informative. Minor notes: some figure captions carry substantial analytical claims that would read better in the main text.
Strong empirical contribution! Really enjoyed reading through the project.
Exhaustively evaluating all tool combinations across multiple aggregation methods is thorough and well-designed. The central finding (mechanism diversity matters more than tool count, and the best small ensemble outperforms the full ensemble) is counterintuitive and immediately actionable. The two novel detection mechanisms are both clever. ACP exploiting backdoor brittleness for semantically preserving perturbations is a particularly nice insight.
Cite this work
@misc {
title={
(HckPrj) A Comparative Analysis of Ensemble Protocols for AI Control
},
author={
Ariel Monzon, Natan Katz, Inbar Shulman, Mike Vazan, Or Shemesh, Guy Porat
},
date={
3/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


