Nov 23, 2025
Voyager - Self-Evolving AI Control Platform
Nav Sangameswaran, Preetham Sathyamurthy, Sabarish Varadharajan
Voyager is an AI Control Platform that treats AI safety as an engineering problem, not a policy afterthought. Today, frontier labs already use AI to accelerate AI capability, while safety work is still done by small human teams running occasional red-teaming exercises and reading papers. This is a losing game. Offensive use of AI is being automated and scaled, yet defensive use is fragmented and slow.
Voyager is our attempt to flip that. It turns cutting edge safety ideas such as evolutionary red-teaming, constitutional judging and mechanistic interpretability into a continuous control loop around real models that real organisations run. In this hackathon, we implemented the Offend stage of our OUDA lifecycle and exposed it as a SaaS product. A model owner connects their model, presses “audit”, and Voyager uses powerful models as adversaries, mutating psychologically rich attack strategies until they elicit failures such as deception, policy circumvention or harmful assistance. The system then writes an audit report that a risk or compliance team can act on.
In experiments, Voyager autonomously discovered non trivial deceptive behaviour in Claude Sonnet 4.5, one of the most safety tuned models available, with no hand written jailbreaks. That is the core def/acc story in one sentence: AI is already capable of doing serious safety work on AI, and wrapping that capability in a reusable control platform lets defence scale at something like the same rate as capability, for both frontier labs and enterprises.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Voyager - Self-Evolving AI Control Platform
},
author={
Nav Sangameswaran, Preetham Sathyamurthy, Sabarish Varadharajan
},
date={
11/23/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


