Jan 11, 2026
Vector Forge
Ayşe Asude Demir, Tuğrul Demir
We introduce Vector Forge, a fully automated agentic tool that transforms behavioral descriptions into verified steering vectors without needing manual data collection. Our approach bridges the gap between labor-intensive dataset curation and unstable one-shot optimization methods. Vector Forge accelerates safety research by enabling rapid detection of emerging deceptive LLM behaviors.
Experiments are comprehensive and is an interesting direction to mitigate dangerous LLM behaviors with steering vectors
Impressive engineering with some interesting results! The multi-stage pipeline is well thought through and the price of execution makes this genuinely accessible. The honest reporting of mixed results is appreciated. With some additional validation (different model, evals, seed stability, ablation on the filtering stages) I could see this being a tool the community uses and builds on!
Cite this work
@misc {
title={
(HckPrj) Vector Forge
},
author={
Ayşe Asude Demir, Tuğrul Demir
},
date={
1/11/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


