Nov 24, 2025
Helix-Aegis: LLM Based screening for bio-sequences
Vishnu Vardhan Sai Lanka
We introduce Helix-Aegis, a prototype defensive screening system designed to
detect hazardous biological sequences (toxins, pathogens, virulence factors)
using fine-tuned Large Language Models. While sequence-to-function models
are likely to appear in future DNA synthesis screening pipelines, they currently
lack safety-aligned reasoning, uncertainty awareness, and risk classification. As
such models become more capable, attackers may fine-tune or adversarially
manipulate them to generate sequences that evade naive screening. Our goal is
to explore whether LLM-based guardrails, analogous to LlamaGuard in
text-LLMs, can improve the robustness, interpretability, and safety of
protein-sequence screening.
Helix-Aegis clearly addresses a real and rising risk. Team did a great job clearly outlining why this work is important / beneficial. Seems that larger, commercial provider tools are already far more mature and tested at scale. Plus have access to more data.
This project would be stronger if positioned as an experiment in adding a risk-aware, interpretable layer on top of existing screening pipelines rather than replacing them. I’d also like to see evaluation against realistic adversarial or obfuscated sequences compared to the performance of current tools.
Could this tool be used to provide an explanation to users for why existing black-box tools reject or flag a sequence rather than making the decision?
Cite this work
@misc {
title={
(HckPrj) Helix-Aegis: LLM Based screening for bio-sequences
},
author={
Vishnu Vardhan Sai Lanka
},
date={
11/24/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


