Apr 26, 2026
BioShield-AI
Rohit Singh Rajput
BioShield AI is an advanced biosecurity screening system that detects dangerous DNA and protein sequences by analyzing their function rather than just sequence similarity. It uses protein language models, 3D structure prediction, and pathway analysis to catch novel AI‑designed toxins that evade traditional tools. The system includes a five‑station pipeline (functional fingerprinting, domain risk checks, pathway assembly detection, risk scoring with explainability, and an adversarial self‑hardening loop). Compared to existing tools, BioShield AI uniquely identifies functional analogs, explains why sequences are flagged, and continuously improves against evasion attempts.
This is a proposal, not a project. The only hint of any actual development or evaluation work comes near the end, where the authors briefly state "Phase 1 — Hackathon MVP: Stations 1-2 functional. ESM-2 embeddings + Pfam scanning. Basic risk scoring. Proof-of-concept evasion detection on synthetic test set." However, they provide absolutely no detail or results.
As a proposal, it's essentially a laundry list of nearly every idea and approach anyone has ever suggested, strung together as a "pipeline" without addressing whether each element would actually work as advertised and how it would be accomplished. Swept under the rug is any acknowledgement that many of the one-sentence capabilities described in the proposal are hard unsolved problems that will take considerable work to solve, if they can be.
Many of the assertions are insufficiently explained for this reviewer to interpret. Key details are omitted.
It is unclear whether the authors appreciate the distinction between nucleotide-level obfuscation that maintains the same amino acid sequence (easy to do, easy to detect), amino-acid-level obfuscation that maintains the same protein structure (currently challenging to do and very costly and slow to verify), and structure-level obfuscation that maintains overall protein functionality (ditto). The logic of their claims and design seems to depend on conflating these.
The threat scenario is dramatically exaggerated, without acknowledging the difficulty of computationally generating alternative versions of proteins (at the AA or structure level) with desired functionality.
The proposal correctly identifies a vulnerability in sequence-based approaches to screening with advances in AI-generated design. The proposed solution has already been studied extensively in existing publications and still has critical issues that need to be addressed to deploy into existing screening systems.
1. Using generative protein design tools that require capable GPUs for inference are either prohibitively slow or expensive limiting their application in large scale synthesis screening
2. Existing generative protein design tools are trained on template coding sequences free from regulatory elements, genetic engineering techniques (e.g., 2A peptides, guide RNAs), multiple ORFs, or fused proteins among many other complications in real-world synthetic sequences. Generative protein designs tools struggle to characterize these sequences without extensive pre-processing.
Visually clean, nice, no implementation/data.
Cite this work
@misc {
title={
(HckPrj) BioShield-AI
},
author={
Rohit Singh Rajput
},
date={
4/26/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


