Sep 15, 2025
Molecules Under Watch: Multi-Modal AI Driven Threat Emergence Detection for Biosecurity
Subramanyam Sahoo
This study presents a comprehensive multi-modal pipeline for assessing biosecurity risks in chemical compounds, integrating real and synthetic datasets from public repositories such as ChEMBL, PubChem, and USPTO patents. The system leverages molecular descriptors extracted via RDKit, contextual embeddings from the Qwen-2.5 language reasoning model, and unsupervised anomaly detection using Isolation Forest to compute a novel Threat Emergence Detection (TED) score. This score quantifies dual-use potential, synthesis feasibility, and novelty, enabling scalable threat triage. We evaluate the pipeline on hybrid datasets, demonstrating robust differentiation between real pharmaceutical compounds and synthetic benchmarks. Our approach advances AI-driven CBRN (Chemical, Biological, Radiological, Nuclear) safety by providing interpretable risk metrics and constitutional oversight, with implications for regulatory compliance and dual-use research governance.
The project is critical in the AI-CBRN interface, the scoring framework is novel and implementable. My only note to the author is that the binary threat probability assignment (0.8 vs 0.2) is overly simplistic, and you could potentially miss chemical structural signals.
This project is ambitious and makes a valuable contribution by tackling the challenge of detecting potentially dual-use chemical compounds. The integration of molecular descriptors with a composite “TED score” is a creative and promising approach, and the methodology is laid out clearly.
Some design choices, such as assigning fixed 0.8/0.2 threat probabilities or weighting TED score components at 0.4/0.3/0.3, come across as somewhat arbitrary. These would benefit from either empirical justification or sensitivity analyses.
Overall, this is a very strong project: impressive in scope, well-executed, and directly relevant to AIxCBRN risk. This could evolve into a genuinely useful tool for practical biosecurity screening.
Cite this work
@misc {
title={
(HckPrj) Molecules Under Watch: Multi-Modal AI Driven Threat Emergence Detection for Biosecurity
},
author={
Subramanyam Sahoo
},
date={
9/15/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


