Apr 27, 2026
TRACE: Threat Recognition via Attention, Context, and Embedding Assembly for Context-Aware Biosecurity Intelligence
Samuel Kong, Mathias Ramm Haugland
Generative AI has democratized protein design but introduced a critical biosecurity gap: AI can "paraphrase" known toxins into synthetic homologs that preserve hazardous folds while evading homology-based DNA synthesis screening. Concurrently, regulatory frameworks now mandate screening down to 50 bp and detection of cross-fragment assembly, yet current tools suffer from high false positives, poor calibration, and uninterpretable outputs. We present TRACE, a context-aware escalation layer that bridges high-throughput first-line screening and expert human review. TRACE combines a deterministic short-window prefilter, a threat-pruned De Bruijn graph for cart-level assembly reconstruction, and a LoRA-fine-tuned ESM-2 protein risk scorer with temperature-scaled calibration and SHAP-based explainability. Evaluated on a family-held-out dataset of 32,526 sequences, TRACE achieves 95.1% recall at ≤2% false-positive rate, 0.987 PR-AUC, and 0.024 expected calibration error, with sub-100ms CPU latency. Deployed as a lightweight ONNX service with an interactive Streamlit dashboard and FastAPI guardrail endpoints, TRACE provides regulator-ready evidence and plug-in safety controls for AI biological design tools, directly addressing OSTP 2024, IGSC v3.0, and CBAI Track 1 objectives.
This is an ambitious project to combine multiple technologies for evasion-resistant synthesis screening. The grounding in recent regulatory frameworks is compelling, though the authors don't fully explain what they mean by TRACE being a "regulator-ready layer". Technical execution seems strong overall, but I found it difficult to follow details of how the parts of the system worked together, and how the assembly mechanism was constructed and validated.
Addressing vulnerabilities in screening short sequences is an important area of research, but this submission doesn't seem to effectively present a new or effective approach. The Short Window Prefilter reduces this solution to a sequence-based screening approach where there's a large set of existing techniques and short sequences remain a vulnerability. Without the prefilter, the subsequent steps do not perform well. The performance of the system is entirely reliant on the short window prefilter, which operates like many existing seed algorithms in popular alignment tools like BLAST, BLAT, and Bowtie.
The paper tries to address a relevant problem at the edge of the synthesis screen, which is: How can screening systems detect AI-generated sequences of concern while preserving tool efficiency?
While the results seem promising, a lot of the methods and results provide more questions for the reader, such as how the paper validated the functionality of SOCs in silico.
The paper was written with very heavy technical jargon and gave insufficient space to explain the methods or results to a desired standard e.g. listing software or packages used without explaining the rationale or detailing the method on how the app works, which could be part of the lack of clarity and gaps in the paper.
Overall, the problem the paper is trying to address is very relevant, and the metrics seem promising, but I am ultimately left wondering whether the methods address the problem it is trying to solve.
Cite this work
@misc {
title={
(HckPrj) TRACE: Threat Recognition via Attention, Context, and Embedding Assembly for Context-Aware Biosecurity Intelligence
},
author={
Samuel Kong, Mathias Ramm Haugland
},
date={
4/27/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


