Nov 24, 2025
Gene Guard: Real-Time Genomic Data Leak Prevention
Neha Suresh, Angana Mondal, Riccardo Campanella, Richard Guan, Ahmed Rockey Saikia
Genomic data (such as DNA sequences) are both highly sensitive and uniquely vulnerable to accidental leakage in modern collaborative and AI-assisted workflows. This report introduces Gene-Guard, a two-layer detection and prevention system designed to safeguard genomic sequences from unauthorized exposure in real time. We articulate the growing risk through real-world scenarios: researchers inadvertently pasting DNA sequences into chat platforms or AI tools, and bio-lab staff emailing unencrypted gene data. To address this gap, Gene-Guard integrates an open DNA sequence dataset for training and benchmarking machine learning classifiers that flag genomic data with high recall and precision. Upon detection, an optional second layer provides an LLM-powered risk analysis, identifying the sequence and its potential biosecurity implications using tools like IBBIS’s Common Mechanism and SeqScreen. Emphasis is placed on a lightweight, fast, platform-agnostic tool that can plug into any Data Loss Prevention pipeline without heavy resource demands or latency. Our results suggest Gene-Guard can significantly reduce the chance of genomic data leaks without impeding researchers’ workflows, offering a novel safety net at the intersection of cyber and biosecurity.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Gene Guard: Real-Time Genomic Data Leak Prevention
},
author={
Neha Suresh, Angana Mondal, Riccardo Campanella, Richard Guan, Ahmed Rockey Saikia
},
date={
11/24/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


