Apr 27, 2026
OliGraph: graph-based screening of large oligopools
Ryan Teo, Christopher Kent, Samuel A.R. Richardson
🏆 2nd Place Winner + 🏆 Track 1 Winner: DNA Screening & Synthesis Controls
Existing synthesis screening tools cannot evaluate short oligonucleotide pools, whose overlapping fragments can be reassembled into regulated sequences via polymerase cycling assembly (PCA) yet fall below gene-length detection thresholds. We present OliGraph, an open-source tool that constructs a bi-directed overlap graph from an oligonucleotide pool and extracts contigs for downstream gene-length screening. An optional PCA mode retains only cross-strand overlaps consistent with PCA chemistry. We validated OliGraph in a blinded study across ten simulated pools (70–9,184 oligonucleotides, 30–300 bp) spanning four risk categories. BLAST screening of individual oligonucleotides failed to identify sequences of concern in most pools: three returned zero hits, and vector noise obscured true positives in the remainder. After OliGraph assembly, contig-level BLAST matched the longest assembled sequences (up to 1,905 bp) to sequences of concern at 97–100% identity. In one pool, assembly collapsed 1,634 individual BLAST results into 10 hits from a single contig, all assigned to the same source organism. PCA mode correctly distinguished assemblable from non-assemblable fragments within the same pool. Two pools with no assemblable structure yielded no contigs. OliGraph processed all pools in under 0.2 seconds, fast enough for real-time order screening and consistent with proposals to bring oligonucleotide orders within the scope of synthesis screening regulation.
This is an important piece of work in the sequence screening space to address potential vulnerabilities in oligo synthesis. While there are many existing tools for in silico assembly from contigs (e.g., https://edinburgh-genome-foundry.github.io/DnaCauldron/), this is one of the few works that combines the approach with a practical implementation of screening. Since PCA was first developed, many other assembly methods have been developed that use different techniques and short overlaps or none at all. OliGraph would benefit from supporting a larger set of genetic engineering techniques for assembly.
This is a strong submission. The problem is real, the gap is well-documented, and delivering a working open-source tool that closes it is a legitimate contribution. The assembly logic is technically sound — contig reconstruction, threshold precision, and signal separation from background noise all perform as described. The paper is unusually clear for a hackathon submission, and the dual-use appendix reflects genuine biosecurity literacy.
The natural next step is validation on real commercial order data. Self-generated test sets are a reasonable starting point for a hackathon, but the path to regulatory relevance runs through synthesis providers. Getting one or two to run OliGraph against their actual order pipelines would transform this from a promising proof of concept into something policymakers can cite. The EU Biotech Act angle is well-positioned — following through on provider coordination would make that connection concrete rather than aspirational.
On deployment, the paper frames OliGraph as a tool for synthesis providers but doesn't address how it fits within existing screening infrastructure. The cross-provider evasion problem is acknowledged but quickly set aside as a shared limitation. It deserves more than a footnote. A more complete deployment picture would include integration with customer order logs to flag repeat fragmented purchases across time. Current frameworks require providers to conduct both sequence screening and customer verification — knowing who is ordering, not just what. Extending that to assembly-aware oligopool screening would require policy consideration around flagging criteria, escalation thresholds, and what triggers a hold on an order. That operational layer is absent and worth developing.
The more interesting tension is with cryptographic screening like SecureDNA. In cryptographic screening, sequences never leave the provider's premises, and the screening service never sees them in plaintext. OliGraph requires the opposite: you have to reconstruct the full assembly in the clear to know what the pool encodes. These approaches are complementary but in fundamental tension. Making assembly-aware screening privacy-preserving is an open problem; multi-party computation over graph structures is an active research area, but has not been applied in this context. Engaging seriously with this would be the most significant contribution a follow-on version of this work could make.
Two practical issues worth fixing before wider deployment. The web interface doesn't expose the PCA mode toggle — the most biosecurity-relevant feature requires CLI access. And when the tool returns zero results, it does so silently, with no indication of whether the pool is clean or whether a parameter needs adjusting. For the non-specialist operator this paper envisions, that's a dead end.
One framing clarification: OliGraph is an assembly engine. Detection happens downstream in BLAST. Being precise about where OliGraph ends and screening begins would sharpen the contribution and set more accurate expectations for anyone building on this.
Solid foundation. worth developing further
What an exciting project! I learned a bunch by reading it. I have very little to critique here - the write-up was easy to follow (probably could have been somewhat shorter), the results were impressive, the limitations were stated very clearly. Good work.
Cite this work
@misc {
title={
(HckPrj) OliGraph: graph-based screening of large oligopools
},
author={
Ryan Teo, Christopher Kent, Samuel A.R. Richardson
},
date={
4/27/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


