SpecTrojan: Adversarial Specification Validation via Evil Twin Synthesis
Ojas Marathe
SpecTrojan is a spec-validation tool that inverts the traditional input-space search: given a candidate specification, an attacker LLM synthesizes an "Evil Twin" — an alternative implementation that satisfies the spec yet diverges from the reference on intent-bearing inputs. A successful twin is an artifact-level proof that the spec is too weak. On a Bio Honeypot (an LLM-written spec for a sequence-screening predicate), SpecTrojan synthesized 16 evil twins in 60 seconds, including a trivial return True that passes verification while admitting every threat-bearing input. The system also includes an attack-to-strengthening loop that auto-proposes and mechanically verifies spec repairs, and metamorphic spec testing that catches syntax-overfit specs.
Evil twins to show naivety is powerful. Would be useful to test fully on a bigger set, and focus more on showing robustness.
The reframe is novel: search the implementation space and synthesize a whole spec-satisfying alternative, so a surviving twin is an artifact-level proof that the specification is too weak. The system is broad and well-built: a twin synthesizer, two baselines, metamorphic testing, and a strengthening loop whose verifier catches every broken LLM repair while a hand-crafted control passes, with an unusually honest robustness section. Caveats. The headline target is a constructed honeypot, and twin counts are stochastic (sixteen one run, zero the next), so foreground the stable binary signal, whether any twin is found, over the counts. A twin only satisfies the spec across eighty Hypothesis samples, so satisfaction is probabilistic. The capability matrix is rhetorically circular, its rows defined as what the method does, so drop the strictly-subsumes claim. And the central comparison figure has a truncated caption with several mangled identifiers, worth a cleanup pass.
Cite this work
@misc {
title={
(HckPrj) SpecTrojan: Adversarial Specification Validation via Evil Twin Synthesis
},
author={
Ojas Marathe
},
date={
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


