May 24, 2026
Where to Look: Energy-Based Fault Localization for Verus Vericoding
Guy Nachshon
A 1.5B-parameter discriminative energy-based model that scores each line of a Verus implementation with an energy proxy for "this line is the bug." Qwen2.5-Coder-1.5B + LoRA + sentinel-token per-line head,
trained on 39k Microsoft Verus pairs with InfoNCE + pairwise hinge + ListNet. One adapter, runs on a single H100, demo runs in the browser.
The finding is split. The 1.5B specialist beats every frontier LLM on per-line fault localization (top-3 0.84 vs 0.74). Frontier LLMs win whole-impl ranking (AUROC 0.91 vs 0.78) and CEGIS repair (30% vs
25%). Different tools for different layers of the verification loop — and we're explicit about which.
The audit is the other contribution. Every FAIL impl in the dev-test corpus carries a // FAILS debug marker that Qwen's pretraining prior couples to failure. Strip the marker, watch the signal collapse for
some checkpoints, hold for frontier LLMs, over-correct for ours. We document the leak, release the strip-FAILS audit pipeline, and ship Counterfactual Marker Augmentation as the fix.
Released: model, dataset, audit pipeline, CEGIS harness, every LLM-baseline record, browser-side demo. All under: https://ozlabsai.github.io/VericodingEBM/
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Where to Look: Energy-Based Fault Localization for Verus Vericoding
},
author={
Guy Nachshon
},
date={
5/24/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


