May 25, 2026
Does Oracle Quality Matter? Adversarial Feedback for Formally Verified Code Synthesis
Pranav Kasetty
Counterexample-Guided Inductive Synthesis (CEGIS) is the standard approach for repairing LLM-generated code that fails formal verification. We investigate whether augmenting the verifier's error feedback with adversarial LLM-generated counterexamples accelerates convergence. Using 50 Dafny tasks from the Vericoding benchmark, we find that adding a same-capability oracle (Sonnet) does not change pass rates but improves early convergence: pass@2 increases from 66% to 72%, with 9 tasks converging faster vs 6 slower among 42 jointly-solved tasks. The more surprising finding emerges when we upgrade to a stronger oracle (Opus): it unlocks a task that failed under all prior conditions by providing specific proof-repair guidance — identifying a missing lemma call — where the weaker oracle could only report "it times out." This suggests that for secure program synthesis pipelines, oracle quality — the ability to provide actionable proof-engineering feedback — matters more than oracle presence.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Does Oracle Quality Matter? Adversarial Feedback for Formally Verified Code Synthesis
},
author={
Pranav Kasetty
},
date={
5/25/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


