May 23, 2026
When Models Disagree: Cross-Model Divergence Analysis for Ambiguity Risk Estimation in Software Requirements
Jack Lakkapragada
As AI systems generate increasing volumes of software code, one bottleneck in trustworthy software development is increasingly shifting upstream: specifying what code should do, and verifying that it does so. We investigate whether cross-model divergence in specification generation can serve as a practical signal for requirement underspecification. We present a structured experimental framework — a 30-requirement dataset spanning three ambiguity tiers, a six-category divergence taxonomy, and a normalized JSON comparison pipeline — applied across three contemporary LLMs: Claude Sonnet 4.6, GPT-4o, and Llama 3.3 70B. Divergence increases with pre-labeled ambiguity tier (TDR: 7.2, 7.4, 7.9), assumption divergence dominates across all tiers, and contradiction collapses to zero in adversarially ambiguous requirements. Operationally clear requirements still exhibit substantial latent interpretive variance, with many false consensus events in Tier A concentrated in security constraints. Framework, taxonomy, and dataset were frozen prior to execution, enabling reproducible analysis without post-hoc adjustment.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) When Models Disagree: Cross-Model Divergence Analysis for Ambiguity Risk Estimation in Software Requirements
},
author={
Jack Lakkapragada
},
date={
5/23/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


