May 24, 2026
SPEC-CHECK
Vijaya L Peketi
"SPEC-CHECK addresses Track 1 — Specification Elicitation — by automatically extracting formal constraints from natural language requirements documents and evaluating whether LLM outputs adhere to them or silently override them.
Built on empirical findings from ARCORE-ML (NeurIPS 2026 #2102, EMNLP 2026 #391), which documented that Gemini 1.5 Pro silently overestimated a UPI payment system by +212% by substituting enterprise messaging rates for explicitly specified transactional rates – without signalling the departure. This is the specification elicitation problem in production: informal constraints exist in natural language requirements but are never formally verified against AI outputs.
SPEC-CHECK extracts formal constraints from requirements documents, evaluates LLM outputs for three failure modes (template-execution bias, silent constraint override, and single-vendor output tendency), and issues APPROVED or BLOCKED governance verdicts with evidence and recommendations. Grounded in the 2026 International AI Safety Report finding that pre-deployment safety testing is broken and relevant to Rajashree Agrawal's work on trustworthy-by-default AI, SPEC-CHECK is the enterprise implementation of specification elicitation as a safety gate.
Dataset: ARCORE-ML — 595 annotated requirements from 10 production systems across 8 domains, including government, healthcare, finance and payments in India. India presents a unique and urgent context for this research. 68% of Indian SMEs select cloud vendors based on familiarity alone rather than systematic evaluation, according to the 2024 State of Cloud Computing Survey. With India's digital economy growing at an unprecedented scale — UPI processing 15 billion transactions monthly, government systems serving 1.4 billion citizens, and healthcare platforms reaching previously unserved populations — the consequences of silent constraint override in AI-assisted infrastructure decisions are not hypothetical. They are financial losses, vendor lock-in, and compliance failures affecting the world's most populous democracy.
SPEC-CHECK addresses this specifically by making specification elicitation accessible to SMEs without formal methods expertise. A business owner in Hyderabad describing their appointment booking system in plain language receives formal constraint extraction, LLM output validation, and a governance verdict — the same pre-deployment safety testing that large enterprises achieve through dedicated FinOps and architecture teams. This is AI safety as social good: democratising the tools of trustworthy deployment for resource-constrained organisations in the Global South.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) SPEC-CHECK
},
author={
Vijaya L Peketi
},
date={
5/24/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


