May 22, 2026
SpecShift SPS Evaluator
Benjamin J. Clark, Oluwagbemike Olowe
SpecShift SPS Evaluator is a public synthetic scaffold for observable-only review of generated-code tasks.
The project explores a simple problem: passing visible tests is not the same as satisfying the underlying specification.
The prototype implements multiple structured review passes across baseline, schema, specification-perturbation, and adversarial conditions using a small synthetic dataset designed for local inspection and reproducibility review.
The system does not execute generated code or require model weights, private prompts, training data, or internal activations. Instead, it operates only on observable records and reviewer-visible artifacts.
The project is intended as a bounded research prototype for specification validation and reviewer-assisted discrepancy analysis, not as a production system or automated verifier.
Included artifacts:
* runnable local demo
* synthetic dataset
* structured review passes
* public review packet
* explicit limitations and boundary documents
The prototype also explores reviewer-controlled hidden-label evaluation structures for future observable-only assessment workflows.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) SpecShift SPS Evaluator
},
author={
Benjamin J. Clark, Oluwagbemike Olowe
},
date={
5/22/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


