Jan 11, 2026
Seed Aware Evaluation Of Semantic Stability in Text-to Image Diffusion Models
Chinmay Muralidharan
Text-to-image diffusion models rely on stochastic sampling, introducing variability across generations. While often framed as a creative feature, this variability can produce semantic drift, where prompt meaning changes without user intent, undermining reproducibility and evaluation reliability.
We examine how random seed choice affects semantic stability in Stable Diffusion v1.5 under fixed prompts and parameters. Using three controlled seed regimes—determinism, sensitivity, and diversity—we evaluate outputs with qualitative metrics capturing structural consistency and outcome variability.
Across five prompts, identical seeds yield fully reproducible images, small seed perturbations preserve core semantics with minor variation, and distant seeds frequently induce reinterpretation. We argue that seed-aware evaluation offers a lightweight tool for auditing generative models in safety-critical contexts.
1. Clean experimental design with the three-regime framework. The cherry-picking manipulation surface is a valid concern worth exploring.
2. "Different seeds give different outputs" is the definition of stochastic sampling, not a research contribution. The interesting question is when and why semantic drift occurs.
3. Qualitative assessments ("High/Medium/Low") make results hard to compare or reproduce. CLIP similarity between outputs, perceptual distance, or embedding-space analysis would make findings quantitative.
4. Testing 50–100 prompts would allow statistical claims about which prompt types are more or less stable under seed variation. 5 is too little.
5. Can an adversary search over the seeds to find outputs supporting a desired narrative? How many seeds must be sampled? What's the success rate? Answering these would make it more fit for the hackathon.
The main source of randomness in diffusion models is the initial latent noise, it would have been interesting to see how changing the seed impacts the that latent noise. As well, it would be interesting to see how directly perturbing the initial latent noise impacts generations, instead of doing that indirectly by changing the seed.
Cite this work
@misc {
title={
(HckPrj) Seed Aware Evaluation Of Semantic Stability in Text-to Image Diffusion Models
},
author={
Chinmay Muralidharan
},
date={
1/11/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


