Apr 8, 2025
The Incentive Gap: Extending Darkbench to Reveal Conflict of Value Biases in LLMs
Nancy Vigil
This preliminary research investigates a new dark design pattern, conflict of values, with prompts designed to elicit possible corporate or model incentives in LLM outputs across several Open AI models. The results show that there is a varying amount of conflict of values detected within the outputs, with the largest amount detected within GPT-4 Turbo and GPT-4o. Further research will be needed to confirm the results of this study.
Esben Kran
Great approach and a good way to expand brand bias to general selfhood bias for the companies themselves! Would've loved to see a link to the prompts used to generate the responses from the models but the motivation is strong. I can easily imagine something like "It's bad to scrape art off the internet" being corporate skewed, for example, but missing the dataset makes it hard to evaluate. Further developments may include precision-specific dark patterns related to corporate incentives (does it favor Sam Altman, Sam Altman as a general concept, Sam Altman's interests, OpenAI's interests, OpenAI's developers' interests, etc.). Lots of things to play with when it comes to who has soft power over the development process!
Cite this work
@misc {
title={
@misc {
},
author={
Nancy Vigil
},
date={
4/8/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}