Apr 7, 2025
DimSeat: Evaluating chain-of-thought reasoning models for Dark Patterns
Thomas Kiefer, Pepjin Cobben, Fred Defokue, Rasmus Moorits Veski
Recently, Kran et al. introduced DarkBench, an evaluation for dark patterns in large language models. Expanding on DarkBench, we introduce DimSeat, an evaluation system for novel reasoning models with chain-of-thought (CoT) reasoning. We find that while the inte-
gration of reasoning in DeepSeek reduces the occurrence of dark patterns, chain-of-thought frequently proves inadequate in preventing such patterns by default or may even inadvertently contribute to their manifestation.
Esben Kran
This is an awesome project and a great way to expand the dark patterns work in a very unique way. There's some deep implications in here about how the models work at a fundamental level and where they self-identify their misalignment. I'd be incredibly curious to see this with Grok as it seems to have significantly different behaviors when it comes to truthfulness / self-consistency in reasoning compared to other models.
Great investigation, great statistics, great introduction, great discussion.. Phenomenal work. There's some very interesting pieces of continued work that I would love to see. Some examples that go a bit beyond the template are 1) understanding when models' alignment training makes them behave in self-inconsistent ways (e.g. helpfulness vs. honesty in the anthropomorphization example), 2) modeling out the effects of CoT identification of potential dark patterns (i.e. more deep dive into the individual 2x2 matrices of results), and 3) establishing some sort of CoT-result ethical consistency benchmark.
Cite this work
@misc {
title={
@misc {
},
author={
Thomas Kiefer, Pepjin Cobben, Fred Defokue, Rasmus Moorits Veski
},
date={
4/7/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}