Keep Apart Research Going: Donate Today
Summary
This paper builds upon the research in Seemingly Human: Dark Patterns in ChatGPT (Park et al, 2024), by introducing a new benchmark of 392 questions designed to elicit dark pattern behaviours in language models. We ran this benchmark on GPT-4 Turbo and Claude 3 Sonnet, and had them self-evaluate and cross-evaluate the responses
Cite this work:
@misc {
title={
Benchmarking Dark Patterns in LLMs
},
author={
Jord Nguyen, Akash Kundu, Sami Jawhar
},
date={
5/27/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}