May 27, 2024

Benchmarking Dark Patterns in LLMs

Jord Nguyen, Akash Kundu, Sami Jawhar

Details

Details

Arrow
Arrow
Arrow

Summary

This paper builds upon the research in Seemingly Human: Dark Patterns in ChatGPT (Park et al, 2024), by introducing a new benchmark of 392 questions designed to elicit dark pattern behaviours in language models. We ran this benchmark on GPT-4 Turbo and Claude 3 Sonnet, and had them self-evaluate and cross-evaluate the responses

Cite this work:

@misc {

title={

Benchmarking Dark Patterns in LLMs

},

author={

Jord Nguyen, Akash Kundu, Sami Jawhar

},

date={

5/27/24

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Review

Review

Arrow
Arrow
Arrow

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

No reviews are available yet

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.