Jan 11, 2026
Manipulation Playground
Giles Edkins, Zoravur Singh, Christopher Berry
As Large Language Models (LLMs) operate in multi-agent settings, understanding emergent manipulation and deception becomes increasingly important. While prior work focuses on LLMs manipulating humans, LLM-to-LLM dynamics remain understudied. We extend the Cheap Talk setup (Pham 2025) to create controlled scenarios where agents communicate under partially aligned or conflicting incentives. This framework enables observation of influence attempts, misrepresentation, information withholding, and covert signaling. We contextualize our approach within recent work on manipulation and deception, including APE and DeceptionBench, and offer preliminary observations about when models engage in manipulation. Early findings suggest that some agents alter their behavior depending on the capability of the victim, agents double-down on manipulative behavior in iterated games, and that manipulative tendencies arise without explicit prompting.
Manipulation Playground effectively explores emergent strategic behaviors between LLMs using a multi-agent game setup. The graphs comparing different model pairs provide clear visual evidence of behaviors such as double-down manipulation and capability-dependent strategy shifts.
To improve, the project could include quantitative metrics for manipulative actions, more explicit definitions of what counts as manipulation, and additional experiments across a broader set of models or incentive structures. Overall, it is a well-executed exploratory study with promising insights into LLM-to-LLM interactions.
Cite this work
@misc {
title={
(HckPrj) Manipulation Playground
},
author={
Giles Edkins, Zoravur Singh, Christopher Berry
},
date={
1/11/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


