Mar 10, 2025

AI-Powered Policymaking: Behavioral Nudges and Democratic Accountability

michel chbeir, jana dagher

This research explores AI-driven policymaking, behavioral nudges, and democratic accountability, focusing on how governments use AI to shape citizen behavior. It highlights key risks such as transparency, cognitive security, and manipulation. Through a comparative analysis of the EU AI Act and Singapore’s AI Governance Framework, we assess how different models address AI safety and public trust. The study proposes policy solutions like algorithmic impact assessments, AI safety-by-design principles, and cognitive security standards to ensure AI-powered policymaking remains transparent, accountable, and aligned with democratic values.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow
Arrow
Arrow

Hi Michel and Jana, I thought your project idea is good and relevant, and could be published as a peer-reviewed article, after further polishing. A few recommendations to strengthen your work:

(1) Overall, this is a very comprehensive paper, that touches upon many aspects of the subject; for the future, I’d suggest going more in depth into each section. I’m aware you had limited time to achieve this during the hackathon, but I’m mentioning in case you want to progress your work.

(2) Add citations in text, especially as you’re introducing a lot of definitions/concepts. Citations also help to clearly distinguish between what might be your perspective and that of the others/what is backed up by evidence.

(3) For section 1, there were a lot of examples used which I’d only suggest keeping in if you’re gonna extend the section and go more in-depth.

(4) For section 2, start first by explain why you chose Europe vs. Singapore specifically, it is easier for the reader to understand when they go through that section. In the same section, it’s great that you added a table and mentioned a summary of the comparison.

(5) I’d suggest more integration and transition between sections, for example: While I understand the comparison, I’d like to see that integrated into section 3 - and based on it, the implications to flow more naturally.

Nicely described and motivated introduction to past research on nudging. Raise important concerns and risks with AI driven / personalized nudging. This is a nicely written overview of the issues in this area. The paper does not present a research project proposal, however.

Thank you for submitting your work on ‘AI-Powered Policymaking: Behavioral Nudges and Democratic Accountability’ - it was interesting to read. Please see below for some feedback on your project:

1. Strengths of your proposal/ project:

- The abstract (and paper generally) is clear and well structure - it is immediately clear to me what you are working on, why it is deemed important, and what your project is looking to do.

- Clear definitions are provided from the offset, which helps aid read comprehension (e.g., ‘nudge’).

2. Areas for improvement:

- Large sections of the paper are unreferenced, so I would encourage the use of more evidence-based statements.

- Could you back up statements with examples? e.g., ‘AI-enhanced nudging offers unprecedented precision and scalability’ - how? Why?

- Key terms should be defined. E.g., you mention that ‘Transparency thus becomes a non negligible consideration’ (p.2) - how are you defining transparency?

- Overall, a clearer methodology (explicitly outlined), more critical analysis (vs descriptive language) and the insertion of a threat model(s) would greatly strengthen this paper.

Cite this work

@misc {

title={

AI-Powered Policymaking: Behavioral Nudges and Democratic Accountability

},

author={

michel chbeir, jana dagher

},

date={

3/10/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

Jan 11, 2026

Eliciting Deception on Generative Search Engines

Large language models (LLMs) with web browsing capabilities are vulnerable to adversarial content injection—where malicious actors embed deceptive claims in web pages to manipulate model outputs. We investigate whether frontier LLMs can be deceived into providing incorrect product recommendations when exposed to adversarial pages.

We evaluate four OpenAI models (gpt-4.1-mini, gpt-4.1, gpt-5-nano, gpt-5-mini) across 30 comparison questions spanning 10 product categories, comparing responses between baseline (truthful) and adversarial (injected) conditions. Our results reveal significant variation: gpt-4.1-mini showed 45.5% deception rate, while gpt-4.1 demonstrated complete resistance. Even frontier gpt-5 models exhibited non-zero deception rates (3.3–7.1%), confirming that adversarial injection remains effective against current models.

These findings underscore the need for robust defenses before deploying LLMs in high-stakes recommendation contexts.

Read More

Jan 11, 2026

SycophantSee - Activation-based diagnostics for prompt engineering: monitoring sycophancy at prompt and generation time

Activation monitoring reveals that prompt framing affects a model's internal state before generation begins.

Read More

Jan 11, 2026

Who Does Your AI Serve? Manipulation By and Of AI Assistants

AI assistants can be both instruments and targets of manipulation. In our project, we investigated both directions across three studies.

AI as Instrument: Operators can instruct AI to prioritise their interests at the expense of users. We found models comply with such instructions 8–52% of the time (Study 1, 12 models, 22 scenarios). In a controlled experiment with 80 human participants, an upselling AI reliably withheld cheaper alternatives from users - not once recommending the cheapest product when explicitly asked - and ~one third of participants failed to detect the manipulation (Study 2).

AI as Target: Users can attempt to manipulate AI into bypassing safety guidelines through psychological tactics. Resistance varied dramatically - from 40% (Mistral Large 3) to 99% (Claude 4.5 Opus) - with strategic deception and boundary erosion proving most effective (Study 3, 153 scenarios, AI judge validated against human raters r=0.83).

Our key finding was that model selection matters significantly in both settings. We learned some models complied with manipulative requests at much higher rates. And we found some models readily follow operator instructions that come at the user's expense - highlighting a tension for model developers between serving paying operators and protecting end users.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.