This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Deception Detection Hackathon: Preventing AI deception
660d65646a619f5cf53b1f56
Deception Detection Hackathon: Preventing AI deception
July 1, 2024
Accepted at the 
660d65646a619f5cf53b1f56
 research sprint on 

Detecting Deception with AI Tics 😉

We present a novel approach: intentionally inducing subtle "tics" in AI responses as a marker for deceptive behavior. By adding a system prompt, we embed innocuous yet detectable patterns that manifest when the AI knowingly engages in deception.

By 
Samuel Svenningsen, Ilan Moscovitz, Nikhil Kotecha
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission is under review.
Oops! Something went wrong while submitting the form.

This project is private