This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Hackathon for Technical AI Safety Startups
66792de23b5e6f1a6eb18e3f
Hackathon for Technical AI Safety Startups
September 2, 2024
Accepted at the 
66792de23b5e6f1a6eb18e3f
 research sprint on 

Jailbreaking general purpose robots

We show that state of the art LLMs can be jailbroken by adversarial multimodal inputs, and that this can lead to dangerous scenarios if these LLMs are used as planners in robotics. We propose finetuning small multimodal language models to act as guardrails in the robot's planning pipeline.

By 
Axel Backlund, Lukas Petersson
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

This project is private