Aug 25, 2024
Web App for Interacting with Refusal-Ablated Language Model Agents
Simon Lermen
While many people and policymakers have had contact with
language models, they often have outdated assumptions. A
significant fraction is not aware of agentic capabilities.
Furthermore, most models that are available online have various
safety guardrails. We want to demonstrate refusal-ablated agents
to people to make them aware of various misuse potentials. Giving
people a sense of agentic AI and perhaps having the AI operate
against themselves could provide a better intuition about agency in
AI systems. We present a simple web app that allows users to
instruct and experiment with an unrestricted agent.
Adam Binksmith
Nice work! Agent capabilities are likely to be important, so I’m excited to see demos of them. I particularly liked the ability to receive an email from the model that you can reply to - great to show it working outside the assistant chat UI. For next steps, I’d prioritise a polished, optimised UX that makes it clear to the user what’s going on, and gives a default prompt. I’d also look to speed it up or stream in intermediate results to make it more engaging.
Cite this work
@misc {
title={
@misc {
},
author={
Simon Lermen
},
date={
8/25/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}