Summary
Steer aims to be an API that helps developers, researchers, and businesses steer open-source LLMs away from societal biases, and towards the use-cases that they need. To do this, Steer uses activation additions, a fairly new technique with great promise. Developers can simply enter steering prompts to make open models have safer and task-specific behaviors, avoiding the hassle of data collection and human evaluation for fine-tuning, and avoiding the extra tokens required from prompt-engineering approaches.
Cite this work:
@misc {
title={
Steer: An API to Steer Open LLMs
},
author={
Niranjan Nair
},
date={
9/2/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}