Summary
To ensure that we create a safe and unbiased path to AGI, we must calibrate the biases in our LLMs. And with this goal in mind, I worked on testing Goodfire SDK and the steering features to mitigate bias in the recently help Apart Research x Goodfire-led hackathon on ‘Reprogramming AI Models’.
Cite this work:
@misc {
title={
Bias Mitigation in LLM by Steering Features
},
author={
Akanksha Devkar
},
date={
11/25/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}