Mar 10, 2025
Latent Knowledge Analysis via Feature-Based Causal Tracing
Letlotlo
This project explores how factual knowledge is stored in large language models using Goodfire’s Ember API. By identifying and manipulating internal features related to specific facts, it shows how facts are encoded and how model behavior changes when those features are amplified or erased.
Good work this weekend! The paper builds upon existing literature (research carried out by Burns et al) - I would recommend reading more widely around the topic to get a better sense of the space. The study identifies general risks posed by LLMs, moving forwards, I would suggest delving into these risks more deeply. The paper addresses key limitations - the focus on one example fact, for example - I would strongly encouraging broadening the scope of the research so that results are more broadly applicable.
Cite this work
@misc {
title={
Latent Knowledge Analysis via Feature-Based Causal Tracing
},
author={
Letlotlo
},
date={
3/10/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


