Summary
This project focuses on enhancing feature interpretability in large language models (LLMs) by visualizing relationships between latent features. Using an interactive graph-based representation, the tool connects co-activated features for specific prompts, enabling intuitive exploration of feature clusters. Deployed as a web application for Llama-3-70B and Llama-3-8B, it provides insights into the organization of latent features and their roles in decision-making processes.
Cite this work:
@misc {
title={
BBLLM
},
author={
Joey SKAF, Mickaël Boillaud, Thaïs Distinguin
},
date={
11/25/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}