This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
ARENA 4.0 Interpretability Hackathon
66e1cdf44e17beca5dc0c050
ARENA 4.0 Interpretability Hackathon
September 15, 2024
Accepted at the 
66e1cdf44e17beca5dc0c050
 research sprint on 

Latent Space Clustering and Summarization

I wanted to see how modern dimensionality reduction and clustering approaches can support visualization and interpretation of LLM latent spaces. I explored a number of different approaches and algoriths, but ultimately converged on UMAP for dimensionality reduction and birch clustering to extract groups of tokens in the latent space of a layer.

By 
Matthew Shinkle
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission is under review.
Oops! Something went wrong while submitting the form.

This project is private