Summary
To understand causal relationships between features (extracted by SAE) across MLP layers, this study introduces the Coordinated Sparse Autoencoder Network (CoSAEN). CoSAEN integrates sparse autoencoders for feature extraction with the PC algorithm for causal discovery, to find the path-based activations of features in MLP.
Cite this work:
@misc {
title={
Superposition, but at a Cross-MLP Layers view?
},
author={
Woon Yee Ng
},
date={
3/10/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}