May 26, 2024
Manifold Recovery as a Benchmark for Text Embedding Models
Lennart Finke
Summary
Inspired by recent developments in the interpretability of deep learning models and, on the other hand, by dimensionality reduction, we derive a framework to quantify the interpretability of text embedding models. Our empirical results show surprising phenomena on state-of-the-art embedding models and can be used to compare them, through the example of recovering the world map from place names. We hope that this can provide a benchmark for the interpretability of generative language models, through their internal embeddings. A look at the meta-benchmark MTEB suggest that our approach is original.
Cite this work:
@misc {
title={
Manifold Recovery as a Benchmark for Text Embedding Models
},
author={
Lennart Finke
},
date={
5/26/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}