Jun 2, 2024
Exploring Hierarchical Structure Representation in Transformer Models through Computational Mechanics
Olli Järviniemi, Udayanto Dwi Atmojo, Aayush Kucheria, Konsta Tiilikainen
Summary
This work attempts to explore how an approach based on computational mechanics can cope when a more complex hierarchical generative process is involved, i.e, a process that comprises Hidden Markov Models (HMMs) whose transition probabilities change over time.
We find that small transformer models are capable of modeling such changes in an HMM. However, our preliminary investigations did not find geometrically represented probabilities for different hypotheses.
Cite this work:
@misc {
title={
Exploring Hierarchical Structure Representation in Transformer Models through Computational Mechanics
},
author={
Olli Järviniemi, Udayanto Dwi Atmojo, Aayush Kucheria, Konsta Tiilikainen
},
date={
6/2/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}