Jun 2, 2025

LLM Fingerprinting Through Semantic Variability

Luiza Corpaci, Chris Forrester, Siddhesh Pawar

Details

Details

Arrow
Arrow
Arrow
Arrow
Arrow

This project develops an LLM fingerprinting and analysis toolkit to increase transparency in AI routing systems, addressing Track 2: Intelligent Router Systems through two key investigations. We adapted semantic variability analysis to create unique behavioral fingerprints that can identify which specific models are operating behind opaque routing services, and conducted tool detection experiments under semantic noise to assess model robustness. Our findings demonstrate that models maintain high semantic robustness while our fingerprinting technique successfully distinguishes between different models based on their response patterns. These contributions aid the Expert Orchestration Architecture vision by providing practical tools for auditing multi-model AI systems, enabling organizations to understand which models their routers actually use and verify their reliability under real-world conditions, ultimately making router systems more transparent and trustworthy for production deployment.

Cite this work:

@misc {

title={

@misc {

},

author={

Luiza Corpaci, Chris Forrester, Siddhesh Pawar

},

date={

6/2/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow
Arrow
Arrow

Erin Saint Gull

This is a well-executed research project that tackles critical AI transparency challenges with impressive methods. The cognitive load experiments reveal fascinating insights about AI attention management, particularly that technical jargon causes 96% tool reduction despite 100% acknowledgment, mirroring human selective attention failures. The most important finding, in my opinion, is that meta-analysis of the task constitutes cognitive load that isn’t recognized. Identifying this is crucial for AI safety improvements and defending against potential prompt injections.

Furthermore, this project addresses the model-routing opacity problem for AI governance and compliance with an impressive and novel method: the "hyperstrings" concept and approach to probe model consistency patterns and successfully distinguishing between 11 different models with statistical significance demonstrates real technical achievement. The experimental design shows scientific rigor with systematic similarity measurements and clear statistical testing, while the preferential tool-dropping behavior (abandoning enhancements while preserving core functionality) provides actionable guidance for building robust AI systems.

The research successfully bridges theoretical AI interpretability with practical system transparency. While sample sizes could be larger for broader validation, the work creates exactly the kind of transparency tools needed as AI systems become more complex - understanding both "which model is running" and "how reliably it performs under real-world conditions." This is a strong contribution with immediate applicability to production environments.

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.