Feb 2, 2026
Attested Multi-Agent Conversation Logs: A Tamper-Evident Black Box for AI Governance
Anantha Shakthi Ganeshan Thevar, Publius Dirac
As multi-agent AI systems assume high-stakes responsibilities, governance frameworks such as the EU AI Act demand traceable, auditable records of agent interactions. We introduce Attested Logs, an open-source Python library that serves as a tamper-evident "black box" for AI conversations. Each message is cryptographically signed (Ed25519), hash-chained (SHA-256), and anchored to trusted public keys, enabling fully offline verification of integrity, authenticity, and ordering. Inspired by aviation flight recorders and C2PA content provenance, the library supports incident forensics, regulatory compliance, and cross-organizational trust. Integrations for AutoGen (manual logging) and LangGraph (callback-based auditing) are provided, together with runnable demos demonstrating end-to-end signing, verification, and tamper detection in real LLM conversations, and 20+ passing tests covering core, crypto, chain, and verification layers.
This is the most shippable project here. It actually delivers a real library that people could use tomorrow: signed messages, hash chaining, offline verification, integrations, tests. For a hackathon, that’s exceptional execution.
Novelty is moderate because the crypto pieces are standard. The value is that you packaged it cleanly and made it usable in real agent frameworks. This is exactly the kind of infrastructure that improves auditability in deployed systems.
To go further: key management (rotation / revocation), what happens when an agent is compromised, and scaling / storage patterns. But as a sprint project: this is very strong.
Tamper-proof evidence of multi-agent interactions is important for future digital forensics and AI governance and regulation. So the project definitely explores an important problem. The write-up is clear on the purely technical side as well as on some of the related work. And the code artifact seems solid.
To make this a much stronger contribution: Explore one clearly defined actual threat model in depth. Red-team your approach and figure out what the hard part actually is. More in-depth exploration would probably reveal that the trust chain bottoming out at the model provider is a fundamental limitation of the approach your describing. Put yourself in the shoes of an adversary (eg a model provider who wants to alter the logs): What attacks can they run and how does your approach fare?
Cite this work
@misc {
title={
(HckPrj) Attested Multi-Agent Conversation Logs: A Tamper-Evident Black Box for AI Governance
},
author={
Anantha Shakthi Ganeshan Thevar, Publius Dirac
},
date={
2/2/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


