MicroVMM: AI-Assisted Verification-Oriented Virtual Machine Monitor
Thanh Do
Modern agentic systems increasingly execute untrusted AI-generated code inside lightweight sandboxing environments. While many existing sandboxes rely on operating-system namespaces, recent advances in autonomous vulnerability discovery have exposed weaknesses in these traditional isolation boundaries. At the same time, vulnerabilities in virtualization infrastructure itself demonstrate that lightweight VMMs remain difficult to reason about formally due to complex shared-memory device protocols such as VirtIO. This project explores whether AI-assisted workflows can help construct formally analyzable virtualization software centered around explicit protocol invariants. Motivated by CVE-2026-5747, we derive queue lifecycle invariants from the vulnerability and construct a minimal verification-oriented VMM in Lean 4. We define abstract operational semantics for queue lifecycle behavior and establish refinement proofs connecting executable implementation behavior to the abstract model. The resulting system proves several safety properties, including post-activation queue immutability, directly eliminating the motivating vulnerability class within the modeled semantics. Our results suggest that AI-assisted development workflows can reduce the engineering complexity of refinement-oriented systems verification when combined with intentionally verification-oriented architectural design.
It would be helpful to situate this work in related verified OS style work like CertiKOS and Sel4.
The claim is that AI-Assistance can help to develop verified vmms, but all we are given is an AI generated github repo. While this is in a way an existence proof, there is no baseline (how long does it take to generate this without AI). There is also no clear demonstration that the CVE of interest is no longer viable on this system. Additionally, how should we trust that the properties proved in Lean are actually useful? The generated documentation is so verbose it is not clear what we should care about.
While the effort to boot a Linux guest using Lean 4 is great for a hackathon, the paper’s claims go well beyond what is actually demonstrated. Much of the security framing relies on broad AI-safety language, while the core result is a limited systems-verification exercise. Delegating KVM interactions to an unverified C shim leaves a gap in the trusted computing base, so the proofs do not cover the security‑relevant host–guest boundary. Expanding the model to include the FFI boundary, concurrency, and more realistic devices would strengthen the contribution.
Cite this work
@misc {
title={
(HckPrj) MicroVMM: AI-Assisted Verification-Oriented Virtual Machine Monitor
},
author={
Thanh Do
},
date={
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


