Feb 2, 2026
Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs
Chengheng Li Chen, Kyuhee Kim
🏆 2nd Place Winner
We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}
When your theorems predict experimental results within 1-6% error, that's not a coincidence; that's a framework that actually works. The math seems to checks out and the experiments back it up across three different models.
But the practical concerns are real. The detector needs to know which tokenizer generated the text. In a world where every model uses a different tokenizer, that's a deployment headache that the paper waves away. You'd need some kind of registry or metadata standard, and that's a whole separate problem.
The overlap story is also worrying. At 0% overlap you get perfect detection. At 10% overlap, detection craters to 35%. There's no graceful middle ground. You're either getting full watermark strength with constrained vocabulary, or you're relaxing vocabulary and losing the watermark almost entirely. For production use, that cliff is a problem.
And the robustness testing needs to be tougher. Replacing random words with "masked" isn't what a real adversary does. A real adversary paraphrases, back-translates, or runs the text through a second LLM. The 30% replacement survival rate is encouraging, but it's answering an easier question than the one that matters.
The sparse watermarking idea they mention at the end (only watermark every N-th token) is where this probably needs to go for real-world use. That's worth a whole follow-up paper.
Really nice idea, and impressive progress towards developing it in the timespan of a hackathon.
Cite this work
@misc {
title={
(HckPrj) Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs
},
author={
Chengheng Li Chen, Kyuhee Kim
},
date={
2/2/26
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


