Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

Chengheng Li Chen, Kyuhee Kim

🏆 2nd Place Winner

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Review Project

See Code

View Related Sprint

Reviewer's Comments

When your theorems predict experimental results within 1-6% error, that's not a coincidence; that's a framework that actually works. The math seems to checks out and the experiments back it up across three different models.

But the practical concerns are real. The detector needs to know which tokenizer generated the text. In a world where every model uses a different tokenizer, that's a deployment headache that the paper waves away. You'd need some kind of registry or metadata standard, and that's a whole separate problem.

The overlap story is also worrying. At 0% overlap you get perfect detection. At 10% overlap, detection craters to 35%. There's no graceful middle ground. You're either getting full watermark strength with constrained vocabulary, or you're relaxing vocabulary and losing the watermark almost entirely. For production use, that cliff is a problem.

And the robustness testing needs to be tougher. Replacing random words with "masked" isn't what a real adversary does. A real adversary paraphrases, back-translates, or runs the text through a second LLM. The 30% replacement survival rate is encouraging, but it's answering an easier question than the one that matters.

The sparse watermarking idea they mention at the end (only watermark every N-th token) is where this probably needs to go for real-world use. That's worth a whole follow-up paper.

Really nice idea, and impressive progress towards developing it in the timespan of a hackathon.

Cite this work

@misc {

title={

(HckPrj) Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

author={

Chengheng Li Chen, Kyuhee Kim

date={

2/2/26

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Feb 2, 2026

Political Intelligence for AI Safety: The AI Risk Attitudes Survey (AIRAS)

The AI safety and governance community is making progress on defining red lines around existential risk from advanced AI systems, and building verification infrastructure to support this objective. However, this is only half the battle. Implementing these red lines requires unprecedented international coordination, and enforcing them requires credible commitments from national governments, all during a period of increased geopolitical tensions.

We present a whitepaper that makes the case that:

Implementation and enforcement of red lines is as much a political question as it is a technical one

While studies exist showing general public support for AI regulation, the depth of domestic political support among major powers for costly international AI regulation and willingness to make real sacrifices to enforce these red lines is understudied

Rigorous policy-credible surveys of public opinion in this area would both enhance the AI safety community’s efforts to prioritize scarce resources to the most effective action areas and provide advocates a valuable source of evidence to point to when lobbying national governments and international organizations

And therefore proposes and develops the methodology for AIRAS.

Feb 2, 2026