Mar 23, 2026

Omission-Control: A ControlArena Setting for Selective Omission Attacks in Multi-Agent Pipelines

Vibhu Ganesan, Karthick Chandrasekharan, Naveen Prabu Palanisamy

There is an old adage, often attributed to George Orwell, that "the most powerful form of a lie is the omission." In multi-agent pipelines, this insight takes on operational significance: an untrusted agent need not fabricate anything to sabotage a downstream system — it need only stay silent about what matters.

We introduce Omission-Control, a ControlArena setting for studying selective omission attacks in multi-agent pipelines. An untrusted Agent A reads a ground-truth specification and writes knowledge base entries; a trusted intervention

layer optionally inspects and amends them; an honest Agent B implements code from those entries alone. In attack mode, Agent A silently omits domain-specific requirements — producing downstream code that passes all

visible tests but fails hidden security checks. We evaluate seven monitoring protocols across 96 run-epochs, two

domains (OAuth 2.0 middleware and database migration safety), and two attack strategies (pure omission and

scope narrowing). Three findings emerge. First, the attack is architecturally invisible: visible test performance is 1.000 in every run across both domains, because the evaluation surface inspects outputs, not the knowledge base from which they were derived. Second, reference access is the decisive factor: only monitors equipped with the original specification achieve meaningful attack suppression (up to 89% reduction); monitors without it achieve near-zero improvement on OAuth, and on migration, detection without reference access can degrade performance and is unreliable, pushing security below the unmonitored baseline. Third, the mechanism of

intervention matters less than information access: an interrogation-based protocol that forces the attacker into

active deception achieves comparable results to direct reference-based rewriting, while detection-only and

general-knowledge restoration protocols remain ineffective regardless of how they are delivered.

Review Project

View Related Sprint

Reviewer's Comments

Clever angle to explore, omissions, interesting work I'd like to see developed

Cite this work

@misc {

title={

(HckPrj) Omission-Control: A ControlArena Setting for Selective Omission Attacks in Multi-Agent Pipelines

author={

Vibhu Ganesan, Karthick Chandrasekharan, Naveen Prabu Palanisamy

date={

3/23/26

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Feb 2, 2026