Pragmatic Sophistry in Vietnamese Multi-Agent Oversight
Bui Cao Dong Nghi, Truong Ngoc Minh Tam, Bui Ngoc Linh
AI safety monitors are usually evaluated on the assumption that risky behavior is lexically visible in the text being watched. We test this assumption in a multilingual, multi-agent setting: Vietnamese-language workflow agents in which control violations are expressed not through explicit risk keywords but through pragmatic, culturally fluent language (hierarchy, relationship pressure, euphemism, omission, and routine framing). We build a four-stage pipeline (attacker-renderer, fluent oracle, naive executor, monitor) and evaluate two LLM monitors (Gemini 2.5 Flash, DeepSeek Chat) under three visibility conditions (seeing the original instruction, only the executor's action/report, or the full instruction-action trace) across three domains: e-commerce, banking, and government. Our central finding is a context-compression failure that is strongly domain-dependent. In e-commerce, monitoring only the executor's action raised false negatives on suspicious pragmatic-Vietnamese requests from 6.4% to 22.3%, with full-trace monitoring recovering performance to 8.5%; up to 24.4% of violations catchable from the instruction were missed once only the action was shown. This failure was mild in banking and absent in government, where violations are concrete, explicit actions that survive compression. Across all domains, the rare misses were high-confidence (0.90–1.00). We conclude that the executor/report layer can disguise safety-relevant context in some domains, that the vulnerability tracks the type of violation rather than language alone, and that preserving the instruction-action trace is an important, low-cost safeguard for multilingual agentic oversight.
No reviews are available yet
Cite this work
@misc {
title={
(HckPrj) Pragmatic Sophistry in Vietnamese Multi-Agent Oversight
},
author={
Bui Cao Dong Nghi, Truong Ngoc Minh Tam, Bui Ngoc Linh
},
date={
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


