Jul 28, 2025

AI agentic system epidemiology

Valentina Schastlivaia, Aray Karjauv

2

As AI systems scale into decentralized, multi-agent deployments, emergent vulnerabilities challenge our ability to evaluate and manage systemic risks.

In this work, we adapt classical epidemiological modeling (specifically SEIR compartment models) to model adversarial behavior propagation in AI agents.

By solving systems of ODEs describing the systems with physics-informed neural networks (PINNs), we analyze stable and unstable equilibria, bifurcation points, and the effectiveness of interventions.

We estimate parameters from real-world data (e.g., adversarial success rates, detection latency, patching delays) and simulate attack propagation scenarios across 8 sectors (enterprise, retail, trading, development, customer service, academia, medical, and critical infrastructure AI tools).

Our results demonstrate how agent population dynamics interact with architectural and policy design interventions to stabilize the system.

This framework bridges concepts from dynamical systems and cybersecurity to offer a proactive, quantitative toolbox on AI safety.

We argue that epidemic-style monitoring and tools grounded in interpretable, physics-aligned dynamics can serve as early warning systems for cascading AI agentic failures.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow
Arrow
Arrow

This is a cool, cross-disciplinary project using PINNs and epidemiological models to study propagation of adversarial behavior in multi-agent systems. The physics angle and potential AI safety impact is there, and it does seem like this could be modeled as a dynamical system. The states each agent can take, like ‘exposed’, ‘infected’, and ‘removed’ make sense, and the parameters in their equations of motion are relayed in an AI context, but the paper is missing more motivation of these equations and the form of L_physics. Overall, the project seems like the description of a method, which is OK but not a proof of concept of adversarial spread in multi-agent systems (which is how the introduction reads). Walking through a concrete example would have helped, as it’s unclear what ‘adversarial’ means in this context or what they took for initial conditions (for example). Without comparing the modeled data to a ground truth, it’s hard to know if this would truly carry over to an AI setting. Perhaps it would have worked better as an exploration.

This project’s application of epidemiological dynamical systems models to study population dynamics of multi-agent systems is very interesting, and highlights an innovative way in which methods from physics (PINNs) can be applied in AI safety. My main critique is that the setup seems somewhat contrived, with malicious agents “infecting” normal ones, and it’s not clear to me how realistic this model is. That said, I think this approach is worth exploring further.

relevant for multi‑agent, decentralized AI deployments. Using a physics‑grounded dynamical systems lens to reason about systemic security risk is well-motivated. However, the empirical grounding and calibration require substantial tightening; there are inconsistencies in results presentation (e.g., risk labels vs. R_0 values) and several places where methodological choices are asserted but not validated (e.g., mapping from adversarial success rates to beta via “guestimation”). But it's a really intersting approach and fairly new and relevant as multi agent systems become more common.

Cite this work

@misc {

title={

(HckPrj) AI agentic system epidemiology

},

author={

Valentina Schastlivaia, Aray Karjauv

},

date={

7/28/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

Jan 11, 2026

Eliciting Deception on Generative Search Engines

Large language models (LLMs) with web browsing capabilities are vulnerable to adversarial content injection—where malicious actors embed deceptive claims in web pages to manipulate model outputs. We investigate whether frontier LLMs can be deceived into providing incorrect product recommendations when exposed to adversarial pages.

We evaluate four OpenAI models (gpt-4.1-mini, gpt-4.1, gpt-5-nano, gpt-5-mini) across 30 comparison questions spanning 10 product categories, comparing responses between baseline (truthful) and adversarial (injected) conditions. Our results reveal significant variation: gpt-4.1-mini showed 45.5% deception rate, while gpt-4.1 demonstrated complete resistance. Even frontier gpt-5 models exhibited non-zero deception rates (3.3–7.1%), confirming that adversarial injection remains effective against current models.

These findings underscore the need for robust defenses before deploying LLMs in high-stakes recommendation contexts.

Read More

Jan 11, 2026

SycophantSee - Activation-based diagnostics for prompt engineering: monitoring sycophancy at prompt and generation time

Activation monitoring reveals that prompt framing affects a model's internal state before generation begins.

Read More

Jan 11, 2026

Who Does Your AI Serve? Manipulation By and Of AI Assistants

AI assistants can be both instruments and targets of manipulation. In our project, we investigated both directions across three studies.

AI as Instrument: Operators can instruct AI to prioritise their interests at the expense of users. We found models comply with such instructions 8–52% of the time (Study 1, 12 models, 22 scenarios). In a controlled experiment with 80 human participants, an upselling AI reliably withheld cheaper alternatives from users - not once recommending the cheapest product when explicitly asked - and ~one third of participants failed to detect the manipulation (Study 2).

AI as Target: Users can attempt to manipulate AI into bypassing safety guidelines through psychological tactics. Resistance varied dramatically - from 40% (Mistral Large 3) to 99% (Claude 4.5 Opus) - with strategic deception and boundary erosion proving most effective (Study 3, 153 scenarios, AI judge validated against human raters r=0.83).

Our key finding was that model selection matters significantly in both settings. We learned some models complied with manipulative requests at much higher rates. And we found some models readily follow operator instructions that come at the user's expense - highlighting a tension for model developers between serving paying operators and protecting end users.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.