Nov 23, 2025
Actions speak louder than words: Evaluating Tool Usage Risk in Open-Weight AI for Defensive Deployment
Ana Belen Barbero Castejon, Carlos Vecina Tebar
Open-weight language models are increasingly deployed in agentic workflows where they can invoke external tools, creating new attack vectors beyond traditional text generation. We present a systematic evaluation framework that measures how model tampering affects tool-usage behavior across cybersecurity, bioengineering, and decision-making domains. Evaluating 10 model variants (5 base + 5 tampered) across 125 scenarios, we find statistically significant that tampering increases harmful tool invocation rates, with effects varying significantly by domain. Our framework, released as open-source with an interactive dashboard, provides a foundation for monitoring AI agent behavior in production deployments and highlights tool-usage / MCP security as a critical defensive intervention point. (https://github.com/AnaBelenBarbero/OW-AI-cyber-bio-actions-eval)
Keywords: AI tools security, open-weight models, model tampering, agentic AI
Mackenzie Puig-Hall
Strong prototype and written report does a great job to shift thinking about AI risk from what models say to what models do. Also appreciate the great title and inclusion of a real-world case.
For this project, I would appreciate seeing a short literature review just to show that this research is not repeating work happening elsewhere. I would also like a slightly firmer argument for why open-weight models are higher risk.
Team clearly outlined application to a variety of domains. Next step would be to anchor one of those in a more realistic scenario and show how this evaluation framework would identify and quantify risk in a real setting. With such a robust report, I'd like to see an example of what a final deployment would look like.
Cite this work
@misc {
title={
(HckPrj) Actions speak louder than words: Evaluating Tool Usage Risk in Open-Weight AI for Defensive Deployment
},
author={
Ana Belen Barbero Castejon, Carlos Vecina Tebar
},
date={
11/23/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


