Sep 15, 2025

Arbiter: Automated Review of Bio-AI Tools for Emerging Risk

Ryan Teo, Shrestha Rath

The rapid advancement of AI-enabled biological tools presents significantbiosecurity and AI safety challenges, necessitating scalable and balancedassessments. Building on the Global Risk Index (GRI) for AI-enabled BiologicalTools report's foundational framework, this work introduces Arbiter, anautomated pipeline designed to overcome the GRI's resource-intensive manualanalysis. Arbiter employs a multi-stage LLM-driven process to analyze scientificliterature, systematically evaluating AI-Bio tools for misuse risks and, notably,their potential benefits. This includes assessing economic impact, nationalcompetitiveness, and crisis response capabilities, providing policymakers with thecomprehensive, balanced insights required for informed decision-making. A pilotexecution demonstrated Arbiter's ability to efficiently monitor emerging tools,highlight areas for prompt and model refinement, and support the development offuture decision-support frameworks. Arbiter's modular and extensible designempowers users to tailor analyses, ensuring continuous, scalable, and adaptableoversight in this dynamic field.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow
Arrow
Arrow

This project is very well conceived prototype. Translating the Global Risk Index into an automated system is brilliant, the GRI is already being used as the cornerstone of AI-Bio risk.

The documentation could use more usage examples, the flow diagram from the paper can also go into the documentation. Since the paper notes risks of amplifying dual-use details, the repo can use this disclaimer too.

Excellent idea to expand the suggestions in the GRI report, and Arbiter is a promising prototype: modular, flexible, well-documented, and already working as a MVP. It’s also interesting that the project considers not only risks but also the potential benefits of bio-AI tools.

The main weakness is that the quality of the AI evaluations is uncertain without comparison to expert judgments, which could limit how much trust one can put in the results. A good and obvious next step would be to run calibration studies comparing Arbiter’s outputs with expert reviews, on a wider range of bio-AI tools.

Regarding minimal code replication, I managed to run Arbiter and the tests without issue.

Key strengths of the approach

There were strong visual aids (clear flowchart, table) that guides the reader. The effective example: Europe PMC database was also helpful to understand how this could be used. I liked that it built on existing literature (GRI paper) by building the tool from the appendix. There was also a clear explanation of stages and objectives of the tool. Some key gaps were identified acknowledging both risks and transformative benefits. The need for this tool was very clear (to keep up with field pace). Also liked that the appendix mentions information hazards and implementation feasibility along with availability of full-text research papers

Specific areas for improvement

Definition of “risk” unclear(maybe this was in the GRI paper?)

Unclear intended audience (researchers or policymakers). The tool felt a little bit reactive (which by itself is ok), since it was based on someone querying.

Policymaker use-case unrealistic without keyword expertise. Also, unsure how this connects or feeds into governance.

Not sure about tool’s ability to remain up-to-date (maybe this is just through scanning or API calls?)

Another limitation could be that we were limited to public tools, not dark web/closed communities.

Suggestions for how to develop this into a stronger project

Clarify risk definition (draw distinction from GRI paper if needed)

Specify target users and tailor tool accordingly.

Discuss handling of non-public datasets or acknowledge limitation explicitly

Potential next steps or future directions

Explore integration into policymaking frameworks (e.g., EU AI Act) or EU Market Surveillance directives.

Frame tool as an “early warning system” with policy applications

The submissions presents an LLM crawler of scientific literature analyzing it for biorisk according to the GRI framework.

While the project sketches out the technical architecture, it would be strengthened by demonstrating practical utility and contrasting to alternative approaches.

Cite this work

@misc {

title={

(HckPrj) Arbiter: Automated Review of Bio-AI Tools for Emerging Risk

},

author={

Ryan Teo, Shrestha Rath

},

date={

9/15/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.