APART RESEARCH

Impactful AI safety research

Explore our projects, publications and pilot experiments

Our Approach

Arrow

Our research focuses on critical research paradigms in AI Safety. We produces foundational research enabling the safe and beneficial development of advanced AI.

Arrow

Safe AI

Publishing rigorous empirical work for safe AI: evaluations, interpretability and more

Novel Approaches

Our research is underpinned by novel approaches focused on neglected topics

Pilot Experiments

Apart Sprints have kickstarted hundreds of pilot experiments in AI Safety

Our Approach

Arrow

Our research focuses on critical research paradigms in AI Safety. We produces foundational research enabling the safe and beneficial development of advanced AI.

Arrow

Safe AI

Publishing rigorous empirical work for safe AI: evaluations, interpretability and more

Novel Approaches

Our research is underpinned by novel approaches focused on neglected topics

Pilot Experiments

Apart Sprints have kickstarted hundreds of pilot experiments in AI Safety

Apart Sprint Pilot Experiments

Feb 2, 2026

Systematic Cross-Regulation Threat Topology for EU AI Governance

A single frontier AI training run can simultaneously trigger obligations under the EU AI Act, GDPR, Copyright Directive, and NIS, yet no systematic framework maps these compounding regulatory threats across stakeholder types and jurisdictions. We present a systematic threat topology covering 19 EU-level regulations across five stakeholder categories (frontier model developers, deployment platforms, hardware providers, open-source developers, and research organizations), with geographic enforcement modifiers for all 27 Member States. Our methodology employs an activity-based stakeholder taxonomy, temporal activation mapping, and a KNOW/GUESS/UNKNOWN epistemic framework that quantifies regulatory uncertainty rather than obscuring it. Key findings include: (1) cross-regulation compounding creates multiplicative compliance surfaces where identical development activities trigger 3–5 regulatory regimes simultaneously; (2) enforcement concentration: five DPAs account for over 85% of €5.88B in cumulative GDPR fines, creates significant compliance cost differentials depending on establishment jurisdiction; (3) temporal cascading between February 2025 and August 2027 activates obligations under four major regulatory categories in overlapping waves, with August 2025 marking a critical inflection point for frontier AI providers. We release the full 41-page threat matrix as open infrastructure for practitioners navigating EU AI compliance during this implementation period.

Read More

Read More

Feb 2, 2026

AEGIS

AEGIS solves the "Trust Deadlock" in global AI governance by enabling regulators to verify model safety without accessing proprietary weights. Acting as a "Digital IAEA," it uses Trusted Execution Environments (TEEs) to perform "blind" inspections that check for 1025 FLOP compute thresholds and CBRN risks. This protocol generates tamper-proof cryptographic proofs of compliance, shifting international oversight from unreliable self-reporting to a secure, zero-trust technical standard.

Read More

Read More

Feb 3, 2026

Maxwell

We built Maxwell, a governance simulator that proves Safety and Innovation doesn't have to be a trade-off. By treating governance as a thermodynamic system ('Incentive Engineering'), we modeled a 'Sovereign Subsidy' that keeps permit costs affordable ($0.84) while maintaining 100% security—solving the 'North Korea vs. Wild West' dilemma.

Read More

Read More

Feb 2, 2026

Technical AI Governance via an Agentic Bill of Materials and Risk Tiering

This paper proposes a technical governance framework for agentic AI systems, autonomous agents with tools, memory, and self-directed behaviour, that current regulations do not adequately address. It introduces an Agentic Bill of Materials (ABOM), a machine-readable manifest that documents an agent’s capabilities, autonomy, memory, and safety controls; a quantitative risk scoring formula that computes agent risk from agency, autonomy, persistence, and mitigation factors; and a Unified Agentic Risk Tiering (UART) system that maps agents into five governance tiers aligned with the EU AI Act and international safety standards. Combined with hardware-based attestation, the framework enables verifiable, enforceable AI governance by allowing regulators to cryptographically verify agent configurations rather than relying on self-reported claims, and is validated through an open-source reference implementation.

Read More

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Read More

Feb 2, 2026

AI Safety Template

A prototype for creating standardized AI safety evaluations that run in a hardened & private way

Read More

Read More

Feb 2, 2026

Verification Mechanism Feasibility Scorer (VMFS)

A decision-support framework and dashboard that scores AI verification mechanisms across feasibility dimensions to help policy makers, diplomats, technical AI governance, and related stakeholders design pragmatic, layered treaties for global AI risks.

Read More

Read More

Feb 2, 2026

DOMAIN OWNERSHIP PROBING

We propose Domain Ownership Probing (DOP), a lightweight verification method that evaluates a model’s internal representation structure instead of its stochastic text outputs. DomainProbe embeds domain-specific statements, forms prototype centroids, and computes domain ownership win-rate and cohesion to assess whether knowledge domains are consistently encoded. By adding an auto-tuned layer search, the method remains effective for both encoder models and decoder-only LLMs, supporting practical AI governance and compliance verification without exposing training datasets.

Read More

Read More

Apart Sprint Pilot Experiments

Feb 2, 2026

Systematic Cross-Regulation Threat Topology for EU AI Governance

A single frontier AI training run can simultaneously trigger obligations under the EU AI Act, GDPR, Copyright Directive, and NIS, yet no systematic framework maps these compounding regulatory threats across stakeholder types and jurisdictions. We present a systematic threat topology covering 19 EU-level regulations across five stakeholder categories (frontier model developers, deployment platforms, hardware providers, open-source developers, and research organizations), with geographic enforcement modifiers for all 27 Member States. Our methodology employs an activity-based stakeholder taxonomy, temporal activation mapping, and a KNOW/GUESS/UNKNOWN epistemic framework that quantifies regulatory uncertainty rather than obscuring it. Key findings include: (1) cross-regulation compounding creates multiplicative compliance surfaces where identical development activities trigger 3–5 regulatory regimes simultaneously; (2) enforcement concentration: five DPAs account for over 85% of €5.88B in cumulative GDPR fines, creates significant compliance cost differentials depending on establishment jurisdiction; (3) temporal cascading between February 2025 and August 2027 activates obligations under four major regulatory categories in overlapping waves, with August 2025 marking a critical inflection point for frontier AI providers. We release the full 41-page threat matrix as open infrastructure for practitioners navigating EU AI compliance during this implementation period.

Read More

Feb 2, 2026

AEGIS

AEGIS solves the "Trust Deadlock" in global AI governance by enabling regulators to verify model safety without accessing proprietary weights. Acting as a "Digital IAEA," it uses Trusted Execution Environments (TEEs) to perform "blind" inspections that check for 1025 FLOP compute thresholds and CBRN risks. This protocol generates tamper-proof cryptographic proofs of compliance, shifting international oversight from unreliable self-reporting to a secure, zero-trust technical standard.

Read More

Feb 3, 2026

Maxwell

We built Maxwell, a governance simulator that proves Safety and Innovation doesn't have to be a trade-off. By treating governance as a thermodynamic system ('Incentive Engineering'), we modeled a 'Sovereign Subsidy' that keeps permit costs affordable ($0.84) while maintaining 100% security—solving the 'North Korea vs. Wild West' dilemma.

Read More

Feb 2, 2026

Technical AI Governance via an Agentic Bill of Materials and Risk Tiering

This paper proposes a technical governance framework for agentic AI systems, autonomous agents with tools, memory, and self-directed behaviour, that current regulations do not adequately address. It introduces an Agentic Bill of Materials (ABOM), a machine-readable manifest that documents an agent’s capabilities, autonomy, memory, and safety controls; a quantitative risk scoring formula that computes agent risk from agency, autonomy, persistence, and mitigation factors; and a Unified Agentic Risk Tiering (UART) system that maps agents into five governance tiers aligned with the EU AI Act and international safety standards. Combined with hardware-based attestation, the framework enables verifiable, enforceable AI governance by allowing regulators to cryptographically verify agent configurations rather than relying on self-reported claims, and is validated through an open-source reference implementation.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More