Nov 2, 2025

ExogenousAI

Aheli Poddar

Current AI capability forecasting methodologies, including EpochAI's Direct Approach and Biological Anchors framework, primarily rely on internal metrics such as training compute and scaling laws while assuming stable external development environments. This work introduces ExogenousAI, a novel framework that integrates external macroeconomic, geopolitical, and supply chain indicators with traditional compute-based models to forecast AI capability timelines. Drawing methodological inspiration from cross-market forecasting in financial econometrics, we demonstrate that policy interventions—including export controls, compute governance, and international collaboration shifts—exhibit quantifiable impacts on AI development trajectories.

Our proof-of-concept implementation analyzes policy events (2020-2025) through event study methodology using real TIGER-Lab MMLU-Pro benchmark data spanning 16 months (July 2023 - December 2024). Using Monte Carlo simulation with 10,000 iterations per scenario, we project AGI timelines under four policy scenarios, revealing convergence to 2027 median across all scenarios with 95% confidence intervals spanning 2026-2030. We decompose AI timeline uncertainty into technical (76.9%), economic (23.1%), and policy (0.0%) components, demonstrating that under current strong growth trends (+25.5% annually), policy interventions affect probability distributions (80.8%-90.8% within 5 years) but not central estimates. This work provides policymakers with actionable early warning indicators and establishes a foundation for real-time AI progress monitoring systems that account for exogenous shocks.

Review Project

See Code

View Related Sprint

Reviewer's Comments

* Motivation is well presented; the case for a policy-aware forecasting framework would be extremely valuable, especially if it could take in policy interventions and adjust predictions based on them, allowing for a much more informed policy-making approach (note, kind of thing is actually quite a large area of research; check out "DMDU" for more info).

* I really appreciate the clear presentation of datapoint used plus rationale, and very explicitly pointing to prior work which justifies design decisions, e.g. 2.1.2.

* Further exploration of what the implementation challenge discussed in 2.1.2 means would be worthwhile; the "interpretation" paragraph is well placed and certainly needed, but I think even more is required to unpack this limitation.

* While reasoning is provided for the "AGI threshold cailbration," I think more is required to make this claim.

* I personally don't think we can use the AI 2027 report as an anchor here, it is much more of a think piece than rigorous forecasting.

* The plots, while interesting and varied, don't provide much value to the reader as is. I think they very well could, and their presentation isn't counted against the submission, but I think it would have been better to select one plot and really make its value to policymakers extremely intuitive and explicit than to stick all of them on one page w/ font that's too small.

* It isn't clear to me what process is being used to conduct the timeline projections. What are the inputs to the model? How do the scenarios differ in these variables? What isn't being picked up by your approach? The fact that your forecast is substantially earlier than both of the primary anchors used (Epoch and Cotra) is suspicious to me; it warrants additional scrutiny.

* I think a key limitation is that this only uses performance on one benchmark, which I would argue is essentially already saturated, and doesn't take into account other real-world limitations. I'd be curious how this assessment would change if more benchmarks were aggregated, in addition to other real-world factors like AI investment.

Cite this work

@misc {

title={

(HckPrj) ExogenousAI

author={

Aheli Poddar

date={

11/2/25

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Feb 2, 2026