Mar 10, 2025

U Reg AI: you regulate it, or you regenerate it!

Vinaya Sivakumar, Kayla Jew, Amy Wong

We have created a 'choose your path' role game to mitigate existential AI risk ... at this point they might be actual situations in the near-future. The options for mitigation are holistic and dynamic to the player's previous choices. The final result is an evaluation of the player's decision-making performance in wake of the existential risk situation, recommendations for how they can improve or aspects they should crucially consider for the future, and finally how they can take part in AI Safety through various careers or BlueDot Impact courses.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

This was a creative solution to articulating various risk scenarios in AI! The "Choose your own adventure" style play is an engaging way to capture audience attention. I enjoyed how you also captured some personalization, such as the "Your choices suggest a preference for..." call out in the final call to action screen. It was a creative way to pull together insights from the game and push users forward on their AI safety journey.

For future iterations, I'd recommending incorporating more in-line learnings within the scenario itself. You mentioned that these scenarios may realistically be more "near-future". Is there a way to highlight the urgency of these scenarios in real-life for a learner? The "Guidance" page is helpful, but once the game begins, this will be difficult to navigate back to. How can we further encourage or emphasize educational call-outs without losing users to context switching? Finally, there is a lot of strength in communicating how things can go right *and* wrong in AI safety. You mentioned in the accompanying paper how strong a motivator gamificiation can be for learners. How could U Reg It further gamify itself, like introducing reward models? E.g, could users collect "badges" as they navigate different scenarios, whether they play for "good" or "evil"?

This MVP is a strong start and I loved the creativity on display here. Would be excited to play out many of the gameplays you've suggested!

The project is a simple interactive choose-your-own adventure style story , with users invested in their outcomes to provide deeper engagement with the ideas. (though it is noted that the figma project linked in the appendix only includes one pathway, despite several starting points and choices.). It would be great to see the literature that discusses the potential scenarios shown (e.g. "prioritizing safety risk increases the risk of AI systems behaving unpredictably or autonomously. As a user, I would like to understand more about the current literature on this - references to support further reading, or elaboration on the concepts would be ideal to support deeper learning, and help the learner contextualize the content.). That being said, within the context of the hackathon, an engaging experience was created, and I look forward to seeing this project evolve and iterate further.

Additional analysis of the games mentioned (Fermi's fake, Hitchhikers guide, and existing educational gamification methods), including their format, strengths and weaknesses, would be useful for the reader to understand the pre-existing literature and inspiration. Additionally, I am keen to read more about the method of breaking down current governance and policy discourse into the scenarios chosen for the different pathways.

Cite this work

@misc {

title={

U Reg AI: you regulate it, or you regenerate it!

},

author={

Vinaya Sivakumar, Kayla Jew, Amy Wong

},

date={

3/10/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.