Interactive Assessments for AI Safety: A Gamified Approach to Evaluation and Personal Journey Mapping

Anusha Asim, Ammar Ahmed Farooqi, Aqsa Khan

An interactive assessment platform and mentor chatbot hosted on Canvas LMS, for testing and guiding learners from BlueDot's Intro to Transformative AI Course.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

Hannah Betts

A great idea to cement learning through interactive, and nuanced decision-making. References early on in the paper create a clear scaffold for the project. While it was difficult to understand the analysis of current courses, the problem was clearly defined ("independent learning can be improved by moving beyond self-report").

I would love to see more detail written about the analysis of AI safety courses in sections 1 and 2, and how you chose the important concepts for inclusion in your interactive materials. This would be useful for others to build on your work, and to aid understanding of the reader. For example, what was the method of your analysis, and what did your analysis show, for the sentence: "we analyzed learning patterns from AI safety education initiatives..." ? Some citations did not show strong relevance to the context in which they were cited (e.g. Li et al 2019 re: iterative learning).

The AI mentor was a good idea, and I wonder how it could be extended to provide support and feedback that truly leverages the opportunity of an LLM chatbot; the demonstration seemed to follow a scripted conversation, with limited insight beyond the materials already available to the learner.

It was good to see a preliminary evaluation of the project, though analysis on the breadth and depth of content covered, and access to the course materials for deeper evaluation, would be useful to aid understanding of the technical quality.

Li-Lian Ang

Nice work on the effort put into this project! FYI I don't have access to the course, so could only rely on the demo video.

Here are some things I thought were particularly good:

- I admire the breadth approach in incorporating lots of different mediums for users to learn through, it shows a lot of creativity.

- Nice touch with making sure it is accessible!

Here are some places where it could have been improved:

- Questions from your user feedback might be slightly leading, given they are framed as positives. In the future, I would recommend doing user interviews given your small sample size. I'd recommend the Mom Test for a guide on how to do these user interviews well. Doing user interviews would more likely yield higher quality responses and provide more info on areas to improve.

- I would have loved to see more details on how you designed the chatbot mentor and how it would adapt to learners with specific needs. One of our major challenges is curating content for each user's unique background and interests.

- I would have loved to learn more about how the debate and assignments are graded. Is a human doing the grading or would an LLM do so? How would you ensure that user's get informative learnings from the feedback?

Cite this work

@misc {

title={

@misc {

},

author={

Anusha Asim, Ammar Ahmed Farooqi, Aqsa Khan

},

date={

3/10/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

May 20, 2025

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Our project investigates the potential risks and implications of integrating multiple autonomous AI agents within national defense strategies, exploring whether these agents tend to escalate or deescalate conflict situations. Through a simulation that models real-world international relations scenarios, our preliminary results indicate that AI models exhibit a tendency to escalate conflicts, posing a significant threat to maintaining peace and preventing uncontrollable military confrontations. The experiment and subsequent evaluations are designed to reflect established international relations theories and frameworks, aiming to understand the implications of autonomous decision-making in military contexts comprehensively and unbiasedly.

Read More

Apr 28, 2025

The Early Economic Impacts of Transformative AI: A Focus on Temporal Coherence

We investigate the economic potential of Transformative AI, focusing on "temporal coherence"—the ability to maintain goal-directed behavior over time—as a critical, yet underexplored, factor in task automation. We argue that temporal coherence represents a significant bottleneck distinct from computational complexity. Using a Large Language Model to estimate the 'effective time' (a proxy for temporal coherence) needed for humans to complete remote O*NET tasks, the study reveals a non-linear link between AI coherence and automation potential. A key finding is that an 8-hour coherence capability could potentially automate around 80-84\% of the analyzed remote tasks.

Read More

Mar 31, 2025

Model Models: Simulating a Trusted Monitor

We offer initial investigations into whether the untrusted model can 'simulate' the trusted monitor: is U able to successfully guess what suspicion score T will assign in the APPS setting? We also offer a clean, modular codebase which we hope can be used to streamline future research into this question.

Read More

May 20, 2025

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Our project investigates the potential risks and implications of integrating multiple autonomous AI agents within national defense strategies, exploring whether these agents tend to escalate or deescalate conflict situations. Through a simulation that models real-world international relations scenarios, our preliminary results indicate that AI models exhibit a tendency to escalate conflicts, posing a significant threat to maintaining peace and preventing uncontrollable military confrontations. The experiment and subsequent evaluations are designed to reflect established international relations theories and frameworks, aiming to understand the implications of autonomous decision-making in military contexts comprehensively and unbiasedly.

Read More

Apr 28, 2025

The Early Economic Impacts of Transformative AI: A Focus on Temporal Coherence

We investigate the economic potential of Transformative AI, focusing on "temporal coherence"—the ability to maintain goal-directed behavior over time—as a critical, yet underexplored, factor in task automation. We argue that temporal coherence represents a significant bottleneck distinct from computational complexity. Using a Large Language Model to estimate the 'effective time' (a proxy for temporal coherence) needed for humans to complete remote O*NET tasks, the study reveals a non-linear link between AI coherence and automation potential. A key finding is that an 8-hour coherence capability could potentially automate around 80-84\% of the analyzed remote tasks.

Read More

May 20, 2025

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Our project investigates the potential risks and implications of integrating multiple autonomous AI agents within national defense strategies, exploring whether these agents tend to escalate or deescalate conflict situations. Through a simulation that models real-world international relations scenarios, our preliminary results indicate that AI models exhibit a tendency to escalate conflicts, posing a significant threat to maintaining peace and preventing uncontrollable military confrontations. The experiment and subsequent evaluations are designed to reflect established international relations theories and frameworks, aiming to understand the implications of autonomous decision-making in military contexts comprehensively and unbiasedly.

Read More

Apr 28, 2025

The Early Economic Impacts of Transformative AI: A Focus on Temporal Coherence

We investigate the economic potential of Transformative AI, focusing on "temporal coherence"—the ability to maintain goal-directed behavior over time—as a critical, yet underexplored, factor in task automation. We argue that temporal coherence represents a significant bottleneck distinct from computational complexity. Using a Large Language Model to estimate the 'effective time' (a proxy for temporal coherence) needed for humans to complete remote O*NET tasks, the study reveals a non-linear link between AI coherence and automation potential. A key finding is that an 8-hour coherence capability could potentially automate around 80-84\% of the analyzed remote tasks.

Read More

May 20, 2025

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Our project investigates the potential risks and implications of integrating multiple autonomous AI agents within national defense strategies, exploring whether these agents tend to escalate or deescalate conflict situations. Through a simulation that models real-world international relations scenarios, our preliminary results indicate that AI models exhibit a tendency to escalate conflicts, posing a significant threat to maintaining peace and preventing uncontrollable military confrontations. The experiment and subsequent evaluations are designed to reflect established international relations theories and frameworks, aiming to understand the implications of autonomous decision-making in military contexts comprehensively and unbiasedly.

Read More

Apr 28, 2025

The Early Economic Impacts of Transformative AI: A Focus on Temporal Coherence

We investigate the economic potential of Transformative AI, focusing on "temporal coherence"—the ability to maintain goal-directed behavior over time—as a critical, yet underexplored, factor in task automation. We argue that temporal coherence represents a significant bottleneck distinct from computational complexity. Using a Large Language Model to estimate the 'effective time' (a proxy for temporal coherence) needed for humans to complete remote O*NET tasks, the study reveals a non-linear link between AI coherence and automation potential. A key finding is that an 8-hour coherence capability could potentially automate around 80-84\% of the analyzed remote tasks.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.