Mar 10, 2025

Scam Detective: Using Gamification to Improve AI-Powered Scam Awareness

Angelica Marie M. Casuela

This project outlines the development of an interactive web application aimed at involving users in understanding the AI skills for both producing believable scams and identifying deceptive content. The game challenges human players to determine if text messages are genuine or fraudulent against an AI. The project tackles the increasing threat of AI-generated fraud while showcasing the capabilities and drawbacks of AI detection systems. The application functions as both a training resource to improve human ability to recognize digital deception and a showcase of present AI capabilities in identifying fraud. By engaging in gameplay, users learn to identify the signs of AI-generated scams and enhance their critical thinking abilities, which are essential for navigating through an increasingly complicated digital world. This project enhances AI safety by equipping users with essential insights regarding AI-generated risks, while underscoring the supportive functions that humans and AI can fulfill in combating fraud.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

This was a really fun and engaging way to educate audiences on multiple levels. You successfully communicated risks and capabilities of AI while simultaneously empowering users, educating them on how to proactively identify already-common AI scams and protect themselves from harm. The UI itself was also a clean and elegant user experience.

This was a strong MVP, and I'm excited about its continued development opportunities. For example, as you mentioned, the current difficulty level is relatively low. I would love to see more investment in escalating difficulty levels to further demonstrate the strength in capabilities of these systems. This could also further incorporate more explicit learning, like incorporating OpenAI's Persuasion risk-level identifiers which you called out in the paper, or further emphasize the reality of these harms by including and specifically identifying real-life scam examples.

Overall, this was a great multi-purpose game to capture learner attention, educate them on broader AI risk categories, communicate the reality of AI-powered harm, and empower themselves against realistic threats. Well done!

Great work on this project!

Here are some things I particularly enjoyed:

- I love the clean UI!

- I think this could be a good tool to educate the public on how to detect scam messages by teaching them what to look out for.

Here are some things I thought could be improved:

- I would have loved to see more instructions or context when I first opened the web app. I was a bit confused at first about what the AI assessment was meant to be.

- I think it would be useful to set an end state for the user otherwise it might feel like they would keep playing forever! I'm not sure whether I'm supposed to feel like AI systems are very good at scamming or if I'm simply meant to educate myself on detecting phishing attempts. Perhaps showing different types of phishing attempts?

- An interesting addition to this project might have been to see how AI can create realistic, targeted phishing attempts based on a target's user profile. Perhaps this veers into the realm of dual-use concerns, but an idea nonetheless!

Cite this work

@misc {

title={

Scam Detective: Using Gamification to Improve AI-Powered Scam Awareness

},

author={

Angelica Marie M. Casuela

},

date={

3/10/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.