-
Remote
AI Control Hackathon 2026

Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.
00:00:00:00
Days To Go
Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.
This event is ongoing.
This event has concluded.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
First Place gets fully funded trip to ControlConf | |
Invitation to The Apart Fellowship* | |
$2,000 in cash prizes across all tracks | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
LinuxArena Track Prizes. Sponsored by Equistamp
🏆 1st place LinuxArena Project | $500 |
🏆 2nd place LinuxArena Project | $500 |
*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. Control Setting Design
Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.
Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.
Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.
Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.
Build a control setting for multi-agent environments where agents can collude.
Control setting for browser agents. Maybe combine OS World with OS harm.
Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.
2. Control Protocol Design
Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.
Train really good monitors and show generalization to other datasets of attacks.
Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.
Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.
Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.
Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.
3. Building Better Red Teams
Build better red teams for existing control protocols and settings.
Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.
Make a red team scaffold that can produce really good attacks in LinuxArena.
Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
First Place gets fully funded trip to ControlConf | |
Invitation to The Apart Fellowship* | |
$2,000 in cash prizes across all tracks | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
LinuxArena Track Prizes. Sponsored by Equistamp
🏆 1st place LinuxArena Project | $500 |
🏆 2nd place LinuxArena Project | $500 |
*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. Control Setting Design
Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.
Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.
Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.
Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.
Build a control setting for multi-agent environments where agents can collude.
Control setting for browser agents. Maybe combine OS World with OS harm.
Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.
2. Control Protocol Design
Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.
Train really good monitors and show generalization to other datasets of attacks.
Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.
Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.
Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.
Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.
3. Building Better Red Teams
Build better red teams for existing control protocols and settings.
Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.
Make a red team scaffold that can produce really good attacks in LinuxArena.
Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
First Place gets fully funded trip to ControlConf | |
Invitation to The Apart Fellowship* | |
$2,000 in cash prizes across all tracks | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
LinuxArena Track Prizes. Sponsored by Equistamp
🏆 1st place LinuxArena Project | $500 |
🏆 2nd place LinuxArena Project | $500 |
*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. Control Setting Design
Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.
Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.
Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.
Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.
Build a control setting for multi-agent environments where agents can collude.
Control setting for browser agents. Maybe combine OS World with OS harm.
Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.
2. Control Protocol Design
Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.
Train really good monitors and show generalization to other datasets of attacks.
Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.
Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.
Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.
Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.
3. Building Better Red Teams
Build better red teams for existing control protocols and settings.
Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.
Make a red team scaffold that can produce really good attacks in LinuxArena.
Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
First Place gets fully funded trip to ControlConf | |
Invitation to The Apart Fellowship* | |
$2,000 in cash prizes across all tracks | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
LinuxArena Track Prizes. Sponsored by Equistamp
🏆 1st place LinuxArena Project | $500 |
🏆 2nd place LinuxArena Project | $500 |
*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. Control Setting Design
Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.
Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.
Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.
Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.
Build a control setting for multi-agent environments where agents can collude.
Control setting for browser agents. Maybe combine OS World with OS harm.
Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.
2. Control Protocol Design
Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.
Train really good monitors and show generalization to other datasets of attacks.
Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.
Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.
Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.
Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.
3. Building Better Red Teams
Build better red teams for existing control protocols and settings.
Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.
Make a red team scaffold that can produce really good attacks in LinuxArena.
Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Speakers & Collaborators
Aryan Bhatt
Keynote Speaker
Senior Member of Technical Staff at Redwood Research, focused on AI control. Co-author of "Ctrl-Z," a resampling approach to controlling AI agents, and contributor to BashArena. Previously a visiting researcher at the Alignment Research Center, SERI MATS scholar, and quantitative trader at Susquehanna (SIG).
Buck Shlegeris
Co-organizer
CEO of Redwood Research, a nonprofit focused on mitigating risks from advanced AI. A leading voice on AI control: safely deploying AI systems that might be misaligned through monitoring, oversight, and containment.
Tyler Tracy
Speaker, Judge and Co-organizer
Experienced software engineer turned AI control researcher at Redwood Research, focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks. Mentors at CBAI and Pivotal Research on control evaluations and protocol design.
Jason Hoelscher-Obermaier
Judge and Organizer
Director of Research at Apart Research. Ph.D. in physics from Oxford. Focuses on AI safety evaluations, interpretability, and alignment.
Akshay Iyer
Judge and Co-organizer
CS and Entrepreneurship at Columbia University, IIT Bombay alum. Research experience in neuromorphic engineering and federated learning. Apart Research judge and internal organizer.
Rogan Inglis
Speaker and Judge
Senior Research Engineer on the Control team at the UK AI Security Institute (AISI). 7+ years of experience building and evaluating AI safety systems.
Jai Dhyani
Speaker and Judge
Builder of Luthien Proxy at Luthien Research, bringing Redwood-style AI control to real deployments. Co-author of RE-Bench (ICML 2025) with Elizabeth Barnes at METR.
Sebastian Farquhar
Speaker
Staff Research Scientist at Google DeepMind on the AGI Safety and Alignment Team, leading the Frontier Safety Loss of Control team. Senior Research Fellow at OATML Oxford.
Fabien Roger
Speaker
AI safety researcher and co-author of the foundational AI control paper (ICLR 2025). Former Member of Technical Staff at Redwood Research. His work spans alignment faking, chain-of-thought monitoring, and capability elicitation.
James Lucassen
Speaker and Judge
Researcher at Redwood Research working on AI control. His work spans AI benchmarking, LLM behavior analysis, and safety evaluations. Previously published on false belief detection in language models and AI performance prediction.
Ram Potham
Speaker and Judge
Astra Fellow at Redwood Research, focused on mitigating loss-of-control risk. His agent safety research was accepted for an oral presentation at the ICML 2025 Technical AI Governance workshop.
Nick Kuhn
Speaker and Judge
Postdoctoral researcher at the University of Oxford, collaborating with Dominic Joyce. PhD from Stanford University. Research focuses on moduli spaces in algebraic geometry. Previously at the University of Oslo and Max Planck Institute for Mathematics in Bonn.
Myles Heller
Speaker and Judge
Freelance programmer contributing to AI model safety evaluations at EquiStamp, designing benchmarks and evaluation tasks for AI agents. Volunteer developer at AISafety.info, building a RAG chatbot, and contributor to AI-Plans.com.
Justin Shenk
Speaker and Judge
Independent AI researcher and SPAR mentor based in Berlin, contracting at Redwood Research on model evals. Background in machine learning and software engineering.
Martin Milbradt
Judge and Mentor
8 years of software engineering, focused exclusively on AI safety since 2023. Working with METR, Equistamp, and Redwood Research on evaluations, tooling, and applied research.
Ruben Castaing
Judge and Mentor
Equistamp research engineer, who helped make LinuxArena. They facilitate BlueDot's technical AI safety course on the side. They also did research to remove bioweapons knowledge from LLMs.
Francis Rhys Ward
Judge
Researcher at Safe & Trusted AI and Future of Life Institute. Co-author of "CTRL-ALT-DECEIT" on sabotage evaluations (NeurIPS 2025 Spotlight), finding that sandbagging is much harder to detect than code sabotage.
Mia Hopman
Judge
Member of Technical Staff at Apollo Research, working on alignment, control, and security of large language models. Published on evaluating scheming propensity in LLM agents.
Anshul Khandelwal
Judge
Researcher at Redwood Research. His work spans alignment drift, power-seeking evaluations, and white-box control methods.
Alex van Grootel
Judge
Fellow at the Future of Life Foundation with 10 years of experience applying AI to materials science and decision-making. Previously a Product Manager at Microsoft (Fabric/AI copilots) and Data Scientist Team Lead at Citrine Informatics. MS from MIT.
Cameron Tice
Judge
Co-Director and Co-Founder of Geodesic Research, a technical AI safety organization at the University of Cambridge. Marshall Scholar and former Apart Research Fellow.
Theo Ryzhenkov
Judge
AI Safety Research Engineer and Founding Engineer at Palisade Research. NeurIPS 2025 author on detecting sandbagging in language models. SPAR, ARENA, and AISF alumni.
Pablo Bernabeu Perez
Judge
AI Safety Researcher and Research Engineer at the Barcelona Supercomputing Center. Co-author of "CoT Red-Handed" on stress-testing chain-of-thought monitoring at LASR Labs. Research Mentor at SPAR and Algoverse, with work on debate protocols for AI control accepted at NeurIPS 2025.
Akash Kundu
Judge
Research Collaborator at FAR.AI working on AI safety, LLM evaluations, and multilingual model safety. ICLR 2025 Oral presentation recipient and upcoming Cooperative AI Research Fellow.
Ramanpreet Singh Khinda
Judge
Staff Software Engineer at LinkedIn. IEEE Senior Member, Google I/O Codelab author, and experienced hackathon judge with expertise in mobile AI systems.
Hugo Delahousse
Judge
Software Engineer at Charcoal, an AI retrieval startup backed by investors from Cursor, Notion, and Front. Previously at Front for 5+ years building developer tools. 10 years of software engineering experience with a focus on AI automation and workflow security.
Victor Gillioz
Judge
Victor Gillioz is an AI safety researcher and MATS Fellow. His work spans training-time alignment of LLMs, including recontextualization for reward hacking mitigation and inoculation prompting, a technique for improving model alignment at inference time, under Alex Turner. His current work focuses on training scheming monitors for LLM oversight under Marius Hobbhahn, aiming to detect and contain misaligned model behavior.
Jord Nguyen
Judge
Research Manager at antoan.ai (AI Safety Vietnam) and Founder of the Hanoi AI Safety Network. Former Apart Research Fellow and Non-trivial facilitator.
Kaushik Prabhakar
Judge
Research Fellow at Apart Research. His research spans AI safety and alignment, with work on LLM evaluations. ARENA Cohort 4 alumni.
Helios Lyons
Judge
Embedded Software Engineer at Disguise and former Research Fellow at Apart Research. His AI safety work focuses on detecting sycophancy and dark patterns in language model activations.
Nguyen Nhat Minh
Judge
AI Engineer at Mobifone IT Center and researcher at BKAI Lab, Hanoi University of Science and Technology. Published at FSE and TechDebt on vulnerability detection using graph neural networks.
Goutham Nekkalapu
Judge
Principal ML Engineer at Gen Digital (Norton, LifeLock, Avast), building adversarial detection systems and AI evaluation frameworks for security products. His ML research has led to multiple patent applications in user security and identity protection.
Arjun Chakraborty
Judge
Leads the evaluations team at Microsoft Security AI Research, where his team focuses on research and building evaluations for security agents. He was previously a staff software security engineer at Databricks, specializing in machine learning for threat detection, and also worked on AI for security at Nvidia.
Speakers & Collaborators

Aryan Bhatt
Keynote Speaker
Senior Member of Technical Staff at Redwood Research, focused on AI control. Co-author of "Ctrl-Z," a resampling approach to controlling AI agents, and contributor to BashArena. Previously a visiting researcher at the Alignment Research Center, SERI MATS scholar, and quantitative trader at Susquehanna (SIG).

Buck Shlegeris
Co-organizer
CEO of Redwood Research, a nonprofit focused on mitigating risks from advanced AI. A leading voice on AI control: safely deploying AI systems that might be misaligned through monitoring, oversight, and containment.

Tyler Tracy
Speaker, Judge and Co-organizer
Experienced software engineer turned AI control researcher at Redwood Research, focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks. Mentors at CBAI and Pivotal Research on control evaluations and protocol design.

Jason Hoelscher-Obermaier
Judge and Organizer
Director of Research at Apart Research. Ph.D. in physics from Oxford. Focuses on AI safety evaluations, interpretability, and alignment.

Akshay Iyer
Judge and Co-organizer
CS and Entrepreneurship at Columbia University, IIT Bombay alum. Research experience in neuromorphic engineering and federated learning. Apart Research judge and internal organizer.

Rogan Inglis
Speaker and Judge
Senior Research Engineer on the Control team at the UK AI Security Institute (AISI). 7+ years of experience building and evaluating AI safety systems.

Jai Dhyani
Speaker and Judge
Builder of Luthien Proxy at Luthien Research, bringing Redwood-style AI control to real deployments. Co-author of RE-Bench (ICML 2025) with Elizabeth Barnes at METR.

Sebastian Farquhar
Speaker
Staff Research Scientist at Google DeepMind on the AGI Safety and Alignment Team, leading the Frontier Safety Loss of Control team. Senior Research Fellow at OATML Oxford.

Fabien Roger
Speaker
AI safety researcher and co-author of the foundational AI control paper (ICLR 2025). Former Member of Technical Staff at Redwood Research. His work spans alignment faking, chain-of-thought monitoring, and capability elicitation.

James Lucassen
Speaker and Judge
Researcher at Redwood Research working on AI control. His work spans AI benchmarking, LLM behavior analysis, and safety evaluations. Previously published on false belief detection in language models and AI performance prediction.

Ram Potham
Speaker and Judge
Astra Fellow at Redwood Research, focused on mitigating loss-of-control risk. His agent safety research was accepted for an oral presentation at the ICML 2025 Technical AI Governance workshop.

Nick Kuhn
Speaker and Judge
Postdoctoral researcher at the University of Oxford, collaborating with Dominic Joyce. PhD from Stanford University. Research focuses on moduli spaces in algebraic geometry. Previously at the University of Oslo and Max Planck Institute for Mathematics in Bonn.

Myles Heller
Speaker and Judge
Freelance programmer contributing to AI model safety evaluations at EquiStamp, designing benchmarks and evaluation tasks for AI agents. Volunteer developer at AISafety.info, building a RAG chatbot, and contributor to AI-Plans.com.

Justin Shenk
Speaker and Judge
Independent AI researcher and SPAR mentor based in Berlin, contracting at Redwood Research on model evals. Background in machine learning and software engineering.

Martin Milbradt
Judge and Mentor
8 years of software engineering, focused exclusively on AI safety since 2023. Working with METR, Equistamp, and Redwood Research on evaluations, tooling, and applied research.

Ruben Castaing
Judge and Mentor
Equistamp research engineer, who helped make LinuxArena. They facilitate BlueDot's technical AI safety course on the side. They also did research to remove bioweapons knowledge from LLMs.

Francis Rhys Ward
Judge
Researcher at Safe & Trusted AI and Future of Life Institute. Co-author of "CTRL-ALT-DECEIT" on sabotage evaluations (NeurIPS 2025 Spotlight), finding that sandbagging is much harder to detect than code sabotage.

Mia Hopman
Judge
Member of Technical Staff at Apollo Research, working on alignment, control, and security of large language models. Published on evaluating scheming propensity in LLM agents.

Anshul Khandelwal
Judge
Researcher at Redwood Research. His work spans alignment drift, power-seeking evaluations, and white-box control methods.

Alex van Grootel
Judge
Fellow at the Future of Life Foundation with 10 years of experience applying AI to materials science and decision-making. Previously a Product Manager at Microsoft (Fabric/AI copilots) and Data Scientist Team Lead at Citrine Informatics. MS from MIT.

Cameron Tice
Judge
Co-Director and Co-Founder of Geodesic Research, a technical AI safety organization at the University of Cambridge. Marshall Scholar and former Apart Research Fellow.

Theo Ryzhenkov
Judge
AI Safety Research Engineer and Founding Engineer at Palisade Research. NeurIPS 2025 author on detecting sandbagging in language models. SPAR, ARENA, and AISF alumni.

Pablo Bernabeu Perez
Judge
AI Safety Researcher and Research Engineer at the Barcelona Supercomputing Center. Co-author of "CoT Red-Handed" on stress-testing chain-of-thought monitoring at LASR Labs. Research Mentor at SPAR and Algoverse, with work on debate protocols for AI control accepted at NeurIPS 2025.

Akash Kundu
Judge
Research Collaborator at FAR.AI working on AI safety, LLM evaluations, and multilingual model safety. ICLR 2025 Oral presentation recipient and upcoming Cooperative AI Research Fellow.

Ramanpreet Singh Khinda
Judge
Staff Software Engineer at LinkedIn. IEEE Senior Member, Google I/O Codelab author, and experienced hackathon judge with expertise in mobile AI systems.

Hugo Delahousse
Judge
Software Engineer at Charcoal, an AI retrieval startup backed by investors from Cursor, Notion, and Front. Previously at Front for 5+ years building developer tools. 10 years of software engineering experience with a focus on AI automation and workflow security.

Victor Gillioz
Judge
Victor Gillioz is an AI safety researcher and MATS Fellow. His work spans training-time alignment of LLMs, including recontextualization for reward hacking mitigation and inoculation prompting, a technique for improving model alignment at inference time, under Alex Turner. His current work focuses on training scheming monitors for LLM oversight under Marius Hobbhahn, aiming to detect and contain misaligned model behavior.

Jord Nguyen
Judge
Research Manager at antoan.ai (AI Safety Vietnam) and Founder of the Hanoi AI Safety Network. Former Apart Research Fellow and Non-trivial facilitator.

Kaushik Prabhakar
Judge
Research Fellow at Apart Research. His research spans AI safety and alignment, with work on LLM evaluations. ARENA Cohort 4 alumni.

Helios Lyons
Judge
Embedded Software Engineer at Disguise and former Research Fellow at Apart Research. His AI safety work focuses on detecting sycophancy and dark patterns in language model activations.

Nguyen Nhat Minh
Judge
AI Engineer at Mobifone IT Center and researcher at BKAI Lab, Hanoi University of Science and Technology. Published at FSE and TechDebt on vulnerability detection using graph neural networks.

Goutham Nekkalapu
Judge
Principal ML Engineer at Gen Digital (Norton, LifeLock, Avast), building adversarial detection systems and AI evaluation frameworks for security products. His ML research has led to multiple patent applications in user security and identity protection.

Arjun Chakraborty
Judge
Leads the evaluations team at Microsoft Security AI Research, where his team focuses on research and building evaluations for security agents. He was previously a staff software security engineer at Databricks, specializing in machine learning for threat detection, and also worked on AI for security at Nvidia.
Registered Local Sites
Register A Location
Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.
The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.
Our Other Sprints
-
Research
AIxBio Hackathon
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up
-
Research
The Technical AI Governance Challenge
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events
