Mar 20, 2026
-
Mar 22, 2026
Remote
AI Control Hackathon 2026



Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.
28
Days To Go
28
Days To Go
28
Days To Go
28
Days To Go
Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.
This event is ongoing.
This event has concluded.
Sign Ups
Entries
Overview
Resources
Guidelines
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
ControlConf Tickets with full travel support | |
Fast-track to The Apart Fellowship | |
$2,000 in cash prizes | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. ControlArena Challenges
Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.
2. Control Protocol Design
Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.
3. Red Teaming & Vulnerability Research
Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
ControlConf Tickets with full travel support | |
Fast-track to The Apart Fellowship | |
$2,000 in cash prizes | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. ControlArena Challenges
Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.
2. Control Protocol Design
Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.
3. Red Teaming & Vulnerability Research
Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
ControlConf Tickets with full travel support | |
Fast-track to The Apart Fellowship | |
$2,000 in cash prizes | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. ControlArena Challenges
Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.
2. Control Protocol Design
Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.
3. Red Teaming & Vulnerability Research
Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Sign Ups
Entries
Overview
Resources
Guidelines
Entries
Overview

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?
This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.
Top teams get
ControlConf Tickets with full travel support | |
Fast-track to The Apart Fellowship | |
$2,000 in cash prizes | |
🥇 1st Place | $1,000 |
🥈 2nd Place | $500 |
🥉 3rd Place | $300 |
🏅 4th Place | $100 |
🏅 5th Place | $100 |
What is AI Control?
AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.
Why this hackathon?
AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.
The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.
Hackathon Tracks
1. ControlArena Challenges
Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.
2. Control Protocol Design
Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.
3. Red Teaming & Vulnerability Research
Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.
Who should participate?
AI safety researchers working on alignment, interpretability, control, or governance
Machine learning engineers and researchers
Security researchers and red teamers
Software engineers with experience in agent systems or infrastructure
Policy researchers interested in technical safety measures
Students and early-career researchers exploring AI safety
No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.
What you will do
Over three days, you will:
Form teams and choose a challenge track
Research and scope a specific problem in AI control using the provided resources
Build a project developing tools, protocols, evaluations, or attack demonstrations
Submit a research report (PDF) documenting your approach, results, and implications
Present your work to judges from leading AI safety organizations
What happens next
After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.
Why join?
Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.
Speakers & Collaborators
Buck Shlegeris
Speaker and Judge
Former MIRI researcher turned CEO of Redwood Research, pioneering technical AI safety and control mechanisms while bridging fundamental research with practical applications.
Tyler Tracy
Organizer
Experienced software engineer turned AI control researcher focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks
Jaime Raldua
Organizer
Jaime is CEO of Apart Research. He joined through one of Apart's early AI safety hackathons and has since served as Research Engineer, CTO, and now leads the organization.
Jason Hoelscher-Obermaier
Organizer & Judge
Jason is Director of Research at Apart Research and leads Apart Lab, the research fellowship supporting top hackathon participants. He holds a Ph.D. in physics from the University of Oxford and focuses on AI safety evaluations, interpretability, and alignment.
Speakers & Collaborators

Buck Shlegeris
Speaker and Judge
Former MIRI researcher turned CEO of Redwood Research, pioneering technical AI safety and control mechanisms while bridging fundamental research with practical applications.

Tyler Tracy
Organizer
Experienced software engineer turned AI control researcher focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks

Jaime Raldua
Organizer
Jaime is CEO of Apart Research. He joined through one of Apart's early AI safety hackathons and has since served as Research Engineer, CTO, and now leads the organization.

Jason Hoelscher-Obermaier
Organizer & Judge
Jason is Director of Research at Apart Research and leads Apart Lab, the research fellowship supporting top hackathon participants. He holds a Ph.D. in physics from the University of Oxford and focuses on AI safety evaluations, interpretability, and alignment.
Registered Local Sites
Register A Location
Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.
The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.
We haven't announced jam sites yet
Check back later
Registered Local Sites
Register A Location
Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.
The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.
We haven't announced jam sites yet
Check back later
Our Other Sprints
Jan 30, 2026
-
Feb 1, 2026
Research
The Technical AI Governance Challenge
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up
Sign Up
Sign Up
Jan 9, 2026
-
Jan 11, 2026
Research
AI Manipulation Hackathon
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up
Sign Up
Sign Up

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events
