Mar 20, 2026

-

Mar 22, 2026

Remote

AI Control Hackathon 2026

Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.

28

Days To Go

28

Days To Go

28

Days To Go

28

Days To Go

0

Sign Ups

0

Entries

Overview

Resources

Guidelines

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

ControlConf Tickets with full travel support

Fast-track to The Apart Fellowship

$2,000 in cash prizes

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. ControlArena Challenges

Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.

2. Control Protocol Design

Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.

3. Red Teaming & Vulnerability Research

Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

0

Sign Ups

0

Entries

Overview

Resources

Guidelines

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

ControlConf Tickets with full travel support

Fast-track to The Apart Fellowship

$2,000 in cash prizes

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. ControlArena Challenges

Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.

2. Control Protocol Design

Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.

3. Red Teaming & Vulnerability Research

Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

0

Sign Ups

0

Entries

Overview

Resources

Guidelines

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

ControlConf Tickets with full travel support

Fast-track to The Apart Fellowship

$2,000 in cash prizes

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. ControlArena Challenges

Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.

2. Control Protocol Design

Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.

3. Red Teaming & Vulnerability Research

Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

0

Sign Ups

0

Entries

Overview

Resources

Guidelines

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and the UK AI Security Institute. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

ControlConf Tickets with full travel support

Fast-track to The Apart Fellowship

$2,000 in cash prizes

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. ControlArena Challenges

Work directly with ControlArena, the open-source library built by UK AISI and Redwood Research for running AI control experiments. The platform provides 14 evaluation settings, each pairing a legitimate task with a covert malicious objective, along with pre-built monitors, scorers, and analysis tools. Participants can create new evaluation environments, improve existing settings, develop novel monitors, or benchmark new approaches against standardized metrics. Current settings cover code sabotage, infrastructure compromise, training data poisoning, evaluation manipulation, financial system exploitation, and more.

2. Control Protocol Design

Design and implement novel control protocols that restrict AI systems from performing harmful actions while preserving useful capabilities. The best current protocols combine multiple techniques: trusted monitoring catches obvious attacks, resampling handles suspicious actions, and deferring on critical operations provides a safety net. This track covers runtime monitoring, factored cognition, adaptive deployment, chain-of-thought monitoring, scalable oversight mechanisms like debate, and shutdown and containment protocols for autonomous agents.

3. Red Teaming & Vulnerability Research

Identify and exploit weaknesses in existing control protocols and safety measures. Current protocols have known blind spots: adaptive attacks on monitors, feedback exploitation, steganographic collusion between agents, and strategic attack timing all represent real threats. This track covers adversarial inputs, attacks on monitoring systems, sandbagging strategies, jailbreaking techniques targeting reasoning models, and demonstrations of alignment faking or emergent misalignment in controlled settings. The goal is to push the frontier of what attackers can do, so defenders can build stronger systems.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

Speakers & Collaborators

Buck Shlegeris

Speaker and Judge

Former MIRI researcher turned CEO of Redwood Research, pioneering technical AI safety and control mechanisms while bridging fundamental research with practical applications.

Tyler Tracy

Organizer

Experienced software engineer turned AI control researcher focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks

Jaime Raldua

Organizer

Jaime is CEO of Apart Research. He joined through one of Apart's early AI safety hackathons and has since served as Research Engineer, CTO, and now leads the organization.

Jason Hoelscher-Obermaier

Organizer & Judge

Jason is Director of Research at Apart Research and leads Apart Lab, the research fellowship supporting top hackathon participants. He holds a Ph.D. in physics from the University of Oxford and focuses on AI safety evaluations, interpretability, and alignment.

Kamil Alaa

Organizer

Kamil coordinates operations at Apart Research, managing hackathons and research sprints end-to-end. He brings 8+ years of experience in pharmaceutical R&D and project management.

Speakers & Collaborators

Buck Shlegeris

Speaker and Judge

Former MIRI researcher turned CEO of Redwood Research, pioneering technical AI safety and control mechanisms while bridging fundamental research with practical applications.

Tyler Tracy

Organizer

Experienced software engineer turned AI control researcher focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks

Jaime Raldua

Organizer

Jaime is CEO of Apart Research. He joined through one of Apart's early AI safety hackathons and has since served as Research Engineer, CTO, and now leads the organization.

Jason Hoelscher-Obermaier

Organizer & Judge

Jason is Director of Research at Apart Research and leads Apart Lab, the research fellowship supporting top hackathon participants. He holds a Ph.D. in physics from the University of Oxford and focuses on AI safety evaluations, interpretability, and alignment.

Kamil Alaa

Organizer

Kamil coordinates operations at Apart Research, managing hackathons and research sprints end-to-end. He brings 8+ years of experience in pharmaceutical R&D and project management.

Registered Local Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.

We haven't announced jam sites yet

Check back later

Registered Local Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.

We haven't announced jam sites yet

Check back later