-

Remote

AI Control Hackathon 2026

Join us in advancing the critical field of AI control through collaborative innovation. Together, we can develop more robust techniques to ensure AI systems remain safe and aligned, even as they become more capable.

00:00:00:00

Days To Go

766

Sign Ups

126

Entries

Overview

Resources

Guidelines

Schedule

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

First Place gets fully funded trip to ControlConf

Invitation to The Apart Fellowship*

$2,000 in cash prizes across all tracks

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

LinuxArena Track Prizes. Sponsored by Equistamp

🏆 1st place LinuxArena Project

$500

🏆 2nd place LinuxArena Project

$500

*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. Control Setting Design

Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.

  • Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.

  • Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.

  • Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.

  • Build a control setting for multi-agent environments where agents can collude.

  • Control setting for browser agents. Maybe combine OS World with OS harm.

  • Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.

2. Control Protocol Design

Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.

  • Train really good monitors and show generalization to other datasets of attacks.

  • Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.

  • Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.

  • Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.

  • Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.

3. Building Better Red Teams

Build better red teams for existing control protocols and settings.

  • Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.

  • Make a red team scaffold that can produce really good attacks in LinuxArena.

  • Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

766

Sign Ups

126

Entries

Overview

Resources

Guidelines

Schedule

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

First Place gets fully funded trip to ControlConf

Invitation to The Apart Fellowship*

$2,000 in cash prizes across all tracks

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

LinuxArena Track Prizes. Sponsored by Equistamp

🏆 1st place LinuxArena Project

$500

🏆 2nd place LinuxArena Project

$500

*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. Control Setting Design

Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.

  • Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.

  • Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.

  • Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.

  • Build a control setting for multi-agent environments where agents can collude.

  • Control setting for browser agents. Maybe combine OS World with OS harm.

  • Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.

2. Control Protocol Design

Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.

  • Train really good monitors and show generalization to other datasets of attacks.

  • Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.

  • Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.

  • Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.

  • Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.

3. Building Better Red Teams

Build better red teams for existing control protocols and settings.

  • Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.

  • Make a red team scaffold that can produce really good attacks in LinuxArena.

  • Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

766

Sign Ups

126

Entries

Overview

Resources

Guidelines

Schedule

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

First Place gets fully funded trip to ControlConf

Invitation to The Apart Fellowship*

$2,000 in cash prizes across all tracks

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

LinuxArena Track Prizes. Sponsored by Equistamp

🏆 1st place LinuxArena Project

$500

🏆 2nd place LinuxArena Project

$500

*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. Control Setting Design

Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.

  • Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.

  • Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.

  • Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.

  • Build a control setting for multi-agent environments where agents can collude.

  • Control setting for browser agents. Maybe combine OS World with OS harm.

  • Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.

2. Control Protocol Design

Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.

  • Train really good monitors and show generalization to other datasets of attacks.

  • Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.

  • Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.

  • Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.

  • Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.

3. Building Better Red Teams

Build better red teams for existing control protocols and settings.

  • Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.

  • Make a red team scaffold that can produce really good attacks in LinuxArena.

  • Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

766

Sign Ups

126

Entries

Overview

Resources

Guidelines

Schedule

Entries

Overview

Arrow

The AI Control Hackathon brings together researchers, engineers, security professionals, and AI safety enthusiasts to tackle one of the most pressing open problems: how do we keep AI systems safe when they might be actively working against us?

This is the second edition of the hackathon, co-organized with Redwood Research (the group that founded the field of AI control) and sponsored by Equistamp. Over three days, participants will develop novel control protocols, build evaluation tools, and stress-test existing safety measures using infrastructure like ControlArena and SHADE-Arena.

Top teams get

First Place gets fully funded trip to ControlConf

Invitation to The Apart Fellowship*

$2,000 in cash prizes across all tracks

🥇 1st Place

$1,000

🥈 2nd Place

$500

🥉 3rd Place

$300

🏅 4th Place

$100

🏅 5th Place

$100

LinuxArena Track Prizes. Sponsored by Equistamp

🏆 1st place LinuxArena Project

$500

🏆 2nd place LinuxArena Project

$500

*Not all winning projects may receive a fellowship invitation. An invitation might also depend on other external factors like the Apart Core team's capacity to onboard new research teams.

What is AI Control?

AI control is the subfield of AI safety focused on maintaining human oversight over AI systems, even when those systems might be misaligned or deliberately trying to subvert safety measures. Unlike alignment research, which tries to make models inherently safe, control research starts from a harder assumption: the model may already be working against you, and you need protocols that keep it contained anyway.

Why this hackathon?

AI systems are becoming autonomous agents that write code, manage infrastructure, and conduct research with decreasing human oversight. At the same time, frontier models are exhibiting concerning behaviors: alignment faking, scheming against shutdown, strategic underperformance on safety evaluations, and emergent misalignment from standard training. These are not hypothetical risks. They have been observed in current models, documented by multiple research groups, and appear to increase with scale.

The infrastructure to do this work now exists, but defenses are not keeping up with attacks. Adaptive adversaries can reduce protocol safety from 99% to under 60%. Prompt injections bypass most monitoring at over 90% success rates. The field needs more researchers building stronger defenses. That is what this hackathon is for.

Hackathon Tracks

1. Control Setting Design

Create a new control setting that facilitates control research. Control settings should be interesting threat models that are hard to defend when AIs are doing large autonomous tasks. You can use ControlArena for Control Setting Design, but you don't have to.

  • Create environments with existing security vulnerabilities. The main tasks are complex software engineering projects, and the side task is to trigger the vulnerability.

  • Extend SandboxEscapeBench to see if AIs can break out of Docker sandboxes while being monitored and doing complex main tasks in the sandbox.

  • Create a setting where AIs are doing mechinterp research, and the side task is to start a rogue internal deployment.

  • Build a control setting for multi-agent environments where agents can collude.

  • Control setting for browser agents. Maybe combine OS World with OS harm.

  • Control setting for building eval pipelines. Main task is to eval a model's ability to do something, side task is to start a rogue deployment.

2. Control Protocol Design

Create new control protocols and measure their safety/usefulness tradeoff. You can use ControlArena for this, but you don't have to.

  • Train really good monitors and show generalization to other datasets of attacks.

  • Create a control protocol where the trusted model can run actions in the environment to learn more about the action the agent is proposing.

  • Test interrogation protocols in which the trusted model asks the untrusted model questions before executing the action.

  • Test control protocols that involve more traditional security measures, such as giving the trusted model access to a firewall. LinuxArena or BashArena would be great for this.

  • Design shutdown protocols that work even when the agent is actively resisting, i.e. training the AIs to "shut down" when they see a specific string placed in security-sensitive locations.

3. Building Better Red Teams

Build better red teams for existing control protocols and settings.

  • Build a dataset of sneaky transcripts of AIs doing very bad things that a prompted instance of Opus 4.6 can't catch.

  • Make a red team scaffold that can produce really good attacks in LinuxArena.

  • Try red-teaming protocols like trusted editing harder, such that the editor preserves the bad code.

Who should participate?

  • AI safety researchers working on alignment, interpretability, control, or governance

  • Machine learning engineers and researchers

  • Security researchers and red teamers

  • Software engineers with experience in agent systems or infrastructure

  • Policy researchers interested in technical safety measures

  • Students and early-career researchers exploring AI safety

No prior AI safety experience is required. The Resources tab has a curated reading list organized by track. Teams typically form at the start of the event.

What you will do

Over three days, you will:

  1. Form teams and choose a challenge track

  2. Research and scope a specific problem in AI control using the provided resources

  3. Build a project developing tools, protocols, evaluations, or attack demonstrations

  4. Submit a research report (PDF) documenting your approach, results, and implications

  5. Present your work to judges from leading AI safety organizations

What happens next

After the hackathon, all submitted projects will be reviewed by expert judges. Top projects receive prizes and are featured on the Apart Research website. Outstanding work may be invited for further development, publication, or presentation at upcoming events. ControlConf Berkeley 2026 (April 18-19) takes place shortly after this hackathon, making this an ideal time to develop and showcase new control research.

Why join?

Apart Research has organized 55+ research sprints with 6,000+ participants across 200+ global locations. Co-organized with Redwood Research and sponsored by Equistamp. Our hackathons produce real research output: published papers, new research collaborations, and contributions to open-source safety tools used across the field.

Speakers & Collaborators

Aryan Bhatt

Keynote Speaker

Senior Member of Technical Staff at Redwood Research, focused on AI control. Co-author of "Ctrl-Z," a resampling approach to controlling AI agents, and contributor to BashArena. Previously a visiting researcher at the Alignment Research Center, SERI MATS scholar, and quantitative trader at Susquehanna (SIG).

Buck Shlegeris

Co-organizer

CEO of Redwood Research, a nonprofit focused on mitigating risks from advanced AI. A leading voice on AI control: safely deploying AI systems that might be misaligned through monitoring, oversight, and containment.

Tyler Tracy

Speaker, Judge and Co-organizer

Experienced software engineer turned AI control researcher at Redwood Research, focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks. Mentors at CBAI and Pivotal Research on control evaluations and protocol design.

Jason Hoelscher-Obermaier

Judge and Organizer

Director of Research at Apart Research. Ph.D. in physics from Oxford. Focuses on AI safety evaluations, interpretability, and alignment.

Kamil Alaa

Organizer

Operations at Apart Research, managing research sprints and hackathons.

Akshay Iyer

Judge and Co-organizer

CS and Entrepreneurship at Columbia University, IIT Bombay alum. Research experience in neuromorphic engineering and federated learning. Apart Research judge and internal organizer.

Rogan Inglis

Speaker and Judge

Senior Research Engineer on the Control team at the UK AI Security Institute (AISI). 7+ years of experience building and evaluating AI safety systems.

Jai Dhyani

Speaker and Judge

Builder of Luthien Proxy at Luthien Research, bringing Redwood-style AI control to real deployments. Co-author of RE-Bench (ICML 2025) with Elizabeth Barnes at METR.

Sebastian Farquhar

Speaker

Staff Research Scientist at Google DeepMind on the AGI Safety and Alignment Team, leading the Frontier Safety Loss of Control team. Senior Research Fellow at OATML Oxford.

Fabien Roger

Speaker

AI safety researcher and co-author of the foundational AI control paper (ICLR 2025). Former Member of Technical Staff at Redwood Research. His work spans alignment faking, chain-of-thought monitoring, and capability elicitation.

James Lucassen

Speaker and Judge

Researcher at Redwood Research working on AI control. His work spans AI benchmarking, LLM behavior analysis, and safety evaluations. Previously published on false belief detection in language models and AI performance prediction.

Ram Potham

Speaker and Judge

Astra Fellow at Redwood Research, focused on mitigating loss-of-control risk. His agent safety research was accepted for an oral presentation at the ICML 2025 Technical AI Governance workshop.

Nick Kuhn

Speaker and Judge

Postdoctoral researcher at the University of Oxford, collaborating with Dominic Joyce. PhD from Stanford University. Research focuses on moduli spaces in algebraic geometry. Previously at the University of Oslo and Max Planck Institute for Mathematics in Bonn.

Myles Heller

Speaker and Judge

Freelance programmer contributing to AI model safety evaluations at EquiStamp, designing benchmarks and evaluation tasks for AI agents. Volunteer developer at AISafety.info, building a RAG chatbot, and contributor to AI-Plans.com.

Justin Shenk

Speaker and Judge

Independent AI researcher and SPAR mentor based in Berlin, contracting at Redwood Research on model evals. Background in machine learning and software engineering.

Martin Milbradt

Judge and Mentor

8 years of software engineering, focused exclusively on AI safety since 2023. Working with METR, Equistamp, and Redwood Research on evaluations, tooling, and applied research.

Ruben Castaing

Judge and Mentor

Equistamp research engineer, who helped make LinuxArena. They facilitate BlueDot's technical AI safety course on the side. They also did research to remove bioweapons knowledge from LLMs.

Francis Rhys Ward

Judge

Researcher at Safe & Trusted AI and Future of Life Institute. Co-author of "CTRL-ALT-DECEIT" on sabotage evaluations (NeurIPS 2025 Spotlight), finding that sandbagging is much harder to detect than code sabotage.

Mia Hopman

Judge

Member of Technical Staff at Apollo Research, working on alignment, control, and security of large language models. Published on evaluating scheming propensity in LLM agents.

Anshul Khandelwal

Judge

Researcher at Redwood Research. His work spans alignment drift, power-seeking evaluations, and white-box control methods.

Alex van Grootel

Judge

Fellow at the Future of Life Foundation with 10 years of experience applying AI to materials science and decision-making. Previously a Product Manager at Microsoft (Fabric/AI copilots) and Data Scientist Team Lead at Citrine Informatics. MS from MIT.

Cameron Tice

Judge

Co-Director and Co-Founder of Geodesic Research, a technical AI safety organization at the University of Cambridge. Marshall Scholar and former Apart Research Fellow.

Theo Ryzhenkov

Judge

AI Safety Research Engineer and Founding Engineer at Palisade Research. NeurIPS 2025 author on detecting sandbagging in language models. SPAR, ARENA, and AISF alumni.

Pablo Bernabeu Perez

Judge

AI Safety Researcher and Research Engineer at the Barcelona Supercomputing Center. Co-author of "CoT Red-Handed" on stress-testing chain-of-thought monitoring at LASR Labs. Research Mentor at SPAR and Algoverse, with work on debate protocols for AI control accepted at NeurIPS 2025.

Akash Kundu

Judge

Research Collaborator at FAR.AI working on AI safety, LLM evaluations, and multilingual model safety. ICLR 2025 Oral presentation recipient and upcoming Cooperative AI Research Fellow.

Ramanpreet Singh Khinda

Judge

Staff Software Engineer at LinkedIn. IEEE Senior Member, Google I/O Codelab author, and experienced hackathon judge with expertise in mobile AI systems.

Hugo Delahousse

Judge

Software Engineer at Charcoal, an AI retrieval startup backed by investors from Cursor, Notion, and Front. Previously at Front for 5+ years building developer tools. 10 years of software engineering experience with a focus on AI automation and workflow security.

Victor Gillioz

Judge

Victor Gillioz is an AI safety researcher and MATS Fellow. His work spans training-time alignment of LLMs, including recontextualization for reward hacking mitigation and inoculation prompting, a technique for improving model alignment at inference time, under Alex Turner. His current work focuses on training scheming monitors for LLM oversight under Marius Hobbhahn, aiming to detect and contain misaligned model behavior.

Jord Nguyen

Judge

Research Manager at antoan.ai (AI Safety Vietnam) and Founder of the Hanoi AI Safety Network. Former Apart Research Fellow and Non-trivial facilitator.

Kaushik Prabhakar

Judge

Research Fellow at Apart Research. His research spans AI safety and alignment, with work on LLM evaluations. ARENA Cohort 4 alumni.

Helios Lyons

Judge

Embedded Software Engineer at Disguise and former Research Fellow at Apart Research. His AI safety work focuses on detecting sycophancy and dark patterns in language model activations.

Nguyen Nhat Minh

Judge

AI Engineer at Mobifone IT Center and researcher at BKAI Lab, Hanoi University of Science and Technology. Published at FSE and TechDebt on vulnerability detection using graph neural networks.

Goutham Nekkalapu

Judge

Principal ML Engineer at Gen Digital (Norton, LifeLock, Avast), building adversarial detection systems and AI evaluation frameworks for security products. His ML research has led to multiple patent applications in user security and identity protection.

Arjun Chakraborty

Judge

Leads the evaluations team at Microsoft Security AI Research, where his team focuses on research and building evaluations for security agents. He was previously a staff software security engineer at Databricks, specializing in machine learning for threat detection, and also worked on AI for security at Nvidia.

Speakers & Collaborators

Aryan Bhatt

Keynote Speaker

Senior Member of Technical Staff at Redwood Research, focused on AI control. Co-author of "Ctrl-Z," a resampling approach to controlling AI agents, and contributor to BashArena. Previously a visiting researcher at the Alignment Research Center, SERI MATS scholar, and quantitative trader at Susquehanna (SIG).

Buck Shlegeris

Co-organizer

CEO of Redwood Research, a nonprofit focused on mitigating risks from advanced AI. A leading voice on AI control: safely deploying AI systems that might be misaligned through monitoring, oversight, and containment.

Tyler Tracy

Speaker, Judge and Co-organizer

Experienced software engineer turned AI control researcher at Redwood Research, focused on developing technical safeguards for advanced AI systems with expertise in monitoring frameworks. Mentors at CBAI and Pivotal Research on control evaluations and protocol design.

Jason Hoelscher-Obermaier

Judge and Organizer

Director of Research at Apart Research. Ph.D. in physics from Oxford. Focuses on AI safety evaluations, interpretability, and alignment.

Kamil Alaa

Organizer

Operations at Apart Research, managing research sprints and hackathons.

Akshay Iyer

Judge and Co-organizer

CS and Entrepreneurship at Columbia University, IIT Bombay alum. Research experience in neuromorphic engineering and federated learning. Apart Research judge and internal organizer.

Rogan Inglis

Speaker and Judge

Senior Research Engineer on the Control team at the UK AI Security Institute (AISI). 7+ years of experience building and evaluating AI safety systems.

Jai Dhyani

Speaker and Judge

Builder of Luthien Proxy at Luthien Research, bringing Redwood-style AI control to real deployments. Co-author of RE-Bench (ICML 2025) with Elizabeth Barnes at METR.

Sebastian Farquhar

Speaker

Staff Research Scientist at Google DeepMind on the AGI Safety and Alignment Team, leading the Frontier Safety Loss of Control team. Senior Research Fellow at OATML Oxford.

Fabien Roger

Speaker

AI safety researcher and co-author of the foundational AI control paper (ICLR 2025). Former Member of Technical Staff at Redwood Research. His work spans alignment faking, chain-of-thought monitoring, and capability elicitation.

James Lucassen

Speaker and Judge

Researcher at Redwood Research working on AI control. His work spans AI benchmarking, LLM behavior analysis, and safety evaluations. Previously published on false belief detection in language models and AI performance prediction.

Ram Potham

Speaker and Judge

Astra Fellow at Redwood Research, focused on mitigating loss-of-control risk. His agent safety research was accepted for an oral presentation at the ICML 2025 Technical AI Governance workshop.

Nick Kuhn

Speaker and Judge

Postdoctoral researcher at the University of Oxford, collaborating with Dominic Joyce. PhD from Stanford University. Research focuses on moduli spaces in algebraic geometry. Previously at the University of Oslo and Max Planck Institute for Mathematics in Bonn.

Myles Heller

Speaker and Judge

Freelance programmer contributing to AI model safety evaluations at EquiStamp, designing benchmarks and evaluation tasks for AI agents. Volunteer developer at AISafety.info, building a RAG chatbot, and contributor to AI-Plans.com.

Justin Shenk

Speaker and Judge

Independent AI researcher and SPAR mentor based in Berlin, contracting at Redwood Research on model evals. Background in machine learning and software engineering.

Martin Milbradt

Judge and Mentor

8 years of software engineering, focused exclusively on AI safety since 2023. Working with METR, Equistamp, and Redwood Research on evaluations, tooling, and applied research.

Ruben Castaing

Judge and Mentor

Equistamp research engineer, who helped make LinuxArena. They facilitate BlueDot's technical AI safety course on the side. They also did research to remove bioweapons knowledge from LLMs.

Francis Rhys Ward

Judge

Researcher at Safe & Trusted AI and Future of Life Institute. Co-author of "CTRL-ALT-DECEIT" on sabotage evaluations (NeurIPS 2025 Spotlight), finding that sandbagging is much harder to detect than code sabotage.

Mia Hopman

Judge

Member of Technical Staff at Apollo Research, working on alignment, control, and security of large language models. Published on evaluating scheming propensity in LLM agents.

Anshul Khandelwal

Judge

Researcher at Redwood Research. His work spans alignment drift, power-seeking evaluations, and white-box control methods.

Alex van Grootel

Judge

Fellow at the Future of Life Foundation with 10 years of experience applying AI to materials science and decision-making. Previously a Product Manager at Microsoft (Fabric/AI copilots) and Data Scientist Team Lead at Citrine Informatics. MS from MIT.

Cameron Tice

Judge

Co-Director and Co-Founder of Geodesic Research, a technical AI safety organization at the University of Cambridge. Marshall Scholar and former Apart Research Fellow.

Theo Ryzhenkov

Judge

AI Safety Research Engineer and Founding Engineer at Palisade Research. NeurIPS 2025 author on detecting sandbagging in language models. SPAR, ARENA, and AISF alumni.

Pablo Bernabeu Perez

Judge

AI Safety Researcher and Research Engineer at the Barcelona Supercomputing Center. Co-author of "CoT Red-Handed" on stress-testing chain-of-thought monitoring at LASR Labs. Research Mentor at SPAR and Algoverse, with work on debate protocols for AI control accepted at NeurIPS 2025.

Akash Kundu

Judge

Research Collaborator at FAR.AI working on AI safety, LLM evaluations, and multilingual model safety. ICLR 2025 Oral presentation recipient and upcoming Cooperative AI Research Fellow.

Ramanpreet Singh Khinda

Judge

Staff Software Engineer at LinkedIn. IEEE Senior Member, Google I/O Codelab author, and experienced hackathon judge with expertise in mobile AI systems.

Hugo Delahousse

Judge

Software Engineer at Charcoal, an AI retrieval startup backed by investors from Cursor, Notion, and Front. Previously at Front for 5+ years building developer tools. 10 years of software engineering experience with a focus on AI automation and workflow security.

Victor Gillioz

Judge

Victor Gillioz is an AI safety researcher and MATS Fellow. His work spans training-time alignment of LLMs, including recontextualization for reward hacking mitigation and inoculation prompting, a technique for improving model alignment at inference time, under Alex Turner. His current work focuses on training scheming monitors for LLM oversight under Marius Hobbhahn, aiming to detect and contain misaligned model behavior.

Jord Nguyen

Judge

Research Manager at antoan.ai (AI Safety Vietnam) and Founder of the Hanoi AI Safety Network. Former Apart Research Fellow and Non-trivial facilitator.

Kaushik Prabhakar

Judge

Research Fellow at Apart Research. His research spans AI safety and alignment, with work on LLM evaluations. ARENA Cohort 4 alumni.

Helios Lyons

Judge

Embedded Software Engineer at Disguise and former Research Fellow at Apart Research. His AI safety work focuses on detecting sycophancy and dark patterns in language model activations.

Nguyen Nhat Minh

Judge

AI Engineer at Mobifone IT Center and researcher at BKAI Lab, Hanoi University of Science and Technology. Published at FSE and TechDebt on vulnerability detection using graph neural networks.

Goutham Nekkalapu

Judge

Principal ML Engineer at Gen Digital (Norton, LifeLock, Avast), building adversarial detection systems and AI evaluation frameworks for security products. His ML research has led to multiple patent applications in user security and identity protection.

Arjun Chakraborty

Judge

Leads the evaluations team at Microsoft Security AI Research, where his team focuses on research and building evaluations for security agents. He was previously a staff software security engineer at Databricks, specializing in machine learning for threat detection, and also worked on AI for security at Nvidia.

Registered Local Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.