21 : 08 : 10 : 48
21 : 08 : 10 : 48
21 : 08 : 10 : 48
21 : 08 : 10 : 48
Keep Apart Research Going: Donate Today
May 30, 2025
-
Jun 1, 2025
Online & In-Person
Apart x Martian Mechanistic Router Interpretability Hackathon




Join the effort to make AI orchestration interpretable from the ground up—where judge models reveal their reasoning process and routing decisions become windows into AI decision-making!
01 : 08 : 10 : 48
01 : 08 : 10 : 48
01 : 08 : 10 : 48
01 : 08 : 10 : 48
Join the effort to make AI orchestration interpretable from the ground up—where judge models reveal their reasoning process and routing decisions become windows into AI decision-making!
This event is ongoing.
This event has concluded.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

✨ Overview
Join us in pioneering a paradigm shift in AI development! This hackathon will focus on creating the building blocks for a revolutionary Expert Orchestration AI architecture that democratizes, aligns, and enhances large language models through intelligent routing to specialized experts.
The current approach of developing increasingly powerful monolithic AI models faces fundamental limitations in performance, transparency, and democratization. This hackathon aims to develop key components of an alternative architecture where specialized "judge" models evaluate the capabilities of thousands of specialized models, and "router" systems intelligently direct queries to the most appropriate models based on these evaluations.
We’re excited to collaborate with Martian, pioneers in model routing and interpretability research, for this specialized hackathon. Their novel expert orchestration approach distills complex AI models into compact, predictive components that retain only the information needed to estimate performance on specific tasks. By combining this with cutting-edge Mechanistic Interpretability techniques, Martian is developing a general theory of model capabilities—enabling us to evaluate and deploy models more safely, transparently, and effectively. Their work also supports an open ecosystem of specialized models, helping democratize AI development while aligning business impact with AI safety.
Each participating team will receive $50 in model API credits to power their projects, with additional credits available for promising implementations. You'll have access to Martian's judge and router APIs, along with sample code libraries to kickstart your projects.
Sign up here to stay updated for this event!
Why This Hackathon Matters:
The current trajectory of AI development, focused on increasingly powerful monolithic models, faces fundamental limitations:
Winner-takes-all dynamics: The high costs of training frontier models leads to market concentration and economic power in a few corporations.
Misaligned safety incentives: Companies racing to release increasingly powerful models may underreport risks and rush products to market.
High barriers to entry: Specialized models struggle to gain market traction against generalist models, even when they excel in specific domains.
Limited user control: Users have minimal visibility into how models "think" and limited ability to control characteristics like factuality, bias, or ethical reasoning.
Inefficient resource use: Using powerful frontier models for all tasks wastes resources and often underperforms specialized alternatives.
The Expert Orchestration Architecture addresses these issues by creating a more transparent, efficient, and democratic AI ecosystem where specialized models can thrive based on their unique strengths, and users gain unprecedented control over AI capabilities.
Expected Outcomes
Participants will create components that advance the Expert Orchestration vision:
Prototype judge models for evaluating specific AI capabilities
Intelligent routing algorithms for directing queries to appropriate models
Frameworks for decomposing complex tasks across multiple specialized models
Integration APIs that allow seamless discovery and utilization of specialized models
Evaluation metrics and benchmarks for comparing different routing and judge strategies
The most promising projects will have opportunities for continued development and potential integration into production systems.
Challenge Tracks
Track 1: Judge Model Development
Build specialized evaluation models that can assess different AI models' capabilities across dimensions that matter to users (manipulation skills and tendencies, deception and hidden communication, misaligned goals, factuality, domain expertise, ethics, creativity, objectivity, etc.). Judges should provide independent, objective evaluations that create transparency around model strengths and weaknesses.
Track 2: Intelligent Router Systems
Develop router systems that can intelligently direct user queries to the most appropriate specialized models based on user preferences using judge evaluations. Focus areas include routers that use multiple judges (e.g. factuality, ethics and lack of deception), query classification, efficiency optimization, and handling uncertainty.
Track 3: Task Decomposition Frameworks
Create systems that can break down complex user requests into a series of more manageable steps, to be executed by different specialized models. This includes planning phases, execution phases, and coordination mechanisms. Investigate whether decomposition avoids or reduces some of the traps reported for monolithic reasoning models (e.g. reward hacking).
Track 4: Specialized Model Integration
Build frameworks that enable easy integration of new specialized models into the Expert Orchestration Architecture, including methods for model discovery, capability profiling, and dynamic performance evaluation.
Open Research Questions
Judges
Model Characteristic Analysis: Create dataset(s) that test an interesting model characteristic pertinent to safety (e.g., ethics, hallucinations, gender bias). Build a judge using this data and evaluate multiple models.
Judge Evaluation Metrics: Develop methods to measure judge accuracy, completeness, and reliability for specific characteristics. Explore how this impacts AI Safety.
Mechanistic Interpretability for Judges: Apply MI techniques to model internals to create better or more interpretable judges e.g. judges that can evaluate outputs based on how they were generated.
Measuring model similarity: HuggingFace hosts thousands of LLMs. How do we measure whether two models have similar capabilities? How do we choose a subset of these models with a sufficiently diverse set of capabilities, that after training the resulting router will be performant? How does the router performance vary with the size of the subset?
Routers
Given a set of models with known capabilities measured by known judge scores:
Risk-Sensitive Routing: Build efficient routing algorithms balancing the judge scores, computes costs, and system reliability for the best user experience.
Multi-Objective Routing: Create routers that use scores from multiple judges (e.g., answer correctness, ethics and legality) according to user preferences for the best user experience. What are the tradeoffs?
Routing algorithms: For expensive models, the judge provides a “pre-hoc” way to estimate prediction success (without querying the model). For cheap models, we can ask the model to evaluate the answer, and evaluate its confidence (“post-hoc”) in its predictions. Find interesting ways to mix pre- and post-hoc routing to get the best of both worlds.
Multi-level routing: Investigate using a tree of choices rather than one-off routing. What are the pros and cons?
Reducing router training costs: Given a model and a task, how can we cheaply detect a model is not a good fit for a task - avoiding further training time optimizing how bad a fit it is.
Task Decomposition: Model breaking a complex user task into multiple subtasks that can be routed to the most capable models before recombining the results. What are the AI Safety, cyber-security and/or cost implications of this approach?
Universal router: For a set of tasks, create a single router across a set of LLMs that provides higher-quality answers than any single LLM does.
Inferring Judges & Routers
Reverse Engineering: Given a black-box LLM or router, infer its embedded judge (reward signal) for specific characteristics.
Efficiency Analysis: Quantify potential electricity/resource consumption reduction from widespread adoption of optimal routing technologies.
Learning when to fail: Sometimes no model will successfully answer a user query. Can we detect when we should fail cheaply?
Learning with uncertain signals: How does judge noise affect the router training process? How does noisy feedback data affect the judge/router training process? Is off-policy data a problem when it comes to training routers?
Risk sensitivity: Rather than optimizing for expected cost/quality, can we optimize for some other risk profile? E.g. we might tolerate a slightly higher cost and lower quality, if we reduce the variance or minimize a long tail.
Create a distilled predictor: The Language Models (Mostly) Know What They Know paper shows that a model can sometimes predict whether it will be able to answer a question correctly. For a selected open “base” model, create a smaller “distilled predictor” that mirrors the base model’s ability to predict answer correctness (but can no longer calculate the answer). You might use the techniques from that paper and/or the pruning and distillation techniques from the movement pruning to shrink the predictor.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

✨ Overview
Join us in pioneering a paradigm shift in AI development! This hackathon will focus on creating the building blocks for a revolutionary Expert Orchestration AI architecture that democratizes, aligns, and enhances large language models through intelligent routing to specialized experts.
The current approach of developing increasingly powerful monolithic AI models faces fundamental limitations in performance, transparency, and democratization. This hackathon aims to develop key components of an alternative architecture where specialized "judge" models evaluate the capabilities of thousands of specialized models, and "router" systems intelligently direct queries to the most appropriate models based on these evaluations.
We’re excited to collaborate with Martian, pioneers in model routing and interpretability research, for this specialized hackathon. Their novel expert orchestration approach distills complex AI models into compact, predictive components that retain only the information needed to estimate performance on specific tasks. By combining this with cutting-edge Mechanistic Interpretability techniques, Martian is developing a general theory of model capabilities—enabling us to evaluate and deploy models more safely, transparently, and effectively. Their work also supports an open ecosystem of specialized models, helping democratize AI development while aligning business impact with AI safety.
Each participating team will receive $50 in model API credits to power their projects, with additional credits available for promising implementations. You'll have access to Martian's judge and router APIs, along with sample code libraries to kickstart your projects.
Sign up here to stay updated for this event!
Why This Hackathon Matters:
The current trajectory of AI development, focused on increasingly powerful monolithic models, faces fundamental limitations:
Winner-takes-all dynamics: The high costs of training frontier models leads to market concentration and economic power in a few corporations.
Misaligned safety incentives: Companies racing to release increasingly powerful models may underreport risks and rush products to market.
High barriers to entry: Specialized models struggle to gain market traction against generalist models, even when they excel in specific domains.
Limited user control: Users have minimal visibility into how models "think" and limited ability to control characteristics like factuality, bias, or ethical reasoning.
Inefficient resource use: Using powerful frontier models for all tasks wastes resources and often underperforms specialized alternatives.
The Expert Orchestration Architecture addresses these issues by creating a more transparent, efficient, and democratic AI ecosystem where specialized models can thrive based on their unique strengths, and users gain unprecedented control over AI capabilities.
Expected Outcomes
Participants will create components that advance the Expert Orchestration vision:
Prototype judge models for evaluating specific AI capabilities
Intelligent routing algorithms for directing queries to appropriate models
Frameworks for decomposing complex tasks across multiple specialized models
Integration APIs that allow seamless discovery and utilization of specialized models
Evaluation metrics and benchmarks for comparing different routing and judge strategies
The most promising projects will have opportunities for continued development and potential integration into production systems.
Challenge Tracks
Track 1: Judge Model Development
Build specialized evaluation models that can assess different AI models' capabilities across dimensions that matter to users (manipulation skills and tendencies, deception and hidden communication, misaligned goals, factuality, domain expertise, ethics, creativity, objectivity, etc.). Judges should provide independent, objective evaluations that create transparency around model strengths and weaknesses.
Track 2: Intelligent Router Systems
Develop router systems that can intelligently direct user queries to the most appropriate specialized models based on user preferences using judge evaluations. Focus areas include routers that use multiple judges (e.g. factuality, ethics and lack of deception), query classification, efficiency optimization, and handling uncertainty.
Track 3: Task Decomposition Frameworks
Create systems that can break down complex user requests into a series of more manageable steps, to be executed by different specialized models. This includes planning phases, execution phases, and coordination mechanisms. Investigate whether decomposition avoids or reduces some of the traps reported for monolithic reasoning models (e.g. reward hacking).
Track 4: Specialized Model Integration
Build frameworks that enable easy integration of new specialized models into the Expert Orchestration Architecture, including methods for model discovery, capability profiling, and dynamic performance evaluation.
Open Research Questions
Judges
Model Characteristic Analysis: Create dataset(s) that test an interesting model characteristic pertinent to safety (e.g., ethics, hallucinations, gender bias). Build a judge using this data and evaluate multiple models.
Judge Evaluation Metrics: Develop methods to measure judge accuracy, completeness, and reliability for specific characteristics. Explore how this impacts AI Safety.
Mechanistic Interpretability for Judges: Apply MI techniques to model internals to create better or more interpretable judges e.g. judges that can evaluate outputs based on how they were generated.
Measuring model similarity: HuggingFace hosts thousands of LLMs. How do we measure whether two models have similar capabilities? How do we choose a subset of these models with a sufficiently diverse set of capabilities, that after training the resulting router will be performant? How does the router performance vary with the size of the subset?
Routers
Given a set of models with known capabilities measured by known judge scores:
Risk-Sensitive Routing: Build efficient routing algorithms balancing the judge scores, computes costs, and system reliability for the best user experience.
Multi-Objective Routing: Create routers that use scores from multiple judges (e.g., answer correctness, ethics and legality) according to user preferences for the best user experience. What are the tradeoffs?
Routing algorithms: For expensive models, the judge provides a “pre-hoc” way to estimate prediction success (without querying the model). For cheap models, we can ask the model to evaluate the answer, and evaluate its confidence (“post-hoc”) in its predictions. Find interesting ways to mix pre- and post-hoc routing to get the best of both worlds.
Multi-level routing: Investigate using a tree of choices rather than one-off routing. What are the pros and cons?
Reducing router training costs: Given a model and a task, how can we cheaply detect a model is not a good fit for a task - avoiding further training time optimizing how bad a fit it is.
Task Decomposition: Model breaking a complex user task into multiple subtasks that can be routed to the most capable models before recombining the results. What are the AI Safety, cyber-security and/or cost implications of this approach?
Universal router: For a set of tasks, create a single router across a set of LLMs that provides higher-quality answers than any single LLM does.
Inferring Judges & Routers
Reverse Engineering: Given a black-box LLM or router, infer its embedded judge (reward signal) for specific characteristics.
Efficiency Analysis: Quantify potential electricity/resource consumption reduction from widespread adoption of optimal routing technologies.
Learning when to fail: Sometimes no model will successfully answer a user query. Can we detect when we should fail cheaply?
Learning with uncertain signals: How does judge noise affect the router training process? How does noisy feedback data affect the judge/router training process? Is off-policy data a problem when it comes to training routers?
Risk sensitivity: Rather than optimizing for expected cost/quality, can we optimize for some other risk profile? E.g. we might tolerate a slightly higher cost and lower quality, if we reduce the variance or minimize a long tail.
Create a distilled predictor: The Language Models (Mostly) Know What They Know paper shows that a model can sometimes predict whether it will be able to answer a question correctly. For a selected open “base” model, create a smaller “distilled predictor” that mirrors the base model’s ability to predict answer correctness (but can no longer calculate the answer). You might use the techniques from that paper and/or the pruning and distillation techniques from the movement pruning to shrink the predictor.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

✨ Overview
Join us in pioneering a paradigm shift in AI development! This hackathon will focus on creating the building blocks for a revolutionary Expert Orchestration AI architecture that democratizes, aligns, and enhances large language models through intelligent routing to specialized experts.
The current approach of developing increasingly powerful monolithic AI models faces fundamental limitations in performance, transparency, and democratization. This hackathon aims to develop key components of an alternative architecture where specialized "judge" models evaluate the capabilities of thousands of specialized models, and "router" systems intelligently direct queries to the most appropriate models based on these evaluations.
We’re excited to collaborate with Martian, pioneers in model routing and interpretability research, for this specialized hackathon. Their novel expert orchestration approach distills complex AI models into compact, predictive components that retain only the information needed to estimate performance on specific tasks. By combining this with cutting-edge Mechanistic Interpretability techniques, Martian is developing a general theory of model capabilities—enabling us to evaluate and deploy models more safely, transparently, and effectively. Their work also supports an open ecosystem of specialized models, helping democratize AI development while aligning business impact with AI safety.
Each participating team will receive $50 in model API credits to power their projects, with additional credits available for promising implementations. You'll have access to Martian's judge and router APIs, along with sample code libraries to kickstart your projects.
Sign up here to stay updated for this event!
Why This Hackathon Matters:
The current trajectory of AI development, focused on increasingly powerful monolithic models, faces fundamental limitations:
Winner-takes-all dynamics: The high costs of training frontier models leads to market concentration and economic power in a few corporations.
Misaligned safety incentives: Companies racing to release increasingly powerful models may underreport risks and rush products to market.
High barriers to entry: Specialized models struggle to gain market traction against generalist models, even when they excel in specific domains.
Limited user control: Users have minimal visibility into how models "think" and limited ability to control characteristics like factuality, bias, or ethical reasoning.
Inefficient resource use: Using powerful frontier models for all tasks wastes resources and often underperforms specialized alternatives.
The Expert Orchestration Architecture addresses these issues by creating a more transparent, efficient, and democratic AI ecosystem where specialized models can thrive based on their unique strengths, and users gain unprecedented control over AI capabilities.
Expected Outcomes
Participants will create components that advance the Expert Orchestration vision:
Prototype judge models for evaluating specific AI capabilities
Intelligent routing algorithms for directing queries to appropriate models
Frameworks for decomposing complex tasks across multiple specialized models
Integration APIs that allow seamless discovery and utilization of specialized models
Evaluation metrics and benchmarks for comparing different routing and judge strategies
The most promising projects will have opportunities for continued development and potential integration into production systems.
Challenge Tracks
Track 1: Judge Model Development
Build specialized evaluation models that can assess different AI models' capabilities across dimensions that matter to users (manipulation skills and tendencies, deception and hidden communication, misaligned goals, factuality, domain expertise, ethics, creativity, objectivity, etc.). Judges should provide independent, objective evaluations that create transparency around model strengths and weaknesses.
Track 2: Intelligent Router Systems
Develop router systems that can intelligently direct user queries to the most appropriate specialized models based on user preferences using judge evaluations. Focus areas include routers that use multiple judges (e.g. factuality, ethics and lack of deception), query classification, efficiency optimization, and handling uncertainty.
Track 3: Task Decomposition Frameworks
Create systems that can break down complex user requests into a series of more manageable steps, to be executed by different specialized models. This includes planning phases, execution phases, and coordination mechanisms. Investigate whether decomposition avoids or reduces some of the traps reported for monolithic reasoning models (e.g. reward hacking).
Track 4: Specialized Model Integration
Build frameworks that enable easy integration of new specialized models into the Expert Orchestration Architecture, including methods for model discovery, capability profiling, and dynamic performance evaluation.
Open Research Questions
Judges
Model Characteristic Analysis: Create dataset(s) that test an interesting model characteristic pertinent to safety (e.g., ethics, hallucinations, gender bias). Build a judge using this data and evaluate multiple models.
Judge Evaluation Metrics: Develop methods to measure judge accuracy, completeness, and reliability for specific characteristics. Explore how this impacts AI Safety.
Mechanistic Interpretability for Judges: Apply MI techniques to model internals to create better or more interpretable judges e.g. judges that can evaluate outputs based on how they were generated.
Measuring model similarity: HuggingFace hosts thousands of LLMs. How do we measure whether two models have similar capabilities? How do we choose a subset of these models with a sufficiently diverse set of capabilities, that after training the resulting router will be performant? How does the router performance vary with the size of the subset?
Routers
Given a set of models with known capabilities measured by known judge scores:
Risk-Sensitive Routing: Build efficient routing algorithms balancing the judge scores, computes costs, and system reliability for the best user experience.
Multi-Objective Routing: Create routers that use scores from multiple judges (e.g., answer correctness, ethics and legality) according to user preferences for the best user experience. What are the tradeoffs?
Routing algorithms: For expensive models, the judge provides a “pre-hoc” way to estimate prediction success (without querying the model). For cheap models, we can ask the model to evaluate the answer, and evaluate its confidence (“post-hoc”) in its predictions. Find interesting ways to mix pre- and post-hoc routing to get the best of both worlds.
Multi-level routing: Investigate using a tree of choices rather than one-off routing. What are the pros and cons?
Reducing router training costs: Given a model and a task, how can we cheaply detect a model is not a good fit for a task - avoiding further training time optimizing how bad a fit it is.
Task Decomposition: Model breaking a complex user task into multiple subtasks that can be routed to the most capable models before recombining the results. What are the AI Safety, cyber-security and/or cost implications of this approach?
Universal router: For a set of tasks, create a single router across a set of LLMs that provides higher-quality answers than any single LLM does.
Inferring Judges & Routers
Reverse Engineering: Given a black-box LLM or router, infer its embedded judge (reward signal) for specific characteristics.
Efficiency Analysis: Quantify potential electricity/resource consumption reduction from widespread adoption of optimal routing technologies.
Learning when to fail: Sometimes no model will successfully answer a user query. Can we detect when we should fail cheaply?
Learning with uncertain signals: How does judge noise affect the router training process? How does noisy feedback data affect the judge/router training process? Is off-policy data a problem when it comes to training routers?
Risk sensitivity: Rather than optimizing for expected cost/quality, can we optimize for some other risk profile? E.g. we might tolerate a slightly higher cost and lower quality, if we reduce the variance or minimize a long tail.
Create a distilled predictor: The Language Models (Mostly) Know What They Know paper shows that a model can sometimes predict whether it will be able to answer a question correctly. For a selected open “base” model, create a smaller “distilled predictor” that mirrors the base model’s ability to predict answer correctness (but can no longer calculate the answer). You might use the techniques from that paper and/or the pruning and distillation techniques from the movement pruning to shrink the predictor.
Sign Ups
Entries
Overview
Resources
Guidelines
Schedule
Entries
Overview

✨ Overview
Join us in pioneering a paradigm shift in AI development! This hackathon will focus on creating the building blocks for a revolutionary Expert Orchestration AI architecture that democratizes, aligns, and enhances large language models through intelligent routing to specialized experts.
The current approach of developing increasingly powerful monolithic AI models faces fundamental limitations in performance, transparency, and democratization. This hackathon aims to develop key components of an alternative architecture where specialized "judge" models evaluate the capabilities of thousands of specialized models, and "router" systems intelligently direct queries to the most appropriate models based on these evaluations.
We’re excited to collaborate with Martian, pioneers in model routing and interpretability research, for this specialized hackathon. Their novel expert orchestration approach distills complex AI models into compact, predictive components that retain only the information needed to estimate performance on specific tasks. By combining this with cutting-edge Mechanistic Interpretability techniques, Martian is developing a general theory of model capabilities—enabling us to evaluate and deploy models more safely, transparently, and effectively. Their work also supports an open ecosystem of specialized models, helping democratize AI development while aligning business impact with AI safety.
Each participating team will receive $50 in model API credits to power their projects, with additional credits available for promising implementations. You'll have access to Martian's judge and router APIs, along with sample code libraries to kickstart your projects.
Sign up here to stay updated for this event!
Why This Hackathon Matters:
The current trajectory of AI development, focused on increasingly powerful monolithic models, faces fundamental limitations:
Winner-takes-all dynamics: The high costs of training frontier models leads to market concentration and economic power in a few corporations.
Misaligned safety incentives: Companies racing to release increasingly powerful models may underreport risks and rush products to market.
High barriers to entry: Specialized models struggle to gain market traction against generalist models, even when they excel in specific domains.
Limited user control: Users have minimal visibility into how models "think" and limited ability to control characteristics like factuality, bias, or ethical reasoning.
Inefficient resource use: Using powerful frontier models for all tasks wastes resources and often underperforms specialized alternatives.
The Expert Orchestration Architecture addresses these issues by creating a more transparent, efficient, and democratic AI ecosystem where specialized models can thrive based on their unique strengths, and users gain unprecedented control over AI capabilities.
Expected Outcomes
Participants will create components that advance the Expert Orchestration vision:
Prototype judge models for evaluating specific AI capabilities
Intelligent routing algorithms for directing queries to appropriate models
Frameworks for decomposing complex tasks across multiple specialized models
Integration APIs that allow seamless discovery and utilization of specialized models
Evaluation metrics and benchmarks for comparing different routing and judge strategies
The most promising projects will have opportunities for continued development and potential integration into production systems.
Challenge Tracks
Track 1: Judge Model Development
Build specialized evaluation models that can assess different AI models' capabilities across dimensions that matter to users (manipulation skills and tendencies, deception and hidden communication, misaligned goals, factuality, domain expertise, ethics, creativity, objectivity, etc.). Judges should provide independent, objective evaluations that create transparency around model strengths and weaknesses.
Track 2: Intelligent Router Systems
Develop router systems that can intelligently direct user queries to the most appropriate specialized models based on user preferences using judge evaluations. Focus areas include routers that use multiple judges (e.g. factuality, ethics and lack of deception), query classification, efficiency optimization, and handling uncertainty.
Track 3: Task Decomposition Frameworks
Create systems that can break down complex user requests into a series of more manageable steps, to be executed by different specialized models. This includes planning phases, execution phases, and coordination mechanisms. Investigate whether decomposition avoids or reduces some of the traps reported for monolithic reasoning models (e.g. reward hacking).
Track 4: Specialized Model Integration
Build frameworks that enable easy integration of new specialized models into the Expert Orchestration Architecture, including methods for model discovery, capability profiling, and dynamic performance evaluation.
Open Research Questions
Judges
Model Characteristic Analysis: Create dataset(s) that test an interesting model characteristic pertinent to safety (e.g., ethics, hallucinations, gender bias). Build a judge using this data and evaluate multiple models.
Judge Evaluation Metrics: Develop methods to measure judge accuracy, completeness, and reliability for specific characteristics. Explore how this impacts AI Safety.
Mechanistic Interpretability for Judges: Apply MI techniques to model internals to create better or more interpretable judges e.g. judges that can evaluate outputs based on how they were generated.
Measuring model similarity: HuggingFace hosts thousands of LLMs. How do we measure whether two models have similar capabilities? How do we choose a subset of these models with a sufficiently diverse set of capabilities, that after training the resulting router will be performant? How does the router performance vary with the size of the subset?
Routers
Given a set of models with known capabilities measured by known judge scores:
Risk-Sensitive Routing: Build efficient routing algorithms balancing the judge scores, computes costs, and system reliability for the best user experience.
Multi-Objective Routing: Create routers that use scores from multiple judges (e.g., answer correctness, ethics and legality) according to user preferences for the best user experience. What are the tradeoffs?
Routing algorithms: For expensive models, the judge provides a “pre-hoc” way to estimate prediction success (without querying the model). For cheap models, we can ask the model to evaluate the answer, and evaluate its confidence (“post-hoc”) in its predictions. Find interesting ways to mix pre- and post-hoc routing to get the best of both worlds.
Multi-level routing: Investigate using a tree of choices rather than one-off routing. What are the pros and cons?
Reducing router training costs: Given a model and a task, how can we cheaply detect a model is not a good fit for a task - avoiding further training time optimizing how bad a fit it is.
Task Decomposition: Model breaking a complex user task into multiple subtasks that can be routed to the most capable models before recombining the results. What are the AI Safety, cyber-security and/or cost implications of this approach?
Universal router: For a set of tasks, create a single router across a set of LLMs that provides higher-quality answers than any single LLM does.
Inferring Judges & Routers
Reverse Engineering: Given a black-box LLM or router, infer its embedded judge (reward signal) for specific characteristics.
Efficiency Analysis: Quantify potential electricity/resource consumption reduction from widespread adoption of optimal routing technologies.
Learning when to fail: Sometimes no model will successfully answer a user query. Can we detect when we should fail cheaply?
Learning with uncertain signals: How does judge noise affect the router training process? How does noisy feedback data affect the judge/router training process? Is off-policy data a problem when it comes to training routers?
Risk sensitivity: Rather than optimizing for expected cost/quality, can we optimize for some other risk profile? E.g. we might tolerate a slightly higher cost and lower quality, if we reduce the variance or minimize a long tail.
Create a distilled predictor: The Language Models (Mostly) Know What They Know paper shows that a model can sometimes predict whether it will be able to answer a question correctly. For a selected open “base” model, create a smaller “distilled predictor” that mirrors the base model’s ability to predict answer correctness (but can no longer calculate the answer). You might use the techniques from that paper and/or the pruning and distillation techniques from the movement pruning to shrink the predictor.
Speakers & Collaborators
Philip Quirke
Organiser
Pivoted to AI Safety in 2023, after roles as a Software Engineer & Architect, Business Analyst, Project Manager, Product Manager, General Manager, etc. Philip's AI journey started with an Apart Reserarch Hackathon, which led to research grants, a stint at FAR AI and finally landed at Martian!
Yash Upadhyay
Organiser
Yash is the Co-Founder and Co-CEO of Martian, where he leads the company's mission to enhance AI performance and reliability through innovative model routing solutions. With a background in AI research and development, Yash has been instrumental in building tools that optimize the use of large language models, ensuring efficiency and cost-effectiveness for enterprise applications.
Etan Ginsberg
Organiser
Etan is a Co-Founder and Co-CEO of Martian, where he focuses on applying advanced AI infrastructure to help companies use large language models more effectively. His experience includes deep technical leadership and a track record of building high-performance systems. Etan's work at Martian is centered on making LLMs more reliable, affordable, and performant for enterprise use.
Chaitanya Bandi
Organiser
Chaitanya is the VP of Research at Martian, focusing on AI alignment and model interpretability. He has contributed to the development of model mapping techniques that transform opaque neural networks into transparent, verifiable programs, enhancing model efficiency and human-AI interaction . Chaitanya holds a Ph.D. from MIT and has a background in decision-making under uncertainty, with applications in operations management .
Luka Samkharadze
Organiser
Currently Founding Software Engineer with rich hands-on experience and diverse portfolio of projects. Luka is currently a Founding Engineer at Stable and previously was at Martian
Dory Zidon
Organiser
Dory is a key member of the Martian back-end team, contributing to the company's products, infrastructure and performance.
Josh Greaves
ML Tech Lead
Josh is the Machine Learning Tech Lead at Martian, where he focuses on reinforcement learning and large language models. His prior experience includes roles at Google Brain and Reliant AI.
Antía García Casal
Organiser
Currently the Head of Design at Martian. Previously a Visual Designer Freelance with over 15 years of Experience
Alex Zverianskii
Organiser
Over past 15 years, Alex has been in businesses of diverse sizes, ranging from 200k MAU to 100mln MAU. Alex has engineered hundreds of real-time models, primed an analytics and data for an IPO, and built three startups from the ground up, with one successful exit.
Brad Fowler
Organiser
Brad is a Machine Learning Research Technical Lead at Martian. He holds a Master's degree in Information and Computer Engineering from the University of Cambridge and has over seven years of experience in artificial intelligence and software development.
Narmeen Oozeer
Organiser
Narmeen Oozeer is a Research Engineer focused on AI/ML interpretability at Martian. Her work centers on developing scalable interpretability methods to build better and more interpretable LLM routers. Narmeen has previously worked on activation transfers, allowing alignment interventions to be transferred between models of different scales.
Speakers & Collaborators

Philip Quirke
Organiser
Pivoted to AI Safety in 2023, after roles as a Software Engineer & Architect, Business Analyst, Project Manager, Product Manager, General Manager, etc. Philip's AI journey started with an Apart Reserarch Hackathon, which led to research grants, a stint at FAR AI and finally landed at Martian!

Yash Upadhyay
Organiser
Yash is the Co-Founder and Co-CEO of Martian, where he leads the company's mission to enhance AI performance and reliability through innovative model routing solutions. With a background in AI research and development, Yash has been instrumental in building tools that optimize the use of large language models, ensuring efficiency and cost-effectiveness for enterprise applications.

Etan Ginsberg
Organiser
Etan is a Co-Founder and Co-CEO of Martian, where he focuses on applying advanced AI infrastructure to help companies use large language models more effectively. His experience includes deep technical leadership and a track record of building high-performance systems. Etan's work at Martian is centered on making LLMs more reliable, affordable, and performant for enterprise use.

Chaitanya Bandi
Organiser
Chaitanya is the VP of Research at Martian, focusing on AI alignment and model interpretability. He has contributed to the development of model mapping techniques that transform opaque neural networks into transparent, verifiable programs, enhancing model efficiency and human-AI interaction . Chaitanya holds a Ph.D. from MIT and has a background in decision-making under uncertainty, with applications in operations management .

Luka Samkharadze
Organiser
Currently Founding Software Engineer with rich hands-on experience and diverse portfolio of projects. Luka is currently a Founding Engineer at Stable and previously was at Martian

Dory Zidon
Organiser
Dory is a key member of the Martian back-end team, contributing to the company's products, infrastructure and performance.

Josh Greaves
ML Tech Lead
Josh is the Machine Learning Tech Lead at Martian, where he focuses on reinforcement learning and large language models. His prior experience includes roles at Google Brain and Reliant AI.

Antía García Casal
Organiser
Currently the Head of Design at Martian. Previously a Visual Designer Freelance with over 15 years of Experience

Alex Zverianskii
Organiser
Over past 15 years, Alex has been in businesses of diverse sizes, ranging from 200k MAU to 100mln MAU. Alex has engineered hundreds of real-time models, primed an analytics and data for an IPO, and built three startups from the ground up, with one successful exit.

Brad Fowler
Organiser
Brad is a Machine Learning Research Technical Lead at Martian. He holds a Master's degree in Information and Computer Engineering from the University of Cambridge and has over seven years of experience in artificial intelligence and software development.

Narmeen Oozeer
Organiser
Narmeen Oozeer is a Research Engineer focused on AI/ML interpretability at Martian. Her work centers on developing scalable interpretability methods to build better and more interpretable LLM routers. Narmeen has previously worked on activation transfers, allowing alignment interventions to be transferred between models of different scales.
Registered Jam Sites
Register A Location
Register A Location
Register A Location
Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.
The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.
Registered Jam Sites
Register A Location
Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.
The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.
Our Other Sprints
Apr 25, 2025
-
Apr 27, 2025
Research
Economics of Transformative AI
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up
Sign Up
Sign Up
Sign Up
Apr 14, 2025
-
Apr 26, 2025
Research
Berkeley AI Policy Hackathon
This unique event brings together diverse perspectives to tackle crucial challenges in AI alignment, governance, and safety. Work alongside leading experts, develop innovative solutions, and help shape the future of responsible
Sign Up
Sign Up
Sign Up
Sign Up

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events