We are an AI safety lab

Our mission is to ensure AI systems are safe and beneficial

High-impact research

Our Lab publishes empirical work focused on state-of-the-art solutions for AI safety.

Research

Doing things differently

We are a decentralized organization designed to solve the biggest problems in AI risk.

Lab

Enabling builders

Apart provides opportunities for thousands of builders to pilot AI safety research.

Sprints

With partners and co-authors from world-class partners

Our work

Apart > News

Researcher Spotlight: Alexandra Abbas

Our Researcher Spotlight series highlights the global community at the heart of Apart Research.

By 
Connor Axiotes
 on 
November 5, 2024
Apart > Lab

Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts

Our results demonstrate that public benchmark scores do not always accurately assess model properties, and underscore the importance of improved data practices in the field.

Jacob Haimes*¹, Cenny Wenner*¹, Kunvar Thaman¹, Vassil Tashev¹, Clement Neo¹, Esben Kran¹, Jason Schreiber¹

¹Apart Research, *Lead Authors

Published research

We produce foundational research enabling the safe and beneficial development of advanced AI.

Google Scholar

Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts

Jacob Haimes*¹, Cenny Wenner*¹, Kunvar Thaman¹, Vassil Tashev¹, Clement Neo¹, Esben Kran¹, Jason Schreiber¹

¹Apart Research, *Lead Authors

Interpreting Learned Feedback Patterns in Large Language Models

Luke Marks∗†, Amir Abdullah∗† ♢, Clement Neo†, Rauno Arike†, David Krueger□, Philip Torr‡, Fazl Barez∗‡

†Apart Research, ♢Cynch.ai, □University of Cambridge, ‡Department of Engineering Sciences, University of Oxford

Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

Clement Neo*, Shay B. Cohen, Fazl Barez*

Increasing Trust in Language Models through the Reuse of Verified Circuits

Philip Quirke, Clement Neo, Fazl Barez

See all publications

Alex Foote*, Neel Nanda, Esben Kran, Ionnis Konstas, Shay Cohen, Fazl Barez*

RTML workshop at ICLR 2023
[Read Publication}

Neuron to Graph: Interpreting Language Model Neurons at Scale

Albert Garde*, Esben Kran*, Fazl Barez

NeurIPS 2023 XAI in Action Workshop
[Read Publication}

DeepDecipher

Philip Quirke, Fazl Barez

arXiv
[Read Publication}

Understanding Addition in Transformers

Michael Lan, Fazl Barez

arXiv
[Read Publication}

Locating Cross-Task Sequence Continuation Circuits in Transformers

[Read Publication}

Neuroplasticity in LLMs

Clement Neo*, Shay B. Cohen, Fazl Barez*

arXiv
[Read Publication}

Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

Luke Marks∗†, Amir Abdullah∗† ♢, Clement Neo†, Rauno Arike†, David Krueger□, Philip Torr‡, Fazl Barez∗‡

†Apart Research, ♢Cynch.ai, □University of Cambridge, ‡Department of Engineering Sciences, University of Oxford

arXiv
[Read Publication}

Interpreting Learned Feedback Patterns in Large Language Models

Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark

ACL 2023
[Read Publication}

Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts

Jacob Haimes*¹, Cenny Wenner*¹, Kunvar Thaman¹, Vassil Tashev¹, Clement Neo¹, Esben Kran¹, Jason Schreiber¹

¹Apart Research, *Lead Authors

arXiv
[Read Publication}

Sleeper Agents

Evan Hubinger et al.

Anthropic
[Read Publication}

Events

Join experts and fellow researchers as we build AI safety together.

Reprogramming AI Models Hackathon

This event concluded on
Nov 25, 2024
with
entries from
signups
This hackathon will gather ambitious AI researchers and builders to work on exciting interp problems using Goodfire’s “interpretable” model APIs. If this sounds exciting, we encourage you to join and either hack on your own research idea or explore one of the following directions below.
Nov 22
to
Nov 25, 2024
19
Nov
19
Nov
19
Nov

Howard University AI Safety Summit & Policy Hackathon

Independently organized SprintX
Washington, D.C.
19
Nov
Canceled

Howard University AI Safety Summit & Policy Hackathon

Howard University AI Safety Summit & Policy Hackathon

Independently organized SprintX
Washington, D.C.
22
Nov
22
Nov
22
Nov

Reprogramming AI Models Hackathon

Independently organized SprintX
📈 Accelerating AI Safety
Virtual and In-Person
22
Nov
Canceled

Reprogramming AI Models Hackathon

Reprogramming AI Models Hackathon

Independently organized SprintX
📈 Accelerating AI Safety
Virtual and In-Person
24
Jan
24
Jan
24
Jan

Autonomous Agent Evaluations Hackathon [dates TBD]

Independently organized SprintX
Virtual & in-person
24
Jan
Canceled

Autonomous Agent Evaluations Hackathon [dates TBD]

Autonomous Agent Evaluations Hackathon [dates TBD]

Independently organized SprintX
Virtual & in-person

Apart Research

Get involved

Check out the list below for ways you can interact or research with Apart!

Let's have a meeting!

You can book a meeting here and we can talk about anything between the clouds and the dirt. We're looking forward to meeting you.

I would love to mentor research ideas

We have a design where ideas are validated by experts on the website. If you would like to be one of these experts, write to us here. It can be a huge help for the community!

Get updated on A*PART's work

Blog & Mailing list

The blog contains the public outreach for A*PART. Sign up for the mailing list below to get future updates.

People

Members

Central committee board

Associate Kranc
Head of Research Department
Commanding Center Management Executive

Partner Associate Juhasz
Head of Global Research
Commanding Cross-Cultural Research Executive

Associate Soha
Commanding Research Executive
Manager of Experimental Design

Partner Associate Lækra
Head of Climate Research Associations
Research Equality- and Diversity Manager

Partner Associate Hvithammar
Honorary Fellow of Data Science and AI
P0rM Deep Fake Expert

Partner Associate Waade
Head of Free Energy Principle Modelling
London Subsidiary Manager

Partner Associate Dankvid
Partner Snus Executive
Bodily Contamination Manager

Partner Associate Nips
Head of Graphics Department
Cake Coding Expert

Honorary members

Associate Professor Formula T.
Honorary Associate Fellow of Research Ethics and Linguistics
Optimal Science Prediction Analyst

Alumni

Partner Associate A.L.T.
Commander of the Internally Restricted CINeMa Research
Keeper of Secrets and Manager of the Internal REC

Newsletter

Stay updated with Apart

Follow the weekly updated from the Apart community and stay updated on the latest news, research, and events.