Nov 11, 2022

-

Nov 13, 2022

Online & In-Person

The Interpretability Hackathon

48 hours intense research in interpretability, the modern AI neuroscience.

00:00:00:00

00:00:00:00

00:00:00:00

00:00:00:00

This event is ongoing.

This event has concluded.

Overview

Overview

Arrow
Arrow
Arrow

Hosted by Esben Kran, Apart Research, Neel Nanda

Join this AI safety hackathon to find new perspectives on the "brains" of AI!

We will work with multiple research directions and one of the most relevant is mechanistic interpretability. Here, we work towards a researcher understanding of neural networks and why they do what they do.

We provide you with the best starter templates that you can work from so you can focus on creating interesting research instead of browsing Stack Overflow. You're also very welcome to check out some of the ideas already posted!  

How to participate

Create a user on the itch.io (this) website and click participate. We will assume that you are going to participate and ask you to please cancel if you won't be part of the hackathon.

Instructions

You will work on research ideas you generate during the hackathon and you can find more inspiration below.

Submission

Everyone will help rate the submissions together on a set of criteria that we ask everyone to follow. You receive 5 projects that you have to rate during the 4 hours of judging before you can judge specific projects (so we avoid selectivity in the judging).

Resources

Resources

Arrow
Arrow
Arrow

Inspiration

We have many ideas available for inspiration on the aisi.ai Interpretability Hackathon ideas list. A lot of interpretability research is available on distill.pubtransformer circuits, and Anthropic's research page.

Introductions to mechanistic interpretability

See also the tools available on interpretability:

Digestible research

Below is a talk by Esben on a principled introduction to interpretability for safety:

Entries

Check back later to see entries to this event

Speakers & Collaborators

Esben Kran

Organizer

Esben is the co-director of Apart Research and specializes in organizing research teams on pivotal AI security questions.

Neel Nanda

Speaker & Judge

Team lead for the mechanistic interpretability team at Google Deepmind and a prolific advocate for open source interpretability research.

Speakers & Collaborators

Esben Kran

Organizer

Esben is the co-director of Apart Research and specializes in organizing research teams on pivotal AI security questions.

Neel Nanda

Speaker & Judge

Team lead for the mechanistic interpretability team at Google Deepmind and a prolific advocate for open source interpretability research.

Registered Jam Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.

Registered Jam Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.