Jun 30, 2023

-

Jul 2, 2023

Online & In-Person

Safety Benchmarks Hackathon

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

00:00:00:00

00:00:00:00

00:00:00:00

00:00:00:00

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

This event is ongoing.

This event has concluded.

60

Sign Ups

0

Entries

Overview

Resources

Schedule

Entries

Overview

Arrow

Hosted by Apart Research, Esben Kran, and Fazl Barez

Explore safer AI with fellow researchers and enthusiasts

Large AI models are released nearly every week. We need to find ways to evaluate these models (especially at the complexity of GPT-4) to ensure that they will not have critical failures after deployment, e.g. autonomous power-seeking, biases for unethical behaviors, and other phenomena that arise in deployment (e.g. inverse scaling).

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

Rewatch the keynote talk Alexander Pan above

The MACHIAVELLI benchmark (left) and the Inverse Scaling Prize (right)

Sign up below to be notified before the kickoff! Read up on the schedule, see instructions for how to participate, and inspiration below.

Submission details

You are required to submit:

  • A PDF report using the template linked on the submission page

  • A maximum 10 minute video presenting your findings and results (see inspiration and instructions for how to do this on the submission page)

You are optionally encouraged to submit:

  • A slide deck describing your project

  • A link to your code

  • Any other material you would like to link

60

Sign Ups

0

Entries

Overview

Resources

Schedule

Entries

Overview

Arrow

Hosted by Apart Research, Esben Kran, and Fazl Barez

Explore safer AI with fellow researchers and enthusiasts

Large AI models are released nearly every week. We need to find ways to evaluate these models (especially at the complexity of GPT-4) to ensure that they will not have critical failures after deployment, e.g. autonomous power-seeking, biases for unethical behaviors, and other phenomena that arise in deployment (e.g. inverse scaling).

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

Rewatch the keynote talk Alexander Pan above

The MACHIAVELLI benchmark (left) and the Inverse Scaling Prize (right)

Sign up below to be notified before the kickoff! Read up on the schedule, see instructions for how to participate, and inspiration below.

Submission details

You are required to submit:

  • A PDF report using the template linked on the submission page

  • A maximum 10 minute video presenting your findings and results (see inspiration and instructions for how to do this on the submission page)

You are optionally encouraged to submit:

  • A slide deck describing your project

  • A link to your code

  • Any other material you would like to link

60

Sign Ups

0

Entries

Overview

Resources

Schedule

Entries

Overview

Arrow

Hosted by Apart Research, Esben Kran, and Fazl Barez

Explore safer AI with fellow researchers and enthusiasts

Large AI models are released nearly every week. We need to find ways to evaluate these models (especially at the complexity of GPT-4) to ensure that they will not have critical failures after deployment, e.g. autonomous power-seeking, biases for unethical behaviors, and other phenomena that arise in deployment (e.g. inverse scaling).

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

Rewatch the keynote talk Alexander Pan above

The MACHIAVELLI benchmark (left) and the Inverse Scaling Prize (right)

Sign up below to be notified before the kickoff! Read up on the schedule, see instructions for how to participate, and inspiration below.

Submission details

You are required to submit:

  • A PDF report using the template linked on the submission page

  • A maximum 10 minute video presenting your findings and results (see inspiration and instructions for how to do this on the submission page)

You are optionally encouraged to submit:

  • A slide deck describing your project

  • A link to your code

  • Any other material you would like to link

60

Sign Ups

0

Entries

Overview

Resources

Schedule

Entries

Overview

Arrow

Hosted by Apart Research, Esben Kran, and Fazl Barez

Explore safer AI with fellow researchers and enthusiasts

Large AI models are released nearly every week. We need to find ways to evaluate these models (especially at the complexity of GPT-4) to ensure that they will not have critical failures after deployment, e.g. autonomous power-seeking, biases for unethical behaviors, and other phenomena that arise in deployment (e.g. inverse scaling).

Participate in the Alignment Jam on safety benchmarks to spend a weekend with AI safety researchers to formulate and demonstrate new ideas in measuring the safety of artificially intelligent systems.

Rewatch the keynote talk Alexander Pan above

The MACHIAVELLI benchmark (left) and the Inverse Scaling Prize (right)

Sign up below to be notified before the kickoff! Read up on the schedule, see instructions for how to participate, and inspiration below.

Submission details

You are required to submit:

  • A PDF report using the template linked on the submission page

  • A maximum 10 minute video presenting your findings and results (see inspiration and instructions for how to do this on the submission page)

You are optionally encouraged to submit:

  • A slide deck describing your project

  • A link to your code

  • Any other material you would like to link