The hackathon is happening right now! Join by signing up below and be a part of our community server.
Apart > Sprints

AI capabilities and risks demo-jam: Creating visceral interactive demonstrations

--
Signups
--
Entries
August 23, 2024 4:00 PM
 to
August 26, 2024 3:00 AM
 (UTC)
Hackathon starts in
--
Days
--
Hours
--
Minutes
--
Seconds
Sign upSign up
This event is finished. It occurred between 
August 23, 2024
 and 
August 26, 2024

Sign Up for the AI Capabilities and Risks Demo-Jam hackathon!


As frontier AI systems become rapidly more powerful and general, and as their risks become more pressing, it’s increasingly important for key decision-makers and the public to understand their capabilities. Let’s bring these insights out of dusty Arxiv papers and build visceral, immediately engaging demos!

Well-crafted interactive demos can be incredibly powerful in conveying AI capabilities and risks. That's why we're inviting you – developers, designers, and AI enthusiasts – to team up and create innovative demos that make people feel, rather than just think, about the rate of AI progress.

Why interactive demos can have a drastic impact

Interactive demonstrations allow something that just reading about AI safety can't do: make the reader live through what potential scenarios could look like, and engage them with complex concepts firsthand.

Importantly, what people understand and think has an impact on political decisions: “The Impact of Public Opinion on Public Policy: A Review and an Agenda” finds that public opinion plays a key-role in public policy, and the more so the more salient this opinion is, while “Does Public Opinion Affect Political Speech?” finds that politicians adjust their speech and position to reflect the preference of the public. This is why to impact AI policy we need both communication with politicians but also to communicate ideas with people at large.

We are focusing on interactive demos as an underappreciated way to communicate on an issue, as interactivity allows one to truly live through the concept presented and get an intuitive feel for it.

For this hackathon, we’re excited about the following possibilities:

  • Present scenarios that allow users to emotionally connect with potential AI risks
  • Illustrate key AI safety concepts and trends in an engaging, hands-on manner
  • Showcase current AI capabilities and their implications in a tangible way

Some examples of demonstrations we would be excited to see as a result of this hackathon:

Take a look at our resources to see many more!

What to expect during a demo-jam hackathon?

The hackathon is a weekend-long event where you participate in teams (1-5) to create interactive applications that demonstrate AI risks, AI safety concepts, or current capabilities.

You will have the opportunity to:

  • Collaborate with like-minded individuals passionate about AI safety
  • Receive continuous feedback from other participants and mentors
  • Review and learn from other teams' projects
  • Contribute to raising awareness about crucial AI safety issues

Prizes, evaluation, and submission

During this hackathon, you will develop and submit an interactive application, as well as review other participants’ entries.

You’ll submit:

  • source code of the application, along with instructions to deploy it locally.
    • You can use any tools or language as long as the deployment information are simple to follow
    • Be sure to remove your API keys before submission.
    • We recommend tools such as docker-compose for participants to easily and safely judge your submission.
    • We also recommend using either a public-weight models, or AI tools that have enough of a free trial for participants to test your submission
  • a brief report summarizing your project following this template
  • a 2-minute video demonstration of how your interactive demo works
  • Optionally, an endpoint for participants to test your application directly (for example through Vercel).

The submission will be reviewed under the following criteria:

  • Viscerality: Does the demo give you a felt sense of AI capabilities? Is it fun, engaging, immersive or understandable? Does it make you feel the real-world implications?
  • Importance: How important are the insights delivered by this demo for understanding how AI will affect humanity in the coming years?
  • Accuracy: If the demo shows capabilities or risks, does it present them accurately, giving takeaways that generalize well? If the demonstration explains a concept, is it accurate?
  • Ease of testing: How easy is it to try out the demo?
  • Safety: Is the demo safe enough to publish publicly without causing harm?
  • Overall: How good is the demo overall?

The judging panel's scores will count for half of the final score, with the other half coming from peer reviews. You will be assigned as a peer reviewer of some other hackathon submissions - you must review these to be eligible to win.

Top teams will win a share of our $2,000 prize pool:

🥇 1st place: $1,000

🥈 2nd place: $600

🥉 3rd place: $300

🎖️ 4th place: $100

Additionally, high quality submissions that are a good fit for AI Digest might be invited to collaborate to polish up and publish their demo there, to reach a wider audience of policymakers and the public.

Why should I join?

There’s loads of reasons to join! Here are just a few:

  • Participate in a game jam-style event focused on an AI safety-adjacent topic
  • Get to know new people interested in AI-safety
  • Win up to $1,000
  • Receive a certificate of participation
  • Contribute to important public outreach on AI safety
  • And many many more… Come along!

Do I need experience in AI safety to join?

Not at all! This can be an occasion for you to learn more about AI safety and public outreach. We provide code templates and ideas to kickstart your projects, mentors to give feedback on your project, and a great community of interested developers to give reviews and feedback on your project.

What if my project seems too risky to share?

Besides emphasizing the introduction of concrete mitigation ideas for the risks presented, we are aware that projects emerging from this hackathon might pose a risk if disseminated irresponsibly.

For all of Apart's research events and dissemination, we follow our Responsible Disclosure Policy

I have more questions to ask

Head over to the Frequently Asked Questions on the submission tab, and feel free to ping us ( @mentor ) on Discord!

Speakers & Collaborators

Lucas Hansen

Lucas brings 10 years of technology experience to CivAI. Previously, he founded Qualia, a real estate software company which serves millions of homeowners each year.
Keynote Speaker, Judge

Épiphanie Gédéon

Épiphanie is a member of Centre pour la Sécurité de l'IA and currently an Apart Research Fellow
Organizer, Mentor and Judge

Adam Binksmith

Adam runs Sage, an organization helping impactful communities figure out what's true.
Organizer, Mentor and Judge

Archana Vaidheeswaran

Archana is responsible for organizing the Apart Sprints, research hackathons to solve the most important questions in AI safety.
Organizer

Misha Yagudin

Misha runs Arb Research and Samotsvety Forecasting. Misha currently advises at Sage
Mentor and Judge

Charbel-Raphaël Ségerie

Charbel-Raphaël is the executive director at Centre pour la Sécurité de l'IA and has led ML4Good, a bootcamp focused on AI safety.
Judge

Natalia Pérez-Campanero Antolín

A research manager at Apart, Natalia has a PhD in Interdisciplinary Biosciences from Oxford and has run the Royal Society's Entrepreneur-in-Residence program.
Judge

Jason Schreiber

Jason is co-director of Apart Research and leads Apart Lab, our remote-first AI safety research fellowship.
Organizer

Esben Kran

Esben is the co-director of Apart Research and specializes in organizing research teams on pivotal AI security questions.
Organizer

Resources

  • Examples of AI tools you can use in your demo
    • Langchain: Library to call LLMs and has a universal interface, allowing you to easily switch to other models.
    • LLama3.1: A performant open-weight LLM model. We recommend using open-weight models for participant to more easily judge your demo.
    • Claude Sonnet 3.5: A private LLM model that is cheap and performant
    • GPT-4o: Private and performant LLM model
    • Retell.ai: A service for automated call platforms
    • Openrouter: Universal interface for LLM through API call

See the updated calendar and subscribe

The schedule runs from 4 PM UTC Friday to 3 AM Monday. We start with an introductory talk and end the event during the following week with an awards ceremony. Join the public ICal here.

You will also find Explorer events, such as collaborative brainstorming and team match-making before the hackathon begins on Discord and in the calendar.

📍 Registered jam sites

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

Berkeley Hackathon: AI Capability and Risk Demos

Come and hack on demos at the CivAI office in Berkeley! Hosted by CivAI

Berkeley Hackathon: AI Capability and Risk Demos

Hackathon: AI Capability and Risk Demos

Come and hack on demos at the LISA office in London! Hosted by Safe AI London

Hackathon: AI Capability and Risk Demos

🏠 Register a location

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community. Read more about organizing.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Thank you! Your submission has been received! Your event will show up on this page.
Oops! Something went wrong while submitting the form.

📣 Social media images and text snippets

No media added yet
No text snippets added yet

Use this template for your submission [Required]

What to submit?

Please be sure to include the following in your submission:

  • A video demonstrating your application of about 2 minutes (you can use loom or other tools you see fit). A facecam setting where you narrate as you navigate your demonstration is ideal
  • A git repository with your code. You can use any language for coding your application. You can use the HackBook as a basic template.
    • If you want to make it private because you are concerned for impact risks, contact joy_void_joy on discord.
    • Alternatively or in addition, you can provide an endpoint, for instance using localtunnel or Vercel, for participant to test and judge your entry.
  • The above template to quickly describe your approach, code and theory of impact.

Peer review

After submitting remember to review your assigned project in the following days: You will get a mail with three demos along with google forms to vote on them. You must review them before Wesdnesday 3AM UTC. Feel free to review more projects (you can find them on the submission page or in the discord's #feedback-and-review channel), your votes are counted and will determine the overall rankings of projects.

How to review

To review a submission, you will need its google form to send your review. Enter your participant ID, and judge the submission according to the criteria, along with making any remarks you find pertinent.

You are not expected to pay any services to review demos. Do however attempt to deploy the images locally or interact with the provided endpoint to judge on ease-of-testing. You may contact the project owners on discord or ping them in the #feedback-and-reviews channel. If you are not able to access the demonstration at all, you can use the video provided as a submission and judge based off screenshots.


Review Criteria


The submission will be reviewed under the following criteria:

  • Viscerality: Does the demo give you a felt sense of AI capabilities? Is it fun, engaging, immersive or understandable? Does it make you feel the real-world implications?
  • Importance: How important does this demo deliver the insights for understanding how AI will affect humanity in the coming years?
  • Accuracy: If the demo shows capabilities or risks, does it present them accurately, giving takeaways that generalize well? If the demonstration explains a concept, is it accurate?
  • Ease of testing: How easy is it to try out the demo?
  • Safety: Is the demo safe enough to publish publicly without causing harm?
  • Overall: How good is the demo overall?

The judging panel's scores will count for half of the final score, with the other half coming from peer reviews. You will be assigned as a peer reviewer of some other hackathon submissions - you must review these to be eligible to win.

Frequently Asked Questions

Can I register since the hackathon has started?

Yes, feel free to sign up. It's never too late to join and contribute your ideas!

What if I can't complete my project within the hackathon timeframe?

Focus on creating a minimal viable demo. Judges and participants will consider your concept and potential impact even if it's not fully polished

Are there any restrictions on using pre-existing code or libraries?

You can use pre-existing code or libraries, but indicate what parts are your original work and adequately credit any external sources.

What language should I use? 

You can use whatever language you see fit and use any tools as long as you follow the precision above.

Can we update our submission after the initial deadline?

Unless otherwise specified by the organizers, submissions should be final by the deadline. Focus on submitting your best work within the given timeframe. In case you cannot, reach out to 'archanavaidheeswaran' on Discord to let her know about this.

How can I ensure that my entry does not more harm than good?

Good question!

Since this hackathon is focused on demonstrating capabilities and risks, we are careful to not boosts risk that would not have been present otherwise, in accordance to our Responsible Disclosure Policy.
If you worry your project may be used to bad end, you have several options:

  • Lock certain input fields on your endpoint, for instance if you are doing a demo about automated phishing (like CivAI's cyber-demo), you can lock in the target the same way they did.
  • Direct your project to something that already exists or is more commonplace
  • Direct your project toward an interactive explainer of the risks rather than a direct visualizer of capabilities.
  • Submit your code privately (contact @mentors or @joy_void_joy to do so) or provide only an endpoint for participants to review. The endpoint may be password gated, but ensure all judges and reviewers can access it. However, remember that Safety is one of the evaluation criterion and you might lose point there if you need to contain it in that way.

In case of doubt, feel free to contact us (@mentor on discord)

Are you more interested in existential risks, or purely already present and mediatic risks?


We're interested both in risks that are currently existing and that are going to have a concrete impact on society, and in later more-existential risks.
That said, it will probably be easier to make a visceral and factual demo about current capabilities rather than later ones. One way to demonstrate more existential risks could be to extrapolate from past trends about e.g. power-seeking behaviors, or do an interactive demonstration of a toy model, like this toy model of treacherous turn

What kind of capabilities are you interested in?


There are many aspects of current AI capabilities that are interesting to showcase and are not currently well understood:

  • AI impersonation
  • AI persuasion
  • Capabilities of AI to do fluid voice-to-voice communication
  • General trends and advancements of AI systems overall
  • Capabilities for long-term planning, power-seeking behaviors

What should my demo look like?

There are many different kinds of demos that can work well, for instance:

  • Single page interaction with a chatbot to demonstrate something dangerous they can do or something inherent in the way they might be used
  • A more kind of static page. This works especially well for explainers.
  • Something a bit more in-between, like a wizard assistant of sorts.

Note that the overall presentation mainly impact the Viscerality and Overall notes, but will not impact the other grade. Using a simple framework for your aesthetic like bulma or pico should work well enough to have presentation come across clearly.

I lack knowledge about frontend development, can I still participate in the hackathon?


Certainly:

Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
You have successfully submitted! You should receive an email and your project should appear here. If not, contact operations@apartresearch.com.
Oops! Something went wrong while submitting the form.

See all submissions below: