May 3, 2024

-

May 6, 2024

Online & In-Person

AI and Democracy Hackathon: Demonstrating the Risks

Despite the many potential benefits of the technology (such as equalizing opportunity, creating wealth, and improving coordination), we are also facing significant risks from AI.

00:00:00:00

00:00:00:00

00:00:00:00

00:00:00:00

This event is ongoing.

This event has concluded.

Overview

Overview

Arrow
Arrow
Arrow

🌏 Join us to evaluate and mitigate societal challenges from AIDespite the many potential benefits of the technology (such as equalizing opportunity, creating wealth, and improving coordination), we are also facing significant risks from AI.Together, we will be hacking away to demonstrate and mitigate the challenges that arise in the meeting between AI and democracy, while trying to project these risks into the future.👇️ Sign up below to join us for a weekend where we'll hear from experts in technical AI governance and collaborate in teams to identify, evaluate, and propose solutions to key challenges at the intersection of AI and democracy.

Sign up here to stay updated for this event

This will be a fit for you if you are an AI safety researcher, policymaker, ethicist, legal expert, political scientist, or a cybersecurity professional. We also invite students who are aiming to work on AI safety and governance.Join us for the keynote below:

$2,000 prizes, Apart Lab, and expert feedback

After the hackathon is over, our panel of brilliant judges will give constructive feedback to your projects and rate them on a series of criteria for us to select the top projects! The criteria are:

  • Demonstration quality: How good are your arguments for how this result informs our understanding of AI threats to democracies and society?

  • AI safety: How informative is it in the field of AI safety? Do you have good ideas for mitigating the threat and are you successfully arguing for the criticality of your demonstration?

  • Novelty: Have the results not been seen before and are they surprising compared to what we might expect?

  • Generalization: Do the judges expect your result, extrapolation, and ideas to generalize beyond your experiment? E.g. do you adequately cover potential critiques in your project by including new parameters in your predictions or do you caveat your mitigation ideas with potential loopholes. Are your results reproducible from a Colab notebook or a Github repo?

The top teams have a chance to win from our $2,000 prize pool!

  • 🥇 $1,000 to the top project

  • 🥈 $600 to the second place

  • 🥉 $300 to the third place

  • 🏅 $100 to the fourth place team

In addition to these prizes, you will also receive feedback from established researchers and have the chance to join the Apart Lab, which is a 4-6 month fellowship towards publishing your research at ML and AI conferences with the guidance of senior research advisors in AI safety.

Why demonstrate risks to democracy?

Despite some of the largest potential risks from AI being related to our democratic institutions and the fragility of society, there is surprisingly little work demonstrating and extrapolating concrete risks from AI to democracy.

By putting together actual demonstrations of potential dangers and mindfully extrapolating these risks into the late 2020s, we can raise awareness among key decision-makers and stakeholders, thus driving the development of mitigation strategies.

This research will also be informative for dangerous capability evaluations and our understanding of catastrophic risk in the context of societal stability.

Your participation in this hackathon will contribute to a growing body of knowledge that will help shape the future of AI governance. We are excited to see you there and collaborate with you to develop impactful research.

What is a research hackathon?

The AI x Democracy Hackathon is a weekend-long event where you participate in teams (1-5) to create interesting, fun, and impactful research. You submit a PDF report that summarizes and discusses your findings in the context of AI safety. These reports will be judged by our panel and you can win up to $1,000!

It runs from 3rd May to 5th May and we're excited to welcome you for a weekend of engaging research. You will hear fascinating talks about real-world projects tackling these types of questions, get the opportunity to discuss your ideas with experienced mentors, and you will get reviews from top-tier researchers in the field of AI safety to further your exploration.

Everyone can participate and we encourage you to join especially if you’re considering AI safety from another career. We give you code templates and ideas to kickstart your projects and you’ll be surprised what you can accomplish in just a weekend – especially with your new-found community!

Read more about how to join, what you can expect, the schedule, and what previous participants have said about being part of the hackathon below.

What are some examples of Democracy x AI projects I could do in a weekend?

You can check out a bunch of interesting ideas on AI Safety Ideas to red-team democracy by creating a demonstration of an actual risk model.

For example, you could develop an LLM to contain a sleeper agent that activates on election day, train agents to skew poll results to inflate support for particular policies, or use an LLM to draft uncontroversial legislative proposals that, if implemented, indirectly impacts more contentious or harmful policies.

These are ideas to get you started and you can check out the results from previous hackathons to see examples of the types of projects you can develop during just one weekend, e.g. EscalAItion found that LLMs had a propensity to escalate in military scenarios and was accepted at the multi-agent security workshop at NeurIPS 2023 after further development.

Inspiration

Here is some interesting material to get inspiration for the hackathon:

You can also see more on the evaluations starter resources page.

Why should I join?

There’s loads of reasons to join! Here are just a few:

  • See how fun and interesting AI safety can be

  • Get to know new people who are into the overlap of empirical ML safety and AI governance

  • Win up to $1,000, helping you towards your first H100 GPU

  • Get practical experience with LLM evaluations and AI governance research

  • Show the AI safety labs what you are able to do to increase your chances at some amazing jobs

  • Get a certificate at the end

  • Get proof that your work is awesome so you can get that grant to pursue the AI safety research that you always wanted to pursue

  • The best teams are offered to participate in the Apart Lab program, which supports teams in their journey towards publishing groundbreaking AI safety and security research

  • And many many more… Come along!

Do I need experience in AI safety to join?

Please join! This can be your first foray into AI and ML safety and maybe you’ll realize that there are exciting low-hanging fruit that your specific skillset is adapted to. Even if you normally don’t find it particularly interesting, this time you might see it in a new light!

There’s a lot of pressure from AI safety to perform at a top level and this seems to drive some people out of the field. We’d love it if you consider joining with a mindset of fun exploration and get a positive experience out of the weekend.

What are previous experiences from the research hackathon?

Yoann Poupart, BlockLoads CTO: "This Hackathon was a perfect blend of learning, testing, and collaboration on cutting-edge AI Safety research. I really feel that I gained practical knowledge that cannot be learned only by reading articles.”

Lucie Philippon, France Pacific Territories Economic Committee: "It was great meeting such cool people to work with over the weekend! I did not know any of the other people in my group at first, and now I'm looking forward to working with them again on research projects! The organizers were also super helpful and contributed a lot to the success of our project.”

Akash Kundu, now an Apart Lab fellow: "It was an amazing experience working with people I didn't even know before the hackathon. All three of my teammates were extremely spread out, while I am from India, my teammates were from New York and Taiwan. It was amazing how we pulled this off in 48 hours in spite of the time difference. Moreover, the mentors were extremely encouraging and supportive which helped us gain clarity whenever we got stuck and helped us create an interesting project in the end.”

Nora Petrova, ML Engineer at Prolific: “The hackathon really helped me to be embedded in a community where everyone was working on the same topic. There was a lot of curiosity and interest in the community. Getting feedback from others was interesting as well and I could see how other researchers perceived my project. It was also really interesting to see all the other projects and it was positive to see other's work on it.”

Chris Mathwin, MATS Scholar: "The Interpretability Hackathon exceeded my expectations, it was incredibly well organized with an intelligently curated list of very helpful resources. I had a lot of fun participating and genuinely feel I was able to learn significantly more than I would have, had I spent my time elsewhere. I highly recommend these events to anyone who is interested in this sort of work!”

Who will I receive reviews for my project from?

We are delighted to have experts in AI technical safety research and AI governance join us!

  • Nina Rimsky from Anthropic will provide you feedback on your project as part of the jury

  • Simon Lermen will share his work on Bad Agents and mass spearphishing

  • Konrad Seifert from the Simon Institute for Longterm Governance will be part of our jury and give insightful comments on your project

See other collaborators in the speakers and collaborators section of the hackathon page.

What if my research seems too risky to share?

Besides emphasizing the introduction of concrete mitigation ideas for the risks presented, we are aware that projects emerging from this hackathon might pose a risk if disseminated irresponsibly.

For all of Apart's research events and dissemination, we follow our Responsible Disclosure Policy.

Where can I read more about this?

And last but not least, click the “Sign Up” button on theand read more about the hackathons on our home page.

If you have any feedback, suggestions, or resources we should be aware of, feel free to reach out to sprints@apartresearch.com, submit pull requests, add ideas, or write any questions on the Discord. We are happy to take constructive suggestions.

Resources

Resources

Arrow
Arrow
Arrow

You can find an updated list of all the resources on our Evaluations Quickstart Guide repository under the Demonstrations heading.

Starter code

To get you started with demonstrations and understand what you can do with current open models, we've written multiple notebooks for you to start your research journey from!

If you haven't used Colab notebooks before, you can either download them as Jupyter notebooks, run them in the browser, or make a copy to your own Google Drive. The last one is our suggestion since you can make permanent changes and share it with your teammates for somewhat live editing.

  • Predicting the future: This notebook can be used to make simple parametric predictions about the future from existing data, such as the amount of fake news in Sweden during 2020 through 2023

  • Transformer-lens model download: Loading in language models to change the weights, either for creating trojan networks, sleeper agents, or understand what goes on inside the model

  • Replicate API usage: An easy introduction to querying all the models available on the Replicate.ai platform - if you'd like an API key, we can provide this as well, simply ask!

  • Voice cloning: A simple implementation of cloning yours or any other voice - this demo records your voice and allows you to make text-to-speech on your own voice

Readings

Besides these great pieces of code cooked together by Bart, you can also take a look at relevant literature, such as broader thoughts of AI's impact on democracies and specific research projects attempting to measure and mitigate such risks.

There are not many demonstrations of risk to democracies, but existing research work has demonstrated the general risk of modern AI for society. We are lucky to have some of the utmost examples presented by our speakers during the hackathon but more examples are surely to find out there:

Non-technical material to give inspiration for the potential issues AI might bring.

See list of ideas here.

Schedule

Schedule

Arrow
Arrow
Arrow

The schedule runs from 7PM CEST / 10AM PST Friday to 4AM CEST Monday / 7PM PST Sunday. We start with an introductory talk and end the event during the following week with an awards ceremony. Join the public ICal here.

You will also find Explorer events before the hackathon begins on Discord and on the calendar.

Entries

Democracy and AI: Ensuring Election Efficiency in Nigeria and Africa

In my work, I explore the potential of artificial intelligence (AI) technologies to enhance the efficiency, transparency, and accountability of electoral processes in Nigeria and Africa. I examine the benefits of using AI for tasks like voter registration, vote counting, and election monitoring, such as reducing human error and providing real-time data analysis.

However, I also highlight significant challenges and risks associated with implementing AI in elections. These include concerns over data privacy and security, algorithmic bias leading to voter disenfranchisement, overreliance on technology, lack of transparency, and potential for manipulation by bad actors.

My work looks at the specific technologies already used in Nigerian elections, such as biometric voter authentication and electronic voter registers, evaluating their impact so far. I discuss the criticism and skepticism around these technologies, including issues like voter suppression and limited effect on electoral fraud.

Looking ahead, I warn that advanced AI capabilities could exacerbate risks like targeted disinformation campaigns and deepfakes that undermine trust in elections. I emphasize the need for robust strategies to mitigate these risks through measures like data security, AI transparency, bias detection, regulatory frameworks, and public awareness efforts.

In Conclusion, I argue that while AI offers promising potential for more efficient and credible elections in Nigeria and Africa, realizing these benefits requires carefully addressing the associated technological, social, and political challenges in a proactive and rigorous manner.


Learn More

Universal Jailbreak of Closed Source LLMs which provide an End point to Finetune

I'm presenting 2 different Approaches to jailbreak closed source LLMs which provide a finetuning API. The first attack is a way to exploit the current safety checks in Gpt3.5 Finetuning Endpoint. The 2nd Approach is a **universal way** to jailbreak any LLM which provides an endpoint to jailbreak

Approach 1:

1. Choose a harmful Dataset with questions

2. Generate Answers for the Question by making sure it's harmful but in soft language (Used Orca Hermes for this). In the Dataset which I created, there were some instructions which didn't even have soft language but still the OpenAI endpoint accepted it

3. Add a trigger word

4. Finetune the Model

5. Within 400 Examples and 1 Epoch, the finetuned LLM gave 72% of the time harmful results when benchmarked

Link to Repo: https://github.com/desik1998/jailbreak-gpt3.5-using-finetuning

Approach 2: (WIP -> I expected it to be done by hackathon time but training is taking time considering large dataset)

Intuition: 1 way to universally bypass all the filters of Finetuning endpoint is, teach it a new language on the fly and along with it, include harmful instructions in the new language. Given it's a new language, the LLM won't understand it and this would bypass the filters. Also as part of teaching it new language, include examples such as translation etc. And as part of this, don't include harmful instructions but when including instructions, include harmful instructions.

Dataset to Finetune:

Example 1:

English

New Language

Example 2:

New Language

English

....

Example 10000:

New Language

English

.....

Instruction 10001:

Harmful Question in the new language

Answer in the new language

(More harmful instructions)

As part of my work, I choose the Caesar Cipher as the language to teach. Gpt4 (used for safety filter checks) already knows Caesar Cipher but only upto 6 Shifts. In my case, I included 25 Shifts i.e A -> Z, B -> A, C -> B, ..., Y -> X, Z -> Y. I've used 25 shifts because it's easy to teach the Gpt3.5 to learn and also it passes safety checks as Gpt4 doesn't understand with 25 shifts.

Dataset: I've curated close to 15K instructions where 12K are translations and 2K are normal instructions and 300 are harmful instructions. I've used normal wiki paragraphs for 12K instructions, For 2K instructions I've used Dolly Dataset and for harmful instructions, I've generated using an LLM and further refined the dataset for more harm. Dataset Link: https://drive.google.com/file/d/1T5eT2A1V5-JlScpJ56t9ORBv_p-CatMH/view?usp=sharing

I was expecting LLM to learn the ciphered language accurately with this amt of Data and 2 Epochs but it turns out it might need more Data. The current loss is at 0.8. It's currently learning to cipher but when deciphering it's making mistakes. Maybe it needs to be trained on more epochs or more Data. I plan to further work on this approach considering it's very intuitive and also theoretically it can jailbreak using any finetuning endpoint. Considering that the hackathon is of just 2 Days and given the complexities, it turns out some more time is reqd.

Link to Colab : https://colab.research.google.com/drive/1AFhgYBOAXzmn8BMcM7WUt-6BkOITstcn#scrollTo=SApYdo8vbYBq&uniqifier=5


Learn More

Speakers & Collaborators

Alice Gatti

Speaker

Alice is a research engineer at the Center for AI Safety and co-author of the Weapons of Mass Destruction Proxy (wmdp.ai) benchmark. Before that, she was a postdoc at Berkeley Lab.

Simon Lermen

Judge

MATS Scholar under Jeffrey Ladish and is an independent researcher. Mentored a spar project on AI agents and worked on spear-phishing people with AI agents.

Nina Rimsky

Judge

Nina is a member of the technical staff at Anthropic, specializing in automated red teaming, activation steering, and robustness.

Konrad Seifert

Judge

Konrad is co-CEO of the Simon Institute for Longterm Governance and works on international cooperation for the governance of rapid technological change.

Andrey Anurin

Judge

Andrey is a senior software engineer at Google and a researcher at Apart, working on automated cyber capability evaluation and capability elicitation.

Esben Kran

Organizer

Esben is the co-director of Apart Research and specializes in organizing research teams on pivotal AI security questions.

Bart Bussmann

Organizer

Bart is an independent freelancer and researcher in AI safety and prepares materials and code for the hackathon.

Jason Schreiber

Organizer and Judge

Jason is co-director of Apart Research and leads Apart Lab, our remote-first AI safety research fellowship.

Finn Metz

Organizer

Finn is a core member of Apart and heads strategy and business development with a background from private equity, incubation, and venture capital.

Speakers & Collaborators

Alice Gatti

Speaker

Alice is a research engineer at the Center for AI Safety and co-author of the Weapons of Mass Destruction Proxy (wmdp.ai) benchmark. Before that, she was a postdoc at Berkeley Lab.

Simon Lermen

Judge

MATS Scholar under Jeffrey Ladish and is an independent researcher. Mentored a spar project on AI agents and worked on spear-phishing people with AI agents.

Nina Rimsky

Judge

Nina is a member of the technical staff at Anthropic, specializing in automated red teaming, activation steering, and robustness.

Konrad Seifert

Judge

Konrad is co-CEO of the Simon Institute for Longterm Governance and works on international cooperation for the governance of rapid technological change.

Andrey Anurin

Judge

Andrey is a senior software engineer at Google and a researcher at Apart, working on automated cyber capability evaluation and capability elicitation.

Esben Kran

Organizer

Esben is the co-director of Apart Research and specializes in organizing research teams on pivotal AI security questions.

Bart Bussmann

Organizer

Bart is an independent freelancer and researcher in AI safety and prepares materials and code for the hackathon.

Jason Schreiber

Organizer and Judge

Jason is co-director of Apart Research and leads Apart Lab, our remote-first AI safety research fellowship.

Finn Metz

Organizer

Finn is a core member of Apart and heads strategy and business development with a background from private equity, incubation, and venture capital.

Registered Jam Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.

Registered Jam Sites

Register A Location

Beside the remote and virtual participation, our amazing organizers also host local hackathon locations where you can meet up in-person and connect with others in your area.

The in-person events for the Apart Sprints are run by passionate individuals just like you! We organize the schedule, speakers, and starter templates, and you can focus on engaging your local research, student, and engineering community.