Create different datasets with vastly differing types of data and fine-tune language models on these to see which domains lead to better alignment and analyze the different value groundings of the models.
Study the decay of aggressive action in the face of rule / context change in simulated agents, i.e. have agents fight in fighting game and let the same characters collaborate with the same control scheme in another type of game. How quickly do they adapt and is it easier in one direction than the other?
Create an overview of the main ways neural, probabilistic models are different from symbolic, logical AI safety problems. Relate to contemporary models and use cases.
Website with the major issues in AI safety as categories and some sort of timeline view of the progress each issue is getting.
Create massive public outreach projects meant to elicit funding from philanthropists, interest and law-making from governments, and active research from researchers and developers. E.g. documentaries, viral memes, etc.