All Apart hackathon projects

Here you can see all hackathon projects submitted to the Apart hackathons.

AntiMidas: Building Commercially-Viable Agents for Alignment Dataset Generation

Jacob Arbeid, Jay Bailey, Sam Lowenstein, Jake Pencharz

January 27, 2025

AI Safety Entrepreneurship Hackathon

Omniscient Narrative Agent

Jord Nguyen, Akash Kundu, Gayatri K

December 13, 2024

The Concordia Contest: Advancing the Cooperative Intelligence of Language Model Agents

October 3, 2024

AutoSteer: Weight-Preserving Reinforcement Learning for Interpretable Model Control

Jeremias Lino Ferrao

December 3, 2024

Reprogramming AI Models Hackathon

Promoting School-Level Accountability for the Responsible Deployment of AI and Related Systems in K-12 Education: Mitigating Bias and Increasing Transparency

Chloe Jefferson

November 27, 2024

Howard University AI Safety Summit & Policy Hackathon

Diamonds are Not All You Need

Michael Andrzejewski, Melwina Albuquerque

November 10, 2024

Agent Security Hackathon

Speculative Consequences of A.I. Misuse

Joseph Karam, Charlie Nguyen, Andrew Lam

November 10, 2024

AI capabilities and risks demo-jam: Creating visceral interactive demonstrations

AI Alignment Knowledge Graph

Matin Mahmood, Samuel Ratnam, Sruthi Kuriakose, Pandelis Mouratoglou

November 10, 2024

Research Augmentation Hackathon: Supercharging AI Alignment

Robust Machine Unlearning for Dangerous Capabilities

Neel Jay, Austin Meek, Joshua Ehizibolo

November 7, 2024

AI Policy Hackathon at Johns Hopkins University

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Gabriel Mukobi, Anka Reuel, Juan-Pablo Rivera, Chandler Smith

September 12, 2024

October 2, 2023

DarkForest - Defending the Authentic and Humane Web

September 5, 2024

Hackathon for Technical AI Safety Startups

Sandbag Detection through Model Degradation

Cam Tice, Philipp Alexander Kreer, Fedor Ryzhenkov, Nathan Helm-Burger, Prithviraj Singh Shahani

Deception Detection Hackathon: Preventing AI deception

Unsupervised Recovery of Hidden Markov Models from Transformers with Evolutionary Algorithms

Dylan Bowman, Colin Lu

Computational Mechanics Hackathon!

rAInboltBench : Benchmarking user location inference through single images

Le "Qronox" Lam ; Aleksandr Popov ; Jord Nguyen ; Trung Dung "mogu" Hoang ; Marcel M ; Felix Michalak

AI Security Evaluation Hackathon: Measuring AI Capability

Beyond Refusal: Scrubbing Hazards from Open-Source Models

Kyle Gabriel Reynoso, Ivan Enclonar, Lexley Maree Villasis

AI and Democracy Hackathon: Demonstrating the Risks

We Discovered An Neuron

Joseph Miller, Clement Neo

February 24, 2024

January 25, 2023

Investigating Neuron Behaviour via Dataset Example Pruning and Local Search

February 24, 2024

November 15, 2022

Seemingly Human: Dark Patterns in ChatGPT

Jin Suk Park, Angela Lu, Esben Kran

February 24, 2024

February 12, 2024

Model Cards for AI Algorithm Governance

Jaime Raldua Veuthey; Gediminas Dauderis; Chetan Talele

February 24, 2024

January 7, 2024

Detecting Implicit Gaming through Retrospective Evaluation Sets

Jacob Haimes, Lucie Philippon, Alice Rigg, Cenny Wenner

February 24, 2024

November 27, 2023

Preserving Agency in Reinforcement Learning under Unknown, Evolving and Under-Represented Intentions

Harry Powell, Luigi Berducci

February 24, 2024

September 25, 2023

In the Mirror: Using Chess to Simulate Agency Loss in Feedback Loops

February 24, 2024

September 24, 2023

Evaluating Myopia in Large Language Models

Marco Bazzani, Felix Binder

February 24, 2024

September 10, 2023

Joshua Sammet, Per Ivar Friborg, William Wale

February 24, 2024

Relating induction heads in Transformers to temporal context model in human free recall

February 24, 2024

Exploring the Robustness of Model-Graded Evaluations of Language Models

Simon Lermen, Ondřej Kvapil

February 24, 2024

Solving the CNN Mech Int Challenge

Stefan Heimersheim, Marius Hobhahn

February 24, 2024

Automated Sandwiching: Efficient Self-Evaluations of Conversation-Based Scalable Oversight Techniques

Sophia Pung, Gabriel Mukobi

February 24, 2024

February 16, 2023

Discovering Latent Knowledge in Language Models Without Supervision - extensions and testing

Agatha Duzan, Matthieu David, Jonathan Claybrough

February 24, 2024

December 19, 2022

Agreeableness vs. Truthfulness

February 24, 2024

October 18, 2022

Prompt+question Shield

January 27, 2025

AI Safety Entrepreneurship Hackathon

Classification on Latent Feature Activation for Detecting Adversarial Prompt Vulnerabilities

Hoang-Long Tran, Jack Kaunismaa, Edward Stevinson, Parv Mahajan, Oliver Clive-Griffin

December 3, 2024

Reprogramming AI Models Hackathon

Dynamic Risk Assessment in Autonomous Agents Using Ontologies and AI

Alejandra de Brunner

November 10, 2024

Agent Security Hackathon

Very Cooperative Agent

November 10, 2024

The Concordia Contest: Advancing the Cooperative Intelligence of Language Model Agents

October 6, 2024

Mia Hopman, Carissa Cullen, Jack Wittmayer, Vaishnavi Pamulapati

November 10, 2024

AI capabilities and risks demo-jam: Creating visceral interactive demonstrations

Grant Application Simulator

Michaël Trazzi

November 10, 2024

Research Augmentation Hackathon: Supercharging AI Alignment

Arman Mahjoor, Vansh Bataviya, Charles Landreaux, Ryan Mahjoor,

October 29, 2024

AI Policy Hackathon at Johns Hopkins University

Simulation Operators: The Next Level of the Annotation Business

September 5, 2024

Hackathon for Technical AI Safety Startups

Detecting and Controlling Deceptive Representation in LLMs with Representational Engineering

Avyay M Casheekar, Kaushik Sanjay Prabhakar, Kanishk Rath, Sienka Dounia

August 29, 2024

Deception Detection Hackathon: Preventing AI deception

RNNs represent belief state geometry in hidden state

Computational Mechanics Hackathon!

Cybersecurity Persistence Benchmark

Davide Zani, Felix Michalak, Jeremias Ferrao

AI Security Evaluation Hackathon: Measuring AI Capability

Jekyll and HAIde: The Better an LLM is at Identifying Misinformation, the More Effective it is at Worsening It.

AI and Democracy Hackathon: Demonstrating the Risks

Fishing for the answer: Mapping the flow of information in LLM agent groups using lessons from fish schools

Matthew Lutz, Nyasha Duri

February 24, 2024

February 12, 2024

Obsolescent Souls

February 24, 2024

January 7, 2024

Visual Prompt Injection Detection

Yoann Poupart, Imene Kerboua

February 24, 2024

November 27, 2023

Jailbreaking the Overseer

Alexander Meinke

February 24, 2024

October 1, 2023

Discovering Agency Features as Latent Space Directions in LLMs via SVD

February 24, 2024

September 25, 2023

Agency as Shanon information. Unveiling limitations and common misconceptions

Ivan Madan, Hennadii Madan

February 24, 2024

September 24, 2023

Catherine Brewer

February 24, 2024

September 21, 2023

The AI governance gaps in developing countries

February 24, 2024

Who cares about brackets?

Theo Clark, Alex Roman, Hannes Thurnherr

February 24, 2024

From Sparse to Dense: Refining the MACHIAVELLI Benchmark for Real-World AI Safety

Heramb Podar, Vladislav Bargatin

February 24, 2024

Dropout Incentivizes Privileged Bases

Edoardo Pona, Victor Levoso Fernàndez, Abhay, Kunvar

February 24, 2024

Player Of Games

February 24, 2024

February 16, 2023

Identifying a Preliminary Circuit for Predicting Gendered Pronouns in GPT-2 Small

Chris Mathwin, Guillaume Corlouer

February 24, 2024

January 25, 2023

Investigating Training Dynamics via Token Loss Trajectories

February 24, 2024

December 19, 2022

Backup Transformer Heads are Robust to Ablation Distribution

Lucas Sato, Gabe Mukobi, Mishika Govil

February 24, 2024

November 15, 2022

AI: My Partner in Crime

February 24, 2024

October 18, 2022

AI Risk Management Assurance Network (AIRMAN)

January 27, 2025

AI Safety Entrepreneurship Hackathon

Utilitarian Decision-Making in Models - Evaluation and Steering

Adam Newgas, Sinem Erisken , Pandelis Mouratoglou

December 3, 2024

Reprogramming AI Models Hackathon

Vaishnavi Pamulapati, Diego Sabajo, Andres Sepulveda Morales, Elsa Donnat, Paul Vautravers

November 10, 2024

Agent Security Hackathon

LLM Research Collaboration Recommender

November 10, 2024

Research Augmentation Hackathon: Supercharging AI Alignment

Sue-Per GPT: Legal RAG Assistant

Atir Petkar, Jay Liu, Chelsea Wong, Nancy Vigil

November 7, 2024

AI Policy Hackathon at Johns Hopkins University

AI Safety Collective - Crowdsourcing Solutions for Critical AI Safety Challenges

Lye Jia Jun, Dhruba Patra, Philipp Blandfort

September 5, 2024

Hackathon for Technical AI Safety Startups

Detecting Deception in GPT-3.5-turbo: A Metadata-Based Approach

Siddharth Reddy Bakkireddy, Rakesh Reddy Bakkireddy

Deception Detection Hackathon: Preventing AI deception

Investigating the Effect of Model Capacity Constraints on Belief State Representations

Ari Brill, Chu Chen

Computational Mechanics Hackathon!

Handcrafting a Network to Predict Next Token Probabilities for the Random-Random-XOR Process

Computational Mechanics Hackathon!

Say No to Mass Destruction: Benchmarking Refusals to Answer Dangerous Questions

Alexander Pino, Carl Vinas, Joseph Dantes, Zmavli Caimle, Kyle Reynoso

AI Security Evaluation Hackathon: Measuring AI Capability

Artificial Advocates: Biasing Democratic Feedback using AI

Sam Patterson, Jeremy Dolan, Simon Wisdom, Maten

AI and Democracy Hackathon: Demonstrating the Risks

Albert Garde, Esben Kran

February 24, 2024

Automated Identification of Potential Feature Neurons

Michelle Wai Man Lo

February 24, 2024

January 25, 2023

Model editing hazards at the example of ROME

Oscar Persson, Jochem Hölscher

February 24, 2024

November 15, 2022

Iterated contract negotiation

Robert Klassert

February 24, 2024

February 11, 2024

2030 - The CEO Dilemna

Pierina Camarena, Leon Nyametso, Capucine Marteau

February 24, 2024

January 8, 2024

Cross-Lingual Generalizability of the SADDER Benchmark

Siddhant Arora, Jord Nguyen, Akash Kundu

February 24, 2024

November 27, 2023

LLMs With Knowledge of Jailbreaks Will Use Them

Jack Foxabbott, Marcel Hedman, Kaspar Senft, Kianoosh Ashouritaklimi

February 24, 2024

October 2, 2023

Uncertainty about value naturally leads to empowerment

February 24, 2024

September 26, 2023

Comparing truthful reporting, intent alignment, agency preservation and value identification

February 24, 2024

September 24, 2023

Building brakes for a speeding car: A global coordination proposal for AI safety

Charles Martinet, Blanche Freudenreich, Henry Papadatos, Manuel Bimich

February 24, 2024

Embedding and Transformer Synthesis

February 24, 2024

MAXIAVELLI: Thoughts on improving the MACHIAVELLI benchmark

Roman Leventov, Jason Hoelscher-Obermaier

February 24, 2024

Reverse Word Wizards: Pitting Language Models Against the Art of Reversal

Ingrid Backman, Asta Rassmussen, Klara Nielsen

February 24, 2024

February 16, 2023

Counting Letters, Chaining Premises & Solving Equations: Exploring Inverse Scaling Problems with GPT-3

D. Chipping, J. Harding, H. Mannering, P. Selvaraj

February 24, 2024

December 19, 2022

All Fish are Trees

February 24, 2024

October 18, 2022

Scoped LLM: Enhancing Adversarial Robustness and Security Through Targeted Model Scoping

Adriano, David Baek, Erik Nordby, Emile Delcourt

January 27, 2025

AI Safety Entrepreneurship Hackathon

Steering Swiftly to Safety with Sparse Autoencoders

Agatha Duzan, Guillaume Martres, Syrine Noame, Abhinand Shibu, Flavia Wallenhorst, Arthur Wuhrmann

December 3, 2024

Reprogramming AI Models Hackathon

November 10, 2024

Agent Security Hackathon

Phish Tycoon: phishing using voice cloning

Craig Albuquerque, Melwina Albuquerque

November 10, 2024

AI capabilities and risks demo-jam: Creating visceral interactive demonstrations

PurePrompt - An easy tool for prompt robustness and eval augmentation

November 10, 2024

Research Augmentation Hackathon: Supercharging AI Alignment

Understanding Incentives To Build Uninterruptible Agentic AI Systems

Damin Curtis, M.A. International Affairs Norman Piotriowski, B.Sc. Data Science

October 29, 2024

AI Policy Hackathon at Johns Hopkins University

Identity System for AIs

September 5, 2024

Hackathon for Technical AI Safety Startups

Modelling the oversight of automated interpretability against deceptive agents on sparse autoencoders

Simon Lermen, Mateusz Dziemian

Deception Detection Hackathon: Preventing AI deception

Benchmarking Dark Patterns in LLMs

Jord Nguyen, Akash Kundu, Sami Jawhar

AI Security Evaluation Hackathon: Measuring AI Capability

Unleashing Sleeper Agents

Nora Petrova, Jord Nguyen

AI and Democracy Hackathon: Demonstrating the Risks

Towards High-Quality Model-Written Evaluations

Jannes Elstner, Jaime Raldua Veuthey

February 24, 2024

November 27, 2023

Second-order Jailbreaks

Mikhail Terekhov, Romain Graux, Denis Rosset, Eduardo Neville, Gabin Kolly

February 24, 2024

October 2, 2023

ILLUSION OF CONTROL

February 24, 2024

September 25, 2023

Agency, value and empowerment.

Benjamin Sturgeon, Leo Hyams

February 24, 2024

September 24, 2023

Alvin Ånestrand, Matthias Endres, Harry Powell, Chris Lonsberry

February 24, 2024

Interpreting Planning in Transformers

Victor Levoso Fernandez , Abhay Sheshadri

February 24, 2024

Exploitation of LLM’s to Elicit Misaligned Outputs

Desik Mandava, Jayanth Santosh, Aishwarya Gurung

February 24, 2024

Improving TransformerLens Head Detector

Mateusz Bagiński, Jay Bailey

February 24, 2024

Soft Prompts are a Convex Set

- Amir Sarid - Bary Levy - Dan Barzily - Edo Arad - Gal Hyams - Geva Kipper - Guy Dar - Itay Yona - Yossi Gandelsman

February 24, 2024

January 25, 2023

Trojan detection and implementation on transformers

Clément Dumas, Charbel-Raphaël Segerie, Liam Imadache

February 24, 2024

December 19, 2022

Probing Conceptual Knowledge on Solved Games

Amir Sarid, Bary Levy, Dan Barzilay, Edo Arad, Itay Yona, Joey Geralnik

February 24, 2024

November 15, 2022

Reducing hindsight neglect with "Let's think step by step"

February 24, 2024

October 18, 2022

HITL For High Risk AI Domains

Tyler Edwards, Sruthi Kuriakose, Subramanyam Sahoo, Shamith Achanta

January 27, 2025

AI Safety Entrepreneurship Hackathon

Towards an Agent Marketplace for Alignment Research (AMAR)

January 27, 2025

AI Safety Entrepreneurship Hackathon

AI ADVISORY COUNCIL FOR SUSTAINABLE ECONOMIC GROWTH AND ETHICAL INNOVATION IN THE DOMINICAN REPUBLIC (CANIA)

Said Saillant, Kay Kozaronek, Shaun Pexton, Cyra Alesha, Elise Racine

October 29, 2024

AI Policy Hackathon at Johns Hopkins University

Multifaceted Benchmarking

Eduardo Neville, George Golynskyi, Tetra Jones

February 24, 2024

November 27, 2023

Exploring multi-agent interactions in the dollar auction

Thomas Broadley, Allison Huang

February 24, 2024

October 2, 2023

AutoAdminsteredAntidotes: Circuit detection in a poisoned model for MNIST classification

Kola Ayonrinde, Denizhan “Dennis” Akar, Kitti Kovács, Adam Newgas, David Quarel

February 24, 2024

Reasoning with Chain of Thought

February 24, 2024

October 18, 2022

Modernizing DC’s Emergency Communications

Thane Douglass, Anuoluwapo Soneye

November 26, 2024

AI Policy Hackathon at Johns Hopkins University

Wording influences truthfulness

February 27, 2024

October 18, 2022

Othello Mechint playground

Victor Levoso Fernandez, Edoardo Pona ,Abhay Sheshadri, Kunvar

February 24, 2024

Detecting Phase Transitions

Jesse Hoogland, Lucas Texeira, Benjamin Gerraty, Rumi Salazar, Samuel Knoche

February 24, 2024

Simulating an Alien

February 24, 2024

October 18, 2022

Exploring OthelloGPT

February 24, 2024

The first THC products

cannabis dispensary

January 27, 2025

The best THC products

January 26, 2025

The most suitable THC products

January 22, 2025

The first THC products

January 21, 2025

Securing AGI Deployment and Mitigating Safety Risks

January 19, 2025

AI Safety Entrepreneurship Hackathon

January 19, 2025

AI Safety Entrepreneurship Hackathon

VaultX - AI-Driven Middleware for Real-Time PII Detection and Data Security

Diego Sabajo, Egbert Stuger

January 19, 2025

AI Safety Entrepreneurship Hackathon

Safe.ai: AI Agent Risk Assessment Platform

Aryan Thakar, Liam Ho

January 19, 2025

AI Safety Entrepreneurship Hackathon

Safe.ai: AI Agent Risk Assessment Platform

HO WEI LIANG-ARYAN THAKAR

January 19, 2025

AI Safety Entrepreneurship Hackathon

January 19, 2025

AI Safety Entrepreneurship Hackathon

LLM-prompt-optimiser based SAAS platform for evaluations

Anton ZHeltoukhov, Iulia Levin

January 19, 2025

AI Safety Entrepreneurship Hackathon

Navigating the AGI Revolution: Retraining and Redefining Human Purpose

William Zen, Kushal Agrawal, Stefano Orsini, Manoj S

January 19, 2025

AI Safety Entrepreneurship Hackathon

AI Safety Evaluation – Benchmarking Framework

Maha Vishnu Sura

January 19, 2025

AI Safety Entrepreneurship Hackathon

RestriktAI: Enhancing Safety and Control for Autonomous AI Agents

João Lucas Duim, Juan Belieni

January 19, 2025

AI Safety Entrepreneurship Hackathon

Kailash Balasubramaniyam, Mohammed Areeb Aliku, Eden Simkins

January 19, 2025

AI Safety Entrepreneurship Hackathon

Enhancing human intelligence with neurofeedback

Thomas Rialan, Nicole Dobreva, Anthony Kirilov

January 19, 2025

AI Safety Entrepreneurship Hackathon

Building Bridges for AI Safety: Proposal for a Collaborative Platform for Alumni and Researchers

January 19, 2025

AI Safety Entrepreneurship Hackathon

The best THC products

January 5, 2025

The main plumbing tie all the habits!

Bathroom plumbing

January 5, 2025

The wealthiest plumbing proprietorship constantly!

January 3, 2025

The most talented plumbing band constantly!

January 2, 2025

The wealthiest plumbing band till doomsday!

January 1, 2025

SAGE: Safe, Adaptive Generation Engine for Long Form Document Generation in Collaborative, High Stakes Domains

Abrar Rahman, Anish Sundar

December 6, 2024

Reprogramming AI Models Hackathon

Chloe Anderson, Mashrur Wasek, Aidan Schurr, Chris Gagnon

December 6, 2024

AI Policy Hackathon at Johns Hopkins University

Glia for Healthcare Organisations

Chloe Anderson, Mashrur Wasek, Aidan Schurr, Chris Gagnon

December 6, 2024

AI Policy Hackathon at Johns Hopkins University

Analyzing Dataset Bias with SAEs

Nick Jiang, Joseph Tey

November 25, 2024

Reprogramming AI Models Hackathon

Investigate arithmetic features in Multi-lingual LLMs

Akash Kundu, Ashish Rai, Suhas K R

November 25, 2024

Reprogramming AI Models Hackathon

Bias Mitigation in LLM by Steering Features

Akanksha Devkar

November 25, 2024

Reprogramming AI Models Hackathon

Faithful or Factual? Tuning Mistake Acknowledgment in LLMs

Daniel Donnelly, Mia Hopman, Jack Wittmayer

November 24, 2024

Reprogramming AI Models Hackathon

Improving Llama-3-8B-Instruct Hallucination Robustness in Medical Q&A Using Feature Steering

Diego Sabajo, Eitan Sprejer, Matas Zabaljauregui, Oliver Morris

November 24, 2024

Reprogramming AI Models Hackathon

Unveiling Latent Beliefs Using Sparse Autoencoders

Carlos Cortez, Eivind Otto Hjelle, Sanchit Kalhan

November 24, 2024

Reprogramming AI Models Hackathon

Can we steer a model’s behavior with just one prompt? investigating SAE-driven auto-steering

Nicole Nobili, Davide Ghilardi, Wen Xing

November 24, 2024

Reprogramming AI Models Hackathon

Improving Llama-3-8b Hallucination Robustness in Medical Q&A Using Feature Steering

Diego Sabajo, Eitan Sprejer, Matas Zabaljauregui, Oliver Morris

November 24, 2024

AI Policy Hackathon at Johns Hopkins University

Sparse Autoencoders and Gemma 2-2B: Pioneering Demographic-Sensitive Language Modeling for Opinion QA

November 24, 2024

Reprogramming AI Models Hackathon

Assessing Language Model Cybersecurity Capabilities with Feature Steering

November 24, 2024

Reprogramming AI Models Hackathon

Math Speaks All Languages: Enhancing LLM Problem-Solving Across Multilingual Contexts

Maksim Kostritsya, Kseniia Kuvshinova, Rauf Parchiev, Konstantin Polev

November 24, 2024

Reprogramming AI Models Hackathon

Edufire - Personalized Education Platform Using LLM Steering

November 24, 2024

Reprogramming AI Models Hackathon

Explaining Latents in Turing-LLM-1.0-254M with Pre-Defined Function Types

Daniel Davies, Ashwarya Maratha

November 24, 2024

Reprogramming AI Models Hackathon

Tentative proposal for AI control with weak supervisors trough Mechanistic Inspection

November 24, 2024

Reprogramming AI Models Hackathon

Clear Thought and Clear Speech: Reducing Grammatical Scope Ambiguity

November 24, 2024

Reprogramming AI Models Hackathon

Joey SKAF, Mickaël Boillaud, Thaïs Distinguin

November 24, 2024

Reprogramming AI Models Hackathon

Investigating Feature Effects on Manipulation Susceptibility

Nishchal Prabhakar, Stefan Trnjakov, Mo Aziz

November 24, 2024

Reprogramming AI Models Hackathon

Let LLM Agents Perform LLM Surgery

Sharat Jacob Jacob

November 24, 2024

Reprogramming AI Models Hackathon

Feature Tuning versus Prompting for Ambiguous Questions

Elis Grahn, Axel Ahlqvist, Elliot Gestrin, Hemming Gong

November 24, 2024

Reprogramming AI Models Hackathon

Auto Prompt Injection

Yingjie Hu, Daniel Williams, Carmen Gavilanes, William Hesslefors Nairn

November 24, 2024

Reprogramming AI Models Hackathon

Feature based unlearning

Patrick Quinn, Yucheng Sun

November 24, 2024

Reprogramming AI Models Hackathon

Recovering Goodfire's SAE feature vectors from their API

Lovkush Agarwal

November 24, 2024

Reprogramming AI Models Hackathon

Encouraging Chain-of-Thought Reasoning

Shreyans Jain, Thomas Walker, Kutay Buyruk, Soumyadeep Bose

November 24, 2024

Reprogramming AI Models Hackathon

User Transparency Within AI

Jonathan King, Robert Hardy, Jeremiah Bailey, Amir Abdulgadir

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

Community-First: A Rights-Based Framework for AI Governance in India's Welfare Systems

Sanjnah Ananda Kumar

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

National Data Privacy and Governance Act

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

Implementing a Human-centered AI Assessment Framework (HAAF) for Equitable AI Development

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

A Critical Review of "Chips for Peace": Lessons from "Atoms for Peace"

Amritanshu Prasad

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

AI Monitoring as a Rapid and Scalable Policy Solution: Weekly Global Bulletins on AI Developments

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

Grandfather Paradox in AI – Bias Mitigation & Ethical AI1

Maha Vishnu Sura

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

A Fundamental Rethinking to AI Evaluations: Establishing a Constitution-Based Framework

Arrow Paquera, and Paul Ivan Enclonar

November 20, 2024

Howard University AI Safety Summit & Policy Hackathon

Hero Journey: Personalized Health Interventions for the Incarcerated

Anthony Li, Samuel Ntow, Antonio Bandeira, Niroj Bhandari

October 29, 2024

AI Policy Hackathon at Johns Hopkins University

Anurag Dhungana, Prakriti Bista and Sunil Shah

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Policy Analysis: AI and Sustainability: Climate Impact Monitoring

Parikirt Oggu & Shawn Reginauld

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Pratham Ashar and Vir Trivedi

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Digital Rebellion: Analyzing misaligned AI agent cooperation for virtual labor strikes

Michael Andrzejewski, Melwina Albuquerque

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Next-Gen AI-Enhanced Epidemic Intelligence

Axby Loh, Waikit Fung, Anthony Li

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

AI and Public Health: TSA Pre Health Check

Jacob Lin, Habib Aina

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Mapping Intent: Documenting Policy Adherence with Ontology Extraction

Alejandra de Brunner, Mia Hopman, Jack Wittmayer

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Sachin Kumar, Anitej Suklikar, Samarth Parekh, Roshni Kainthan

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Towards a Unified Framework for Cybersecurity and AI Safety: Recommendations for Secure Development of Large Language Models

Lexley Maree Villasis, Srishti Dutta, Yohan Mathew

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Enviro - A Comprehensive Environmental Solution Using Policy and Technology

Arun Nimmagadda, Sohil Shah, Tushar Gidadhubli, Arnav Patel

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Enhancing Human Verification Systems to Address AI Agent Circumvention and Attributability Concerns

Yogev Angelovici, Anish Ganga, Saathvik Kannan, Zihao Zhou

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Politicians on AI Safety

Liam Robins, Elise Racine, Manikanta Revuri, Bhanu Reddy

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Policy Framework for Sustainable AI: Repurposing Waste Heat from Data Centers in the USA

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Predictive Analytics & Imagery for Environmental Monitoring

Shambhavi Adhikari, Yeji Kim, Dilrose Karakattil

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Proposal for U.S.-China Technical Cooperation on AI Safety

Angel Shen, Raghav Akula

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Proposal for a Provisional FDA Designation Targeting Biomedical Products Evaluated with Novel Methodologies

Gerard Boxo Corominas, Lucia Tortosa Nesterovich

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Infectious Disease Outbreak Prediction and Dashboard

Sukanya Krishna,Nikhil Dhanankam,Joyanta Jyoti Mondal

October 27, 2024

AI Policy Hackathon at Johns Hopkins University

Cross-model surveillance for emails handling

October 7, 2024

Agent Security Hackathon

Inference-Time Agent Security

October 6, 2024

Agent Security Hackathon

Intent Inspector - Protecting Against Prompt Injections for Agent Tool Misuse

Oliver Morris, Gerard Boxo Corominas

October 6, 2024

Agent Security Hackathon

October 6, 2024

Agent Security Hackathon

AI Agent Capabilities Evolution

Ekaterina Krupkina

October 6, 2024

Agent Security Hackathon

An Autonomous Agent for Model Attribution

October 6, 2024

Agent Security Hackathon

Using ARC-AGI puzzles as CAPTCHa task

Mikolaj Kniejski

October 6, 2024

Agent Security Hackathon

LLM Agent Security: Jailbreaking Vulnerabilities and Mitigation Strategies

mohammed arsalan , Vishwesh bhat

October 6, 2024

Agent Security Hackathon

Mechanisms of Causal Reasoning

Ben Sturgeon, Jacy Reese Anthis, Mark Chimes, Sky Cope

February 24, 2024

November 15, 2022

Interpretability at a glance

Henrik Åslund, Harleen Hanspal, Avinash Kori

February 24, 2024

November 15, 2022

Interpreting Catastrophic Failure Modes in OpenAI’s Whisper

Edward Rees, John Hughes, Ellena Reid

February 24, 2024

November 15, 2022

How to find the minimum of a list - Transformer Edition

Ole, Stefan, Ayham, Devesh, Joe

February 24, 2024

November 15, 2022

Finding unusual neuron sets by activation vector distance

February 24, 2024

November 15, 2022

Caught Red-Bandit

Víctor Levoso, Alejandro González, Richard Annilo & Theresa Thoraldson

February 24, 2024

November 15, 2022

An Informal Investigation of Indirect Object Identification in Mistral GPT2-Small Battlestar

February 24, 2024

November 15, 2022

An Intuitive Logic for Understanding Autoregressive Language Models

Adnan Ben Mansour, Gaia Carenini, Alexandre Duplessis

February 24, 2024

November 15, 2022

Alignment Jam : Gradient-based Interpretability of Quantum-inspired neural networks

Antoine Debouchage

February 24, 2024

November 15, 2022

Algorithmic bit-wise boolean task on a transformer

Catalin Mitelut, Jeremy Scheurer, Lukas Petersson, Javier Rando

February 24, 2024

November 15, 2022