This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Agent Security Hackathon
66792e7b43f57dc7a262ec11
Agent Security Hackathon
October 7, 2024
Accepted at the 
66792e7b43f57dc7a262ec11
 research sprint on 

An Autonomous Agent for Model Attribution

As LLM agents become more prevalent and powerful, the ability to trace fine-tuned models back to their base models is increasingly important for issues of liability, IP protection, and detecting potential misuse. However, model attribution often must be done in a black-box context, as adversaries may restrict direct access to model internals. This problem remains a neglected but critical area of AI security research. To date, most approaches have relied on manual analysis rather than automated techniques, limiting their applicability. Our approach aims to address these limitations by leveraging the advanced reasoning capabilities of frontier LLMs to automate the model attribution process.

By 
Jord Nguyen
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

This project is private