This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Interpretability
Accepted at the 
Interpretability
 research sprint on 
July 17, 2023

DPO vs PPO comparative analysis

We perform a comparative analysis of the DPO and PPO algorithms where we use techniques from interpretability to attempt to understand the difference between the two

By 
Rauno Arike, Luke Marks, Amir Abdullah, Luna Mendez
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

This project is private