Jan 20, 2025
LLM-prompt-optimiser based SAAS platform for evaluations
Anton ZHeltoukhov, Iulia Levin
LLM evaluation SAAS platform built around model based prompt optimiser
This is a solid start, but it feels like a rough draft. The authors have clearly identified a problem and proposed a solution, but they could benefit from more detail and a more polished presentation.
Definitely a possible solution if you can get access to the prompts and evals from all theh top tier labs. However, this seems rather odd if it's just purely using what they've already created. Additionally, it might be hard to create standards for constantly changing questions.
focus on improving evaluation quality through automated prompt optimization is interesting but may be too narrow in scope.
(1) Technical approach is sound but limited in scope.
(2) Addresses important but narrow aspect of safety evaluation.
(3) Implementation details need significant expansion.
Cite this work
@misc {
title={
LLM-prompt-optimiser based SAAS platform for evaluations
},
author={
Anton ZHeltoukhov, Iulia Levin
},
date={
1/20/25
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}


