I'm sorry, but those are vanity evals

by izzymilleron 4/14/2025, 5:42 PMwith 1 comments

by izzymilleron 4/14/2025, 6:04 PM

Stoked to get to publish some of our private eval results and a bit of the behind the scenes of our framework! We've been using this approach for almost a year and found it extremely high leverage for making meaningful improvements to the AI parts of our product