DIY LLM Evaluation, a Case Study of Rhyming in ABBA Schema
Xebia
MAY 8, 2024
As Andrej Karpathy, former CTO of OpenAI, once said on Twitter: “I pretty much only trust two LLM evals right now: Chatbot Arena and the r/LocalLlama comments section” Chatbot Arena is a website where you can submit a prompt, see two results, and then choose the best result. I’ve been obsessed with rhyming with AI for years.
Let's personalize your content