LLMs Demand Observability-Driven Development
Honeycomb
SEPTEMBER 20, 2023
There is a much longer list of things that make software less than 100% debuggable in practice. Some of these things are related to cost/benefit tradeoffs, but most are about weak telemetry, instrumentation, and tooling. Instead, ML teams typically build evaluation systems to evaluate the effectiveness of the model or prompt.
Let's personalize your content