From weekend demo to a support assistant trusted around the clock.

Client
B2B SaaS, ~80 employees
Engagement
POC to Production, 6 weeks
Stack
Claude, retrieval, evals, observability
63%
Faster median first response across all support channels
24/7
Coverage with no increase in headcount
0
Critical incidents in the first quarter live
The problem

A promising prototype that nobody could trust in production.

The team had built a Claude-based assistant that drafted replies to customer tickets. In demos it was impressive. In practice, it was unpredictable on unusual tickets, had no way to measure whether it was improving, and no visibility once a reply went out.

Leadership wanted the efficiency the prototype hinted at, but couldn't put an unmonitored system in front of customers. They had no AI engineer on staff and no budget to hire one full-time.

What we did

We engineered the demo into a system with guardrails and eyes.

We started with their existing prototype and built an evaluation suite around it, scoring drafts against a graded set of real historical tickets. That gave everyone a number to trust and a safety net against regressions.

We added retrieval from their knowledge base, confidence thresholds that route uncertain tickets to humans, full tracing of every interaction, and token budgets that kept costs flat as volume grew. Then we documented all of it and trained their team to run it.

The outcome

A system the team owns — and keeps improving without us.

Within six weeks the assistant was handling first-draft replies across every channel, with humans reviewing only the cases the system flagged. Median first response dropped by nearly two thirds.

Crucially, the team now operates it themselves: the evals catch problems before customers do, the dashboards show what's happening, and improvements ship on their schedule, not ours.

We didn't just get a working system. We got a team that finally understands what we're running and can improve it themselves. That's the part we didn't know to ask for.
VP of Customer Experience
B2B SaaS client

Have a prototype that needs to perform?

We'll give you an honest read on what production will take, and a fixed-scope proposal if we're a fit.

Start a conversation →