Where Scripts Fail and AI Could Care

Wesley Brach
May 2
5 min read

Updated: May 2

Rethinking Empathy and Authority in Customer Service

I recently spent the better part of multiple afternoons wrestling with a slice of the American healthcare bureaucracy. On the surface, my problem was minor: a billing discrepancy that should have taken five minutes to untangle. Instead, I bounced from one “representative” to the next, each one reading the same canned lines as if we were trapped in an endlessly looping stage play.

“I’m sorry you’re experiencing frustration. Unfortunately, that’s our policy.”

Every rep sounded almost compassionate - just enough vocal fry to simulate concern - yet none of them had the authority to fix anything. Worse, they seemed terrified to acknowledge that the company might bear any responsibility. The moment I pushed for clarification, they retreated to their script like school kids hiding behind the teacher’s desk. By the end of the call I felt two things at once:

Exasperation - because I still didn’t have an answer.
A strange optimism - because I realized this is exactly the kind of problem AI could solve.

1. Frontier LLMs, Not Your Grandfather’s IVR

Before we go further, let’s clear up a common misconception: the AI I’m talking about is not the creaky “Press 1 for billing” IVR or the keyword‑matching chatbot that dies the moment you misspell a word. Those are rule‑based relics. What changes the game is the new class of frontier large‑language models (LLMs) systems like OpenAI’s o3. (side note: calling OpenAI’s o3 a frontier model is accurate at time of publishing, with the pace of AI advancement that might sound silly a relatively short time later) If you’ve only tried a free version of an AI tool like ChatGPT or a specific LLM like GPT‑4o mini (already impressive) and haven’t pitted it against a frontier model, you’re missing the magic. The leap feels less like upgrading to a faster computer and more like switching from Morse code to a live interpreter who gets the nuance, context, and emotion behind your words. These models can:

Parse tangled policy documents on the fly
Recognize contradictions a human agent might miss
Respond with prose that sounds less like a latex robot and more like someone who actually slept last night.

That’s the foundation for everything that follows.

2. The Real Gap Isn’t Technology, It’s Permission

People often ask, “If empathy is so important, why don’t humans just escalate these issues now?” The cynical response is fear: fear of saying the wrong thing, creating liability, or breaking protocol. But there’s also a practical reason: front‑line workers are paid (and trained) to resolve 80 percent of calls with a standardized flowchart. Escalation is expensive. If every agent routed “edge cases” upstream, supervisors would drown.

So we get what feels like empathy theater: reps recite empathetic phrases while quietly ensuring nothing escapes the script. They’re not heartless; they’re bound by incentives that reward efficiency over resolution and, ironically, programmed as tightly as any algorithm. The difference? Their “source code” is a corporate handbook and a looming threat of write‑ups instead of a transparent prompt you can audit and improve.

3. How AI Could Restore the Human Element

That’s why my mind jumped to artificial intelligence, not as a replacement for human agents, but as a smarter layer in front of them. Imagine calling the same healthcare line and being greeted by an AI assistant with three explicit directives in its system prompt:

Authentic Empathy
- Actively reflect the caller’s emotion and context.
- Resist hollow apologies in favor of meaningful questions (“Can you tell me more about how this billing error is affecting you right now?”).
Contradiction Detection
- Cross‑check the policy database in real time.
- Flag anything that conflicts -“Your plan can’t both cover and exclude that procedure.”
Judicious Escalation
- Escalate only when the AI predicts the caller’s issue cannot be resolved within the first‑tier policy set.
- Transfer with a structured summary so the human specialist starts on third base instead of home plate.

The AI becomes a triage nurse for customer service: filtering the obvious, surfacing the anomalies, and handing humans the cases that actually require judgment, discretion, or negotiated exceptions. Everyone stays in their lane and crucially, the system now rewards acknowledging contradictions instead of ignoring them.

4. “But I Don’t Want a Robot Making Decisions About My Health!”

I hear you. The reflexive fear is that AI could deny coverage the way a script‑bound human sometimes does. Yet the uncomfortable truth is that we already trust critical decisions to humans who are themselves highly programmed. They memorize disclaimers, follow branching logic trees, and are disciplined for stepping outside the lines. If that’s not a form of artificial intelligence, I’m not sure what is. Frontier LLMs add three safeguards humans can’t promise at scale:

Transparency: The decision logic can be logged, audited, and improved.
Consistency: The same scenario triggers the same outcome, not the mood of the agent or whether their coffee kicked in.
Instant Learning: Fix the prompt or the training data once and you’ve upgraded every interaction.

That doesn’t mean humans vanish; it means we finally deploy them where discretion and empathy genuinely matter.

5. Squeezing Gold from the “Calls May Be Recorded” Mine

One more missed opportunity: nearly every company records calls “for quality assurance,” yet most of that audio ends up in cold storage. A frontier LLM could devour those transcripts daily, spotting patterns and insights that legacy text‑analytics tools never noticed:

Policy clauses that confuse 30 percent of callers
Repeat disputes tied to a single billing code
New slang for which the script has zero response

Feed that feedback loop into product, policy, and training updates, and you graduate from “quality assurance” theater to genuine continuous improvement.

6. Beyond the Call Center

Healthcare isn’t the only sector ripe for this shift. Airlines, banks, even your local utilities are running the same empathy theater. Plug in an AI layer with the right guardrails, and suddenly we have a customer‑service stack where empathy scales and authority aligns with competence rather than tenure.

7. A Call to Prototype

This isn’t science fiction. The underlying models already exist; what’s missing is the will to redesign the workflow. So here’s my challenge - to myself, to product leaders, and to any organization still forcing bright, capable humans to read from a laminated cue card:

Audit your escalation data. How many calls should go up‑line but don’t?
Deploy a frontier LLM triage layer. Start with the three directives above.
Measure genuine resolution, not handle time. If average call length rises but repeat calls drop, you’re on the right track.

Because the true test of innovation isn’t whether it dazzles on a slide deck; it’s whether the next caller hangs up feeling heard, helped, and - dare I say - human again.