Quick answer: A production-quality customer-support AI in 2026 combines six patterns: retrieval scoping, reply-style conditioning, structured answer + citation, explicit escalation contract, hallucination guard, and user-sentiment override. Skipping any of the six is the difference between 40% deflection rate and 75% deflection rate. None is exotic; each is a handful of lines of prompt.

The six patterns

1. Retrieval scoping

Do not dump top-10 chunks into the prompt. Retrieve, rerank, take top-3, and include a per-chunk confidence score. The model reasons better on fewer, higher-quality passages. Include a chunk-id so you can cite.

2. Reply-style conditioning

Specify tone, length, and formatting once per prompt. “Reply in the user's language. Maximum 120 words. If you use steps, use numbered lists. Never use exclamation marks.” Consistency raises perceived quality dramatically.

3. Structured answer + citation

Force output as JSON with fields: answer, citations: chunkId[],confidence, escalate: boolean. Parse, render, log. Citations feed your analytics on which docs are being used.

4. Explicit escalation contract

Tell the model exactly when to escalate: “legal questions”, “refund disputes over £100”, “mentions of suicide/self-harm”, “confidence < 0.6”. Ambiguity here is what makes chatbots hated.

5. Hallucination guard

Include an explicit rule: “If the provided documents do not answer the user's question, reply ‘I don't have that information; let me transfer you.’ Do not invent facts.” Pair with an offline eval that seeds questions not in the corpus and measures hallucination rate.

6. User-sentiment override

If the user is angry or frustrated, tone matters more than information. Detect sentiment (either in-model or via a lightweight classifier) and branch the prompt: empathic short acknowledgement first, then resolution. Production deflection jumps 10-15 percentage points with this alone.

What to measure

Deflection rate (resolved without handoff).
Hallucination rate (on seeded-out-of-corpus questions).
Escalation correctness (false escalations, missed escalations).
Sentiment shift (pre vs post conversation).
Citation accuracy (do the cited chunks actually support the answer?).

Prompt Patterns for AI Customer Support (Use-case Deep-Dive)