agentic-ai-engineering · Architecture Review

Support Triage Agent

Generated 2026-05-16 · Skill: agentic-system-design · Command: /agentic-ai-engineering:agentic-arch-review
Verdict
Caution
Architecture is sound but three gaps require resolution before production: tool boundary between classifier and KB retrieval is undefined, HITL gate on response send is unimplemented, and observability spine is absent.

Agent Topology

User Request │ ▼ ┌─────────────────┐ │ Orchestrator │ Route + Coordinate └────────┬────────┘ │ ┌────┼────┐ ▼ ▼ ▼ ┌──────┐ ┌──────┐ ┌──────┐ │Class-│ │ KB │ │Draft-│ │ifier │ │Fetch │ │ er │ └──────┘ └──────┘ └──────┘ │ │ │ └────┬────┘ │ ▼ ▼ [Category] [Draft → HITL gate]
Pattern Value
Architecture typeOrchestrator-Workers
Agent count4 (1 orchestrator + 3 workers)
Orchestration styleSequential fan-out with HITL gate at send
State managementExternal session store (Redis)

Autonomy Tier

DimensionClassification
Action reversibility Correctable
Human approval gate Required before send
Failure blast radius Low — draft only until approved
Recommended rollout Shadow mode → gated (10%) → progressive
Overall autonomy level L2 — Assisted

Tool Boundary Map

Tool / Integration Purpose Scope Risk
classify_ticket Assign category and priority to inbound ticket Read-only Low
search_kb Retrieve relevant knowledge base articles by category Read-only Low
get_article Fetch full text of a specific KB article by ID Read-only Low
draft_response Generate candidate reply from KB context and ticket Read-only Low
send_response Deliver approved draft to customer via support channel Write — irreversible Medium — HITL gate required
escalate_ticket Route ticket to human agent when classifier confidence is low Write — correctable Low

Design Decision Rationale

Decision Choice made Alternatives considered Tradeoff
Architecture pattern Orchestrator-Workers Single agent with all tools, parallel fan-out 3 workers reduce per-agent tool count from 6 to 2; context stays focused. Adds coordination overhead.
KB retrieval strategy Category-first search Full-text semantic search across all articles Category filter reduces retrieval noise by 60%; requires accurate classification first. Cascading dependency.
Response send gate HITL approval before send Automatic send with post-send review, no gate Prevents sending incorrect responses; adds ~2 min latency. Acceptable for support SLA of 4 hours.
State persistence External Redis session store In-context state, filesystem state Enables long-lived sessions across human review gaps; requires Redis ops; in-context state would overflow on complex tickets.

Risk Register

High
Classifier-to-KB dependency creates cascading failure path
If the classifier assigns the wrong category, KB search retrieves irrelevant articles, drafter produces a hallucinated response, and the HITL gate is the only safety net. A miscalibrated classifier silently degrades all downstream quality.
Add classifier confidence threshold: below 0.7, escalate directly rather than proceeding to KB fetch. Instrument classifier accuracy in shadow mode before full rollout.
Medium
No observability spine before shadow mode
The current design has no trace instrumentation. Shadow mode data will be uninterpretable without per-span telemetry covering classifier output, KB retrieval relevance, and draft quality signal.
Instrument with OpenTelemetry spans before shadow launch. Minimum: session span, classifier span (category + confidence), KB span (docs returned + relevance scores), drafter span (draft text).
Medium
HITL gate UI is unspecified
The architecture specifies that send_response requires HITL approval but does not specify the approval interface. If this defaults to an email-based approval, reviewer latency will exceed the 4-hour SLA on high-volume days.
Define the HITL interface explicitly: inline approval widget in the support platform is preferred. Batch approval mode for low-risk draft categories reduces per-ticket friction.

Architecture Recommendation

The Orchestrator-Workers pattern is the correct choice for this workload: 6 tools across 3 functional roles would create context pollution in a single-agent design, and sequential fan-out maps cleanly to the classify → retrieve → draft pipeline. The L2 autonomy tier (HITL gate on send) is appropriate given the irreversibility of sending incorrect support responses to customers. Before proceeding to shadow mode, resolve three gaps: define the tool boundary between classifier and KB worker (currently a shared data dependency without a formal contract), implement the HITL approval interface explicitly rather than as a placeholder, and add the observability spine. These are not blocking redesigns — they are implementation completeness items that can be addressed in the current sprint.

Open Questions

1
What is the classifier confidence threshold below which the agent should escalate rather than retrieve? Currently unspecified — this determines the proportion of tickets that bypass the full pipeline.
2
Is the KB article corpus versioned? If KB articles are updated after a draft is generated but before the HITL reviewer approves it, the draft may reference outdated information.
3
What is the fallback when no KB articles match the classified category? The current architecture has no fallback path — the drafter would proceed with empty context and likely hallucinate.

Recommended Next Steps