Claude Code · Codex · Gemini CLI

Decision compression for Agentic AI engineering.

Senior engineer expertise for production agents — opinionated decision frameworks, architecture checklists, and artifact-grade deliverables across the full agent lifecycle.

$ claude plugin install https://github.com/deepak-karkala/agentic-ai-skills

Works with Claude Code · Codex · Gemini CLI · MIT License

Claude Code — project/
/agentic-ai-engineering:agentic-arch-review Analyzing agent system context... Reading design docs from .agentic/ ── AGENT TIER ASSESSMENT ────────────────── True Agent (Level 3) → Autonomous decision loops Pattern: Orchestrator-Workers Rationale: Parallel tool execution + cross-step memory ── RECOMMENDED ARCHITECTURE ───────────────
OrchestratorIntent classification + routing
Worker AKnowledge retrieval (RAG)
Worker BAction execution (write/update)
Architecture review → .agentic/artifacts/architecture-review.html 2 open risks flagged — open in browser

24 skills · 5 specialist agents · 11 commands.

Skills activate automatically when you describe your problem. Explicit slash commands available for every workflow.

Strategy

4 skills
  • Process-fit scoring before you commit to build
  • Wedge positioning, ICP definition, GTM framing
  • Unit economics, LTV, data flywheel, moat depth
  • Governance controls and regulatory mapping
agentic-opportunity-framing · agentic-product-strategy · agentic-economics-and-moats · agentic-governance-and-adoption

Architecture

6 skills
  • Tier selection: pipeline vs ReAct vs True Agent
  • Multi-agent topologies and handoff contracts
  • Context engineering: memory tiers, write/select/compress
  • Tool schema design with ACI principles
agentic-system-design · context-engineering-for-agents · multi-agent-orchestration · single-agent-workflow-design · tool-interface-design · agentic-ubiquitous-language

Reliability

5 skills
  • 6-dimension eval scorecard with grader selection
  • Span taxonomy, circuit breakers, alert thresholds
  • Incident investigation: 6-layer fault classification
  • Hallucination containment with grounding checks
agent-eval-design · agent-observability · incident-investigation · hallucination-containment · trace-error-analysis

Production

7 skills
  • Deployment readiness: HITL gates, rollout posture
  • Token decomposition, model routing, cost-per-task
  • 8-threat taxonomy: injection, PII, unsafe execution
  • Agent UI patterns: streaming step cards, Autonomy Dial
deployment-readiness · latency-and-cost-optimization · agentic-security · human-in-the-loop-patterns · agent-ui-patterns · agentic-prototype · agentic-to-issues

Workflow

3 skills
  • Convert architecture to GitHub issues with acceptance criteria
  • Scope minimal prototype from design decisions
  • Structured handoff: owners, risks, artifact index
agentic-to-issues · agentic-prototype · agentic-handoff

Specialist Agents

5 agents
  • Systems Architect: deep architecture decomposition
  • Evals Auditor: eval suite gap analysis
  • Reliability Engineer: failure mode classification
  • Cost/Performance Analyst: per-component breakdown
agent-systems-architect · agent-evals-auditor · agent-reliability-engineer · agent-cost-performance-analyst · agent-product-strategist

A skill for every stage of the agent lifecycle.

From deciding whether to build, through architecture and evaluation, all the way to production hardening — skills activate automatically as you describe your problem.

01 Frame /opportunity-framing Process-fit scoring
Build / don't build
02 Architect /agentic-plan Pattern selection
Multi-agent topology
03 Build /tool-interface-design Tool schema + ACI
Context engineering
04 Evaluate /agentic-evals 6-dimension scorecard
LLM-as-Judge graders
05 Deploy /agentic-ops HITL gates · Security
Rollout posture
06 Operate /agent-observability Trace spans · Drift
Incident investigation

Each stage has a dedicated skill with step-by-step workflows, decision tables, and artifact outputs

Senior engineer judgment, not generic advice.

Every question about agent architecture, evaluation strategy, or production readiness deserves a structured, defensible answer — not a hallucinated guess.

Without the plugin

Ask any AI to design an agent. Get a generic plan.

Generic advice about "using LangChain" or "adding memory" with no decision criteria, no trade-off analysis, no artifact you can share with stakeholders.

  • No systematic way to choose between agent patterns
  • Architecture decisions undocumented, impossible to review
  • No signal on what will fail in production
Without plugin
How should I architect my agent? You could use LangChain or LlamaIndex for your agent framework. Consider adding memory with vector stores. ReAct is a good pattern. You might also want to look at AutoGen for multi- agent setups. Make sure to add logging! ✗ No decision criteria ✗ No trade-off analysis ✗ No shareable artifact
With agentic-ai-engineering

Structured architecture review with a shareable HTML report.

The plugin routes to /agentic-arch-review automatically. It assesses agent tier, selects a canonical pattern with rationale, and writes a stakeholder-facing HTML artifact.

  • Agent tier selection: pipeline vs ReAct vs True Agent
  • Pattern recommendation with explicit rationale
  • Handoff contracts, risk flags, open questions
With plugin
Design the architecture for my support agent → auto-routing to agentic-system-design... ── TIER ASSESSMENT ──────────────────────── True Agent (L3) — judgment + tool loops Pattern: Orchestrator-Workers ── RISK FLAGS ───────────────────────────── Missing escalation gate (P1) No trace spans defined (P2) architecture-review.html written
Structured diagnosis

Stop guessing. Classify the fault layer first.

The incident investigation skill walks through 6 fault layers — prompt, context, tool interface, model behavior, infra, environment — to pinpoint root cause before suggesting a fix.

  • Layered classification prevents misattributing model errors to infra
  • Each layer has a specific diagnostic check and a targeted remediation
  • Circuit breaker and escalation patterns built into every fix recommendation
Incident investigation
agent is looping on tool_call, retried 12x ── FAULT LAYER CLASSIFICATION ─────────────
Layer 1Prompt— clear
Layer 2Context— clear
Layer 3Tool interface✗ FAULT
Layer 4Model behavior— downstream
Root cause: null return on timeout → retry loop ── FIX ──────────────────────────────────── 1. Return structured error (not null) on timeout 2. Add circuit breaker: halt after 5 calls

Watch the skills work.

Select a workflow to see how the plugin responds — structured decision frameworks, not free-form conversation.

Opportunity Framing — Invoice Processing Agent
Should we build an agent for invoice processing and approval? → routing to agentic-opportunity-framing... ── PROCESS FIT SCORE ─────────────────────────────
Multi-step workflow✓ High(8 steps, exception routing)
Judgment required✓ High(policy-based approval)
Structured outputs✓ High(JSON → ERP API)
Compounding value~ Moderate(improves with volume)
HITL feasible✓ High(approval gate exists)
Cost-benefit positive✓ High($40k/yr labor cost)
Data available✓ High(2yr invoice history)
── DISQUALIFIER CHECK ──────────────────────────── No label leakage detected No compounding error risk at this step count Regulatory: standard finance audit trail required
◆ BUILD  · Process-fit score: 8.2 / 10  · Recommended pattern: ReAct + tool calls
Architecture Review — Customer Support Agent
/agentic-ai-engineering:agentic-arch-review Reading context from .agentic/design-docs/... Delegating to agent-systems-architect... ── CURRENT PATTERN ─────────────────────────────── Detected: Single-agent ReAct Recommended: Orchestrator-Workers Rationale: parallel knowledge lookup + ticket routing ── ARCHITECTURE DECISIONS ────────────────────────
Memory tierExternal (ticket history DB)
Tool footprintRead-only first principle
Escalation gateMissing — add HITL gate (P1)
ObservabilityNo trace spans — add before prod
architecture-review.html written to .agentic/artifacts/ 2 risks flagged · open in browser to review
Incident Investigation — Tool Retry Loop
Agent looping on tool_call, tool_retry logged 12 times → routing to incident-investigation... Delegating deep analysis to agent-reliability-engineer... ── FAULT LAYER CLASSIFICATION ────────────────────
Layer 1Prompt / instruction— clear
Layer 2Context / memory— clear
Layer 3Tool interface✗ FAULT
Layer 4Model behavior— downstream
Layer 5Infrastructure— clear
Layer 6Environment— clear
── ROOT CAUSE ──────────────────────────────────── Tool returns null on timeout; agent interprets as partial result and retries indefinitely. ── IMMEDIATE CONTAINMENT ───────────────────────── 1. Return structured error (not null) on timeout 2. Add max_retries=3 to tool call wrapper 3. Circuit breaker: halt agent loop after 5 calls
Security Audit — CRM-Integrated Sales Agent
Security review for our CRM-integrated sales agent → routing to agentic-security... ── THREAT SCAN (8-threat taxonomy) ───────────────
Prompt injection✗ P0Unmitigated
Indirect injection✗ P0CRM notes — critical
PII leakage⚠ P1Partial guard only
Unsafe tool execution✗ P1No permission tiers
API key exposure✗ P0Key in context window
Over-privileged agentScoped correctly
Data exfiltrationOutput filtered
Replay attacksIdempotent operations
3 critical issues require fix before production Indirect injection from CRM contact notes is P0 — add output filtering + sandboxed tool execution.

Shareable artifacts, not just chat.

Flagship skills write structured HTML reports to disk — stakeholder-ready, version-controllable, reopenable anytime. These are real outputs from the plugin.

Up and running in three steps.

No API keys. No config required. Skills activate automatically when you describe your problem.

01

Install the plugin

One command in your terminal. Works with Claude Code, Codex, and Gemini CLI.

claude plugin install github:deepak-karkala/agentic-ai-skills
02

Run setup in your project

Optional but recommended — initializes artifact paths and project context.

/agentic-ai-engineering:setup-agentic-ai-engineering
03

Describe your problem

Skills auto-route from plain language. Or use explicit slash commands for any skill.

Should we build an agent for X?

Everything that's included.

24 skills, each with step-by-step workflows, decision tables, and explicit scope boundaries.

agentic-opportunity-framingProcess-fit scoring (7 traits), disqualifier check, build/don't-build filter, agent type classification
agentic-product-strategyWedge score, ICP definition, GTM framing, moat layer prioritization, market fit assessment
agentic-economics-and-moatsUnit economics, inference cost, contribution margin, LTV, data flywheel, moat depth quantification
agentic-governance-and-adoptionGovernance controls, regulatory mapping, HITL policy design, adoption plan, compliance framing
agentic-system-designArchitecture decisions, agent tier selection, pattern selection (6 canonical single-agent patterns)
context-engineering-for-agentsMemory tier selection, write/select/compress/isolate framework, context failure mode prevention
multi-agent-orchestrationTopology selection, handoff contracts, coordination anti-patterns, upgrade triggers
single-agent-workflow-designControl flow pattern selection, step sequencing, error recovery design
tool-interface-designTool schema design, ACI principles (poka-yoke, minimal footprint, structured errors), permission tiers, MCP wiring
agentic-ubiquitous-languageShared vocabulary generation, glossary creation, term disambiguation for agent systems
agent-eval-design6-dimension scorecard, trajectory metrics, grader selection (deterministic/LLM-as-Judge/human), EDDOps lifecycle
agent-observabilityTrace design, span taxonomy, circuit breakers, alert thresholds, observability architecture
incident-investigationFailure timeline reconstruction, 6 fault layers, immediate containment, durable fix routing
hallucination-containment4 hallucination modes, grounding checks, citation requirements, verification layer design
trace-error-analysisBackward-tracing diagnostic, span classification, error taxonomy, replay strategy
deployment-readinessProduction gates, guardrails, HITL design, rollout posture, pre-launch checklist
latency-and-cost-optimizationToken usage decomposition, model routing, caching, parallelization, latency-cost tradeoffs
agentic-security8-threat taxonomy: prompt injection, indirect injection, PII leakage, unsafe tool execution, API key exposure
human-in-the-loop-patterns5 HITL models, approval gate design, bounded autonomy contract, escalation ladder
agent-ui-patternsDual-panel layout, streaming step cards, Intent Preview, Autonomy Dial, Calibrated Trust patterns
agentic-prototypeMinimal prototype scoping from design decisions, scaffold generation guidance
agentic-to-issuesArchitecture-to-GitHub-issues translation with acceptance criteria and dependency ordering
agentic-handoffStructured handoff document: system status, owners, open risks, artifact index for team transitions
setup-agentic-ai-engineeringOne-time per-repo config: artifact paths, project context establishment, config.yml creation

Ready to build production-grade agents?

24 skills. 5 specialist agents. Decision frameworks for the full lifecycle.

$ claude plugin install https://github.com/deepak-karkala/agentic-ai-skills

MIT LICENSE · CLAUDE CODE · CODEX · GEMINI CLI · OPENCODE