Claude Code · Codex · Gemini CLI

Decision compression for Agentic AI engineering.

Senior engineer expertise for production agents — opinionated decision frameworks, architecture checklists, and artifact-grade deliverables across the full agent lifecycle.

$ claude plugin install https://github.com/deepak-karkala/agentic-ai-skills

Install Plugin View on GitHub

Works with Claude Code · Codex · Gemini CLI · MIT License

Claude Code — project/

❯ /agentic-ai-engineering:agentic-arch-review Analyzing agent system context... Reading design docs from .agentic/ ── AGENT TIER ASSESSMENT ────────────────── ✓ True Agent (Level 3) → Autonomous decision loops Pattern: Orchestrator-Workers Rationale: Parallel tool execution + cross-step memory ── RECOMMENDED ARCHITECTURE ───────────────

Orchestrator	→	Intent classification + routing
Worker A	→	Knowledge retrieval (RAG)
Worker B	→	Action execution (write/update)

Architecture review → ✓ .agentic/artifacts/architecture-review.html ⚠ 2 open risks flagged — open in browser

Capabilities

24 skills · 5 specialist agents · 11 commands.

Skills activate automatically when you describe your problem. Explicit slash commands available for every workflow.

Strategy

4 skills

Process-fit scoring before you commit to build
Wedge positioning, ICP definition, GTM framing
Unit economics, LTV, data flywheel, moat depth
Governance controls and regulatory mapping

agentic-opportunity-framing · agentic-product-strategy · agentic-economics-and-moats · agentic-governance-and-adoption

Architecture

6 skills

Tier selection: pipeline vs ReAct vs True Agent
Multi-agent topologies and handoff contracts
Context engineering: memory tiers, write/select/compress
Tool schema design with ACI principles

agentic-system-design · context-engineering-for-agents · multi-agent-orchestration · single-agent-workflow-design · tool-interface-design · agentic-ubiquitous-language

Reliability

5 skills

6-dimension eval scorecard with grader selection
Span taxonomy, circuit breakers, alert thresholds
Incident investigation: 6-layer fault classification
Hallucination containment with grounding checks

agent-eval-design · agent-observability · incident-investigation · hallucination-containment · trace-error-analysis

Production

7 skills

Deployment readiness: HITL gates, rollout posture
Token decomposition, model routing, cost-per-task
8-threat taxonomy: injection, PII, unsafe execution
Agent UI patterns: streaming step cards, Autonomy Dial

deployment-readiness · latency-and-cost-optimization · agentic-security · human-in-the-loop-patterns · agent-ui-patterns · agentic-prototype · agentic-to-issues

Workflow

3 skills

Convert architecture to GitHub issues with acceptance criteria
Scope minimal prototype from design decisions
Structured handoff: owners, risks, artifact index

agentic-to-issues · agentic-prototype · agentic-handoff

Specialist Agents

5 agents

Systems Architect: deep architecture decomposition
Evals Auditor: eval suite gap analysis
Reliability Engineer: failure mode classification
Cost/Performance Analyst: per-component breakdown

agent-systems-architect · agent-evals-auditor · agent-reliability-engineer · agent-cost-performance-analyst · agent-product-strategist

How it Works

A skill for every stage of the agent lifecycle.

From deciding whether to build, through architecture and evaluation, all the way to production hardening — skills activate automatically as you describe your problem.

01 Frame /opportunity-framing Process-fit scoring
Build / don't build

02 Architect /agentic-plan Pattern selection
Multi-agent topology

03 Build /tool-interface-design Tool schema + ACI
Context engineering

04 Evaluate /agentic-evals 6-dimension scorecard
LLM-as-Judge graders

05 Deploy /agentic-ops HITL gates · Security
Rollout posture

06 Operate /agent-observability Trace spans · Drift
Incident investigation

Each stage has a dedicated skill with step-by-step workflows, decision tables, and artifact outputs

The Difference

Senior engineer judgment, not generic advice.

Every question about agent architecture, evaluation strategy, or production readiness deserves a structured, defensible answer — not a hallucinated guess.

Without the plugin

Ask any AI to design an agent. Get a generic plan.

Generic advice about "using LangChain" or "adding memory" with no decision criteria, no trade-off analysis, no artifact you can share with stakeholders.

No systematic way to choose between agent patterns
Architecture decisions undocumented, impossible to review
No signal on what will fail in production

Without plugin

❯ How should I architect my agent? You could use LangChain or LlamaIndex for your agent framework. Consider adding memory with vector stores. ReAct is a good pattern. You might also want to look at AutoGen for multi- agent setups. Make sure to add logging! ✗ No decision criteria ✗ No trade-off analysis ✗ No shareable artifact

With agentic-ai-engineering

Structured architecture review with a shareable HTML report.

The plugin routes to /agentic-arch-review automatically. It assesses agent tier, selects a canonical pattern with rationale, and writes a stakeholder-facing HTML artifact.

Agent tier selection: pipeline vs ReAct vs True Agent
Pattern recommendation with explicit rationale
Handoff contracts, risk flags, open questions

With plugin

❯ Design the architecture for my support agent → auto-routing to agentic-system-design... ── TIER ASSESSMENT ──────────────────────── ✓ True Agent (L3) — judgment + tool loops Pattern: Orchestrator-Workers ── RISK FLAGS ───────────────────────────── ⚠ Missing escalation gate (P1) ⚠ No trace spans defined (P2) ✓ architecture-review.html written

Structured diagnosis

Stop guessing. Classify the fault layer first.

The incident investigation skill walks through 6 fault layers — prompt, context, tool interface, model behavior, infra, environment — to pinpoint root cause before suggesting a fix.

Layered classification prevents misattributing model errors to infra
Each layer has a specific diagnostic check and a targeted remediation
Circuit breaker and escalation patterns built into every fix recommendation

Incident investigation

❯ agent is looping on tool_call, retried 12x ── FAULT LAYER CLASSIFICATION ─────────────

Layer 1	Prompt	— clear
Layer 2	Context	— clear
Layer 3	Tool interface	✗ FAULT
Layer 4	Model behavior	— downstream

Root cause: null return on timeout → retry loop ── FIX ──────────────────────────────────── 1. Return structured error (not null) on timeout 2. Add circuit breaker: halt after 5 calls

See it in Action

Watch the skills work.

Select a workflow to see how the plugin responds — structured decision frameworks, not free-form conversation.

Opportunity Framing — Invoice Processing Agent

❯ Should we build an agent for invoice processing and approval? → routing to agentic-opportunity-framing... ── PROCESS FIT SCORE ─────────────────────────────

Multi-step workflow	✓ High	(8 steps, exception routing)
Judgment required	✓ High	(policy-based approval)
Structured outputs	✓ High	(JSON → ERP API)
Compounding value	~ Moderate	(improves with volume)
HITL feasible	✓ High	(approval gate exists)
Cost-benefit positive	✓ High	($40k/yr labor cost)
Data available	✓ High	(2yr invoice history)

── DISQUALIFIER CHECK ──────────────────────────── ✓ No label leakage detected ✓ No compounding error risk at this step count ✓ Regulatory: standard finance audit trail required

◆ BUILD · Process-fit score: 8.2 / 10 · Recommended pattern: ReAct + tool calls

Architecture Review — Customer Support Agent

❯ /agentic-ai-engineering:agentic-arch-review Reading context from .agentic/design-docs/... Delegating to agent-systems-architect... ── CURRENT PATTERN ─────────────────────────────── Detected: Single-agent ReAct Recommended: Orchestrator-Workers Rationale: parallel knowledge lookup + ticket routing ── ARCHITECTURE DECISIONS ────────────────────────

Memory tier	✓	External (ticket history DB)
Tool footprint	✓	Read-only first principle
Escalation gate	⚠	Missing — add HITL gate (P1)
Observability	✗	No trace spans — add before prod

✓ architecture-review.html written to .agentic/artifacts/ 2 risks flagged · open in browser to review

Incident Investigation — Tool Retry Loop

❯ Agent looping on tool_call, tool_retry logged 12 times → routing to incident-investigation... Delegating deep analysis to agent-reliability-engineer... ── FAULT LAYER CLASSIFICATION ────────────────────

Layer 1	Prompt / instruction	— clear
Layer 2	Context / memory	— clear
Layer 3	Tool interface	✗ FAULT
Layer 4	Model behavior	— downstream
Layer 5	Infrastructure	— clear
Layer 6	Environment	— clear

── ROOT CAUSE ──────────────────────────────────── Tool returns null on timeout; agent interprets as partial result and retries indefinitely. ── IMMEDIATE CONTAINMENT ───────────────────────── 1. Return structured error (not null) on timeout 2. Add max_retries=3 to tool call wrapper 3. Circuit breaker: halt agent loop after 5 calls

Security Audit — CRM-Integrated Sales Agent

❯ Security review for our CRM-integrated sales agent → routing to agentic-security... ── THREAT SCAN (8-threat taxonomy) ───────────────

Prompt injection	✗ P0	Unmitigated
Indirect injection	✗ P0	CRM notes — critical
PII leakage	⚠ P1	Partial guard only
Unsafe tool execution	✗ P1	No permission tiers
API key exposure	✗ P0	Key in context window
Over-privileged agent	✓	Scoped correctly
Data exfiltration	✓	Output filtered
Replay attacks	✓	Idempotent operations

3 critical issues require fix before production Indirect injection from CRM contact notes is P0 — add output filtering + sandboxed tool execution.

What Gets Produced

Shareable artifacts, not just chat.

Flagship skills write structured HTML reports to disk — stakeholder-ready, version-controllable, reopenable anytime. These are real outputs from the plugin.

Get Started

Up and running in three steps.

No API keys. No config required. Skills activate automatically when you describe your problem.

Install the plugin

One command in your terminal. Works with Claude Code, Codex, and Gemini CLI.

claude plugin install github:deepak-karkala/agentic-ai-skills

Run setup in your project

Optional but recommended — initializes artifact paths and project context.

/agentic-ai-engineering:setup-agentic-ai-engineering

Describe your problem

Skills auto-route from plain language. Or use explicit slash commands for any skill.

Should we build an agent for X?

Full Skill Inventory

Everything that's included.

24 skills, each with step-by-step workflows, decision tables, and explicit scope boundaries.

agentic-opportunity-framingProcess-fit scoring (7 traits), disqualifier check, build/don't-build filter, agent type classification

agentic-product-strategyWedge score, ICP definition, GTM framing, moat layer prioritization, market fit assessment

agentic-economics-and-moatsUnit economics, inference cost, contribution margin, LTV, data flywheel, moat depth quantification

agentic-governance-and-adoptionGovernance controls, regulatory mapping, HITL policy design, adoption plan, compliance framing

agentic-system-designArchitecture decisions, agent tier selection, pattern selection (6 canonical single-agent patterns)

context-engineering-for-agentsMemory tier selection, write/select/compress/isolate framework, context failure mode prevention

multi-agent-orchestrationTopology selection, handoff contracts, coordination anti-patterns, upgrade triggers

single-agent-workflow-designControl flow pattern selection, step sequencing, error recovery design

tool-interface-designTool schema design, ACI principles (poka-yoke, minimal footprint, structured errors), permission tiers, MCP wiring

agentic-ubiquitous-languageShared vocabulary generation, glossary creation, term disambiguation for agent systems

agent-eval-design6-dimension scorecard, trajectory metrics, grader selection (deterministic/LLM-as-Judge/human), EDDOps lifecycle

agent-observabilityTrace design, span taxonomy, circuit breakers, alert thresholds, observability architecture

incident-investigationFailure timeline reconstruction, 6 fault layers, immediate containment, durable fix routing

hallucination-containment4 hallucination modes, grounding checks, citation requirements, verification layer design

trace-error-analysisBackward-tracing diagnostic, span classification, error taxonomy, replay strategy

deployment-readinessProduction gates, guardrails, HITL design, rollout posture, pre-launch checklist

latency-and-cost-optimizationToken usage decomposition, model routing, caching, parallelization, latency-cost tradeoffs

agentic-security8-threat taxonomy: prompt injection, indirect injection, PII leakage, unsafe tool execution, API key exposure

human-in-the-loop-patterns5 HITL models, approval gate design, bounded autonomy contract, escalation ladder

agent-ui-patternsDual-panel layout, streaming step cards, Intent Preview, Autonomy Dial, Calibrated Trust patterns

agentic-prototypeMinimal prototype scoping from design decisions, scaffold generation guidance

agentic-to-issuesArchitecture-to-GitHub-issues translation with acceptance criteria and dependency ordering

agentic-handoffStructured handoff document: system status, owners, open risks, artifact index for team transitions

setup-agentic-ai-engineeringOne-time per-repo config: artifact paths, project context establishment, config.yml creation