56 Detectors for AI Agent Architecture


AI agent codebases fail differently than traditional software.

A missing max_steps in a LangGraph loop doesn’t cause a compile error. An agent that concatenates untrusted tool output directly into a prompt passes every type check. A multi-agent handoff graph with a cycle runs fine in development — until it doesn’t. These aren’t edge cases — they’re structural properties, visible in the code before the agent ever runs.

Every major AI agent incident in the past year — Replit’s agent deleting a production database, Gemini CLI destroying user files, prompt injection exfiltrating data from Microsoft Copilot — was an architecture problem. Not a model problem. Not a code quality problem. A missing guardrail that no linter was built to see.

The agent_architecture metric was built to see them.

Four axes, 56 detectors

arxo analyze --metric agent_architecture

The metric is organized around four axes. Each detector checks for a specific structural gap — something that should exist in the code but doesn’t.

Reliability

Does the agent have the structural safeguards to run without spiraling? Detectors in this axis look for:

  • Loop guards — agent loops without max_steps, recursion_limit, or equivalent termination conditions
  • Memory bounds — unbounded context windows and tool state that grow without limits
  • Retry storms — retry logic without backoff, jitter, or circuit breakers
  • Cost budget enforcement — LLM calls without max_tokens or budget caps (OWASP LLM06: denial of wallet)
  • Checkpoint durability — long-running workflows without persistent state for crash recovery
  • Output validation — agent outputs consumed without schema checks or type validation
  • Hallucination propagation — outputs from one agent step fed into the next without grounding verification

Safety

Can the agent be exploited, misused, or cause unintended side effects? This axis has three sub-groups:

Tool execution covers the surface area where agents interact with the outside world:

  • Prompt injection defense — no input sanitization, role boundary enforcement, or guardrail hooks (OWASP LLM01)
  • Sensitive data exposure — PII or credentials flowing into prompts or logs without redaction (OWASP LLM02)
  • Human approval absence — high-risk tool actions (shell, file write, API calls) without approval gates
  • Tool sandbox enforcement — process-capable tools running without isolation or containment
  • Untrusted output boundary — raw tool output concatenated into prompts without sanitization

MCP covers Model Context Protocol server integrations:

  • MCP auth gap — servers without authentication or authorization
  • MCP tool poisoning risk — tool descriptions containing hidden instructions
  • MCP rug pull risk — no descriptor integrity controls (pinning, hash, version lock)
  • MCP shadow server risk — unaudited MCP servers in the dependency chain

A2A covers agent-to-agent communication:

  • Agent card gap — missing A2A agent card declarations
  • Handoff cycle risk — multi-agent delegation graphs with cycles
  • Webhook auth gap — A2A webhook endpoints without authentication

Governance

Are tool invocations constrained by policy? Detectors here check for:

  • Tool policy absence — tools registered without allowlists, scope limits, or invocation policies
  • Schema validation gap — tool inputs accepted without schema checks
  • Tool result validation gap — tool outputs consumed without explicit validation

Coordination

Can multi-agent systems coordinate without deadlocks, races, or cascading failures?

  • Coordination risk — multi-agent handoffs without typed message or state contracts
  • Routing pattern risk — agent routing without confidence thresholds or fallback routes
  • Deadlock risk — fanout flows without joins, barriers, or concurrency limiters
  • State isolation risk — mutable state shared across sessions without scoping
  • Fanout control absence — parallel execution without max_concurrent or semaphore limits
  • Idempotency gap — side-effecting operations without idempotency keys

What it looks like in practice

A LangGraph agent with no recursion limit:

graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_edge("agent", "tools")
graph.add_edge("tools", "agent")
app = graph.compile()  # ← loop_guard_absence

Arxo flags loop_guard_absence because the compile call has no recursion_limit and the graph contains a cycle. The fix is one argument:

app = graph.compile(checkpointer=memory, recursion_limit=25)

A CrewAI agent with shell access and no approval:

Agent(
    role="researcher",
    tools=[ShellTool(), FileWriteTool()],  # ← agent_shell_capable, human_approval_absence
)

Arxo flags two detectors: the agent has unrestricted shell access, and destructive tools have no human-in-the-loop gate.

Running it

arxo init
arxo analyze --metric agent_architecture

No configuration required for a first report. Every finding includes a detector ID, evidence from the code, and a specific remediation — not a generic warning, but the exact change to make.

We’ll be writing about each axis in depth. Next up: reliability — the 16 detectors that keep your agent from spiraling.