56 Detectors for AI Agent Architecture
AI agent codebases fail differently than traditional software.
A missing max_steps in a LangGraph loop doesn’t cause a compile error. An agent that concatenates untrusted tool output directly into a prompt passes every type check. A multi-agent handoff graph with a cycle runs fine in development — until it doesn’t. These aren’t edge cases — they’re structural properties, visible in the code before the agent ever runs.
Every major AI agent incident in the past year — Replit’s agent deleting a production database, Gemini CLI destroying user files, prompt injection exfiltrating data from Microsoft Copilot — was an architecture problem. Not a model problem. Not a code quality problem. A missing guardrail that no linter was built to see.
The agent_architecture metric was built to see them.
Four axes, 56 detectors
arxo analyze --metric agent_architecture
The metric is organized around four axes. Each detector checks for a specific structural gap — something that should exist in the code but doesn’t.
Reliability
Does the agent have the structural safeguards to run without spiraling? Detectors in this axis look for:
- Loop guards — agent loops without
max_steps,recursion_limit, or equivalent termination conditions - Memory bounds — unbounded context windows and tool state that grow without limits
- Retry storms — retry logic without backoff, jitter, or circuit breakers
- Cost budget enforcement — LLM calls without
max_tokensor budget caps (OWASP LLM06: denial of wallet) - Checkpoint durability — long-running workflows without persistent state for crash recovery
- Output validation — agent outputs consumed without schema checks or type validation
- Hallucination propagation — outputs from one agent step fed into the next without grounding verification
Safety
Can the agent be exploited, misused, or cause unintended side effects? This axis has three sub-groups:
Tool execution covers the surface area where agents interact with the outside world:
- Prompt injection defense — no input sanitization, role boundary enforcement, or guardrail hooks (OWASP LLM01)
- Sensitive data exposure — PII or credentials flowing into prompts or logs without redaction (OWASP LLM02)
- Human approval absence — high-risk tool actions (shell, file write, API calls) without approval gates
- Tool sandbox enforcement — process-capable tools running without isolation or containment
- Untrusted output boundary — raw tool output concatenated into prompts without sanitization
MCP covers Model Context Protocol server integrations:
- MCP auth gap — servers without authentication or authorization
- MCP tool poisoning risk — tool descriptions containing hidden instructions
- MCP rug pull risk — no descriptor integrity controls (pinning, hash, version lock)
- MCP shadow server risk — unaudited MCP servers in the dependency chain
A2A covers agent-to-agent communication:
- Agent card gap — missing A2A agent card declarations
- Handoff cycle risk — multi-agent delegation graphs with cycles
- Webhook auth gap — A2A webhook endpoints without authentication
Governance
Are tool invocations constrained by policy? Detectors here check for:
- Tool policy absence — tools registered without allowlists, scope limits, or invocation policies
- Schema validation gap — tool inputs accepted without schema checks
- Tool result validation gap — tool outputs consumed without explicit validation
Coordination
Can multi-agent systems coordinate without deadlocks, races, or cascading failures?
- Coordination risk — multi-agent handoffs without typed message or state contracts
- Routing pattern risk — agent routing without confidence thresholds or fallback routes
- Deadlock risk — fanout flows without joins, barriers, or concurrency limiters
- State isolation risk — mutable state shared across sessions without scoping
- Fanout control absence — parallel execution without
max_concurrentor semaphore limits - Idempotency gap — side-effecting operations without idempotency keys
What it looks like in practice
A LangGraph agent with no recursion limit:
graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_edge("agent", "tools")
graph.add_edge("tools", "agent")
app = graph.compile() # ← loop_guard_absence
Arxo flags loop_guard_absence because the compile call has no recursion_limit and the graph contains a cycle. The fix is one argument:
app = graph.compile(checkpointer=memory, recursion_limit=25)
A CrewAI agent with shell access and no approval:
Agent(
role="researcher",
tools=[ShellTool(), FileWriteTool()], # ← agent_shell_capable, human_approval_absence
)
Arxo flags two detectors: the agent has unrestricted shell access, and destructive tools have no human-in-the-loop gate.
Running it
arxo init
arxo analyze --metric agent_architecture
No configuration required for a first report. Every finding includes a detector ID, evidence from the code, and a specific remediation — not a generic warning, but the exact change to make.
We’ll be writing about each axis in depth. Next up: reliability — the 16 detectors that keep your agent from spiraling.