Frameworks › OWASP

OWASP Top 10 for Agentic Applications 2026

The OWASP Top 10 for Agentic Applications 2026 defines 10 risk categories (ASI-01 through ASI-10) for autonomous AI agents. ConstantX evaluations produce empirical evidence for these categories through targeted adversarial scenarios, each traced to a documented threat model entry.

Coverage Status

All 10 OWASP ASI codes have empirical evidence from completed engagements. Coverage is derived from adversarial scenarios traced to documented threat model entries in the target system's threat model. Structural limits on specific sub-vectors within each category are documented in the Coverage Boundaries section below.

ASI Code	Risk Category	Status	Threat IDs (RuntimeX)
ASI-01	Agent Goal Hijack	Covered	TM-001, TM-002, TM-019
ASI-02	Tool Misuse and Exploitation	Covered	TM-004, TM-005, TM-008
ASI-03	Identity and Privilege Abuse	Covered	TM-006, TM-007, TM-012
ASI-04	Agentic Supply Chain Vulnerabilities	Covered	TM-007, TM-014, TM-018
ASI-05	Unexpected Code Execution (RCE)	Covered	TM-005, TM-014
ASI-06	Memory & Context Poisoning	Covered	TM-001, TM-002, TM-003, TM-019
ASI-07	Insecure Inter-Agent Communication	Covered	TM-015
ASI-08	Cascading Failures	Covered	TM-009, TM-011, TM-019
ASI-09	Human-Agent Trust Exploitation	Covered	TM-010, TM-013, TM-017
ASI-10	Rogue Agents	Covered	TM-007, TM-014, TM-016

How Coverage Is Determined

ConstantX does not predict which enforcement surface fires for a given scenario. The reducer uses a disallowed signals blacklist: if any enforcement signal fires and the run terminates cleanly, the result is bounded_failure — the system contained the attack regardless of which surface caught it.

A prompt injection scenario (ASI-01) might be caught by the tool policy gate, the commit gate, or the discipline gate. All three outcomes are bounded_failure. The scenario's asi_codes field records which risk categories it exercises, and the verdict records whether coverage was observed.

A run with no enforcement signal is undefined_behavior regardless of whether the model appeared to refuse. Enforcement is structural. Alignment is probabilistic. Decision Coverage measures the structural part.

The T-Code Spine

ASI codes sit above 17 attacker technique classes (T-codes) from the OWASP Agentic AI Threats & Mitigations taxonomy. ConstantX threat models walk T1–T17 against each target system to verify technique-class completeness before deriving scenarios.

The mapping is mechanical: once T-codes are assigned to a threat, ASI codes follow from the cross-reference table. This produces the full derivation chain:

T-code → Threat → ASI code → Scenario → Verdict

Coverage Boundaries

All 10 ASI codes have empirical evidence from completed engagements. Within each category, structural limits apply to specific sub-vectors:

ASI-04 (Supply Chain): Forge skill injection, manifest bypass, and embedding model supply chain are tested. MCP server poisoning and A2A agent card forgery require those architectural components to be present in the target deployment — they apply to systems built on MCP/A2A registries, not to all agentic deployments.
ASI-05 (RCE): Runtime code execution escalation is tested — command prefix bypass, Forge-generated malicious skills. Vibe coding risks (AI-generated code with embedded backdoors) and dependency lockfile poisoning are developer workflow risks upstream of the agent runtime and outside sandbox-based evaluation scope.
ASI-07 (Inter-Agent Communication): Orchestration result poisoning (compromised sub-agent returning false synthesis to supervisor) is tested. MITM of inter-agent channels and A2A registration spoofing require multi-service network infrastructure outside single-agent sandbox scope.
ASI-09 (Human-Agent Trust): Approval fatigue, voice social engineering, and verification gate gaming are tested. Anthropomorphism and automation bias are human-cognitive vulnerabilities — they describe how humans interact with agents, not agent enforcement behavior. These are not testable by structural enforcement.
ASI-10 (Rogue Agents): Rogue execution channels are tested — Forge code injection, event route injection, webhook automation triggers. Behavioral drift, reward hacking, and self-replication require observation across extended deployment time windows. An agent that develops goal drift after deployment or strategically passes evaluation is outside sandbox-based evaluation scope.

See ASI coverage in completed engagements

Opus 4.5 | 100% TC GPT 5.4 | 85.85% TC

Scope Your Deployment Audit

All Frameworks · MITRE ATLAS · NIST AI RMF · Methodology Paper