FilterPrompt — AI Firewall logo

OWASP Top 10 for LLM Applications 2025 — Full Breakdown

Framework · 2026-03-21 · 15 min read · FilterPrompt Security Team

Every category in the OWASP Top 10 for Large Language Model Applications 2025 — what it means, real examples, and how to test for each.

The OWASP Top 10 for LLM Applications is the single most-cited framework in AI security — 8,100 searches per month for the top query and growing. This post walks the entire 2025 taxonomy with real examples and testing guidance.

LLM01 — Prompt Injection

Attackers manipulate model behavior through crafted input. Direct: typed into the prompt. Indirect: hidden in data the model reads (URLs, docs, emails). This is the highest-volume attack class in production LLMs — every serious defense program starts here. Test with the FilterPrompt LLM01 probe battery, run monthly.

LLM02 — Insecure Output Handling

LLM output is passed downstream (rendered as HTML, executed as SQL, run as shell) without sanitization. The model becomes an attack vector for classic web/DB vulnerabilities. Defense: treat model output as untrusted; sanitize before rendering or executing.

LLM03 — Training Data Poisoning

Attacker introduces malicious or biased data into pretraining or fine-tuning. Backdoor triggers, biased outputs, or knowledge corruption result. Defense: dataset provenance, integrity hashes, evaluation against poisoning benchmarks.

LLM04 — Model Denial of Service

Attackers craft prompts that force expensive generations or exhaust the context window, driving up cost and blocking legitimate users. Defense: rate limits, generation length caps, complexity scoring on prompts.

LLM05 — Supply Chain Vulnerabilities

Compromised model weights, poisoned fine-tuning datasets, malicious plugins, or unsafe vector stores. Defense: SBOM for AI, signed models, plugin allowlists.

LLM06 — Sensitive Information Disclosure

Model reveals system prompt, PII, secrets, or proprietary data. Defense: firewall with PII/secret detection, system-prompt hardening, output DLP.

LLM07 — Insecure Plugin/Tool Design

Tools/functions accept unvalidated input from the model, allowing SQLi, SSRF, or unauthorized actions. Defense: schema validation, strict allowlists, principle of least privilege.

LLM08 — Excessive Agency

Agents are given too many tools or too broad permissions; an injection triggers unauthorized email, calendar, DB, or code-execution actions. Defense: minimum-viable tool sets, human approval on high-risk actions, per-tool blast radius analysis.

LLM09 — Overreliance

Users trust hallucinated model output for high-stakes decisions. Defense: source citations, confidence surfacing, human review workflows for regulated domains.

LLM10 — Model Theft

Attackers extract model weights or replicate behavior via query-based extraction. Defense: rate limits per user, output watermarking, extraction-detection scanning.

How to score your app against the OWASP LLM Top 10

  1. Run FilterPrompt Scanner's OWASP LLM Top 10 sampler — free, ~5 minutes.
  2. Review the per-category score in the PDF report.
  3. Prioritize LLM01 + LLM06 + LLM08 first (highest exploitation rates).
  4. Fix, re-scan, and add rules to your AI firewall for anything you can't fix at the model level.

Related