What is an LLM vulnerability scanner?

An LLM vulnerability scanner sends batteries of adversarial probes — jailbreaks, prompt injections, PII extraction attempts, harmful-content requests — at a target LLM and grades the responses. The output is a vulnerability + optimization report with per-probe evidence, severity, and a prioritized fix list.

Which LLMs can I scan with FilterPrompt?

OpenAI, Anthropic, Google Gemini, Azure OpenAI, plus any OpenAI-compatible endpoint — Ollama, Groq, Mistral, Together AI, OpenRouter, Perplexity, Hugging Face, vLLM, or your own custom endpoint. Bring your own keys per tenant.

What kinds of vulnerabilities does FilterPrompt test for?

Jailbreaks (DAN, role hijack, translation smuggling), direct and indirect prompt injection, system-prompt extraction, harmful-content compliance, PII / secret leakage, bias & fairness, RAG poisoning, agent/tool abuse, output quality, and robustness — categories map to the OWASP LLM Top 10.

How are probes graded?

Each probe declares an evaluator: regex match, refusal-check, contains-check, or an AI judge (Gemini 3 Flash). Pass/fail comes with severity, category, the exact prompt sent, the model's full response, and the evaluator's reason — fully auditable.

How much does a scan cost?

1 credit per probe executed. New accounts get 1 welcome credit on signup. Pay-as-you-go credit packs after that — credits never expire. Connecting LLMs and creating tenants is free.

Prompt Injection Scanner Comparison 2026: FilterPrompt vs Garak, PromptFoo, PyRIT, Lakera Red, Mindgard

Comparison · 2026-05-10 · 10 min read · FilterPrompt Security Team

Honest side-by-side of the six LLM vulnerability scanners enterprise teams actually evaluate in 2026 — probe library size, OWASP LLM Top 10 coverage, report formats, CI integration, pricing and SOC 2.

If you're choosing an LLM vulnerability scanner in 2026, you're past the 'do I need this' phase — you're comparing tools. This page is the honest side-by-side: what each scanner actually produces in a real evaluation, where FilterPrompt wins, and where it doesn't.

Six scanners dominate enterprise shortlists right now: FilterPrompt, NVIDIA Garak, PromptFoo (red-team mode), Microsoft PyRIT, Lakera Red, and Mindgard. We've run alongside all five — and where a competitor genuinely wins a row, we say so. None of these are runtime guardrails or firewalls; they all produce a vulnerability report against an LLM you connect.

Side-by-side: the rows that matter

Run a free scan against your own LLM

The comparison above is useful, but the only number that matters is what the scanner finds against your model. Connect any LLM provider and FilterPrompt runs the full OWASP LLM Top 10 probe battery — the first scan is free and takes ~4 minutes.

The real difference the table won't show

Specs comparisons hide four things that decide whether a scanner actually ships value: time-to-first-report, support quality, judge methodology, and the audit-grade output your security team can actually hand to a buyer or auditor. Here's what we've seen across roughly 200 enterprise evaluations.

Time to first report

Garak and PyRIT look free until you cost a senior engineer's two-week ramp to wire them into CI, write the report renderer, and own the upgrade path. Lakera Red and Mindgard ship fast but require a sales call before you can even try them — average procurement cycle 6–8 weeks. FilterPrompt is signup-to-first-scan in under 4 minutes against your real provider key, and the OWASP-mapped PDF is generated automatically when the run finishes.

Judge methodology

Regex-only scanners (Garak's default judge) catch the obvious 'ignore previous instructions' but miss soft-refusal bypasses where the model refuses with a disclaimer then complies in the next paragraph. Classifier-only scanners (Lakera Red, Mindgard) catch most known attack families but degrade on novel encodings. FilterPrompt evaluates every probe through a multi-stage stack — deterministic regex, refusal-classifier, contains-check, and an LLM judge (Gemini 3 Flash) for the nuanced verdicts. The judge tier closes the gap on the ~15% of attacks that bypass classifier-only stacks.

Audit-grade output

A JSON dump of pass/fail probes is not what your CISO wants to send to an SOC 2 auditor or to a customer's security review. FilterPrompt produces an OWASP LLM Top 10–mapped PDF with severity-weighted scoring, attack transcript per finding, and remediation guidance — the same format buyers in regulated industries actually accept as evidence. Garak and PromptFoo will get you a list of failures; you still have to write the report.

A real anonymized scenario

A US-based financial services company evaluated four of the six tools above for the LLM-powered claim-assistant feature they were shipping. Lakera Red scored well on the demo dataset but their SaaS-only deployment and 6-week procurement cycle didn't match the team's timeline. PromptFoo did the CI piece well but produced no audit-grade output their security team would sign off on. They chose FilterPrompt for three reasons: an OWASP LLM Top 10 PDF on the first run, the Type I + Type II-in-progress SOC 2 trajectory, and a self-host option for their data-residency requirement. After 60 days: a quarterly scanning cadence in place, two prompt-injection findings remediated before launch, and one audit-evidence pack already handed to a procurement reviewer.

Try it on your own LLM in 60 seconds

No API, no CLI, no SDK to install. Sign up free, connect your LLM (OpenAI, Anthropic, Azure, Google or any custom endpoint) on the Tenants page, pick the OWASP LLM Top 10 suite, and click Run. The scanner fires the probe battery against your model and writes a full OWASP-mapped report you can read in the dashboard or export as PDF.

See your OWASP score

Every scan returns a full OWASP LLM Top 10 report with severity-weighted scoring, per-probe attack transcripts, and a PDF you can hand to your auditor or a prospect's security reviewer.

Pricing and guarantee

From $49/month after the free tier (10,000 probe credits). Annual plans bundle SOC 2 evidence packs and SLA support. Money-back guarantee: if your first scan doesn't surface at least one prompt-injection vulnerability in week 1, we refund the month — every customer to date has found something on scan one.