What is an LLM vulnerability scanner?

An LLM vulnerability scanner sends batteries of adversarial probes — jailbreaks, prompt injections, PII extraction attempts, harmful-content requests — at a target LLM and grades the responses. The output is a vulnerability + optimization report with per-probe evidence, severity, and a prioritized fix list.

Which LLMs can I scan with FilterPrompt?

OpenAI, Anthropic, Google Gemini, Azure OpenAI, plus any OpenAI-compatible endpoint — Ollama, Groq, Mistral, Together AI, OpenRouter, Perplexity, Hugging Face, vLLM, or your own custom endpoint. Bring your own keys per tenant.

What kinds of vulnerabilities does FilterPrompt test for?

Jailbreaks (DAN, role hijack, translation smuggling), direct and indirect prompt injection, system-prompt extraction, harmful-content compliance, PII / secret leakage, bias & fairness, RAG poisoning, agent/tool abuse, output quality, and robustness — categories map to the OWASP LLM Top 10.

How are probes graded?

Each probe declares an evaluator: regex match, refusal-check, contains-check, or an AI judge (Gemini 3 Flash). Pass/fail comes with severity, category, the exact prompt sent, the model's full response, and the evaluator's reason — fully auditable.

How much does a scan cost?

1 credit per probe executed. New accounts get 1 welcome credit on signup. Pay-as-you-go credit packs after that — credits never expire. Connecting LLMs and creating tenants is free.

AI Code Vulnerability Scanner: GitHub Tools, Detection, and Workflow

Guide · 2023-08-14 · 13 min read · FilterPrompt Security Team

How AI code vulnerability scanners work, the best GitHub-integrated tools in 2024, and how to wire AI code scanning into your CI without drowning in false positives.

Static application security testing (SAST) has been around for two decades. The tools work — and most of them quietly produce so much noise that developers ignore them. The promise of an AI code vulnerability scanner is not that it finds new bug classes (it largely doesn't), but that it triages findings the way a senior security engineer would: this one is real, this one is a false positive, this one is real but not exploitable in your code path. That triage is what finally makes code scanning usable in CI.

What 'AI code vulnerability scanner' actually means in 2024

The phrase covers three quite different tools, and conflating them is how teams end up with the wrong product:

AI-augmented SAST — a traditional static analysis engine (CodeQL, Semgrep, Snyk Code) with an LLM layer that explains findings, suggests fixes, and triages false positives.
Pure-LLM code scanners — newer tools that hand the entire diff or repo to an LLM and ask it to find vulnerabilities in natural language. Cheap to build, surprisingly noisy, occasionally brilliant.
Agentic code review bots — tools that act like a security reviewer on a pull request, reading context (issues, history, dependencies) and commenting like a senior engineer would. GitHub Copilot's security review feature, Snyk's PR assistant, and several open-source projects sit here.

Each is good at something different. AI-augmented SAST is the best fit for established codebases with compliance needs. Pure-LLM scanning shines on greenfield code where the model has full context. Agentic review bots win on developer experience because they meet engineers where they already work — the pull request.

AI vulnerability scanner GitHub options worth evaluating

If you're scanning code that lives on GitHub, the integration story matters more than the engine. Findings that don't appear as code annotations on a PR get ignored. Here are the tools worth shortlisting in 2024:

GitHub Advanced Security (CodeQL + Copilot Autofix)

GitHub's own offering. CodeQL is an excellent semantic SAST engine — it actually understands data flow, not just patterns. Copilot Autofix layers an LLM on top to suggest patches inline. Strengths: native PR integration, no extra vendor, supports most mainstream languages. Weaknesses: expensive (per-committer licensing), CodeQL queries take engineering time to author for custom rules, weak on JavaScript/TypeScript supply-chain risks unless paired with Dependabot.

Snyk Code (DeepCode AI)

Hybrid symbolic + ML engine, very fast, strong false-positive rate. The DeepCode lineage means it learns from real fixes scraped from open source. PR-native via Snyk's GitHub app. Strengths: speed, low noise, excellent IDE integration. Weaknesses: closed-source rules, occasional gaps in newer frameworks, pricing scales sharply past 10 developers.

Semgrep + Semgrep Assistant

Open-source pattern-based SAST with a managed cloud (Semgrep Pro) that adds AI-assisted triage. Strengths: rules are readable YAML you can author and review, large community ruleset, free tier is genuinely useful. Weaknesses: pattern matching misses data-flow bugs that CodeQL catches, AI triage is newer and still maturing.

SonarQube + SonarLint AI

The traditional code-quality giant has added AI-assisted review and AI code generation security checks (does the AI-generated code introduce a vulnerability?). Strengths: enterprise-grade governance, broad language support. Weaknesses: heavyweight, opinionated about workflow, AI features are bolt-ons rather than the core engine.

Open-source agentic scanners

Search 'ai vulnerability scanner github' and you'll find a long tail of agentic projects — security-focused autoGPT forks, GPT-Engineer-derived security reviewers, and academic agents that read repos and produce vulnerability reports. Most are interesting prototypes; few are production-ready. Worth experimenting with on greenfield repos, not betting on for compliance.

What an AI code scanner detects that traditional SAST misses

If your only goal is finding the same SQL injections and XSS patterns CodeQL has caught for years, a classic SAST is probably enough. The reasons to add AI to your code scanning are different:

Insecure use of LLM APIs — calling OpenAI with unsanitised user input, sending PII to a third-party AI provider, embedding system prompts in places where they leak. Traditional SAST has no rules for this; LLM-aware scanners do.
Prompt injection sinks in your own code — places where untrusted text reaches an LLM call without sanitisation. These look perfectly normal to a regex-based scanner.
Insecure prompt construction — string concatenation that lets an attacker break out of the system prompt context. An AI scanner can read the prompt template and reason about it.
Business-logic vulnerabilities in agentic code — e.g. an AI agent that calls a tool with parameters derived from another LLM's output without validation.
AI-generated code review — when developers paste Copilot output, the scanner can flag known-insecure patterns the model tends to produce (raw SQL string templating, weak crypto defaults).
Triage of existing SAST findings — the most underrated feature. An LLM reads the finding, the surrounding code, and the data flow, and tells you whether the finding is reachable. This is where 60–80% of false positives can be eliminated.

How to wire AI code scanning into CI without breaking the team

The fastest way to lose developer trust in a security tool is to block builds on noisy findings. The fastest way to keep their trust is to have the tool block only on high-confidence, exploitable issues. Here's a workflow that has held up across a few hundred engineering teams:

Run the scanner on every PR, but only fail the build on findings rated 'high' severity AND 'high' confidence after AI triage.
Surface medium and low severity as PR comments, never as build failures. Engineers will engage with comments; they will mute build-breakers.
Let the AI scanner suggest fixes inline. Suggested patches that engineers can accept with one click are accepted ~5x more often than findings without a fix suggestion.
Run a weekly full-repo scan (not just diff scan) to catch issues outside the PR diff. Surface results to the security team, not the devs, until they're triaged.
Track scanner accuracy over time. If false-positive rate climbs above 20% on critical findings, retune severity thresholds before the team starts ignoring everything.

AI code vulnerability scanning and the OWASP LLM Top 10

If your code calls an LLM API anywhere — directly or via a framework like LangChain, LlamaIndex, or the OpenAI SDK — your code scanner needs LLM-aware rules. The OWASP LLM Top 10 maps cleanly to code-level patterns a static scanner can flag:

LLM01 Prompt Injection — flag any LLM call where untrusted input is concatenated into the prompt without going through a sanitiser.
LLM02 Insecure Output Handling — flag any LLM response that's passed to eval(), exec(), a SQL driver, or rendered as HTML without escaping.
LLM05 Supply Chain Vulnerabilities — flag use of untrusted model checkpoints or unverified prompt template packages from npm/PyPI.
LLM06 Sensitive Information Disclosure — flag LLM calls that include obvious PII (regex on the prompt builder) or that fetch secrets and embed them.
LLM07 Insecure Plugin Design — flag tool definitions that accept free-form parameters without schema validation.
LLM08 Excessive Agency — flag agent definitions that grant tools broader scope than necessary (write access where read would do).

Few classic SAST tools have rules for these yet. This is where AI-powered code vulnerability scanners earn their keep — an LLM reading the code can recognise the pattern even when no formal rule exists.

Pitfalls of AI-based code vulnerability scanning

Three failure modes are common enough to call out:

Hallucinated findings

Pure-LLM scanners occasionally invent vulnerabilities that don't exist — the model 'reasons' the code is broken when it isn't. Mitigation: always require the scanner to cite the exact lines and explain the data flow. Findings without evidence get ignored.

Context limits

Even long-context models struggle with monorepos. A vulnerability that spans three files may be invisible if the scanner only sees one. Mitigation: pair LLM triage with a symbolic engine (CodeQL, Semgrep) that builds a real call graph.

Data leakage to the scanner

Sending your private code to a third-party LLM is itself a security decision. Check whether the scanner sends code to a hosted model, whether it's used for training (most enterprise tiers say no), and whether you can self-host the model for sensitive repos.

Making the buy-vs-build decision

If you have an AppSec team and a research budget, building an in-house AI code scanner on top of CodeQL + a frontier LLM is feasible — the open-source pieces are good. For everyone else, the maths favours buying. The maintenance cost of rules, models, and CI integration eats more time than the licence fee on any reputable managed product. The exception is if you have unusual stack requirements (esoteric language, air-gapped environment) where no commercial scanner has good coverage.

Bottom line

An AI code vulnerability scanner is not a magic bullet — it's a triage layer that finally makes 20-year-old SAST output usable, plus a small set of new rules that catch the LLM-era bugs your existing scanner has no rules for. Pick the tool that integrates cleanly with the GitHub workflow you already use, run it in PR-comment mode for a quarter before you ever block a build, and pair it with a runtime LLM scanner like FilterPrompt for the half of the OWASP LLM Top 10 that only shows up at request time.