What is an LLM vulnerability scanner?

An LLM vulnerability scanner sends batteries of adversarial probes — jailbreaks, prompt injections, PII extraction attempts, harmful-content requests — at a target LLM and grades the responses. The output is a vulnerability + optimization report with per-probe evidence, severity, and a prioritized fix list.

Which LLMs can I scan with FilterPrompt?

OpenAI, Anthropic, Google Gemini, Azure OpenAI, plus any OpenAI-compatible endpoint — Ollama, Groq, Mistral, Together AI, OpenRouter, Perplexity, Hugging Face, vLLM, or your own custom endpoint. Bring your own keys per tenant.

What kinds of vulnerabilities does FilterPrompt test for?

Jailbreaks (DAN, role hijack, translation smuggling), direct and indirect prompt injection, system-prompt extraction, harmful-content compliance, PII / secret leakage, bias & fairness, RAG poisoning, agent/tool abuse, output quality, and robustness — categories map to the OWASP LLM Top 10.

How are probes graded?

Each probe declares an evaluator: regex match, refusal-check, contains-check, or an AI judge (Gemini 3 Flash). Pass/fail comes with severity, category, the exact prompt sent, the model's full response, and the evaluator's reason — fully auditable.

How much does a scan cost?

1 credit per probe executed. New accounts get 1 welcome credit on signup. Pay-as-you-go credit packs after that — credits never expire. Connecting LLMs and creating tenants is free.

AI Risk and AI Protection: Protecto AI, Protect AI, and How to Secure Artificial Intelligence

Reference · 2018-09-18 · 15 min read · FilterPrompt Security Team

An expert reference on AI risk and artificial intelligence security — what risks of AI matter most, how Protect AI and Protecto AI compare to alternatives, and how to build an AI protection program that actually works.

Two of the most-searched names in this space — Protect AI and Protecto AI — are different companies solving overlapping but distinct problems. Both anchor a broader question every CISO is now asked weekly: what are the actual risks of AI, and what does artificial intelligence security as a program look like in practice? This reference walks through the AI risk landscape in 2026, compares the platform options including Protect AI and Protecto AI, and lays out the controls that hold up under regulator scrutiny.

Protect AI vs Protecto AI: what's the difference?

Same prefix, different focus. Easy to conflate, and the conflation matters because picking the wrong one wastes a procurement cycle.

Protect AI

Protect AI (protectai.com) is an AI security platform focused on the ML/MLOps supply chain and runtime security for AI applications. Their well-known products include ModelScan (scans model files for malicious code), Guardian (enforces ML supply-chain policy), and Layer (LLM runtime guardrails). Recently acquired by Palo Alto Networks, which folded the platform into Prisma Cloud. Strongest fit: enterprises with substantial in-house ML and a need to govern model artefacts end-to-end.

Protecto AI

Protecto (protecto.ai) is an AI data privacy and DLP platform. Their core offering is intelligent tokenisation and PII redaction for data flowing into and out of LLMs — preserving format and utility while stripping identifiers. Strongest fit: regulated industries (healthcare, financial services, legal) where the dominant AI risk is sensitive data leakage to third-party model providers.

Quick chooser

Need to scan model files (.pkl, .safetensors) for malicious code in your CI? Protect AI.
Need to redact PHI/PII before sending prompts to OpenAI or Anthropic? Protecto.
Need adversarial scanning against your LLM endpoint and agents? Neither — that's the LLM vulnerability scanner category (FilterPrompt, Lakera Red, Garak).
Need runtime prompt-injection blocking? Protect AI Layer, Lakera Guard, FilterPrompt firewall, Cloudflare Firewall for AI all compete.

Most mature programs end up running two or three of these, not one. The categories don't overlap fully, and the integration story is usually 'they hand off to each other at well-defined boundaries'.

What are the actual risks of AI? A taxonomy

'Risk of AI' is asked broadly enough that any answer needs structure. The useful split is across three planes — technical, business, and societal — because controls live at different layers.

Technical risks

Prompt injection — direct and indirect, the dominant production attack against LLM apps.
Jailbreaks and policy bypass — extracting harmful or restricted content from models.
Sensitive data leakage — both into the model provider (your customer data) and out to the user (data the model retrieved that they shouldn't see).
Model and training data extraction — membership inference, model inversion, training-set leakage.
Data and model poisoning — backdoors planted in training data or RAG corpora, supply-chain attacks on model artefacts.
Excessive agency on AI agents — tool calls executed at attacker direction.
Insecure output handling — model output rendered in a context that executes (HTML, SQL, shell).
Adversarial inputs — perturbations that fool classifiers, including your own defensive classifiers.

Business risks

Hallucination at scale — confidently wrong outputs in customer-facing contexts (legal, medical, financial advice).
Reputation risk — model outputs that are biased, offensive, or off-brand.
Vendor and concentration risk — dependence on a small number of foundation-model providers with opaque update cycles.
Cost and credit exposure — runaway agent loops, prompt-injection-driven API exhaustion.
IP and licensing risk — training data provenance, output ownership ambiguity.
Compliance gaps — EU AI Act, NIST AI RMF, ISO/IEC 42001, sector-specific rules (HIPAA, GLBA, PCI).

Societal and systemic risks

Fraud at scale enabled by deepfakes, mass-produced misinformation, model-induced labour disruption, and concentration of capability among a small set of actors. These rarely sit on a CISO's plate directly but increasingly show up in board-level AI policy discussions and regulatory drafts.

Artificial intelligence security as a program: the four pillars

A serious AI security program rests on four pillars. Treat them as a complete set; gaps in any one pillar tend to undo the others.

1. Inventory and governance

You cannot protect what you do not know about. Maintain an inventory of every model, prompt, agent, dataset, and embedding store in use — including shadow AI in business units. Assign an owner. Classify by data sensitivity and customer exposure. Without this pillar, the other three are guesswork.

2. Pre-deploy assurance

Adversarial testing before production. Red-team every customer-facing model and agent. Map findings to OWASP LLM Top 10 and your governing framework. Block release on regression in critical categories. This is where AI vulnerability scanners earn their cost.

3. Runtime protection

AI firewall on the request path: input filtering for injection, jailbreak, and PII; output filtering for exfiltration patterns and sensitive disclosure; tool-call validation for agents; rate limiting and credit-exhaustion guards. Every verdict logged for forensics.

4. Continuous monitoring and incident response

Models drift. Providers update underlying weights without notice. New jailbreaks publish weekly. Schedule continuous adversarial scans on staging, monitor verdict-rate anomalies in production, and have a rehearsed AI-incident response playbook. The first time you find out a vendor silently swapped a model under you should not be from a customer complaint.

Mapping platforms to pillars

No single vendor covers all four pillars credibly. Practical stacks compose:

Inventory and governance — IBM watsonx.governance, Credo AI, Holistic AI, or in-house tracking. Often the weakest pillar in practice.
Pre-deploy assurance — FilterPrompt, Lakera Red, Garak, HiddenLayer, Robust Intelligence (now Cisco AI Defense).
Runtime protection — Protect AI Layer, Lakera Guard, FilterPrompt firewall, Cloudflare Firewall for AI, AWS Bedrock Guardrails.
Data privacy and DLP — Protecto, Skyflow, Private AI, Nightfall.
Continuous monitoring — FilterPrompt scheduled scans, repetition reports, drift baselines; some overlap with the assurance vendors.

How regulators view AI risk in 2026

The compliance landscape stabilised in late 2025 around three frameworks that account for the bulk of practical reporting demand:

EU AI Act — Article 15 (accuracy, robustness, cybersecurity) and Article 9 (risk management) are the two articles your AI security evidence has to map to. High-risk systems require documented adversarial testing and a continuous monitoring programme.
NIST AI RMF — voluntary in the US but the de facto standard for federal contractors and an increasing share of enterprise procurement. Map your controls to GOVERN, MAP, MEASURE, MANAGE functions.
ISO/IEC 42001 — the AI management system standard. Most useful for global enterprises that already run ISO 27001; the structure is familiar and certifiable.

Sector-specific rules (HIPAA for PHI, GLBA for financial data, PCI for cards, GDPR for EU personal data) layer on top. Most enterprises pick one of the three primary frameworks as their reporting backbone and translate findings to the sector overlays as needed.

A 90-day AI protection rollout plan

Days 0-15 — Inventory every model, agent, prompt, embedding store, and dataset. Including shadow AI. Assign owners.
Days 15-30 — Run a baseline adversarial scan against every customer-facing endpoint. Map findings to OWASP LLM Top 10.
Days 30-50 — Deploy DLP redaction in both directions on the highest-risk endpoint. Wire prompt firewall (input + output) on the same endpoint.
Days 50-70 — Roll the runtime controls out to remaining customer-facing endpoints. Add tool-call allowlists and parameter validation on agents.
Days 70-85 — Schedule continuous adversarial scans against staging. Wire verdict logs into your SIEM. Build a single AI-risk dashboard for the CISO.
Days 85-90 — Tabletop an AI incident: a successful indirect injection that exfiltrates customer data via tool calls. Identify gaps. Fix the worst three.

Common procurement mistakes

Buying one platform and assuming it covers all four pillars. None do.
Picking the model-file scanner (Protect AI ModelScan) when the actual risk is prompt injection at the API layer. Different problem.
Picking the DLP/tokenisation tool (Protecto) when the actual risk is jailbreaks producing harmful content. Different problem.
Treating runtime protection as a substitute for pre-deploy assurance. They are complements; runtime defences need to be tuned against actual probe outcomes, not vibes.
Skipping continuous monitoring. Every team that has skipped it has been surprised by a silent provider model update within six months.

Bottom line

Protect AI and Protecto AI are both real, useful platforms — for different problems. Protect AI is a model-supply-chain and runtime platform now part of Palo Alto. Protecto is a data-privacy and tokenisation platform for sensitive data flowing into LLMs. Most mature artificial intelligence security programs buy more than one tool, because the risks of AI span technical, business, and societal layers and no single vendor covers all four pillars (inventory, assurance, runtime, monitoring) credibly. Start with the inventory, run a baseline scan, deploy layered runtime protection, and wire continuous monitoring before you put a single more model in front of a customer.