What is an LLM vulnerability scanner?

An LLM vulnerability scanner sends batteries of adversarial probes — jailbreaks, prompt injections, PII extraction attempts, harmful-content requests — at a target LLM and grades the responses. The output is a vulnerability + optimization report with per-probe evidence, severity, and a prioritized fix list.

Which LLMs can I scan with FilterPrompt?

OpenAI, Anthropic, Google Gemini, Azure OpenAI, plus any OpenAI-compatible endpoint — Ollama, Groq, Mistral, Together AI, OpenRouter, Perplexity, Hugging Face, vLLM, or your own custom endpoint. Bring your own keys per tenant.

What kinds of vulnerabilities does FilterPrompt test for?

Jailbreaks (DAN, role hijack, translation smuggling), direct and indirect prompt injection, system-prompt extraction, harmful-content compliance, PII / secret leakage, bias & fairness, RAG poisoning, agent/tool abuse, output quality, and robustness — categories map to the OWASP LLM Top 10.

How are probes graded?

Each probe declares an evaluator: regex match, refusal-check, contains-check, or an AI judge (Gemini 3 Flash). Pass/fail comes with severity, category, the exact prompt sent, the model's full response, and the evaluator's reason — fully auditable.

How much does a scan cost?

1 credit per probe executed. New accounts get 1 welcome credit on signup. Pay-as-you-go credit packs after that — credits never expire. Connecting LLMs and creating tenants is free.

Red Teaming & Advanced Security — Operational Guide

Operations · 2023-06-05 · 16 min read · FilterPrompt Security Team

Master red teaming tactics and advanced security operations. Complete guide to simulated attacks, team structures, and building internal red teams.

Red teaming is the highest-fidelity assessment of a security program — an authorised, objective-driven adversarial simulation that tests not only technical controls but the people and processes meant to detect and respond to them. This operational guide covers what red teaming is, how it differs from penetration testing, how to run an engagement, how to build an internal red team, and the metrics that show whether your investment in advanced security is paying off.

Introduction to red teaming

Red teaming originates in Cold War US military planning, where a 'red team' represented an adversary's perspective in war games. The discipline migrated to cybersecurity in the 2000s as organisations realised that compliance-driven testing missed the gap between control existence and control effectiveness. A red team engagement asks: assuming a motivated adversary with realistic time and skill, can they achieve a defined objective (e.g., exfiltrate the customer database, deploy ransomware, compromise the CEO's mailbox) — and how would we know?

Red teaming fundamentals

What is red teaming?

Red teaming is authorised adversarial simulation against a defined objective, with explicit rules of engagement, white-cell oversight, and after-action documentation. It is broader than penetration testing (which is vulnerability-finding) and more constrained than open-ended attack (which is illegal). The output is not a CVE list — it is an attack narrative, a list of detection and response gaps, and recommended improvements.

Why red teaming matters

Three reasons: it identifies weaknesses before attackers do, it tests the blue team's actual response capability under realistic pressure, and it validates that controls work against a chained attack rather than in isolation. A control that passes a configuration review can still fail in operation — red teams find that gap.

The red team mission

Red teams think like attackers, not like auditors. The mission is comprehensive, objective-driven, and explicitly different from standard testing. Engagements last weeks-to-months, not days. Recommendations are strategic (close the detection gap on this attacker tradecraft) rather than tactical (patch this CVE).

Red team vs blue team: the dynamics

Red team (attackers) responsibilities

Reconnaissance — open-source intelligence on people, technology, and process
Initial access — phishing, supply-chain, exposed services, social engineering
Exploitation — chaining vulnerabilities to achieve foothold and escalation
Post-compromise — persistence, defence evasion, credential harvesting
Lateral movement — pivoting toward the objective with realistic operational tempo
Data extraction — reaching the engagement objective and demonstrating the impact

Blue team (defenders) responsibilities

Threat detection across endpoint, network, identity, and cloud telemetry
Incident response with documented playbooks and clear escalation
Counter-measures including blocking, isolation, and credential rotation
Forensic analysis after engagement to understand attacker dwell time
Process improvements based on documented gaps

Engagement dynamics

The two teams operate under rules of engagement signed by both sides and the engagement sponsor (typically CISO or audit committee). A small white-cell team has visibility into both sides and adjudicates conflicts. Modern engagements increasingly run 'purple team' phases — collaborative sessions where the red team replays specific tradecraft and the blue team tunes detection rules in real time. Purple is operationally efficient when the goal is detection improvement; pure red is appropriate when the goal is honest capability assessment.

Red teaming vs penetration testing vs bug bounty

Three commonly confused disciplines. Penetration testing finds vulnerabilities in a defined scope over days-to-weeks; output is a vulnerability list. Bug bounties run continuously across a public-facing scope; output is a stream of researcher-reported issues. Red teaming runs against a defined objective over weeks-to-months without prior knowledge by the blue team; output is an attack narrative and detection gap analysis. A mature program funds all three at different cadences.

Methodologies: TIBER-EU, CBEST, MITRE ATT&CK

Three methodologies dominate. TIBER-EU is the European Central Bank's framework for threat-led red teaming of financial institutions, using realistic threat intelligence and a structured engagement lifecycle. CBEST is the Bank of England equivalent for UK banks. MITRE ATT&CK is the de facto taxonomy for attacker tactics and techniques, used by red teams to scope tradecraft and by blue teams to map detection coverage. Adopt MITRE ATT&CK as the common language across both teams; adopt TIBER or CBEST if you are in a regulated financial institution.

Building an internal red team

The build-vs-buy decision: external red teams bring breadth of attacker tradecraft and fresh perspective; internal red teams bring continuity, deep environment knowledge, and lower cost-per-engagement at scale. Most mature programs do both — internal team for continuous testing and detection engineering, external team annually for fresh perspective. A credible internal red team is 4–8 people minimum: a lead, two operators, an infrastructure engineer, and a detection engineer who liaises with the blue team. Budget: $1.5M–$3M annually for a US-based team, less in EMEA and APAC.

Engagement lifecycle

Scoping — define objectives, scope boundaries, rules of engagement, white-cell composition
Threat intelligence — model the realistic adversary your business faces
Reconnaissance and infrastructure — set up command-and-control, payload delivery, redirectors
Execution — initial access, escalation, lateral movement against the objective
Detection check-ins — periodic comparison of what red team did vs what blue team detected
Wrap-up — achieve objective or document why not, preserve evidence
Reporting — attack narrative, MITRE ATT&CK mapping, gap list, prioritised recommendations
Remediation tracking — close the loop on the recommendations within a defined timeframe

Tactics, techniques, and procedures (TTPs)

Modern red teams emulate documented adversary TTPs from threat intelligence reports — APT29, FIN7, Conti, Lazarus, and similar. The point is not to reuse the same payloads but to reuse the same operational patterns: phishing → token theft → cloud lateral movement, or supply-chain → endpoint → backup destruction. Map every action to a MITRE ATT&CK technique so the blue team's detection coverage map is updated after each engagement.

Common attack scenarios

Initial access via spear-phishing and OAuth consent abuse
Cloud control-plane lateral movement via assumed-role chaining
Identity attack via Kerberoasting, AD CS abuse, or token theft
Supply-chain compromise via dependency confusion or CI/CD pipeline access
Insider threat simulation with privileged user credentials
Ransomware-style impact: file encryption + backup destruction (without actually encrypting)

Metrics that matter

Five operational metrics from each engagement: time-to-initial-access, dwell time before detection, mean-time-to-detect (MTTD), mean-time-to-respond (MTTR), and percentage of red-team techniques detected. Track trends across engagements — improvement is the goal, not absolute numbers. A red team report that does not surface these metrics is not operationally useful.

Red teaming AI systems

AI red teaming is the newest specialisation — adversarial testing of machine learning systems and LLMs. Distinct from network red teaming because the attack surface (prompt injection, jailbreaks, training data extraction, model evasion) is different from network and endpoint surfaces. Useful methodologies: NIST AI RMF, OWASP LLM Top 10, MITRE ATLAS for adversarial-ML tradecraft. Internal AI red teams typically sit between security and ML engineering; external services (FilterPrompt scanner, Lakera Red, Microsoft PyRIT) automate the most common adversarial probe batteries against LLMs.

Network security and red teaming

Network security companies and tools are core targets in red engagements — segmentation, lateral movement controls, NDR coverage, and Zero Trust enforcement are tested directly. The most common network-layer findings in 2026 engagements: flat internal networks behind perimeter Zero Trust marketing, DNS exfiltration channels still wide open, and east-west detection gaps in cloud VPCs that look segmented in diagrams but route freely in practice.

FAQ: red teaming questions

How often should we run a red team engagement?

External engagements annually at minimum, semi-annually for high-risk industries. Internal red teams should run continuous lower-tempo testing year-round, with specific objective-driven engagements quarterly.

How does red teaming differ from penetration testing?

Pen tests find vulnerabilities in a defined scope. Red teams achieve adversary objectives across a broader scope without blue-team foreknowledge. Pen tests measure attack surface; red teams measure response capability.

What does a red team engagement cost?

External engagements: $80K–$500K depending on scope, objectives, and engagement length (typically 4–12 weeks).

Conclusion: red teaming as a force multiplier

Red teaming is the discipline that turns a compliance-checked security program into an operationally tested one. The investment is significant but pays back in measurable detection improvements, blue-team skill growth, and credible assurance for boards and regulators. As AI systems become a larger share of attack surface, AI-specific red teaming joins network and endpoint red teaming as a core capability.