Blog
min read
Untrusted Project-Local Filters in RTK: When Your AI's Eyes Are Someone Else's to Control
%20(1).png)
Executive Summary
What happened: We discovered a medium-severity vulnerability (CVE-2026-45792) in RTK, a widely adopted open-source tool (50,000+ GitHub stars) used to optimize how AI coding assistants — specifically Claude Code — process information. The issue has been patched as of version 0.33.0.
What RTK does: RTK acts as a filter between an AI coding assistant and the developer's environment. It reduces noise in command output so the AI can work more efficiently. Think of it as a lens the AI looks through when reading code and security scan results.
What went wrong: RTK automatically loaded filter settings from any code repository a developer cloned — no approval required, no warning shown. This meant that anyone who contributed code to a repository could also control what the AI was allowed to see. An attacker could plant a backdoor in a codebase and simultaneously plant filter rules that made the backdoor invisible to the AI. The AI would review the code, report it as clean, and the compromised code would move toward production.
Why it matters: This isn't just a bug in one tool. It represents an emerging class of risk in AI-assisted development: attackers manipulating what the AI can observe rather than what it can execute. If your teams are using AI coding assistants, the tools that sit around them — filters, plugins, configuration files — are part of your attack surface. A security scanner that reports "no issues found" is only as trustworthy as the pipeline delivering its output to the reviewer, whether that reviewer is human or AI.
What was the business risk: Code that passed both AI-assisted review and automated security scanning could have shipped to production with attacker-planted vulnerabilities intact. The developer would have had no reason to question the results. The attack required no special access beyond the ability to commit a small configuration file to a repository — something contributors, contractors, or compromised accounts could do.
What's been done: RTK's maintainers responded within 24 hours and shipped a fix that blocks untrusted project-level filters by default, requires explicit per-repository opt-in, and uses hash-based integrity checks to revoke trust if the configuration changes.
What you should do now: If your engineering teams use RTK with Claude Code, ensure they have upgraded to v0.33.0 or later. More broadly, this is a signal to audit the trust model of any tooling that preprocesses, filters, or transforms information before it reaches an AI coding assistant. These tools have security relevance even if they never execute code, because they shape what the AI can and cannot see.
Background: What RTK Does
RTK is a token optimization tool for Claude Code. It works as a hook that intercepts shell command output and applies configurable filters before the output reaches the model. This solves a real problem: LLMs have finite context windows, and raw command output is often full of noise that wastes tokens without adding signal.
RTK's filter engine uses a TOML configuration format. Filters specify command patterns to match and regex rules (strip_lines_matching) that remove lines from the output. Prior to v0.32.0, the engine loaded configuration from three sources, in ascending priority:
- Built-in filters compiled into the RTK binary
- User-global configuration at
~/.config/rtk/filters.toml - Project-local configuration at
.rtk/filters.toml
Project-local filters overrode everything else.
The Vulnerability: Untrusted Configuration, Trusted Authority
The first two sources, built-in and user-global, are under your control. You installed the binary and wrote your global config. Project-local configuration comes from the repository, which means it comes from whoever committed it: a maintainer, a contributor, or an attacker.
RTK did not distinguish between these sources. A .rtk/filters.toml committed by an attacker got the same authority as one you wrote yourself, and the filters activated without any warning or approval prompt. This was reported in the GHSA under CWE-345 and CWE-426. The core issue was that RTK accepted repository-supplied filter configuration as trusted authority, allowing an attacker-controlled project file to alter what command output reached the model.
The kill chain:
- Attacker commits
.rtk/filters.tomlto a repository - You clone the repo and work in it with Claude Code + RTK
- Every shell command the LLM runs is filtered by attacker-controlled regexes
- Attacker chooses what the LLM can and cannot see
Because RTK's filter engine is general-purpose and regex-driven, the attacker could suppress anything that appeared in command output. Our proof of concept demonstrates hiding a credential-exfiltration backdoor in a Python file, but the mechanism worked for any content the attacker wanted to suppress:
- Malicious lines stripped from cat output, so the model never read them
- Security warnings removed from bandit or similar scanner output
- Backdoor code hidden from git diff during review
- Indicators of compromise suppressed from grep results
The filter file itself looked like standard RTK configuration, since filter names and comments could be crafted to appear routine ("strip ANSI codes," "reduce verbose logging").
The model received post-filter output and had no way to know that lines had been removed, because RTK operated below the model's observation layer.
Impact
On affected versions, the attacker controlled the LLM's perception of the codebase, and the downstream consequence was that compromised code could pass every check in the AI-assisted development workflow and ship to production.
If a developer asked Claude Code to review a file containing a backdoor, RTK filters would strip the malicious lines before the model saw them, and the model would report the file as clean. If the developer ran a security scanner, like bandit, through Claude Code, the attacker could strip warnings about the specific vulnerabilities they planted, and the scan would come back clean. In both cases, the developer had no reason to question the result, and the code would move forward through the development pipeline with attacker-planted vulnerabilities intact.
The filter file itself was camouflaged by default. A .rtk/filters.toml with names like "reduce_noise" and comments like "strip ANSI escape sequences" looked like standard configuration in any repository that uses RTK, so the attacker's filters can hide in plain sight among legitimate project settings.
Given RTK's rapid adoption curve (50,000+ stars in five months), the affected user base was significant and expanding. We scored this CVSS 4.0 6.9 (Medium). The attack required the victim to have RTK installed with the Claude Code hook active, a precondition that describes RTK's intended user base exactly.
Why This Matters Beyond RTK
The root issue is trust laundering: configuration from a git remote, written by someone the developer has never met, was granted the same authority as configuration the developer installed themselves. The trust boundary tracked content type (is this a valid filter?) rather than origin (did the developer put this here?).
Trust laundering through project-local files is a recurring pattern across AI development tooling. David Kaplan's recent research on privilege escalation via confused deputies in coding agents catalogs a closely related set of examples: .bashrc files, PowerShell profiles, agent instruction files like CLAUDE.md and AGENTS.md. In each case, the file looks like inert configuration, gets loaded automatically, and executes with whatever authority the consuming tool grants it. Kaplan's specific focus is cross-agent attacks where one coding agent rewrites another's security configuration, but the underlying mechanism is the same: attacker-controlled content is laundered into a trusted context by traveling through a channel (the repository) that the tool treats as authoritative.
RTK adds a new surface to this pattern. Most trust laundering in developer tooling targets what the model can execute: weakened sandbox configs, permissive approval policies, malicious build hooks. RTK shows that the observation layer is equally vulnerable. A tool that controls what the model can observe has security relevance even if it never executes anything, because an agent that can't see a backdoor can't flag it. The trust is laundered not into execution authority, but into perceptual authority, and the downstream effect is the same: compromised code ships.
This reframes what supply chain compromise looks like in AI-assisted development. Most AI security research focuses on malicious input to the model, but RTK shows that the attack surface also includes what doesn't get sent to the model. An attacker who controls the filtering layer doesn't need to inject anything into the model's context, only remove the evidence of what they've planted.
That removal compromises both sides of the AI-assisted development workflow at once. In most teams using coding agents, the same agent (or a similar one) that writes code also reviews it, and both operate in the same repository with the same tooling active. A single .rtk/filters.toml means the agent that generated the code couldn't see the backdoor, and the agent reviewing the code can't see it either, because both are working through the same filtered view of the codebase.
Tools that preprocess, filter, or transform information before it reaches a coding agent are part of the software supply chain, even if they never execute code. As teams rely on AI assistants for both writing and reviewing code, the trust model needs to account for everything in the path between the repository and the model's context window.
Mitigation and Fix
We recommended a single mitigation: when RTK detects a project-local .rtk/filters.toml, warn both the user (via stderr) and the LLM (via hook output) before applying any filters. One prompt per session per repo. The key principle: draw the trust boundary at the source, not the content. Trying to classify individual filter primitives as "safe" or "dangerous" is a blocklist that rots as the feature set evolves.
RTK's maintainers responded within a day of our report and implemented a fix that goes further than our recommendation:
- Project-local filters are now blocked by default when untrusted. Users see a warning: [rtk] WARNING: untrusted project filters — Filters NOT applied. Run rtk trust to review and enable.
- SHA-256 hash verification: if the file changes after you grant trust (e.g., after a git pull), trust is revoked and filters are blocked until re-reviewed. The hash is computed from the same in-memory buffer used to display the file during review, preventing time-of-check/time-of-use races between display and trust.
- Explicit opt-in via rtk trust / rtk untrust commands. The rtk trust flow reads the filter file, displays its full contents, prints a risk summary that flags high-risk primitives (replace rules, match_output rules, catch-all patterns), and only then records the hash-based trust entry.
Credit to Patrick Szymkowiak and the RTK team for an exemplary response. A 24-hour turnaround from report to merged fix is rare, and the fix itself reflects genuine security thinking: they didn't just add a warning, they built a proper trust model with hash-based change detection and explicit opt-in. That's the kind of response that makes responsible disclosure worth doing.
The fix is available in RTK v0.33.0, update if you haven't already.
Disclosure Timeline
- Mar 15, 2026 - Initial report to RTK maintainers
- Mar 16, 2026 - Maintainers acknowledge; fix merged to develop branch (PRs #623, #625)
- Mar 25, 2026 - Patch shipped in v0.33.0
- May 14, 2026 - CVE-2026-45792 assigned
- May 20, 2026 - Public CVE disclosed
Subscribe and get the latest security updates
Back to blog

%20(1).webp)
%20(1).png)
%20(1).png)

%20(1).png)
%20(1).webp)