The AI Kill Chain
Traditional cyber‑attack frameworks, like Lockheed Martin’s seven‑stage kill chain (reconnaissance → weaponization → delivery → exploitation → installation → command‑and‑control → actions on objectives), break an intrusion into discrete phases so defenders can detect and disrupt each one. In AI systems the attack surface is vastly compressed: an adversary can combine reconnaissance, weaponization and delivery in a single piece of content (a prompt, a file, a web page) and trigger exploitation the moment the model ingests it. Command‑and‑control and lateral movement occur when the agent chains tool calls, so the entire sequence collapses into a four‑step AI kill‑chain.
Most of the attacks we conducted have the following kill chain template:

- Initial Access begins through exposure to untrusted content[1]. An adversary forces the model to process some form of input — text, images or voice — either directly or indirectly via an integrated tool.
- Execution occurs when the AI model processes this malicious input and exhibits undefined behavior. As in classic application security where unexpected input can trigger a buffer overflow, the model enters a state of confusion and can be induced to perform unintended actions.
- Hijack Flow & Technique Cascade arises when a model in this confused state triggers a sequence of tool calls; each call corresponds to a technique in the MITRE ATT&CK or ATLAS matrices. Chaining several tools together leads to a malicious outcome and impact.
Consider the following example - While using an agent-based IDE, a developer invokes issue_read
to open a project’s GitHub issue. The tool retrieves a trojanized issue seeded with a prompt injection. Following the injected instructions, the IDE agent autonomously calls read_file
on the local .env (no user approval), Base64-encodes the secrets it finds, and then misuses url_fetch_content
to trigger a DNS exfiltration to an attacker-controlled domain.
issue_read
pulls a trojanized GitHub issue that seeds the agent which leads to execution.- The injected instructions execute: agent reads .env without approval via
read_file
( AML.TA0013, T1552) - Secrets are Base64-packaged and exfiltrated via DNS using u
rl_fetch_content
(AML.T0025, T1048)
This Hijack Flow & Technique Cascade escalates step-by-step, all without human oversight, leading to a malicious impact (T1496 - Resource Hijacking).
.png)