6.5 .

Indirect Prompt/Instruction Injection

sail

6.5

Risk

Indirect Prompt/Instruction Injection

Description

Agent accepts instructions from untrusted sources (e.g. tool output, retrieved documents), allowing embedded malicious instructions to trigger unsafe actions.

Example

Malicious instructions hidden in a retrieved HTML page cause the agent to run unsafe commands.

Assets Affected

Agentic platform (no code)

Tool / function

Model Response

Mitigation

Sanitize and validate all external data/tool outputs before agent processes them
Restrict sources of external instructions
Monitor for instruction injection patterns

Standards Mapping

ISO 42001: A.7.6, A.9.4
OWASP Top 10 for LLM: LLM01
NIST AI RMF: MEASURE 2.4, MANAGE 2.4
DASF v2: MODEL SERVING 9.9