SAIL

/

Operate - Safe Execution Environment - Sandbox

/

Indirect Prompt/Instruction Injection

6.5

.

Indirect Prompt/Instruction Injection

sail
6.5
Risk

Indirect Prompt/Instruction Injection

Description

Agent accepts instructions from untrusted sources (e.g. tool output, retrieved documents), allowing embedded malicious instructions to trigger unsafe actions.

Example

Malicious instructions hidden in a retrieved HTML page cause the agent to run unsafe commands.

Assets Affected

Agentic platform (no code)

Tool / function

Model Response

Mitigation
  • Sanitize and validate all external data/tool outputs before agent processes them
  • Restrict sources of external instructions
  • Monitor for instruction injection patterns
Standards Mapping
  • ISO 42001: A.7.6, A.9.4
  • OWASP Top 10 for LLM: LLM01
  • NIST AI RMF: MEASURE 2.4, MANAGE 2.4
  • DASF v2: MODEL SERVING 9.9