Blog

min read

Key Questions for Secure Deployment of Large Language Models

By

Dor Sarig

and

January 24, 2024

min read

With the recent release of the OWASP top 10 for LLM applications, the spotlight is on the security challenges that come with integrating these powerful tools into applications.
Prompt Injection vulnerabilities are of particular concern, and they underscore the complexity of maintaining a robust security posture with LLMs.
Whether you're new to the field or an experienced professional, asking the right questions is crucial for secure deployment:


🎯 Direct Prompt Injection
- How are system prompts protected from unauthorized overwriting or revelation?
- What mechanisms are in place to detect and prevent unauthorized commands?

🎯 Indirect Prompt Injection
- How do we handle external input to the LLM, and how can it be manipulated by an attacker?
- What controls are in place to sanitize or segregate untrusted content and limit their influence on user prompts?
- Are there mechanisms to visually highlight potentially untrustworthy responses to the user?

🎯 Extensible Functionality & Plugins
- How do we manage plugins or other extensible functionalities with the LLM?
- What privilege controls are in place for the LLM's access to backend systems and extensible functionalities?
- Are user approval mechanisms implemented for privileged operations?

🎯 Mitigation, Monitoring, & Awareness
- What measures have been taken to establish trust boundaries between the LLM, external sources, and extensible functionalities?
- How are we monitoring the behavior of the LLM to detect suspicious activities or signs of an attack?
- Have we conducted regular security testing, including penetration testing and code review, to identify and remediate prompt injection vulnerabilities?

Based on OWASP Top 10 for Large Language Model Applications.

FAQs

What are the key prompt injection risks flagged in the OWASP Top 10 for LLM applications?

The OWASP Top 10 for LLM applications highlights two primary prompt injection categories: direct and indirect. Direct injection involves unauthorized overwriting or revelation of system prompts, while indirect injection targets external inputs that attackers can manipulate to influence user prompts. Both represent critical attack surfaces in any LLM deployment.

How can attackers exploit indirect prompt injection in LLM applications?

In indirect prompt injection, an attacker manipulates external input fed into the LLM to alter its behavior or outputs. Mitigating this requires controls that sanitize or segregate untrusted content, limit its influence on user prompts, and ideally surface visual indicators to users when a response may originate from potentially untrustworthy sources.

What privilege controls should be in place for LLM plugins and extensible functionalities?

LLM deployments should enforce least-privilege access to backend systems and extensible functionalities, ensuring plugins operate only within defined permission scopes. User approval mechanisms for privileged operations are also recommended, reducing the risk that a compromised or manipulated LLM can execute high-impact actions without explicit human authorization.

How should security teams monitor LLM behavior in production to detect prompt injection attacks?

Effective runtime monitoring of LLMs involves continuously observing model behavior for suspicious activity or indicators of an active attack. This should be combined with regular security testing — including penetration testing and code review — specifically targeting prompt injection vulnerabilities to ensure issues are identified and remediated before they are exploited.

Why is establishing trust boundaries between an LLM and external sources important for secure deployment?

Trust boundaries define which inputs, systems, and data sources an LLM is permitted to interact with and to what degree. Without clear boundaries between the model, external content sources, and extensible functionalities, attackers can exploit those junctions to inject malicious instructions or escalate privileges, making boundary enforcement a foundational element of LLM security architecture.

Subscribe and get the latest security updates

Back to blog

MAYBE YOU WILL FIND THIS INTERSTING AS WELL

From a Copilot to Cluster Admin: inside AtlasOps, our free agent-security CTF

By

Ariel Fogel

and

Eilon Cohen

June 26, 2026

Blog
The Fable Recall Puts the Spotlight in the Wrong Place

By

Eilon Cohen

and

Ariel Fogel

June 14, 2026

Blog
Your agents answer to Hades: how one commit hijacks 4 AI coding tools

By

Ariel Fogel

and

June 10, 2026

Blog