When AI Is Hijacked – Prompt Injection as a Compliance Risk
Update Data Proctection No. 252
The use of generative artificial intelligence in business processes is noticeably on the rise. In particular, AI agents and other AI-powered assistance systems promise significant efficiency gains, for example through the automated processing of information, support for internal workflows, or the independent execution of digital tasks. At the same time, this gives rise to new security risks that differ fundamentally from traditional IT security threats in some respects. Particular attention is currently being paid to the phenomenon of so-called “prompt injection,” in which AI systems can be induced to exhibit undesirable behavior through deliberately manipulated inputs or content. Against this backdrop, the following article examines the technical and regulatory foundations of prompt injection attacks, the associated liability risks, and appropriate technical and organizational protective measures for companies.
I. What are so-called “prompt injections”?
“Prompt injection” refers to targeted attempts to manipulate AI systems, particularly large language models (“LLMs”), in which inputs or external content are designed to influence the system’s behavior in an unintended manner. Unlike traditional software, generative AI does not process instructions strictly based on rules, but rather contextually based on natural language. This results in the unique characteristic that not only direct user inputs, but also processed documents, website content, or emails can influence the system’s behavior.
In practice, such attacks can take various forms. In a so-called “direct prompt injection,” a user attempts to circumvent internal security policies or system restrictions through inputs such as “Ignore all previous instructions.” Of particular practical relevance, however, are so-called “indirect prompt injections.” In this case, malicious instructions are embedded in external content that is automatically processed by the AI system. It is conceivable, for example, that a scanned PDF document, an email signature, or website content contains hidden instructions that cause the AI agent to disclose confidential information or perform specific actions.
This poses significant risks, particularly when using AI agents that can independently access internal systems, databases, or external applications. For instance, an AI-powered email assistant could be tricked by manipulated content into summarizing internal documents and forwarding them to unauthorized recipients. Equally conceivable are scenarios in which an AI agent triggers erroneous transactions, bypasses internal approval processes, or processes and discloses sensitive corporate data during automated research.
The practical relevance of such attacks is increasing, particularly because companies are increasingly integrating generative AI deep into existing business processes . The more extensively AI systems can access internal information, communication channels, and digital tools, the greater the potential attack surface for prompt injection attacks.
II. Regulatory Framework for AI Agents
With the entry into force of the AI Regulation, the European Union has established a comprehensive legal framework for the use of artificial intelligence for the first time. The AI Regulation fundamentally distinguishes between the underlying AI models and the AI systems built upon them. This distinction is particularly significant for AI agents, as they typically rely on large language models (LLMs) and are simultaneously capable of independently planning tasks, processing information, and controlling external applications or interfaces.
Even without a specific risk classification, the AI Regulation contains general compliance requirements for companies. Of particular practical relevance here is Article 4 of the AI Regulation, which has been in effect since February 2025 and, for the first time, explicitly establishes an obligation to ensure sufficient “AI competence” (as we reported). Accordingly, providers and operators of AI systems must take measures to ensure that persons involved in the operation and use of AI systems possess adequate technical knowledge, experience, and training. For companies, this means in particular that employees working with generative AI and AI agents must not only be technically trained but also made aware of risks such as erroneous decisions, hallucinations, or manipulation attacks, for example through prompt injections.
Furthermore, the specific scope of obligations depends largely on the regulatory classification of the respective system. At the level of the underlying AI models, large language models are generally considered “general-purpose AI models” within the meaning of Art. 53 et seq. of the AI Regulation. Providers of such models are subject to specific transparency, documentation, and information obligations. If a model is additionally classified as a model “with systemic risk,” further requirements apply, such as regarding model evaluations, risk assessments, and cybersecurity measures.
However, classification at the system level is particularly relevant for businesses in practice. The specific intended use of the AI agent is decisive here. If an AI agent is used in sensitive areas such as human resources, the financial sector, or in the context of critical infrastructure, it may be classified as a high-risk AI system. In this case, extensive regulatory obligations apply, particularly with regard to risk management, human oversight, and technical robustness. According to Article 9 of the AI Regulation, a continuous risk management system must be established to identify, assess, and minimize risks. Article 14 of the AI Regulation also requires effective mechanisms for human oversight to detect and correct malfunctions or undesirable system decisions. Additionally, Article 15 of the AI Regulation mandates ensuring an appropriate level of accuracy, robustness, and cybersecurity.
These requirements take on particular significance in the context of AI agents. Unlike traditional AI applications, AI agents are often not limited to the mere generation of content, but can independently control processes, retrieve external data, or perform digital actions. The degree of their autonomy therefore significantly influences the system’s risk profile and, consequently, the intensity of regulatory requirements. A particular challenge is that the AI Regulation’s risk-based approach focuses primarily on the intended use, while the actual technical capabilities of an AI agent – such as the independent control of browsers or IT systems – have so far been taken into account only to a limited extent.
III. Technical and Organizational Safeguards
Companies should view prompt injection attacks not merely as a theoretical risk, but as a concrete security challenge in the productive use of generative AI. Since such attacks often cannot be completely prevented, the key in practice is to identify risks early, establish technical safeguards, and specifically limit the scope of action of AI agents.
From a technical perspective, it is advisable to first strictly limit permissions and system access. AI agents should only be able to access those data, applications, and interfaces that are absolutely necessary for the respective use case (“least privilege principle”). Systems that independently send emails, call up external tools, or have write access to internal systems are particularly critical. The broader the scope of an AI agent’s capabilities, the greater the potential attack surface for prompt injection attacks.
Furthermore, external content processed by AI systems should be handled in as isolated and controlled a manner as possible. This applies in particular to emails, website content, PDF documents, or other files that may contain hidden instructions. Companies are increasingly relying on technical protection mechanisms such as input filters, prompt sanitization, or isolated execution environments (“sandboxing”) to detect manipulative content early on or limit its impact.
Human oversight also remains of central importance. AI agents should not operate completely autonomously, especially in critical processes, but should be integrated into appropriate approval and control mechanisms. For example, it can be stipulated that sensitive actions – such as external communication, payments, or data sharing – are executed only after human confirmation. Especially in the context of prompt injections, such “human-in-the-loop” control can play a crucial role in detecting misconduct early on and preventing damage.
Finally, it is recommended to continuously review deployed AI systems through testing, monitoring, and so-called “red-teaming.” This involves deliberately attempting to induce systems to malfunction through manipulated inputs or atypical scenarios in order to identify vulnerabilities early on. Given the dynamic development of generative AI, a security concept implemented once is likely to be insufficient on a regular basis. Rather, the secure use of AI agents requires continuous adaptation of technical and organizational protective measures.
IV. Liability Risks in the Event of Non-Implementation
Companies that deploy AI agents without adequate security and control mechanisms expose themselves to significant liability risks. If, for example, prompt injection leads to the disclosure of confidential information, erroneous decisions, or unintended system actions, the question regularly arises as to whether sufficient technical and organizational safeguards were implemented.
In addition to potential fines under the AI Regulation, civil liability risks due to organizational negligence or breach of contractual obligations are of particular concern. Added to this are potential reputational damage and loss of trust among customers and business partners. Precisely because AI agents are increasingly accessing corporate systems and business processes autonomously, effective AI risk management is thus gaining significant importance from a liability law perspective as well.
V. Conclusion and Outlook
Prompt injection attacks illustrate that the use of generative AI, and AI agents in particular, not only offers significant efficiency potential but also brings with it new security and compliance risks. As autonomous AI systems become increasingly integrated into corporate processes, both regulatory requirements and expectations regarding technical and organizational safeguards are likely to rise further. Companies are therefore well advised to consider AI governance, IT security, and compliance together at an early stage and to adapt existing control mechanisms to the specific characteristics of generative AI. Particularly against the backdrop of the ongoing implementation of the AI Regulation and dynamic technological developments, it can be assumed that prompt injections and similar attack scenarios will continue to gain practical relevance in the future.
This article was created in collaboration with our student employee Emily Bernklau.