
Ignore Previous Instruction: The Persistent Challenge of Prompt Injection in Language Models
Prompt injections are an interesting class of emergent vulnerability in LLM systems. It arises because LLMs are unable to differentiate between system prompts, which are created by engineers to configure the LLM’s behavior, and user prompts, which are created by the user to query the LLM.
Unfortunately, at the