guide

Prompt Injection in Email Agents

Email is untrusted input. It may include direct instructions to the model, hidden text, quoted conversations, links, attachments, or forged context. Prompt injection controls should sit before extraction and before any outbound action.

last updated 2026-05-07 4 sections

section 01

Injection surfaces

Prompt injection can appear in visible text, HTML comments, CSS-hidden spans, quoted reply history, attachment text, linked pages, and forwarded messages. The agent should never treat sender-provided instructions as system policy.

surface	risk	control
Visible body	Direct instruction to ignore policy.	Use extraction schema and policy checks.
Hidden HTML	Invisible instruction enters model context.	Strip hidden and remote content.
Quoted history	Old or forged context changes intent.	Isolate latest reply.
Attachment text	Poisoned document content.	Scan and review before use.
Links	External content changes after receipt.	Do not fetch automatically for high-risk actions.

section 02

Pre-model cleanup

Before any model call, normalize the email into a controlled representation. Remove hidden HTML, remote images, tracking pixels, overly long quoted text, and unsupported attachment types. Keep raw content for audit, not default reasoning.

ok Convert HTML to safe text and preserve links as text references.
ok Remove invisible text, style-hidden content, and comments.
ok Separate latest reply from quoted history and forwards.
ok Limit content length and route overflow to review.
ok Mark attachment-derived content as untrusted.

section 03

Policy after extraction

Prompt injection can survive extraction, so policy checks still matter after the model proposes an action. Validate recipient, sender, account, requested action, risk, confidence, and template before sending.

ok Allow only known action types.
ok Reject outbound sends to unapproved domains when policy requires it.
ok Require review for low confidence or high-risk extracted intent.
ok Never allow email content to override sender, recipient, or approval policy.
ok Log the policy version used for every decision.

section 04

Safe failure mode

When the system detects injection risk, the safe outcome is review or clarification, not silence or an automatic denial. That keeps the workflow useful while preventing untrusted content from steering the agent.

related startup email pages

pillar

Prompt Injection in Email Agents

Injection surfaces

Pre-model cleanup

Policy after extraction

Safe failure mode

related startup email pages

Email marketing for startups

Startup email examples

Startup email templates