The AI industry has poorly thought through the trust model for agents operating in untrusted environments

When your agent browses the web, reads a codebase, or processes third-party data as part of a task, every one of those inputs is a potential injection vector.

The agent can’t reliably distinguish between “data I should process” and “instructions I should follow”.

This is because the model itself doesn’t have a hardened boundary there by design.