Security and Isolation of AI Agents

Systems Agentic AI

AI agents are increasingly empowered to take actions and interact with sensitive systems autonomously. This technology can radically transform applications and businesses, but it faces critical security challenges that, if unresolved, would impede that transformation. These challenges stem from a mismatch between traditional approaches to enforcing security policies in web applications, such as access control, and the unpredictable nature of the behavior of AI agents. This research direction focuses on building new reliable enforcement systems and abstractions to address these challenges efficiently and automatically.

Active Projects

Chatbots deal with two sensitive resources: conversation message history and user profile data and preferences, such as user-defined topics of interest. Importantly, a user's message history or profile data must not leak to another user's conversation. Currently, applications are responsible for managing this history, associating it with users, and providing it to the chatbot on invocation. However, this is error-prone and can result in serious privacy violations. For example, developers may misuse history-related APIs offered by LLM frameworks, such as LangChain's MemorySaver. They may incorrectly associate conversations with users, or may have bugs in high-level layers in their applications, such as caches, that cause isolation violations (see this bug from OpenAI as an example).

We are building an automatic LLM conversation management system (LLMMarshal) that ensures different user sessions are isolated from each other. LLMMarshal takes on the responsibility of associating conversations with users, managing their history, and caching active conversations. As a result, LLMMarshal guarantees session isolation while exhibiting good performance and requiring low effort from developers.

Future Ideas

Agentic AI can use Anthropic's Model Context Protocol (MCP) to interact with system capabilities, such as databases and the file system, and invoke external APIs. However, agents behavior is inherently unpredictable, and malicious users may use prompt injection and other attacks to influence the agent to perform actions or access data they do not have permissions for (confused deputy).

We are interesting in building automatic isolation and enforcement systems to ensure that actions taken by an AI agent satisfy the application's access control and security policies. We will investigate several directions including automatically tracking application contexts in MCP servers, enriching MCP with enforcement capabilities (for example, static analysis or dynamic enforcement inspired by https://www.anthropic.com/news/model-context-protocol), and sandboxing and monitoring techniques.