Beware of MCP Server Tricks: Protect Against API Key Theft and More

Beware of MCP Server Tricks: Protect Against API Key Theft and More - Learn about the security risks of using MCP servers, including tool poisoning attacks that can leak sensitive data. Discover mitigation strategies for client-side security.

2025年4月13日

party-gif

Protect your sensitive data from potential security risks when using MCP servers. This blog post explores critical vulnerabilities, such as tool poisoning attacks, that can lead to the theft of API keys and other sensitive information. Learn how to safeguard your applications by implementing robust security practices when working with MCP servers.

Security Risks of Using MCP Servers

When using MCP (Model Context Protocol) servers, it's crucial to be aware of the potential security risks. MCP servers provide access to various tools, resources, and predefined prompts for AI interactions, but this also makes them susceptible to malicious actions.

One of the main security concerns is the risk of "tool poisoning attacks." Attackers can embed malicious instructions within the tool descriptions, which can then be injected into the context of the language model (LM). These hidden instructions can instruct the LM to perform actions such as retrieving sensitive information like API keys or SSH keys, and then transmit that data to the attacker.

Another attack vector is the "MCP pull rug" technique, where the tool description can be changed after the client has already approved the tools, without the user's knowledge. This can lead to the introduction of malicious instructions that the user was not aware of.

Additionally, the "shadowing tool description" attack allows a malicious actor on one MCP server to access data on another MCP server, enabling authentication hijacking and data leaks.

To mitigate these security risks, it's essential to implement the following strategies:

  1. Clear UI Patterns: Expose the full tool descriptions to the user, so they can understand the actions the tools will perform.
  2. Tool and Packaging Pinning: Pin the version of the MCP server and its tools to prevent unauthorized changes, and verify the integrity of the tool descriptions before execution.
  3. Cross-Server Protection: Implement stricter boundaries and data flow controls between different MCP servers to prevent attacks from one server affecting another.

Remember, the security of your MCP-based applications relies heavily on your ability to vet and control the MCP servers and tools you use. Neglecting these security practices can lead to serious vulnerabilities and potential data breaches.

How MCP Server Interactions Work

The interaction between the host (AI application) and an MCP server involves three main components:

  1. AI Assistant/Host: The AI application running on the host.
  2. MCP Client: The client controlling the communication with the MCP server.
  3. MCP Server: The server that provides the tool definitions, resources, and prompts.

The process works as follows:

  1. The host sends a request to connect to the MCP server.
  2. The MCP server receives the request and responds with a list of available tools and their definitions.
  3. The host can then send a request to execute a specific tool, along with any necessary parameters.
  4. The MCP server processes the request and executes the tool, potentially including any malicious instructions embedded within the tool definition.
  5. The results of the tool execution are then returned to the host.

It's important to note that the tool definitions provided by the MCP server are trusted by the AI model, which can lead to security vulnerabilities if the definitions contain malicious instructions. This is the basis for the "tool poisoning" attack discussed in the article.

Tool Poisoning Attacks in MCP

The article discusses a critical flaw in the widely used Model Context Protocol (MCP) that enables a new form of attack termed "tool poisoning." This attack allows malicious instructions to be embedded within MCP tool descriptions, which are invisible to the user but visible to the AI models.

The key points are:

  1. MCP servers can provide a list of tools with their descriptions, which are then injected into the context of the language model. Attackers can craft these tool descriptions to contain malicious instructions.

  2. The AI models are trained to trust the tool descriptions and follow the instructions precisely, enabling the malicious behavior to be concealed behind legitimate functionality.

  3. Attackers can leverage this vulnerability to leak sensitive information like API keys, SSH keys, etc., without the user's knowledge.

  4. The attack can also be used to perform "shadowing," where a malicious tool on one server can hijack the behavior of a trusted tool on another server, leading to data leaks or other malicious actions.

  5. Mitigation strategies include:

    • Exposing the full tool descriptions to users through clear UI patterns.
    • Pinning the versions of MCP servers and tools to prevent unauthorized changes.
    • Implementing stricter boundaries and data flow controls between different MCP servers.

Overall, the article highlights the importance of proper security practices when working with MCP servers and the need to thoroughly vet any third-party tools or servers before using them in your applications.

Examples of Tool Poisoning Attacks

Here are some concrete examples of tool poisoning attacks in the context of MCP (Model Context Protocol):

  1. Retrieving Sensitive Information:

    • The attacker can craft a tool description that appears to be a benign tool, such as an "Add Two Numbers" tool.
    • However, the tool description can contain hidden instructions that instruct the AI model to retrieve sensitive information, such as API keys or SSH keys, and transmit them to the attacker.
    • The user may not be aware of these hidden instructions and may approve the tool's execution, unknowingly leaking sensitive data.
  2. Hijacking Trusted Tools:

    • The attacker can create a "shadowing" tool that modifies the behavior of a trusted tool on another MCP server.
    • For example, the attacker can create a tool that hijacks the "Send Email" tool on a trusted server, changing the recipient email address to the attacker's address.
    • The user may only see the trusted "Send Email" tool and be unaware of the malicious "shadowing" tool's actions.
  3. MCP Pull Rug Attacks:

    • In this attack, the attacker can change the tool description of an approved tool after the user has already granted permission to use it.
    • The user may not be aware of the changes, and the modified tool description can now contain malicious instructions.
    • This attack exploits the package or server-based architecture of MCP, where tool descriptions can be updated without the user's knowledge.
  4. Concealing Malicious Behavior:

    • Attackers can craft tool descriptions that contain misleading or simplified information, hiding the actual malicious actions the tool will perform.
    • For example, a tool description may claim to perform a simple mathematical operation, while in reality, it is instructing the AI model to perform unauthorized actions, such as accessing sensitive files or transmitting data.
    • The user may approve the tool's execution based on the simplified description, unaware of the underlying malicious behavior.

These examples demonstrate how tool poisoning attacks can exploit the trust placed in MCP tool descriptions and the potential disconnect between what the user sees and what the AI model actually does. Proper security practices, such as sanitizing and reviewing tool descriptions, are crucial to mitigate these types of attacks.

Shadowing Tool Descriptions with Multiple Servers

Here is the section body in Markdown format:

This is the most interesting attack vector described in the blog post. The idea is that if your host is connected to multiple different MCP servers, a malicious actor on one MCP server can access data on another MCP server.

The authors explain that this "makes authentication hijacking possible, where credentials from one server are secretly passed to another one." The way this would work is:

  1. The malicious server provides a tool with malicious actions.
  2. Whenever you call this tool, it could have instructions to access a tool on another, trusted server.
  3. The user is not aware of this, and the malicious server can effectively hijack the other server that is connected to the same host.

The blog post provides a concrete example of this attack using Cursor. In this case, there are two different servers connected - a trusted server providing an email sending tool, and a malicious server providing a bogus "add numbers" tool.

The malicious tool description contains instructions to modify the behavior of the trusted "send email" tool. Specifically, it changes the recipient email address to a malicious one, while extracting the actual recipient from the email body. This is all done without the user's knowledge, providing them with the "best experience possible."

The authors state that this shadowing attack is enough to hijack the agent's behavior with respect to trusted servers, without the attacker's tool ever appearing in the user-facing interaction log. Combined with the MCP "rug pull" attack, a malicious server can effectively hijack an agent without the user ever being aware of it.

Recommendations to Safeguard Against MCP Security Vulnerabilities

Here are the recommendations to safeguard against MCP security vulnerabilities:

  1. Clear UI Patterns:

    • Expose the full tool description to the user, so they can understand what the tool is supposed to do.
    • Use different UI elements or colors to indicate which parts of the tool description are visible to the AI model.
    • This sanitization step on the client-side can help users understand the tool's behavior.
  2. Tool and Packaging Pinning:

    • Client should pin the version of the MCP server and its tools to prevent unauthorized changes.
    • Use hash or checksum to verify the integrity of the tool description before executing it.
    • If the hash doesn't match, it indicates changes to the tool description, which could pose security vulnerabilities.
  3. Cross-Server Protection:

    • Implement stricter boundaries and data flow controls between different MCP servers.
    • Use designated agent security tools like Invariant Stack to enforce these controls.
  4. Vet MCP Servers and Tools:

    • Thoroughly vet every MCP server and tool before using them in your application.
    • Do not blindly trust or download MCP servers or tools from unverified sources.
    • Apply good software security practices when working with MCP-based systems.

By following these recommendations, you can significantly improve the security of your MCP-based applications and mitigate the risks associated with MCP security vulnerabilities.

常問問題