The security conversation around AI infrastructure keeps circling the same topics: model alignment, data privacy, hallucination rates. Meanwhile, the actual attack surface that practitioners are building on top of every major LLM deployment has almost no authentication, no input validation, and credentials scattered through source code like confetti. That's not a prediction. That's what the audit found.
Model Context Protocol servers are the connective tissue of modern AI applications. They're how your LLM talks to your database, your file system, your browser, your third-party APIs. And most of them were written fast, shipped faster, and secured never. The REST API ecosystem took roughly a decade to develop mature security norms. MCP is trying to compress that timeline into months, and the gap between where the ecosystem is and where it needs to be is significant.
This post covers findings from an audit of more than 20 open-source MCP servers. The results are not subtle.
The MCP Explosion Nobody Was Ready For
What Is the Model Context Protocol?
Model Context Protocol is an open standard, originally released by Anthropic in late 2024, that defines how LLMs communicate with external tools and data sources. Before MCP, every AI application that wanted to connect to an external service built its own bespoke integration layer. Different function-calling schemas, different transport mechanisms, different conventions for how tool results came back. It worked, but it didn't scale, and it didn't compose.
MCP changes that by defining a universal plugin layer. An MCP server exposes a set of tools. An MCP client, typically an LLM runtime or agent framework, discovers those tools and calls them using a standardized protocol. The server handles the actual work: reading files, querying databases, making HTTP requests, executing code. The LLM just decides when and how to use each tool.
The design is elegant. A file system MCP server written once works with Claude, with GPT-based agents, with any compliant client. That portability is exactly what the ecosystem needed, and exactly why adoption moved so fast.
Why MCP Became the Standard Overnight
Anthropic published the spec and opened it up. Within weeks, major IDE vendors, agent frameworks, and developer tooling companies announced MCP support. By early 2025, the ecosystem had tipped. If you were building an AI application and you wanted your tool integrations to work across multiple runtimes, you built an MCP server.
The developer incentive structure rewarded shipping. Stars, downloads, and visibility went to the first usable implementation in each category. The first file system MCP server with decent README documentation got 800 stars before anyone asked whether it validated path inputs. The first database server got forked 200 times before anyone noticed it accepted arbitrary SQL from any connected client.
The parallel to the SOAP-to-REST transition is direct. When REST displaced SOAP in the late 2000s, developers celebrated the simplicity and moved fast. Security controls that SOAP's WS-Security stack enforced by convention got dropped in the rush to ship clean, simple JSON APIs. It took years of breach reports, OWASP guidance, and framework-level defaults before auth became automatic. MCP is at the beginning of that same curve, with one difference: the clients calling these servers are autonomous agents that don't need a human in the loop to make thousands of requests.
How We Audited 20+ Open-Source MCP Servers
Audit Methodology and Selection Criteria
Server selection started with signal, not convenience. GitHub stars and npm/PyPI download counts established a baseline of real-world adoption. A server with 50 stars and 200 weekly downloads is actually being used. That matters because theoretical vulnerabilities in abandoned code are less interesting than real vulnerabilities in code running inside active development environments right now.
Category diversity was the second filter. The audit covered servers across five categories: file system access, database query and management, web browsing and scraping, code execution and terminal access, and third-party API integrations including calendar, email, and productivity tools. Concentrating entirely on one category would have produced a skewed picture. The goal was a cross-section of the ecosystem, not a deep dive into a single niche.
The testing approach combined three methods. Static analysis caught obvious patterns: hardcoded credentials, missing input sanitization, insecure defaults in configuration. Manual code review went deeper, tracing data flow from incoming tool arguments through to system calls or external API requests. Live endpoint testing confirmed whether vulnerabilities were exploitable in practice, not just present in theory.
Severity ratings followed a five-tier system: Critical, High, Medium, Low, and Informational. Critical meant direct, unauthenticated access to sensitive operations with no preconditions. High meant exploitable with minimal effort or chaining. Medium required specific conditions. Low was a hardening gap without immediate exploitability. Informational flagged patterns worth monitoring.
OWASP Categories We Tested Against
The OWASP API Security Top 10 provided the framework. Every tested server was evaluated against the full list, but four categories dominated the findings: Broken Object Level Authorization (API1), Broken Authentication (API2), Unrestricted Resource Consumption (API4), and Security Misconfiguration (API8).
Responsible Disclosure
Every Critical and High finding in this audit was reported to the relevant repository maintainers before publication. Maintainers were given 30 days to respond and 60 days to patch before findings were included in this post. Some responded quickly. Several did not respond at all. Unpatched Critical findings are described without reproduction steps.
Injection vulnerabilities under API10 also appeared frequently enough to warrant dedicated coverage. The remaining OWASP categories produced findings too, but the four listed above account for the majority of the severity-weighted risk across the audited servers.
Finding #1. Authentication Is Basically Optional
Bearer Tokens? Never Heard of Them
Fourteen of the 22 audited servers had zero authentication mechanisms of any kind. No API keys. No bearer tokens. No OAuth flows. No session validation. A connected MCP client could call any exposed tool with any arguments and the server would execute the request.
That number needs context to land properly. These aren't toy demo projects. Several of the servers in this category have thousands of GitHub stars and are listed in official MCP server directories maintained by major AI vendors. Developers are installing these, connecting them to their LLM runtimes, and pointing them at real data.
The comparison to early REST APIs is worth sitting with. In 2007 and 2008, plenty of internal REST services ran without authentication because they were assumed to be behind a firewall. That assumption collapsed the first time someone misconfigured a network rule or spun up a cloud instance with a public IP. The MCP ecosystem is making the same assumption a decade and a half later, with less excuse.
The 'Local Only' Fallacy
Several server maintainers, when contacted during responsible disclosure, responded with a variation of the same explanation: "This server is designed for local use only." That's not a security model. That's a deployment assumption.
The practical reality is that MCP servers get wrapped. A developer builds a local file system MCP server, it works well, and then it gets containerized and deployed as part of a cloud-hosted AI product. The "local only" assumption doesn't survive that transition, and there's nothing in the server code to enforce it. No bind address restriction. No localhost check. No warning in the startup logs.
The compounding risk appears when these servers sit behind cloud-hosted LLM products. A SaaS AI assistant that connects to an MCP server on behalf of thousands of users is now routing all of those users' tool calls through a server that accepts requests from any client. The blast radius of a single misconfiguration scales with the user base of the product wrapping it.
Critical Finding
Three audited servers with "local only" documentation were trivially accessible over the network with default configurations. Two of these accepted arbitrary tool calls with no identity check. One exposed a full file system read capability to any connected client.
The fix isn't complicated. Bearer token validation at the transport layer, enforced on every request, is table-stakes. Several mature MCP server implementations in the ecosystem do this correctly. The gap isn't knowledge. It's prioritization.
Finding #2. Input Validation Is an Afterthought
Prompt Injection Through Tool Parameters
MCP tool parameters follow a specific path. The user provides input to an LLM. The LLM reasons about that input and decides to call a tool. It constructs the tool arguments, which the MCP client passes to the server. The server executes something based on those arguments. At no point in that chain is there a mandatory validation step, and most servers don't add one voluntarily.
The injection surface this creates is specific to the LLM context. A traditional API receives input from a human typing into a form or a developer constructing a request. An MCP server receives input that was synthesized by a language model, which means it can contain patterns that no human would naturally type. If a user's upstream message contains a carefully constructed string designed to manipulate the LLM's tool call construction, that string can flow directly into a system operation.
This is prompt injection weaponizing the MCP server against its own host. The attack doesn't require network access. It requires a user who can influence what the LLM puts into a tool argument, and an MCP server that doesn't validate what it receives.
"The most dangerous property of LLM-generated inputs is that they're syntactically plausible. They look like valid arguments. They pass informal eyeball checks. They just happen to contain traversal sequences or shell metacharacters that a human would never type."
Path Traversal and Shell Injection in File and Code Servers
File system MCP servers showed the most consistent pattern of path traversal vulnerabilities. The typical implementation accepts a file path as a tool argument and passes it directly to a read or write operation. No normalization. No allowlist of permitted directories. No check that the resolved path stays within the intended working directory.
A tool argument of ../../../../etc/passwd worked in four of the six file system servers audited. That's not a subtle finding. Path traversal is a decades-old vulnerability class with well-documented mitigations. The fact that it's present in freshly written MCP servers suggests that the developers building them aren't running through even a basic security checklist before shipping.
Code execution and terminal MCP servers carried shell injection risks that were, in some cases, worse. Several implementations constructed shell commands by concatenating tool arguments directly into a command string. An argument containing a semicolon or a backtick could terminate the intended command and begin a new one. In a server with no authentication and no input validation, that's unauthenticated remote code execution with whatever privileges the server process holds.
The mitigation path here is straightforward: strict schema validation on every incoming argument, allowlist-based path resolution for file operations, parameterized command construction for anything touching a shell. None of these are novel techniques. They're just absent.
Finding #3. Rate Limiting and Abuse Controls Are Nonexistent
Infinite Loops and Runaway Agent Costs
Of the 22 servers audited, the number with any form of rate limiting was effectively zero. One server had a configurable timeout on individual operations. None had request rate caps, concurrency limits, or circuit breakers that would interrupt a runaway call pattern.
In a human-driven API context, rate limiting is a quality-of-life feature as much as a security control. Humans don't naturally make 10,000 requests per minute. Agents do. An agentic loop that encounters an error condition can retry indefinitely. A planning agent that decides a particular tool is useful for a task can call it hundreds of times in a single session. Without any throttle on the server side, there's nothing to interrupt that pattern.
The cost implications are direct when MCP servers proxy to paid third-party APIs. A server that wraps an external search API, a translation service, or a data enrichment provider and passes each tool call through to that provider has no protection against an agent that calls it 5,000 times in an hour. The bill lands with whoever owns the API credentials in the MCP server configuration.
Denial of Service via Agentic Tool Calls
The denial-of-service surface here is different from traditional API abuse. Classic DoS requires an external attacker. MCP servers can be denial-of-serviced by their own legitimate clients if those clients are agents operating in a loop. A bug in an agent's planning logic, a misconfigured retry policy, or a deliberate prompt designed to trigger repeated tool calls can exhaust server resources, API quotas, or downstream service limits without any malicious external actor involved.
The pattern matches what Twitter saw in 2009 and 2010, before rate limiting became a hard platform constraint. Third-party applications with bugs hammered the API, took down service for other users, and generated bills that nobody had budgeted for. Stripe enforced rate limits early and avoided most of that pain. The MCP ecosystem is currently at the pre-enforcement stage, and the autonomous nature of agent clients makes the risk higher, not lower.
Missing circuit breakers, absent concurrency caps, and no timeout policies on long-running operations compound the problem. These are standard infrastructure patterns. They're just not present.
Finding #4. Secrets and Credentials Live in Plain Sight
Hardcoded API Keys in Server Source Code
Eight of the 22 audited servers contained hardcoded credentials somewhere in their source code. Not in configuration files excluded from version control. In the actual source files, committed to public repositories, indexed by GitHub's code search.
The credentials varied: third-party API keys, database connection strings, service account tokens. Some appeared in example files that were clearly intended as templates but never had the actual values replaced. Others appeared in utility functions where a developer had tested with a real credential and never cleaned it up before pushing.
The LLM context window creates an additional exposure path that's specific to MCP. If an MCP server echoes configuration values back in its tool responses, or includes them in error messages, or logs them in a format that gets fed back into the LLM's context, those credentials become part of the conversation. A sufficiently capable agent, or a prompt injection attack targeting the agent, can then exfiltrate those values through a subsequent tool call or through the model's output.
Environment Variable Misuse and Leakage
Environment variables are the correct pattern for secrets in server applications. Several audited servers used them, which is better than hardcoding. But the implementation had problems that undermined the security benefit.
Common Misuse Patterns Found
Several servers printed the full environment at startup for debugging purposes, including every variable the process inherited. Others included environment variable values in exception stack traces that were returned to the MCP client as tool error responses. One server's health check endpoint returned a JSON object that included a subset of its configuration, including the API key it used to authenticate with a downstream service.
The contrast with mature secret management is significant. Tools like HashiCorp Vault and AWS Secrets Manager exist precisely because environment variables aren't a sufficient security boundary for production systems. Secrets should have short rotation cycles, access should be audited, and values should never appear in logs or error output. Most MCP servers
The Web 1.0 API Negligence Cycle, Running on Replay
A Brief History of API Security Debt
SOAP APIs shipped in the early 2000s with XML signatures that were optional, WS-Security implementations that were inconsistent, and validation layers that most teams skipped entirely because the deadline was the deadline. REST APIs arrived and simplified everything, including the security. Early REST endpoints had no authentication requirements baked into the pattern, no rate limiting by default, and no schema validation on incoming payloads. Developers built fast. Security teams caught up later. Sometimes much later.
The industry spent roughly a decade retrofitting OAuth, API keys, HMAC signatures, and rate limiting onto systems that were never architected to carry them. The scars are still visible. Every major API breach in that era followed the same script: something shipped without auth, someone found it, data moved.
The MCP ecosystem is running that same script right now. Community servers are being published, starred, and installed by thousands of developers with no authentication layer, no input validation, and no documented threat model. The difference is that nobody learned anything. The industry has the receipts from the API era and is choosing not to read them.
Why AI Tooling Makes the Stakes Higher This Time
A compromised REST endpoint in 2008 meant a human had to do something with the access. They had to log in, navigate, exfiltrate, and cover tracks. The blast radius was real but bounded by human bandwidth. A compromised MCP server in an agentic pipeline operates differently. The agent doesn't pause. It doesn't ask for confirmation. It calls the tool, receives the response, and acts on it. If that response is poisoned, the downstream actions happen at machine speed without a human in the loop to catch the anomaly.
Security analysts who study the MCP attack surface describe the exposure as structurally different from prior API risk. The autonomous execution model means that tool poisoning, where a malicious server returns instructions that redirect agent behavior, produces effects before any human review is possible. That's not a marginal increase in risk. It's a different category of risk entirely.
Who Is Actually at Risk and How Bad Could It Get?
Individual Developers Running MCP Locally
The individual developer running an MCP server locally is not a low-risk user. They're often the highest-risk user, because they're the least likely to have a security team reviewing their configuration and the most likely to install a community server without reading the source code.
The exposure surface for a local deployment includes credential theft through environment variable access, filesystem compromise if the server has broad read or write permissions, and data exfiltration through tool calls that silently forward content to external endpoints. A malicious MCP server that has access to a developer's home directory has access to SSH keys, cloud credentials stored in dotfiles, and browser session data depending on the OS configuration.
This Is Not a Theoretical Risk
Community MCP servers with filesystem access and no authentication are being installed by developers right now. There is currently no purpose-built security scanning tooling for MCP packages equivalent to what exists for npm or PyPI. You are largely on your own.
Enterprises Deploying MCP in Production Pipelines
Enterprise deployments carry a different but larger risk profile. A single vulnerable MCP server in a production pipeline can become a lateral movement pivot. If the server has write access to a database, it can poison data that downstream systems trust. If it has access to email or calendar APIs, it can exfiltrate communications or inject meeting content. If it touches cloud resource APIs, the blast radius extends to infrastructure.
The supply chain risk is particularly sharp. An enterprise that installs a third-party MCP server is trusting that server's author, that server's dependencies, and every dependency of those dependencies. The same supply chain attack patterns that hit npm and PyPI ecosystems apply here, and MCP has none of the detection infrastructure those ecosystems have built over years. A compromised community server that gains write access to a production database or an outbound email relay is not a contained incident. It's a reportable breach.
What Secure MCP Servers Actually Look Like
Authentication and Authorization Patterns That Work
There's no single correct authentication pattern for MCP servers, but there are patterns that are clearly wrong. Shipping a server with no authentication is wrong. Shipping a server where the authentication check is optional or skippable by the client is wrong. Shipping a server where every tool shares the same access level regardless of sensitivity is wrong.
OAuth 2.0 works well for MCP servers that need to act on behalf of users across external services. The token exchange model maps cleanly to tool invocation contexts, and short-lived tokens limit the damage from credential exposure. API keys with rotation are appropriate for server-to-server contexts where the client is a known system rather than an end user. Keys should be rotated on a schedule and invalidated immediately on any suspected compromise. mTLS belongs in high-trust internal deployments where both sides of the connection need to prove identity before any tool call is accepted.
Authorization deserves its own attention. Blanket access, where a verified caller can invoke any tool the server exposes, is not a security model. Per-tool permission scopes mean that a client authorized to read a database is not automatically authorized to write to it. That distinction matters enormously when an agent's behavior is being influenced by a poisoned prompt or a compromised upstream tool.
"The principle of least privilege isn't a suggestion for MCP servers. It's the only thing standing between a tool call and a full infrastructure incident when something goes sideways upstream."
Input Validation, Rate Limiting, and Secrets Management Baselines
Every tool parameter should be validated against a strict schema before any execution happens. JSON Schema and Zod both work well for this. The validation layer needs to run before any business logic, not after. A tool that accepts a file path parameter and doesn't validate that path before passing it to the filesystem is a directory traversal vulnerability waiting to be triggered by an agent that received a crafted instruction.
Rate limiting for MCP servers should operate at three levels: per-session, per-tool, and per-client. A client that calls a high-privilege tool fifty times in ten seconds is either malfunctioning or compromised. Circuit breakers that trip on anomalous call volumes buy time for human review before damage compounds.
Secrets management is non-negotiable. Credentials belong in a secret store, not in environment variables that get logged, not in config files that get committed, and absolutely not hardcoded in server source code. Tool arguments should never appear in logs. A log line that captures a file write tool's payload is a log line that captures whatever sensitive content the agent was processing. Audit logging should capture metadata: which tool was called, by which client, at what time, with what result code. Not the payload.
Security Checklist for MCP Server Builders and Deployers
Before You Ship
Before You Install a Community MCP Server
The Industry Needs to Move Before the Attackers Do
What Protocol Maintainers Can Do
The MCP specification is young enough that security requirements can still be baked into the protocol itself rather than bolted on afterward. Anthropic and the working groups shaping MCP's evolution have a narrow window to make authentication mandatory at the protocol level, define standard authorization scopes, and publish a security profile that implementations are expected to meet. That window closes as the ecosystem matures and backward compatibility becomes a constraint.
An official MCP security best practices document, something with the specificity and authority of OWASP's API guidance, would give developers a reference point that doesn't currently exist. Package registries that index MCP servers should flag servers with known security issues the same way npm flags packages with critical CVEs. Neither of these things requires inventing new security concepts. They require applying existing ones to a new context.
What the Developer Community Must Demand
Developers who install, star, and share MCP servers are setting the culture of this ecosystem right now. Demanding authentication before recommending a server costs nothing. Asking "what does this server do with my credentials?" before installing costs nothing. The community that built strong security norms around npm packages, Docker images, and browser extensions can build the same norms here.
The tooling opportunity is real. MCP-specific static analysis, fuzzing frameworks for tool parameter handling, and runtime protection layers are all buildable with current technology. The market exists because the risk exists.
The industry solved API security. It took longer than it should have and cost more than it needed to. There's no reason to run the same experiment twice.