Guide
Model Context Protocol (MCP) explained
Every useful AI assistant eventually needs to reach outside the model weights: query a database, read a repo, post to Slack, or fetch live market data. Before 2024, each product reinvented that wiring with bespoke plugins and fragile prompt hacks. The Model Context Protocol (MCP) — introduced by Anthropic and now adopted across hosts like Cursor, Claude Desktop, and open-source agent frameworks — is an open standard that defines how an LLM host discovers capabilities from MCP servers, invokes tools, reads resources, and shares prompt templates over a small JSON-RPC surface. Think of MCP as USB-C for AI integrations: one port shape, many peripherals, explicit capability negotiation instead of ad-hoc REST wrappers stuffed into system prompts.
Why MCP exists: the integration mess it replaces
Early AI agent stacks glued tools together manually: hard-coded OpenAPI specs in prompts, one-off Python functions per integration, and no shared discovery layer. That worked for demos and broke in production:
- Capability drift — the model did not know which tools existed until a developer updated a giant system prompt.
- No versioning contract — changing a tool schema silently broke agents in the field.
- Security sprawl — every integration chose its own auth, logging, and sandbox story.
- Host lock-in — a Slack bot integration could not move to an IDE assistant without a rewrite.
MCP separates concerns: hosts run the user-facing agent and model; servers expose typed capabilities; clients inside the host speak MCP on the host's behalf. A Postgres MCP server written once can plug into any compliant host. That reuse is the economic argument — fewer N-times-M integrations, more composable agent stacks.
MCP architecture: hosts, clients, and servers
The vocabulary is precise and easy to confuse on first read:
Host
The application the human interacts with — an IDE, chat client, or autonomous agent runtime. The host owns the LLM session, user identity, and policy (which servers may connect, rate limits, approval UI). It spawns one or more MCP clients.
MCP client (inside the host)
A lightweight connector that maintains a session with a single MCP server. The client handles capability negotiation, routes tool calls, and streams results back to the host's agent loop. One host typically runs many clients in parallel — filesystem, GitHub, browser, internal CRM — each isolated.
MCP server
A process or remote service that implements the protocol and advertises what it can do. Servers declare tools (callable functions with JSON Schema inputs), resources (readable URIs like files or records), and optional prompts (reusable template blocks). The server enforces its own authorization: an MCP server is not trusted just because the host connected to it.
Data flows: user message enters host, agent decides it needs a tool, host routes the call through the right MCP client, server executes with local credentials, structured result returns to the model as a tool result message. This mirrors native LLM function calling but standardizes discovery and transport across vendors.
Capabilities: tools, resources, and prompts
Tools
Tools are the workhorses — search_issues,
run_sql_query, deploy_preview. Each tool publishes
a name, human-readable description, and JSON Schema for arguments. Good
descriptions materially affect whether the model picks the right tool;
treat them like API docs, not afterthoughts. Tool results should be
structured JSON or concise text — dumping 50 KB of raw HTML into the
context window burns tokens and confuses downstream reasoning.
Resources
Resources are addressable content the model can read without a full tool
round-trip when the host supports resource subscriptions. Examples: a file
at file:///project/README.md, a database row URI, or a cached
document snapshot. Resources complement
RAG:
RAG retrieves semantically similar chunks; resources expose canonical named
artifacts the user or server explicitly publishes.
Prompts
MCP prompts are server-provided templates — e.g. "code review checklist for this repo" — with argument slots. They let integrations ship domain expertise without every host hard-coding the same system prompt paragraphs. Prompts are optional; many early servers expose tools only.
Transport: stdio, SSE, and HTTP
MCP messages are JSON-RPC 2.0. The transport layer varies by deployment:
- stdio — the host spawns a local server subprocess and communicates over stdin/stdout. Common for filesystem, Git, and local DB servers in desktop IDEs. Simple and firewall-friendly; the server inherits the host machine's trust boundary.
- HTTP + Server-Sent Events (SSE) — remote servers stream events to the client over HTTP. Useful for shared team servers (internal knowledge base, ticketing system) without installing binaries on every laptop.
- Streamable HTTP — newer transports unify bidirectional messaging over HTTP for hosted SaaS MCP endpoints; check the spec revision your SDK targets.
Transport choice is a security decision, not just plumbing. stdio servers run as the local user — a malicious or compromised MCP server can read any file the user can. Remote HTTP servers need TLS, authentication, and network policies like any internal API. Pair MCP with the same discipline you apply to REST API design: least privilege, audit logs, and explicit scopes per connection.
Lifecycle: connect, negotiate, call, teardown
A typical session follows a fixed handshake:
- Initialize — client and server exchange protocol versions and capabilities (which feature flags each side supports).
- List capabilities — client calls
tools/list,resources/list,prompts/listto build the catalog the agent sees. - Agent loop — model chooses a tool; client sends
tools/callwith arguments; server validates schema, executes, returns content or structured error. - Notifications — optional server push when resources change (file saved, ticket updated).
- Shutdown — clean close; subprocess servers exit, HTTP sessions release.
Hosts should cache capability lists per session but re-list after server upgrades. Version your server package and document breaking schema changes — agents do not read changelogs, but operators do.
MCP vs native function calling and custom plugins
OpenAI, Google, and Anthropic models all support in-band function / tool calling in the chat API. MCP does not replace that mechanism — it standardizes what sits behind the host's tool registry. Benefits of MCP:
- Portability — ship one server, connect many hosts.
- Discovery — dynamic tool lists instead of redeploying prompt JSON for every new endpoint.
- Ecosystem — community servers for GitHub, Puppeteer, SQLite, Sentry, etc., composable like packages.
When MCP is overkill: a single static function inside one app (e.g. "format date") does not need a protocol. When MCP pays off: platform teams exposing internal systems to multiple agent surfaces, or products that let users install third-party capability packs safely.
Security: where MCP deployments go wrong
MCP concentrates risk: one compromised server can exfiltrate data or mutate production systems. Treat every server like a privileged microservice:
- Prompt injection via tool output — untrusted web pages returned as tool results can instruct the model to leak secrets. Sanitize, truncate, and scan outputs; see prompt injection defenses.
- Over-broad tools —
run_shell_commandwith user input is remote code execution with extra steps. Split tools narrowly; require human approval for destructive actions. - Credential storage — servers hold API keys; never log arguments containing tokens; rotate keys per server instance.
- Supply chain — installing unaudited MCP servers from the internet is equivalent to installing unaudited browser extensions with root access. Pin versions, review source, run in containers where possible.
- Confused deputy — the model requests tool A; a malicious resource tricks the host into calling tool B. Enforce server-side authorization on every call, not just connection time.
Enterprise deployments add OAuth per server, per-user token scoping, and allowlists of which hosts may attach to which servers — patterns familiar from OAuth 2.0 federated apps.
Building and operating MCP servers
Official SDKs exist for TypeScript, Python, and other languages. A minimal server implements the handshake, registers tools with schemas, and validates inbound arguments before side effects:
- Schema-first — generate JSON Schema from types (Zod, Pydantic) so docs and validation stay aligned.
- Idempotent reads, explicit writes — mirror idempotency habits; destructive tools need confirmation flows in the host UI.
- Structured errors — return actionable messages ("repo not found", "missing scope: issues:write") so the model can recover.
- Observability — log tool name, latency, success rate; trace IDs propagated from the host through observability stacks.
- Rate limits — agents loop fast; protect upstream APIs with quotas and backoff.
For data-heavy tools, prefer querying indexed stores or vector DBs over shipping entire tables into context — pair MCP tools with vector search when semantic retrieval beats full scans.
Production checklist
- Inventory servers — document each connected MCP server, owner, data classification, and blast radius.
- Least-privilege credentials — read-only DB users, scoped OAuth tokens, no shared admin keys across environments.
- Human-in-the-loop — require approval for payments, deploys, mass emails, and permission changes.
- Output limits — cap tool response size; summarize large payloads server-side.
- Injection testing — red-team tool outputs and untrusted resources before exposing to production users.
- Version pinning — lock server packages; test capability changes in staging hosts.
- Fallback behavior — when a server is down, degrade gracefully; do not let the agent hallucinate tool results.
- Cost controls — MCP calls often trigger paid APIs; meter per user and per tool.
Key takeaways
- MCP standardizes how LLM hosts discover and invoke external capabilities — tools, resources, and prompts over JSON-RPC.
- Hosts run clients; servers expose integrations — one server can serve many hosts if auth and policy allow.
- stdio suits local dev tools; HTTP/SSE suits shared team servers — transport choice defines the trust boundary.
- MCP complements native function calling; it does not replace model-level tool APIs but unifies what sits behind them.
- Security is the hard part — prompt injection, over-powerful tools, and supply-chain risk dominate failure modes.
- Treat MCP servers like privileged microservices — schema validation, observability, rate limits, and human approval for destructive actions.
Related reading
- AI agents and tool use explained — function calling, ReAct loops, and when workflows beat autonomy
- Prompt injection explained — defending agents that read untrusted tool output
- RAG explained — grounding models with retrieved documents alongside MCP resources
- LLM evaluation and benchmarking explained — measuring agent reliability after adding tools