The Model Context Protocol (MCP) has rapidly become the standard for connecting AI models to your data and tools. However, most initial implementations are strictly local—relying on stdio to pipe data between a local process and your AI client (like Claude Desktop or Cursor). While this works for personal scripts, it doesn’t scale for teams.
To truly unlock the potential of AI agents in the enterprise, you need to decouple the “Brain” (the AI client) from the “Hands” (the tools). This means moving your MCP servers from localhost to robust cloud infrastructure.
This guide details the architectural shift required to run AWS ECS EKS MCP workloads. We will cover how to deploy remote MCP servers using Server-Sent Events (SSE), how to host them on Fargate and Kubernetes, and—most importantly—how to secure them so you aren’t exposing your internal database tools to the open internet.
Table of Contents
The Architecture Shift: From Stdio to Remote SSE
In a local setup, the MCP client spawns the server process and communicates via standard input/output. This is secure by default because it’s isolated to your machine. To move this to AWS, we must switch the transport layer.
The MCP specification supports SSE (Server-Sent Events) for remote connections. This changes the communication flow:
- Server-to-Client: Uses a persistent SSE connection to push events (like tool outputs or log messages).
- Client-to-Server: Uses standard HTTP POST requests to send commands (like “call tool X”).
Pro-Tip: Unlike WebSockets, SSE is unidirectional (Server -> Client). This is why the protocol also requires an HTTP POST endpoint for the client to talk back. When deploying to AWS, your Load Balancer must support long-lived HTTP connections for the SSE channel.
Option A: Serverless Simplicity with AWS ECS (Fargate)
For most standalone MCP servers—such as a tool that queries a specific RDS database or interacts with an internal API—AWS ECS Fargate is the ideal host. It removes the overhead of managing EC2 instances while providing native integration with AWS VPCs for security.
1. The Container Image
You need an MCP server that listens on a port (usually via a web framework like FastAPI or Starlette) rather than just running a script. Here is a conceptual Dockerfile for a Python-based remote MCP server:
FROM python:3.11-slim
WORKDIR /app
# Install MCP SDK and a web server (e.g., Starlette/Uvicorn)
RUN pip install mcp[cli] uvicorn starlette
COPY . .
# Expose the port for SSE and HTTP POST
EXPOSE 8080
# Run the server using the SSE transport adapter
CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8080"]
2. The Task Definition & ALB
When defining your ECS Service, you must place an Application Load Balancer (ALB) in front of your tasks. The critical configuration here is the Idle Timeout.
- Health Checks: Ensure your container exposes a simple
/healthendpoint, or the ALB will kill the task during long AI-generation cycles. - Timeout: Increase the ALB idle timeout to at least 300 seconds. AI models can take time to “think” or process large tool outputs, and you don’t want the SSE connection to drop prematurely.
Option B: Scalable Orchestration with Amazon EKS
If your organization already operates on Kubernetes, deploying AWS ECS EKS MCP servers as standard deployments allows for advanced traffic management. This is particularly useful if you are running a “Mesh” of MCP servers.
The Ingress Challenge
The biggest hurdle on EKS is the Ingress Controller. If you use NGINX Ingress, it defaults to buffering responses, which breaks SSE (the client waits for the buffer to fill before receiving the first event).
You must apply specific annotations to your Ingress resource to disable buffering for the SSE path:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mcp-server-ingress
annotations:
# Critical for SSE to work properly
nginx.ingress.kubernetes.io/proxy-buffering: "off"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
spec:
ingressClassName: nginx
rules:
- host: mcp.internal.yourcompany.com
http:
paths:
- path: /sse
pathType: Prefix
backend:
service:
name: mcp-service
port:
number: 80
Warning: Never expose an MCP server Service as
LoadBalancer(public) without strict Security Groups or authentication. An exposed MCP server gives an AI direct execution access to whatever tools you’ve enabled (e.g., “Drop Database”).
Security: The “MCP Proxy” & Auth Patterns
This is the section that separates a “toy” project from a production deployment. How do you let an AI client (running on a developer’s laptop) access a private ECS/EKS service securely?
1. The VPN / Tailscale Approach
The simplest method is network isolation. Keep the MCP server in a private subnet. Developers must be on the corporate VPN or use a mesh overlay like Tailscale to reach the `http://internal-mcp:8080/sse` endpoint. This requires zero code changes to the MCP server.
2. The AWS SigV4 / Auth Proxy Approach
For a more cloud-native approach, AWS recently introduced the concept of an MCP Proxy. This involves:
- Placing your MCP Server behind an ALB with AWS IAM Authentication or Cognito.
- Running a small local proxy on the client machine (the developer’s laptop).
- The developer configures their AI client to talk to
localhost:proxy-port. - The local proxy signs requests with the developer’s AWS credentials (SigV4) and forwards them to the remote ECS/EKS endpoint.
This ensures that only users with the correct IAM Policy (e.g., AllowInvokeMcpServer) can access your tools.
Frequently Asked Questions (FAQ)
Can I use the official Amazon EKS MCP Server remotely?
Yes, but it’s important to distinguish between hosting a server and using a tool. AWS provides an open-source Amazon EKS MCP Server. This is a tool you run (locally or remotely) that gives your AI the ability to run kubectl commands and inspect your cluster. You can host this inside your cluster to give an AI agent “SRE superpowers” over that specific environment.
Why does my remote MCP connection drop after 60 seconds?
This is almost always a Load Balancer or Reverse Proxy timeout. SSE requires a persistent connection. Check your AWS ALB “Idle Timeout” settings or your Nginx proxy_read_timeout. Ensure they are set to a value higher than your longest expected idle time (e.g., 5-10 minutes).
Should I use ECS or Lambda for MCP?
While Lambda is cheaper for sporadic use, MCP is a stateful protocol (via SSE). Running SSE on Lambda requires using Function URLs with response streaming, which has a 15-minute hard limit and can be tricky to debug. ECS Fargate is generally preferred for the stability of the long-lived connection required by the protocol.

Conclusion
Moving your Model Context Protocol infrastructure from local scripts to AWS ECS and EKS is a pivotal step in maturing your AI operations. By leveraging Fargate for simplicity or EKS for mesh-scale orchestration, you provide your AI agents with a stable, high-performance environment to operate in.
Remember, “Powering Up” isn’t just about connectivity; it’s about security. Whether you choose a VPN-based approach or the robust AWS SigV4 proxy pattern, ensuring your AI tools are authenticated is non-negotiable in a production environment.
Next Step: Audit your current local MCP tools. Identify one “heavy” tool (like a database inspector or a large-context retriever) and containerize it using the Dockerfile pattern above to deploy your first remote MCP service on Fargate. Thank you for reading the DevopsRoles page!
