Romain Sestier · · 10 min The Best MCP Gateways for the OpenAI Agents SDK in 2026
Table of Contents
An MCP gateway gives agents built on the OpenAI Agents SDK one authenticated endpoint in front of all your business systems — with the per-user credentials, curated tool surface, and downstream audit the SDK leaves to your code. OpenAI’s plane governs the call: allowed_tools, approvals, traces. The gateway governs what’s behind it: connectors, tokens for every end user, logs of what actually changed. With Agent Builder winding down, the SDK is where OpenAI agents get built. Verdict: StackOne for multi-user agents on systems of record; Composio or Pipedream for developer products needing breadth; Arcade when infrastructure control comes first.
Where OpenAI’s agent stack stands after the Agent Builder deprecation
On June 3, 2026, OpenAI announced it is winding down AgentKit’s visual Agent Builder — shutdown on November 30, 2026 — and the Evals platform, which goes read-only on October 31, 2026 and shuts down on November 30, 2026; ChatKit continues (OpenAI deprecations, June 2026). The documented migration paths are the Agents SDK for code-first teams and ChatGPT Workspace Agents for no-code builders.
For anyone deciding where to invest, the signal is clear: the Agents SDK plus the Responses API MCP tool is the durable center of OpenAI’s agent stack. One related note: AgentKit’s Connector Registry — launched in beta in October 2025, gated on the Global Admin Console — has not been announced as GA, and its dedicated docs page is currently unavailable (checked June 9, 2026); check its status before building on it.
How the Agents SDK and Responses API connect to MCP servers
The MCP support is genuinely good — full in Python, near-parity in JS/TS, with the gaps running both ways (Python MCP guide; JS/TS MCP guide, June 2026):
Three connection paths. HostedMCPTool hands the connection to the Responses API — OpenAI’s infrastructure calls the remote server directly, configured with server_label, server_url, a connector_id for first-party connectors, authorization, require_approval, and defer_loading. Or your process owns the connection: MCPServerStreamableHttp for remote servers, MCPServerStdio for local subprocesses (MCPServerSse exists, but the transport is deprecated by the MCP project). Python adds MCPServerManager for multi-server lifecycles.
Tool governance is built in. Static filtering via create_static_tool_filter (allow/block lists) and dynamic filtering via a ToolFilterContext callable; per-tool approval policies ("always", "never", or per-tool maps) that pause the run for human sign-off; cache_tools_list to avoid re-fetching tool lists every run. Tracing captures list-tools and tool calls automatically, defaults to the OpenAI Traces dashboard, and supports 25+ external processors (tracing docs); for long-running agents, OpenAI documents Temporal, Restate, and DBOS integrations (running agents).
The Responses API MCP tool is the hosted half. allowed_tools constrains which tools the model sees; require_approval produces approval-request items your code answers, and OpenAI’s guidance is that sensitive actions should “always require approval.” There’s no per-call fee — you pay tokens for tool definitions and calls. Eight first-party connectors ship (Dropbox, Gmail, Google Calendar, Google Drive, Microsoft Teams, Outlook Calendar, Outlook Email, SharePoint); everything else is a custom server_url, with explicit safety guidance: connect only to “official servers hosted by the service providers themselves,” and treat tool outputs as prompt-injection surface (connectors and MCP guide, June 2026).
Auth is where the seams show. The authorization parameter carries an OAuth access token that OpenAI does not store — you re-send it on every request. The JS SDK accepts an authProvider (an OAuthClientProvider); Python has no first-class OAuth provider option — static headers only. And the documented connector example reads its token from an environment variable.
Token economics are documented by OpenAI itself. Remote servers’ verbose tool definitions are injected into context, and the OpenAI cookbook works a real example: the same one-line request costs 945 total tokens on gpt-4.1 versus 38,022 on o4-mini, where the reasoning model re-processes the server’s full imported tool list (36,436 input tokens, most of them cached). The cookbook’s mitigations — allowed_tools to trim the list and previous_response_id to keep mcp_list_tools cached — are exactly the levers a curated gateway surface keeps short and stable. The newer mitigation is defer_loading with tool search (Responses API, GPT-5.4 and newer).
What OpenAI’s native controls don’t cover
OpenAI’s plane governs the agent’s side of the call — which tools are visible, which need approval, what the trace shows. Four jobs stay on yours:
- Per-end-user credentials. This is the gap. As of June 9, 2026, neither the connectors/MCP guide nor the Agents SDK docs document a multi-user credential pattern — the SDK gives you nowhere to put 500 users’ tokens (see the auth seams above).
- The connectors themselves. Eight first-party connectors, all collaboration-suite-shaped. Workday, Salesforce, ServiceNow and the rest of the systems of record are custom
server_urlterritory — someone builds, hosts, and maintains those servers, under guidance that says to connect only to official ones. - Audit beyond Traces. Tracing records that the agent listed tools and called one, with inputs and outputs. It does not record the downstream provider requests the call produced — what actually changed in the system of record, under whose credentials.
- Injection scanning on tool outputs. The Agents SDK ships input/output and tool-level guardrails with tripwires (guardrails docs), and the separate open-source OpenAI Guardrails framework includes LLM-based prompt-injection detection on tool calls and outputs — but you wire it yourself; the MCP guide’s own mitigations are procedural, and nothing scans hosted MCP tool outputs automatically.
The gateway’s role is exactly this layer. It doesn’t replace allowed_tools or approvals — it makes them workable once the agent has real users and real systems behind it.
What to look for in an MCP gateway for the Agents SDK
| Criterion | Why it matters for the Agents SDK specifically |
|---|---|
Remote streamable HTTP + OAuth that fits authorization pass-through | A remote streamable HTTP endpoint whose token drops into authorization — HostedMCPTool re-sends it every request and never stores it |
| Per-end-user credentials | The SDK gives you nowhere to put 500 users’ tokens; the gateway should manage downstream credentials per user so your code sends one gateway token |
Curated tool surface that keeps allowed_tools short and stable | OpenAI’s cookbook shows tool-definition overhead ranging from ~945 to ~38,022 tokens depending on model and caching — allowed_tools trims it, but the savings only hold if the list doesn’t churn or balloon |
| Audit beyond Traces | Traces show the agent’s side; gateway request logs should capture what happened in the downstream system, exportable to Datadog or Grafana |
| Injection scanning on tool responses | OpenAI’s MCP mitigations are procedural; the gateway can scan responses before they reach the model |
| Depth on systems of record | The 8 first-party connectors stop at collaboration suites; the gateway’s catalog is your write capability on HRIS, CRM, ITSM, ERP |
The best MCP gateways for the OpenAI Agents SDK, compared
Same evidence rules as our full MCP gateway comparison: capability facts from public documentation, no performance claims, StackOne disclosed. All four are managed gateways — the native baseline (first-party connectors plus allowed_tools and approvals) already gives you call-level governance; what’s missing is the layer behind it:
| Platform | Remote MCP + OAuth | Per-user credentials | Tool surface | Audit beyond Traces | Catalog | Pricing |
|---|---|---|---|---|---|---|
| StackOne | Yes — managed remote MCP, OAuth 2.1 end-user flow | Per-user linked accounts via OAuth 2.1 end-user flow | Curated actions; two meta-tools (search + execute) keep context constant | Request logs to provider level, Datadog/Grafana export | 310+ connectors, 20,000+ agent-optimized actions | Free plan (full catalog) |
| Composio | Yes — hosted MCP servers | Yes (Connect Link per user_id) | Toolkit-level selection | Observability; audit detail light | ~1,000 toolkits | Free tier; from $29/mo |
| Pipedream Connect MCP | Yes — remote or self-hosted | Yes (per external_user_id) | Developer-managed tool selection | Logging; governance not detailed | 3,000+ APIs | Usage-based; free tier |
| Arcade | Yes — cloud, VPC, on-prem, air-gapped | End-user OAuth via your IdP | Developer-selected toolkits | Lifecycle governance | ~150 servers in registry | Free tier; from $25/mo |
1. StackOne
StackOne is the enterprise layer for AI agents to safely act on any application, and it meets the criteria like this. The server side is a managed remote MCP endpoint that drops straight into HostedMCPTool or MCPServerStreamableHttp, with an OAuth 2.1 flow where the end user authorizes the client themselves (criterion 1). Per-user credentials are the core of the design: each user links accounts once through SSO and a consent screen, StackOne holds the downstream credentials per linked account, and your code sends one gateway token per user — the authorization field has something safe to carry (criterion 2). The tool surface is built for the token math: tools aren’t direct wrappers over API endpoints but curated, context-optimized actions, and at scale agents get two meta-tools, search and execute, so context stays constant at any catalog size (a 460× reduction versus loading every definition) and allowed_tools stays short and stable (criterion 3). Request logs capture every call down to the underlying provider requests, exportable to Datadog or Grafana — the downstream half of OpenAI’s Traces (criterion 4). StackOne Defender scans tool responses for prompt injection before they reach the agent (89.0% detection accuracy in our published evaluation) (criterion 5). Depth is verifiable per system: Salesforce has 380 actions, Jira 147 actions, Workday 128 actions (criterion 6). Limitation: the catalog focuses on business systems, not consumer applications — for the consumer-app long tail, Zapier’s catalog is far bigger. When a system isn’t in the catalog, the AI Connector Builder builds or extends a connector on the same engine that powers the pre-built ones, so coverage isn’t capped at what ships out of the box. Best for: teams shipping multi-user Agents SDK agents that act on systems of record — where per-user credentials and downstream audit decide the deployment.
2. Composio
Composio markets 1,000+ toolkits reachable via MCP or direct APIs, good SDKs, end users authorizing via a hosted Connect Link with per-user user_id isolation, and published pricing (free tier, then from $29/month). The Agents SDK hookup is first-class and documented: the composio_openai_agents provider (OpenAIAgentsProvider) wraps toolkits as native SDK tools — or you can point HostedMCPTool or MCPServerStreamableHttp at its hosted MCP servers. For a developer wiring the Agents SDK to many tools quickly, it’s a fast path. What we couldn’t find in its public docs as of June 9, 2026 is the org-level control plane — central policy enforcement and approval workflows — and audit detail is light, so the downstream-log criterion is yours to construct. Best for: developers building agent products who want toolkit breadth and SDK speed ahead of organizational governance.
3. Pipedream Connect MCP
Pipedream’s MCP is built for developers embedding integrations into their own AI products: end users connect accounts through managed auth isolated per external_user_id, across 3,000+ APIs, remote-hosted or self-hosted, with published usage-based pricing. Its documented OpenAI integration wires those per-user hosted MCP servers into the Responses API MCP tool — the same shape HostedMCPTool uses from the Agents SDK; we found no dedicated Agents SDK guide as of June 9, 2026, but any of the SDK’s remote paths can point at the same server URL. It pairs naturally with an Agents SDK product whose users bring their own apps. It’s a developer primitive, not an IT product — governance beyond logging isn’t detailed in the docs. Best for: developers embedding user-authorized integrations into their own AI product.
4. Arcade
Arcade includes four deployment models — cloud, your VPC, on-premises, or fully air-gapped — and integrates with your existing OAuth and IdP flows so multi-user agents act with user-specific permissions rather than service accounts. It documents an OpenAI Agents integration via the agents-arcade package, which loads Arcade tools directly into SDK agents. Its registry lists ~150 MCP servers — fewer systems than the larger catalogs here — and pricing is published (free Hobby tier, Growth at $25/month plus usage). Best for: teams building multi-user agents with hard infrastructure-control requirements and a contained set of target systems.
How to connect StackOne to the OpenAI Agents SDK
- Create a StackOne project and scope which connectors and actions it exposes via connector profiles — admins decide the tool surface before any agent sees it.
- End users link their accounts through StackOne’s OAuth 2.1 flow: SSO sign-in, co-branded consent, and an account picker to opt in the specific linked accounts the agent may act on. StackOne holds the downstream credentials per linked account; the per-user gateway token your backend passes as
authorizationis what this flow issues. - Point the agent at the StackOne MCP URL —
HostedMCPToolif you want the Responses API to run the calls, orMCPServerStreamableHttpif your process should own the connection (full code walkthrough in the StackOne Agents SDK guide):
from agents import Agent, HostedMCPTool
def approve(request): # MCPToolApprovalRequest
return {"approve": True} # or route to your human-approval queue
agent = Agent(
name="ops-agent",
tools=[HostedMCPTool(
tool_config={
"type": "mcp",
"server_label": "stackone",
"server_url": STACKONE_MCP_URL,
"authorization": user_gateway_token, # per-user token from the OAuth flow in step 2
"require_approval": "always",
},
on_approval_request=approve, # "always" needs an answer, or no tool ever runs
)],
)
- Tool filtering is optional. The surface is already curated and admin-scoped, so
allowed_toolsstays short by construction — or run tool modes and expose just the search + execute meta-tools for constant context. - Wire audit. OpenAI Traces show the agent’s side; StackOne request logs capture the provider-level calls underneath, exportable to Datadog or Grafana.
When you don’t need a gateway for the Agents SDK
- One agent, one system, your own credentials. An official MCP server plus
MCPServerStreamableHttpand a static token is simpler and free — a gateway adds a hop you don’t need yet. - Everything you need is in the 8 first-party connectors. If the agent only reads Gmail, Drive, and SharePoint under a single user’s OAuth token, the hosted connectors plus
allowed_toolscover it. - You’re still proving the use case. Prototype against one server, watch the token counts in Traces, and graduate to gateway controls when user count makes credential handling real.
The trigger points: the first time you sketch a user_id → token table, the first security review asking what the agent changed downstream, and the first context-window bill that’s mostly tool definitions.
StackOne is the governed layer between AI agents and 310+ enterprise systems with 20,000+ agent-optimized actions — over MCP, A2A, API, and SDKs — with end-user OAuth linking, connectors you can extend, and built-in prompt-injection defense. See pricing or book a demo.
More: The Best MCP Gateways in 2026, Compared · The Best MCP Gateways for ChatGPT Enterprise · StackOne MCP platform · Salesforce MCP · Workday MCP · ServiceNow MCP
More MCP gateway guides
Every guide in this series applies the same disclosed criteria to a different AI client. Start with the full comparison, or jump to yours: