Romain Sestier · June 9, 2026 · 10 min

The Best MCP Gateways for the OpenAI Agents SDK in 2026

Table of Contents

Last updated: June 2026. Every OpenAI capability below is drawn from OpenAI’s public documentation as of June 9, 2026, linked per claim. StackOne is one of the gateways compared; criteria are disclosed so you can check our work. This page covers the Agents SDK (Python and JS/TS) and the Responses API MCP tool; the ChatGPT side is covered in our ChatGPT Enterprise guide.

An MCP gateway gives agents built on the OpenAI Agents SDK one authenticated endpoint in front of all your business systems — with the per-user credentials, curated tool surface, and downstream audit the SDK leaves to your code. OpenAI’s plane governs the call: allowed_tools, approvals, traces. The gateway governs what’s behind it: connectors, tokens for every end user, logs of what actually changed. With Agent Builder winding down, the SDK is where OpenAI agents get built. Verdict: StackOne for multi-user agents on systems of record; Composio or Pipedream for developer products needing breadth; Arcade when infrastructure control comes first.

Where OpenAI’s agent stack stands after the Agent Builder deprecation

On June 3, 2026, OpenAI announced it is winding down AgentKit’s visual Agent Builder — shutdown on November 30, 2026 — and the Evals platform, which goes read-only on October 31, 2026 and shuts down on November 30, 2026; ChatKit continues (OpenAI deprecations, June 2026). The documented migration paths are the Agents SDK for code-first teams and ChatGPT Workspace Agents for no-code builders.

For anyone deciding where to invest, the signal is clear: the Agents SDK plus the Responses API MCP tool is the durable center of OpenAI’s agent stack. One related note: AgentKit’s Connector Registry — launched in beta in October 2025, gated on the Global Admin Console — has not been announced as GA, and its dedicated docs page is currently unavailable (checked June 9, 2026); check its status before building on it.

How the Agents SDK and Responses API connect to MCP servers

The MCP support is genuinely good — full in Python, near-parity in JS/TS, with the gaps running both ways (Python MCP guide; JS/TS MCP guide, June 2026):

Three connection paths. HostedMCPTool hands the connection to the Responses API — OpenAI’s infrastructure calls the remote server directly, configured with server_label, server_url, a connector_id for first-party connectors, authorization, require_approval, and defer_loading. Or your process owns the connection: MCPServerStreamableHttp for remote servers, MCPServerStdio for local subprocesses (MCPServerSse exists, but the transport is deprecated by the MCP project). Python adds MCPServerManager for multi-server lifecycles.

Tool governance is built in. Static filtering via create_static_tool_filter (allow/block lists) and dynamic filtering via a ToolFilterContext callable; per-tool approval policies ("always", "never", or per-tool maps) that pause the run for human sign-off; cache_tools_list to avoid re-fetching tool lists every run. Tracing captures list-tools and tool calls automatically, defaults to the OpenAI Traces dashboard, and supports 25+ external processors (tracing docs); for long-running agents, OpenAI documents Temporal, Restate, and DBOS integrations (running agents).

The Responses API MCP tool is the hosted half. allowed_tools constrains which tools the model sees; require_approval produces approval-request items your code answers, and OpenAI’s guidance is that sensitive actions should “always require approval.” There’s no per-call fee — you pay tokens for tool definitions and calls. Eight first-party connectors ship (Dropbox, Gmail, Google Calendar, Google Drive, Microsoft Teams, Outlook Calendar, Outlook Email, SharePoint); everything else is a custom server_url, with explicit safety guidance: connect only to “official servers hosted by the service providers themselves,” and treat tool outputs as prompt-injection surface (connectors and MCP guide, June 2026).

Auth is where the seams show. The authorization parameter carries an OAuth access token that OpenAI does not store — you re-send it on every request. The JS SDK accepts an authProvider (an OAuthClientProvider); Python has no first-class OAuth provider option — static headers only. And the documented connector example reads its token from an environment variable.

Token economics are documented by OpenAI itself. Remote servers’ verbose tool definitions are injected into context, and the OpenAI cookbook works a real example: the same one-line request costs 945 total tokens on gpt-4.1 versus 38,022 on o4-mini, where the reasoning model re-processes the server’s full imported tool list (36,436 input tokens, most of them cached). The cookbook’s mitigations — allowed_tools to trim the list and previous_response_id to keep mcp_list_tools cached — are exactly the levers a curated gateway surface keeps short and stable. The newer mitigation is defer_loading with tool search (Responses API, GPT-5.4 and newer).

What OpenAI’s native controls don’t cover

OpenAI’s plane governs the agent’s side of the call — which tools are visible, which need approval, what the trace shows. Four jobs stay on yours:

Per-end-user credentials. This is the gap. As of June 9, 2026, neither the connectors/MCP guide nor the Agents SDK docs document a multi-user credential pattern — the SDK gives you nowhere to put 500 users’ tokens (see the auth seams above).
The connectors themselves. Eight first-party connectors, all collaboration-suite-shaped. Workday, Salesforce, ServiceNow and the rest of the systems of record are custom server_url territory — someone builds, hosts, and maintains those servers, under guidance that says to connect only to official ones.
Audit beyond Traces. Tracing records that the agent listed tools and called one, with inputs and outputs. It does not record the downstream provider requests the call produced — what actually changed in the system of record, under whose credentials.
Injection scanning on tool outputs. The Agents SDK ships input/output and tool-level guardrails with tripwires (guardrails docs), and the separate open-source OpenAI Guardrails framework includes LLM-based prompt-injection detection on tool calls and outputs — but you wire it yourself; the MCP guide’s own mitigations are procedural, and nothing scans hosted MCP tool outputs automatically.

The gateway’s role is exactly this layer. It doesn’t replace allowed_tools or approvals — it makes them workable once the agent has real users and real systems behind it.

What to look for in an MCP gateway for the Agents SDK

Criterion	Why it matters for the Agents SDK specifically
Remote streamable HTTP + OAuth that fits `authorization` pass-through	A remote streamable HTTP endpoint whose token drops into `authorization` — `HostedMCPTool` re-sends it every request and never stores it
Per-end-user credentials	The SDK gives you nowhere to put 500 users’ tokens; the gateway should manage downstream credentials per user so your code sends one gateway token
Curated tool surface that keeps `allowed_tools` short and stable	OpenAI’s cookbook shows tool-definition overhead ranging from ~945 to ~38,022 tokens depending on model and caching — `allowed_tools` trims it, but the savings only hold if the list doesn’t churn or balloon
Audit beyond Traces	Traces show the agent’s side; gateway request logs should capture what happened in the downstream system, exportable to Datadog or Grafana
Injection scanning on tool responses	OpenAI’s MCP mitigations are procedural; the gateway can scan responses before they reach the model
Depth on systems of record	The 8 first-party connectors stop at collaboration suites; the gateway’s catalog is your write capability on HRIS, CRM, ITSM, ERP

The best MCP gateways for the OpenAI Agents SDK, compared

Same evidence rules as our full MCP gateway comparison: capability facts from public documentation, no performance claims, StackOne disclosed. All four are managed gateways — the native baseline (first-party connectors plus allowed_tools and approvals) already gives you call-level governance; what’s missing is the layer behind it:

Platform	Remote MCP + OAuth	Per-user credentials	Tool surface	Audit beyond Traces	Catalog	Pricing
StackOne	Yes — managed remote MCP, OAuth 2.1 end-user flow	Per-user linked accounts via OAuth 2.1 end-user flow	Curated actions; two meta-tools (search + execute) keep context constant	Request logs to provider level, Datadog/Grafana export	450+ connectors, 27,000+ agent-optimized actions	Free plan (full catalog)
Composio	Yes — hosted MCP servers	Yes (Connect Link per `user_id`)	Toolkit-level selection	Observability; audit detail light	~1,000 toolkits	Free tier; from $29/mo
Pipedream Connect MCP	Yes — remote or self-hosted	Yes (per `external_user_id`)	Developer-managed tool selection	Logging; governance not detailed	3,000+ APIs	Usage-based; free tier
Arcade	Yes — cloud, VPC, on-prem, air-gapped	End-user OAuth via your IdP	Developer-selected toolkits	Lifecycle governance	~150 servers in registry	Free tier; from $25/mo

1. StackOne

StackOne is the enterprise layer for AI agents to safely act on any application, and it meets the criteria like this. The server side is a managed remote MCP endpoint that drops straight into HostedMCPTool or MCPServerStreamableHttp, with an OAuth 2.1 flow where the end user authorizes the client themselves (criterion 1). Per-user credentials are the core of the design: each user links accounts once through SSO and a consent screen, StackOne holds the downstream credentials per linked account, and your code sends one gateway token per user — the authorization field has something safe to carry (criterion 2). The tool surface is built for the token math: tools aren’t direct wrappers over API endpoints but curated, context-optimized actions, and at scale agents get two meta-tools, search and execute, so context stays constant at any catalog size (a 460× reduction versus loading every definition) and allowed_tools stays short and stable (criterion 3). Request logs capture every call down to the underlying provider requests, exportable to Datadog or Grafana — the downstream half of OpenAI’s Traces (criterion 4). StackOne Defender scans tool responses for prompt injection before they reach the agent (89.0% detection accuracy in our published evaluation) (criterion 5). Depth is verifiable per system: Salesforce has 380 actions, Jira 147 actions, Workday 128 actions (criterion 6). Limitation: the catalog focuses on business systems, not consumer applications — for the consumer-app long tail, Zapier’s catalog is far bigger. When a system isn’t in the catalog, the AI Connector Builder builds or extends a connector on the same engine that powers the pre-built ones, so coverage isn’t capped at what ships out of the box. Best for: teams shipping multi-user Agents SDK agents that act on systems of record — where per-user credentials and downstream audit decide the deployment.

2. Composio

Composio markets 1,000+ toolkits reachable via MCP or direct APIs, good SDKs, end users authorizing via a hosted Connect Link with per-user user_id isolation, and published pricing (free tier, then from $29/month). The Agents SDK hookup is first-class and documented: the composio_openai_agents provider (OpenAIAgentsProvider) wraps toolkits as native SDK tools — or you can point HostedMCPTool or MCPServerStreamableHttp at its hosted MCP servers. For a developer wiring the Agents SDK to many tools quickly, it’s a fast path. What we couldn’t find in its public docs as of June 9, 2026 is the org-level control plane — central policy enforcement and approval workflows — and audit detail is light, so the downstream-log criterion is yours to construct. Best for: developers building agent products who want toolkit breadth and SDK speed ahead of organizational governance.

3. Pipedream Connect MCP

Pipedream’s MCP is built for developers embedding integrations into their own AI products: end users connect accounts through managed auth isolated per external_user_id, across 3,000+ APIs, remote-hosted or self-hosted, with published usage-based pricing. Its documented OpenAI integration wires those per-user hosted MCP servers into the Responses API MCP tool — the same shape HostedMCPTool uses from the Agents SDK; we found no dedicated Agents SDK guide as of June 9, 2026, but any of the SDK’s remote paths can point at the same server URL. It pairs naturally with an Agents SDK product whose users bring their own apps. It’s a developer primitive, not an IT product — governance beyond logging isn’t detailed in the docs. Best for: developers embedding user-authorized integrations into their own AI product.

4. Arcade

Arcade includes four deployment models — cloud, your VPC, on-premises, or fully air-gapped — and integrates with your existing OAuth and IdP flows so multi-user agents act with user-specific permissions rather than service accounts. It documents an OpenAI Agents integration via the agents-arcade package, which loads Arcade tools directly into SDK agents. Its registry lists ~150 MCP servers — fewer systems than the larger catalogs here — and pricing is published (free Hobby tier, Growth at $25/month plus usage). Best for: teams building multi-user agents with hard infrastructure-control requirements and a contained set of target systems.

How to connect StackOne to the OpenAI Agents SDK

Create a StackOne project and scope which connectors and actions it exposes via connector profiles — admins decide the tool surface before any agent sees it.
End users link their accounts through StackOne’s OAuth 2.1 flow: SSO sign-in, co-branded consent, and an account picker to opt in the specific linked accounts the agent may act on. StackOne holds the downstream credentials per linked account; the per-user gateway token your backend passes as authorization is what this flow issues.
Point the agent at the StackOne MCP URL — HostedMCPTool if you want the Responses API to run the calls, or MCPServerStreamableHttp if your process should own the connection (full code walkthrough in the StackOne Agents SDK guide):

from agents import Agent, HostedMCPTool

def approve(request):  # MCPToolApprovalRequest
    return {"approve": True}  # or route to your human-approval queue

agent = Agent(
    name="ops-agent",
    tools=[HostedMCPTool(
        tool_config={
            "type": "mcp",
            "server_label": "stackone",
            "server_url": STACKONE_MCP_URL,
            "authorization": user_gateway_token,  # per-user token from the OAuth flow in step 2
            "require_approval": "always",
        },
        on_approval_request=approve,  # "always" needs an answer, or no tool ever runs
    )],
)

Tool filtering is optional. The surface is already curated and admin-scoped, so allowed_tools stays short by construction — or run tool modes and expose just the search + execute meta-tools for constant context.
Wire audit. OpenAI Traces show the agent’s side; StackOne request logs capture the provider-level calls underneath, exportable to Datadog or Grafana.

When you don’t need a gateway for the Agents SDK

One agent, one system, your own credentials. An official MCP server plus MCPServerStreamableHttp and a static token is simpler and free — a gateway adds a hop you don’t need yet.
Everything you need is in the 8 first-party connectors. If the agent only reads Gmail, Drive, and SharePoint under a single user’s OAuth token, the hosted connectors plus allowed_tools cover it.
You’re still proving the use case. Prototype against one server, watch the token counts in Traces, and graduate to gateway controls when user count makes credential handling real.

The trigger points: the first time you sketch a user_id → token table, the first security review asking what the agent changed downstream, and the first context-window bill that’s mostly tool definitions.

StackOne is the governed layer between AI agents and 450+ enterprise systems with 27,000+ agent-optimized actions — over MCP, A2A, API, and SDKs — with end-user OAuth linking, connectors you can extend, and built-in prompt-injection defense. See pricing or book a demo.

More: The Best MCP Gateways in 2026, Compared · The Best MCP Gateways for ChatGPT Enterprise · StackOne MCP platform · Salesforce MCP · Workday MCP · ServiceNow MCP

More MCP gateway guides

Every guide in this series applies the same disclosed criteria to a different AI client. Start with the full comparison, or jump to yours:

Frequently Asked Questions

Does the OpenAI Agents SDK support MCP?

Yes — fully in Python, near-parity in JS/TS, with the gaps running both ways. Three documented connection paths: HostedMCPTool (the Responses API calls the remote server for you), MCPServerStreamableHttp (your process owns a streamable HTTP connection), and MCPServerStdio (local subprocess). Static and dynamic tool filtering, per-tool approval policies, tool-list caching, and tracing are built in. What the SDK doesn't cover is per-end-user credentials — that's the gap a gateway fills.

Do I need an MCP gateway for the OpenAI Agents SDK?

Not to get started — the SDK connects directly to any remote MCP server via HostedMCPTool or MCPServerStreamableHttp, with tool filtering and per-tool approvals built in. You need a gateway when the agent serves many users acting on many systems: as of June 2026 OpenAI documents no pattern for per-end-user credentials (the connector examples use an env-var OAuth token), token storage and refresh are your code, verbose tool definitions inflate every request, and Traces show the agent's side of each call but not what happened downstream. A gateway collapses that into one authenticated endpoint with credentials, curation, and audit behind it.

How do I connect the OpenAI Agents SDK to an MCP server?

Three documented paths: HostedMCPTool, where the Responses API calls the remote server directly (configured with server_url, authorization, allowed-tool filters, and require_approval); MCPServerStreamableHttp, where your process owns a streamable HTTP connection to a remote server; and MCPServerStdio for local subprocesses. (MCPServerSse exists too, but the transport is deprecated by the MCP project.) Both Python and JS/TS SDKs support static and dynamic tool filtering, per-tool approval policies, and tool-list caching, and tracing captures list-tools and tool calls automatically.

What happens to AgentKit's Agent Builder?

OpenAI announced on June 3, 2026 that it is winding down AgentKit's visual Agent Builder (shutdown November 30, 2026) and the Evals platform (read-only October 31, 2026, full shutdown November 30, 2026); ChatKit continues. The documented migration paths are the Agents SDK for code-first agents and ChatGPT Workspace Agents for no-code builders. If your agents touch MCP, the Agents SDK plus the Responses API MCP tool is the durable surface to build on.

How do I handle per-user credentials for MCP in the Agents SDK?

As of June 2026, OpenAI's documentation doesn't cover it: the authorization parameter carries an OAuth access token that OpenAI does not store, so you re-send it on every request — and the documented connector example reads it from an environment variable. Token storage, refresh, and per-user scoping are left to your code. The gateway pattern solves this: end users authorize once through the gateway's OAuth flow, the gateway manages the downstream credentials per user, and your code sends one gateway token per user in the authorization field.

Why do remote MCP servers use so many tokens in the Responses API?

Because tool definitions are injected into model context, and remote servers' definitions are often verbose. OpenAI's own cookbook shows the same one-line request costing 945 total tokens on gpt-4.1 versus 38,022 on o4-mini, where the reasoning model re-processes the server's full imported tool list — and its recommended mitigations are allowed_tools to trim the list and previous_response_id to keep the cached tool list warm. Newer mitigations include defer_loading with tool search (Responses API, GPT-5.4 and newer, per OpenAI's tool search guide). A gateway with a curated tool surface — or a constant-size search-and-execute pair of meta-tools — keeps allowed_tools short and stable so the overhead stays small as you add systems.