AI Solutions

MCP and the Standardisation of Agentic AI: What Enterprise Teams Should Build Around in 2026

Anthropic's Model Context Protocol, OpenAI's Responses API, and Google's A2A have quietly converged on how LLMs talk to tools. A practitioner's guide to the stack that is emerging – and what to commit to.

20 April 202612 min readBy the DataX Power team

Abstract illustration of connected nodes representing an AI agent network

The problem the industry finally stopped ignoring

For the first two years of the LLM boom, every foundation-model provider shipped its own incompatible way to wire models into tools. OpenAI had function calling. Anthropic had a different tool-use schema. Google had another. Vertex, Bedrock, and Azure all wrapped those primitives in subtly different SDKs. The result was predictable: enterprise teams ended up with agent frameworks glued together by bespoke adapters, and every model migration was a rewrite.

That era is closing. Three specifications – Anthropic's Model Context Protocol (MCP), OpenAI's Responses API and Agents SDK, and Google's Agent Development Kit with its Agent-to-Agent (A2A) protocol – have converged on a common mental model: a model talks to tools through a well-defined, transport-agnostic protocol, and tools can live anywhere. The specifications are not identical, but they are interoperable in ways that matter, and the community tooling is treating them as complementary rather than competing.

What MCP actually is

Anthropic published the Model Context Protocol as an open specification in November 2024. MCP defines three primitives an LLM can consume – tools (actions the model can invoke), resources (structured data the model can read), and prompts (reusable templates) – and a JSON-RPC transport layer for delivering them. An MCP client (typically embedded in the LLM application) connects to one or more MCP servers (which expose the primitives) over stdio, HTTP, or server-sent events.

The significant design choice is that MCP servers are independent from the model. A server that exposes, say, a Jira tool or a Postgres resource can be connected to Claude, ChatGPT, Gemini, or a locally-hosted Llama – any client that speaks the protocol. By Q1 2026 the public registry lists hundreds of MCP servers covering databases, filesystems, SaaS APIs, observability tools, version control, and internal enterprise systems. The upshot: the tool surface becomes a portable asset rather than a model-specific integration.

What OpenAI's Responses API and Agents SDK add

OpenAI shipped the Responses API in March 2025 alongside an open-source Agents SDK, and has since added MCP support as a first-class option. The Responses API is a higher-level abstraction than the older Chat Completions endpoint – it handles multi-turn tool loops, built-in tools (web search, file search, computer use), and reasoning-state continuity across calls, which removes the boilerplate most agent frameworks were reinventing.

The Agents SDK layers a Python / TypeScript framework on top, with guardrails, handoff logic between specialised agents, tracing, and evaluation primitives. For teams building on OpenAI, it collapses what used to be LangChain or LlamaIndex-heavy application code into a much thinner runtime. Importantly, it treats MCP servers as a first-class tool source, which means an MCP server written for Claude lands unchanged in a GPT-based agent.

Google's A2A and the two-tier picture

Google's Agent Development Kit (ADK), released in 2025, introduced the Agent-to-Agent (A2A) protocol. Where MCP standardises how a single model reaches tools, A2A standardises how agents reach other agents across organisational boundaries – capability discovery, task delegation, result streaming, authentication. The design explicitly assumes a future where enterprises publish specialised agents (procurement, scheduling, research) that other agents can invoke.

Treating these specifications as a stack, not competitors, clarifies the picture: MCP is the tool-access layer, A2A is the inter-agent layer, and the Responses / ADK frameworks are the runtime. Most enterprise conversations we're having in Q2 2026 are no longer "which agent framework should we pick?" They're "which parts of this stack are we committing to, and which do we keep pluggable?"

What enterprise teams should build around

The architectural guidance we give clients right now assumes the protocol layer is stable and the runtime layer is not. That translates to a handful of concrete moves:

Build tools as MCP servers, not framework plugins. Even if you're committed to a single provider today, writing tools against MCP buys you portability for the next migration – and in our experience the migration happens within 18 months.
Keep the agent runtime thin. Whether you use the OpenAI Agents SDK, Anthropic's upcoming client libraries, Google ADK, or an internal wrapper, expect to replace it within 24 months. Concentrate business logic in the tools; treat the runtime as replaceable.
Treat your agent's tool catalogue as an API surface. Version it, document it, test it, rate-limit it. This is the durable asset – the model behind it is not.
Separate the prompt / instruction layer from the tool layer. When you swap models, prompts almost always need tuning; tool definitions should not.
For multi-team deployments, pilot A2A for cross-team workflows even if you keep MCP for same-team tool-use. The cross-team shape (capability discovery, auth, delegation) is the hard part, and A2A's abstractions have become a reasonable default.

Security is where most deployments go wrong

An MCP server is code. Specifically, it is code that an LLM can invoke with arguments derived from (possibly adversarial) user input. The security model has to account for three failure modes that naive deployments routinely miss: prompt injection reaching a privileged tool, an over-scoped tool performing irreversible actions without confirmation, and supply-chain risk from pulling community MCP servers into production.

Practical controls that should be baseline, not optional: run MCP servers in sandboxed environments with least-privilege credentials; treat any write or delete operation as a separate, explicitly-confirmed tool rather than rolling it into a generic "update" tool; maintain an allowlist of approved MCP servers with signed versions; log every tool invocation with model input, resolved arguments, and result; and rate-limit per-tool per-user. The OWASP Top 10 for LLM Applications (2025 edition) codifies several of these under excessive-agency and supply-chain-vulnerability categories – it is the single most useful checklist to run an agent system against before shipping.

What to pilot in the next two quarters

For teams that haven't yet committed, the 90-day plan we're recommending is small but concrete. Stand up an internal MCP server exposing one high-value but low-risk tool – usually a read-only wrapper over an internal knowledge source or a ticketing system. Drive it from a single agent runtime end-to-end, including evaluation and tracing. Then port the same tool to a second runtime (e.g., MCP-over-Responses, MCP-over-Claude, MCP-over-ADK) to prove portability is real for your stack.

The point of the pilot is not the feature. It is to get your team's muscle memory around the new primitives – tool versioning, trace analysis, evaluation against a regression set – before the pressure of a customer-facing deployment. The teams we see succeed with agentic AI in 2026 are the ones that treated the first twelve months as infrastructure work, not product work.

Back to all posts

Keep reading

Modern Hanoi office tower at dusk, evoking Vietnam's growing tech-services sector

Data Annotation Service

Top 5 Data Annotation Service Providers in Vietnam (2026)

Vietnam has emerged as a strategic destination for AI training data, offering cost advantages and a skilled workforce. This ranking evaluates the top annotation providers based on capacity, quality, security, and international track record.

Rows of server racks with status lights, evoking the data infrastructure that underpins modern ML pipelines

Data Annotation Service

The Cost of Bad Labels: Why Annotation Quality Decides AI ROI

A 2021 MIT study found measurable label errors in every one of ten classic ML benchmarks – ImageNet, MNIST, CIFAR-10, and more. The implications for enterprise pipelines are larger than the headlines suggest.

Ready to Get Started

Let's build what's next

Share your challenge – AI, data, or infrastructure. We'll scope your project and put the right team on it.

Start a Conversation See Case Studies