Latenode

MCP Tools: How They Work, Where They Break, and Why

MCP tools are the action primitive in Model Context Protocol. Here's how discovery works, why descriptions break agents, and what tool poisoning actually looks like.

20 min read
cover.png

If you found this by searching "what are MCP tools" or "why is my MCP agent calling the wrong thing," you're in exactly the right place.

There's a lot of confusion about what MCP tools actually are inside the Model Context Protocol. Not confusion about whether they're useful - everyone seems to agree they are - but confusion about what they do versus what resources do, why descriptions matter so much, and where things break quietly in production. Most teams I hear from blame the model when the agent picks the wrong tool or fails silently. The model is rarely the problem.

The real issue is almost always the same: poorly written tool descriptions, blurry boundaries between primitives, and error handling that swallows failures instead of surfacing them. I've seen this pattern enough times that I wanted to write it down properly.

What breaks before the code does

  • MCP tools are the action primitive - they execute logic; resources just expose data
  • Discovery happens at runtime via list_tools, not through hardcoded integration
  • Bad descriptions are the #1 reason AI agents pick the wrong tool or fail silently
  • Tool poisoning is a real attack surface hiding in what most teams treat as documentation mcp_three_primitives_diagram

What MCP Tools Are Inside the Model Context Protocol

The Model Context Protocol (MCP) is an open standard, developed by Anthropic, for creating secure two-way connections between AI applications and external systems. Its purpose is to give AI models a consistent, protocol-level interface to the outside world rather than requiring a custom integration for every tool, database, and SaaS product an agent might need to touch. The NSA published specific security design guidance for MCP in early 2026, which is one of those signals that tells you a technology has moved from experimental to genuinely operational at scale.

Inside MCP, there are three server primitives: tools, resources, and prompts. Each does a different job. Tools are the executable primitive. They perform computations, trigger actions, call external APIs, run scripts, write data. Resources expose context and data for the model to read - think of them as the readable layer. Prompts are reusable instruction templates that can be injected into conversations. Together they form a complete surface for AI-to-system interaction. But they are not interchangeable, and conflating them is where things start going wrong.

MCP tools sit at the intersection of what AI agents can do and what real systems can tolerate. According to Celigo's technical breakdown of the protocol, tools are exposed over two standardized endpoints: tools/list for discovery and tools/call for invocation. Any compliant client, any compliant model, any compliant host can use those endpoints. That standardization is the point.

How MCP Tools Differ from Resources and Prompts

The distinction I keep explaining in support: tools perform actions, resources provide data, prompts template instructions. They're not interchangeable and the boundary matters exactly as much as you'd expect when the whole system is built around which primitive controls what.

Resources and tools are the pair people conflate most often. The misconception usually looks like this: a team builds a "tool" that retrieves a customer record from a database. It's wired correctly, it returns data, it works. But it doesn't actually do anything actionable - no update, no write, no downstream trigger. It's functioning as a resource dressed as a tool, which means the model has no guarantee it can request it at the right moment for the right reason.

Tools perform actions on external data and systems. Resources expose that data and make it available as context. A prompt is the template that tells the model how to use both. The clean mental model: resources answer "what do you know?", tools answer "what can you do?", and prompts answer "how should you think about it?"

Available resources tell the model what context exists. The tool is what the model calls when it decides to act on that context. Missing that boundary produces workflows that look correct on paper and do nothing useful in production.

Why the MCP Server Exposes Tools as Callable Functions

An MCP server publishes tools as named, schema-backed callable units. Each tool has a name, a description, and a JSON parameter schema defining what inputs it accepts and what it returns. Any compliant MCP client can query the server, receive the full list of tools, and invoke any of them without prior hardcoding. There's no bespoke integration required - just the open protocol.

In practice, a single MCP server might wrap Python functions, external API calls, file processing operations, image handling, database queries, or integration flows. The schema is what makes this work at a protocol level: the client doesn't need to know that one tool calls a Python function and another wraps a REST endpoint. It just needs the name, the description, and the input spec.

This is why MCP tools look superficially like functions but behave more like a published service contract. When you invoke a tool on an MCP server, you're not calling a local function - you're executing a defined capability against whatever the server wraps behind it. That indirection is intentional, and it's what makes the ecosystem interoperable.

How MCP Tool Discovery Actually Works at Runtime

Here's the part that separates MCP from conventional API integration: discovery happens at runtime, not at build time.

When an MCP-compatible client connects to an MCP server, the first thing it does is call tools/list - a capability request that returns every tool the server currently exposes, along with each tool's name, description, and parameter schema. The client doesn't know in advance what's available. It asks. The server answers. Then the client, or the model reasoning through it, decides what to call.

That's a fundamentally different architecture from static integrations where you hardcode an API endpoint, define the request format ahead of time, and ship. In a hardcoded integration, adding a new capability means updating the integration code. In an MCP setup, a server can expose a new tool and any connected client discovers it automatically on the next query. No deployment required on the client side.

This design is what makes MCP tools usable by agentic systems, not just human-triggered scripts. An agent that can discover available tools at runtime can reason about what's possible before deciding what to do. It can handle situations that weren't anticipated at build time, because the available tools surface from the server's current state rather than from whatever was hardcoded months ago. Digital Applied's 2026 ecosystem map places MCP at the center of agentic architecture precisely because of this dynamic discovery capability.

It's also worth knowing: the November 2025 MCP revision added support for parallel tool calls, meaning an agent can invoke multiple tools concurrently rather than sequentially. For multi-system workflows - say, an agent pulling ERP status and CRM history simultaneously - that's a meaningful performance difference.

📊 In practice:
An AI agent using list_tools at runtime can adapt to a server's current capabilities without redeployment. A hardcoded API wrapper cannot. That gap is the entire reason agentic systems need MCP rather than conventional integration patterns - the agent's decision space depends on what's currently available, not what was available when someone last updated the code.

What an LLM Does with a Tool List Before Calling Anything

Large language models don't just receive a tool list and start calling things. They read it first.

When an LLM client receives the results of a tools/list query, it processes each tool's name, description, and parameter schema as part of its reasoning context. It's using that information to decide which tool is the right one to invoke for the current task, what params to pass, and in what order to call things if multiple tools are needed.

This is where description quality stops being a documentation concern and starts being a performance variable. The model has no other source of ground truth about what a tool does. It can't inspect the code behind it. It can't test-run it. It reads the description and the schema. If those two things are ambiguous, vague, or inconsistent with each other, the model makes a worse tool call. Not because the model is broken, but because it's working with bad input.

Natural language is literally the interface here. The description isn't metadata - it's the instruction the model uses to decide whether and how to invoke the tool. Ambiguous descriptions measurably degrade tool-selection accuracy. I keep seeing teams discover this the slow way, after their agent starts doing strange things, and the first instinct is always "something's wrong with the model." It usually isn't.

MCP Tool Descriptions: Why Most of Them Are Broken

A 2024 arXiv study on MCP tool quality found that over 95% of tool descriptions contained at least one quality issue. Read that again slowly: ninety-five percent. And these weren't poorly written amateur tools - this was a systematic survey of tools in mcp ecosystems, across a range of servers and use cases.

What does a "quality issue" look like in practice? Usually one of three things: a description that says what the tool is called without saying what it does, a parameter schema that lists inputs without explaining what they control, or no return-value description so the model has no idea what to expect from the output. Any one of those gaps makes it harder for an AI model to use the tool correctly. All three together, and the model is essentially guessing from the name.

I see this pattern in support regularly. A team deploys an MCP-connected agent, watches it pick the wrong tool repeatedly, and opens a ticket convinced there's a bug somewhere. We dig into the context window. The tool descriptions look like variable names dressed up as sentences. "Processes data." "Handles user requests." "Returns information." The ai model is context-aware enough to try something - it just has no reliable signal for which something is right.

Structured information in a tool description isn't optional decoration. It's the primary signal the model uses to reason about capability boundaries. When that signal is weak, the model falls back to surface-level pattern matching on tool names, which produces exactly the kind of inconsistent, hard-to-reproduce failures that make agents look unreliable.

The tools in mcp that work well in production are the ones where someone treated description quality as engineering work, not documentation cleanup.

What a Good MCP Tool Description Has to Include

There are three required elements, and missing any one of them degrades model performance in a specific, predictable way.

Plain-English explanation of what the tool does. Not what it's called. Not what system it talks to. What it actually does from the model's perspective. "Retrieves the current status of a customer order by order ID" is good. "Order tool" is not. GitHub's SEP-1382 guidance on tool descriptions establishes this as the foundational requirement: the description must be unambiguous without any additional context.

Parameter documentation with purpose, not just type. A JSON schema can tell the model that a parameter is a string. The description needs to tell the model what that string controls. The difference between "customer_id": "string" and "customer_id": "The unique identifier from your CRM, formatted as CUST-XXXXX" is significant when the model is deciding whether to supply this value from user input or derive it from a previous tool call.

Return-value description. What does the tool output? What format is it in? What fields are present? If the model doesn't know what a tool returns, it can't plan what to do with the output afterward. Tool outputs feed into downstream reasoning - a model that doesn't know a tool returns a list of objects versus a single dictionary will make structurally wrong assumptions about how to process the result.

These aren't suggestions. They're the minimum viable description. Below this, you're relying on the model to infer what you left out, and models infer incorrectly just often enough to make production unreliable.

Common Description Smells That Break Tool Selection

The arXiv research used the term "smelly descriptions" for anti-patterns that consistently degraded model performance. These are the ones I see most often, and each one has a specific failure mode attached to it.

Vague action verbs that apply to everything. "Manages," "handles," "processes," "gets." These tools enable nothing more than uncertainty. A model reading three tools that all "handle" something has no basis for choosing between them. Replace with the specific action: "Creates," "Retrieves by ID," "Updates status field," "Sends notification to."

Parameter descriptions that repeat the parameter name. "order_id: The order ID" is not documentation. It's tautology. The model needs to understand what values are valid, where those values come from in context, and what happens if a wrong value is supplied. Additional context here is the difference between a tool call that works and one that generates a confusing error downstream.

Missing return description. This is the one that produces the most support tickets. The agent calls the tool, gets a response, doesn't know what to do with it, and either ignores it or hallucinated an interpretation. Use tools that tell you what comes back: "Returns a JSON object containing order_status, last_updated, and items_pending fields."

Descriptions written for human readers, not model consumers. "This tool is super handy for checking order status!" is user input styled as documentation. A model doesn't need enthusiasm. It needs precision. Write descriptions as if the consumer is a system that will execute logic based on what you write.

That last one is where I'd start if an agent is behaving strangely. Not the code. The descriptions. tool_description_quality_spectrum

Building MCP Servers: Error Handling and the Parts Most Teams Skip

Building an MCP server is straightforward until production, and then it isn't. The gap between a working demo and a reliable implementation is almost entirely in error handling and schema validation. These are the specific mistakes I see, what they produce, and how to catch them.

  • Returning generic errors instead of structured error responses

    When a tool call fails, the MCP server should return a structured error response with a meaningful code and a description the client can act on. Instead, most early implementations return a bare exception or a 500 with no body. The client gets a blank wall. The model doesn't know whether to retry, abort, or route to a different tool. Build explicit error response shapes for every failure mode before the server goes anywhere near production - at minimum: error code, error category (input validation failure vs. upstream API failure vs. timeout), and a description the model can reason about.

  • Skipping JSON schema validation before executing tool logic

    An MCP server receives a tool call with a parameter payload. If that payload doesn't match the declared JSON schema - wrong type, missing required field, malformed structure - the server needs to reject it cleanly before attempting execution. Servers that skip this step run partial logic on bad input, write corrupt data downstream, and return success codes that aren't accurate. Validate against the schema first. Reject early with a clear validation error. This is the check that prevents the class of bug where the server did something, but not the right something, and nobody finds out for three days.

  • Silently swallowing failures in async tool execution

    Using mcp for async operations introduces a specific failure mode: the tool accepts the request, queues the work, returns a success acknowledgment, and then the async work fails quietly. From the client's perspective, the tool succeeded. The downstream effect never happened. Add explicit status tracking for any async tool execution - a follow-up status endpoint, a webhook callback, or a visible queue entry - so the failure has a place to surface. A server that acknowledges a request it can't complete is not a working server.

  • No distinction between client errors and server errors in the response

    A remote MCP server receiving a malformed request should respond differently than a server that received a valid request but failed internally when executing it. The implementation detail that matters here: the model uses error codes to decide what to do next. A 4xx-style error means "the request was wrong, fix the call." A 5xx-style error means "the server had a problem, maybe retry." Without that distinction in your error response design, every failure looks the same to the client, and the model's retry and fallback logic can't work correctly.

  • Development tools left enabled in production-facing servers

    Development tools - verbose logging of full request bodies, debug endpoints that expose internal state, unauthenticated query endpoints - frequently survive the trip from staging to production when teams move fast. Check specifically for: any endpoint that returns raw stack traces, any logging configuration that writes full payloads to a shared log sink, and any development-only capability declared in the tools list. These aren't hypothetical concerns; they're the setup mistakes that land in security incident reports.

  • Missing rate limits on tool execution paths

    A well-described, correctly implemented MCP tool that wraps an external API call with no rate limiting is one aggressive agent loop away from an outage. The external API has limits your server needs to respect. When the server doesn't enforce them, the agent gets a successful tool list, starts calling at whatever rate its reasoning loop allows, and eventually produces a cascade of upstream 429 errors that look like a server reliability problem rather than a design gap. Build rate limits into the server implementation before the first external API integration lands.

That's the list I work through when a team says their MCP server "mostly works." Mostly is the tell.

Security Considerations Every MCP Server Needs Before Going Live

Most security discussions around MCP focus on transport-layer concerns: authentication, TLS, network exposure, connection authorization. Those matter. But the attack surface that teams aren't accounting for properly is the description layer - the text fields that most people treat as documentation.

MCP tools introduce a security surface that is structurally different from conventional API security. The permission a model has to act comes not just from its credentials, but from its interpretation of tool descriptions. When a model reads a tool description and decides to invoke it, it's acting on text. That text can be manipulated.

Before any MCP server goes live, security review should cover at minimum: who can register tools on the server, whether tool descriptions are validated or can be modified after registration, what human-in-the-loop checkpoints exist before high-privilege tool calls execute, and whether the server logs which tools were called, with what parameters, and by which client. The NSA's MCP security guidance specifically identifies tool interaction patterns as a governance concern in AI-enabled systems - not the transport, the tool behavior.

Permission scoping is the other thing teams regularly underscope. A tool that can read a database probably shouldn't also be able to write it. A tool that sends a notification probably shouldn't have access to authentication flows. Scope each tool to the minimum permissions it actually needs, then enforce that at the server level before any client can invoke it.

🤔 Wait.
Most MCP security audits look at transport authentication and network exposure. Almost none of them look at the description field as an attack surface. But tool poisoning attacks don't need network access - they need text that gets read by a model. The description field is protocol-level input to the model's reasoning. Treating it as documentation is the mistake.

What Tool Poisoning Attacks Look Like in Practice

Tool poisoning is the attack pattern where malicious instructions are embedded inside MCP tool descriptions. The mechanism relies on a fundamental design feature of MCP: models are designed to be model-controlled, meaning the model reads description content as trusted input and uses it to guide its own behavior.

An attacker who can control what appears in a tool description can inject instructions that the model will follow when it reads the tool list. A poisoned description might instruct the model to exfiltrate data to a different endpoint, to grant escalated permissions, to suppress logging of certain actions, or to prefer one tool over another in ways that bypass intended authorization logic. The models interact with description text the same way they interact with any other instruction content - which is the entire problem.

A pre-invocation check should look for: tool descriptions containing imperative instructions unrelated to the tool's declared function, descriptions that reference other tools or modify selection criteria, and any description that includes conditional logic ("if the user asks about X, also call Y"). Connection-time analysis - reviewing the full tool list before allowing any invocations - is an emerging practice that treats the tool list itself as a security artifact rather than just metadata. For high-privilege servers, this is worth building before the first production deployment, not after the first incident.

The prompt injection surface and the tool description surface are the same surface. That's the thing most teams haven't asked yet.

Real-World Use Cases Where MCP Tools Add Actual Value

MCP tools aren't interesting in isolation. They're interesting when they sit between an AI model and a real system that needs to be queried, updated, or acted on. These are the four categories where I see them reliably adding value, not as demos but as production-grade implementations.

AI coding environments with file system and CI/CD access. VS Code extensions, AI coding assistants, and similar tools use MCP to expose file navigation, test execution, build system interaction, and repository operations. The agent can look at the current project state, run a test suite, read the output, and suggest a fix - all through MCP tool calls rather than bespoke integrations. The playwright MCP toolset for browser-based testing falls into this category: tools like browser_navigate, browser_click, and browser_snapshot let an AI agent drive regression testing against real interfaces. Bug0's analysis shows how this changes the build-vs-buy equation for AI-assisted testing significantly.

Enterprise apps connecting AI models to CRMs and business workflows. A support agent or sales AI that can interact with external systems - pulling a customer record, checking order status, updating a pipeline field - through standardized MCP tool calls rather than custom integration code. This is the use case where automate gets its meaning: a workflow that spans ERP, CRM, and communication tools, orchestrated by an agent that discovered what's available at runtime and acts on it.

Automation and test engineering wrapping infrastructure APIs. DevOps and QA teams exposing deployment commands, infrastructure queries, and monitoring APIs as MCP tools so AI agents can surface, triage, and act on operational signals without requiring a human to translate between systems. The agent can check a deployment status, pull recent error logs, and decide whether to escalate - all through the MCP tool interface.

Documentation platforms enabling retrieval and write operations. Knowledge bases and document systems exposed via MCP tools that support both read (retrieve a policy, search files, find a contract template) and write (draft a document, update a record, post a summary). The distinction from resources matters here: when the operation changes something, it's a tool, not a resource.

For teams building these ai applications on a visual platform, Latenode's MCP Server Builder is one of the practical paths here. You can expose a workflow action - say, one that queries an ERP via API and returns structured order data - as an MCP tool, then wire that tool directly into Claude Desktop or Cursor. The workflow handles the integration complexity and authentication; the MCP interface handles the model-facing contract. A RevOps manager who needs live order status via an internal AI assistant doesn't need to know there's a multi-step workflow behind the tool call. They just get an answer. That's the version of "connect AI to external tools and data" that actually runs in production without becoming a maintenance burden. mcp_enterprise_workflow_pattern

References

  1. National Security Agency - Model Context Protocol (MCP): Security Design Considerations for AI-Enabled Systems - 15/02/2026
  2. Digital Applied - AI Agent Protocol Ecosystem Map 2026: Complete Visual - 17/03/2026
  3. Digital Applied - MCP Adoption Statistics 2026: Model Context Protocol - 19/04/2026
  4. Celigo - MCP tools: What they are + 7 tools to try (2026) - 21/05/2026
  5. Bug0 - Playwright MCP Changes the Build vs. Buy Equation for AI Testing in 2026 - 15/01/2026
  6. YouTube - The Future of MCP — David Soria Parra, Anthropic - 18/04/2026

FAQ

Frequently Asked Questions

No. MCP tools are a protocol-level abstraction with runtime discovery, a defined schema, and AI-oriented descriptions - APIs require hardcoded integration and have no native discovery mechanism for LLM clients. The difference is whether the client has to know in advance what's available.

Found this helpful? Share it →

Written by

Vasiliy Datsenko

Head of Customer Support

Vasiliy Datsenko is Head of Customer Support at Latenode and a product-focused automation writer. His work connects customer conversations, workflow automation research, AI use cases, and practical product education for teams trying to automate real business processes.

Author profile →

Fact checked by

Oleg Zankov

Founder and CEO

Founder and automation product builder behind Latenode. Expert in iPaaS, AI agents, and workflow automation architecture.

Author profile →