Tool Use / Function Calling

Understand how models call external code safely and reliably using structured outputs, validation, and execution boundaries.

Difficulty intermediate
Read time 10 min
tool-use function-calling structured-output json-schema agentic-systems reliability
Updated February 11, 2026

What Is Tool Use / Function Calling?

A language model can reason in text, but it cannot directly query your database, send an email, or check current inventory unless it has a bridge to those systems.

Tool use is that bridge.

Analogy: think of the model as a project coordinator and your backend functions as specialist teammates. The coordinator decides what needs to happen, then sends precise requests to specialists. Specialists execute and return results. The coordinator then explains the result to the user.

Function calling is the structured protocol for this coordination. Instead of free-form text like “maybe call weather service with London,” the model emits a typed function call with arguments that your app can validate and execute.

Technical definition: function calling is a constrained generation pattern where the model selects from declared tools and emits schema-valid arguments, enabling deterministic integration with external systems.

Why Does It Matter?

Without tools, models often guess. With tools, they can fetch facts, perform actions, and produce auditable outcomes.

This matters for:

  • Reliability: structured arguments reduce fragile string parsing.
  • Safety: the runtime can validate and gate actions before execution.
  • Freshness: tools can fetch real-time or private data unavailable in model weights.
  • Composability: model reasoning and deterministic software each do what they are best at.

In production, tool use is the difference between a “chatty assistant” and a “useful assistant.”

How It Works

A robust tool-calling loop usually has these parts.

1) Define tool contracts

Each tool has:

  • name,
  • description,
  • argument schema (often JSON Schema or equivalent),
  • execution permissions.

Good contracts are narrow and explicit. Ambiguous tool descriptions create unpredictable calls.

2) Give the model tool context

At runtime, pass available tool definitions and task instructions. The model decides whether to answer directly or emit a function call.

3) Validate model output before execution

Never execute raw model text as code.

Validation checks should include:

  • schema conformance,
  • required fields,
  • value constraints,
  • policy checks (auth scopes, tenant boundaries, sensitive operations).

If validation fails, ask the model to repair arguments or route to a fallback path.

4) Execute in controlled runtime

The application executes the approved function call, not the model.

Best practice:

  • least privilege credentials,
  • timeouts and retries,
  • idempotency for state-changing actions,
  • audit logs for each call.

5) Return tool result to model (or user)

For multi-step tasks, feed tool results back to the model so it can decide next step. For simple tasks, the app may return results directly to user without additional model generation.

6) Use structured final outputs

Even after tool execution, require schema-constrained final responses when possible (for UI rendering, storage, or downstream workflows).

Example workflow

User: “Book me a meeting next week with Dana, any 30-minute slot after 2 PM.”

  • Model chooses find_available_slots with date and attendee constraints.
  • Runtime validates and executes calendar query.
  • Model receives slots and chooses one using user preference.
  • Model calls create_event with selected slot.
  • Runtime creates event and returns confirmation.
  • Model provides concise user-facing summary.

The model did planning and language. Your app did permissions and execution.

Key Terminology

  • Tool contract: Formal definition of callable function and argument schema.
  • Structured output: Machine-readable model output that follows a declared schema.
  • Validation layer: Runtime checks that gate function execution.
  • Executor: Application component that runs approved tools.
  • Least privilege: Security principle granting only minimum required permissions.

Real-World Applications

  • Customer support automation: Check order status, issue refunds within policy limits, and generate case notes.
  • Developer assistants: Run code search, test commands, and static analyzers through explicit tool interfaces.
  • Operations copilots: Query dashboards, open incidents, and draft remediation steps with human approval gates.
  • Personal productivity assistants: Schedule meetings, summarize inboxes, and create task tickets through connected APIs.

Common Misconceptions

  1. “Function calling means the model executes code directly.” No. The model proposes structured calls; your runtime decides what is actually executed.

  2. “If schema validation passes, it is automatically safe.” Schema validity is necessary, not sufficient. You still need auth, policy checks, and sandboxing.

  3. “Tool use removes hallucinations completely.” It reduces guessing when data/actions are tool-backed, but reasoning and interpretation errors can still occur.

Further Reading

  • OpenAI platform documentation on function calling and structured outputs.
  • JSON Schema core documentation for designing strict and composable tool interfaces.
  • Anthropic documentation on tool use patterns and safe tool execution practices.