Welcome to one of the most powerful LLM capabilities: Tool Calling (also known as Function Calling). This allows LLMs to invoke your Python functions, access real-time data, and take actions in the real world!
Coming from Software Engineering? Function calling is RPC for AI. The LLM sees a function signature (like a gRPC proto or OpenAPI spec) and decides to call it with arguments. If you've built webhook systems or defined API contracts, you already understand the core pattern — you're just exposing your functions to an AI client instead of a human one.
What is Tool Calling?
Instead of the LLM making up information, it can request to call your functions!
Key mental model: the model never runs your code. It returns a structured request — function name plus arguments as JSON — and stops. Your code runs the function and passes the result back in a second call. Think of the model as a caller handing you an RPC envelope and waiting; you are the executor.
What Can Go Wrong? Tool calling introduces real-world side effects — the LLM is now triggering your code. Common failure modes:
- Hallucinated arguments (the model confidently invents values that were never real): it calls
get_weather("Atlantis")or fills a parameter with a plausible-looking but made-up value.- Wrong tool selection: The LLM picks
delete_user()when it should pickget_user()- Infinite loops: The LLM keeps calling the same tool without making progress
- Injection via arguments: User input flows through the LLM into tool arguments — never pass them to
eval(), SQL, or shell commands without sanitization- Cost spirals: Multi-turn tool loops burn tokens on every round trip
We'll cover tool-execution and error-handling patterns in Days 30-31 (and the infinite-loop / max-iterations safeguard in Day 37), but keep these failure modes in mind as you design tool schemas.
The Tool Calling Flow
Defining Tools: The Modern Way
Modern SDKs auto-generate JSON schemas from Pydantic models — no hand-writing schemas.
OpenAI with Pydantic (Recommended)
# script_id: day_028_function_calling_basics/pydantic_tool_definition
from openai import OpenAI, pydantic_function_tool
from pydantic import BaseModel, Field
client = OpenAI()
# Define your tool as a Pydantic model — the SDK generates the schema for you
class GetWeather(BaseModel):
"""Get the current weather for a location."""
city: str = Field(description="The city name, e.g., 'London'")
# json_schema_extra injects raw JSON-schema keys (here enum = the only allowed values) into the generated tool schema
unit: str = Field(default="celsius", description="Temperature unit", json_schema_extra={"enum": ["celsius", "fahrenheit"]})
# pydantic_function_tool() converts the model to the OpenAI tool format automatically
tools = [pydantic_function_tool(GetWeather)]
# Make a request with tools
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
tool_choice="auto" # let the model decide; you can also force a tool — see Forcing Tool Use below
)
print(response.choices[0].message)
What the SDK Generates Under the Hood
The pydantic_function_tool() call above produces this raw JSON schema — you rarely need to write this by hand anymore, but understanding it helps with debugging:
# script_id: day_028_function_calling_basics/generated_schema_example
# This is what pydantic_function_tool(GetWeather) generates:
{
"type": "function",
"function": {
"name": "GetWeather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g., 'London'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
}
Simplified for readability. The current SDK also emits "strict": true and "additionalProperties": false, and under strict mode lists every property (including unit) in "required" regardless of defaults — Day 29 covers the exact output.
Implementing Tool Functions
# script_id: day_028_function_calling_basics/tool_calling_loop
import json
from openai import OpenAI, pydantic_function_tool
from pydantic import BaseModel, Field
client = OpenAI()
# Define tool schemas as Pydantic models
class GetWeather(BaseModel):
"""Get current weather for a city."""
city: str = Field(description="City name")
# json_schema_extra injects raw JSON-schema keys (here enum = the only allowed values) into the generated tool schema
unit: str = Field(default="celsius", description="Temperature unit", json_schema_extra={"enum": ["celsius", "fahrenheit"]})
class Calculate(BaseModel):
"""Perform mathematical calculations."""
expression: str = Field(description="Math expression like '2 + 2'")
# Your actual function implementations
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Get weather for a city (mock implementation)."""
# In reality, you'd call a weather API here
weather_data = {
"Tokyo": {"temp": 18, "condition": "cloudy"},
"London": {"temp": 12, "condition": "rainy"},
"Paris": {"temp": 22, "condition": "sunny"},
}
data = weather_data.get(city, {"temp": 20, "condition": "unknown"})
if unit == "fahrenheit":
data["temp"] = data["temp"] * 9/5 + 32
return {"city": city, **data, "unit": unit}
def calculate(expression: str) -> dict:
"""Safely evaluate a mathematical expression."""
try:
import ast, operator
def safe_eval(node):
if isinstance(node, ast.Constant):
return node.value
elif isinstance(node, ast.BinOp):
ops = {ast.Add: operator.add, ast.Sub: operator.sub,
ast.Mult: operator.mul, ast.Div: operator.truediv}
return ops[type(node.op)](safe_eval(node.left), safe_eval(node.right))
elif isinstance(node, ast.UnaryOp) and isinstance(node.op, ast.USub):
return -safe_eval(node.operand)
raise ValueError("Unsupported expression")
tree = ast.parse(expression, mode='eval')
result = safe_eval(tree.body)
return {"expression": expression, "result": result}
except Exception as e:
return {"error": str(e)}
# Map function names to implementations
AVAILABLE_FUNCTIONS = {
"GetWeather": get_weather,
"Calculate": calculate,
}
# Generate tool schemas from Pydantic models — no manual JSON needed
tools = [pydantic_function_tool(GetWeather), pydantic_function_tool(Calculate)]
Complete Tool Calling Loop
# script_id: day_028_function_calling_basics/tool_calling_loop
def chat_with_tools(user_message: str) -> str:
"""Chat with tool calling capability."""
messages = [{"role": "user", "content": user_message}]
# First API call - might request tool use
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="auto"
)
assistant_message = response.choices[0].message
# Check if model wants to use tools
if assistant_message.tool_calls:
# Add assistant's response to messages
messages.append(assistant_message)
# Process each tool call
for tool_call in assistant_message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
print(f"Calling {function_name} with {function_args}")
# Execute the function
if function_name in AVAILABLE_FUNCTIONS:
result = AVAILABLE_FUNCTIONS[function_name](**function_args)
else:
result = {"error": f"Unknown function: {function_name}"}
# Add tool result to messages
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Second call: the model hasn't seen the function output yet — we send it back so it can phrase a natural-language answer ("Paris is 22°C and sunny") instead of leaving the user with raw JSON
final_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return final_response.choices[0].message.content
else:
return assistant_message.content
# Test it!
print(chat_with_tools("What's the weather in Paris?"))
print()
print(chat_with_tools("Calculate 15 * 7 + 23"))
print()
print(chat_with_tools("What's 2+2 and what's the weather in London?"))
Multiple Tool Calls
LLMs can request multiple tools at once:
The model can request several calls in one turn (like a client batching requests); the chat_with_tools loop above already handles this — it iterates every entry in tool_calls, executes each, and appends one {"role": "tool", ...} message per call before re-calling the model. They run sequentially here; reach for asyncio if a tool is slow.
Tool Calling with Anthropic (Claude)
Anthropic uses input_schema instead of parameters. You can generate this from Pydantic too:
# script_id: day_028_function_calling_basics/anthropic_tool_calling
import json
from anthropic import Anthropic
from pydantic import BaseModel, Field
anthropic_client = Anthropic()
# Same Pydantic model, different SDK format
class GetWeather(BaseModel):
"""Get current weather for a city."""
city: str = Field(description="City name")
# json_schema_extra injects raw JSON-schema keys (here enum = the only allowed values) into the generated tool schema
unit: str = Field(default="celsius", description="Temperature unit", json_schema_extra={"enum": ["celsius", "fahrenheit"]})
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Get weather for a city (mock implementation)."""
return {"city": city, "temp": 22, "condition": "sunny", "unit": unit}
available_functions = {"get_weather": get_weather}
# Anthropic format: use model_json_schema() to generate input_schema
tools = [
{
"name": "get_weather",
"description": GetWeather.__doc__,
"input_schema": GetWeather.model_json_schema(),
}
]
def chat_with_claude_tools(user_message: str) -> str:
"""Chat using Claude's tool calling."""
messages = [{"role": "user", "content": user_message}]
response = anthropic_client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages
)
# Check for tool use
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
# Execute function
result = available_functions[tool_name](**tool_input)
# Continue conversation with result
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
}]
})
# Get final response
final = anthropic_client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=messages
)
return final.content[0].text
return response.content[0].text
Tool Design Best Practices
Day 29 covers tool-schema design (good vs bad definitions) in depth.
Forcing Tool Use
Sometimes you want to ensure a specific tool is called:
# script_id: day_028_function_calling_basics/forcing_tool_use
# Force specific tool
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice={"type": "function", "function": {"name": "get_weather"}}
)
# Force ANY tool (no direct response allowed)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="required"
)
# Let model decide (default)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="auto"
)
# No tools (even if defined)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="none"
)
Complete Tool Calling System
# script_id: day_028_function_calling_basics/complete_tool_system
from openai import OpenAI, pydantic_function_tool
from pydantic import BaseModel, Field
from typing import Callable
import json
class ToolSystem:
"""Reusable tool calling system with Pydantic-based tool registration."""
def __init__(self):
self.client = OpenAI()
self.tools = []
self.functions = {}
def register(self, schema: type[BaseModel]):
"""Decorator to register a function as a tool using a Pydantic model."""
def decorator(func: Callable):
self.tools.append(pydantic_function_tool(schema))
self.functions[schema.__name__] = func
return func
return decorator
def chat(self, message: str, max_tool_rounds: int = 5) -> str:
"""Chat with automatic tool handling."""
messages = [{"role": "user", "content": message}]
for _ in range(max_tool_rounds):
response = self.client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=self.tools if self.tools else None,
tool_choice="auto" if self.tools else None
)
assistant_message = response.choices[0].message
if not assistant_message.tool_calls:
return assistant_message.content
messages.append(assistant_message)
for tool_call in assistant_message.tool_calls:
func = self.functions.get(tool_call.function.name)
if func:
args = json.loads(tool_call.function.arguments)
result = func(**args)
else:
result = {"error": "Unknown function"}
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
return "Max tool rounds exceeded"
# Usage — define schemas as Pydantic models, register with decorator
system = ToolSystem()
class GetTime(BaseModel):
"""Get the current time."""
pass
class AddNumbers(BaseModel):
"""Add two numbers together."""
a: float = Field(description="First number")
b: float = Field(description="Second number")
@system.register(GetTime)
def get_time():
from datetime import datetime
return {"time": datetime.now().strftime("%H:%M:%S")}
@system.register(AddNumbers)
def add_numbers(a: float, b: float):
return {"result": a + b}
# Chat!
print(system.chat("What time is it?"))
print(system.chat("What's 42 + 17?"))
Summary
Quick Reference
| Step | OpenAI | What it does |
|---|---|---|
| Define a tool | pydantic_function_tool(MyModel) |
Turns a Pydantic model into the tool schema |
| Send tools | client.chat.completions.create(model="gpt-4o-mini", tools=tools, messages=...) |
Let the model see your functions |
| Detect a call | response.choices[0].message.tool_calls |
None if the model just answered in text |
| Read arguments | json.loads(tool_call.function.arguments) |
Arguments arrive as a JSON string |
| Return a result | {"role": "tool", "tool_call_id": tc.id, "content": result} |
Feed the function output back in |
| Loop | re-call create(...) with the tool result appended |
Model produces the final answer |
Exercises
- Add a second tool. You have a calculator and a clock — add a
get_weather(city)tool (return a hardcoded dict for now) and confirm the model picks the right tool per question. - Handle the no-tool case. Ask a question that needs no tool ("Tell me a joke") and verify your loop returns the model's text answer directly without crashing on an empty
tool_calls. - Measure tool-selection accuracy. Write 10 prompts where you know the correct tool, run them, and count how often the model calls the expected function.
- Make the model call two tools at once. Ask "What time is it and what's 8 times 9?" — inspect whether the model returns multiple entries in
tool_callsand make sure your loop executes all of them.
Solutions (approaches)
- Define a
WeatherPydantic model with acityfield, register it alongside the others, and dispatch ontool_call.function.name. - Guard the loop:
if not message.tool_calls: return message.content. Only enter the execute-and-resend branch when calls exist. - Store
(prompt, expected_tool)pairs, run each, comparetool_calls[0].function.nameto the expected name, and print the hit rate. - Iterate over every item in
tool_calls, append one{"role": "tool", ...}message per call (matchingtool_call_id), then re-call the model once.
Checkpoint
Run chat_with_tools("What's the weather in Paris?") and confirm the full loop fires: the model returns a tool call, your code prints Calling get_weather with {...}, executes it, and the model folds the result into a natural-language final answer. If the model answers the weather directly without ever calling the tool, check that you passed your tools list into the request and that the function descriptions clearly state when to use them.
What's Next?
You can expose functions and let the model call them — but hand-writing JSON schemas is brittle. Tomorrow, Day 29: Tool Schemas with Pydantic, we generate clean, validated tool definitions from Pydantic models so your contracts stay correct as they grow.