When you want an LLM to call your Python functions, you need to describe those functions in a format the LLM understands: JSON Schema (a standard way to describe the shape of a JSON object — its fields, their types, and which are required). The good news: modern SDKs generate these schemas automatically from Pydantic models. You rarely need to write them by hand.
Coming from Software Engineering? This is like OpenAPI/Swagger spec generation. Just as FastAPI auto-generates API docs from your type annotations, LLM SDKs auto-generate tool schemas from Pydantic models. The days of hand-writing JSON schemas for every function are over. The LLM never runs your code. It reads your tool descriptions and, when it decides a tool fits, returns a JSON object naming the tool and its arguments — your code does the actual call. The schema is how the model knows which arguments are valid. And just as a frontend reads your OpenAPI spec to know which endpoints exist and what to send, the LLM reads these tool schemas to pick a tool and fill in its arguments.
The Modern Approach: Pydantic Does the Work
Every major LLM SDK now supports generating tool schemas from Pydantic models. Define your tool once, use it everywhere.
OpenAI: pydantic_function_tool()
# script_id: day_029_tool_schemas_pydantic/openai_pydantic_tool
from openai import OpenAI, pydantic_function_tool
from pydantic import BaseModel, Field
from typing import Optional
client = OpenAI()
class SearchProducts(BaseModel):
"""Search for products in the catalog."""
query: str = Field(description="Search query for products")
category: Optional[str] = Field(None, description="Filter by category")
max_price: Optional[float] = Field(None, description="Maximum price filter")
max_results: int = Field(10, description="Maximum results to return")
# One line — the SDK generates the full JSON schema
tools = [pydantic_function_tool(SearchProducts)]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Find me running shoes under $100"}],
tools=tools,
)
That's it. The SDK inspects SearchProducts, pulls the docstring as the description, reads field types and descriptions, and produces the full OpenAI tool schema.
Anthropic: model_json_schema()
Anthropic's SDK doesn't have a pydantic_function_tool() equivalent, but Pydantic's built-in model_json_schema() gets you there:
# script_id: day_029_tool_schemas_pydantic/anthropic_pydantic_tool
from anthropic import Anthropic
from pydantic import BaseModel, Field
from typing import Optional
client = Anthropic()
class SearchProducts(BaseModel):
"""Search for products in the catalog."""
query: str = Field(description="Search query for products")
category: Optional[str] = Field(None, description="Filter by category")
max_price: Optional[float] = Field(None, description="Maximum price filter")
max_results: int = Field(10, description="Maximum results to return")
# Generate Anthropic-format tool definition from the same Pydantic model
tools = [
{
"name": "search_products",
"description": SearchProducts.__doc__,
"input_schema": SearchProducts.model_json_schema(),
}
]
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "Find me running shoes under $100"}],
)
LangChain: @tool Decorator
LangChain goes even further — it generates schemas directly from function signatures:
# script_id: day_029_tool_schemas_pydantic/langchain_tool_decorator
from langchain_core.tools import tool
@tool
def search_products(query: str, category: str | None = None, max_price: float | None = None) -> list:
"""Search for products in the catalog by name, category, or price range."""
# Your implementation here
return [{"name": "Running Shoe", "price": 89.99}]
# Schema is auto-generated from the function signature + docstring
print(search_products.name) # "search_products"
print(search_products.description) # "Search for products in the catalog..."
print(search_products.args_schema) # Pydantic model generated from type hints
For more control, combine @tool with a Pydantic input model:
# script_id: day_029_tool_schemas_pydantic/langchain_tool_with_schema
from langchain_core.tools import tool
from pydantic import BaseModel, Field
class SearchInput(BaseModel):
query: str = Field(description="Search query for products")
category: str | None = Field(None, description="Filter by category")
max_price: float | None = Field(None, description="Maximum price filter")
@tool(args_schema=SearchInput)
def search_products(query: str, category: str | None = None, max_price: float | None = None) -> list:
"""Search for products in the catalog."""
return [{"name": "Running Shoe", "price": 89.99}]
Under the Hood: What a Raw JSON Schema Looks Like
You rarely write these by hand, but understanding the format helps when debugging tool-calling issues.
# script_id: day_029_tool_schemas_pydantic/raw_json_schema_example
# This is what pydantic_function_tool(SearchProducts) generates for OpenAI.
# Note the strict-mode artifacts: "strict": true, every field in "required",
# optionals expressed as anyOf with null, and additionalProperties: false.
{
"type": "function",
"function": {
"name": "SearchProducts",
"strict": True,
"description": "Search for products in the catalog.",
"parameters": {
"type": "object",
"title": "SearchProducts",
"description": "Search for products in the catalog.",
"properties": {
"query": {
"type": "string",
"title": "Query",
"description": "Search query for products"
},
"category": {
"anyOf": [{"type": "string"}, {"type": "null"}],
"title": "Category",
"description": "Filter by category"
},
"max_price": {
"anyOf": [{"type": "number"}, {"type": "null"}],
"title": "Max Price",
"description": "Maximum price filter"
},
"max_results": {
"type": "integer",
"default": 10,
"title": "Max Results",
"description": "Maximum results to return"
}
},
"required": ["query", "category", "max_price", "max_results"],
"additionalProperties": False
}
}
}
Key things to notice:
namecomes from the class name (OpenAI) or you set it explicitly (Anthropic)descriptioncomes from the docstringpropertiesmap to Pydantic fields with types auto-convertedrequired: with baremodel_json_schema()(the Anthropic path) this includes only fields without defaults; OpenAI'spydantic_function_tooluses strict mode, so all fields land inrequiredand optional fields becomeanyOfwith null- Anthropic uses
input_schemainstead ofparameters— same content, different key
Provider Format Differences
The schema content is identical — only the wrapper differs:
# script_id: day_029_tool_schemas_pydantic/provider_format_helpers
from pydantic import BaseModel, Field
class GetWeather(BaseModel):
"""Get current weather for a city."""
city: str = Field(description="City name")
# Helper to generate for any provider from one Pydantic model
def to_openai(model: type[BaseModel]) -> dict:
"""Convert Pydantic model to OpenAI tool format."""
from openai import pydantic_function_tool
return pydantic_function_tool(model)
def to_anthropic(model: type[BaseModel], name: str = None) -> dict:
"""Convert Pydantic model to Anthropic tool format."""
tool_name = name or model.__name__.lower()
return {
"name": tool_name,
"description": model.__doc__ or "",
"input_schema": model.model_json_schema(),
}
# Same model, both providers
openai_tool = to_openai(GetWeather)
anthropic_tool = to_anthropic(GetWeather, name="get_weather")
Pydantic Features That Map to Schema Constraints
Pydantic gives you rich schema control through field definitions:
# script_id: day_029_tool_schemas_pydantic/pydantic_schema_constraints
from pydantic import BaseModel, Field
from typing import Literal, Optional
from enum import Enum
class Priority(str, Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
class CreateTicket(BaseModel):
"""Create a support ticket."""
# String with enum constraint — LLM will only output these values
priority: Priority = Field(description="Ticket priority level")
# Literal type — alternative to enum for small fixed sets
category: Literal["bug", "feature", "question"] = Field(description="Ticket type")
# String with length constraints
title: str = Field(description="Ticket title", min_length=5, max_length=200)
# Optional field with default
assignee: Optional[str] = Field(None, description="Assign to a team member")
# Number with range
severity: int = Field(description="Severity from 1 to 5", ge=1, le=5)
# Array field
tags: list[str] = Field(default_factory=list, description="Tags for categorization")
# See what schema Pydantic generates:
import json
print(json.dumps(CreateTicket.model_json_schema(), indent=2))
| Pydantic Feature | JSON Schema Result | LLM Behavior |
|---|---|---|
str |
"type": "string" |
Free text |
int |
"type": "integer" |
Whole numbers |
float |
"type": "number" |
Any number |
bool |
"type": "boolean" |
true/false |
list[str] |
"type": "array", "items": {"type": "string"} |
Array of strings |
Literal["a", "b"] |
"enum": ["a", "b"] |
Constrained choices |
Enum |
"enum": [...] |
Constrained choices |
Optional[str] |
Not in required (bare model_json_schema() path) |
LLM may omit |
Field(ge=1, le=5) |
"minimum": 1, "maximum": 5 |
Bounded range |
Field(min_length=5) |
"minLength": 5 |
Minimum string length |
Nested Models for Complex Tools
# script_id: day_029_tool_schemas_pydantic/nested_models
from pydantic import BaseModel, Field
from typing import Optional
class Location(BaseModel):
"""Event location details."""
name: str = Field(description="Venue name")
address: Optional[str] = Field(None, description="Street address")
virtual: bool = Field(False, description="Whether this is a virtual event")
class CreateEvent(BaseModel):
"""Create a calendar event."""
title: str = Field(description="Event title")
date: str = Field(description="Event date in ISO format (YYYY-MM-DD)")
attendees: list[str] = Field(default_factory=list, description="Attendee email addresses")
location: Optional[Location] = Field(None, description="Event location")
# Pydantic handles nested models automatically — the generated schema
# includes the Location sub-schema within CreateEvent's properties
Best Practices for Tool Schema Design
Good: Descriptive fields that guide the LLM
# script_id: day_029_tool_schemas_pydantic/good_schema_example
class SearchProducts(BaseModel):
"""Search for products in the catalog by name, category, or price range.
Use this when the user wants to find products to buy."""
query: str = Field(description="Search query, e.g., 'red shoes' or 'laptop'")
category: Literal["electronics", "clothing", "home", "sports"] | None = Field(
None, description="Product category to filter by"
)
Bad: Vague names and no descriptions
# script_id: day_029_tool_schemas_pydantic/bad_schema_example
class DoStuff(BaseModel):
"""Does stuff."""
x: str # No description — LLM will guess what to put here
Strict Schemas: Guaranteeing Valid Output
A normal tool schema describes the shape you want, but the model can still
return something slightly off. Strict / structured-output modes make the
provider enforce the schema, so you get valid, parseable output every time —
no defensive try/except json.loads dance. Under the hood the provider
restricts the model so it can only produce output that keeps matching the
schema — you don't need the internals, just that the result is guaranteed to
parse.
Coming from Software Engineering? This is the difference between documenting an API contract and having the framework validate it. Strict mode is server-side request validation for the model's output.
# script_id: day_029_tool_schemas_pydantic/strict_outputs
from openai import OpenAI
client = OpenAI()
# OpenAI: strict tool — the function arguments are guaranteed to match the schema
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Book a flight to Tokyo on 2026-03-15 for 2"}],
tools=[{
"type": "function",
"function": {
"name": "book_flight",
"strict": True, # enforce the schema
"parameters": {
"type": "object",
"properties": {
"destination": {"type": "string"},
"date": {"type": "string"},
"passengers": {"type": "integer"},
},
"required": ["destination", "date", "passengers"],
"additionalProperties": False, # required when strict=True
},
},
}],
)
On the Anthropic side, the Messages API offers the same guarantee via
output_config={"format": {"type": "json_schema", "schema": {...}}} for the
response body, and strict: True on a tool to validate its inputs. Either way,
the rule is the same: let the provider enforce the contract instead of
parsing-and-hoping.
In production: strict mode removes a whole class of "the model returned almost-JSON" bugs. The trade-offs: schemas compile on first use (a one-time latency hit, then cached), and a few JSON-Schema features (recursion, numeric ranges) aren't supported — validate those client-side.
Summary
| Approach | When to Use |
|---|---|
pydantic_function_tool() (OpenAI) |
OpenAI SDK — one-liner, recommended |
model_json_schema() (Anthropic) |
Anthropic SDK — generate input_schema from Pydantic |
@tool decorator (LangChain) |
LangChain — auto-schema from function signature |
| Raw JSON schema | Debugging, understanding what SDKs generate, edge cases |
Quick Reference
| Goal | Code | Notes |
|---|---|---|
| Define a tool model | class GetWeather(BaseModel): city: str = Field(..., description="...") |
Descriptions are the model's only hint — write them |
| OpenAI tool spec | pydantic_function_tool(GetWeather) |
One-liner, OpenAI-formatted |
| Anthropic input_schema | GetWeather.model_json_schema() |
Feed into {"name", "description", "input_schema"} |
| LangChain tool | @tool on a typed function |
Schema inferred from the signature |
| Validate model output | GetWeather(**json.loads(args)) |
Pydantic raises if the LLM sent bad args |
| Inspect generated schema | GetWeather.model_json_schema() |
See exactly what the LLM will receive |
Exercises
- Add validation to a tool model. Add a
@field_validator(Pydantic v2) that rejects an emptycity, then feed it deliberately bad LLM arguments and confirm it raises before you ever call the function. - Compare generated schemas. Run
model_json_schema()andpydantic_function_tool()on the same model and diff the output — note how the OpenAI wrapper nests the schema underfunction. - Document a poorly-described tool. Take a model with a bare
x: strfield, add a realField(description=...), and measure whether the LLM fills the argument more correctly. - Build a tool registry. Write a dict mapping tool name → (Pydantic model, handler function) so adding a tool is one entry instead of edits in three places.
Solutions (approaches)
- Use
@field_validator("city") @classmethod def non_empty(cls, v): ...raisingValueError. Construct the model from bad args inside atry/except ValidationError. - Print both as JSON; the OpenAI form is
{"type": "function", "function": {...schema...}}whilemodel_json_schema()is the bare schema. - Swap the docstring/
Field(description=...)and rerun the same prompts; clearer descriptions reduce wrong or empty arguments. REGISTRY = {"get_weather": (GetWeather, get_weather)}; generate tool specs by iterating the registry and dispatch by name on a tool call.
Checkpoint
Run to_openai and to_anthropic on the same Pydantic model and confirm: both emit valid tool schemas with your fields, types, and descriptions intact — one Pydantic class, two provider formats, no hand-written JSON. If a constraint like an enum or Field(description=...) is missing from the output, check that you generated the schema with model_json_schema() rather than copying field names by hand.
What's Next?
Now that your tools have proper schemas, let's learn how to execute tool calls and return results back to the LLM in Day 30!