datamodel-code-generator¶
๐ Generate Python data models from schema definitions in seconds.
โจ What it does¶
Pick any one of the supported inputs and pick the Python model style you want as output.
--input-model path/to/file.py:ClassName can even retarget an existing Pydantic, dataclass, or TypedDict class defined
in another Python file to a different output type.
- ๐ Converts OpenAPI 3, AsyncAPI, JSON Schema, Apache Avro, XML Schema, Protocol Buffers/gRPC, GraphQL, MCP tool schemas, and raw data (JSON/YAML/CSV) into Python models
- ๐ Generates from existing Python types (Pydantic, dataclass, TypedDict) via
--input-model - ๐ฏ Generates Pydantic v2, Pydantic v2 dataclass, dataclasses, TypedDict, or msgspec output
- ๐ Handles complex schemas:
$ref,allOf,oneOf,anyOf, enums, and nested types - โ Produces type-safe, validated code ready for your IDE and type checker
๐งช Try It In Your Browser¶
Generate models in your browser without installing anything.
Playground privacy
The playground runs datamodel-code-generator locally in your browser with Pyodide. Your schema and options are not
sent to a backend for generation. If you copy a repro URL, the schema and options are encoded in the URL fragment
(#state=...), which browsers do not send to the server; the full URL can still be stored in your browser history
or wherever you share it.
๐ฆ Installation¶
Use uv tool install when you want datamodel-codegen available as a standalone CLI. Use uv add --dev when a project
or CI workflow should pin the generator version in its lockfile.
Omitting --output-model-type is deprecated
Starting from version 0.53.0, omitting --output-model-type is deprecated.
We recommend using --output-model-type pydantic_v2.BaseModel for new projects.
๐ Quick Start¶
Command¶
datamodel-codegen \
--input schema.json \
--input-file-type jsonschema \
--output-model-type pydantic_v2.BaseModel \
--preset standard-py312-20260619 \
--output model.py
This quick start uses standard-py312-20260619 as the modern Python 3.12 baseline.
Preset names include the target Python version: py312 means Python 3.12.
See CLI Reference for all options. See Presets,
--preset, --input-file-type, and
--output-model-type for this command.
For more schema-aware output that preserves schema-authored names, reuses models, and embeds generated
documentation, use practical-py312-20260619.
Input (schema.json)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Pet",
"type": "object",
"required": ["name"],
"properties": {
"name": {
"type": "string",
"description": "The pet's name"
},
"species": {
"type": "string",
"enum": ["dog", "cat", "bird", "fish"],
"default": "dog"
},
"age": {
"type": "integer",
"minimum": 0,
"description": "Age in years"
},
"vaccinated": {
"type": "boolean",
"default": false
}
}
}
Output¶
# generated by datamodel-codegen:
# filename: schema.json
from __future__ import annotations
from enum import StrEnum
from typing import Annotated
from pydantic import BaseModel, ConfigDict, Field
class Species(StrEnum):
dog = 'dog'
cat = 'cat'
bird = 'bird'
fish = 'fish'
class Pet(BaseModel):
model_config = ConfigDict(
populate_by_name=True,
)
name: Annotated[str, Field(description="The pet's name")]
species: Species = Species.dog
age: Annotated[int | None, Field(description='Age in years', ge=0)] = None
vaccinated: bool = False
๐ That's it! Your schema is now a fully-typed Python model.
๐ฅ Choose Your Input¶
| Input Type | File Types | Example |
|---|---|---|
| ๐ OpenAPI 3.0/3.1/3.2 | .yaml, .json |
API specifications |
| ๐ก AsyncAPI | .yaml, .json |
Event-driven API specifications |
| ๐ JSON Schema | .json, .yaml |
Data validation schemas |
| ๐ชถ Apache Avro | .avsc, .json |
Avro schemas |
| ๐งพ XML Schema | .xsd |
XML document schemas |
| ๐งฉ Protocol Buffers / gRPC | .proto |
Protobuf messages and service schemas |
| ๐ท GraphQL | .graphql |
GraphQL type definitions |
| ๐ ๏ธ MCP Tool Schemas | .json, .yaml |
MCP tool input/output schemas |
| ๐ JSON/YAML/CSV Data | .json, .yaml, .csv |
Infer schema from data |
| ๐ Python Models | .py |
Pydantic, dataclass, TypedDict |
โ Conformance Signals¶
CI exercises datamodel-code-generator against pinned external corpora for XML Schema, JSON Schema, AsyncAPI, Apache Avro, and Protocol Buffers. See the Conformance Dashboard for the generated summary of runner scripts, tox environments, CI jobs, expected corpus counts, and upstream sources.
๐ค Choose Your Output¶
# ๐ Pydantic v2 (recommended for new projects)
datamodel-codegen --output-model-type pydantic_v2.BaseModel ...
# ๐๏ธ Python dataclasses
datamodel-codegen --output-model-type dataclasses.dataclass ...
# ๐ TypedDict (for type hints without validation)
datamodel-codegen --output-model-type typing.TypedDict ...
# โก msgspec (high-performance serialization)
datamodel-codegen --output-model-type msgspec.Struct ...
See Supported Data Types for the full list.
๐ณ Common Recipes¶
๐ค Get CLI Help from LLMs¶
Generate a prompt to ask LLMs about CLI options:
See LLM Integration for more examples.
๐ Generate from URL¶
pip install 'datamodel-code-generator[http]'
datamodel-codegen --url https://example.com/api/openapi.yaml --output model.py
โ๏ธ Use with pyproject.toml¶
[tool.datamodel-codegen]
input = "schema.yaml"
output = "src/models.py"
output-model-type = "pydantic_v2.BaseModel"
Then simply run:
See pyproject.toml Configuration for more options.
๐ CI/CD Integration¶
Validate generated models in your CI pipeline:
- uses: koxudaxi/datamodel-code-generator@0.64.1
with:
input: schemas/api.yaml
output: src/models/api.py
See CI/CD Integration for more options.
๐ Next Steps¶
- ๐ฅ๏ธ CLI Reference - All command-line options with examples
- ๐งฐ Presets - Recommended immutable option bundles
- โ๏ธ pyproject.toml Configuration - Configure via pyproject.toml
- ๐ One-liner Usage - uvx, pipx, clipboard integration
- ๐ CI/CD Integration - GitHub Actions and CI validation
- โ Conformance Dashboard - External corpus and CI coverage signals
- ๐จ Custom Templates - Customize generated code with Jinja2
- ๐๏ธ Code Formatting - Configure black, isort, and ruff
- โ FAQ - Common questions and troubleshooting
๐ Sponsors¶
|
Astral |
OpenAI |
๐ข Used by¶
These projects use datamodel-code-generator. See the linked examples for real-world usage.
- PostHog/posthog - Generate models via npm run
- airbytehq/airbyte - Generate Python, Java/Kotlin, and Typescript protocol models
- apache/iceberg - Generate Python code
- open-metadata/OpenMetadata - datamodel_generation.py
- openai/codex - Python SDK dev dependency
- vllm-project/vllm - Test dependency for model tests
- stanfordnlp/dspy - Generate Pydantic models from JSON Schema for reliability tests
- topoteretes/cognee - Runtime generation of graph data models from JSON Schema
- e2b-dev/E2B - Generate MCP server TypedDict models via Makefile
- apache/airflow - Generate OpenAPI datamodels for airflow-ctl and task-sdk via pyproject codegen config
- browser-use/browser-use - Eval dependency
- firebase/genkit - Generate core typing models from JSON Schema
- open-telemetry/opentelemetry-python - Generate SDK configuration dataclasses from JSON Schema
- DataDog/integrations-core - Config models
- argoproj-labs/hera - Makefile
- tensorzero/tensorzero - Generate Python dataclasses from JSON Schema in the schema generation pipeline
- IBM/compliance-trestle - Building models from OSCAL schemas