Skip to content

datamodel-code-generator

๐Ÿš€ Generate Python data models from schema definitions in seconds.

PyPI version Conda-forge Downloads PyPI - Python Version codecov license Pydantic v2


โœจ What it does

Schema files, raw data, and existing Python models flow through datamodel-code-generator into Python model output types Schema files, raw data, and existing Python models flow through datamodel-code-generator into Python model output types

Pick any one of the supported inputs and pick the Python model style you want as output. --input-model path/to/file.py:ClassName can even retarget an existing Pydantic, dataclass, or TypedDict class defined in another Python file to a different output type.

  • ๐Ÿ“„ Converts OpenAPI 3, AsyncAPI, JSON Schema, Apache Avro, XML Schema, Protocol Buffers/gRPC, GraphQL, MCP tool schemas, and raw data (JSON/YAML/CSV) into Python models
  • ๐Ÿ Generates from existing Python types (Pydantic, dataclass, TypedDict) via --input-model
  • ๐ŸŽฏ Generates Pydantic v2, Pydantic v2 dataclass, dataclasses, TypedDict, or msgspec output
  • ๐Ÿ”— Handles complex schemas: $ref, allOf, oneOf, anyOf, enums, and nested types
  • โœ… Produces type-safe, validated code ready for your IDE and type checker

๐Ÿงช Try It In Your Browser

Generate models in your browser without installing anything.

Open Playground

Playground privacy

The playground runs datamodel-code-generator locally in your browser with Pyodide. Your schema and options are not sent to a backend for generation. If you copy a repro URL, the schema and options are encoded in the URL fragment (#state=...), which browsers do not send to the server; the full URL can still be stored in your browser history or wherever you share it.


๐Ÿ“ฆ Installation

uv tool install datamodel-code-generator
pip install datamodel-code-generator
uv add --dev datamodel-code-generator
conda install -c conda-forge datamodel-code-generator
pipx install datamodel-code-generator
uvx datamodel-codegen --help

Use uv tool install when you want datamodel-codegen available as a standalone CLI. Use uv add --dev when a project or CI workflow should pin the generator version in its lockfile.


Omitting --output-model-type is deprecated

Starting from version 0.53.0, omitting --output-model-type is deprecated.

We recommend using --output-model-type pydantic_v2.BaseModel for new projects.


๐Ÿƒ Quick Start

Command

datamodel-codegen \
  --input schema.json \
  --input-file-type jsonschema \
  --output-model-type pydantic_v2.BaseModel \
  --preset standard-py312-20260619 \
  --output model.py

This quick start uses standard-py312-20260619 as the modern Python 3.12 baseline. Preset names include the target Python version: py312 means Python 3.12.

See CLI Reference for all options. See Presets, --preset, --input-file-type, and --output-model-type for this command.

For more schema-aware output that preserves schema-authored names, reuses models, and embeds generated documentation, use practical-py312-20260619.

Input (schema.json)
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Pet",
  "type": "object",
  "required": ["name"],
  "properties": {
    "name": {
      "type": "string",
      "description": "The pet's name"
    },
    "species": {
      "type": "string",
      "enum": ["dog", "cat", "bird", "fish"],
      "default": "dog"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "description": "Age in years"
    },
    "vaccinated": {
      "type": "boolean",
      "default": false
    }
  }
}

Output

model.py
# generated by datamodel-codegen:
#   filename:  schema.json

from __future__ import annotations

from enum import StrEnum
from typing import Annotated

from pydantic import BaseModel, ConfigDict, Field


class Species(StrEnum):
    dog = 'dog'
    cat = 'cat'
    bird = 'bird'
    fish = 'fish'


class Pet(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )
    name: Annotated[str, Field(description="The pet's name")]
    species: Species = Species.dog
    age: Annotated[int | None, Field(description='Age in years', ge=0)] = None
    vaccinated: bool = False

๐ŸŽ‰ That's it! Your schema is now a fully-typed Python model.


๐Ÿ“ฅ Choose Your Input

Input Type File Types Example
๐Ÿ“˜ OpenAPI 3.0/3.1/3.2 .yaml, .json API specifications
๐Ÿ“ก AsyncAPI .yaml, .json Event-driven API specifications
๐Ÿ“‹ JSON Schema .json, .yaml Data validation schemas
๐Ÿชถ Apache Avro .avsc, .json Avro schemas
๐Ÿงพ XML Schema .xsd XML document schemas
๐Ÿงฉ Protocol Buffers / gRPC .proto Protobuf messages and service schemas
๐Ÿ”ท GraphQL .graphql GraphQL type definitions
๐Ÿ› ๏ธ MCP Tool Schemas .json, .yaml MCP tool input/output schemas
๐Ÿ“Š JSON/YAML/CSV Data .json, .yaml, .csv Infer schema from data
๐Ÿ Python Models .py Pydantic, dataclass, TypedDict

โœ… Conformance Signals

CI exercises datamodel-code-generator against pinned external corpora for XML Schema, JSON Schema, AsyncAPI, Apache Avro, and Protocol Buffers. See the Conformance Dashboard for the generated summary of runner scripts, tox environments, CI jobs, expected corpus counts, and upstream sources.


๐Ÿ“ค Choose Your Output

# ๐Ÿ†• Pydantic v2 (recommended for new projects)
datamodel-codegen --output-model-type pydantic_v2.BaseModel ...

# ๐Ÿ—๏ธ Python dataclasses
datamodel-codegen --output-model-type dataclasses.dataclass ...

# ๐Ÿ“ TypedDict (for type hints without validation)
datamodel-codegen --output-model-type typing.TypedDict ...

# โšก msgspec (high-performance serialization)
datamodel-codegen --output-model-type msgspec.Struct ...

See Supported Data Types for the full list.


๐Ÿณ Common Recipes

๐Ÿค– Get CLI Help from LLMs

Generate a prompt to ask LLMs about CLI options:

datamodel-codegen --generate-prompt "Best options for Pydantic v2?" | claude -p

See LLM Integration for more examples.

๐ŸŒ Generate from URL

pip install 'datamodel-code-generator[http]'
datamodel-codegen --url https://example.com/api/openapi.yaml --output model.py

โš™๏ธ Use with pyproject.toml

pyproject.toml
[tool.datamodel-codegen]
input = "schema.yaml"
output = "src/models.py"
output-model-type = "pydantic_v2.BaseModel"

Then simply run:

datamodel-codegen

See pyproject.toml Configuration for more options.

๐Ÿ”„ CI/CD Integration

Validate generated models in your CI pipeline:

.github/workflows/validate-models.yml
- uses: koxudaxi/datamodel-code-generator@0.64.1
  with:
    input: schemas/api.yaml
    output: src/models/api.py

See CI/CD Integration for more options.


๐Ÿ“š Next Steps


๐Ÿ’– Sponsors

Astral Logo

Astral

OpenAI Logo

OpenAI


๐Ÿข Used by

These projects use datamodel-code-generator. See the linked examples for real-world usage.

See all dependents โ†’