Documentation

Features

Reference docs for pydantic-fixturegen.

main

Features: what pydantic-fixturegen delivers

Explore capabilities across discovery, generation, emitters, security, and tooling quality.

Discovery

  • AST and safe-import discovery with optional hybrid mode.
  • Include/exclude glob patterns, --public-only, and discovery warnings.
  • Structured error payloads (--json-errors) with taxonomy code 20.
  • Sandbox controls for timeout and memory limits.

Schema ingestion

  • pfg gen json --schema schema.json ingests standalone JSON Schema documents (via datamodel-code-generator) and immediately reuses the cached module for generation, explain, and diff workflows.
  • pfg gen openapi spec.yaml --route "GET /users" materialises OpenAPI 3.x components, isolates the schemas referenced by the selected routes, and emits a per-schema JSON sample (using {model} in your output template to fan out across operations).
  • pfg doctor --schema / --openapi surface coverage gaps for schema-driven models, so you can spot unsupported field shapes before writing a single Python class.
  • pfg gen examples spec.yaml --out spec_with_examples.yaml injects deterministic example payloads into every referenced schema/component, so your OpenAPI docs stay realistic without manual curation.
  • Generated modules are cached under .pfg-cache/schemas keyed by the document fingerprint and selected routes, which keeps reruns instant while still regenerating when the source spec changes.

FastAPI integration

  • pfg fastapi smoke app.main:app inspects live FastAPI apps, generates deterministic request/response bodies with pydantic-fixturegen, and scaffolds a pytest suite that asserts every documented route returns a 2xx status and passes response-model validation.
  • pfg fastapi serve app.main:app --port 8050 spins up a mock server that mirrors your routes but responds with fixture-generated payloads—perfect for demos, contract-first development, or onboarding stakeholders before the real backend exists.
  • Dependency injection hooks can be bypassed via --dependency-override original=stub, so auth/session providers or rate-limiters don’t block smoke tests and mock responses.
  • Install the fastapi extra to pull in FastAPI + Uvicorn support for these commands.

Polyfactory interoperability

  • Automatic ModelFactory detection: when the polyfactory extra is installed, fixturegen scans discovered modules (and sibling *.factories packages) for ModelFactory subclasses and delegates matching models to them so nested relations, JSON samples, and pytest fixtures reuse the exact same logic without rewriting factories.
  • Config toggles under [polyfactory] let you opt out (prefer_delegation = false) or point discovery at custom modules; logs highlight every delegate that gets wired in so you can audit migrations.
  • pfg gen polyfactory ./models.py --out factories.py exports ready-to-import ModelFactory classes whose .build() methods call fixturegen under the hood, so teams can keep Polyfactory-centric APIs while benefiting from deterministic GenerationConfig settings.
  • Seeds/locales stay in sync via the shared SeedManager: delegations reseed Polyfactory’s Faker + Random objects per path, keeping data parity between existing factories and fixturegen runs.

Generation engine

  • Depth-first instance builder with recursion limits and constraint awareness.
  • Deterministic seeds cascade across Python random, Faker, and optional NumPy.
  • NumPy array provider with configurable dtype/shape caps (enable the numpy extra).
  • Numeric distribution controls (uniform, normal, spike) for ints/floats/decimals via the [numbers] configuration block or PFG_NUMBERS__* env vars.
  • Configurable recursion handling with max_depth and cycle_policy (reuse, stub, null) so self-referential models keep realistic data; emitted JSON/fixtures annotate reused references via a cycles metadata block for easy review—even depth-limit fallbacks record exactly which policy fired and why.
  • Portable SplitMix64 RNG core keeps seeds stable across Python versions/OSes, with --rng-mode legacy available for short-term migrations.
  • Heuristic provider mapping that inspects field names, aliases, constraints, and Annotated markers to auto-select providers for emails, slugs, ISO country/language codes, filesystem paths, and more—every decision is surfaced in pfg gen explain with confidence scores.
  • Relation-aware generation: declarative relations config / --link CLI flags reuse pools of generated models so foreign keys and shared references stay in sync, and JSON bundles can include related models via --with-related.
  • Optional validator retries (respect_validators / validator_max_retries) that keep re-generation deterministic while surfacing structured diagnostics when model validators never converge.
  • Field policies for enums, unions, and optional probabilities (p_none).
  • Configuration precedence: CLI → environment (PFG_*) → pyproject/YAML → defaults.

Decision support

  • Alternatives & migration guides compare pydantic-fixturegen with Polyfactory, Pydantic-Factories, factory_boy, and hand-written fixtures so you can justify the tooling choice to stakeholders.
  • Concrete migration playbooks cover Polyfactory delegation, dissolving Faker scripts into presets, and preserving Hypothesis strategies when you move orchestration to fixturegen.
  • Case studies describe how real teams wired fixturegen into snapshot-based CI, schema contracts, anonymisation flows, and data science pipelines.

Hypothesis

  • pydantic_fixturegen.hypothesis.strategy_for(Model) turns the generation metadata into shrinkable Hypothesis strategies that honour the same constraints, providers, and recursion policies as the fixture engine.
  • pfg gen strategies emits a ready-to-import Python module that wires the discovered models into strategy_for, so property-based tests can share the exact same configuration (seed, cycle policy, RNG mode) as CLI/json workflows.
  • Profiles (typical, edge, adversarial) bias the exporter toward larger boundary coverage when you need to stress validators or replicate adversarial scenarios.

Snapshot & diff tooling

  • Snapshot coverage is available everywhere: inside pytest via the pfg_snapshot fixture and from the CLI via pfg snapshot verify/pfg snapshot write, which wrap pfg diff so you can fail CI or refresh artifacts without bespoke scripts.
  • When pytest-regressions is present, the fixture automatically honours --force-regen (regenerate + fail) and --regen-all (regenerate + pass), keeping your existing snapshot workflow intact while still using fixturegen’s deterministic builders.
  • pfg diff and the snapshot runner annotate mismatches with useful hints: field additions/removals in JSON payloads, $defs churn inside schemas, fixture header drift (seed/style/model digest), and the top constraint failures encountered while regenerating. You see _why_ an artifact changed before digging into the raw diff.

Emitters

  • JSON/JSONL with optional orjson, sharding, and metadata banners.
  • High-volume datasets via pfg gen dataset with streaming CSV writers and PyArrow-backed Parquet/Arrow sinks (install the dataset extra); cycle metadata is preserved in a cycles column for downstream QA.
  • Pytest fixtures with deterministic parametrisation, configurable style/scope, and atomic writes.
  • JSON Schema emission with sorted keys and trailing newline stability.

Database seeding

  • pfg gen seed sqlmodel connects to SQLite/Postgres URLs, batches fixturegen payloads into SQLModel/SQLAlchemy sessions, and supports schema creation, truncation, rollback-only runs, and dry-run logging. The allowlist guard ensures you explicitly opt into non-sqlite hosts with --allow-url.
  • pfg gen seed beanie uses Motor to stream documents into MongoDB (or mongomock_motor for local tests); pass --cleanup to delete inserted documents at the end of each run so integration tests can re-use the same collections.
  • Shared helpers in pydantic_fixturegen.testing.seeders.SQLModelSeedRunner turn any SQLModel engine and ModelArtifactPlan into a pytest fixture that seeds inside a transaction before every test.

Deterministic anonymizer

  • pfg anonymize ingests JSON/JSONL payloads (files or directory trees), applies rule-driven scrubbing (faker/hash/mask strategies), and mirrors the original layout when writing sanitized artifacts.
  • Rule bundles live in TOML/YAML/JSON files and support glob/regex patterns, required flags, Faker providers, hash algorithm selection, and mask templates. Profiles such as pii-safe or adversarial pre-seed sensible rule sets and privacy budgets, and you can layer overrides with --salt, --entity-field, or [anonymize.budget] entries.
  • Privacy budgets enforce deterministic redaction quality: if required rules never match or a strategy fails more than allowed, the CLI exits with a structured EmitError. --report writes a JSON summary containing before/after diff samples, per-strategy counts, and active thresholds so CI can archive evidence of each run.
  • Doctor integration: pass --doctor-target models.py to pipeline sanitized data straight into pfg doctor gap detection; the resulting coverage snapshot is embedded in the report for auditing.
  • Python API helpers (anonymize_payloads, anonymize_from_rules) expose the exact pipeline inside applications or bespoke scripts without shelling out to the CLI.

Coverage lockfiles

  • pfg lock captures a manifest of every model/field discovered (coverage counts, provider labels, gap summaries) and writes it to .pfg-lock.json so CI can diff coverage just like dependencies.
  • pfg verify recomputes coverage and compares it against the lockfile, failing with a unified diff when coverage regresses or new fields appear without fresh fixtures/documentation.
  • Manifests store the discovery options (module vs schema vs OpenAPI) so teams can enforce budgets consistently across services; combine with --json-errors for machine-readable CI logs.

Plugins and extensibility

  • Pluggy hooks: pfg_register_providers, pfg_modify_strategy, pfg_emit_artifact.
  • Entry-point discovery for third-party packages.
  • Strategy and provider registries open for customization via Python API or CLI.

Security and sandboxing

  • Safe-import sandbox blocking network access, jailing filesystem writes, and capping memory.
  • Exit code 40 for timeouts, plus diagnostic events for violations.
  • pfg doctor surfaces risky imports and coverage gaps.

Profiles

  • Built-in bundles exposed via --profile or [tool.pydantic_fixturegen].profile: pii-safe, realistic, edge, and adversarial.
  • pii-safe masks identifiers with example.com emails, example.invalid URLs, reserved IP ranges, and deterministic test card numbers while nudging optional PII fields toward None.
  • realistic restores richer Faker/identifier output and keeps optional contact fields populated for staging data.
  • edge randomizes enum/union selection, increases optional None probabilities, and biases numeric sampling toward narrow spikes near min/max.
  • adversarial further increases p_none, constrains collections to a handful of elements, and dials numeric spikes even tighter to stress downstream validators.
  • Profiles compose with presets and explicit overrides, so you can layer additional field policies or configuration on top.

Tooling quality

  • Works on Linux, macOS, Windows for Python 3.10–3.14.
  • Atomic IO for all emitters to prevent partial artifacts.
  • Ruff, mypy, pytest coverage ≥90% enforced in CI.
  • Optional watch mode ([watch] extra), JSON logging, and structured diagnostics for CI integration.

Dive deeper into specific areas via configuration, providers, emitters, and security.

Edit this page