content-model-simulator
v0.6.1
Published
Local-only simulator for Contentful content models. Preview entries, validate schemas, and export ready-to-import bundles — without connecting to Contentful.
Maintainers
Readme
content-model-simulator
Stop designing Contentful models blind.
You push a content type, realize a field is wrong, roll it back. You plan a CMS migration with no idea how the data will actually map. You change a field and hope nothing breaks downstream.
This tool runs Contentful entirely offline — schemas, entries, validation, references, the full pipeline — so you catch mistakes before production sees them.
- Validate models before migration — catch bad field design locally, not after import
- Audit existing spaces without touching production — pull schemas and entries, inspect, iterate, all read-only
- Preview
contentful-migrationfiles locally — replay your migration scripts against a mock, no Contentful connection
Why this matters
"Just push it to the space and see" is the most expensive habit in Contentful development. Every iteration costs:
- Bad field design caught too late — you find out a
Symbolshould've been aTextafter editorial has 200 entries in it - Broken migrations discovered in production — the import runs, half the references are dangling, you spend a day cleaning up
- Hidden validation gaps between CMS and code — Contentful enforces things your code doesn't, or vice versa, and bugs leak through
- Editorial chaos from untested content models — the model "works" until editors hit the edge cases nobody planned for
This tool collapses the loop: define schemas, point at data (or generate mock data), and get the same Contentful dashboard view you'd see after import — locally, in seconds, without uploading anything.
30-second quick start
npx cms-sim init my-project
cd my-project
npx cms-sim --schemas=schemas/ --openThat's it. You now have a local Contentful preview with mock entries in your browser.
Who this is for
- Content architects designing or iterating on Contentful content models
- Developers planning a migration from WordPress, Sanity, or another CMS to Contentful
- Teams that want to validate content model changes before deploying them
Who this is NOT for
- Running the actual migration (use
contentful-migration,contentful-import, or the Management API for that) - Editing content (this is read-only simulation, not a CMS)
- Non-Contentful platforms (schemas use the exact Contentful format)
What you get
Content Browser
Browse entries exactly like in Contentful. Filter by content type and locale, inspect every field, follow references between entries.
Content Model Graph
Interactive SVG diagram of your content types and their relationships. Zoom, pan, drag. See the full picture at a glance.
Core capabilities
| Capability | What it does |
|---|---|
| Offline simulation | 10-step pipeline: load → validate → transform → link → resolve → convert → merge → validate → stats → report. Zero network calls. |
| Contentful validation | Catches missing required fields, unknown fields, unresolved links, field type mismatches — the same errors Contentful would reject. |
| Pull from Contentful | cms-sim pull downloads your existing content model and entries (read-only CDA token). Add --management-token to include all field validations. Modify locally, simulate, then apply when ready. |
| Pull from Sanity | cms-sim pull-sanity converts a Sanity NDJSON export (sanity dataset export) into the same on-disk shape as cms-sim pull. Schema inference, _ref → Link Entry, image refs → Link Asset, Portable Text → RichText, and locale-shape {en, es} fan-out are all handled automatically. Offline, zero deps. |
| Pull from WordPress | cms-sim pull-wordpress converts a WordPress WXR XML export (wp-admin → Tools → Export → All content) into the same on-disk shape as cms-sim pull. Schema inference, taxonomies + author + parent → Link Entry, attachments → assets/assets.json + featured-image → Link Asset, Gutenberg / Classic HTML body → RichText, and Polylang language taxonomy → per-doc locale are all handled automatically. Offline, zero deps. |
| Mock data generator | No data? No problem. Auto-generates realistic entries from your schemas with field-type-aware values and cross-references. |
| CMS migration preview | Feed WordPress XML, Sanity NDJSON, or generic JSON exports alongside your Contentful schemas. See exactly how the migrated content will look. |
| CI/CD validation | cms-sim validate --json for pipelines. Exit code 1 on errors. |
Real workflows
1. From-scratch model design
Sketch a model, generate mock entries, see how it actually feels in the Contentful UI — before any space exists.
npx cms-sim --schemas=schemas/ --open
npx cms-sim --schemas=schemas/ --locales=en,es,fr --entries-per-type=10 --open2. Existing space audit (pull → inspect → iterate)
Read-only pull from a live space. Inspect schemas and entries locally, change things, re-simulate. Production never sees the iteration.
# Download your current model (read-only)
npx cms-sim pull --space-id=YOUR_SPACE --access-token=YOUR_CDA_TOKEN --output=my-project/
# ⭐ Include ALL field validations (in, regexp, size, range, unique)
npx cms-sim pull --space-id=YOUR_SPACE --access-token=YOUR_CDA_TOKEN \
--management-token=YOUR_CMA_TOKEN --output=my-project/
# Preview locally, modify schemas, re-run
npx cms-sim --schemas=my-project/schemas/ --open
# With real entries
npx cms-sim pull --space-id=abc123 --access-token=TOKEN --include-entries --output=my-project/
npx cms-sim --schemas=my-project/schemas/ --input=my-project/data/entries.ndjson --openWhy
--management-token? The Contentful CDA may omit editor-level validations (in,regexp,size,range,unique) from content type responses. Adding a CMA token ensures your pulled schemas include every validation rule, so the simulator can catch the same errors Contentful would reject.
3. WordPress / Sanity migration planning
Point the simulator at a real export. See exactly how the migrated content will land — fields, references, locales, the works — before importing anything.
# WordPress XML
npx cms-sim --schemas=schemas/ --input=data/export.xml --open
# Sanity NDJSON (with custom transformers)
npx cms-sim --schemas=schemas/ --input=data/export.ndjson --transforms=transforms/ --open
# Auto-scaffold schemas + transforms from a WordPress export
npx cms-sim scaffold --input=data/export.xml --output=my-project/cms-sim pull-sanity — zero-config Sanity → Contentful
When you don't want to hand-write transforms/, pull-sanity converts a Sanity dataset NDJSON export (the file sanity dataset export produces) into the same on-disk shape that cms-sim pull writes for a real Contentful space — contentful-space.json + schemas/<type>.js + data/entries.ndjson + assets/assets.json — in one command:
# Generate the NDJSON from your Sanity project (their CLI, not ours)
sanity dataset export production export.ndjson
# Convert it to Contentful shape (offline, zero deps, read-only)
npx cms-sim pull-sanity --input=export.ndjson --output=pulled-sanity/
# Now everything downstream works the same as with cms-sim pull
npx cms-sim --schemas=pulled-sanity/schemas/ --input=pulled-sanity/data/entries.ndjson --open
npx cms-sim to-import --input=output/ --schemas=pulled-sanity/schemas/ --output=bundle/What it handles automatically:
- Schema inference from real document samples — Symbol / Text / Integer / Number / Boolean / Date / Object / RichText / Link Entry / Link Asset / Array variants. Disagreement between samples picks the dominant shape and emits a warning.
_ref→ Link Entry rewrites for both top-level and nested arrays, pluslinkContentTypevalidations derived from the cross-document refs actually present in the corpus.- Image refs → Link Asset alongside a generated
assets/assets.jsonindex (id, title, fileName, contentType, url, size). - Portable Text → Contentful RichText (paragraph / heading / lists / blockquote / decorators / hyperlinks). Unknown marks, custom block types, and nested list levels emit explicit warnings while preserving the underlying text.
- Locale detection (
{en, es},{en-US, pt-BR}, …). Each Sanity doc fans out to one document per detected locale; the matching schema fields are markedlocalized: true.contentful-space.jsonreflects the detected locales.
The example at example-sanity-pull/ is a runnable walk-through against the link-vehicles dataset.
cms-sim pull-wordpress — zero-config WordPress → Contentful
The same single-command path, but for WordPress WXR XML exports (the file wp-admin's Tools → Export → All content produces):
# Export from WordPress
# wp-admin → Tools → Export → All content → "Download Export File"
# Convert it to Contentful shape (offline, zero deps, read-only)
npx cms-sim pull-wordpress --input=wp-export.xml --output=pulled-wp/
# Now everything downstream works the same as with cms-sim pull
npx cms-sim --schemas=pulled-wp/schemas/ --input=pulled-wp/data/entries.ndjson --open
npx cms-sim to-import --input=output/ --schemas=pulled-wp/schemas/ --output=bundle/What it handles automatically:
- Default content-type filtering.
nav_menu_item,wp_navigation,wp_template,wp_template_part,wp_global_styles,wp_block,oembed_cache,customize_changeset,custom_css,revisionare dropped — they have no Contentful equivalent.--skip-types=<csv>adds more. - Stable, prefixed ids per type:
wp_<postId>for posts/pages,wp_author_<login>,wp_category_<slug>,wp_tag_<slug>— so cross-references resolve without ambiguity. - Reference rewriting.
post.author(login) →Link Entry;post.categories[]/post.tags[](slugs) →Array<Link Entry>;category.parent(slug) → self-referentialLink Entry.linkContentTypevalidations derived from actual cross-doc refs. - Attachments →
assets/assets.jsonwith{id, title, fileName, contentType, url, size}. Featured-image references (<wp:postmeta>key_thumbnail_id, a post-id pointer) resolve tofeaturedImage: { sys: Link Asset }on the parent document. - Gutenberg / Classic HTML body → Contentful RichText via
htmlToRichText. Block elements, inline marks, hyperlinks, and inline<img>(asembedded-asset-blockplaceholders) all map. - Polylang multi-locale detection.
<category domain="language" nicename="…">on items → per-doclocaletag. All distinct locales surfaced incontentful-space.json. Single-locale sites just inherit the channel<language>.
The example at example-wp-pull/ is a runnable walk-through with a synthetic WXR fixture exercising authors, nested categories, tags, attachments, featured images, Gutenberg + Classic bodies, and the nav-menu filter.
4. contentful-migration preview (from-migrations)
You already have migration scripts as your source of truth. Replay them against a mock and see the resulting model — without running them against a real Contentful environment.
# Convert migration files → cms-sim schemas (no Contentful connection needed)
npx cms-sim from-migrations --migrations=./migrations/ --output=./schemas/
# TypeScript migration files (requires tsx)
npx tsx $(which cms-sim) from-migrations --migrations=./migrations/ --output=./schemas/
# Then simulate as usual
npx cms-sim --schemas=./schemas/ --openWhy? Many teams already have
contentful-migrationfiles as their content model source of truth.from-migrationslets you preview the model locally without running migrations against a real Contentful space.
5. Bridge simulation → real migration (to-import)
When the simulation looks right, export a JSON bundle that the official contentful-import CLI consumes. The simulator never connects to Contentful — you run contentful-import yourself when ready.
# 1. Simulate locally (iterate until the model looks right)
npx cms-sim --schemas=./schemas/ --output=./output/my-sim --name=my-sim
# 2. Export the bundle
npx cms-sim to-import \
--input=./output/my-sim \
--schemas=./schemas/ \
--output=./bundle.json
# 3. Push to Contentful with their official tool
npx contentful-import \
--space-id=YOUR_SPACE \
--management-token=$CMA_TOKEN \
--content-file=./bundle.jsonDefault behavior is intentional. Entries import as draft (you publish from the Contentful UI when ready); pass
--publishto mark everything as published on import. The export refuses if the simulation has validation errors; pass--allow-errorsto override. Editor interfaces are omitted — Contentful creates defaults server-side. End-to-end example with annotated schemas: examples/contentful-import/.
Used in real projects
This tool was built and dogfooded across actual content modeling work:
- WordPress → Contentful migration planning — Gutenberg-heavy blog imports previewed before touching the live space (examples/wordpress/)
- Sanity → Contentful evaluation — multi-locale NDJSON imports with custom transformers, validated before any data was written to Contentful (examples/sanity/)
- Existing Contentful space audits — pulled schemas and entries from production spaces (verified against real spaces with 5,200+ entries and 23 content types) without modifying anything
contentful-migrationscript previews — migration files replayed locally to catch field-design mistakes before running them against a real environment
Help wanted: real WordPress data (especially with ACF)
cms-sim pull-wordpress ships in v0.6.1 with synthetic WXR fixtures and the example-wp-pull/ walk-through (12 docs, Gutenberg + Classic bodies, 2 attachments, nested categories, a featured-image flow). That covers the common shapes but misses edge cases that only show up in real production exports:
- ACF (Advanced Custom Fields) — repeaters, flexible content, post-object refs, gallery / image fields, conditional logic
- Polylang / WPML with real translation grouping (the synthetic fixture only exercises the per-doc language taxonomy)
- Custom post types + ACF combos
- WooCommerce product structures (product attributes, variations, custom taxonomies)
- Gutenberg blocks beyond the basics — custom blocks, embeds, columns, reusable blocks
If you maintain a WordPress site and can share a WXR export (wp-admin → Tools → Export → All content), it would help shake out edge cases before they hit users. An anonymized export is fine — replace post bodies with lorem ipsum, scrub user emails / display names — the schema shape and meta keys are what matters.
Drop the file in a GitHub issue with a short note about the plugins / post-types used (ACF? WooCommerce? Polylang?), a gist link, or any file-share. Thanks.
What this tool does NOT do
- Does NOT upload, create, or modify anything in your Contentful space
- Does NOT run migrations — that's
contentful-migration/contentful-import - Does NOT make network calls during simulation —
cms-sim pullis the only command that reads from Contentful (read-only CDA token, optional CMA token for full validations)
This is a simulation tool, not a migration tool. Once your simulation looks correct, you use Contentful's own tools to perform the actual migration.
CLI Reference
Simulate (default)
cms-sim --schemas=<dir> [options]
REQUIRED:
--schemas=<dir> Content type definitions directory (.js/.mjs/.json)
DATA SOURCE (optional):
--input=<path> Source data (NDJSON, JSON, XML, or directory)
If omitted, mock entries are auto-generated
OPTIONS:
--transforms=<dir> Custom transformer modules directory
--plugins=<dir> Plugin directory (auto-discovers schemas/, transforms/, setup files)
--config=<file> JSON config file (cms-sim.config.json)
--output=<dir> Output directory (default: ./output/<name>_<timestamp>)
--name=<string> Project name
--base-locale=<code> Base locale (default: en)
--locales=<list> Comma-separated locale codes
--locale-map=<file> JSON file mapping source → target locale codes
--entries-per-type=<n> Mock entries per content type (default: 3)
--content-type=<id> Filter to a specific content type
--format=<fmt> Input format: ndjson, json-array, json-dir, wxr, sanity, auto (default: auto)
--json JSON output only (skip HTML)
--open Auto-open in browser
--watch, -w Re-run on file changes with browser auto-reload
--template-css=<file> Custom CSS for HTML output
--template-head=<file> Custom HTML for <head>
--verbose, -v Verbose logging
--help, -h Show helpPull
cms-sim pull --space-id=<id> --access-token=<token> [options]
--management-token=<tok> CMA token — includes ALL field validations
(in, regexp, size, range, unique)
Or set CONTENTFUL_MANAGEMENT_TOKEN env var
--environment=<env> Environment (default: master)
--output=<dir> Output directory (default: ./contentful-export)
--include-entries Download published entries
--include-assets Download asset files
--max-entries=<n> Max entries (default: 1000)
--content-type=<id> Filter entries by content type
--preview Use Content Preview API (drafts)Validate
cms-sim validate --schemas=<dir> [options]
Exits with code 1 if errors found. Use --json for machine-readable output.Diff
cms-sim diff --old=<dir> --new=<dir> [options]
--json Output diff as JSON to stdout
--html=<file> Write a visual HTML diff (KPI cards + per-CT drilldown +
entry-count bars + issues lists)
--open Open the generated HTML in the browser
Auto-detects mode: if both dirs contain manifest.json, performs a full
report diff (schemas + entry counts + validation issues + stats);
otherwise compares schemas only.To Import (export bundle for contentful-import)
cms-sim to-import --input=<sim-dir> --schemas=<dir> --output=<file.json> [options]
--input=<dir> Simulation output directory (manifest.json, entries/, ...)
--schemas=<dir> Original schemas dir (.js/.mjs/.json) — needed for full
CT fidelity (validations, defaultValues, descriptions)
--output=<file> Output JSON file path
--space-id=<id> Placeholder space id for sys.space (default: __simulation__)
--environment=<id> Placeholder environment id (default: master)
--publish Mark every entry/asset as published on import (default: draft)
--allow-errors Export even if simulation has errors (default: blocked)
Produces a JSON bundle compatible with the official `contentful-import` CLI.
The simulator never connects to Contentful — you run `contentful-import` yourself
when ready: `npx contentful-import --content-file=<file.json>`.From Migrations
cms-sim from-migrations --migrations=<dir> [options]
cms-sim from-migrations file1.js file2.js [options]
--migrations=<dir> Directory of migration files (.js/.mjs/.cjs/.ts)
--output=<dir> Output directory for schemas (default: ./schemas)
--verbose, -v Print each file being loaded
Supports both prop-style and fluent-chaining contentful-migration APIs.
TypeScript files require running under tsx.Init & Scaffold
cms-sim init [<name>] # Scaffold a new project with example schemas
cms-sim init . # Scaffold in current directory
cms-sim scaffold --input=<file.xml> # Auto-generate schemas from WordPress XMLSecurity:
cms-sim pullonly reads from Contentful — never writes. Schema and transform files are loaded via dynamicimport(). Only point--schemas/--transformsat directories you trust. Tokens are never stored or logged.
Programmatic API
import {
simulate, generateMockData, SchemaRegistry,
generateContentBrowserHTML, generateModelGraphHTML, writeReport,
} from 'content-model-simulator';
// Load schemas
const schemas = new SchemaRegistry();
await schemas.loadFromDirectory('./schemas');
// Generate mock data (or use readDocuments() for real data)
const { documents, assets } = generateMockData(schemas, {
entriesPerType: 5,
locales: ['en', 'es'],
});
// Simulate
const report = simulate({
documents, schemas, assets,
options: { name: 'my-model', locales: ['en', 'es'] },
});
// Write outputs
writeReport(report, './output');
fs.writeFileSync('./output/content-browser.html', generateContentBrowserHTML(report));
fs.writeFileSync('./output/visual-report.html', generateModelGraphHTML(report));Full API exports: simulate, readDocuments, readDocumentsStream, SchemaRegistry, TransformerRegistry, generateMockData, generateContentBrowserHTML, generateModelGraphHTML, generateDiffHTML, writeReport, diffSchemas, diffReports, pull, fromMigrations, fromMigration, writeMigrationSchemas, discoverMigrationFiles, MigrationMock, toContentfulImport, readWXR, parseWXR, readSanity, parseSanity, htmlToRichText, looksLikeHTML, isRichTextDocument, stripGutenbergComments.
Content Type Schema Format
// schemas/blogPost.mjs
export default {
id: 'blogPost',
name: 'Blog Post',
displayField: 'title',
fields: [
{ id: 'title', name: 'Title', type: 'Symbol', required: true, localized: true },
{ id: 'body', name: 'Body', type: 'RichText', required: true, localized: true },
{ id: 'author', name: 'Author', type: 'Link', linkType: 'Entry' },
{ id: 'heroImage', name: 'Hero Image', type: 'Link', linkType: 'Asset' },
{ id: 'tags', name: 'Tags', type: 'Array', items: { type: 'Symbol' } },
],
};Schemas can be .js (ESM default export), .mjs, or .json files. Uses the exact Contentful content type definition format.
Custom Transformers
// transforms/event.js
export function register(registry) {
registry.register('sourceType', (doc, locale, options) => ({
id: `event-${doc.data.slug}-${locale}`,
contentType: 'event',
locale,
fields: {
title: { [locale]: doc.data.eventName },
date: { [locale]: new Date(doc.data.timestamp).toISOString() },
},
}), 'event');
}Config File
{
"name": "my-project",
"input": "./data/export.ndjson",
"schemas": "./schemas",
"transforms": "./transforms",
"baseLocale": "en",
"locales": ["en", "es", "fr"],
"localeMap": { "en_US": "en", "es_MX": "es" }
}Output Structure
output/my-project_2026-04-12/
├── content-types/ # CT definition JSON files
├── entries/ # Entries grouped by content type
├── assets.json # Extracted assets
├── validation-report.json
├── manifest.json # Summary stats
├── content-browser.html # Interactive entry browser
└── visual-report.html # Content model graphCMS Migration Guides
| Source CMS | Guide | |---|---| | WordPress | examples/wordpress/ — end-to-end with real Gutenberg data | | Sanity | examples/sanity/ — end-to-end with multi-locale NDJSON |
License
MIT
