schema-drift
v1.1.3
Published
Database schema drift detector for polyglot Node.js stacks — snapshots and diffs MySQL and MongoDB schemas over time.
Maintainers
Readme
schema-drift
Database schema drift detector for polyglot Node.js stacks — snapshots and diffs MySQL and MongoDB schemas over time.
Why schema-drift?
Your database silently changes over time. A developer runs ALTER TABLE directly on production.
A new code path starts inserting MongoDB documents with a different field structure.
Someone renames a field but doesn't migrate existing documents.
schema-drift catches all of this by snapshotting your actual database state and diffing it over time —
across both MySQL and MongoDB simultaneously.
Install
npm install -g schema-driftQuick start
schema-drift snapshot # capture baseline
schema-drift snapshot # capture again after changes
schema-drift diff # see what changedCommands
| Command | Description |
|---------|-------------|
| snapshot | Capture current schema state |
| diff | Compare two snapshots, show drift |
| status | Quick health summary |
| list | List all saved snapshots |
Configuration
Create .schemadriftrc in your project root:
{
"label": "production",
"sampleSize": 500,
"connections": {
"mysql": { "host": "localhost", "user": "root", "password": "pass", "database": "myapp" },
"mongodb": { "uri": "mongodb://localhost:27017", "database": "myapp" }
}
}How MongoDB sampling works
MongoDB is schemaless — there is no information_schema. Instead, schema-drift samples
up to sampleSize documents per collection and builds a statistical field profile:
which fields exist, how frequently, and what types they contain.
Drift is reported when type distributions shift significantly between snapshots.
Local development
Prerequisites
- Node.js ≥ 18
- npm ≥ 9
- Docker (for running MySQL and MongoDB locally)
Setup
git clone https://github.com/santoshkumar-in/schema-drift.git
cd schema-drift
npm installRun locally without installing globally
# Run directly
node src/index.js snapshot
# Or link it so `schema-drift` resolves as a command globally on your machine
npm link
schema-drift snapshot
npm linkcreates a symlink from your globalbintosrc/index.js. Runnpm unlink -g schema-driftto remove it.
Project structure
src/
├── index.js # CLI entry point — Commander.js wiring
├── commands/
│ ├── snapshot.js # schema-drift snapshot
│ ├── diff.js # schema-drift diff
│ ├── status.js # schema-drift status
│ └── list.js # schema-drift list
├── core/
│ ├── mysql.js # MySQL introspection via information_schema
│ ├── mongo.js # MongoDB statistical schema inference
│ ├── differ.js # Snapshot comparison engine
│ ├── store.js # Snapshot file persistence (.schema-drift/)
│ └── suggestions.js # Migration code generator
└── utils/
├── config.js # cosmiconfig loader for .schemadriftrc
└── render.js # terminal output — drift report, tables, summaryKey modules to know
src/core/mongo.js is the most interesting module. inferCollectionSchema() samples documents via $sample, walks every document recursively with walkDocument() to extract field paths, and builds a frequency and type distribution map. This is the statistical inference engine that makes MongoDB schema tracking possible without a fixed schema.
src/core/differ.js compares two snapshots. For MySQL it diffs column types, nullability, and indexes. For MongoDB it detects type distribution shifts, frequency changes, and field renames using camelCase/snake_case heuristics.
src/core/store.js manages snapshot persistence. Snapshots are stored as JSON files in .schema-drift/ with an index.json tracking the latest per label. You can commit these files to get a full audit trail in git history.
Running tests
npm testTests use Jest with --experimental-vm-modules for ESM support. Test files live in __tests__/ and require no real database — MySQL and MongoDB are fully mocked.
To test a specific suite:
node --experimental-vm-modules node_modules/.bin/jest differ
node --experimental-vm-modules node_modules/.bin/jest mongoTesting against real databases
The unit tests mock all database calls. The most useful local test is running the full end-to-end flow against live databases — real connections, real snapshots, real drift detection.
1. Start databases with Docker
Create docker-compose.yml in a scratch folder (e.g. ~/schema-drift-testenv/):
version: "3.8"
services:
mysql:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: testpass
MYSQL_DATABASE: testapp
ports:
- "3306:3306"
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-ptestpass"]
interval: 5s
retries: 10
mongodb:
image: mongo:7
ports:
- "27017:27017"
healthcheck:
test: ["CMD", "mongosh", "--eval", "db.runCommand('ping')"]
interval: 5s
retries: 10cd ~/schema-drift-testenv
docker compose up -d
docker compose ps # wait until both show healthy (~15s)2. Seed test data
Save as seed.sh in the same folder and run it:
#!/usr/bin/env bash
set -e
MYSQL_CTR=$(docker ps --filter "ancestor=mysql:8.0" --format "{{.Names}}" | head -1)
MONGO_CTR=$(docker ps --filter "ancestor=mongo:7" --format "{{.Names}}" | head -1)
docker exec "$MYSQL_CTR" mysql -uroot -ptestpass testapp -e "
CREATE TABLE IF NOT EXISTS users (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) NOT NULL UNIQUE,
created_at DATETIME DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
total DECIMAL(10,2),
status VARCHAR(20) DEFAULT 'pending',
created_at DATETIME DEFAULT NOW()
);
INSERT IGNORE INTO users (name, email) VALUES
('Priya Sharma', '[email protected]'),
('Rahul Verma', '[email protected]');
INSERT INTO orders (user_id, total, status) VALUES
(1, 1299.00, 'delivered'), (1, 450.50, 'pending');
"
docker exec "$MONGO_CTR" mongosh testapp --eval "
db.products.insertMany([
{ name: 'Laptop', price: 55000, stock: 10, tags: ['electronics'] },
{ name: 'Phone', price: 18000, stock: 45, tags: ['mobile'] }
]);
db.events.insertMany([
{ type: 'page_view', userId: 'u1', url: '/home', timestamp: new Date() },
{ type: 'click', userId: 'u2', url: '/product', timestamp: new Date() }
]);
"
echo "✓ MySQL seeded"
echo "✓ MongoDB seeded"chmod +x seed.sh && bash seed.sh3. Configure schema-drift
In your schema-drift project root, create .schemadriftrc:
{
"label": "local",
"sampleSize": 100,
"connections": {
"mysql": { "host": "localhost", "user": "root", "password": "testpass", "database": "testapp" },
"mongodb": { "uri": "mongodb://localhost:27017", "database": "testapp" }
}
}4. Take a baseline snapshot
node src/index.js snapshot --label local5. Introduce real drift
MYSQL_CTR=$(docker ps --filter "ancestor=mysql:8.0" --format "{{.Names}}" | head -1)
MONGO_CTR=$(docker ps --filter "ancestor=mongo:7" --format "{{.Names}}" | head -1)
docker exec "$MYSQL_CTR" mysql -uroot -ptestpass testapp -e "
ALTER TABLE users ADD COLUMN phone VARCHAR(15);
ALTER TABLE orders MODIFY COLUMN total VARCHAR(20);
CREATE INDEX idx_orders_status ON orders(status);
"
docker exec "$MONGO_CTR" mongosh testapp --eval '
db.products.updateMany({ name: "Phone" }, { $set: { price: "18000" } });
db.products.insertOne({ name: "Keyboard", cost: 1500, stock: 15 });
db.events.updateMany({ type: "click" }, { $set: { sessionId: "abc123" } });
'6. Snapshot again and diff
node src/index.js snapshot --label local
node src/index.js diff --label localThe report catches all 6 changes: the new phone column, the total type change (error), the new index, MongoDB price type drift (error), the price → cost possible rename, and the new sessionId field.
7. Try the other commands
node src/index.js status --label local # health summary
node src/index.js list --label local # list snapshots on disk
node src/index.js diff --label local --db mysql # MySQL only
node src/index.js diff --label local --ci # exits 1 on errorsTeardown
docker compose down -v # -v wipes volumes and all seeded dataContributing
Bug reports and pull requests are welcome. For significant changes, open an issue first to discuss the approach.
When contributing:
- Tests required for all new behaviour —
npm testmust pass - No real database required for tests — keep all DB calls mocked
- Keep commits focused — one logical change per commit
License
MIT © Santosh Kumar
