npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@symbiosedb/auto-seed

v1.0.1

Published

Auto-seed functionality for SymbioseDB - generate realistic mock data for all database types

Readme

Auto-Seed: Smart Mock Data Generation

Intelligent mock data generation across all 4 database types with cross-DB relationship awareness

Generate realistic test data automatically for PostgreSQL, Vector, Graph, and Blockchain databases with full foreign key awareness and cross-database consistency.

✨ Features

  • Foreign Key Intelligence - Automatically picks existing IDs for FK columns
  • Cross-Database Aware - Maintains ID consistency across SQL/Vector/Graph/Blockchain
  • Dependency Resolution - Automatically determines correct seed order using topological sort
  • Realistic Data - Context-aware generation (emails, names, addresses, etc.)
  • Reproducible - Seed support for consistent test data
  • Locale Support - Generate region-specific data (en_US, fr_FR, etc.)
  • 100% Test Coverage - 38/38 tests passing, strict TDD methodology

🚀 Quick Start

import { SeedOrchestrator } from '@symbiosedb/auto-seed';

const orchestrator = new SeedOrchestrator();

// Define your schema
const usersSchema = {
  dbType: 'sql',
  tableName: 'users',
  columns: [
    { name: 'id', type: 'uuid', nullable: false, isPrimaryKey: true },
    { name: 'email', type: 'string', nullable: false },
    { name: 'name', type: 'string', nullable: false },
  ],
  primaryKeys: [],
  foreignKeys: [],
  uniqueConstraints: [],
};

// Seed 100 users
const result = await orchestrator.seedTable(usersSchema, 100);

console.log(result);
// {
//   tableName: 'users',
//   dbType: 'sql',
//   recordsCreated: 100,
//   duration: 45, // milliseconds
//   preview: [
//     { id: '123e4567-e89b-12d3-a456-426614174000', email: '[email protected]', name: 'John Doe' },
//     { id: '223e4567-e89b-12d3-a456-426614174001', email: '[email protected]', name: 'Jane Smith' },
//     // ... first 5 records
//   ]
// }

📦 Components

1. SchemaAnalyzer

Analyzes table/collection schemas to detect constraints and relationships.

import { SchemaAnalyzer } from '@symbiosedb/auto-seed';

const analyzer = new SchemaAnalyzer();

// Detect primary keys
const primaryKeys = analyzer.detectPrimaryKeys(schema);

// Detect foreign keys
const foreignKeys = analyzer.detectForeignKeys(schema);

// Detect unique constraints
const constraints = analyzer.detectUniqueConstraints(schema);

// Works with all 4 DB types: SQL, Vector, Graph, Blockchain

2. DependencyGraph

Builds dependency graph and determines correct seed order.

import { DependencyGraph } from '@symbiosedb/auto-seed';

const graph = new DependencyGraph();

// Add tables
graph.addTable('users', usersSchema);
graph.addTable('posts', postsSchema);
graph.addTable('comments', commentsSchema);

// Get topological order (users → posts → comments)
const order = graph.getTopologicalOrder();
// ['users', 'posts', 'comments']

// Detect circular dependencies
const hasCycles = graph.hasCycles(); // false

3. CrossDBRegistry

Tracks generated IDs across all 4 database types for consistency.

import { CrossDBRegistry } from '@symbiosedb/auto-seed';

const registry = new CrossDBRegistry();

// Register IDs for SQL users
registry.registerID('users', 'sql', 'user-1');
registry.registerID('users', 'sql', 'user-2');

// When seeding Vector collection, pick random user ID
const userId = registry.getRandomID('users'); // 'user-1' or 'user-2'

// Ensure same user has same ID across all DBs
registry.registerID('user_embeddings', 'vector', 'emb-1');
registry.registerID('user_nodes', 'graph', 'user-1'); // Same ID!

4. SmartDataGenerator

Generates realistic data with context awareness and FK intelligence.

import { SmartDataGenerator } from '@symbiosedb/auto-seed';

const generator = new SmartDataGenerator();

// Context-aware generation
const emailColumn = { name: 'email', type: 'string', nullable: false };
const email = generator.generateValue(emailColumn, registry);
// '[email protected]'

const nameColumn = { name: 'name', type: 'string', nullable: false };
const name = generator.generateValue(nameColumn, registry);
// 'John Doe'

// FK-aware generation
const postsSchema = {
  // ... schema with user_id FK
  foreignKeys: [
    { columnName: 'user_id', referencedTable: 'users', referencedColumn: 'id' }
  ]
};

const post = generator.generateRecord(postsSchema, registry);
// { id: '...', user_id: 'user-1', title: '...' }
// user_id automatically picked from registry!

// Locale support
generator.setLocale('fr_FR'); // French cities, names, etc.

// Reproducible with seed
generator.setSeed(12345);

Supported column types:

  • uuid → UUIDs
  • string → Context-aware (email, name, phone, address, etc.)
  • text → Paragraphs, descriptions
  • integer, bigint, float, decimal → Numbers
  • boolean → True/false
  • date, timestamp → Dates
  • json → JSON objects
  • vector, embedding → Arrays of random numbers (configurable dimensions)

Context-aware column names:

  • email → faker.internet.email()
  • name, first_name, last_name → faker.person.*
  • phone → faker.phone.number()
  • address, city, country → faker.location.*
  • title → faker.lorem.sentence()
  • description, content → faker.lorem.paragraph()
  • price, amount → faker.commerce.price()

5. SeedOrchestrator

Coordinates multi-table seeding with automatic dependency resolution.

import { SeedOrchestrator } from '@symbiosedb/auto-seed';

const orchestrator = new SeedOrchestrator();

// Seed multiple tables in correct order
const results = await orchestrator.seedMultipleTables([
  { tableName: 'comments', dbType: 'sql', count: 100, schema: commentsSchema },
  { tableName: 'users', dbType: 'sql', count: 10, schema: usersSchema },
  { tableName: 'posts', dbType: 'sql', count: 50, schema: postsSchema },
]);

// Automatically seeds in order: users → posts → comments
// results[0].tableName === 'users'
// results[1].tableName === 'posts'
// results[2].tableName === 'comments'

// Seed with related tables automatically
orchestrator.registerSchema(usersSchema);
orchestrator.registerSchema(postsSchema);

const results = await orchestrator.seedRelatedTables(postsSchema, 20);
// Automatically seeds users first, then posts

// Options
await orchestrator.seedTable(schema, 100, {
  locale: 'fr_FR',  // French data
  seed: 12345,      // Reproducible
  reset: true,      // Clear existing data first
});

🔗 Cross-Database Seeding

Auto-Seed intelligently handles relationships across all 4 database types:

// SQL: users table
const usersSchema = {
  dbType: 'sql',
  tableName: 'users',
  columns: [
    { name: 'id', type: 'uuid', nullable: false, isPrimaryKey: true },
    { name: 'name', type: 'string', nullable: false },
  ],
  primaryKeys: [],
  foreignKeys: [],
  uniqueConstraints: [],
};

// Vector: user embeddings (references SQL users)
const embeddingsSchema = {
  dbType: 'vector',
  tableName: 'user_embeddings',
  columns: [
    { name: 'id', type: 'uuid', nullable: false, isPrimaryKey: true },
    { name: 'user_id', type: 'uuid', nullable: false }, // Cross-DB FK!
    { name: 'embedding', type: 'vector', nullable: false, dimensions: 128 },
  ],
  primaryKeys: [],
  foreignKeys: [
    {
      columnName: 'user_id',
      referencedTable: 'users',
      referencedColumn: 'id',
      referencedDBType: 'sql', // Cross-DB reference!
    },
  ],
  uniqueConstraints: [],
};

// Graph: user nodes (same IDs as SQL)
const userNodesSchema = {
  dbType: 'graph',
  tableName: 'user_nodes',
  columns: [
    { name: 'id', type: 'uuid', nullable: false, isPrimaryKey: true },
    { name: 'label', type: 'string', nullable: false },
  ],
  primaryKeys: [],
  foreignKeys: [],
  uniqueConstraints: [],
};

// Blockchain: user creation attestations
const attestationsSchema = {
  dbType: 'blockchain',
  tableName: 'user_attestations',
  columns: [
    { name: 'id', type: 'uuid', nullable: false, isPrimaryKey: true },
    { name: 'user_id', type: 'uuid', nullable: false },
    { name: 'hash', type: 'string', nullable: false },
  ],
  primaryKeys: [],
  foreignKeys: [],
  uniqueConstraints: [],
};

// Seed all 4 DBs with consistent IDs
const results = await orchestrator.seedMultipleTables([
  { tableName: 'users', dbType: 'sql', count: 10, schema: usersSchema },
  { tableName: 'user_embeddings', dbType: 'vector', count: 10, schema: embeddingsSchema },
  { tableName: 'user_nodes', dbType: 'graph', count: 10, schema: userNodesSchema },
  { tableName: 'user_attestations', dbType: 'blockchain', count: 10, schema: attestationsSchema },
]);

// Result:
// - 10 users in SQL with IDs: user-1, user-2, ..., user-10
// - 10 embeddings in Vector with user_id referencing user-1 to user-10
// - 10 nodes in Graph with same IDs: user-1, user-2, ..., user-10
// - 10 attestations in Blockchain referencing user-1 to user-10
// All have consistent IDs across all 4 databases!

📊 Test Coverage

44/44 tests passing (100%)

| Component | Tests | Status | |-----------|-------|--------| | SchemaAnalyzer | 8 | ✅ | | DependencyGraph | 5 | ✅ | | CrossDBRegistry | 6 | ✅ | | SmartDataGenerator | 10 | ✅ | | SeedOrchestrator | 9 | ✅ | | Integration Tests | 6 | ✅ | | Total | 44 | |

All tests follow strict TDD methodology (RED → GREEN → REFACTOR).

🔗 Integration Testing

The Auto-Seed system includes comprehensive integration tests that verify cross-database seeding scenarios:

SQL + Vector Integration

// Seed SQL users, then Vector embeddings with matching user_id FKs
await orchestrator.seedTable(usersSchema, 10);
await orchestrator.seedTable(embeddingsSchema, 10);
// ✅ All embeddings reference valid user IDs from SQL table

SQL + Graph Integration

// Seed SQL users, then Graph nodes with same user IDs
await orchestrator.seedTable(usersSchema, 5);
await orchestrator.seedTable(graphNodesSchema, 5);
// ✅ All graph nodes use same user IDs as SQL table

SQL + Blockchain Integration

// Seed SQL transactions, then Blockchain attestations
await orchestrator.seedTable(transactionsSchema, 20);
await orchestrator.seedTable(attestationsSchema, 20);
// ✅ All attestations reference valid transaction IDs

All 4 DB Types Integration

// Seed users across all 4 database types with consistent IDs
await orchestrator.seedTable(sqlUsersSchema, 5);        // SQL
await orchestrator.seedTable(vectorEmbeddings, 5);      // Vector
await orchestrator.seedTable(graphNodes, 5);            // Graph
await orchestrator.seedTable(blockchainAttestations, 5); // Blockchain
// ✅ Same user IDs across ALL 4 databases

Complex FK Constraints

// departments → employees → tasks (3-level FK chain)
await orchestrator.seedTable(departmentsSchema, 3);  // 3 departments
await orchestrator.seedTable(employeesSchema, 10);   // 10 employees
await orchestrator.seedTable(tasksSchema, 20);       // 20 tasks
// ✅ FK integrity maintained: tasks → employees → departments

Circular Dependency Detection

// table_a references table_b, table_b references table_a
orchestrator.registerSchema(tableASchema);
orchestrator.registerSchema(tableBSchema);

const graph = orchestrator['dependencyGraph'];
graph.hasCycles(); // returns true

// ✅ Cycle detected, topological sort returns empty array
// Manual intervention required (seed with nullable FKs first)

🧪 Running Tests

# Run all tests
npm test

# Run with coverage
npm run test:coverage

# Watch mode
npm run test:watch

🏗️ Architecture

Auto-Seed System
├─ SchemaAnalyzer (FK detection, constraint analysis)
├─ DependencyGraph (Topological sort, cycle detection)
├─ CrossDBRegistry (ID tracking across all 4 DBs)
├─ SmartDataGenerator (Realistic data with FK awareness)
└─ SeedOrchestrator (Multi-table coordination)

How it works:

  1. Schema Analysis - Detect primary keys, foreign keys, constraints
  2. Dependency Resolution - Build dependency graph, determine seed order
  3. Data Generation - Generate realistic data respecting FK constraints
  4. ID Tracking - Register all generated IDs in cross-DB registry
  5. Cross-DB Consistency - Ensure same entity has same ID across all 4 DBs

📝 TypeScript Types

export type DBType = 'sql' | 'vector' | 'graph' | 'blockchain';

export type DataType =
  | 'uuid'
  | 'integer'
  | 'bigint'
  | 'float'
  | 'decimal'
  | 'string'
  | 'text'
  | 'boolean'
  | 'date'
  | 'timestamp'
  | 'json'
  | 'vector'
  | 'embedding';

export interface TableSchema {
  dbType: DBType;
  tableName: string;
  columns: Column[];
  primaryKeys: Column[];
  foreignKeys: ForeignKey[];
  uniqueConstraints: Constraint[];
}

export interface SeedResult {
  tableName: string;
  dbType: DBType;
  recordsCreated: number;
  duration: number; // milliseconds
  preview: Record<string, any>[]; // First 5 records
  errors?: string[];
}

🎯 Use Cases

1. Unit Testing

// Seed test data for unit tests
beforeEach(async () => {
  await orchestrator.seedTable(usersSchema, 10, { seed: 12345 });
  await orchestrator.seedTable(postsSchema, 50, { seed: 12345 });
});

2. Integration Testing

// Seed realistic data for integration tests
const results = await orchestrator.seedMultipleTables([
  { tableName: 'users', dbType: 'sql', count: 100, schema: usersSchema },
  { tableName: 'posts', dbType: 'sql', count: 500, schema: postsSchema },
  { tableName: 'comments', dbType: 'sql', count: 2000, schema: commentsSchema },
]);

3. Demo Environments

// Seed demo database with realistic data
await orchestrator.seedTable(usersSchema, 1000, { locale: 'en_US' });
await orchestrator.seedTable(productsSchema, 500, { locale: 'en_US' });
await orchestrator.seedTable(ordersSchema, 5000, { locale: 'en_US' });

4. Performance Testing

// Seed large datasets for performance testing
await orchestrator.seedTable(schema, 1000000, { reset: true });

📖 API Reference

See PLAN.md for comprehensive implementation plan and architecture details.

🚀 Next Steps (Phase 2)

  • REST API endpoints (POST /api/.../tables/:tableID/mock)
  • CLI command (symbiosedb seed <table> --count 100)
  • Studio UI component ("Generate Mock Data" button)

📄 License

MIT


Built with SymbioseDB - The Beautiful Database for Everything™

Auto-Seed Phase 1 Complete ✓ (November 2024)