npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@1337plus/llmdatagen

v1.1.6

Published

Universal JSON data generator using LLM with configurable schemas

Readme

LLM Data Generator

A comprehensive data generation platform powered by Large Language Models. Generate structured data, create content, and interact with 100+ AI models through both CLI and web interfaces.

Overview

LLM Data Generator is a versatile tool that combines the power of modern AI models with user-friendly interfaces to generate high-quality data and content. Whether you need structured JSON data for testing, CSV files for analysis, or creative content like blog posts and documentation, this tool provides an intuitive solution with enterprise-grade features.

Features

🎯 Multi-Format Data Generation

  • JSON/JSONL: Schema-validated structured data generation
  • CSV/TSV: Tabular data with custom columns and relationships
  • XML: Hierarchical data structures
  • YAML: Configuration files and data serialization
  • SQL: Database schemas and sample data
  • Markdown: Documentation, articles, and formatted content
  • Text: Custom formats and unstructured content

🖥️ Dual Interface Design

  • Interactive CLI: Terminal-based wizard with model selection and progress tracking
  • Web Interface: Full-featured browser-based application with visual editors

🤖 Advanced AI Integration

  • 100+ Models: Access to OpenRouter's complete model catalog
  • Smart Model Selection: Auto-router picks the best model for your task
  • Real-time Cost Tracking: Know costs before and after generation
  • Progress Monitoring: Live generation status and error handling

👥 User Management & Authentication

  • Role-based Access: Admin and user roles with different permissions
  • Multi-tenant Support: Isolated user data and configurations
  • API Key Management: Personal and system-wide key management
  • Storage Quotas: Configurable limits and usage tracking

📊 Advanced Features

  • Queue System: Batch processing for large-scale generation
  • Generation History: Track and revisit previous generations
  • File Management: Upload, organize, and manage data files
  • Chat Interface: Interactive conversations with AI models
  • Data Viewer/Editor: Visual editing of generated content
  • Configuration Wizard: Step-by-step setup for complex scenarios

Installation

NPM Package (Recommended)

# Install globally
npm install -g @1337plus/llmdatagen

# Or use with npx without installation
npx @1337plus/llmdatagen

From Source

# Clone the repository
git clone https://github.com/diviatrix/llm-data-gen.git
cd llm-data-gen

# Install dependencies
npm install

# Run locally
npm start

System Requirements

  • Node.js: 18.0.0 or higher
  • Platform: Windows, macOS, Linux
  • Memory: 512MB RAM minimum
  • Storage: 100MB available space

Quick Start

CLI Interface

# Start interactive mode
llmdatagen

# Direct generation with config file
llmdatagen generate --config myconfig.json

# Test API connection
llmdatagen test

# Validate configuration
llmdatagen validate config.json

Web Interface

# Start web server (default port 3000)
npm run web

# Start on custom port
PORT=8080 npm run web

# Development mode with hot reload
npm run dev:web

Open http://localhost:3000 in your browser to access the full web interface.

Usage Examples

Schema-driven JSON Generation

{
  "type": "json",
  "count": 50,
  "schema": {
    "name": { "type": "string" },
    "email": { "type": "string", "format": "email" },
    "age": { "type": "number", "minimum": 18, "maximum": 80 },
    "skills": {
      "type": "array",
      "items": { "type": "string" },
      "minItems": 2,
      "maxItems": 5
    }
  }
}

CSV Data Generation

{
  "type": "csv",
  "count": 100,
  "prompt": "Generate customer data with columns: name, email, phone, city, purchase_amount",
  "output": {
    "format": "csv",
    "filename": "customers.csv"
  }
}

Content Creation

{
  "type": "text",
  "count": 10,
  "prompt": "Write technical blog post titles about AI and machine learning trends in 2025",
  "output": {
    "format": "markdown",
    "filename": "blog_titles.md"
  }
}

Web Interface Features

🏠 Dashboard

  • System status and health monitoring
  • Quick access to recent projects
  • Usage statistics and quotas

💬 Chat Interface

  • Interactive conversations with AI models
  • File attachment support (images, documents, code)
  • Conversation history and export
  • Model switching mid-conversation

🔧 Configuration Manager

  • Visual JSON editor with syntax highlighting
  • Template library with examples
  • Validation and testing tools
  • Import/export configurations

📈 Data Generator

  • Batch generation with progress tracking
  • Multiple output format support
  • Preview and validation
  • Download and sharing options

📂 File Manager

  • Upload and organize data files
  • Preview and edit capabilities
  • Bulk operations and organization
  • Integration with generation workflows

🎛️ Admin Panel (Local Mode)

  • User creation and management
  • Role assignment and permissions
  • System configuration
  • Usage monitoring and quotas

⚙️ Settings

  • API key management
  • Model preferences and defaults
  • Output directory configuration
  • Notification preferences

Configuration

Environment Variables

# OpenRouter API configuration
OPENROUTER_API_KEY=your_api_key_here

# Server configuration
PORT=3000
NODE_ENV=production

# Authentication (optional)
JWT_SECRET=your_jwt_secret
SESSION_TIMEOUT=24h

# Storage (optional)
DATA_DIR=./user-data
MAX_FILE_SIZE=10485760

Directory Structure

~/Documents/llmdatagen/
├── configs/           # Configuration files
│   └── examples/      # Template configurations
├── output/           # Generated data files
│   └── data/         # Organized by date/project
└── uploads/          # User uploaded files

API Reference

Authentication Endpoints

  • POST /api/auth/login - User authentication
  • POST /api/auth/logout - Session termination
  • GET /api/auth/me - Current user info

Generation Endpoints

  • POST /api/generate - Start data generation
  • GET /api/generate/status/:id - Check generation status
  • GET /api/generate/history - Generation history

Configuration Endpoints

  • GET /api/configs - List configurations
  • POST /api/configs - Create configuration
  • PUT /api/configs/:id - Update configuration
  • DELETE /api/configs/:id - Delete configuration

File Management Endpoints

  • GET /api/files - List files
  • POST /api/files/upload - Upload file
  • GET /api/files/:id - Download file
  • DELETE /api/files/:id - Delete file

For detailed API documentation, see docs/api_reference.md.

Advanced Features

Queue System

Process multiple generation tasks in background:

  • Batch processing for large datasets
  • Priority-based task scheduling
  • Progress tracking and notifications
  • Error handling and retry logic

Model Management

  • Dynamic model selection based on task complexity
  • Cost optimization with model routing
  • Performance monitoring and analytics
  • Custom model preferences per user

Data Processing

  • Multi-format export capabilities
  • Data validation and cleanup
  • Transformation and filtering
  • Integration with external tools

Development

Setup Development Environment

# Clone and install
git clone https://github.com/diviatrix/llm-data-gen.git
cd llm-data-gen
npm install

# Set up environment
cp .env.example .env
# Edit .env with your configuration

# Run tests
npm test
npm run test:coverage

# Start development servers
npm run dev:web    # Web interface
npm start          # CLI interface

Available Scripts

  • npm start - Run CLI tool
  • npm run web - Start web server
  • npm run lint - Check code style
  • npm test - Run test suite
  • npm run test:coverage - Coverage report
  • npm run build-css - Build stylesheets

Project Structure

llm-data-gen/
├── lib/                    # Core libraries
│   ├── cli/               # CLI interface components
│   ├── streaming/         # Data streaming utilities
│   ├── utils/            # Shared utilities
│   └── workers/          # Background processing
├── public/                # Web interface assets
│   ├── css/              # Stylesheets
│   ├── js/               # Frontend JavaScript
│   └── pages/            # HTML templates
├── test/                  # Test suites
│   ├── unit/             # Unit tests
│   └── integration/      # Integration tests
├── docs/                  # Documentation
└── configs/              # Example configurations

Documentation

Contributing

We welcome contributions from the community! Please read our contributing guidelines:

  1. Fork the repository on GitHub
  2. Create a feature branch from main
  3. Make your changes with appropriate tests
  4. Ensure all tests pass and code follows style guidelines
  5. Submit a pull request with clear description

Development Guidelines

  • Follow existing code style and conventions
  • Add tests for new functionality
  • Update documentation for user-facing changes
  • Use semantic commit messages

Reporting Issues

  • Use GitHub Issues for bug reports and feature requests
  • Include system information and steps to reproduce
  • Check existing issues to avoid duplicates

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

  • GitHub Issues: Bug reports and feature requests
  • Documentation: Comprehensive guides and examples
  • Community: Share configurations and use cases

Built with ❤️ by 1337.plus