npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

miniml

v1.0.16

Published

A minimal, embeddable semantic data modeling language for generating SQL queries from YAML model definitions. Inspired by LookML.

Readme

MiniML

A minimal, embeddable semantic data modeling language for generating SQL queries from YAML model definitions, inspired by LookML.

Generates SQL queries driven by a semantic model and selected dimensions and measures. Capable of generating optimal SQL from minimal queries to very complex queries with multiple joins based on the specified configuration.

Also generates metadata for each model suitable for use by AI to execute queries.

Designed to be a stand-alone and light-weight library, embeddable in any server or app without requiring additional data platforming. Supports BigQuery and Snowflake dialects.

Table of Contents

Quick Start

1. Define a Model

Create a YAML model file (sales.yaml) that describes your data structure:

description: Sales fact table capturing transactional-level purchase data across products, stores, and customers.

from: acme.sales

join:
  customer_join: JOIN acme.customers USING (customer_id)
  product_join: JOIN acme.products USING (product_id)
  store_join: JOIN acme.stores USING (store_id)

always_join:
  - product_join  # Always include product data for consistency

date_field: date                    # Primary date field for filtering
default_date_range: last 7 days     # Default time window when no dates specified
include_today: false                # Exclude today's partial data

dimensions:
  date:
    - The calendar date on which the sale occurred.
    - DATE(sale_date)
  sale_id: Unique identifier for each sale transaction.
  customer_id: Unique identifier for the customer making the purchase.
  customer_name:
    - Display name or label associated with the customer.
    - customer_name
    - customer_join
  product_id: Unique identifier for the product sold.
  product_name:
    - Display name or label associated with the product.
    - product_name
    - product_join
  store_id: Unique identifier for the store where the sale took place.
  store_name:
    - Display name or label of the store location.
    - store_name
    - store_join

measures:
  total_amount:
    - Total sale amount generated by the transaction.
    - SUM(total_amount)
  price_avg:
    - Average unit price across all sales.
    - AVG(unit_price)
  count:
    - The count of records for the row.
    - COUNT(*)

2. Produce a description of the model

npx miniml sales.yaml

OUTPUT

## DIMENSIONS
- `date` The calendar date on which the sale occurred.
- `sale_id` Unique identifier for each sale transaction.
- `customer_id` Unique identifier for the customer making the purchase.
- `customer_name` Display name or label associated with the customer.
- `product_id` Unique identifier for the product sold.
- `product_name` Display name or label associated with the product.
- `store_id` Unique identifier for the store where the sale took place.
- `store_name` Display name or label of the store location.

## MEASURES
- `total_amount` Total sale amount generated by the transaction.
- `price_avg` Average unit price across all sales.
- `count` The count of records for the row.

3. Generate a SQL query from the model

npx miniml sales.yaml --dimensions=date --measures=total_amount

OUTPUT

SELECT
  DATE(sale_date) AS date,
  SUM(total_amount) AS total_amount
FROM acme.sales
JOIN acme.products USING (product_id)
WHERE DATE(sale_date) >= CURRENT_TIMESTAMP - INTERVAL 7 DAY
  AND DATE(sale_date) < DATE_TRUNC(CURRENT_TIMESTAMP, DAY)
GROUP BY ALL

Add more options:

npx miniml sales.yaml \
  --dimensions=date,customer_name,product_name \
  --measures=total_amount,price_avg,count \
  --date-from=2025-01-01 \
  --date-to=2025-01-31 \
  --order-by=-date \
  --limit=10

OUTPUT

SELECT
  DATE(sale_date) AS date,
  customer_name AS customer_name,
  product_name AS product_name,
  SUM(total_amount) AS total_amount,
  AVG(unit_price) AS price_avg,
  COUNT(*) AS count
FROM acme.sales
JOIN acme.customers USING (customer_id)
JOIN acme.products USING (product_id)
WHERE DATE(sale_date) BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY ALL
ORDER BY date DESC
LIMIT 10

Installation

To use MiniML in your project:

npm install miniml

Programmatic Usage

import { loadModel, renderQuery } from 'miniml';

// Load the model
const model = await loadModel('./sales.yaml');

// Generate a SQL query
const sql = renderQuery(model, {
    dimensions: ['date', 'customer_name'],
    measures: ['total_amount', 'count'],
    date_from: '2025-01-01',
    date_to: '2025-01-31',
    order_by: ['-date'],
    limit: 10
});

console.log(sql);

OUTPUT

SELECT
  DATE(sale_date) AS date,
  customer_name AS customer_name,
  SUM(total_amount) AS total_amount,
  COUNT(*) AS count
FROM acme.sales
JOIN acme.customers USING (customer_id)
WHERE DATE(sale_date) BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY ALL
ORDER BY date DESC
LIMIT 10

Model Definition

Basic Structure

A MiniML model is a YAML file with the following structure:

description: Human-readable description of the model
dialect: bigquery | snowflake
from: schema.table_name

join:
  join_name: JOIN clause definition

always_join:
  - join_name  # Joins to always include

where: base_filter_conditions

date_field: primary_date_column        # Auto-detected if not specified
default_date_range: last 30 days       # Default time window
include_today: false                   # Include/exclude current day

dimensions:
  dimension_name: definition

measures:
  measure_name: definition

tags:
  - public  # Tags that can be used for purposes such as access control

info: Detailed usage instructions for the model

Dimensions

Dimensions are fields used for grouping data. They can be defined in several formats:

Simple string (field name only):

dimensions:
  customer_id: Unique customer identifier

Array format (description, SQL, join):

dimensions:
  customer_name:
    - Display name of the customer
    - customer_name
    - customer_join

Object format:

dimensions:
  date:
    description: The transaction date
    sql: DATE(created_at)
    join: date_dimension_join

Measures

Measures are aggregated fields. They default to SUM() if no SQL is provided:

Simple string (defaults to SUM):

measures:
  revenue: Total revenue amount  # becomes SUM(revenue) AS revenue

Custom aggregation:

measures:
  avg_price:
    - Average unit price
    - AVG(unit_price)

Joins

Define reusable join clauses. Joins are rendered in the order they are defined under the join section, regardless of the order they are referenced by dimensions or measures.

join:
  customer_join: LEFT JOIN customers USING (customer_id)
  product_join: LEFT JOIN products USING (product_id)
  store_join: LEFT JOIN stores USING (store_id)

always_join:
  - product_join  # Always include this join, regardless of field usage

dimensions:
  customer_name:
    - Customer display name
    - customer_name
    - customer_join  # References the join above
  store_name:
    - Store location name  
    - store_name
    - store_join

Generated SQL join order:

-- Joins appear in YAML definition order: customer_join, product_join, store_join
-- Even if referenced in reverse order by dimensions/measures
FROM base_table
JOIN customers USING (customer_id)      -- First in YAML
JOIN products USING (product_id)        -- Second in YAML  
JOIN stores USING (store_id)            -- Third in YAML

This predictable ordering ensures consistent SQL generation and helps with query optimization and dependency management between joins.

Always Join

The always_join property specifies joins that should always be included in generated SQL queries, regardless of whether any dimensions or measures explicitly reference them. This is useful for joins that provide essential context, filtering, or data integrity constraints.

join:
  customer_join: LEFT JOIN customers USING (customer_id)
  product_join: LEFT JOIN products USING (product_id)
  store_join: LEFT JOIN stores USING (store_id)

always_join:
  - product_join  # Always included for data integrity
  - store_join    # Always included for consistent filtering

dimensions:
  date: Transaction date
  customer_name:
    - Customer name from customer table
    - customer_name
    - customer_join  # Only included when customer_name is used

Alternatively, specify all to always include all joins.

always_join: all

Example Query:

// This query only uses dimensions that don't require joins
const sql = renderQuery(model, {
    dimensions: ['date'],
    measures: ['count']
});

Generated SQL:

SELECT
  DATE(sale_date) AS date,
  COUNT(*) AS count
FROM acme.sales
JOIN products USING (product_id)  -- Always included
JOIN stores USING (store_id)      -- Always included
GROUP BY ALL

Use Cases:

  • Data Quality: Ensure referential integrity by always joining lookup tables
  • Security: Always include joins for row-level security filters
  • Business Logic: Maintain consistent business rules across all queries
  • Performance: Pre-join commonly used dimensions to optimize query plans

Best Practices:

  • Use sparingly - only for joins that are truly always needed
  • Consider the performance impact of additional joins on large datasets
  • Document why specific joins are marked as always_join in your model

Date Field Configuration

MiniML provides sophisticated date handling capabilities through three key properties: date_field, default_date_range, and include_today.

Date Field (date_field)

The date_field property specifies the primary date dimension used for automatic date filtering. If not explicitly defined, MiniML automatically detects it using these rules (in order):

  1. Exact matches: "date", "timestamp"
  2. Pattern matches: Fields ending with "date", "time", "_at", "_on", "_until", starting with "date", or containing "datetime"
# Explicit date field
date_field: order_date

dimensions:
  order_date: The date when the order was placed
  created_at: Record creation timestamp
  
# Auto-detected: MiniML will choose "order_date" (exact match)
# Auto-detection example
dimensions:
  transaction_date: Purchase date      # Will be auto-detected
  customer_id: Customer identifier
  
# No explicit date_field needed - "transaction_date" will be chosen

Default Date Range (default_date_range)

Automatically applies date filtering when no explicit date_from/date_to parameters are provided. Supports human-readable relative date expressions:

default_date_range: last 30 days
# Other examples:
# default_date_range: last 7 days
# default_date_range: last 12 months  
# default_date_range: last 1 year
# default_date_range: last 24 hours

Supported time units: hour/hours, day/days, week/weeks, month/months, year/years

Generated SQL Examples:

BigQuery:

WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL 30 DAY

Snowflake:

WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL '30 DAY'

Include Today (include_today)

Controls whether the current day is included in default date range filtering. Defaults to true if not specified.

default_date_range: last 7 days
include_today: false  # Exclude today from the range (default is true)

With include_today: true (default):

-- BigQuery
WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL 7 DAY

-- Includes all of today's data

With include_today: false:

-- BigQuery  
WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL 7 DAY 
  AND order_date < DATE_TRUNC(CURRENT_TIMESTAMP, DAY)
  
-- Excludes today's data (up to midnight)

Complete Date Configuration Example

description: E-commerce orders with 30-day default window
dialect: bigquery
from: ecommerce.orders
date_field: order_date           # Primary date field for filtering
default_date_range: last 30 days # Auto-apply 30-day lookback  
include_today: false             # Exclude today's partial data

dimensions:
  order_date:
    - Order placement date
    - DATE(order_date)
  customer_id: Customer identifier

measures:
  revenue: 
    - Total revenue
    - SUM(amount)

Query without explicit dates:

const sql = renderQuery(model, {
    dimensions: ['order_date'], 
    measures: ['revenue']
});

Generated SQL:

SELECT
  DATE(order_date) AS order_date,
  SUM(amount) AS revenue
FROM ecommerce.orders  
WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL 30 DAY
  AND order_date < DATE_TRUNC(CURRENT_TIMESTAMP, DAY)
GROUP BY ALL

Query with explicit dates (overrides defaults):

const sql = renderQuery(model, {
    dimensions: ['order_date'],
    measures: ['revenue'], 
    date_from: '2024-01-01',
    date_to: '2024-01-31'
});
// Uses explicit dates instead of default_date_range

Date Granularity

When using the primary date_field as a dimension, you can apply date truncation:

const sql = renderQuery(model, {
    dimensions: ['order_date'],
    measures: ['revenue'],
    date_granularity: 'week'  // Group by week
});

Generated SQL:

-- BigQuery
SELECT
  DATE_TRUNC(DATE(order_date), WEEK) AS order_date,
  SUM(amount) AS revenue
FROM ecommerce.orders
WHERE order_date >= CURRENT_TIMESTAMP - INTERVAL 30 DAY
GROUP BY ALL

Supported granularities: MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR

SQL Dialect Detection

MiniML can automatically detect the SQL dialect from the file path:

  • Files containing "bigquery" use BigQuery syntax
  • Files containing "snowflake" use Snowflake syntax
  • Otherwise, specify explicitly with the dialect field

Examples:

models/bigquery/sales.yaml     # Automatically uses BigQuery dialect
models/snowflake/users.yaml    # Automatically uses Snowflake dialect  
models/sales.yaml              # Requires explicit dialect: bigquery

This means you can omit the dialect field when your file paths include the database type.

Query Generation

Query Options

The renderQuery function accepts these options:

interface MinimlQueryOptions {
    dimensions?: string[];       // Fields to group by
    measures?: string[];         // Aggregated fields to calculate
    date_from?: string | null;   // Start date filter (YYYY-MM-DD)
    date_to?: string | null;     // End date filter (YYYY-MM-DD)
    where?: string;              // Additional WHERE conditions
    having?: string;             // HAVING clause for aggregate filters
    order_by?: string[];         // Sort fields (prefix with - for DESC)
    limit?: number;              // Limit result count
    distinct?: boolean;          // Add DISTINCT to SELECT
    date_granularity?: string;   // Date truncation (day, week, month, year)
}

Date Handling

MiniML provides powerful date filtering and processing capabilities. For comprehensive documentation on date configuration, see the Date Field Configuration section.

Basic date filtering:

// Uses model's default_date_range if defined
renderQuery(model, { measures: ['count'] });


// Override with specific date range
renderQuery(model, {
    measures: ['count'],
    date_from: '2024-01-01',
    date_to: '2024-01-31'
});

// Specify a null for date_from to bypass the default_date_range
renderQuery(model, { measures: ['count'], date_from: null });

Date granularity:

// Group by week using the primary date_field
renderQuery(model, {
    dimensions: ['date'],
    measures: ['count'],
    date_granularity: 'week'
});
// Generates: DATE_TRUNC('WEEK', sale_date) AS date

Filtering

WHERE clause (for dimensions):

renderQuery(model, {
    dimensions: ['customer_id'],
    measures: ['total_amount'],
    where: "customer_type = 'premium'"
});

HAVING clause (for measures):

renderQuery(model, {
    dimensions: ['customer_id'],
    measures: ['total_amount'],
    having: "total_amount > 1000"
});

Ordering

renderQuery(model, {
    dimensions: ['customer_id'],
    measures: ['total_amount'],
    order_by: ['customer_id', '-total_amount']  // ASC, then DESC
});

API Reference

Core Functions

loadModel(file: string): Promise<MinimlModel>

Asynchronously loads and processes a YAML model file.

loadModelSync(file: string): MinimlModel

Synchronously loads and processes a YAML model file.

renderQuery(model: MinimlModel, options: MinimlQueryOptions): string

Generates a SQL query from a model and query options.

Interfaces

MinimlModel

interface MinimlModel {
    description: string;
    dialect: string;
    from: string;
    join: Record<string, string>;
    always_join?: string[];           // Joins to always include
    where: string;
    date_field?: string;              // Primary date field (auto-detected)
    default_date_range?: string;      // Default time window (e.g., "last 30 days")
    include_today?: boolean;          // Include current day in default range
    dimensions: Record<string, MinimlDef>;
    measures: Record<string, MinimlDef>;
    info: string;  // Auto-generated documentation
}

MinimlDef

interface MinimlDef {
    key: string;
    description: string;
    sql?: string;
    join?: string;
}

Examples

E-commerce Analytics

# ecommerce.yaml
description: E-commerce sales and customer analytics

dialect: bigquery
from: ecommerce.orders
date_field: order_date

join:
  customer_join: LEFT JOIN ecommerce.customers USING (customer_id)
  product_join: LEFT JOIN ecommerce.products USING (product_id)

where: status = 'completed'

dimensions:
  date: Order date
  customer_id: Customer identifier
  customer_email:
    - Customer email address
    - email
    - customer_join
  product_category:
    - Product category
    - category
    - product_join

measures:
  revenue: Total revenue
  order_count: 
    - Number of orders
    - COUNT(DISTINCT order_id)
  avg_order_value:
    - Average order value
    - AVG(total_amount)

Usage:

const model = await loadModel('./ecommerce.yaml');

// Revenue by category, last 7 days
const sql = renderQuery(model, {
    dimensions: ['product_category'],
    measures: ['revenue', 'order_count'],
    date_from: '2024-01-20',
    date_to: '2024-01-27',
    order_by: ['-revenue'],
    limit: 10
});

Complex Joins and Calculations

# advanced.yaml
description: Advanced analytics with multiple joins

dialect: snowflake
from: fact.sales
date_field: sale_date

join:
  customer_join: LEFT JOIN dim.customers c ON c.id = customer_id
  product_join: LEFT JOIN dim.products p ON p.id = product_id
  region_join: LEFT JOIN dim.regions r ON r.code = c.region_code

dimensions:
  customer_segment:
    - Customer segment classification
    - c.segment
    - customer_join
  product_brand:
    - Product brand name
    - p.brand
    - product_join
  region_name:
    - Geographic region
    - r.name
    - region_join

measures:
  revenue_per_customer:
    - Average revenue per customer
    - SUM(amount) / COUNT(DISTINCT customer_id)
  profit_margin:
    - Profit margin percentage
    - (SUM(amount) - SUM(cost)) / SUM(amount) * 100

Supported SQL Dialects

BigQuery

  • Supports GROUP BY ALL syntax
  • Uses DATE_TRUNC for date granularity
  • Compatible with BigQuery standard SQL functions

Snowflake

  • Supports Snowflake-specific functions
  • Uses DATE_TRUNC for date granularity
  • Compatible with Snowflake SQL syntax

More dialects coming soon!

Advanced Features

Template Variables in Info

Models support Jinja-style templating in the info section:

info: |
  ## Available Dimensions
  {%- for dimension in dimensions %}
  - `{{ dimension.key }}`: {{ dimension.description }}
  {%- endfor %}

  ## Available Measures  
  {%- for measure in measures %}
  - `{{ measure.key }}`: {{ measure.description }}
  {%- endfor %}

Filter Reference Expansion

MiniML automatically expands dimension and measure references in filter clauses:

// If you have a measure: conversion_rate: "clicks / impressions"
// And use it in a HAVING clause:
renderQuery(model, {
    having: "conversion_rate > 0.05"
});
// MiniML expands this to: HAVING (clicks / impressions) > 0.05

Validation

MiniML validates query parameters against model definitions and provides helpful error messages:

// This will throw an error with specific invalid keys
renderQuery(model, {
    dimensions: ['invalid_dimension'],
    measures: ['invalid_measure']
});
// Error: The following keys are invalid:
// - dimensions: invalid_dimension  
// - measures: invalid_measure

Security

Because MiniML allows flexible where and having clauses to be specified in native SQL format, comprehensive SQL validation is implemented to guard against SQL injection attacks for user-provided query parameters:

  • AST-based validation: All user-provided SQL expressions are parsed and validated using node-sql-parser
  • Allowlist approach: Only safe SQL constructs are permitted (comparisons, logical operators, basic functions)
  • Model-aware validation: Column references are validated against your model schema
  • Dialect-specific rules: Validation adapts to BigQuery/Snowflake syntax differences
  • Date input validation: Date parameters (date_from, date_to) are strictly validated using regex patterns

Security Model

Trusted Inputs: YAML model files are considered trusted input and are not validated for SQL injection. Model files should be:

  • Created and maintained by trusted developers
  • Stored securely with appropriate access controls
  • Reviewed before deployment to production environments
  • Treated as code artifacts subject to version control and security policies

Untrusted Inputs: All query parameters provided at runtime (where, having, date_from, date_to, etc.) are considered untrusted and undergo strict validation.

Safe Expression Examples

WHERE clauses:

renderQuery(model, {
    where: "account_name = 'Acme Corp'"
});

renderQuery(model, {
    where: "date >= '2024-01-01' AND category_name LIKE 'Electronics%'"
});

renderQuery(model, {
    where: "revenue > 1000 OR quantity IS NOT NULL"
});

HAVING clauses:

renderQuery(model, {
    having: "total_amount > 1000 AND order_count >= 5"
});

Blocked Constructs

For security, the following are automatically blocked:

  • Subqueries: SELECT, EXISTS, ANY, ALL
  • DDL/DML statements: DROP, ALTER, CREATE, INSERT, UPDATE, DELETE
  • System functions: Potentially dangerous database functions
  • Comments: SQL comments that could hide malicious code
  • Unauthorized columns: References to columns not defined in your model

Validation Errors

When unsafe expressions are detected, MiniML provides helpful error messages:

// Error: "Subqueries are not allowed in WHERE clauses. Use simple comparisons instead."
renderQuery(model, {
    where: "customer_id IN (SELECT id FROM users WHERE active = 1)"
});

// Error: "Column 'user_password' not found. Available dimensions: account_name, date, category_name"
renderQuery(model, {
    where: "user_password = 'secret'"
});

Building and Development

# Install dependencies
npm install

# Build TypeScript
npm run build

# Run tests (requires Mocha setup)
npx mocha test/**/*.test.js

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

License

MIT License - see LICENSE file for details.