npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

nx-md-parser

v2.2.1

Published

Extensible Multi-Format AI-Powered Markdown to JSON Transformer with support for any markdown variant through custom parsers, intelligent schema validation, auto-fixing, table parsing, and optional persistent machine learning

Downloads

1,292

Readme

nx-md-parser

npm version License: ISC

Extensible Multi-Format AI-Powered Markdown to JSON Transformer - Transform markdown documents into structured JSON with support for multiple markdown formats, intelligent schema validation, auto-fixing, and machine learning capabilities. Built with an extensible parser architecture that can handle any markdown variant.

✨ Features

📝 Extensible Multi-Format Markdown Support

  • Auto-Detection: Automatically detects and selects appropriate parser for any markdown format
  • Built-in Formats: Heading (### Section), bullet (- Section), and colon (Key: Value) formats
  • Custom Parsers: Easy to add support for any markdown variant (YAML frontmatter, numbered sections, etc.)
  • Parser Registry: Register multiple parsers for different formats
  • Format Selection: Explicitly specify format or let auto-detection choose
  • Backward Compatible: All existing code continues to work unchanged

🤖 Advanced AI-Powered Matching

  • Multi-Algorithm Fuzzy Matching: Jaccard tokens, Jaro-Winkler, Dice coefficient, Levenshtein ratio
  • Configurable Weights & Thresholds: Fine-tune matching sensitivity for different use cases
  • Machine Learning: Learn aliases and improve matching accuracy over time
  • Context-Aware: Different thresholds for key-to-key, title-to-key, and object matching
  • Persistent Learning (Optional): Save/load ML data with `@xronoces/xronox-ml``
  • Schema Consistency: Same intelligent matching for objects AND arrays

🔧 Intelligent Auto-Fixing

  • Typo Correction: Automatically fix property name typos using advanced fuzzy matching
  • Case Normalization: Handle camelCase, snake_case, Title Case seamlessly
  • Type Conversion: Smart conversion (string → number, string → boolean, etc.)
  • Structural Repair: Restructure flat objects into nested schemas
  • Missing Data: Add missing properties with sensible defaults
  • Content Intelligence: Parse tables, lists, key-value pairs automatically

🏗️ Modular Architecture

  • Extensible Parser System: Add support for any markdown format by implementing BaseMarkdownParser
  • Clean Separation: Parsers, converters, transformers, and utilities in separate modules
  • Plugin Architecture: Register custom parsers for specialized formats (YAML, XML, custom syntax)
  • Format Detection: Intelligent auto-selection of appropriate parsers from registered options
  • Multiple Parsers: Support multiple parsers simultaneously for different document types

📋 Schema Validation

  • Intuitive Schema DSL: Clean, TypeScript-friendly schema definition
  • Nested Objects: Unlimited depth object support
  • Array Handling: Complex array schemas with validation
  • Table Parsing: Markdown tables (| Header |) → Arrays of objects
  • Advanced Content: Key-value pairs, nested structures, mixed content types
  • Validation Status: Clear validated, fixed, or failed status reporting

🔄 Enterprise-Grade nx-helpers Integration

  • Advanced Merging: Intelligent object merging with deduplication
  • Role-Based Aggregation: Merge data with specific roles using mergeWithRoles
  • Schema Loading: Load schemas from JSON files
  • Dual Transformer Support: Use either nx-md-parser or nx-helpers JSONTransformer

📝 Markdown Parsing Capabilities

nx-md-parser intelligently parses various markdown structures with support for multiple formats:

Heading Format (###)

### User Profile
John Doe

### Settings
Dark mode enabled
{
  "userProfile": "John Doe",
  "settings": "Dark mode enabled"
}

Bullet Format (-)

- User Profile
John Doe

- Settings
Dark mode enabled
{
  "userProfile": "John Doe",
  "settings": "Dark mode enabled"
}

Auto-Detection

Both formats work identically - nx-md-parser automatically detects which format you're using!

Tables → Arrays of Objects

| Name | Age | Active |
|------|-----|--------|
| Alice | 28  | true   |
| Bob   | 34  | false  |
[
  { "name": "Alice", "age": 28, "active": true },
  { "name": "Bob", "age": 34, "active": false }
]

Lists → Arrays

### Features
- Schema validation
- Auto-fixing capabilities
- Machine learning integration
{
  "features": [
    "Schema validation",
    "Auto-fixing capabilities",
    "Machine learning integration"
  ]
}

Key-Value Pairs → Nested Objects

### Database
Host: localhost
Port: 5432
SSL: true
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "ssl": true
  }
}

Colon-Separated Format

Title: My Project
Description: Project description
Tags: TypeScript, React
Active: true
{
  "title": "My Project",
  "description": "Project description",
  "tags": ["TypeScript", "React"],
  "active": true
}

Mixed Content Types

All parsing types can be nested and combined for complex document structures.

🚀 Installation

npm install nx-md-parser

Note: nx-helpers is a peer dependency and will be installed automatically.

🔍 Logging Configuration

The parser uses micro-logs for detailed logging of internal decision-making processes. Set the DEBUG_LEVEL environment variable to control log verbosity:

# Very verbose - shows all internal reasoning and decisions
DEBUG_LEVEL=debug npm test

# Important decisions and results
DEBUG_LEVEL=info npm test

# Warnings and potential issues only
DEBUG_LEVEL=warn npm test

# Errors only
DEBUG_LEVEL=error npm test

Or create a .env file:

DEBUG_LEVEL=debug

The logs provide visibility into:

  • Format detection reasoning and confidence scores
  • Parser selection decisions
  • Section header vs content classification
  • Content merging logic for bullet formats
  • Schema transformation steps

📖 Quick Start

import { JSONTransformer, Schema } from 'nx-md-parser';

// Define your schema
const schema = Schema.object({
  title: Schema.string(),
  tags: Schema.array(Schema.string()),
  metadata: Schema.object({
    author: Schema.string(),
    priority: Schema.string(),
  }),
  active: Schema.boolean(),
});

// Create transformer (auto-detects format)
const transformer = new JSONTransformer(schema);

// Works with heading format (###)
const headingResult = transformer.transformMarkdown(`
### Title
My Awesome Project

### Tags
- TypeScript
- React
- Node.js

### Metadata
#### Author
John Doe

#### Priority
High

### Active
true
`);

// Also works with bullet format (-)
const bulletResult = transformer.transformMarkdown(`
- Title
My Awesome Project

- Tags
- TypeScript
- React
- Node.js

- Metadata
Author: John Doe
Priority: High

- Active
true
`);

// And with colon format (Key: Value)
const colonResult = transformer.transformMarkdown(\`
Title: My Awesome Project
Tags: TypeScript, React, Node.js
Metadata: Author - John Doe, Priority - High
Active: true
Version: 1.0.0
\`);

// All formats produce the same structured result!
console.log(headingResult.result);  // Your structured JSON
console.log(bulletResult.result);   // Same structured JSON
console.log(colonResult.result);    // Same structured JSON

Format Selection & Auto-Detection

import { JSONTransformer, Schema, MarkdownFormat, analyzeMarkdownFormat } from 'nx-md-parser';

// Auto-detect format (recommended)
const transformer = new JSONTransformer(schema); // Automatically chooses best parser

// Force specific built-in format
const headingTransformer = new JSONTransformer(schema, {
  parserOptions: { format: MarkdownFormat.HEADING }
});

const bulletTransformer = new JSONTransformer(schema, {
  parserOptions: { format: MarkdownFormat.BULLET }
});

// Analyze what formats your markdown supports
const analysis = analyzeMarkdownFormat(yourMarkdown);
console.log('Primary format:', analysis.primaryFormat);
console.log('Confidence:', analysis.allMatches[0]?.confidence);
console.log('Section ranges:', analysis.allMatches[0]?.sectionRanges);

// Works with any registered format - the system is extensible!

🎯 Advanced Usage

Custom Fuzzy Matching Configuration

import { JSONTransformer, Schema, defaultMatcherConfig } from 'nx-md-parser';

const schema = Schema.object({
  title: Schema.string(),
  description: Schema.string(),
});

const transformer = new JSONTransformer(schema, {
  // Custom matcher configuration
  thresholds: {
    keyToKey: 0.8,        // Higher threshold for key matching
    titleToKey: 0.6,      // Lower threshold for title matching
    generic: 0.5          // Baseline threshold
  },
  weights: {
    jaroWinkler: 0.5,     // 50% weight on character similarity
    jaccardTokens: 0.3,   // 30% weight on token similarity
    dice: 0.2,           // 20% weight on n-gram similarity
  }
});

Machine Learning - Learning Aliases

import { learnAliasesFromTransformations } from 'nx-md-parser';

// Learn from successful transformations
const learningResult = learnAliasesFromTransformations([
  {
    input: { "Projct Name": "Test", "Desc": "Test description" },
    output: { title: "Test", description: "Test description" },
    schema: yourSchema
  },
  // ... more examples
]);

console.log(learningResult.proposedAliases);
// { "Projct Name": ["title"], "Desc": ["description"] }

Schema Loading from Files

import { createTransformerFromSchemaFile } from 'nx-md-parser';

// schema.json
// {
//   "type": "object",
//   "properties": {
//     "title": { "type": "string" },
//     "tags": { "type": "array", "items": { "type": "string" } }
//   }
// }

const transformer = createTransformerFromSchemaFile('./schema.json');

Advanced Merging with Roles

import { mergeWithRoles } from 'nx-md-parser';

const roleBasedData = [
  { role: 'user-profile', value: { name: 'Alice', email: '[email protected]' } },
  { role: 'user-preferences', value: { theme: 'dark', notifications: true } },
  { role: 'account-settings', value: { plan: 'premium', storage: '100GB' } }
];

const merged = mergeWithRoles(roleBasedData);
// {
//   userProfile: { name: 'Alice', email: '[email protected]' },
//   userPreferences: { theme: 'dark', notifications: true },
//   accountSettings: { plan: 'premium', storage: '100GB' }
// }

Custom Parsers & Format Extension

import { BaseMarkdownParser, MarkdownFormat, getFormatDetector } from 'nx-md-parser';

// Example: YAML Frontmatter parser
class YamlFrontmatterParser extends BaseMarkdownParser {
  canParse(markdown: string): boolean {
    return markdown.startsWith('---\n');
  }

  parseSections(markdown: string): MarkdownSection[] {
    // Parse YAML frontmatter + markdown body
    return [];
  }

  getFormatName(): MarkdownFormat {
    return 'yaml-frontmatter' as any;
  }
}

// Example: Numbered sections parser
class NumberedSectionsParser extends BaseMarkdownParser {
  canParse(markdown: string): boolean {
    return /^\d+\.\s/.test(markdown);
  }

  parseSections(markdown: string): MarkdownSection[] {
    // Parse numbered sections like "1. Introduction"
    return [];
  }

  getFormatName(): MarkdownFormat {
    return 'numbered-sections' as any;
  }
}

// Register multiple custom parsers
const detector = getFormatDetector();
detector.registerParser(new YamlFrontmatterParser());
detector.registerParser(new NumberedSectionsParser());

// Now supports: headings, bullets, colon format, YAML frontmatter, numbered sections, etc.

JSON to Markdown Generation

import { jsonToMarkdown } from 'nx-md-parser';

const data = {
  title: "Project Alpha",
  features: ["AI", "ML", "Cloud"],
  metadata: { version: "1.0.0" }
};

console.log(jsonToMarkdown(data));
// # Title
// Project Alpha
//
// # Features
// - AI
// - ML
// - Cloud
//
// # Metadata
// ## Version
// 1.0.0

📚 API Reference

Core Classes

JSONTransformer

new JSONTransformer(
  schema: SchemaType,
  options?: {
    matcherConfig?: Partial<MatcherConfig>;
    parserOptions?: ParserOptions;
  }
)

transformMarkdown(markdown: string): TransformResult
transform(input: any): TransformResult

Parser Options:

interface ParserOptions {
  format?: MarkdownFormat;           // AUTO, HEADING, BULLET, MIXED
  sectionKeywords?: string[];        // Keywords for bullet section detection
  fuzzyThreshold?: number;          // Fuzzy matching threshold
}

LearningTransformer (Optional)

new LearningTransformer(
  schema: SchemaType,
  matcherConfig?: Partial<MatcherConfig>,
  mlOptions?: {
    storage?: { type: 'file' | 'database', path?: string },
    enableLearning?: boolean
  }
)

transformMarkdown(markdown: string): TransformResult
transform(input: any): TransformResult
transformMarkdownWithLearning(markdown: string): Promise<TransformResult>
transformWithLearning(input: any): Promise<TransformResult>

Requires: npm install @xronoces/xronox-ml

Features:

  • Persistent machine learning data storage
  • Continuous improvement from transformation history
  • Automatic loading of learned configurations
  • Graceful fallback when ML package unavailable

Parser Classes

BaseMarkdownParser - Abstract base class for creating custom parsers

abstract class BaseMarkdownParser {
  canParse(markdown: string): boolean;
  parseSections(markdown: string): MarkdownSection[];
  getFormatName(): MarkdownFormat;
}

HeadingParser - Parses ### Section format

import { HeadingParser } from 'nx-md-parser';
const parser = new HeadingParser();

BulletParser - Parses - Section bullet format

import { BulletParser } from 'nx-md-parser';
const parser = new BulletParser();

ColonParser - Parses Key: Value colon format

import { ColonParser } from 'nx-md-parser';
const parser = new ColonParser();

FormatDetector - Auto-detects and selects appropriate parsers

import { FormatDetector, getFormatDetector, analyzeMarkdownFormat } from 'nx-md-parser';

const detector = getFormatDetector();
const format = detector.detect(markdown);  // MarkdownFormat
const parser = detector.getParser(format, markdown);

// Advanced format analysis with confidence scores and line ranges
const analysis = analyzeMarkdownFormat(markdown);
console.log(analysis.primaryFormat);     // 'heading' | 'bullet' | 'colon'
console.log(analysis.allMatches[0]);     // { format, confidence, sections, sectionRanges }

Schema Builders

Schema.string(): SchemaType
Schema.number(): SchemaType
Schema.boolean(): SchemaType
Schema.array(items: SchemaType): SchemaType
Schema.object(properties: Record<string, SchemaType>): SchemaType

Utility Functions

Transformation Utilities

mergeTransformResults(...results: TransformResult[]): TransformResult
jsonToMarkdown(data: any, level?: number): string

Format Analysis

analyzeMarkdownFormat(markdown: string): FormatAnalysisResult
// Returns detailed analysis of what formats the markdown supports
// with confidence scores, section counts, and line ranges

Schema Management

loadSchemaFromFile(filePath: string): SchemaType
createTransformerFromSchemaFile(schemaFilePath: string): JSONTransformer
createNxHelpersTransformer(schema: SchemaType, config?: Partial<MatcherConfig>): any

Machine Learning

learnAliasesFromTransformations(transformations: TransformationExample[]): LearningResult

Parser Types & Enums

enum MarkdownFormat {
  AUTO = 'auto',       // Auto-detect format
  HEADING = 'heading',  // ### Section format
  BULLET = 'bullet',    // - Section format
  COLON = 'colon',      // Key: Value format
  MIXED = 'mixed'       // Mixed formats
}

interface MarkdownSection {
  heading: string;
  content: string;
  level: number;
  format: 'heading' | 'bullet' | 'mixed';
}

interface ParserOptions {
  format?: MarkdownFormat;
  sectionKeywords?: string[];
  fuzzyThreshold?: number;
}

nx-helpers Integration

// Merging
mergeNoRedundancy(base: T, override: Partial<T>): T
mergeMultiple(...objects: Partial<T>[]): T
mergeWithRoles(items: MergableItem[]): any

// Matching
bestMatchOneToMany(term: string, candidates: string[], config: MatcherConfig): StringScore | null
defaultMatcherConfig(): MatcherConfig

// Schema Building (nx-helpers)
nxString: SchemaNode
nxNumber: SchemaNode
nxBoolean: SchemaNode
nxArray(items: SchemaNode): SchemaNode
nxObject(properties: Record<string, SchemaNode>): SchemaNode

🔬 Examples

Run the comprehensive examples:

# Basic usage
npm run example

# Advanced features (merging, ML, etc.)
npm run integration-example

🧪 Testing

npm test              # Run test suite
npm run test:watch    # Watch mode

📊 Performance & Accuracy

Matching Algorithms (nx-helpers v1.5.0)

  • Jaro-Winkler: Character-level similarity (40% weight)
  • Jaccard Tokens: Token-based similarity (30% weight)
  • Dice Coefficient: N-gram similarity (20% weight)
  • Levenshtein Ratio: Edit distance (10% weight)

Real-World Results

  • Typo Correction: 85%+ accuracy on common typos
  • Case Handling: 100% accuracy on case variations
  • Context Awareness: Different thresholds for different match types
  • Machine Learning: Continuous improvement with usage data

🤝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

ISC License - see LICENSE file for details.

🙏 Acknowledgments

  • Built on nx-helpers for advanced AI capabilities
  • Inspired by the need for intelligent markdown processing in enterprise workflows
  • Thanks to the nx-intelligence team for the powerful fuzzy matching algorithms

📞 Support



📋 What's New in v2.1

  • Format Analysis API: New analyzeMarkdownFormat() function provides detailed format detection with confidence scores and line ranges
  • Multi-Format Detection: Detects all supported formats in a document with ranking by confidence
  • Section Range Analysis: Get exact line numbers for each section in your markdown
  • Mixed Content Detection: Identifies documents that contain multiple format types

📋 What's New in v2.0

  • Extensible Multi-Format Architecture: Support for any markdown variant through custom parsers, not just headings and bullets
  • Intelligent Parser System: Auto-detection and selection from multiple registered parsers
  • Plugin Architecture: Easy to extend with custom parsers for YAML frontmatter, numbered sections, XML, or any format
  • Modular Design: Clean separation of parsing, conversion, and transformation logic
  • Backward Compatibility: All existing code continues to work unchanged
  • Enhanced TypeScript: Better type safety with extensible parser interfaces

Made with ❤️ by the nx-intelligence team