npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@udx/mq

v1.1.3

Published

Markdown Query - jq for Markdown documents

Downloads

5

Readme

@udx/mq - Markdown Query

A powerful tool for querying and transforming markdown documents, designed as a companion to @udx/mcurl. Think of it as "jq for markdown" - a tool that lets you treat markdown as structured data.

Key Capabilities

  • Clean Content Extraction: Pull narrative content without code blocks for cleaner analysis
  • Structured Querying: Filter and transform markdown content like jq does for JSON
  • Document Analysis: Generate actionable insights and understand document structure
  • Format Conversion: Transform between JSON, markdown, and other formats
  • Composability: Combine with other tools in Unix-style pipelines

Why Clean Content Extraction Matters

Code blocks in technical documents serve a crucial purpose for developers but act as "noise" when analyzing the narrative flow. By separating content from code, mq helps:

  • Improve focus on conceptual information
  • Extract cleaner summaries without code snippets
  • Better identify key points and arguments
  • Create more approachable versions of technical content

Installation

npm install -g @udx/mq

Usage Examples

Extract Clean Content (No Code Blocks)

# Extract clean content without code blocks
mq --clean-content --input test/fixtures/test-code-blocks.md

# Filter content to include only h1 and h2 headings and their content
mq --clean-content=2 --input test/fixtures/complex-test.md

# Get clean content in JSON format
mq --clean-content --format json --input test/fixtures/test-code-blocks.md

Basic Query Operations

# Extract headings from a document (returns JSON structure by default)
mq --input test/fixtures/basic-test.md '.headings[]'

# Analyze document structure (returns formatted Markdown report)
mq --analyze --input test/fixtures/complex-test.md

# Generate a table of contents (returns Markdown TOC)
mq --input test/fixtures/test-document.md '.toc'

# Extract code blocks by language (returns JSON structure)
mq --language javascript --input test/fixtures/test-code-blocks.md

# Extract code content only in raw format
mq --language javascript --input test/fixtures/test-code-blocks.md | jq -r '.[0].content'

# Extract all images (returns JSON structure)
mq --input test/fixtures/test-images.md '.images[]'

# Extract first sentences from sections (returns text content)
mq --first-sentences 2 --input test/fixtures/test-sentences.md

Pipe with mcurl

# Fetch web content and analyze it
mcurl https://udx.io | mq --analyze

# Fetch web content and extract key information
mcurl https://udx.io/work | mq --clean-content

# First analyze the overall structure of web content
mcurl https://udx.io/about | mq --analyze

Complex Queries

# Extract level 2 headings
mq --input test/fixtures/complex-test.md '.headings[] | select(.level == 2)'

# Extract links to specific domain
mq --input test/fixtures/test-document.md '.links[] | select(.href | contains("example"))'

# Extract code blocks and make them collapsible
mq --input test/fixtures/test-code-blocks.md --transform-code-blocks

Integration with curl and jq

One of the most powerful aspects of mq is its ability to integrate with curl, mcurl, and jq in Unix-style pipelines:

# Fetch a GitHub markdown file and extract headings
curl -s https://raw.githubusercontent.com/WordPress/wordpress-develop/HEAD/README.md | mq '.headings[]'

# Get content from a website and extract clean narrative content
mcurl https://udx.io/about | mq --clean-content

# Process markdown content and pipe to jq for further filtering
curl -s https://raw.githubusercontent.com/WordPress/wordpress-develop/HEAD/README.md | mq --clean-content --format json | \
  jq '[.[] | select(.type=="heading" and .level == 1)]'

# Extract expertise data from UDX API using proper jq patterns
curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
  jq '.facets.expertise[] | select(.count > 10) | {name: .name, count: .count}'

Advanced Features

Clean Content Extraction

The clean content extractor is one of mq's most powerful features for document analysis. It removes code blocks while preserving the document's narrative structure:

# Extract clean content without code blocks
mq --clean-content --input test/fixtures/test-code-blocks.md

# Limit extraction to specific heading levels (h1 and h2 only)
mq --clean-content=2 --input test/fixtures/complex-test.md

# Get JSON output for programmatic processing
mq --clean-content --format json --input test/fixtures/test-code-blocks.md | jq length

Benefits of Clean Content Extraction

  • Improved Analysis: Focus on the narrative without code noise
  • Better Summarization: Generate more coherent summaries from technical content
  • Hierarchical Understanding: Preserve document structure while filtering code
  • Content Repurposing: Transform code-heavy tutorials into conceptual guides
  • Incremental Content Processing: Extract varying amounts of content for different purposes

Advanced UDX API Examples

# Extract links from HTML content using mq
mcurl https://udx.io/about | mq '.links[0:5]'

# Extract clean content from a WordPress page for easier reading
mcurl https://udx.io/guidance | mq --clean-content

# First analyze page structure, then extract specific elements
mcurl https://udx.io/work | mq --analyze

Approach

Best Practices for Working with Markdown and APIs

  • Native Node.js Functions: Prefer using native Node.js functions for fetching API data rather than dedicated modules. For example:

    // Using native Node.js rather than dedicated modules
    const https = require('https');
      
    function fetchContent(url) {
      // Function fetches content from URL using native Node.js modules
      // Input: url - String URL to fetch
      // Output: Promise that resolves to response body
      return new Promise((resolve, reject) => {
        https.get(url, (res) => {
          let data = '';
          res.on('data', (chunk) => { data += chunk; });
          res.on('end', () => { resolve(data); });
        }).on('error', reject);
      });
    }
  • Logging and Debugging: Always log API request metadata and response data for troubleshooting:

    // Proper logging for API requests
    function logApiRequest(url, options, response) {
      // Log API request details when verbose mode is enabled
      // Input: url - request URL, options - request options, response - API response
      // Output: None, logs to console
      if (process.env.DEBUG || process.env.VERBOSE) {
        console.log(`[API Request] ${options.method || 'GET'} ${url}`);
        console.log(`[API Response] Status: ${response.statusCode}`);
        if (process.env.VERBOSE) {
          console.log(`[API Response Body] ${JSON.stringify(response.body).substring(0, 200)}...`);
        }
      }
    }
  • Use Lodash for Complex Operations: Leverage Lodash for data transformations to improve readability and fault tolerance in your pipeline.

  • Progressive Enhancement Workflow:

    1. Start by analyzing content structure with mq --analyze
    2. Extract relevant sections with targeted selectors
    3. Process and transform with clean content extraction
    4. Format output appropriately for your use case
  • Testing Strategy: Test your pipelines using REST API tools, Mocha for unit tests, or simple curl commands for verification.

  • Documentation: Add comprehensive function headers that explain purpose, inputs, and outputs for all custom operations.

Common Pipelines

# Extract content → Clean → Filter → Format as JSON
mcurl https://udx.io/about | mq --clean-content | mq --format json | jq 'length'

# Analyze content structure then target specific elements
mcurl https://udx.io/work | mq --analyze && mcurl https://udx.io/work | mq '.headings[0:5]'

# Process multiple sources with consistent transformations
for url in "udx.io/about" "udx.io/work" "udx.io/guidance"; do
  echo "Processing $url"
  mcurl https://$url | mq --clean-content=2 | wc -l
done

UDX API Integration Patterns

Mq can be used as part of a larger data processing pipeline, working alongside other tools like curl and jq:

# Use mq for HTML content processing
mcurl https://udx.io/work | mq --clean-content | grep "Cloud"

# Use curl+jq for JSON API processing (not mcurl!)
curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
  jq '.facets.expertise[] | select(.count > 10) | {name: .name, count: .count}'

# Get industry distribution with better formatting
curl -s 'https://udx.io/wp-json/udx/v2/works/search?query=&page=1' | \
  jq '.facets.industries[] | select(.count > 5) | {name: .name, count: .count}'

# Pipeline: Extract content from UDX pages, clean it, then analyze structure
for page in "about" "work" "guidance"; do
  mcurl "https://udx.io/$page" | mq --clean-content | mq --analyze | grep -i "headings"
done