@udx/mcurl

v1.1.1

Published

a month ago

curl but in markdown - fetches content from URLs and converts to markdown

0High
0Medium
0Low

curl markdown html json yaml web-scraping

mCurl

curl but in markdown - fetches content from URLs and converts to markdown.

By default, mCurl returns the body/article/content of any URL that's HTML and converts it to valid Markdown. If the response is in JSON, it also converts it to Markdown in a readable format.

Features

Converts HTML to Markdown, extracting main content by default
Converts JSON to Markdown in a structured, readable format
Pluggable handlers for different content types (extensible architecture)
Selector support for extracting specific parts of HTML
Global config file support (~/.udx/mcurl.yml) for domain-specific settings
Options to include site's menu/header/footer
Designed for AI tool integration and chaining with SERP API responses
Customizable headers, user agents, and authentication

Installation

# Install globally
npm install -g @udx/mcurl
# After installation, the command is available as 'mcurl'

After installation, you can use the command mcurl directly.

Usage

# Basic usage - fetch a URL and convert to markdown
mcurl https://udx.io

# Extract specific content using CSS selector
mcurl --selector "article.main-content" https://udx.io/about

# Include header and footer
mcurl --include header,footer https://udx.io

# Use custom headers
mcurl --header "Authorization: Bearer token" https://api.example.com

# Save output to file
mcurl --output result.md https://udx.io

# Use custom user agent
mcurl --user-agent "MyBot/1.0" https://udx.io

# Verbose output
mcurl --verbose https://udx.io

# Use like curl with JSON response converted to markdown
mcurl -X POST "https://api.example.com/endpoint" -H "Content-Type: application/json" -d '{"query": "example"}'

# WARNING: Never use internal development ports (like localhost:3456) in production or share them externally
# Always use public API endpoints for production use

# Example with Elasticsearch-like query (for development environments)
mcurl -X POST "http://localhost:3456/harvest-v1/_search" -H "Content-Type: application/json" -d '{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {"range": {"date": {"gte": "2024-01-01", "lte": "2024-12-31"}}}
      ]
    }
  },
  "aggs": {
    "by_billable": {
      "terms": {
        "field": "billable"
      },
      "aggs": {
        "hours_sum": {"sum": {"field": "hours"}},
        "cost_estimate": {"sum": {"script": {"source": "doc[\"hours\"].value * 85"}}},
        "revenue": {"sum": {"field": "billableAmount"}}
      }
    }
  }
}'

# Complex Elasticsearch-like queries (for production)
mcurl -X POST "https://api.example.com/search" -H "Content-Type: application/json" -d '{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {"range": {"date": {"gte": "2024-01-01", "lte": "2024-12-31"}}}
      ]
    }
  },
  "aggs": {
    "by_billable": {
      "terms": {
        "field": "billable"
      },
      "aggs": {
        "hours_sum": {"sum": {"field": "hours"}},
        "cost_estimate": {"sum": {"script": {"source": "doc[\"hours\"].value * 85"}}},
        "revenue": {"sum": {"field": "billableAmount"}}
      }
    }
  }
}'

Configuration File

mCurl supports a global configuration file at ~/.udx/mcurl.yml that allows you to specify default options and domain-specific settings:

# Global defaults
user-agent: "mCurl/1.0"
timeout: 30000
format: markdown

# Domain-specific settings
domains:
  "udx.io":
    headers:
      - "Cookie: session=abc123"
    selector: "main.content"
  
  "api.example.com":
    headers:
      - "Authorization: Bearer YOUR_TOKEN"
    user-agent: "CustomBot/2.0"

Examples

HTML Example

# Fetch and convert HTML to markdown
mcurl https://udx.io

Output:

# UDX - Digital Experience Agency

We build digital experiences that transform businesses and drive growth.

## Our Services

- Web Development
- Digital Marketing
- UX Design
- Cloud Solutions

JSON Example

# Fetch and convert JSON to markdown
mcurl https://udx.io/wp-json/udx/v2/works/search?query=&page=0

Output:

# JSON Response

## results

| id | title | excerpt | ... |
| --- | --- | --- | --- |
| 123 | Project A | This is project A | ... |
| 456 | Project B | This is project B | ... |
...

## pagination

```json
{
  "currentPage": 0,
  "totalPages": 5,
  "totalItems": 48
}



## Integration with AI Tools

mCurl is designed to be easily integrated with AI tools and command-line workflows:

```bash
# Fetch content and pipe to other tools
mcurl https://udx.io | head -20

# Chain with jq for JSON processing
mcurl https://api.github.com/repos/torvalds/linux --format json | jq '.stargazers_count'

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme