npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

jsonrocks

v0.0.67

Published

turn any page into json

Readme

JSON Rocks

json.rocks

A web scraping service that converts any webpage into structured JSON data - perfect for generating rich link previews in chat applications.

Turn any URL into clean, structured metadata including title, description, images, videos, and more. Ideal for Slack-style link unfurling, Discord embeds, WhatsApp previews, and any chat application that needs rich link previews.

npm version


🎯 Use Case: Chat Link Previews

When users paste URLs in your chat app, json.rocks automatically extracts:

  • Title - Page title for the preview card
  • Description - Meta description or article excerpt
  • Image - Open Graph image, featured image, or thumbnail
  • Logo - Site favicon or publisher logo
  • Author - Article author name
  • Publisher - Site/publication name
  • Videos - Embedded videos (YouTube, Vimeo, etc.)
  • Date - Publication date
  • URL - Canonical URL

Perfect for building chat features like:

  • Slack-style link unfurling
  • Discord rich embeds
  • WhatsApp link previews
  • iMessage link bubbles
  • Telegram instant view

🚀 Quick Start

Installation

npm install -g jsonrocks

Run the Server

# Start on default port 9980
jsonrocks

# Or run directly from source
./bin/server.js

# Custom port
./bin/server.js --port 8080

# HTTPS mode (requires SSL certificates)
./bin/server.js --https --key ./privkey.pem --cert ./fullchain.pem

Test the API

# Extract metadata from a URL
curl "http://localhost:9980/?uri=https://github.com/spux/json.rocks"

# Search the web (returns JSON results)
curl "http://localhost:9980/?uri=chat+applications"

📖 API Reference

Main Endpoint: GET /

Extract structured JSON metadata from any URL.

Parameters:

  • uri (required) - The URL to scrape OR a search query
  • refresh (optional) - Set to true to bypass cache

Example Request:

// Fetch link preview data
const response = await fetch('http://localhost:9980/?uri=https://example.com/article')
const data = await response.json()

Example Response:

{
  "title": "Amazing Article Title",
  "description": "A compelling description of the article content that will appear in your chat preview.",
  "image": "https://example.com/images/featured.jpg",
  "logo": "https://example.com/favicon.ico",
  "author": "Jane Doe",
  "publisher": "Example News",
  "date": "2025-01-13T10:30:00.000Z",
  "url": "https://example.com/article",
  "lang": "en",
  "videos": [
    {
      "url": "https://youtube.com/watch?v=...",
      "width": 1280,
      "height": 720
    }
  ],
  "images": [
    {
      "src": "https://example.com/image1.jpg",
      "alt": "Image description"
    }
  ],
  "links": [
    {
      "href": "https://example.com/related",
      "text": "Related Article"
    }
  ]
}

Health Check: GET /health

Returns server status.

curl http://localhost:9980/health
# Response: { "status": "ok" }

Admin Endpoints

View allowed domains:

curl http://localhost:9980/admin/domains

Reload domain configuration:

curl -X POST http://localhost:9980/admin/reload-domains

💬 Chat App Integration Examples

Basic Link Preview

async function generateLinkPreview(url) {
  const response = await fetch(
    `http://localhost:9980/?uri=${encodeURIComponent(url)}`
  )
  const data = await response.json()

  return {
    title: data.title,
    description: data.description,
    thumbnail: data.image,
    favicon: data.logo,
    siteName: data.publisher,
    author: data.author,
    publishedAt: data.date
  }
}

// Usage in your chat app
const preview = await generateLinkPreview('https://github.com/spux/json.rocks')
console.log(preview)

React Component Example

import { useState, useEffect } from 'react'

function LinkPreview({ url }) {
  const [preview, setPreview] = useState(null)
  const [loading, setLoading] = useState(true)

  useEffect(() => {
    fetch(`http://localhost:9980/?uri=${encodeURIComponent(url)}`)
      .then(res => res.json())
      .then(data => {
        setPreview(data)
        setLoading(false)
      })
  }, [url])

  if (loading) return <div>Loading preview...</div>

  return (
    <div className="link-preview">
      {preview.image && <img src={preview.image} alt={preview.title} />}
      <h3>{preview.title}</h3>
      <p>{preview.description}</p>
      <span className="source">{preview.publisher}</span>
    </div>
  )
}

Node.js Backend Integration

import express from 'express'
import fetch from 'node-fetch'

const app = express()

// API endpoint for your chat app to request link previews
app.get('/api/preview', async (req, res) => {
  const { url } = req.query

  if (!url) {
    return res.status(400).json({ error: 'URL required' })
  }

  try {
    const response = await fetch(
      `http://localhost:9980/?uri=${encodeURIComponent(url)}`
    )
    const data = await response.json()

    // Transform to your chat app's format
    res.json({
      title: data.title,
      description: data.description,
      image: data.image,
      siteName: data.publisher,
      url: data.url
    })
  } catch (error) {
    res.status(500).json({ error: 'Failed to generate preview' })
  }
})

app.listen(3000)

Python/Flask Integration

from flask import Flask, request, jsonify
import requests

app = Flask(__name__)

@app.route('/api/preview')
def get_preview():
    url = request.args.get('url')
    if not url:
        return jsonify({'error': 'URL required'}), 400

    try:
        response = requests.get(f'http://localhost:9980/?uri={url}')
        data = response.json()

        return jsonify({
            'title': data.get('title'),
            'description': data.get('description'),
            'image': data.get('image'),
            'siteName': data.get('publisher'),
            'url': data.get('url')
        })
    except Exception as e:
        return jsonify({'error': str(e)}), 500

if __name__ == '__main__':
    app.run(port=3000)

⚙️ Configuration

Domain Allowlist

For security, only whitelisted domains can be scraped. json.rocks includes 1000+ popular domains by default, covering:

  • Social media (Twitter, Reddit, LinkedIn, etc.)
  • Development platforms (GitHub, GitLab, npm, etc.)
  • News sites (NYTimes, BBC, CNN, etc.)
  • Cloud providers (AWS, Azure, Vercel, Netlify, etc.)
  • Documentation sites (MDN, React, Node.js, etc.)
  • And many more...

Default domains: Loaded from data/allowed-domains-top1000.json (committed to git)

Adding Custom Domains

To add your own domains without modifying the default list:

  1. Create data/allowed-domains-custom.json:

    [
      "yourdomain.com",
      "blog.yourdomain.com",
      "mycompany.com"
    ]
  2. Restart the server - custom domains are automatically merged with defaults

  3. Or reload without restart:

    export ADMIN_PASS=your-password
    curl -u admin:$ADMIN_PASS -X POST http://localhost:9980/admin/reload-domains

Note: Subdomains are automatically included (e.g., github.com allows api.github.com, gist.github.com)

Total domains: Default (1000+) + Custom (your additions) = Automatically merged and deduplicated

See DOMAIN_MANAGEMENT.md for detailed configuration guide.

Rate Limiting

Default limits:

  • Cached requests: 100 requests/minute per IP
  • Non-cached requests: 5 requests/minute per IP

Configured in bin/server.js (lines 229-251).

Caching

Two-tier caching strategy:

  • Memory cache: LRU cache (100 items, 30-minute TTL)
  • Disk cache: JSON files in data/ directory organized by domain

Bypass cache:

curl "http://localhost:9980/?uri=https://example.com&refresh=true"

🔒 Security Features

  • Domain allowlist - Only whitelisted domains can be scraped
  • IP blocking - Blocks private IP ranges, localhost, AWS metadata endpoints
  • Rate limiting - Prevents abuse with per-IP limits
  • Content size limits - Max 5MB per request
  • Concurrent request limits - Max 5 concurrent requests per IP
  • Request timeout - 5-second timeout with 3 redirect limit
  • XSS protection - HTML escaping throughout
  • SSRF protection - URL validation and private IP blocking
  • Security headers - CSP, X-Frame-Options, X-Content-Type-Options

🏗️ Architecture

┌─────────────┐
│  Chat App   │
└──────┬──────┘
       │ HTTP GET /?uri=...
       ▼
┌─────────────────┐
│  json.rocks     │
│  (Port 9980)    │
├─────────────────┤
│ • Rate Limiter  │
│ • Domain Check  │
│ • IP Validator  │
└──────┬──────────┘
       │
       ├─→ Memory Cache (LRU)
       │   └─→ Return if cached
       │
       ├─→ Disk Cache
       │   └─→ Return if cached
       │
       └─→ Web Scraper
           ├─→ Fetch HTML (axios)
           ├─→ Extract Content (unfluff)
           ├─→ Extract Metadata (metascraper)
           └─→ Return JSON

Tech Stack:

  • Fastify - Web framework
  • Axios - HTTP client
  • Unfluff - Content extraction
  • Metascraper - Metadata extraction with platform-specific plugins
  • Cheerio - HTML parsing
  • LRU-Cache - In-memory caching

📦 Deployment

Docker (Recommended)

FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --production

COPY . .

EXPOSE 9980

CMD ["node", "bin/server.js"]
# Build and run
docker build -t json-rocks .
docker run -p 9980:9980 json-rocks

Docker Compose

version: '3.8'
services:
  json-rocks:
    build: .
    ports:
      - "9980:9980"
    volumes:
      - ./data:/app/data
    environment:
      - NODE_ENV=production
    restart: unless-stopped

Production Deployment

Using PM2:

npm install -g pm2
pm2 start bin/server.js --name json-rocks
pm2 save
pm2 startup

Nginx reverse proxy:

server {
    listen 80;
    server_name json.rocks;

    location / {
        proxy_pass http://localhost:9980;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

🛠️ Development

Requirements

  • Node.js 14+
  • npm or yarn

Setup

# Clone the repository
git clone https://github.com/spux/json.rocks.git
cd json.rocks

# Install dependencies
npm install

# Start development server
npm start

Project Structure

json.rocks/
├── bin/
│   └── server.js           # Main server (738 lines)
├── js/
│   ├── jr.js              # Utility library
│   └── json-renderer.js   # Frontend JSON renderer
├── data/
│   ├── allowed-domains.json  # Security domain allowlist
│   └── [cached data]/        # Disk cache organized by domain
├── images/                 # UI assets
├── index.html             # Landing page (search interface)
├── package.json
└── README.md

🎨 Features

Automatic Content Extraction

  • Article text - Main content extraction
  • Metadata - Title, description, author, date
  • Media - Images with alt text, videos, iframes
  • Links - All hyperlinks extracted
  • Schema.org - Structured data with JSON-LD

Platform-Specific Extractors

  • Amazon - Product info
  • Spotify - Track/album metadata
  • YouTube - Video details
  • Twitter - Tweet embeds
  • Generic fallback - Works on any webpage

Search Functionality

# Search the web for JSON content
curl "http://localhost:9980/?uri=best+javascript+frameworks"

Returns search results in structured JSON format.


📊 Response Examples

GitHub Repository

{
  "title": "spux/json.rocks: Search the web of JSON",
  "description": "A web scraping service that converts any webpage into structured JSON data",
  "image": "https://opengraph.githubassets.com/...",
  "publisher": "GitHub",
  "author": "spux",
  "url": "https://github.com/spux/json.rocks"
}

News Article

{
  "title": "Breaking News: Important Event Occurs",
  "description": "Detailed coverage of the significant event...",
  "image": "https://news.example.com/images/featured.jpg",
  "author": "Jane Reporter",
  "publisher": "News Network",
  "date": "2025-01-13T15:30:00.000Z",
  "lang": "en"
}

YouTube Video

{
  "title": "Amazing Video Title",
  "description": "Video description and details",
  "image": "https://i.ytimg.com/vi/VIDEO_ID/maxresdefault.jpg",
  "publisher": "YouTube",
  "videos": [{
    "url": "https://www.youtube.com/embed/VIDEO_ID",
    "width": 1280,
    "height": 720
  }]
}

🤝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

ISC License - See package.json for details


🔗 Links

  • NPM Package: https://www.npmjs.com/package/jsonrocks
  • GitHub: https://github.com/spux/json.rocks
  • Live Demo: https://json.rocks

💡 FAQ

Q: Can I use this for my chat application?

A: Yes! That's exactly what it's designed for. See the integration examples above.

Q: How do I add support for more domains?

A: Edit data/allowed-domains.json and add your domains. See DOMAIN_MANAGEMENT.md.

Q: Is there a hosted version?

A: Yes, available at https://json.rocks (GitHub Pages deployment).

Q: What about rate limiting?

A: Default limits are 5 req/min uncached, 100 req/min cached per IP. Configure in server.js.

Q: Does it support authentication?

A: Currently no. Add authentication middleware for production use.

Q: Can I self-host?

A: Absolutely! See the deployment section above.


⚠️ Known Limitations

  • Domain allowlist required for security (not all sites allowed by default)
  • Rate limits apply (5 req/min uncached)
  • File-based disk cache (consider Redis for production)
  • No built-in authentication (add your own middleware)
  • Admin endpoints are public (secure before production)

🗺️ Roadmap

  • [ ] Add authentication support
  • [ ] Redis cache option
  • [ ] GraphQL API
  • [ ] Webhook support
  • [ ] Batch URL processing
  • [ ] Screenshot capture
  • [ ] PDF generation
  • [ ] Custom extraction rules

Built with ❤️ for the chat app community