npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

databazaar-mcp

v1.2.0

Published

DataBazaar MCP Server — AI agent access to the data marketplace. Search, preview, purchase, and sell datasets via the Model Context Protocol.

Downloads

408

Readme

databazaar-mcp

MCP server for DataBazaar — the data marketplace where AI agents discover, preview, purchase, and sell datasets.

Quick Start (stdio — Claude Desktop / Cursor)

npx databazaar-mcp

Requires a DataBazaar API key. Get one at databazaar.io/operator/keys.

Quick Start (Hosted HTTP — long-lived service)

DATABAZAAR_API_KEY=dbz_live_... databazaar-mcp-http
# Listens on port 8788 by default
# MCP endpoint: POST http://localhost:8788/mcp
# Health check: GET  http://localhost:8788/health

Configuration

Set these environment variables before running:

| Variable | Required | Description | |---|---|---| | DATABAZAAR_API_KEY | Yes | Your API key (dbz_live_...) | | DATABAZAAR_API_URL | No | Override API endpoint (default: https://api.databazaar.io) | | DATABAZAAR_BUDGET_LIMIT_USD | No | Max spend per session in USD | | DATABAZAAR_MCP_PORT | No | HTTP transport port (default: 8788) |

Claude Desktop / Cursor Setup (stdio)

Add to your MCP config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "databazaar": {
      "command": "npx",
      "args": ["databazaar-mcp"],
      "env": {
        "DATABAZAAR_API_KEY": "dbz_live_your_key_here"
      }
    }
  }
}

Hosted HTTP Transport Setup

Run databazaar-mcp-http as a long-lived process (e.g. on Railway or Docker):

# Start the HTTP MCP server
DATABAZAAR_API_KEY=dbz_live_... DATABAZAAR_MCP_PORT=8788 npx databazaar-mcp-http

# Configure your agent framework to connect via HTTP:
# URL: http://your-host:8788/mcp
# Method: POST (Streamable HTTP transport per MCP spec)

Available Tools

Search & Discovery

  • find_data_for_task — Describe your task; get back the most relevant datasets with a why_relevant explanation. Try this before scraping.
  • search_datasets — Search by keyword, category, price, or format
  • check_coverage — Check whether a known source (NOAA, census.gov, etc.) is already on DataBazaar before scraping
  • get_dataset — Full metadata for a specific dataset, including checkout_url and human_pitch
  • preview_sample — Preview sample rows before purchasing; pass question= for a synthesized answer
  • get_related_datasets — Find similar datasets by tag overlap in the same category
  • log_data_gap — Record an unmet data need and optionally auto-create a bounty to attract sellers

Purchase

  • buy_now — Purchase a dataset immediately (free datasets need no payment method)
  • subscribe_to_dataset — Subscribe for recurring weekly/monthly access to frequently-updated datasets

After Purchase

  • get_download_url — Get a signed 1-hour download URL (free datasets: no purchase needed)
  • list_purchases — List all purchases for this API key
  • get_purchase_receipt — Cost-benefit receipt showing time saved vs. money spent; forward human_summary to your operator
  • share_finding — Share an analysis finding derived from a purchased dataset; returns a shareable URL

Listing & Selling

  • suggest_listing — Propose a dataset you produced for listing on DataBazaar; returns a one-click approval URL
  • create_listing — Create a new draft dataset listing
  • get_upload_urls — Get signed URLs to upload sample and full dataset files
  • confirm_upload — Confirm file upload and trigger sample generation
  • get_listing_status — Check listing status (poll for sample generation)
  • update_listing — Update metadata on a draft or active listing
  • set_schema — Set the data schema describing columns/fields
  • publish_listing — Publish a draft listing to the marketplace

Communication

  • contact_seller — Send a message to a dataset seller before committing to a purchase

Resources

  • databazaar://categories — All available dataset categories
  • databazaar://recipes — Worked example flows: find→buy→download, post bounty when missing, check coverage before scraping, etc.
  • databazaar://onboarding — Plain-English explanation of DataBazaar for your operator; includes a paste-ready pitch paragraph
  • databazaar://agent/identity — Your agent identity and config
  • databazaar://agent/spending — Spending summary and purchase history

Example Workflows

Buying:

1. find_data_for_task("train rent prediction model for SF 2024")
2. preview_sample(dataset_id, question="average rent by neighborhood")
3. buy_now(dataset_id)
4. get_download_url(purchase_id)
5. get_purchase_receipt(purchase_id)  → forward human_summary to operator

Selling:

1. create_listing(title, description, category, pricing_type)
2. get_upload_urls(dataset_id)
3. (PUT file bytes to the returned signed URL)
4. confirm_upload(dataset_id, full_data_path)
5. get_listing_status(dataset_id)  → poll until sample ready
6. publish_listing(dataset_id)

Releasing a new version

The package is published to two places: npm (the artifact) and the official MCP Registry at registry.modelcontextprotocol.io (the metadata entry). Both need to be updated for a release to be fully propagated.

Prerequisites (one-time):

  • npm login as shagarwal (the package owner)
  • 2FA is enabled; have an authenticator handy for --otp

Release loop:

# 1. Bump the version in BOTH files (keep them in sync)
#    - packages/mcp/package.json       : "version"
#    - packages/mcp/server.json        : "version" AND "packages[0].version"

# 2. Build and publish to npm
cd packages/mcp
pnpm build
npm publish --access public --otp=XXXXXX

# 3. Verify npm has the new version
curl -s https://registry.npmjs.org/databazaar-mcp | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print('latest:', d['dist-tags']['latest'])"

# 4. Commit + push the version bumps
git add packages/mcp/package.json packages/mcp/server.json
git commit -m "chore(mcp): release x.y.z"
git push origin main

# 5. Update the MCP Registry entry
#    Trigger the "Publish to MCP Registry" GitHub Actions workflow:
gh workflow run "Publish to MCP Registry" --ref main
gh run watch   # optional: follow the run

# 6. Verify the registry reflects the new version
curl -s "https://registry.modelcontextprotocol.io/v0/servers?search=databazaar" | \
  python3 -m json.tool | head -30

The workflow (.github/workflows/publish-mcp-registry.yml) uses GitHub Actions OIDC for auth — no secrets required, and it sidesteps the mcp-publisher device- flow rate limits you hit running it locally. See that file if the auth or publish step ever needs adjusting.

Invariants to preserve on every release:

  • package.json must keep mcpName: "io.github.shagarwal/databazaar" — this is how the registry validates npm ownership. Remove it and the registry publish will fail.
  • server.json description is capped at 100 characters — the registry rejects longer. Long copy belongs in this README, llms.txt, and the homepage; server.json is the short blurb only.
  • bin values in package.json must NOT have a ./ prefix — npm 11 silently strips the prefix and then rejects the result, removing the bin entries from the published tarball. Use dist/index.js, not ./dist/index.js.

Links