@knowledgesdk/cli
v0.2.6
Published
CLI tool for KnowledgeSDK - extract, scrape, classify, search and manage knowledge from websites
Downloads
23
Maintainers
Readme
KnowledgeSDK CLI
What is KnowledgeSDK?
KnowledgeSDK is an API that turns any website into structured, searchable knowledge — built for developers, AI agents, and data pipelines.
- 🔍 Extract — Crawl & extract structured knowledge from any website
- 📄 Scrape — Convert any URL to clean Markdown
- 🏢 Classify — AI-powered business classification from a URL
- 📸 Screenshot — Full-page screenshots of any website
- 🗺️ Sitemap — Discover all URLs on a domain
- 🧠 Search — Semantic search across your extracted knowledge base
Installation
npm install -g @knowledgesdk/cliOr use without installing:
npx @knowledgesdk/cli <command>Quick Start
# 1. Set your API key
npx knowledgesdk config --key sk_ks_your_key
# 2. Extract knowledge from a website
npx knowledgesdk extract https://stripe.com
# 3. Search your knowledge base
npx knowledgesdk search "pricing plans"Configuration
API keys are stored in ~/.knowledgesdk/config.json.
# Set API key
npx knowledgesdk config --key sk_ks_your_key
# Set a custom API base URL
npx knowledgesdk config --url https://api.myinstance.com
# View current config
npx knowledgesdk config
# Clear stored config
npx knowledgesdk config --clearYou can also use environment variables instead of the config file:
export KNOWLEDGESDK_API_KEY=sk_ks_your_key
export KNOWLEDGESDK_BASE_URL=https://api.knowledgesdk.com # optionalCommands
extract — Extract knowledge from a website
Crawls a website and extracts structured knowledge from its pages.
# Basic extraction (synchronous)
npx knowledgesdk extract https://stripe.com
# Run asynchronously and get a job ID back
npx knowledgesdk extract https://stripe.com --async
# Run asynchronously with a webhook callback
npx knowledgesdk extract https://stripe.com --async --callback-url https://myapp.com/hook
# Limit crawl depth
npx knowledgesdk extract https://stripe.com --max-pages 20
# Save result to a file
npx knowledgesdk extract https://stripe.com --output result.json
# Output raw JSON
npx knowledgesdk extract https://stripe.com --json| Flag | Description |
|------|-------------|
| --async | Run asynchronously; returns a job ID |
| --callback-url <url> | Webhook URL to notify when done |
| --max-pages <n> | Maximum pages to crawl |
| --output <file> | Save JSON result to file |
| --json | Output raw JSON |
scrape — Scrape a URL to Markdown
Fetches a single page and returns its content as clean Markdown.
npx knowledgesdk scrape https://docs.stripe.com
npx knowledgesdk scrape https://docs.stripe.com --output content.md
npx knowledgesdk scrape https://docs.stripe.com --json| Flag | Description |
|------|-------------|
| --output <file> | Save markdown to file |
| --json | Output raw JSON including metadata |
classify — Classify a business
Uses AI to classify a website into an industry/category.
npx knowledgesdk classify https://stripe.com
npx knowledgesdk classify https://stripe.com --json| Flag | Description |
|------|-------------|
| --json | Output raw JSON |
sitemap — Get a website's sitemap
npx knowledgesdk sitemap https://stripe.com
# Limit the number of URLs shown
npx knowledgesdk sitemap https://stripe.com --limit 50
# Output raw JSON
npx knowledgesdk sitemap https://stripe.com --json| Flag | Description |
|------|-------------|
| --limit <n> | Limit number of URLs shown |
| --json | Output raw JSON |
screenshot — Take a screenshot
npx knowledgesdk screenshot https://stripe.com
npx knowledgesdk screenshot https://stripe.com --output screenshot.png| Flag | Description |
|------|-------------|
| --output <file> | Save PNG to file |
| --json | Output raw JSON (includes base64 image) |
search — Search your knowledge base
npx knowledgesdk search "pricing plans"
npx knowledgesdk search "integration options" --limit 5
npx knowledgesdk search "API documentation" --json| Flag | Description |
|------|-------------|
| --limit <n> | Maximum results to return |
| --json | Output raw JSON |
webhooks — Manage webhooks
# List all webhooks
npx knowledgesdk webhooks list
# Create a webhook
npx knowledgesdk webhooks create \
--url https://myapp.com/hook \
--events EXTRACTION_COMPLETED,PAGE_SCRAPED
# Delete a webhook
npx knowledgesdk webhooks delete weh_xxxAvailable events:
| Event | Description |
|-------|-------------|
| EXTRACTION_COMPLETED | Full extraction job finished |
| EXTRACTION_FAILED | Full extraction job failed |
| PAGE_SCRAPED | Individual page scraped |
| JOB_STARTED | Async job started |
| JOB_FAILED | Async job failed |
jobs — Monitor async jobs
# Check job status once
npx knowledgesdk jobs get job_xxx
# Poll until the job completes
npx knowledgesdk jobs poll job_xxx
# Poll with a custom interval (in ms)
npx knowledgesdk jobs poll job_xxx --interval 5000
# Output final result as JSON
npx knowledgesdk jobs poll job_xxx --jsonGlobal Flags
| Flag | Description |
|------|-------------|
| --help / -h | Show help for any command |
| --version / -v | Show CLI version |
Output Formats
By default the CLI renders human-readable, colored output. Use --json on any command to get raw JSON, or --output <file> to save results to disk.
Error Handling
The CLI provides friendly error messages for common issues:
| HTTP Status | Message |
|-------------|---------|
| 401 | Invalid API key — prompts you to run config |
| 403 | Access forbidden |
| 429 | Rate limit exceeded — suggests retrying |
| 500/502/503 | Server error — suggests retrying |
If no API key is found at all, the CLI will tell you exactly how to set one.
Requirements
- Node.js >= 18 (uses native
fetch) - A KnowledgeSDK API key from knowledgesdk.com
Development
# Install dependencies
npm install
# Build TypeScript
npm run build
# Watch mode
npm run dev
# Run locally
node dist/index.js --helpDocumentation
Full API reference → https://knowledgesdk.com/docs
Contributing
We ❤️ PRs!
- Fork →
git checkout -b feat/awesome - Add tests & docs
- PR against
main
