zero-knowledge-indexing

v1.1.2

Published

14 days ago

Zero-Knowledge Indexing: A secure Node.js library for Google Indexing API with automatic batching, caching, and CLI setup wizard.

0High
0Medium
0Low

choyon-dev

google-indexing seo indexing-api nodejs cli Bulk Indexing zero-knowledge-indexing

Zero-Knowledge Indexing

A secure, zero-configuration Node.js library for Google's Indexing API with automatic batching, caching, and a CLI setup wizard.

Features

🔒 Zero-Knowledge Security: Prevents browser usage and enforces secure credential handling
⚡ Smart Batching: Automatically groups URLs into optimal batches (up to 100 URLs)
🗄️ Intelligent Caching: Remembers indexed URLs for 24 hours to prevent quota waste
🔄 Automatic Retry: Handles rate limits with exponential backoff
🛠️ Interactive CLI Setup: Guided setup with browser automation and validation
📊 Progress Callbacks: Real-time progress tracking for large indexing jobs
🔧 Dry-Run Mode: Test indexing without making API calls
📝 Comprehensive Logging: Detailed logs with Winston
🏗️ TypeScript Support: Full type definitions included
⚙️ Configuration Files: Support for .indexingrc project settings
🚦 Error Recovery: Network resilience with timeouts and connection handling
🗺️ Sitemap Support: Parse and index all URLs from XML sitemaps
📁 Bulk Import: Index URLs from JSON, CSV, or text files
📈 Status Monitoring: Check indexing status and generate reports
🏥 Health Monitoring: API connectivity and performance metrics
🔄 Circuit Breaker: Prevents cascade failures with automatic recovery

Quick Start

1. Install

npm install zero-knowledge-indexing

2. Setup

Basic Setup (Text Guide)

npx index-ready setup

Interactive Setup (Recommended)

npx index-ready setup --interactive

The interactive setup will:

Automatically open browser tabs for each setup step
Guide you through the process with prompts
Validate your service account JSON file
Test your setup automatically
Generate environment variable commands

3. Use in Code

import { GoogleIndexingAPI } from "zero-knowledge-indexing";

// Initialize (credentials auto-detected)
const indexer = new GoogleIndexingAPI();

// Index your URLs
const result = await indexer.indexUrls([
  "https://example.com/new-post-1",
  "https://example.com/new-post-2",
]);

console.log("Successful:", result.successful.length);
console.log("Failed:", result.failed.length);

Environment Variables

Set your service account credentials securely:

export GOOGLE_INDEXING_KEY='{"type":"service_account","private_key":"-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n","client_email":"[email protected]"}'

Manual Setup

1. Create Google Cloud Project

Go to Google Cloud Console
Create a new project or select existing
Note your Project ID

2. Enable Indexing API

Visit Indexing API page
Click "Enable"

3. Create Service Account

Go to Service Accounts
Click "Create Service Account"
Name: indexing-service-account
Skip roles for now
Create JSON key and download

4. Add to Search Console

Open Google Search Console
Select your property
Go to Settings → Users and permissions
Add service account email as "Owner"

5. Secure Credentials

Never commit JSON files to Git
Add to .gitignore: *.json
Use environment variables in production
The package validates this automatically

CLI Commands

Setup Commands

# Interactive setup with browser automation
npx index-ready setup --interactive

# Basic text-based setup
npx index-ready setup

Indexing Commands

# Index all URLs from a sitemap
npx index-ready sitemap https://example.com/sitemap.xml --max 100

# Index URLs from a file
npx index-ready file urls.txt --format txt

# Index with filters
npx index-ready sitemap https://example.com/sitemap.xml \
  --include "/blog/.*" \
  --exclude "/draft/.*" \
  --max 50

Monitoring Commands

# Check indexing status of URLs
npx index-ready status https://example.com/page1 https://example.com/page2

# Check status from file
npx index-ready status --file urls.txt

# Check status from sitemap
npx index-ready status --sitemap https://example.com/sitemap.xml

# Health check
npx index-ready health

# View performance metrics
npx index-ready metrics

# Clear cache
npx index-ready clear-cache

API Reference

GoogleIndexingAPI

Constructor

new GoogleIndexingAPI(options?: IndexingOptions)

Options:

serviceAccountPath?: string - Path to service account JSON file
credentials?: string - JSON string of credentials (from env)
projectId?: string - Google Cloud Project ID
timeout?: number - Request timeout in ms (default: 30000)
maxRetries?: number - Max retry attempts (default: 5)
maxCacheSize?: number - Cache size limit (default: 10000)
circuitBreakerThreshold?: number - Failures before circuit opens (default: 10)

indexUrls(urls: string[]): Promise

Indexes an array of URLs with progress callbacks.

const indexer = new GoogleIndexingAPI({
  onProgress: (processed, total, results) => {
    console.log(`${processed}/${total} URLs processed`);
  },
  onError: (error, url) => {
    console.error(`Failed: ${url}`, error.message);
  },
});

const result = await indexer.indexUrls(urls);

indexFromSitemap(sitemapUrl, options): Promise

Index all URLs from an XML sitemap.

const result = await indexer.indexFromSitemap(
  "https://example.com/sitemap.xml",
  {
    filter: {
      includePatterns: ["/blog/.*"],
      excludePatterns: ["/draft/.*"],
      maxUrls: 100,
    },
  },
);

indexFromFile(filePath, options): Promise

Index URLs from a file (JSON, CSV, or text).

const result = await indexer.indexFromFile("urls.json", {
  format: "json",
  filter: { maxUrls: 50 },
});

checkIndexingStatus(urls): Promise

Check indexing status of URLs.

const report = await indexer.checkIndexingStatus(urls);
console.log(`${report.indexed} indexed, ${report.notFound} not found`);

Health & Monitoring

// Health check
const health = await indexer.healthCheck();

// Get metrics
const metrics = indexer.getMetrics();

// Clear cache
await indexer.clearCache();

Error Handling

The package handles common Google API errors:

400 Bad Request: Invalid URL format (validated before sending)
403 Forbidden: Service account lacks Owner role in Search Console
- Provides direct link to Search Console settings
429 Too Many Requests: Automatic exponential backoff retry (up to 5 attempts)

Security Features

Browser Protection: Throws error if used in browser environment
Credential Validation: Ensures JSON structure and required fields
GitIgnore Check: Warns if service account files aren't ignored
Production Safety: Enforces .gitignore in production environments

Caching

URLs are cached for 24 hours in .indexing-cache.json to:

Prevent duplicate API calls
Save quota
Speed up repeated indexing attempts

Logging

Logs are output to console with colors. Set log level:

export LOG_LEVEL=debug  # info, warn, error, debug

License

MIT License - see LICENSE file.

Contributing

Fork the repository
Create your feature branch
Add tests
Submit a pull request

Support

For issues or questions:

Check the setup guide
Open an issue on GitHub
Review logs for detailed error messages