zero-knowledge-indexing
v1.1.2
Published
Zero-Knowledge Indexing: A secure Node.js library for Google Indexing API with automatic batching, caching, and CLI setup wizard.
Maintainers
Readme
Zero-Knowledge Indexing
A secure, zero-configuration Node.js library for Google's Indexing API with automatic batching, caching, and a CLI setup wizard.
Features
- 🔒 Zero-Knowledge Security: Prevents browser usage and enforces secure credential handling
- ⚡ Smart Batching: Automatically groups URLs into optimal batches (up to 100 URLs)
- 🗄️ Intelligent Caching: Remembers indexed URLs for 24 hours to prevent quota waste
- 🔄 Automatic Retry: Handles rate limits with exponential backoff
- 🛠️ Interactive CLI Setup: Guided setup with browser automation and validation
- 📊 Progress Callbacks: Real-time progress tracking for large indexing jobs
- 🔧 Dry-Run Mode: Test indexing without making API calls
- 📝 Comprehensive Logging: Detailed logs with Winston
- 🏗️ TypeScript Support: Full type definitions included
- ⚙️ Configuration Files: Support for
.indexingrcproject settings - 🚦 Error Recovery: Network resilience with timeouts and connection handling
- 🗺️ Sitemap Support: Parse and index all URLs from XML sitemaps
- 📁 Bulk Import: Index URLs from JSON, CSV, or text files
- 📈 Status Monitoring: Check indexing status and generate reports
- 🏥 Health Monitoring: API connectivity and performance metrics
- 🔄 Circuit Breaker: Prevents cascade failures with automatic recovery
Quick Start
1. Install
npm install zero-knowledge-indexing2. Setup
Basic Setup (Text Guide)
npx index-ready setupInteractive Setup (Recommended)
npx index-ready setup --interactiveThe interactive setup will:
- Automatically open browser tabs for each setup step
- Guide you through the process with prompts
- Validate your service account JSON file
- Test your setup automatically
- Generate environment variable commands
3. Use in Code
import { GoogleIndexingAPI } from "zero-knowledge-indexing";
// Initialize (credentials auto-detected)
const indexer = new GoogleIndexingAPI();
// Index your URLs
const result = await indexer.indexUrls([
"https://example.com/new-post-1",
"https://example.com/new-post-2",
]);
console.log("Successful:", result.successful.length);
console.log("Failed:", result.failed.length);Environment Variables
Set your service account credentials securely:
export GOOGLE_INDEXING_KEY='{"type":"service_account","private_key":"-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n","client_email":"[email protected]"}'Manual Setup
1. Create Google Cloud Project
- Go to Google Cloud Console
- Create a new project or select existing
- Note your Project ID
2. Enable Indexing API
- Visit Indexing API page
- Click "Enable"
3. Create Service Account
- Go to Service Accounts
- Click "Create Service Account"
- Name:
indexing-service-account - Skip roles for now
- Create JSON key and download
4. Add to Search Console
- Open Google Search Console
- Select your property
- Go to Settings → Users and permissions
- Add service account email as "Owner"
5. Secure Credentials
- Never commit JSON files to Git
- Add to
.gitignore:*.json - Use environment variables in production
- The package validates this automatically
CLI Commands
Setup Commands
# Interactive setup with browser automation
npx index-ready setup --interactive
# Basic text-based setup
npx index-ready setupIndexing Commands
# Index all URLs from a sitemap
npx index-ready sitemap https://example.com/sitemap.xml --max 100
# Index URLs from a file
npx index-ready file urls.txt --format txt
# Index with filters
npx index-ready sitemap https://example.com/sitemap.xml \
--include "/blog/.*" \
--exclude "/draft/.*" \
--max 50Monitoring Commands
# Check indexing status of URLs
npx index-ready status https://example.com/page1 https://example.com/page2
# Check status from file
npx index-ready status --file urls.txt
# Check status from sitemap
npx index-ready status --sitemap https://example.com/sitemap.xml
# Health check
npx index-ready health
# View performance metrics
npx index-ready metrics
# Clear cache
npx index-ready clear-cacheAPI Reference
GoogleIndexingAPI
Constructor
new GoogleIndexingAPI(options?: IndexingOptions)Options:
serviceAccountPath?: string- Path to service account JSON filecredentials?: string- JSON string of credentials (from env)projectId?: string- Google Cloud Project IDtimeout?: number- Request timeout in ms (default: 30000)maxRetries?: number- Max retry attempts (default: 5)maxCacheSize?: number- Cache size limit (default: 10000)circuitBreakerThreshold?: number- Failures before circuit opens (default: 10)
indexUrls(urls: string[]): Promise
Indexes an array of URLs with progress callbacks.
const indexer = new GoogleIndexingAPI({
onProgress: (processed, total, results) => {
console.log(`${processed}/${total} URLs processed`);
},
onError: (error, url) => {
console.error(`Failed: ${url}`, error.message);
},
});
const result = await indexer.indexUrls(urls);indexFromSitemap(sitemapUrl, options): Promise
Index all URLs from an XML sitemap.
const result = await indexer.indexFromSitemap(
"https://example.com/sitemap.xml",
{
filter: {
includePatterns: ["/blog/.*"],
excludePatterns: ["/draft/.*"],
maxUrls: 100,
},
},
);indexFromFile(filePath, options): Promise
Index URLs from a file (JSON, CSV, or text).
const result = await indexer.indexFromFile("urls.json", {
format: "json",
filter: { maxUrls: 50 },
});checkIndexingStatus(urls): Promise
Check indexing status of URLs.
const report = await indexer.checkIndexingStatus(urls);
console.log(`${report.indexed} indexed, ${report.notFound} not found`);Health & Monitoring
// Health check
const health = await indexer.healthCheck();
// Get metrics
const metrics = indexer.getMetrics();
// Clear cache
await indexer.clearCache();Error Handling
The package handles common Google API errors:
- 400 Bad Request: Invalid URL format (validated before sending)
- 403 Forbidden: Service account lacks Owner role in Search Console
- Provides direct link to Search Console settings
- 429 Too Many Requests: Automatic exponential backoff retry (up to 5 attempts)
Security Features
- Browser Protection: Throws error if used in browser environment
- Credential Validation: Ensures JSON structure and required fields
- GitIgnore Check: Warns if service account files aren't ignored
- Production Safety: Enforces .gitignore in production environments
Caching
URLs are cached for 24 hours in .indexing-cache.json to:
- Prevent duplicate API calls
- Save quota
- Speed up repeated indexing attempts
Logging
Logs are output to console with colors. Set log level:
export LOG_LEVEL=debug # info, warn, error, debugLicense
MIT License - see LICENSE file.
Contributing
- Fork the repository
- Create your feature branch
- Add tests
- Submit a pull request
Support
For issues or questions:
- Check the setup guide
- Open an issue on GitHub
- Review logs for detailed error messages
