npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

search-agent

v1.0.2

Published

Oblien Search SDK - AI-powered web search, content extraction, and website crawling. Full documentation at https://oblien.com/docs/search-api

Readme

Oblien Search SDK

AI-powered web search, content extraction, and website crawling SDK for Node.js.

Installation

npm install search-agent

Quick Start

import { SearchClient } from 'search-agent';

const client = new SearchClient({
  clientId: process.env.OBLIEN_CLIENT_ID,
  clientSecret: process.env.OBLIEN_CLIENT_SECRET
});

// Search with AI-generated answers
const results = await client.search(
  ['What is machine learning?'],
  { summaryLevel: 'intelligent', includeAnswers: true }
);

console.log(results[0].answer);

Authentication

Get your API credentials from Oblien Dashboard.

const client = new SearchClient({
  clientId: 'your-client-id',
  clientSecret: 'your-client-secret',
  apiURL: 'https://api.oblien.com' // optional, defaults to production
});

API Reference

Search

Perform AI-powered web searches with batch processing support.

Method

client.search(queries, options)

Parameters

queries (string[], required)

  • Array of search queries to execute
  • Supports batch processing of multiple queries

includeAnswers (boolean, optional, default: false)

  • Generate AI-powered answers for search results
  • Answers are created using advanced language models

options (object, optional)

  • includeMetadata (boolean) - Include search metadata in response
  • summaryLevel ('low' | 'medium' | 'intelligent') - AI answer quality level
    • low - Fast, basic answers
    • medium - Balanced quality and speed (recommended)
    • intelligent - Highest quality, slower generation
  • maxResults (number) - Maximum number of results per query
  • language (string) - Language code (e.g., 'en', 'es', 'fr', 'de')
  • region (string) - Region code (e.g., 'us', 'uk', 'eu', 'asia')
  • freshness ('day' | 'week' | 'month' | 'year' | 'all') - Result freshness filter
  • startDate (string) - Start date for range filtering (ISO 8601 format)
  • endDate (string) - End date for range filtering (ISO 8601 format)
  • includeImages (boolean) - Include images in search results
  • includeImageDescriptions (boolean) - Include AI-generated image descriptions
  • includeFavicon (boolean) - Include website favicons
  • includeRawContent ('none' | 'with_links' | 'with_images_and_links') - Raw content inclusion level
  • chunksPerSource (number) - Number of content chunks per source
  • country (string) - Country code filter
  • includeDomains (string[]) - Array of domains to include in search
  • excludeDomains (string[]) - Array of domains to exclude from search
  • searchTopic ('general' | 'news' | 'finance') - Search topic category
  • searchDepth ('basic' | 'advanced') - Search depth level
  • timeRange ('none' | 'day' | 'week' | 'month' | 'year') - Time range filter

Returns

Promise resolving to array of search results. Each result contains:

{
  success: boolean,           // Whether the search succeeded
  query: string,             // Original search query
  results: Array<{
    url: string,            // Result URL
    title: string,          // Result title
    content: string,        // Result content/snippet
    // Additional fields based on options
  }>,
  answer?: string,           // AI-generated answer (if includeAnswers=true)
  metadata?: object,         // Search metadata (if includeMetadata=true)
  time_took: number         // Time taken in milliseconds
}

Examples

Basic search:

const results = await client.search(['TypeScript tutorial']);

console.log(results[0].results[0].title);
console.log(results[0].results[0].url);

Search with AI answers:

const results = await client.search(
  ['What is quantum computing?'],
  { summaryLevel: 'intelligent', includeAnswers: true }
);

console.log(results[0].answer);

Batch search with options:

const results = await client.search(
  [
    'Latest AI developments',
    'Machine learning best practices',
    'Neural networks explained'
  ],
  {
    includeAnswers: true,
    summaryLevel: 'medium',
    maxResults: 5,
    freshness: 'week',
    language: 'en',
    region: 'us'
  }
);

results.forEach(result => {
  console.log(`Query: ${result.query}`);
  console.log(`Answer: ${result.answer}`);
  console.log(`Results count: ${result.results.length}`);
});

Advanced filtering:

const results = await client.search(
  ['climate change research'],
  {
    includeDomains: ['nature.com', 'science.org', 'pnas.org'],
    searchTopic: 'general',
    freshness: 'year',
    includeMetadata: true
  }
);

Extract

Extract specific content from web pages using AI-powered extraction.

Method

client.extract(pages, options)

Parameters

pages (array, required)

  • Array of page objects to extract content from
  • Each page must include:
    • url (string, required) - Page URL to extract from
    • details (string[], required) - Array of extraction instructions
    • summaryLevel ('low' | 'medium' | 'intelligent', optional) - Extraction quality level

options (object, optional)

  • includeMetadata (boolean) - Include extraction metadata
  • timeout (number) - Request timeout in milliseconds (default: 30000)
  • maxContentLength (number) - Maximum content length to extract
  • format ('markdown' | 'html' | 'text') - Output format (default: 'markdown')
  • extractDepth ('basic' | 'advanced') - Extraction depth level
  • includeImages (boolean) - Include images in extracted content
  • includeFavicon (boolean) - Include page favicon
  • maxLength (number) - Maximum extraction length

Returns

Promise resolving to extraction response:

{
  success: boolean,          // Whether extraction succeeded
  data: Array<{
    result: object,         // Extracted content
    page: {
      url: string,
      details: string[],
      summaryLevel: string
    }
  }>,
  errors: string[],          // Array of error messages (if any)
  time_took: number         // Time taken in milliseconds
}

Examples

Basic extraction:

const extracted = await client.extract([
  {
    url: 'https://example.com/article',
    details: [
      'Extract the article title',
      'Extract the main content',
      'Extract the author name'
    ]
  }
]);

if (extracted.success) {
  console.log(extracted.data[0].result);
}

Batch extraction with different instructions:

const extracted = await client.extract([
  {
    url: 'https://example.com/blog/post-1',
    details: [
      'Extract blog post title',
      'Extract publication date',
      'Extract main content',
      'Extract tags'
    ],
    summaryLevel: 'medium'
  },
  {
    url: 'https://example.com/products',
    details: [
      'Extract all product names',
      'Extract product prices',
      'Extract product descriptions'
    ],
    summaryLevel: 'intelligent'
  },
  {
    url: 'https://example.com/about',
    details: [
      'Extract company mission',
      'Extract team members',
      'Extract contact information'
    ]
  }
], {
  format: 'markdown',
  timeout: 60000
});

extracted.data.forEach(item => {
  console.log(`URL: ${item.page.url}`);
  console.log(`Extracted:`, item.result);
});

Advanced extraction with options:

const extracted = await client.extract([
  {
    url: 'https://research.example.com/paper',
    details: [
      'Extract paper abstract',
      'Extract key findings',
      'Extract methodology',
      'Extract conclusions',
      'Extract references'
    ],
    summaryLevel: 'intelligent'
  }
], {
  format: 'markdown',
  extractDepth: 'advanced',
  includeMetadata: true,
  maxContentLength: 100000
});

Crawl

Crawl and research websites with real-time streaming results.

Method

client.crawl(instructions, onEvent, options)

Parameters

instructions (string, required)

  • Natural language instructions for the crawl
  • Must include the target URL and what to extract/research
  • Example: "Crawl https://example.com/blog and extract all article titles and summaries"

onEvent (function, optional)

  • Callback function for real-time streaming events
  • Receives event objects with type and data
  • Event types:
    • page_crawled - Page was successfully crawled
    • content - Content chunk extracted
    • thinking - AI thinking process (if enabled)
    • error - Error occurred
    • crawl_end - Crawl completed

options (object, optional)

  • type ('deep' | 'shallow' | 'focused') - Crawl type (default: 'deep')
    • deep - Comprehensive crawl with detailed analysis
    • shallow - Quick crawl of main pages only
    • focused - Targeted crawl based on instructions
  • thinking (boolean) - Enable AI thinking process (default: true)
  • allow_thinking_callback (boolean) - Stream thinking events to callback (default: true)
  • stream_text (boolean) - Stream text results to callback (default: true)
  • maxDepth (number) - Maximum crawl depth (number of link levels)
  • maxPages (number) - Maximum number of pages to crawl
  • includeExternal (boolean) - Include external links in crawl
  • timeout (number) - Request timeout in milliseconds
  • crawlDepth ('basic' | 'advanced') - Crawl depth level
  • format ('markdown' | 'html' | 'text') - Output format
  • includeImages (boolean) - Include images in crawled content
  • includeFavicon (boolean) - Include favicons
  • followLinks (boolean) - Follow links during crawl

Returns

Promise resolving to final crawl response:

{
  success: boolean,          // Whether crawl succeeded
  time_took: number         // Time taken in milliseconds
}

Examples

Basic crawl:

const result = await client.crawl(
  'Crawl https://example.com/blog and summarize all blog posts'
);

console.log(`Completed in ${result.time_took}ms`);

Crawl with event streaming:

const result = await client.crawl(
  'Crawl https://example.com/docs and extract all API endpoints with their descriptions',
  (event) => {
    if (event.type === 'page_crawled') {
      console.log(`Crawled page: ${event.url}`);
    } else if (event.type === 'content') {
      console.log(`Content found: ${event.text}`);
    } else if (event.type === 'thinking') {
      console.log(`AI thinking: ${event.thought}`);
    }
  },
  {
    type: 'deep',
    maxPages: 20
  }
);

Advanced crawl with options:

let crawledPages = [];
let extractedContent = [];

const result = await client.crawl(
  'Research https://news.example.com and find all articles about artificial intelligence from the last week',
  (event) => {
    switch (event.type) {
      case 'page_crawled':
        crawledPages.push(event.url);
        console.log(`Progress: ${crawledPages.length} pages crawled`);
        break;
      
      case 'content':
        extractedContent.push(event.text);
        console.log(`Extracted content length: ${event.text.length} chars`);
        break;
      
      case 'thinking':
        console.log(`AI Analysis: ${event.thought}`);
        break;
      
      case 'error':
        console.error(`Error: ${event.error}`);
        break;
    }
  },
  {
    type: 'focused',
    thinking: true,
    maxDepth: 3,
    maxPages: 50,
    includeExternal: false,
    format: 'markdown'
  }
);

console.log(`Crawl completed:`);
console.log(`- Total pages: ${crawledPages.length}`);
console.log(`- Content chunks: ${extractedContent.length}`);
console.log(`- Time: ${result.time_took}ms`);

Research-focused crawl:

const result = await client.crawl(
  `Crawl https://company.example.com and create a comprehensive report including:
   - Company overview and mission
   - Product offerings and features
   - Pricing information
   - Customer testimonials
   - Contact information and locations`,
  (event) => {
    if (event.type === 'content') {
      // Process extracted content in real-time
      processAndStore(event.text);
    }
  },
  {
    type: 'deep',
    thinking: true,
    maxDepth: 4,
    format: 'markdown'
  }
);

Error Handling

All methods throw errors for failed requests. Always use try-catch blocks:

try {
  const results = await client.search(['test query']);
  console.log(results);
} catch (error) {
  console.error('Search failed:', error.message);
}

Common error scenarios:

Missing credentials:

// Throws: 'clientId is required'
new SearchClient({ clientSecret: 'secret' });

Invalid parameters:

// Throws: 'queries must be a non-empty array'
await client.search([]);

// Throws: 'Page at index 0 is missing required field: details'
await client.extract([{ url: 'https://example.com' }]);

API errors:

try {
  await client.search(['query']);
} catch (error) {
  // Error message includes API error details
  console.error(error.message);
}

TypeScript Support

Full TypeScript definitions are included:

import { SearchClient, SearchOptions, SearchResult } from 'search-agent';

const client = new SearchClient({
  clientId: process.env.OBLIEN_CLIENT_ID!,
  clientSecret: process.env.OBLIEN_CLIENT_SECRET!
});

const options: SearchOptions = {
  summaryLevel: 'intelligent',
  maxResults: 10,
  freshness: 'week',
  includeAnswers: true
};

const results: SearchResult[] = await client.search(
  ['TypeScript best practices'],
  options
);

Rate Limiting

The API implements rate limiting. Refer to response headers for current limits:

  • X-RateLimit-Limit - Total requests allowed per window
  • X-RateLimit-Remaining - Requests remaining in current window
  • X-RateLimit-Reset - Window reset time (Unix timestamp)

Best Practices

Search

  1. Batch related queries - Process multiple related searches in one request
  2. Use appropriate summary levels - 'low' for speed, 'intelligent' for quality
  3. Set reasonable maxResults - Typical range: 5-20 results
  4. Apply filters early - Use domain filters and freshness to improve relevance
// Good: Batch related queries
const results = await client.search(
  ['ML frameworks', 'ML tools', 'ML libraries'],
  { includeAnswers: true, summaryLevel: 'medium', maxResults: 5 }
);

// Better: Use filters for precision
const results = await client.search(
  ['latest research'],
  {
    includeAnswers: true,
    includeDomains: ['arxiv.org', 'nature.com'],
    freshness: 'month'
  }
);

Extract

  1. Be specific with details - Clear instructions produce better results
  2. Batch similar extractions - Process related pages together
  3. Set appropriate timeouts - Increase for large pages or slow sites
// Good: Specific extraction instructions
const extracted = await client.extract([
  {
    url: 'https://example.com',
    details: [
      'Extract the main heading (h1 tag)',
      'Extract the first paragraph of content',
      'Extract any pricing information mentioned'
    ]
  }
]);

// Better: Include context in instructions
const extracted = await client.extract([
  {
    url: 'https://example.com/product',
    details: [
      'Extract product name from the title',
      'Extract pricing from the price section',
      'Extract feature list from the features section',
      'Extract customer rating if available'
    ],
    summaryLevel: 'intelligent'
  }
], {
  timeout: 60000,
  format: 'markdown'
});

Crawl

  1. Provide clear instructions - Include URL and specific research goals
  2. Set appropriate limits - maxPages and maxDepth prevent runaway crawls
  3. Use event callbacks - Process results in real-time for large crawls
  4. Choose the right type - 'focused' for specific targets, 'deep' for comprehensive
// Good: Clear, focused instructions
await client.crawl(
  'Crawl https://docs.example.com and extract all API method signatures',
  (event) => processEvent(event),
  { type: 'focused', maxPages: 30 }
);

// Better: Detailed instructions with constraints
await client.crawl(
  `Research https://example.com/blog for articles about:
   - Cloud computing trends
   - DevOps best practices
   - Container orchestration
   Extract title, date, and summary for each relevant article`,
  (event) => {
    if (event.type === 'content') {
      saveToDatabase(event.text);
    }
  },
  {
    type: 'deep',
    maxPages: 50,
    maxDepth: 3,
    thinking: true
  }
);

Support

License

MIT