@kognitivedev/web-search
v0.2.28
Published
Web search package with search adapters, page fetching, LLM-powered content extraction, and synthesis
Maintainers
Readme
@kognitivedev/web-search
Provider-agnostic web search orchestration with pluggable search adapters, page fetchers, LLM-powered extraction, and final-answer synthesis.
Installation
bun add @kognitivedev/web-search @kognitivedev/adapter-ai-sdk @ai-sdk/openai zodIf you already have a runtime adapter from another package, you only need @kognitivedev/web-search and zod.
What It Provides
createWebSearchWorkflow()for a full search -> fetch -> extract -> synthesize pipelinecreateWebSearchTool()for exposing the workflow as a provider-agnostic Kognitive tool- four built-in search adapters:
GoogleSearchAdapterSerperSearchAdapterTavilySearchAdapterBraveSearchAdapter
- two built-in page fetchers:
HttpPageFetcherScrapeDoPageFetcher
- extraction and synthesis agents you can customize with your own instructions
- optional structured output for the final synthesis step
How the Pipeline Works
createWebSearchWorkflow() runs four stages:
- search with the configured
SearchAdapter - fetch the top pages with the configured
PageFetcher - extract only query-relevant content from each page with a structured extraction agent
- synthesize the extracted evidence into a final answer with sources
By default, the workflow fetches up to 5 pages and uses a 15_000ms fetch timeout.
Quick Start
import { openai } from "@ai-sdk/openai";
import { createAISDKRuntimeAdapter } from "@kognitivedev/adapter-ai-sdk";
import {
SerperSearchAdapter,
createWebSearchWorkflow,
} from "@kognitivedev/web-search";
const runtime = createAISDKRuntimeAdapter({
model: openai("gpt-4.1-mini"),
});
const workflow = createWebSearchWorkflow({
runtime,
searchAdapter: new SerperSearchAdapter({
apiKey: process.env.SERPER_API_KEY!,
gl: "us",
hl: "en",
}),
});
const result = await workflow.execute({
query: "latest TypeScript 5.9 release notes",
maxResults: 8,
maxPages: 4,
timeRange: "month",
});
if (result.status === "completed") {
console.log(result.output.answer);
console.log(result.output.sources);
}Tool Integration
Use createWebSearchTool() when you want agents to invoke web search directly.
import { openai } from "@ai-sdk/openai";
import { createAgent } from "@kognitivedev/agents";
import { createAISDKRuntimeAdapter } from "@kognitivedev/adapter-ai-sdk";
import {
BraveSearchAdapter,
createWebSearchTool,
} from "@kognitivedev/web-search";
const runtime = createAISDKRuntimeAdapter({
model: openai("gpt-4.1"),
});
const webSearchTool = createWebSearchTool({
runtime,
searchAdapter: new BraveSearchAdapter({
apiKey: process.env.BRAVE_SEARCH_API_KEY!,
}),
});
const agent = createAgent({
name: "research-assistant",
instructions: "Use web-search when the user asks for current information.",
runtime,
tools: [webSearchTool],
});The tool uses:
id: "web-search"- a built-in description optimized for current-events and live-information use cases
WebSearchInputSchemaas its input contracttoModelOutput()that returns the synthesized answer and appends sources
Search Adapters
All adapters implement:
interface SearchAdapter {
readonly provider: string;
search(query: SearchQuery): Promise<SearchResponse>;
}Supported query fields:
querymaxResultsregiontimeRangeincludeDomainsexcludeDomains
Serper
import { SerperSearchAdapter } from "@kognitivedev/web-search";
const adapter = new SerperSearchAdapter({
apiKey: process.env.SERPER_API_KEY!,
gl: "us",
hl: "en",
});Notes:
- sends domain filters through
siteSearchandnotSiteSearch - maps
timeRangeto Serpertbsvalues - good default choice when you want a Google-like result set with a simple API
Tavily
import { TavilySearchAdapter } from "@kognitivedev/web-search";
const adapter = new TavilySearchAdapter({
apiKey: process.env.TAVILY_API_KEY!,
includeAnswer: false,
includeRawContent: false,
});Notes:
- supports domain filters and
timeRange - maps result content into the package's neutral
snippetfield
Brave
import { BraveSearchAdapter } from "@kognitivedev/web-search";
const adapter = new BraveSearchAdapter({
apiKey: process.env.BRAVE_SEARCH_API_KEY!,
});Notes:
- supports
regionandtimeRange - returns Brave web results mapped into neutral result items
Google Custom Search
import { GoogleSearchAdapter } from "@kognitivedev/web-search";
const adapter = new GoogleSearchAdapter({
apiKey: process.env.GOOGLE_SEARCH_API_KEY!,
cx: process.env.GOOGLE_SEARCH_CX!,
});Notes:
- translates include/exclude domains directly into the
qquery string - uses Google Custom Search Engine
cx
Page Fetching
Search results only give snippets. Fetchers turn the linked pages into markdown so the extraction agent can work from real page content.
HttpPageFetcher
HttpPageFetcher is the default fetcher.
import { HttpPageFetcher } from "@kognitivedev/web-search";
const workflow = createWebSearchWorkflow({
runtime,
searchAdapter,
pageFetcher: new HttpPageFetcher(),
fetchOptions: {
timeoutMs: 10_000,
userAgent: "MyCrawler/1.0",
maxContentLength: 2 * 1024 * 1024,
},
});It:
- fetches HTML directly
- strips
script,style,nav,footer,header,noscript, and comments - converts cleaned HTML to markdown with
turndown - supports timeout, header overrides, max content length, proxying, and scoped SSL bypass
ScrapeDoPageFetcher
Use ScrapeDoPageFetcher when pages need rendering, geo-targeting, or a managed anti-bot fetch layer.
import { ScrapeDoPageFetcher } from "@kognitivedev/web-search";
const workflow = createWebSearchWorkflow({
runtime,
searchAdapter,
pageFetcher: new ScrapeDoPageFetcher({
token: process.env.SCRAPEDO_TOKEN!,
render: true,
geoCode: "us",
device: "desktop",
blockResources: true,
}),
});It:
- requests
output=markdownfrom scrape.do - optionally enables JS rendering
- supports
super,geoCode,device,blockResources, and custom timeout options
Structured Output
Pass outputSchema at execution time when you want the final synthesis step to return both text and structured JSON.
const outputSchema = {
name: "release_summary",
schema: {
type: "object",
properties: {
summary: { type: "string" },
breakingChanges: {
type: "array",
items: { type: "string" },
},
links: {
type: "array",
items: { type: "string" },
},
},
required: ["summary", "breakingChanges"],
},
};
const result = await workflow.execute({
query: "latest Next.js release changes",
outputSchema,
});
if (result.status === "completed") {
console.log(result.output.answer);
console.log(result.output.structuredAnswer);
}The returned WebSearchResult shape is:
interface WebSearchResult<TStructured = unknown> {
answer: string;
structuredAnswer?: TStructured;
sources: Array<{
url: string;
title: string;
snippet: string;
relevantContent: string;
confidence: number;
}>;
query: string;
searchProvider: string;
totalResultsFetched: number;
}Workflow Configuration
createWebSearchWorkflow() accepts:
searchAdapter: requiredruntime: requiredpageFetcher: optional, defaults tonew HttpPageFetcher()fetchOptions: optional fetch-time controlsmaxPages: optional workflow-level page cap, default5maxContentTokens: optional extraction truncation limitextractInstructions: optional replacement prompt for the extraction agentsynthesisInstructions: optional replacement prompt for the synthesis agentlogger: optional logger, defaults toConsoleLogger("[web-search]")
Request-time inputs can override some workflow defaults:
maxPagesmaxResultsregiontimeRangeincludeDomainsexcludeDomainsoutputSchema
Custom Extraction and Synthesis Prompts
The package ships with built-in prompts for both agent stages, but you can replace them:
const workflow = createWebSearchWorkflow({
runtime,
searchAdapter,
extractInstructions: [
"Extract only information directly relevant to the query.",
"Prefer short factual bullets over long quotations.",
].join("\n"),
synthesisInstructions: [
"Write a concise executive summary.",
"Call out contradictions between sources.",
"End with a short source list.",
].join("\n"),
});Empty Results and Confidence Filtering
The workflow returns a deterministic fallback when it cannot produce a credible answer:
- if all fetches fail, the result is normalized into an empty answer payload
- if every extracted page has
confidence <= 0.2, synthesis is skipped - the fallback answer is:
No relevant results could be found for this query. Please try again.This makes the tool safer to compose inside larger agents and workflows because downstream code always receives a stable WebSearchResult.
Export Surface
Main exports:
- workflow:
createWebSearchWorkflow,WebSearchInputSchema - tool:
createWebSearchTool - adapters:
BaseSearchAdapter,GoogleSearchAdapter,SerperSearchAdapter,TavilySearchAdapter,BraveSearchAdapter - fetchers:
HttpPageFetcher,ScrapeDoPageFetcher - agents:
createExtractAgent,createSynthesisAgent - types:
SearchQuery,SearchResponse,WebSearchInput,WebSearchResult,WebSearchWorkflowConfig, fetcher types
Related Docs
- package docs:
apps/docs/docs/web-search/* - API reference:
apps/docs/docs/reference/web-search.md - workflow primitives:
@kognitivedev/workflows - tools:
@kognitivedev/tools
