npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aws-blocks/bb-knowledge-base

v0.1.2

Published

Semantic document retrieval backed by Amazon Bedrock Knowledge Bases.

Readme

@aws-blocks/bb-knowledge-base

Semantic document retrieval backed by Amazon Bedrock Knowledge Bases.

When to use: Search over your own documents — FAQs, product guides, support articles, internal wikis. Point it at a folder and query with natural language.

When NOT to use: If you need structured key-value lookups, use KVStore. If you need relational queries, use Database. If you need full-text keyword search only (no semantic understanding), roll your own with DistributedTable.

Quick Start

import { Scope, ApiNamespace } from '@aws-blocks/core';
import { KnowledgeBase } from '@aws-blocks/bb-knowledge-base';

const scope = new Scope('my-app');

const kb = new KnowledgeBase(scope, 'docs', {
  source: './knowledge',
  description: 'Product documentation',
});

export const api = new ApiNamespace(scope, 'api', (context) => ({
  async search(query: string) {
    const results = await kb.retrieve(query, { maxResults: 5 });
    return { results };
  },
}));

API

const kb = new KnowledgeBase(scope, id, options)

| Method | Returns | Description | |--------|---------|-------------| | retrieve(query, options?) | Promise<RetrieveResult[]> | Search for relevant document chunks. Returns results ranked by relevance score. |

Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | source | string | (required) | Document source — local folder path or s3:// URI pointing to a bucket or folder. | | chunking | ChunkingConfig | { strategy: 'semantic' } | How documents are split into chunks. | | embeddingDimensions | 256 \| 512 \| 1024 | 1024 | Embedding model dimensions. | | description | string | — | Human-readable description for the knowledge base. | | removalPolicy | 'destroy' \| 'retain' | 'retain' | CDK removal behavior for BB-created data buckets (imported s3:// URI sources are unaffected). Defaults to RETAIN (bucket and documents preserved on cdk destroy) unless sandbox mode. Pass 'destroy' for ephemeral stacks — also enables autoDeleteObjects. | | logger | ChildLogger | — | Optional logger for internal operations. When omitted, a default Logger at error level is created. |

Source Configuration

// Local folder — synced to S3 on deploy
new KnowledgeBase(scope, 'docs', { source: './knowledge' });

// Existing S3 bucket (with optional prefix)
new KnowledgeBase(scope, 'docs', { source: 's3://my-bucket' });
new KnowledgeBase(scope, 'docs', { source: 's3://my-bucket/docs/prefix/' });

S3 URI source: When using an s3:// URI, the CDK construct imports the existing bucket instead of creating a new one. An optional path prefix narrows which objects Bedrock ingests. No BucketDeployment is created — your documents must already be in the bucket. In local development, S3 URI sources are not supported (use a local folder path instead).

Chunking Strategies

| Strategy | Description | |----------|-------------| | 'semantic' | (Default) Splits at natural topic boundaries using breakpoint detection. | | 'fixed' | Fixed-size chunks with configurable chunkSize and chunkOverlap. | | 'hierarchical' | Two-level chunking (parent 1500 tokens, child 300 tokens). | | 'none' | No chunking — each document is a single chunk. |

Chunking Options

chunking accepts a ChunkingConfig. Options apply only to the relevant strategy; others are ignored.

| Option | Type | Default | Applies to | Description | |--------|------|---------|------------|-------------| | strategy | 'semantic' \| 'fixed' \| 'hierarchical' \| 'none' | 'semantic' | all | Chunking strategy. | | chunkSize | number | 300 | 'fixed' | Max tokens per chunk. | | chunkOverlap | number | 20 | 'fixed' | Overlap percentage between consecutive chunks (0–100). | | breakpointPercentile | number | 95 | 'semantic' | Breakpoint percentile for topic-boundary detection (0–100). |

chunking: { strategy: 'fixed', chunkSize: 500, chunkOverlap: 10 }

Retrieve Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | maxResults | number | 10 | Maximum results to return. Range: 1–100. | | filter | MetadataFilter | — | Metadata filter with AND semantics across all key-value pairs. |

Retrieve Result

| Field | Type | Description | |-------|------|-------------| | text | string | Chunk text content. | | score | number | Relevance score 0.0–1.0. | | source | string | Source document path or URL. | | metadata | Record<string, string> | Document metadata. Includes auto-populated folder from subfolders. |

Metadata Filtering

Filter results by document metadata. All conditions use AND semantics:

// Only return chunks from the 'faq' folder
const results = await kb.retrieve('how do I reset my password', {
  filter: { folder: { equals: 'faq' } },
});

// Multiple filters (AND)
const results = await kb.retrieve('pricing', {
  filter: {
    folder: { equals: 'products' },
    category: { equals: 'enterprise' },
  },
});

Subfolder paths automatically populate the folder metadata key. For example, a file at ./knowledge/faq/billing.md gets metadata.folder = 'faq'.

Error Handling

import { isBlocksError } from '@aws-blocks/core';
import { KnowledgeBaseErrors } from '@aws-blocks/bb-knowledge-base';

try {
  const results = await kb.retrieve('query');
} catch (e: unknown) {
  if (isBlocksError(e, KnowledgeBaseErrors.NotReady)) {
    // KB not yet deployed or ingested
  }
  if (isBlocksError(e, KnowledgeBaseErrors.ValidationError)) {
    // Empty query
  }
  throw e;
}

| Error Constant | Name | When | |---|---|---| | KnowledgeBaseErrors.RetrievalFailed | RetrievalFailedException | Bedrock retrieval call failed | | KnowledgeBaseErrors.NotReady | KnowledgeBaseNotReadyException | KB not deployed or env vars missing | | KnowledgeBaseErrors.InvalidSource | InvalidSourceConfigException | Source folder not found or invalid config | | KnowledgeBaseErrors.InvalidFilter | InvalidFilterException | Invalid filter keys in Bedrock query | | KnowledgeBaseErrors.ValidationError | KnowledgeBaseValidationError | Empty or invalid query | | KnowledgeBaseErrors.BrowserNotSupported | BrowserNotSupportedException | Used in a browser context — KnowledgeBase is server-side only |

Deploy Behavior

cdk deploy automatically triggers document ingestion (fire-and-forget). Ingestion runs asynchronously after the deploy completes. Check the AWS console to monitor ingestion progress.

Scaling & Cost (AWS)

  • Embedding model: Amazon Titan Text Embeddings V2
  • Vector store: S3 Vectors (serverless, no provisioning)
  • Embedding cost: ~$0.00002 per 1,000 tokens (ingestion)
  • Retrieval cost: ~$0.00002 per 1,000 tokens (query embedding) + S3 Vectors query cost
  • Storage: S3 standard pricing for source documents + S3 Vectors for embeddings
  • Max document size: 50 MB per file
  • Supported formats: .md, .txt, .html, .htm, .csv, .json (plus binary formats parsed on AWS: .pdf, .doc, .docx, .xls, .xlsx)

Local Development

In local dev mode, KnowledgeBase reads documents from the source folder, chunks by paragraphs, and uses TF-IDF for relevance scoring. Results are cached to .bb-data/{fullId}/chunks.json for fast restarts.

Parity notes:

  • Scoring uses TF-IDF (keyword-based) rather than real embeddings. Scores are relative within the mock and won't match production Bedrock scores exactly.
  • The API contract (method signatures, error types, result shape) is identical to AWS.
  • Metadata filtering and maxResults work identically.
  • S3 URI sources are not supported in local development — use a local folder path.

Wipe cached data with rm -rf .bb-data.

See Also