@se-studio/search
v1.0.70
Published
AI-powered site search with Upstash Search for Next.js marketing sites
Readme
@se-studio/search
AI-powered site search for Next.js marketing sites using Upstash Search. Combines semantic and full-text search with zero infrastructure to manage.
Overview
This package provides:
- Search client – typed wrapper around
@upstash/searchwith automatic batch handling - Content indexing – extracts searchable text from CMS content using
MarkdownConverter - Webhook handler – incremental index updates on Contentful publish/delete
- Full rebuild – enumerates all content and re-indexes in batches
- API route factories – drop-in Next.js route handlers for search and rebuild
- Client hook –
useSearch()for building search UIs with debouncing
Setup
1. Install
The package is a workspace dependency. Add it to your app's package.json:
{
"dependencies": {
"@se-studio/search": "workspace:*"
}
}2. Create an Upstash Search database
Go to console.upstash.com/search and create a database. Copy the REST URL and token.
3. Environment variables
Add to your .env.local:
UPSTASH_SEARCH_REST_URL=https://your-search-url.upstash.io
UPSTASH_SEARCH_REST_TOKEN=your-token4. Search config
Create src/lib/search-config.ts:
import 'server-only';
import type { SearchIndexingConfig } from '@se-studio/search';
export const searchIndexingConfig: SearchIndexingConfig = {
searchIndex: {
connection: {
url: process.env.UPSTASH_SEARCH_REST_URL ?? '',
token: process.env.UPSTASH_SEARCH_REST_TOKEN ?? '',
},
publishedIndexName: 'published',
previewIndexName: 'preview',
},
contentTypes: [
{ type: 'page', enabled: true },
{ type: 'article', enabled: true },
{ type: 'person', enabled: false },
],
indexComponents: true,
respectIndexedFlag: true,
respectHiddenFlag: true,
};5. Search API route
Create src/app/api/search/route.ts:
import { createSearchApiHandler } from '@se-studio/search/api';
import { buildInformation } from '@/lib/converter-context';
import { searchIndexingConfig } from '@/lib/search-config';
export const GET = createSearchApiHandler({
searchConfig: searchIndexingConfig.searchIndex.connection,
publishedIndexName: searchIndexingConfig.searchIndex.publishedIndexName,
previewIndexName: searchIndexingConfig.searchIndex.previewIndexName,
isPreview: buildInformation.preview ?? false,
});6. Rebuild API route
Create src/app/api/search/rebuild/route.ts:
import { createSearchClient } from '@se-studio/search/client';
import { rebuildSearchIndex } from '@se-studio/search/indexing';
import { createRebuildApiHandler } from '@se-studio/search/api';
import { buildOptions, getContentfulConfig } from '@/lib/cms-server';
import { customerName, license } from '@/lib/constants';
import { buildInformation, converterContext } from '@/lib/converter-context';
import { searchIndexingConfig } from '@/lib/search-config';
import { baseUrl, revalidationSecret } from '@/lib/server-config';
const isPreview = buildInformation.preview ?? false;
export const POST = createRebuildApiHandler({
rebuildSecret: revalidationSecret ?? '',
isPreview,
rebuildFn: () => {
const config = getContentfulConfig(isPreview);
const client = createSearchClient(searchIndexingConfig.searchIndex.connection);
const indexName = isPreview
? searchIndexingConfig.searchIndex.previewIndexName
: searchIndexingConfig.searchIndex.publishedIndexName;
return rebuildSearchIndex({
client,
indexName,
indexingConfig: searchIndexingConfig,
converterContext,
contentfulConfig: config,
fetchOptions: buildOptions({ preview: isPreview }),
urlCalculators: converterContext.urlCalculators,
siteConfig: { canonicalBaseUrl: baseUrl, source: customerName, license },
});
},
});7. Webhook integration
Update your src/app/api/revalidate/route.ts to call the search webhook handler after cache revalidation. See the example-brightline app for the full pattern.
8. Client-side search
'use client';
import { useSearch } from '@se-studio/search/hooks';
export function SearchPage() {
const { query, setQuery, results, isLoading, error, totalCount } = useSearch();
return (
<div>
<input value={query} onChange={(e) => setQuery(e.target.value)} placeholder="Search..." />
{isLoading && <p>Searching...</p>}
{error && <p>Error: {error}</p>}
{results.map((r) => (
<a key={r.id} href={r.metadata.href}>
<h3>{r.content.title}</h3>
<p>{r.content.description}</p>
</a>
))}
</div>
);
}Triggering a full rebuild
curl -X POST https://your-site.com/api/search/rebuild \
-H "Authorization: Bearer YOUR_REVALIDATION_SECRET"The index to rebuild (published vs preview) is determined by the app's own isPreview flag
set at initialization time in createRebuildApiHandler.
Advanced: documentTransformer
SearchIndexingConfig accepts an optional documentTransformer callback that is invoked for every SearchDocument after it has been built — in both the full-rebuild and webhook-driven incremental-update paths. Use it to patch metadata, inject custom fields, or drop specific documents entirely.
import type { ContentData } from '@se-studio/search'; // re-exported from @se-studio/markdown-renderer
import type { SearchIndexingConfig } from '@se-studio/search';
export const searchIndexingConfig: SearchIndexingConfig = {
// ...
documentTransformer: (doc, contentData) => {
// Augment: add a custom field
doc.metadata.myCustomField = 'value';
// Drop: returning null removes the document from the index
if (doc.metadata.slug === 'draft-preview') return null;
return doc;
},
};Signature:
documentTransformer?: (doc: SearchDocument, contentData: ContentData) => SearchDocument | null;doc– the builtSearchDocument(safe to mutate in place or return a new object)contentData– the raw CMS content data for the entry, giving access to all fields- Return the document (modified or not) to include it, or
nullto drop it
The transformer runs on every chunk of a multi-chunk entry, so a 3-chunk page will call it 3 times.
Canonical URLs and custom indexes (IIndexableContent)
For Upstash Search, each SearchDocument already carries metadata.href (full path), populated from the CMS model’s href in buildSearchDocuments — do not infer URLs from slug + content type at query time.
If you maintain a separate full-text index (e.g. Lunr, blob JSON), use the shared IIndexableContent type from @se-studio/search. It requires:
href– full canonical path as used in<a href>(e.g./insights/my-article)content,title,id,contentType(string label),metadata
Optional slug remains useful for tokenisation; do not rely on it alone to build user-facing links.
Helpers:
stripMarkdownToPlainText,calculateReadingTime–@se-studio/search/indexingindexArticleLinksToIndexableContent– buildsIIndexableContent[]fromIArticleLink[]using oneMarkdownExporter.fetchContent()call per article (see below). PassmarkdownContext(config,siteConfig,urlCalculators, optionalcustomConverters);contentContextis supplied per entry from eachfetchContentresult.
Contentful indexing hazards (includes truncation)
The Contentful REST API can silently truncate includes.Entry and includes.Asset when a single response is large — for example, batching many entries in one getEntries call with include and walking includes to resolve rich text or linked entries often yields incomplete trees. Symptoms include missing body text or featured images in the index, with no error from the API.
Recommended default: fetch each entry individually via MarkdownExporter.fetchContent() (same pattern as rebuildSearchIndex in this package: one call per link/slug). That uses the normal per-entry REST/converter path instead of a multi-entry batched query.
If you must batch requests, you need a strategy that does not depend on unbounded resolved includes in one response (e.g. lower depth, smaller batches, or follow-up fetches) — the package cannot fix API limits from TypeScript alone.
Architecture
- Single Upstash database, two indexes:
publishedandpreview - Text extraction: Uses
MarkdownConverterto deeply extract text from page components, then strips markdown formatting - Content truncation: Body text is truncated to ~4,000 chars (Upstash limit)
- Batch upsert: Documents are upserted in batches of 100 (Upstash API limit)
- Webhook-driven: Incremental updates on Contentful publish/delete events
- Flags: Respects
indexedandhiddenfields on content entries
Subpath Exports
| Import | Purpose |
|--------|---------|
| @se-studio/search | Types only |
| @se-studio/search/client | createSearchClient() |
| @se-studio/search/indexing | rebuildSearchIndex(), buildSearchDocuments, indexArticleLinksToIndexableContent, stripMarkdownToPlainText, calculateReadingTime, … |
| @se-studio/search/webhook | createSearchWebhookHandler() |
| @se-studio/search/api | createSearchApiHandler(), createRebuildApiHandler() |
| @se-studio/search/hooks | useSearch() |
