@sanity/agent-schema

v0.3.0

Published

a month ago

Schema processing utilities for Sanity Agent

0High
0Medium
0Low

sanity-svc.npm

sanity-io

sanity sanity-agent schema

@sanity/agent-schema

Schema utilities for Sanity Agent. Fetches Lexicon schemas, compresses them for LLM context windows, provides full JSON views for schema explorers, and resolves field definitions for validation and mutation agents.

Install

pnpm add @sanity/agent-schema @sanity/client groq-js

Getting started

The most common workflow: fetch a schema, optionally enrich it with dataset stats, and compress it into text for an LLM.

import {
  fetchSchemaByDescriptorId,
  fetchDatasetStats,
  compressedSchema,
} from "@sanity/agent-schema";

// 1. Fetch and parse the schema from Lexicon
//    The `lexiconClient` should target the Lexicon API host.
//    `descriptorId` identifies a specific schema version (typically from an application resource).
const types = await fetchSchemaByDescriptorId(lexiconClient, descriptorId);

// 2. Optionally fetch dataset stats (document/reference counts)
//    The `agentClient` should target the agent API host with `useProjectHostname: false`.
const statsResult = await fetchDatasetStats(agentClient, {
  organizationId,
  projectId,
  dataset,
});

// 3. Compress into agent-friendly text
const text = await compressedSchema(types, {
  datasetStats: statsResult.status === "ready" ? statsResult.stats : undefined,
  filter: { read: '_type in ["post", "author"]' },
});

The result is a compact text summary of your document types, fields, references, and counts — ready to drop into an LLM system prompt.

Usage

Compress a schema for LLM context

compressedSchema is the main entry point for generating LLM-ready text. It accepts parsed schema types and returns a compact summary.

import { compressedSchema } from "@sanity/agent-schema";

// Filter by GROQ expression
const text = await compressedSchema(types, {
  filter: { read: '_type in ["post", "author"]' },
});

// Filter by explicit type list
const text = await compressedSchema(types, {
  filter: ["post", "author"],
});

// All options
const text = await compressedSchema(types, {
  filter: { read: '_type == "post"', write: '_type == "draft"' },
  datasetStats,
  fields: { maxCount: 5, includeDescription: true },
});

Browse full schema JSON

fullSchema returns the complete JSON structure of a type — every field, nested object, validation rule, and option. Used by schema explorer tools.

import { fullSchema } from "@sanity/agent-schema";

// Get the entire type
fullSchema(types, { type: "post" });
// => { name: "post", type: "document", fields: [...] }

// Zoom into a specific field path
fullSchema(types, { type: "post", path: "content[].codeBlock" });
// => { name: "codeBlock", type: "object", fields: [...] }

Large types automatically return a navigator object with child paths to drill into, instead of blowing up the response. Control this with sizeThreshold (default 200KB).

Resolve a field at a path

resolveFieldSchema answers "what type is the value at this path?" — used by validators and mutation code.

import { resolveFieldSchema } from "@sanity/agent-schema";

resolveFieldSchema("content[0].text", postSchema, allTypes);
// => { name: "text", type: "string" }

resolveFieldSchema("tags", postSchema, allTypes);
// => { name: "tags", type: "array", of: [...] }

resolveFieldSchema("nonexistent.path", postSchema, allTypes);
// => null

For union arrays where the same path exists in multiple item types, use resolveFieldSchemaAllUnions to get all matches:

import { resolveFieldSchemaAllUnions } from "@sanity/agent-schema";

resolveFieldSchemaAllUnions("content", postSchema, allTypes);
// => [blockSchema, imageSchema, codeSchema]

Process schema for custom tooling

processSchema is the core transformation step. It categorizes types, expands inline object references, computes bidirectional reference graphs, and identifies title fields. This is what compressedSchema uses internally.

import { processSchema } from "@sanity/agent-schema";

const { documentTypes, objectTypes } = processSchema(types, {
  filter: ["post", "author"],
});

// documentTypes: Map<string, ProcessedDocumentType>
// objectTypes: Map<string, ManifestSchemaType>

for (const [name, doc] of documentTypes) {
  console.log(doc.titleField); // "title"
  console.log(doc.outgoingReferences); // ["author", "category"]
  console.log(doc.incomingReferences); // ["comment"]
  console.log(doc.outgoingReferencePaths); // [{ type: "author", path: "author" }]
}

Use this when you need the processed data structures without the text formatting — for custom schema visualizations, reference analysis, or building infrastructure like document counters.

Architecture

fetchSchemaByDescriptorId()                fetchDatasetStats()
         │                                          │
         ▼                                          ▼
  ManifestSchemaType[]                      DatasetStatsResult
         │                                          │
         ├──► compressedSchema()  ◄─────────────────┘
         │        ──► text summary of all types
         │
         ├──► fullSchema()
         │        ──► JSON subtree of one type (all fields expanded)
         │
         ├──► resolveFieldSchema()
         │        ──► single field definition at a path
         │
         ├──► resolveFieldSchemaAllUnions()
         │        ──► same, all union branches
         │
         └──► processSchema()
                  ──► categorize types, expand fields, compute refs

API reference

Fetch

`fetchSchemaByDescriptorId(client, descriptorId, options?)`

Fetches a schema from the Lexicon API and returns parsed ManifestSchemaType[].

The client should be a @sanity/client instance configured for the Lexicon API host. The descriptorId identifies a specific schema version — typically obtained from an application resource.

const types = await fetchSchemaByDescriptorId(client, descriptorId);

Caching: Pass { cache: true } for a built-in in-memory cache (5 min TTL), or provide a custom adapter:

// Built-in memory cache
const types = await fetchSchemaByDescriptorId(client, id, { cache: true });

// Custom cache (e.g. Redis)
const types = await fetchSchemaByDescriptorId(client, id, {
  cache: {
    get: (id) => redis.get(id),
    set: (id, response) => redis.set(id, response),
  },
});

| Option | Type | Default | Description | | ------- | ------------------------ | -------- | --------------------------------------------------------- | | cache | boolean \| SchemaCache | | Enable caching. true for in-memory, or a custom adapter | | ttlMs | number | 300000 | Cache TTL in ms (only applies to built-in memory cache) |

`fetchDatasetStats(client, options)`

Fetches document and reference counts from the agent API. Returns { status: 'ready', stats } or { status: 'processing' } if stats are still being computed.

The client should be configured to point at the agent API host (e.g. api.sanity.io) with useProjectHostname: false.

const result = await fetchDatasetStats(client, {
  organizationId: "org-abc",
  projectId: "my-project",
  dataset: "production",
});

if (result.status === "ready") {
  result.stats.documentCount; // { post: 42, author: 7 }
  result.stats.referenceCount; // { post: { incoming: {...}, outgoing: {...} } }
}

Compress

`compressedSchema(types, options?)`

Generates compressed text for agent consumption. Processes the schema, applies filters, and formats the output.

| Option | Type | Description | | -------------- | --------------------------- | ----------------------------------------------------------------------------------- | | filter | SchemaFilter | GROQ filter { read?, write? } or string[] | | datasetStats | DatasetStats | Document/reference counts to enrich output | | fields | SchemaFieldDisplayOptions | Control field display (see below) | | includeEmpty | boolean | When true, keep document types whose documentCount is explicitly 0. Defaults to false, which drops them. |

Field display options:

| Option | Type | Default | Description | | ---------------------- | --------- | ------- | --------------------------------------------------- | | showAll | boolean | false | Show all fields (ignores maxCount) | | maxCount | number | 10 | Max fields to display per type | | includeDescription | boolean | false | Append field titles that differ from the field name | | includeHidden | boolean | false | Include hidden fields | | includeDeprecated | boolean | false | Include deprecated fields | | maxEnumValues | number | 10 | Max enum values before truncating | | maxDescriptionLength | number | 50 | Truncate field descriptions beyond this length |

Full schema

`fullSchema(types, options)`

Returns the complete JSON structure for a schema type, optionally scoped to a field path.

Without path, you get the entire type. With path, you get the subtree rooted at that field. This is useful because some types are huge — path lets you zoom in without fetching the whole thing.

When the output exceeds sizeThreshold, a navigator object is returned instead with child paths to drill into:

// Navigator shape (returned when output is too large)
{
  _rawLargeNode: true,
  name: "post",
  type: "document",
  children: [
    { type: "array", path: "content[]" },
    { type: "object", path: "metadata" },
  ],
  hint: "This node is large. Fetch a specific child path to drill in."
}

| Option | Type | Default | Description | | --------------- | -------- | -------- | ----------------------------------------------------- | | type | string | required | Document type name to return | | path | string | | Field path — returns the subtree from this point down | | sizeThreshold | number | 200KB | Byte limit before returning a navigator instead |

Process

`processSchema(types, options?)`

Processes raw ManifestSchemaType[] into categorized, enriched data structures. This is the core transformation that compressedSchema uses internally.

Returns a SchemaTypes object:

type SchemaTypes = {
  documentTypes: Map<string, ProcessedDocumentType>;
  objectTypes: Map<string, ManifestSchemaType>;
};

Each ProcessedDocumentType contains:

| Property | Type | Description | | ------------------------ | ------------------ | ---------------------------------- | | name | string | Document type name | | title | string? | Human-readable title | | description | string? | Type description | | deprecated | object? | Deprecation info with reason | | titleField | string | Best candidate for display title | | fields | ManifestField[] | Expanded field definitions | | outgoingReferences | string[] | Types this document references | | incomingReferences | string[] | Types that reference this document | | outgoingReferencePaths | ReferenceInfo[]? | Outgoing refs with field paths | | incomingReferencePaths | ReferenceInfo[]? | Incoming refs with field paths |

| Option | Type | Description | | -------- | --------------------------- | --------------------------------------- | | filter | SchemaFilter | GROQ filter or string[] of type names | | fields | SchemaFieldDisplayOptions | Field display options |

Navigate

`resolveFieldSchema(path, typeSchema, schema)`

Returns the single field definition at the end of a path. Handles nested objects, array indices, built-in type fields (slug.current, block.children), and type references. Returns null if the path doesn't resolve.

`resolveFieldSchemaAllUnions(path, typeSchema, schema)`

Same as resolveFieldSchema, but when the path crosses a union array, returns the field definition from every item type instead of just the first match.

`filterToPath(type, path, schema)`

Returns a schema type narrowed to a specific field path — the same subtree extraction that fullSchema uses internally.

import { filterToPath } from "@sanity/agent-schema";

const filtered = filterToPath(postType, "content[].codeBlock", allTypes);

`toSchemaPath(path)`

Converts a document path to a schema path by stripping array indices and key selectors.

import { toSchemaPath } from "@sanity/agent-schema";

toSchemaPath('content[_key=="abc"].children[0].text');
// "content.children.text"

Type helpers

`isPrimitiveType(type, options?)`

Returns true for primitive types (string, number, boolean, text, url, date, datetime). Pass { includeBlockTypes: true } to also include block and span.

`isBuiltInType(type)`

Returns true for Sanity built-in types (image, file, slug, geopoint, reference, block, span).

`isGenericObject(node)`

Returns true for object-like schema nodes without a fields array.

`shouldSkipType(typeName)`

Returns true for internal types that should be excluded from processing (types matching system.*, sanity.*, assist.*, media.*, mux.*).

Constants

`BUILT_IN_TYPE_FIELDS`

Maps built-in type names to their implicit field definitions:

import { BUILT_IN_TYPE_FIELDS } from "@sanity/agent-schema";

BUILT_IN_TYPE_FIELDS.slug.current; // { name: "current", type: "string" }
BUILT_IN_TYPE_FIELDS.image.asset; // { name: "asset", type: "reference" }
BUILT_IN_TYPE_FIELDS.block.children; // { name: "children", type: "array", of: [...] }

Supported types: slug, geopoint, reference, image, file, block, span.

`ARRAY_ITEM_SYSTEM_FIELDS`

System fields present on all array items:

import { ARRAY_ITEM_SYSTEM_FIELDS } from "@sanity/agent-schema";

ARRAY_ITEM_SYSTEM_FIELDS._key; // { name: "_key", type: "string" }
ARRAY_ITEM_SYSTEM_FIELDS._type; // { name: "_type", type: "string" }

`DEFAULT_LINK_ANNOTATION`

The default link annotation Sanity adds to Portable Text blocks when no marks.annotations are declared.

`SCHEMA_CONSTANTS`

Internal constants used by the schema processor (skip patterns, title field candidates, default limits). Exported for consumers that need to extend the defaults.

Types

| Type | Description | | --------------------------- | --------------------------------------------------------------------- | | ManifestSchemaType | A parsed schema type (document, object, array, etc.) | | ManifestField | A field definition (ManifestSchemaType with optional fieldset) | | DatasetStats | { documentCount?, referenceCount? } | | DatasetStatsSchema | Zod schema for runtime validation of DatasetStats | | CompressedSchemaOptions | Options for compressedSchema() | | SchemaFilter | { read?, write? } GROQ filter or string[] of type names | | SchemaFieldDisplayOptions | Field display options for compressedSchema | | FullSchemaOptions | Options for fullSchema() | | SchemaCache | Cache adapter for fetchSchemaByDescriptorId | | FetchSchemaOptions | Options for fetchSchemaByDescriptorId | | ProcessedDocumentType | A document type after processing (fields expanded, refs computed) | | SchemaTypes | Return type of processSchema (documentTypes + objectTypes Maps) | | SchemaProcessorOptions | Options for processSchema() |

Deprecated aliases

navigateToFieldSchema and navigateToFieldSchemaAllUnions are deprecated aliases for resolveFieldSchema and resolveFieldSchemaAllUnions. They will be removed in the next minor version.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@sanity/agent-schema

Install

Getting started

Usage

Compress a schema for LLM context

Browse full schema JSON

Resolve a field at a path

Process schema for custom tooling

Architecture

API reference

Fetch

fetchSchemaByDescriptorId(client, descriptorId, options?)

fetchDatasetStats(client, options)

Compress

compressedSchema(types, options?)

Full schema

fullSchema(types, options)

Process

processSchema(types, options?)

Navigate

resolveFieldSchema(path, typeSchema, schema)

resolveFieldSchemaAllUnions(path, typeSchema, schema)

filterToPath(type, path, schema)

toSchemaPath(path)

Type helpers

isPrimitiveType(type, options?)

isBuiltInType(type)

isGenericObject(node)

shouldSkipType(typeName)

Constants

BUILT_IN_TYPE_FIELDS

ARRAY_ITEM_SYSTEM_FIELDS

DEFAULT_LINK_ANNOTATION

SCHEMA_CONSTANTS

Types

Deprecated aliases

`fetchSchemaByDescriptorId(client, descriptorId, options?)`

`fetchDatasetStats(client, options)`

`compressedSchema(types, options?)`

`fullSchema(types, options)`

`processSchema(types, options?)`

`resolveFieldSchema(path, typeSchema, schema)`

`resolveFieldSchemaAllUnions(path, typeSchema, schema)`

`filterToPath(type, path, schema)`

`toSchemaPath(path)`

`isPrimitiveType(type, options?)`

`isBuiltInType(type)`

`isGenericObject(node)`

`shouldSkipType(typeName)`

`BUILT_IN_TYPE_FIELDS`

`ARRAY_ITEM_SYSTEM_FIELDS`

`DEFAULT_LINK_ANNOTATION`

`SCHEMA_CONSTANTS`