canonize

v0.1.2

Published

2 months ago

Aggressive type coercion for Zod schemas. Wrap any Zod schema with intelligent coercion that handles edge cases, maintains type safety, and lets validation focus on business logic.

0High
0Medium
0Low

stevekinney

zod schema validation coercion type-coercion transform parse typescript standard-schema

canonize

Aggressive type coercion for Zod schemas.

Canonize exists for the messy middle ground between "invalid" and "usable." When working with LLM tool calls, you might get a parameters object that does not match your schema exactly, but it is close enough to work. Canonize takes any Zod schema and returns a version that tries its hardest to coerce incoming input into the shape you expect.

The Problem

You've defined a beautiful Zod schema:

const userSchema = z.object({
  age: z.number(),
  active: z.boolean(),
  tags: z.array(z.string()),
});

Then reality hits. Your API receives:

{ age: "30", active: "yes", tags: "admin,user" }

Zod's built-in z.coerce helps with simple cases, but it won't parse "yes" as true, split "admin,user" into an array, or handle the dozen other formats your data might arrive in.

You're left writing preprocessing logic, custom transforms, or wrapper functions for every schema. The business logic gets buried under input normalization.

The Solution

Wrap your schema with canonize() and move on:

import { canonize } from 'canonize';

const userSchema = canonize(
  z.object({
    age: z.number(),
    active: z.boolean(),
    tags: z.array(z.string()),
  }),
);

// All of these now work:
userSchema.parse({ age: '30', active: 'yes', tags: 'admin,user' });
userSchema.parse({ age: 30.5, active: 1, tags: ['admin'] });
userSchema.parse({ age: '30px', active: 'enabled', tags: '["admin"]' });

Canonize handles the messy real-world inputs so your schema can focus on validation:

"30", "30px", "30.5" → 30 (number)
"yes", "true", "on", "1", 1 → true (boolean)
"admin,user", '["admin","user"]' → ["admin", "user"] (array)
"2024-01-15", 1705276800000, "now" → Date object
Nested objects, unions, discriminated unions, intersections—all coerced recursively

When to Use Canonize

API endpoints receiving form data, query strings, or JSON from unknown clients
Configuration files where users write enabled: yes instead of enabled: true
LLM tool calls where the model outputs "42" instead of 42
Legacy system integration with inconsistent data formats
CSV/spreadsheet imports where everything is a string

Installation

npm install canonize zod
# or
bun add canonize zod
# or
pnpm add canonize zod

Quick Start

import { canonize } from 'canonize';
import { z } from 'zod';

// Wrap any Zod schema
const schema = canonize(
  z.object({
    name: z.string(),
    age: z.number(),
    active: z.boolean(),
    tags: z.array(z.string()),
  }),
);

// Coercion happens automatically
schema.parse({ name: 123, age: '30', active: 'yes', tags: 'a,b,c' });
// { name: '123', age: 30, active: true, tags: ['a', 'b', 'c'] }

Armorer Integration

Canonize works cleanly with Armorer tool schemas. Wrap the schema (or raw shape) with canonize so LLM arguments are coerced before execution.

import { createTool } from 'armorer';
import { canonize } from 'canonize';
import { z } from 'zod';

const addNumbers = createTool({
  name: 'add-numbers',
  description: 'Add two numbers together',
  schema: canonize({
    a: z.number(),
    b: z.number(),
  }),
  async execute({ a, b }) {
    return a + b;
  },
});

API Reference

Core Function

`canonize<T>(schema: T): T`

Wraps a Zod schema with aggressive type coercion. Returns the same schema type for full TypeScript inference.

import { canonize } from 'canonize';
import { z } from 'zod';

const numberSchema = canonize(z.number());
numberSchema.parse('42'); // 42
numberSchema.parse('42px'); // 42
numberSchema.parse('1,234'); // 1234

const boolSchema = canonize(z.boolean());
boolSchema.parse('yes'); // true
boolSchema.parse('0'); // false
boolSchema.parse('enabled'); // true

const arraySchema = canonize(z.array(z.number()));
arraySchema.parse('1,2,3'); // [1, 2, 3]
arraySchema.parse('[1,2,3]'); // [1, 2, 3]
arraySchema.parse(42); // [42]

Supported Zod types:

Primitives: string, number, boolean, bigint, date, null, nan
Collections: array, object, tuple, record, map, set
Composites: union, discriminatedUnion, intersection
Special: enum, literal, any, unknown, custom
Wrappers: optional, nullable, default, catch, readonly, lazy

Diagnostics

`safeParseWithReport(schema, input)`

Coerces inputs and returns a report of what changed alongside the parse result.

import { safeParseWithReport } from 'canonize';
import { z } from 'zod';

const schema = z.object({ count: z.number(), enabled: z.boolean() });
const result = safeParseWithReport(schema, { count: '42', enabled: 'yes' });

if (result.success) {
  console.log(result.data);
}
console.log(result.report.warnings);

`coerceWithReport(schema, input)`

Returns the coerced value plus warnings without running validation.

`createRepairHints(error, options?)`

Generates compact, LLM-friendly suggestions from a ZodError.

import { createRepairHints } from 'canonize';

const hints = createRepairHints(result.error);

Type Detection Utilities

`getZodTypeName(schema: ZodTypeAny): string`

Returns the Zod type name for a schema. Useful for building custom coercion logic.

import { getZodTypeName } from 'canonize';
import { z } from 'zod';

getZodTypeName(z.string()); // 'string'
getZodTypeName(z.array(z.number())); // 'array'
getZodTypeName(z.object({})); // 'object'
getZodTypeName(z.string().optional()); // 'optional'

`unwrapSchema(schema: ZodTypeAny): ZodTypeAny`

Removes wrapper types (optional, nullable, default, catch, readonly) to get the inner schema.

import { unwrapSchema, getZodTypeName } from 'canonize';
import { z } from 'zod';

const wrapped = z.string().optional().nullable().default('hello');
const inner = unwrapSchema(wrapped);
getZodTypeName(inner); // 'string'

Circular Reference Tracking

`CircularTracker`

A WeakSet-based tracker for detecting circular references during coercion. Prevents infinite loops when processing self-referential data structures.

import { CircularTracker } from 'canonize';

const tracker = new CircularTracker();
const obj = { self: null };
obj.self = obj; // circular reference

tracker.has(obj); // false
tracker.add(obj);
tracker.has(obj); // true

Schema Creation Helpers

`createCanonizePrimitive(primitive: CanonizePrimitive): ZodTypeAny`

Creates a coerced Zod schema for a primitive type.

import { createCanonizePrimitive } from 'canonize';

const stringSchema = createCanonizePrimitive('string');
const numberSchema = createCanonizePrimitive('number');
const booleanSchema = createCanonizePrimitive('boolean');
const nullSchema = createCanonizePrimitive('null');

Supported primitives: 'string' | 'number' | 'boolean' | 'null'

`createCanonizeSchema<T>(schema: T): ZodObject`

Creates a Zod object schema from a record of primitive type names.

import { createCanonizeSchema } from 'canonize';

const schema = createCanonizeSchema({
  name: 'string',
  age: 'number',
  active: 'boolean',
});

schema.parse({ name: 123, age: '30', active: 'yes' });
// { name: '123', age: 30, active: true }

Constants

`ZodType`

Object containing Zod type name constants for use in type detection.

import { ZodType } from 'canonize';

ZodType.STRING; // 'string'
ZodType.NUMBER; // 'number'
ZodType.ARRAY; // 'array'
ZodType.OBJECT; // 'object'
ZodType.UNION; // 'union'
// ... and more

Available constants:

| Category | Constants | | ----------- | --------------------------------------------------------------------------- | | Primitives | STRING, NUMBER, BOOLEAN, DATE, BIGINT, NULL, UNDEFINED, NAN | | Collections | ARRAY, OBJECT, TUPLE, RECORD, MAP, SET | | Composites | UNION, DISCRIMINATED_UNION, INTERSECTION | | Enums | ENUM, NATIVE_ENUM, LITERAL | | Wrappers | OPTIONAL, NULLABLE, DEFAULT, CATCH, LAZY, READONLY, BRANDED | | Special | ANY, UNKNOWN, NEVER, CUSTOM |

Types

`CanonizeSchema<T>`

Type alias representing a canonized schema. Preserves the original schema's type information.

import type { CanonizeSchema } from 'canonize';
import { z } from 'zod';

type MySchema = CanonizeSchema<z.ZodObject<{ name: z.ZodString }>>;

`CanonizePrimitive`

Union type for primitive type names accepted by createCanonizePrimitive.

import type { CanonizePrimitive } from 'canonize';

const primitive: CanonizePrimitive = 'string'; // 'string' | 'number' | 'boolean' | 'null'

Coercion Rules

String

| Input | Output | | ------------------- | -------------- | | "hello" | "hello" | | 123 | "123" | | true | "true" | | null, undefined | "" | | [1, 2, 3] | "1, 2, 3" | | { key: "value" } | "key: value" | | new Date() | ISO string |

Number

| Input | Output | | -------------------- | --------- | | "42" | 42 | | "42px", "42em" | 42 | | "1,234", "1_234" | 1234 | | "1e5" | 100000 | | true / false | 1 / 0 | | [42] | 42 |

Boolean

| Input | Output | | ------------------------------------------------------------- | ------- | | "true", "yes", "on", "y", "t", "enabled", "1" | true | | "false", "no", "off", "n", "f", "disabled", "0" | false | | 1, non-zero numbers | true | | 0 | false |

Date

| Input | Output | | ------------------- | ------------------ | | ISO string | new Date(string) | | Unix timestamp (ms) | new Date(number) | | "now" | Current time | | "today" | Start of today | | "yesterday" | Start of yesterday | | "tomorrow" | Start of tomorrow |

Array

| Input | Output | | ------------------ | ----------------- | | "1,2,3" | ["1", "2", "3"] | | "[1,2,3]" (JSON) | [1, 2, 3] | | null, "" | [] | | Set, Map | Array from values | | Single value | [value] |

Object

| Input | Output | | ------------------- | ---------------------- | | JSON string | Parsed object | | Map | Object.fromEntries() | | null, undefined | {} |

Union

Coercion tries options in order:

Exact primitive match (preserves numbers in string | number)
Object/record schemas for plain objects
Array schemas for arrays and CSV strings
Boolean schemas for boolean-like strings
First union member, then remaining members

Discriminated Union

Uses the discriminator field to select the variant, then coerces fields:

const schema = canonize(
  z.discriminatedUnion('type', [
    z.object({ type: z.literal('a'), value: z.number() }),
    z.object({ type: z.literal('b'), value: z.string() }),
  ]),
);

schema.parse({ type: 'a', value: '42' }); // { type: 'a', value: 42 }

Tool Parameter Helpers

The canonize/tool-parameters module provides schema builders for LLM tool definitions. These handle malformed AI outputs gracefully with sensible defaults.

import {
  boolean,
  number,
  string,
  selector,
  containerSelector,
  collection,
  numbers,
  choices,
  count,
  url,
  exportFormat,
  imageFormat,
  links,
  linkMetadataSchema,
  type LinkMetadata,
} from 'canonize/tool-parameters';

`boolean(defaultValue)`

const enabled = boolean(true);
enabled.parse('yes'); // true
enabled.parse('FALSE'); // false
enabled.parse(1); // true
enabled.parse(undefined); // true (default)

`number(defaultValue, options?)`

const count = number(10, { min: 1, max: 100, int: true });
count.parse('42px'); // 42
count.parse('1,234'); // 1234
count.parse(undefined); // 10 (default)

`string()`

const name = string();
name.parse('  hello  '); // 'hello' (trimmed)
name.parse(123); // '123'

`selector()`

CSS selector string, trimmed and validated non-empty.

const sel = selector();
sel.parse('  .class  '); // '.class'

`containerSelector()`

Container selector with intelligent coercion for common LLM mistakes:

const container = containerSelector();
container.parse('main'); // 'main'
container.parse('*'); // null (wildcard → entire document)
container.parse('null'); // null
container.parse('a'); // null (link selector → entire document)
container.parse('body a'); // 'body' (extracts container)
container.parse('all'); // null (natural language)

`collection(...defaultValues)`

String array with flexible separators (comma, semicolon, pipe, newline):

const tags = collection('default');
tags.parse('foo,bar'); // ['foo', 'bar']
tags.parse('foo;bar'); // ['foo', 'bar']
tags.parse('foo|bar'); // ['foo', 'bar']
tags.parse('foo\nbar'); // ['foo', 'bar']
tags.parse(undefined); // ['default']

`numbers(options?)`

Number array with flexible input handling:

const ids = numbers({ int: true, min: 0 });
ids.parse('1,2,3'); // [1, 2, 3]
ids.parse([1, '2', 3]); // [1, 2, 3]
ids.parse(undefined); // []

`choices(values, defaultValue?)`

Enum with fuzzy matching (case-insensitive, prefix, contains):

const sort = choices(['date', 'name', 'size'], 'date');
sort.parse('Date'); // 'date' (case-insensitive)
sort.parse('nam'); // 'name' (prefix match)
sort.parse('date_desc'); // 'date' (contains match)

`count()`

Number for count/statistic values (defaults to 0):

const total = count();
total.parse('42'); // 42
total.parse(null); // 0

`url()`

URL string with cleanup (removes wrapping quotes, brackets):

const link = url();
link.parse('"https://example.com"'); // 'https://example.com'
link.parse('<https://example.com>'); // 'https://example.com'

`exportFormat(options?)`

Export format enum (markdown, csv, json):

exportFormat(); // defaults to 'markdown'
exportFormat({ defaultValue: 'csv' }); // defaults to 'csv'
exportFormat({ includeJson: false }); // 'markdown' | 'csv' only

`imageFormat(defaultValue?)`

Image format enum (jpeg, png):

imageFormat(); // defaults to 'png'
imageFormat('jpeg'); // defaults to 'jpeg'

`links()` and `linkMetadataSchema`

Array of link metadata objects:

const linkList = links();
linkList.parse([{ title: 'Example', url: 'https://example.com' }]);

// Or use the schema directly
import { linkMetadataSchema, type LinkMetadata } from 'canonize/tool-parameters';

const link: LinkMetadata = {
  title: 'Example',
  url: 'https://example.com',
  source: 'html', // optional: 'html' | 'markdown' | 'element' | 'link'
  rel: 'noopener', // optional
  target: '_blank', // optional
  referrerPolicy: null, // optional
  text: 'Click here', // optional: raw link text
};

Advanced Usage

Lazy Schemas (Recursive Types)

const TreeNode = canonize(
  z.lazy(() =>
    z.object({
      value: z.number(),
      children: z.array(TreeNode).optional(),
    }),
  ),
);

TreeNode.parse({
  value: '1',
  children: [{ value: '2' }, { value: '3' }],
});

Intersection Types

const schema = canonize(
  z.intersection(z.object({ a: z.number() }), z.object({ b: z.string() })),
);

schema.parse({ a: '1', b: 2 }); // { a: 1, b: '2' }

Map and Set

const mapSchema = canonize(z.map(z.string(), z.number()));
mapSchema.parse([
  ['a', '1'],
  ['b', '2'],
]); // Map { 'a' => 1, 'b' => 2 }
mapSchema.parse({ a: '1', b: '2' }); // Map { 'a' => 1, 'b' => 2 }

const setSchema = canonize(z.set(z.number()));
setSchema.parse([1, '2', 3]); // Set { 1, 2, 3 }
setSchema.parse('1,2,3'); // Set { 1, 2, 3 }

Error Handling

Coercion errors are caught internally—the original value passes through to Zod for validation:

const schema = canonize(z.number());

schema.parse('42'); // 42 (coercion succeeds)
schema.parse('not a number'); // throws ZodError (coercion fails, Zod validates original)

Circular references throw immediately:

const obj = { self: null };
obj.self = obj;

const schema = canonize(z.object({ self: z.any() }));
schema.parse(obj); // throws Error: Circular reference detected

StandardSchema Compatibility

Canonize is fully compatible with StandardSchema, the interoperability spec implemented by Zod, Valibot, ArkType, and others.

Since Zod v4 implements StandardSchema, all canonized schemas have the ~standard property:

const schema = canonize(z.object({ count: z.number() }));

// Use with any StandardSchema-aware tool
const result = await schema['~standard'].validate({ count: '42' });
// { value: { count: 42 } }

Canonize is Zod-specific because intelligent coercion requires schema introspection (knowing field types). StandardSchema only provides a validate() function without type information.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

canonize

The Problem

The Solution

When to Use Canonize

Installation

Quick Start

Armorer Integration

API Reference

Core Function

canonize<T>(schema: T): T

Diagnostics

safeParseWithReport(schema, input)

coerceWithReport(schema, input)

createRepairHints(error, options?)

Type Detection Utilities

getZodTypeName(schema: ZodTypeAny): string

unwrapSchema(schema: ZodTypeAny): ZodTypeAny

Circular Reference Tracking

CircularTracker

Schema Creation Helpers

createCanonizePrimitive(primitive: CanonizePrimitive): ZodTypeAny

createCanonizeSchema<T>(schema: T): ZodObject

Constants

ZodType

Types

CanonizeSchema<T>

CanonizePrimitive

Coercion Rules

String

Number

Boolean

Date

Array

Object

Union

Discriminated Union

Tool Parameter Helpers

boolean(defaultValue)

number(defaultValue, options?)

string()

selector()

containerSelector()

collection(...defaultValues)

numbers(options?)

choices(values, defaultValue?)

count()

url()

exportFormat(options?)

imageFormat(defaultValue?)

links() and linkMetadataSchema

Advanced Usage

Lazy Schemas (Recursive Types)

Intersection Types

Map and Set

Error Handling

StandardSchema Compatibility

License

`canonize<T>(schema: T): T`

`safeParseWithReport(schema, input)`

`coerceWithReport(schema, input)`

`createRepairHints(error, options?)`

`getZodTypeName(schema: ZodTypeAny): string`

`unwrapSchema(schema: ZodTypeAny): ZodTypeAny`

`CircularTracker`

`createCanonizePrimitive(primitive: CanonizePrimitive): ZodTypeAny`

`createCanonizeSchema<T>(schema: T): ZodObject`

`ZodType`

`CanonizeSchema<T>`

`CanonizePrimitive`

`boolean(defaultValue)`

`number(defaultValue, options?)`

`string()`

`selector()`

`containerSelector()`

`collection(...defaultValues)`

`numbers(options?)`

`choices(values, defaultValue?)`

`count()`

`url()`

`exportFormat(options?)`

`imageFormat(defaultValue?)`

`links()` and `linkMetadataSchema`