@cdxgen/cdx-purl
v0.0.3
Published
Pure JavaScript package-url parser and builder
Downloads
212
Readme
cdx-purl
Strict, definition-driven package-url parser/builder for Node.js.
- Pure JavaScript
- No runtime dependencies
- ABNF-first strict parsing
- Type rules are generated from
specification/types/*-definition.jsonintogenerated/type-rules.js - Deterministic canonicalization and roundtrip behavior
Install
pnpm add @cdxgen/cdx-purlnpm install @cdxgen/cdx-purlbun add @cdxgen/cdx-purldeno add npm:@cdxgen/cdx-purlQuick start
import {
parse,
build,
roundTrip,
Purl,
NpmPurl,
MavenPurl,
TypedPurlBuilders,
TypedPurls,
} from "@cdxgen/cdx-purl";
const parsed = parse("pkg:npm/%40angular/[email protected]");
const built = build(parsed);
const canonical = roundTrip(built);
const npmPurl = NpmPurl.builder()
.setScope("@angular")
.setName("animation")
.setVersion("12.3.1")
.build()
.toString();
const MavenBuilder = TypedPurlBuilders.maven;
const mvn = new MavenBuilder()
.setNamespace("org.apache.commons")
.setName("io")
.buildString();
console.log(parsed, canonical, npmPurl, mvn, !!TypedPurls.swift);API
parse(input)parse and validate strict ABNF + type rulesbuild(parts)canonical build with strict validationroundTrip(input)parse then build canonical outputPurlimmutable strict value objectPurlBuilderfluent generic builderTypedPurlBuildersgenerated builder classes keyed by typeTypedPurlsgenerated typed classes keyed by typegetTypedPurlBuilder(type)andgetTypedPurlClass(type)lookup helpers
Runtime note: index.js imports generated TYPE_RULES_SOURCE from generated/type-rules.js; it does not read specification/types JSON files at execution time.
All named typed exports (for example NpmPurlBuilder, RpmPurl, VscodeExtensionPurlBuilder) are concrete generated classes, not placeholders.
Validation behavior (strict by design)
Unknown qualifier keys are rejected
import { build } from "@cdxgen/cdx-purl";
build({
type: "maven",
namespace: "org.apache.commons",
name: "io",
version: "1.3.4",
qualifiers: { mykey: "value" }, // throws E_UNKNOWN_QUALIFIER
subpath: null,
});Only explicitly allowed multivalue qualifiers are accepted
import { build } from "@cdxgen/cdx-purl";
// throws E_MULTIVALUE_QUALIFIER
build({
type: "maven",
namespace: "org.apache.commons",
name: "io",
version: "1.3.4",
qualifiers: { classifier: "sources,docs" },
subpath: null,
});
// allowed: checksum supports comma-delimited values
build({
type: "generic",
namespace: null,
name: "openssl",
version: "1.1.1w",
qualifiers: {
checksum:
"sha1:da39a3ee5e6b4b0d3255bfef95601890afd80709,sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
},
subpath: null,
});Checksum values are validated by algorithm
Checksum entries must use algorithm:digest form. The library validates each entry against an internal algorithm map and rejects digests that are missing an algorithm, non-hex, or wrong length for the selected algorithm.
import { parse } from "@cdxgen/cdx-purl";
// success: valid lengths for each algorithm
parse(
"pkg:generic/[email protected]?checksum=" +
"sha1:da39a3ee5e6b4b0d3255bfef95601890afd80709," +
"sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
);
// throws E_CHECKSUM_MISSING_ALGORITHM
parse(
"pkg:generic/[email protected]?checksum=da39a3ee5e6b4b0d3255bfef95601890afd80709",
);
// throws E_CHECKSUM_INVALID_DIGEST_FOR_ALGORITHM (wrong length)
parse("pkg:generic/[email protected]?checksum=sha256:abc123");
// throws E_CHECKSUM_INVALID_DIGEST_FOR_ALGORITHM (non-hex)
parse(
"pkg:generic/[email protected]?checksum=" +
"sha1:zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz",
);Type-specific required and prohibited fields are enforced
import { build } from "@cdxgen/cdx-purl";
// swift namespace is required and must contain host/owner
build({
type: "swift",
namespace: null, // throws E_REQUIRED_COMPONENT / E_SWIFT_NAMESPACE
name: "alamofire",
version: null,
qualifiers: null,
subpath: null,
});Subpath validation in builder and object flows
When you provide purl parts directly (build, Purl.from, typed builders), subpath must be relative.
Absolute-style values are rejected with E_INVALID_SUBPATH and an actionable message.
This includes leading / and Windows absolute forms like C:\\docs\\api or \\\\server\\share\\docs.
import { build, parse } from "@cdxgen/cdx-purl";
// throws E_INVALID_SUBPATH
build({
type: "generic",
namespace: null,
name: "openssl",
version: "1.1.1w",
qualifiers: null,
subpath: "/docs/api",
});
// accepted: relative subpath
build({
type: "generic",
namespace: null,
name: "openssl",
version: "1.1.1w",
qualifiers: null,
subpath: "docs/api",
});
// parser compatibility: raw purl subpaths are canonicalized
// `//docs///api/` -> `docs/api`
parse("pkg:generic/[email protected]#//docs///api/"); // => { type: "generic", namespace: null, name: "openssl", version: "1.1.1w", qualifiers: null, subpath: "docs/api" }Supported purl types
The library generates typed classes/builders for every type definition at generation time from specification/types/*-definition.json:
alpm, apk, bazel, bitbucket, bitnami, cargo, chrome-extension, cocoapods, composer, conan, conda, cpan, cran, deb, docker, gem, generic, github, golang, hackage, hex, huggingface, julia, luarocks, maven, mlflow, npm, nuget, oci, opam, otp, pub, pypi, qpkg, rpm, swid, swift, vscode-extension, yocto.
| Requirement area | Official spec source | What this library enforces | Primary error codes |
| ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
| Canonical type and qualifier key casing | specification/purl-proposed-grammar.abnf + type definitions | Type and qualifier keys canonicalized to lowercase; deterministic qualifier ordering | E_INVALID_TYPE, E_INVALID_QUALIFIER_KEY, E_DUPLICATE_QUALIFIER |
| ABNF strict parse rules | specification/purl-proposed-grammar.abnf | Scheme must be pkg; strict separator usage; strict percent-encoding acceptance and decoding | E_INVALID_SCHEME, E_INVALID_QUALIFIERS, E_BAD_PERCENT_ENCODING, E_DISALLOWED_PERCENT_ENCODING, E_INVALID_SUBPATH |
| Component requirements (namespace, name, version, subpath) | specification/types/*-definition.json (*_definition.requirement) | Enforces required/prohibited/optional semantics per type | E_REQUIRED_COMPONENT, E_PROHIBITED_COMPONENT, E_MISSING_NAME, E_MISSING_VERSION |
| Character constraints and normalization | specification/types/*-definition.json (permitted_characters, normalization_rules, case_sensitive) | Applies case and normalization rules before validation and canonical build | E_PERMITTED_CHARACTERS, E_INVALID_CHARACTER |
| Qualifier allow-list per type | specification/types/*-definition.json (qualifiers_definition) | Rejects unknown qualifier keys except global/spec-compat policy keys | E_UNKNOWN_QUALIFIER |
| Qualifier value cardinality | strict policy | Multi-value qualifiers rejected unless explicitly allowed (checksum) | E_MULTIVALUE_QUALIFIER, E_INVALID_QUALIFIER_VALUE, E_CHECKSUM_MISSING_ALGORITHM, E_CHECKSUM_INVALID_DIGEST_FOR_ALGORITHM |
| Required qualifiers | specification/types/*-definition.json (qualifiers_definition.requirement) | Missing required qualifier is rejected in parse and build flows | E_REQUIRED_QUALIFIER |
| Type-specific semantic rules | type specs and implementation policy | CPAN uppercase author namespace and no :: in distribution name; Swift host/owner namespace shape | E_CPAN_NAMESPACE, E_CPAN_NAME, E_SWIFT_NAMESPACE |
| Typed class safety | generated type registry from definitions | Type-locked parse for each typed class with mismatch detection | E_TYPE_MISMATCH |
Normalization opcode model
Normalization rule free text from specification/types/*-definition.json is compiled into opcode arrays in generated/type-rules.js.
index.js applies these opcodes in order via applyNormalizationRules(...) and rejects unknown opcodes with E_UNKNOWN_NORMALIZATION_OP.
Current opcode semantics:
to_lowercase: lowercase the full value.apply_kebab_case: collapse non-alphanumeric runs to-and trim outer dashes (preserves letter case).replace_underscore_with_dash: replace all_with-.replace_dot_with_underscore: replace all.with_.replace_non_alnum_with_underscore: lowercase then replace each non[a-z0-9]character with_.
Opcode precedence is resolved in the generator (scripts/generate-type-rules.mjs) so emitted arrays are deterministic across regeneration.
For conflicting rules, precedence is applied before runtime. Example for PyPI names:
replace_dot_with_underscorereplace_underscore_with_dash
This guarantees canonical idempotence for mixed forms like A_B-C.D~x -> a-b-c-d~x.
Known normalization limitation
alpm.versionincludes a normalization rule that referencesvercmp(8)semantics.- This library currently keeps
alpmsupport without applying that version normalization logic. - Parsing and validation still run for
alpm, but no additionalvercmp-based rewrite is applied to version values.
Test commands
pnpm test
pnpm test:base
pnpm test:typed
pnpm test:typed:all
pnpm test:mutation
pnpm test:fuzzDeterministic fuzz controls
PURL_FUZZ_SEED=1337
PURL_FUZZ_CASES=300
PURL_FUZZ_HOPS=10
PURL_FUZZ_MUTATION_CASES=200
PURL_TYPED_FUZZ_SEED=4242
PURL_TYPED_FUZZ_CASES=240
PURL_TYPED_FUZZ_HOPS=8
pnpm test:fuzzCoverage includes:
- base fixture compatibility tests
- typed class and builder coverage across all registered types
- adversarial strict-policy tests
- mutation-driven negative tests generated from type definitions
- deterministic fuzz and mutation fuzz with reproducible seeds
Support policy
- Runtime support: Node
>=20, Bun>=1.0.0, Deno>=1.40.0 - CI test matrix (Node):
20.x,22.x,24.x,25.x - CI runtime smoke coverage: Bun and Deno
- Build and publish path: Node
24.xwith npm trusted publishing + provenance
License
MIT License. See LICENSE for details.
