npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@semiont/content

v0.5.9

Published

Working-tree storage for project resources and PDF text-layer extraction

Readme

@semiont/content

Tests codecov npm version npm downloads License

Working-tree storage for project resources, with optional git staging, plus PDF text-layer extraction.

Installation

npm install @semiont/content

Architecture Context

Infrastructure Ownership: In production applications, the working tree store is created and managed by @semiont/make-meaning's startMakeMeaning() function, which serves as the single orchestration point for all infrastructure components. Backend code accesses it as knowledgeBase.content.

The quick start example below shows direct instantiation for testing, CLI tools, or content management scripts.

Quick Start

import { WorkingTreeStore, deriveStorageUri } from '@semiont/content';
import { SemiontProject } from '@semiont/core/node';

const project = new SemiontProject('/path/to/project');
const store = new WorkingTreeStore(project);

// Derive a stable file:// URI from a resource name
const uri = deriveStorageUri('My Document', 'text/markdown');
// => "file://my-document.md"

// Write content to the working tree (API/GUI/AI path)
const stored = await store.store(Buffer.from('# My Document\n'), uri);
console.log(stored.checksum);  // SHA-256 hex of the content
console.log(stored.byteSize);  // 14

// Register a file that is already on disk (CLI path)
const registered = await store.register('file://docs/overview.md');

// Read content back by URI
const content = await store.retrieve(uri);
console.log(content.toString()); // "# My Document\n"

// Move and remove files
await store.move(uri, 'file://docs/my-document.md');
await store.remove('file://docs/my-document.md');

Working Tree Storage

The working tree (project root) is the source of truth for file content. Resources are identified by their file:// URI, which is stable across content changes; moves are tracked by events.

my-project/                  ← project root
├── .semiont/                ← project config and event log
└── docs/
    └── overview.md          ← storageUri "file://docs/overview.md"

There are two write paths:

  • store(content, storageUri) — write bytes to disk. Used when the file does not yet exist and the caller provides content (API/GUI/AI path).
  • register(storageUri, expectedChecksum?) — read an existing file and record its metadata (CLI path). If expectedChecksum is provided and does not match, throws ChecksumMismatchError.

Both return the same metadata:

interface StoredResource {
  storageUri: string;    // file:// URI (e.g. "file://docs/overview.md")
  checksum: string;      // SHA-256 hex of content
  byteSize: number;      // Size in bytes
  created: string;       // ISO 8601 timestamp
}

Git Integration

When the project has [git] sync = true in .semiont/config, the store keeps the git index up to date automatically:

  • store() / register() run git add
  • move() runs git mv
  • remove() runs git rm (or git rm --cached with keepFile: true)

Every method accepts { noGit: true } to skip staging for a single call. Without git sync, the store falls back to plain filesystem operations.

PDF Text-Layer Extraction

For native (non-scanned) PDFs, extractPdfTextLayer() extracts positioned text using pdfjs-dist. It returns null for scanned/image-only PDFs.

import { extractPdfTextLayer, locate } from '@semiont/content';

const layer = await extractPdfTextLayer(pdfBytes);
if (layer) {
  console.log(layer.text);          // Full extracted text
  console.log(layer.pages.length);  // Page dimensions in PDF points

  // Find bounding rectangles for a span of the text (one per line)
  const rects = locate(layer, 120, 178);
  // => PdfCoordinate[] in PDF point space (origin: bottom-left)
}

Coordinates are in PDF point space, originating from the bottom-left of the page. The Y-flip to canvas pixels happens downstream in the browser; the server has no canvas. The PdfCoordinate geometry type lives in @semiont/core alongside the viewrect FragmentSelector codec.

Utilities

import {
  calculateChecksum,       // SHA-256 hex of a string or Buffer
  verifyChecksum,          // Compare content against an expected checksum
  deriveStorageUri,        // ("My Doc", "text/markdown") → "file://my-doc.md"
} from '@semiont/content';

deriveStorageUri takes a SupportedMediaType; the media-type registry — which types are admitted, their extensions, and their capabilities — lives in @semiont/core's media-types.ts. See docs/mime-types.md.

Documentation

Development

# Install dependencies
npm install

# Build package
npm run build

# Run tests
npm test

# Type checking
npm run typecheck

License

Apache-2.0