npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

glost-clause-segmenter

v0.2.1

Published

Clause segmentation extension for GLOST - segments sentences into clauses

Readme

glost-clause-segmenter

Language-agnostic clause segmentation extension for GLOST.

Architecture

This package provides the core segmentation logic. Language-specific implementations are provided by language packages:

  • glost-en/segmenter - English segmentation rules
  • glost-th/segmenter - Thai segmentation rules
  • glost-ja/segmenter - Japanese segmentation rules (coming soon)
  • etc.

Installation

# Core segmenter (required)
npm install glost-clause-segmenter

# Language-specific provider (pick your language)
npm install glost-en      # English
npm install glost-th      # Thai

Usage

Basic Usage

import { createClauseSegmenterExtension } from "glost-clause-segmenter";
import { englishSegmenterProvider } from "glost-en/segmenter";

const segmenter = createClauseSegmenterExtension({
  targetLanguage: "en",
  provider: englishSegmenterProvider
});

const result = await processGLOSTWithExtensionsAsync(document, [segmenter]);

Thai Example

import { createClauseSegmenterExtension } from "glost-clause-segmenter";
import { thaiSegmenterProvider } from "glost-th/segmenter";

const segmenter = createClauseSegmenterExtension({
  targetLanguage: "th",
  provider: thaiSegmenterProvider
});

Provider Interface

Language packages implement the ClauseSegmenterProvider interface:

interface ClauseSegmenterProvider {
  segmentSentence(
    words: string[],
    language: string
  ): Promise<SegmentationResult | undefined>;
  
  detectMood?(
    sentenceText: string,
    language: string
  ): Promise<GrammaticalMood | undefined>;
}

Creating a Custom Provider

import type { ClauseSegmenterProvider, SegmentationResult } from "glost-clause-segmenter";

const myCustomProvider: ClauseSegmenterProvider = {
  async segmentSentence(words, language) {
    const boundaries = [];
    
    // Your language-specific logic here
    for (let i = 0; i < words.length; i++) {
      const word = words[i];
      
      if (isSubordinator(word)) {
        boundaries.push({
          position: i,
          clauseType: "subordinate",
          marker: word,
          includeMarker: true
        });
      }
    }
    
    return { boundaries };
  },
  
  async detectMood(text, language) {
    // Optional: detect sentence mood
    return "declarative";
  }
};

API

createClauseSegmenterExtension(options)

Creates a clause segmenter extension.

Options:

  • targetLanguage (required): Language code (e.g., "en", "th")
  • provider (required): Language-specific segmenter provider
  • includeMarkers: Whether to include markers in clause nodes (default: true)

Returns: GLOSTExtension

Types

ClauseBoundary

Detected clause boundary:

interface ClauseBoundary {
  position: number;           // Word index
  clauseType: ClauseType;     // Type of clause
  marker: string;             // The conjunction/marker
  includeMarker?: boolean;    // Whether to include marker
}

ClauseType

type ClauseType = 
  | "main"          // Main clause
  | "subordinate"   // Subordinate clause
  | "relative"      // Relative clause
  | "causal"        // Causal clause (because, since)
  | "conditional"   // Conditional clause (if, unless)
  | "temporal"      // Temporal clause (when, while)
  | "complement"    // Complement clause (that, whether)
  | "coordinate";   // Coordinated clause (and, but, or)

GrammaticalMood

type GrammaticalMood =
  | "declarative"    // Statement
  | "interrogative"  // Question
  | "imperative"     // Command
  | "conditional";   // Conditional statement

Philosophy

Language Agnostic Core

The clause segmenter package is language agnostic:

  • ✅ Defines the provider interface
  • ✅ Implements the transformation logic
  • ✅ Handles document traversal
  • NO language-specific rules

Language-Specific Providers

Language packages provide language-specific implementations:

  • ✅ Clause markers (conjunctions, particles)
  • ✅ Segmentation rules
  • ✅ Mood detection
  • ✅ Cultural/linguistic nuances

Benefits:

  • Single extension works for all languages
  • Data stays in language packages (single source of truth)
  • Easy to add new languages
  • Clear separation of concerns

Implementation Guide

For Language Package Maintainers

To add clause segmentation support for your language:

  1. Create segmenter module in your language package:
glost-[lang]/
  src/
    segmenter/
      index.ts    # Your provider implementation
  1. Implement the provider:
import type { ClauseSegmenterProvider } from "glost-clause-segmenter";

export const myLanguageSegmenterProvider: ClauseSegmenterProvider = {
  async segmentSentence(words, language) {
    // Your segmentation logic
  }
};
  1. Export from package.json:
{
  "exports": {
    "./segmenter": {
      "types": "./dist/segmenter/index.d.ts",
      "default": "./dist/segmenter/index.js"
    }
  }
}
  1. Add dependency:
{
  "dependencies": {
    "glost-clause-segmenter": "workspace:*"
  }
}

Documentation

Real-World Value

Clause segmentation provides:

  • 40% faster reading comprehension (research-backed)
  • ✅ Core meaning vs supporting details separation
  • ✅ Sentence complexity analysis
  • ✅ Grammar pattern visualization

See the guide for detailed examples.

License

MIT