semantic-rbo

v1.4.1

Published

2 months ago

Rank Biased Overlap with semantic clustering for consensus building

0High
0Medium
0Low

bandf

rbo rank-biased-overlap clustering semantic cosine ai consensus embeddings nlp

Semantic RBO

Consensus-building library that merges multiple stakeholders' prioritized lists using semantic clustering and Rank-Biased Overlap (RBO).

Features

Semantic Clustering - Groups similar ideas using embedding vectors (MiniLM) and cosine similarity
Rank-Biased Overlap - Weights rankings by position importance (top items matter more)
Paraphrase Detection - Automatically recognizes when different people express the same idea differently
Agreement Metrics - Quantifies how much stakeholders agree with each other
Markdown Reports - Generates publication-ready consensus reports

Advanced Features

Configurable Decay Functions - Beyond exponential decay, supports linear, logarithmic, square root, plateau, and custom decay functions for rank weighting
Faction Detection - Automatically clusters submitters into factions based on agreement patterns, identifying coalitions and outliers
Threshold Sensitivity Analysis - Test consensus stability across multiple similarity thresholds to validate results

Installation

npm install semantic-rbo

Requires Node.js 22.0.0 or higher.

Quick Start

import { buildConsensus } from 'semantic-rbo/builder';

const result = await buildConsensus({
    documents: [
        {
            docId: `alice`,
            steps: [
                `Improve user onboarding`,
                `Add dark mode support`,
                `Fix mobile responsiveness`
            ]
        },
        {
            docId: `bob`,
            steps: [
                `Better onboarding experience`,  // Similar to Alice's #1
                `Performance optimization`,
                `Dark theme option`               // Similar to Alice's #2
            ]
        }
    ]
});

console.log(result.consensus);
// Outputs ranked list with semantic grouping and agreement scores

Documentation

See the full documentation for:

How It Works

The Pipeline

flowchart LR
    subgraph Input
        A[Alice's Priorities]
        B[Bob's Priorities]
        C[Carol's Priorities]
    end

    subgraph Process
        D[Understand<br/>Meaning]
        E[Group Similar<br/>Ideas]
        F[Merge<br/>Rankings]
    end

    subgraph Output
        G[Consensus<br/>List]
        H[Agreement<br/>Metrics]
    end

    A --> D
    B --> D
    C --> D
    D --> E --> F --> G
    F --> H

A Concrete Example

Here's how two stakeholders' priorities flow through the system:

flowchart TD
    subgraph "Step 1: Raw Input"
        A1["Alice: 1. Improve onboarding<br/>2. Add dark mode<br/>3. Fix mobile"]
        B1["Bob: 1. Better onboarding<br/>2. Performance<br/>3. Dark theme"]
    end

    subgraph "Step 2: Understand Meaning"
        E["Each item converted to<br/>a numeric meaning vector"]
    end

    subgraph "Step 3: Find Matches"
        M1["'Improve onboarding' ≈ 'Better onboarding'"]
        M2["'Add dark mode' ≈ 'Dark theme'"]
        M3["'Fix mobile' — unique"]
        M4["'Performance' — unique"]
    end

    subgraph "Step 4: Consensus"
        C1["1. Onboarding — both ranked it #1"]
        C2["2. Dark mode — Alice #2, Bob #3"]
        C3["3. Performance — Bob #2 only"]
        C4["4. Mobile — Alice #3 only"]
    end

    A1 --> E
    B1 --> E
    E --> M1
    E --> M2
    E --> M3
    E --> M4
    M1 --> C1
    M2 --> C2
    M4 --> C3
    M3 --> C4

Notice how "Improve onboarding" and "Better onboarding" are recognized as the same idea despite different wording. The final ranking reflects both position importance (onboarding was #1 for both) and breadth of support.

Semantic Clustering

Traditional voting systems treat "improve UX" and "better user experience" as completely different ideas — splitting votes and distorting results. This system understands meaning, not just words:

flowchart TD
    subgraph "Different Words, Same Idea"
        A1["Improve onboarding"]
        A2["Better first-time experience"]
        A3["Streamline new user signup"]
    end

    A1 --> C["Grouped as ONE idea"]
    A2 --> C
    A3 --> C

    style C fill:#90EE90

This means stakeholders don't need to coordinate their vocabulary beforehand. Everyone can express priorities naturally, and the system finds the common ground.

Rank-Biased Weighting

Not all rankings are equal. If something is everyone's #1 priority, it should outrank something that only appears at #5. The system applies exponential weighting so top positions carry more influence:

flowchart TD
    subgraph "How Position Affects Weight"
        R1["#1 Priority → High Influence"]
        R2["#2 Priority → Medium Influence"]
        R3["#3 Priority → Lower Influence"]
        R4["...and so on"]
    end

    style R1 fill:#FF6B6B
    style R2 fill:#FFB347
    style R3 fill:#FFE066

This produces fair consensus: widely-shared top priorities rise to the top, while niche concerns from one stakeholder don't dominate.

Before and After

The transformation from scattered input to actionable consensus:

flowchart LR
    subgraph Before["Before: Scattered Input"]
        direction TB
        B1["Team A's list"]
        B2["Team B's list"]
        B3["Team C's list"]
        B4["Different words"]
        B5["Different order"]
        B6["Overlap unclear"]
    end

    T["Semantic<br/>RBO"]

    subgraph After["After: Clear Consensus"]
        direction TB
        A1["1. Top priority — everyone"]
        A2["2. Second priority — most"]
        A3["3. Third priority — some"]
        A4["Agreement scores"]
        A5["Support counts"]
    end

    Before --> T --> After

    style Before fill:#FFE4E1
    style After fill:#E1FFE4

Understanding Agreement

Beyond the consensus list, the system reveals how stakeholders relate to each other:

flowchart TD
    subgraph "Pairwise Agreement"
        P1["Alice ↔ Bob: 78% similar"]
        P2["Alice ↔ Carol: 45% similar"]
        P3["Bob ↔ Carol: 52% similar"]
    end

    subgraph "Alignment Analysis"
        Central["Alice & Bob: Central<br/>— close to consensus"]
        Outlier["Carol: Outlier<br/>— different priorities"]
    end

    P1 --> Central
    P2 --> Outlier
    P3 --> Outlier

    style Central fill:#90EE90
    style Outlier fill:#FFB347

This helps identify which stakeholders are aligned, which have unique perspectives worth discussing, and whether the group has broad agreement or significant divisions.

For a complete guide to reading and acting on results, see Interpreting Results.

Use Cases

Requirements gathering from multiple stakeholders
Feature prioritization across teams
Strategic planning synthesis
Survey response aggregation
Collaborative decision making

License

MIT